In the ELK Stack, ELK stands for Elasticsearch, Logstash,
and Kibana. In the ELK stack, Logstash defined as it extracts the logging data
or other events from different input sources. Logstash processes the events and
later stores it in Elasticsearch. Kibana defined as a web interface, which
accesses the logging data form Elasticsearch and visualizes it.
In this tutorial, you will learn
Introduction to ELK Stack
ELK Stack Architecture
Introduction to Elastic Search
What is Logstash?
What is Kibana?
Case Studies in ELK Stack
Introduction to ELK Stack
The ELK Stack is a collection of three open-source products
that are Elasticsearch, Logstash, and Kibana. They all are developed,
maintained, and managed by the company Elastic.
E stands for Elastic Search: used for storing logs
L stands for LogStash: used for shipping as well as the
processing and storing logs
K stands for Kibana: It is a visualization tool (a web
interface) which is hosted through Nginx or Apache
ELK Stack is designed to allow users to take data from any
source, in any format, and to search, visualize, and analyze that data in
real-time.
ELK provides centralized logging that is useful when attempting to identify the problems with applications or servers. It allows you to search all your logs in a single place. It also helps to find issues that occur in multiple servers by connecting their logs during a specific time frame.
ELK Stack Architecture
It is the architecture of ELK Stack which shows the proper
order of log flow within ELK. Here, the logs generated from various sources are
collected and processed by Logstash, based on the provided filter criteria.
Logstash then pipes those logs to Elasticsearch which then analyzes and
searches the data. Finally, using Kibana, the logs are visualized and managed
as per the requirements.
- Logs: Server logs that need to be analyzed are identified
- Logstash: Collect logs and events data. It even parses and transforms data
- ElasticSearch: The transformed data from Logstash is Store, Search, and indexed.
- Kibana: It uses Elasticsearch DB to Explore, Visualize, and Share
Introduction to Elasticsearch
Elasticsearch is a NoSQL database. It is based on the Lucene
search engine, and it is built with RESTful APIS. It offers maximum reliability,
simple deployment, and easy management. It also offers advanced queries to
perform detail analysis and stores all the data centrally. It helps execute a
quick search of the documents.
Elasticsearch also allows you to store, search and analyze
the big volumes of data. It is mostly used as the underlying engine to power
applications that completed search requirements. It has been adopted in search
engine platforms for modern web and mobile applications. Apart from a quick
search, the tool offers complex analytics and many advanced features.
Features of Elastic search
- Open source search server is written by using
Java - It used to index any kind of heterogeneous data
- Has REST API web-interface with JSON output
- Full-Text Search
- Near Real-Time (NRT) search
- Sharded, replicated searchable, JSON document
store - Schema-free, JSON & REST based distributed
document store - Multi-language & Geolocation support
Advantages of Elasticsearch
- Store schema-less data and also creates a schema
for your data - Manipulate your data record by record with the
help of Multi-document APIs - Perform filtering and querying your data for
insights - Based on Apache Lucene and provides RESTful API
- Provides horizontal scalability, reliability,
and multitenant capability for real-time use of indexing to make it faster
search - Helps you to scale vertically and horizontally
What is Logstash?
Logstash is the data collection pipeline tool. It collects
data inputs and feeds into the Elasticsearch. It gathers all types of data from
different sources and makes it available for further use.
Logstash can unify data from disparate sources and normalize
the data into your desired destinations. It allows you to cleanse and
democratize all your data for analytics and visualization of use cases.
It consists of three components:
- Input:
In this, passing logs to process them into machine understandable format - Filters:
It is a set of conditions to perform a particular action or event - Output:
Decision maker for processed log or event
Features of Logstash
- Using internal queues events are passed through each phase
- Allows different inputs for your logs
- Filtering/parsing for your logs
- Advantage of Logstash
- Offers centralize the data processing
- It analyzes a large variety of structured/unstructured data and events
- Offers plugins to connect with different types of input sources and platforms
What is Kibana?
Kibana is a data visualization that completes the ELK stack.
This tool is used to visualizing the Elasticsearch documents and helps
developers to have a quick insight into it. Kibana dashboard offers different
interactive diagrams, geospatial data, and graphs to visualize complex quires.
It can be used for search, interact, and view with data
stored in Elasticsearch directories. Kibana helps you to perform advanced data
analysis and visualize your data in a variety of tables, charts, and maps.
Features of Kibana
- Powerful front-end dashboard which is capable of
visualizing indexed information from the elastic cluster - Enables real-time search of indexed information
- You can search, view, and interact with data
stored in Elasticsearch - Execute queries on data & visualize results
in charts, tables, and maps - Configurable dashboard to slice and dice
logstash logs in elastic search - Capable of providing historical data in the form
of graphs, charts, etc. - Real-time dashboards which are easily
configurable - Enables real-time search of indexed information
Advantages and Disadvantages of Kibana
- Easy visualizing
- Fully integrated with Elasticsearch
- Visualization tool
- Offers real-time analysis, charting,
summarization, and debugging capabilities - Provides an instinctive and user-friendly
interface - Allows sharing of snapshots of the logs searched
through - Permits saving the dashboard and managing
multiple dashboards
Case Studies in ELK Stack
NetFlix
Netflix heavily relies on the ELK stack. The companies using
the ELK stack to monitor and analyze customer service operation’s security log.
It allows them to store, index, and search documents from more than fifteen
clusters which comprise almost 800 nodes.
The famous social media marketing site is LinkedIn that uses
ELK stack to monitor performance and security. The IT team integrated ELK with
Kafka to support its load in real-time. Their ELK operation consists of more
than 100 clusters across six different data centers.
Tripwire
Tripwire is a worldwide Security Information Event
Management system. The business uses ELK to support information packet log
analysis.
Medium
Medium is a famous blog-publishing platform. They use the
ELK stack to debug their production issues. The business also uses ELK to
detect DynamoDB hotpots. Moreover, using this stack, the company can support
thousands of published posts each week as well as 25 million unique readers.