ELK Stack: Elasticsearch, Logstash & Kibana Logs
Build centralized logging with Elasticsearch, Logstash pipelines, Beats shippers, and Kibana visualization
Greetings, brave adventurer! High in the Watchtower of the Warrior tier you have learned to read the three pillars of observability. Now you descend into the great Hall of Records - the realm of logs - where every service in your kingdom scribbles its story in a thousand scattered scrolls. Alone, those scrolls are noise. Gathered, parsed, and indexed, they become an oracle that answers any question about what happened, when, and why. This quest, The ELK Stack, teaches you to build that oracle.
Whether you have only ever run tail -f on a single log file, or you already grep across servers and yearn for something better, this adventure forges the discipline every Warrior of the logs needs: structured logging, parsing pipelines, indices and mappings, and dashboards that turn raw events into insight.
📖 The Legend Behind This Quest
In the early ages, a service ran on one machine and its log lived in one file. When the great cities of microservices rose, a single user request might leave footprints in a dozen log files across a dozen hosts. Debugging by SSHing into each box and grepping became impossible. The operators who survived learned a single truth: logs must be shipped, parsed, and centralized the moment they are born, or they are useless when the fire starts.
The ELK Stack - Elasticsearch, Logstash, and Kibana, joined by the lightweight Beats shippers - is the most widely deployed answer to that problem. Master it and you can ask any question of any log from any service, all from a single search bar.
🎯 Quest Objectives
By the time you complete this journey, you will have mastered:
Primary Objectives (Required for Quest Completion)
- The ELK Architecture - Explain what Elasticsearch, Logstash, Kibana, and Beats each do and how data flows between them
- Elasticsearch Indexing - Create indices, understand mappings and the inverted index, and query with the search API
- Logstash Pipelines - Write an input → filter → output pipeline that parses unstructured logs with
grok - Kibana Visualization - Build an index pattern, a Discover search, and a dashboard
Secondary Objectives (Bonus Achievements)
- Structured Logging - Emit JSON logs at the source so parsing becomes trivial
- Beats Shippers - Use Filebeat to ship logs without a heavy Logstash on every host
- Index Lifecycle Management - Roll over and delete old indices so storage does not explode
Mastery Indicators
You’ll know you’ve truly mastered this quest when you can:
- Draw the data flow from a log line to a Kibana chart without notes
- Write a grok pattern that turns
127.0.0.1 - GET /api 200 53msinto structured fields - Explain why high-cardinality fields and unbounded indices are dangerous
- Decide when Filebeat alone is enough and when you need Logstash in the middle
🗺️ Quest Prerequisites
📋 Knowledge Requirements
- Comfort on the command line and reading/writing YAML
- The three pillars of observability (complete Monitoring Fundamentals first)
- A mental model of what a log line is and where services write them
🛠️ System Requirements
- Modern operating system (Windows 10+, macOS 10.14+, or Linux)
- Docker and Docker Compose installed (allocate at least 4 GB, ideally 8 GB, to Docker)
- A terminal and a text editor or IDE (VS Code recommended)
🧠 Skill Level Indicators
This 🔴 Hard quest expects:
- You can run multi-container apps with Docker Compose
- You are ready to think about parsing, indexing, and searching at scale
- Ready for 120-150 minutes of focused, hands-on building
🌍 Choose Your Adventure Platform
The stack runs in containers, so the lab is identical everywhere. The only platform difference is how you install Docker. Then everyone meets at the same docker compose up.
🍎 macOS Kingdom Path
Click to expand macOS instructions
```bash # Install Docker Desktop (or colima) and confirm Compose is available brew install --cask docker docker --version docker compose version # Elasticsearch needs a higher mmap limit; Docker Desktop's VM handles it, # but if you hit a vm.max_map_count error, raise it inside the VM. ```🪟 Windows Empire Path
Click to expand Windows instructions
```powershell # Install Docker Desktop with the WSL2 backend winget install Docker.DockerDesktop docker --version docker compose version ```🐧 Linux Territory Path
Click to expand Linux instructions
```bash sudo apt update && sudo apt install -y docker.io docker-compose-plugin # Debian/Ubuntu sudo systemctl enable --now docker # Elasticsearch requires a raised virtual-memory map count on Linux hosts: sudo sysctl -w vm.max_map_count=262144 echo 'vm.max_map_count=262144' | sudo tee /etc/sysctl.d/99-elasticsearch.conf ```☁️ Cloud Realms Path
Click to expand Cloud/Container instructions
```bash # In a Codespace or any container host, the same compose file works. # Forward ports 5601 (Kibana) and 9200 (Elasticsearch) to your browser. docker compose up -d ``` > ⚠️ This single-node stack disables security for learning convenience. Never expose it to the public internet, and never use these settings in production.🧙♂️ Chapter 1: The Architecture of the Hall of Records
Before you ship a single log, learn what each component does. The ELK Stack is four cooperating services, each with one job.
⚔️ Skills You’ll Forge in This Chapter
- The role of Elasticsearch, Logstash, Kibana, and Beats
- How a log line travels from source to dashboard
- Where parsing should happen and why
🏗️ The Four Components
| Component | Job | Analogy |
|---|---|---|
| Beats (e.g. Filebeat) | Lightweight shipper that tails files and forwards lines | The runner who carries scrolls from each tower |
| Logstash | Heavy pipeline: ingest, parse, enrich, and route | The scribe who reads, structures, and stamps each scroll |
| Elasticsearch | Distributed search engine that indexes and stores | The vast indexed library you can query instantly |
| Kibana | Web UI for search, visualization, and dashboards | The reading room where you ask the oracle questions |
The canonical data flow:
[ app log file ] --> Filebeat --> Logstash (grok/filter) --> Elasticsearch (index) --> Kibana (search/chart)
For simple JSON logs you can skip Logstash entirely: Filebeat --> Elasticsearch. You add Logstash when you need to parse unstructured text, enrich records (GeoIP, lookups), or fan out to multiple destinations.
🔍 Knowledge Check: Architecture
- Which component stores and searches the data?
- When can you drop Logstash from the pipeline?
- What does Filebeat do that Logstash also can but more expensively?
⚡ Quick Wins and Checkpoints
- Named the four roles: You can say what each component does in one line
- Drew the flow: You sketched source → ship → parse → index → visualize
🧙♂️ Chapter 2: Stand Up the Stack with Docker Compose
Now make it real. This single Compose file brings up Elasticsearch, Logstash, and Kibana, wired together.
⚔️ Skills You’ll Forge in This Chapter
- Running the full stack locally
- Verifying cluster health from the Elasticsearch API
🏗️ The Compose File
# docker-compose.yml — single-node ELK for learning (security disabled)
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.13.4
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- ES_JAVA_OPTS=-Xms1g -Xmx1g # cap heap so it fits a laptop
ports:
- "9200:9200"
healthcheck:
test: ["CMD-SHELL", "curl -s http://localhost:9200/_cluster/health | grep -q '\"status\"'"]
interval: 10s
retries: 12
logstash:
image: docker.elastic.co/logstash/logstash:8.13.4
volumes:
- ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf:ro
ports:
- "5044:5044" # Beats input
depends_on:
elasticsearch:
condition: service_healthy
kibana:
image: docker.elastic.co/kibana/kibana:8.13.4
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- "5601:5601"
depends_on:
elasticsearch:
condition: service_healthy
Bring it up and confirm the cluster is alive:
docker compose up -d
# Wait for green/yellow status (yellow is normal for a single node)
curl -s 'http://localhost:9200/_cluster/health?pretty'
# Kibana takes a minute; then open it
open http://localhost:5601 # macOS; use start/xdg-open elsewhere
🔍 Knowledge Check: Standing It Up
- Why is a single-node cluster
yellowand notgreen? - What does
ES_JAVA_OPTS=-Xms1g -Xmx1gprotect against on a laptop? - Why does Logstash
depend_onElasticsearch being healthy?
🧙♂️ Chapter 3: Indexing in Elasticsearch
Elasticsearch is a search engine, not just a database. Understanding indices, mappings, and the inverted index is what separates a log searcher from a log struggler.
⚔️ Skills You’ll Forge in This Chapter
- Indices, documents, shards, and mappings
- How the inverted index makes full-text search fast
- Indexing and querying documents via the REST API
🏗️ Core Concepts
- Document - a single JSON record (one log event).
- Index - a named collection of documents (e.g.
logs-2026.06.14), like a table. - Shard - a horizontal slice of an index; shards are how Elasticsearch scales and replicates.
- Mapping - the schema: which fields exist and their types (
keywordfor exact match,textfor full-text,date,long, etc.). - Inverted index - instead of scanning rows, Elasticsearch maps each term to the documents containing it, so searching billions of logs for “timeout” is near-instant.
Create an index with an explicit mapping, then index and search a document:
# 1. Create an index with a mapping (keyword vs text matters for search/aggregation)
curl -s -X PUT 'http://localhost:9200/app-logs' -H 'Content-Type: application/json' -d '{
"mappings": {
"properties": {
"@timestamp": { "type": "date" },
"level": { "type": "keyword" },
"service": { "type": "keyword" },
"message": { "type": "text" },
"duration_ms":{ "type": "long" }
}
}
}'
# 2. Index a structured log document
curl -s -X POST 'http://localhost:9200/app-logs/_doc' -H 'Content-Type: application/json' -d '{
"@timestamp": "2026-06-14T10:31:02Z",
"level": "error",
"service": "checkout",
"message": "payment gateway timeout",
"duration_ms": 5012
}'
# 3. Search it: find all error logs from the checkout service
curl -s 'http://localhost:9200/app-logs/_search?pretty' -H 'Content-Type: application/json' -d '{
"query": { "bool": { "must": [
{ "match": { "level": "error" } },
{ "match": { "service": "checkout" } }
]}}
}'
Field types matter: use
keywordfor fields you filter and aggregate on (level, service) andtextfor fields you full-text search (message). Getting this wrong is the #1 beginner mistake.
🔍 Knowledge Check: Indexing
- What is the difference between a
keywordand atextfield? - Why is the inverted index faster than scanning every document?
- What does a mapping define, and why set it explicitly?
🧙♂️ Chapter 4: Parsing Logs with Logstash and grok
Most real logs are unstructured text. Logstash’s job is to turn 127.0.0.1 - - GET /api/orders 200 53 into searchable fields. The grok filter is the spell that does it.
⚔️ Skills You’ll Forge in This Chapter
- The input → filter → output pipeline shape
- Parsing unstructured lines with
grok - Why structured logging at the source beats parsing later
🏗️ A Logstash Pipeline
# logstash.conf — input (Beats) -> filter (parse) -> output (Elasticsearch)
input {
beats { port => 5044 }
}
filter {
# grok parses a common access-log line into named fields.
grok {
match => { "message" => "%{IP:client_ip} %{USER:ident} %{USER:auth} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:status:int} %{NUMBER:duration_ms:int}" }
}
# Promote the parsed timestamp to the event's @timestamp.
date {
match => [ "timestamp", "ISO8601" ]
}
# Drop the raw message once parsed to save space (optional).
mutate { remove_field => [ "ident", "auth" ] }
}
output {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
index => "access-logs-%{+YYYY.MM.dd}" # daily indices, easy to roll over
}
}
The three stages map directly onto the architecture: input receives from Beats, filter structures the data, output writes to Elasticsearch. The %{NUMBER:status:int} syntax both extracts and casts, so status becomes a real number you can aggregate.
🏗️ The Better Path: Structured Logging at the Source
Parsing text is fragile - one log-format change breaks your grok. The superior approach is to emit JSON logs at the application, so no parsing is needed:
# Emit JSON logs the moment they are born — no grok required downstream.
import json, logging, sys
class JsonFormatter(logging.Formatter):
def format(self, record):
return json.dumps({
"@timestamp": self.formatTime(record),
"level": record.levelname.lower(),
"service": "checkout",
"message": record.getMessage(),
})
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(JsonFormatter())
log = logging.getLogger("app")
log.addHandler(handler)
log.setLevel(logging.INFO)
log.error("payment gateway timeout") # -> a clean JSON line Elasticsearch loves
🔍 Knowledge Check: Parsing
- What do the three Logstash stages (input/filter/output) each do?
- Why is structured JSON logging more robust than grok parsing?
- What does
%{NUMBER:status:int}accomplish beyond extraction?
🧙♂️ Chapter 5: Visualizing in Kibana
Elasticsearch holds the answers; Kibana asks the questions. In a few clicks you turn indexed logs into searches, charts, and dashboards.
⚔️ Skills You’ll Forge in This Chapter
- Creating a data view (index pattern)
- Searching in Discover with KQL
- Building a visualization and a dashboard
🏗️ From Index to Dashboard
- Create a data view: in Kibana, go to Stack Management → Data Views and add
access-logs-*with@timestampas the time field. - Explore in Discover: open Discover, pick the data view, and filter with KQL (Kibana Query Language):
# KQL examples you type in the Discover search bar
status >= 500 and method : "POST"
service : "checkout" and duration_ms > 1000
not status : 200
- Build a visualization: in Lens, drop
@timestampon the X axis andCounton the Y axis, then break it down by thestatuskeyword field to see error spikes over time. - Assemble a dashboard: combine a request-rate line chart, a top-errors table, and a p95-latency metric into one board your team watches.
🔍 Knowledge Check: Visualization
- What does a Kibana data view connect to?
- Write a KQL query for “5xx responses on the checkout service”
- Why must you choose a time field when creating the data view?
🎮 Mastery Challenges
🟢 Novice Challenge: Index and Search by Hand
Objective: Using only the Elasticsearch REST API, create an index, add three log documents, and run a search that returns exactly one of them.
Requirements:
- An index with an explicit mapping (at least one
keywordand onetextfield) - Three documents indexed via
_doc - A
bool/matchquery that returns exactly one document
Validation: The search response’s hits.total.value equals 1.
🟡 Intermediate Challenge: Parse a Real Log Line
Objective: Write a Logstash grok filter that turns a raw access-log line into structured, typed fields.
Requirements:
grokextracts at least method, path, status, and durationstatusandduration_msare cast to integers- The output lands in a daily-rolled index in Elasticsearch
Validation: In Kibana Discover, you can filter status >= 500 and see only matching events.
🔴 Advanced Challenge: Build the Observability Dashboard
Objective: Build a Kibana dashboard for one service with request rate, error rate, and latency.
Requirements:
- A time-series of request count broken down by
status - A table of the top error messages
- A metric showing p95
duration_ms
Validation: Generate some traffic, then watch the dashboard reflect the error spike within a minute.
🏆 Quest Rewards & Achievements
🎖️ Badges Earned:
- 🏆 Loremaster of the Logs - You centralized scattered logs into a searchable archive
- 📜 Keeper of the Index - You tamed Elasticsearch mappings and Logstash pipelines
🛠️ Skills Unlocked:
- Centralized Log Aggregation - Ship, parse, and index logs from anywhere
- Structured Logging & Pipeline Design - Make logs searchable by design, not by luck
🔓 Unlocked Quests:
- Distributed Tracing - Follow a single request across services
- Alerting Systems - Turn indexed signals into actionable pages
📊 Progression Points: +75 XP
🗺️ Next Steps in Your Journey
Continue the Main Story:
- 🎯 Distributed Tracing - Master the traces pillar with OpenTelemetry and Jaeger
Explore Side Adventures:
- ⚔️ Alerting Systems - Route and respond to what your logs reveal
Character Class Recommendations
💻 Software Developer: Continue to Distributed Tracing
🏗️ System Engineer: Explore Alerting Systems
🛡️ Security Specialist: Revisit Monitoring Fundamentals for SLO grounding
📚 Resources
Official Documentation
- Elasticsearch Guide - Indices, mappings, and the query DSL
- Logstash Reference - Pipelines, inputs, filters, outputs
- Kibana Guide - Data views, Discover, Lens, dashboards
- Filebeat Reference - Lightweight log shipping
Community Resources
- Grok Debugger - Test grok patterns interactively
- Elastic Common Schema (ECS) - A shared field naming standard
- Awesome Elasticsearch - Curated tools and reading
Learning Materials
- Elastic Stack Quickstart - Get the whole stack running
- Index Lifecycle Management - Roll over and retire indices
🤝 Quest Completion Checklist
- ✅ Completed all primary objectives
- ✅ Stood up a running ELK stack with Docker Compose
- ✅ Indexed and searched documents in Elasticsearch
- ✅ Parsed a log line with a Logstash grok filter
- ✅ Built a Kibana dashboard
- ✅ Identified your next quest in the journey
🕸️ Knowledge Graph
Structured wiki-links connect this quest to the IT-Journey knowledge graph. Open the Obsidian Graph View to explore connections.
Level hub: [[Level 1010 - Monitoring & Observability]] Overworld: [[🏰 Overworld - Master Quest Map]] Requires: [[Monitoring Fundamentals: Metrics, Logs, and Traces for Observability]] Unlocks: [[Distributed Tracing: OpenTelemetry and Jaeger]] · [[Alerting Systems: Alertmanager, Routing, and On-Call]] Obsidian docs: [[Obsidian Knowledge Graph and Wiki Links]]
🎁 Rewards
Badges
- 🏆 Loremaster of the Logs - Centralized logs from chaos into a searchable archive
- 📜 Keeper of the Index - Tamed Elasticsearch mappings and Logstash pipelines
Skills unlocked
- 🛠️ Centralized Log Aggregation
- 🧠 Structured Logging & Pipeline Design
Features unlocked
- Access to the Distributed Tracing and Alerting Systems quests
🕸️ Quest Network
Click a node to open the quest · ⌘/Ctrl-click for a new tab · drag to reposition · scroll to zoom.
Referenced by
- Loading…