Why Loki Over ELK?
The short version: ELK indexes everything, Loki indexes almost nothing. Loki only stores labels (like job name, hostname, environment) as indexed fields. The actual log lines sit compressed on disk and get scanned at query time. That approach uses a fraction of the memory because there's no inverted index sitting in RAM.
It plugs straight into Grafana, so if you already run Grafana for Prometheus dashboards, you don't need a separate UI like Kibana. One less thing to update, one less thing to break.
The tradeoff is real though — searches across unindexed content are slower, and if you need full-text search across millions of log lines, Loki will feel sluggish compared to Elasticsearch. For filtering by service, time range, and grepping for error patterns, it's more than enough.
Architecture
- Loki - The log storage and query engine
- Promtail - Agent that ships logs to Loki
- Grafana - Where you view and query logs
Docker Compose Setup
version: "3"
services:
loki:
image: grafana/loki:latest
ports:
- "3100:3100"
volumes:
- ./loki-config.yaml:/etc/loki/local-config.yaml
- loki-data:/loki
command: -config.file=/etc/loki/local-config.yaml
promtail:
image: grafana/promtail:latest
volumes:
- ./promtail-config.yaml:/etc/promtail/config.yml
- /var/log:/var/log:ro
command: -config.file=/etc/promtail/config.yml
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
volumes:
- grafana-data:/var/lib/grafana
volumes:
loki-data:
grafana-data:
Loki Configuration
Create loki-config.yaml:
auth_enabled: false
server:
http_listen_port: 3100
common:
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /loki/boltdb-shipper-active
cache_location: /loki/boltdb-shipper-cache
limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
Promtail Configuration
Promtail is honestly annoying to configure compared to Filebeat. Filebeat's YAML is straightforward — you point it at files and it ships them. Promtail's config has this label-centric model where __path__ is technically a label, not a path directive, and the glob behavior doesn't always match what you'd expect. I spent an hour wondering why no logs appeared until I realized my path had /var/log/nginx/*.log but the actual files were in /var/log/nginx/access.log and error.log separately. Filebeat would have just worked.
Create promtail-config.yaml:
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*.log
- job_name: containers
static_configs:
- targets:
- localhost
labels:
job: docker
__path__: /var/lib/docker/containers/*/*log
Start the Stack
docker compose up -d
Configure Grafana
- Open Grafana at
http://localhost:3000 - Login (admin/admin by default)
- Go to Configuration → Data Sources → Add data source
- Select Loki
- URL:
http://loki:3100 - Save & test
Querying Logs (LogQL)
Loki uses LogQL, similar to PromQL:
# All logs from a job
{job="varlogs"}
# Filter by content
{job="varlogs"} |= "error"
# Exclude patterns
{job="varlogs"} != "debug"
# Regex matching
{job="varlogs"} |~ "fail(ed|ure)"
# Parse and filter
{job="nginx"} | json | status >= 400
Labels Are Key
Loki only indexes labels, so the labels you pick determine how fast your queries run:
job- What service/applicationhost- Which serverenvironment- prod/staging/dev
Don't use high-cardinality labels (like user IDs). That explodes storage.
Retention
In loki-config.YAML:
compactor:
working_directory: /loki/compactor
shared_store: filesystem
retention_enabled: true
retention_delete_delay: 2h
limits_config:
retention_period: 744h # 31 days
Alerting
Loki can send alerts based on log patterns:
groups:
- name: error-alerts
rules:
- alert: HighErrorRate
expr: |
sum(rate({job="myapp"} |= "error" [5m])) > 10
for: 5m
labels:
severity: critical
Production Tips
- I set retention to 30 days — anything longer and my 256GB disk filled up within a month
- If you're storing more than 5GB/day, move chunk storage to S3 or MinIO. Local filesystem won't keep up
- Point Prometheus at Loki's
/metricsendpoint. I missed a disk-full event because I wasn't monitoring Loki itself - Read/write path separation only matters past ~50GB/day. Below that, single-binary is fine
ELK vs Loki:
RAM: ELK needed 4GB minimum on my VPS (Elasticsearch 2GB heap + Kibana + Logstash). Loki + Grafana + Promtail together use around 530MB.
Query speed: Elasticsearch is faster for full-text search across large datasets. Loki is fast for label-filtered queries but noticeably slower when scanning unindexed content across wide time ranges.
Setup time: ELK took me most of a weekend to get right, between Elasticsearch tuning, Logstash pipelines, and Kibana dashboards. Loki was running in about 45 minutes with Docker Compose.
Cost: ELK forced me onto a bigger VPS ($24/month). Loki fits on a $6/month VPS alongside other services.
Full-text search is genuinely worse. If you need to search for arbitrary substrings across all your logs without knowing which service produced them, Elasticsearch will return results in seconds where Loki might take 30+ seconds or time out entirely. Loki assumes you know roughly where to look (which job, which time range) before you query.
Complex aggregations are limited. ELK lets you do things like "show me the top 10 IP addresses by request count, broken down by hour, filtered by status code" in a single Kibana visualization. LogQL can handle basic rates and counts, but multi-dimensional aggregations either require awkward workarounds or just aren't possible. If your use case is log analytics rather than log monitoring, ELK is still the better tool.
Right now Loki is using 380MB on the same VPS that couldn't run Elasticsearch. Grafana adds another 150MB. Total: 530MB for logs + dashboards. ELK wanted 4GB minimum. Eight services push about 2GB of logs per day into it and the VPS still has headroom for other things. That was never true with Elasticsearch.
💬 Comments