Table of Contents
- Overview
- Prerequisites
- Quick Architecture
- Install / Setup
- Base Configuration
- Reload/Enable & Health Checks
- Security / Hardening
- Performance & Optimization
- Backup & Restore
- Troubleshooting (Top issues)
- Key Takeaways & Next Steps
Overview
This guide builds a production-ready Prometheus Grafana Docker Compose Monitoring. We’ll run Prometheus, Grafana, and Node Exporter with Docker Compose,
using persistent volumes, health checks, and declarative configs. You’ll get a ready-to-run docker-compose.yml, a basic
Prometheus scrape config, and starter alert rules. Official docs:
Prometheus ·
Grafana ·
Docker Compose.
Prerequisites
- A Linux host with Docker Engine + Docker Compose v2.
- Open ports:
9090/tcp(Prometheus),9100/tcp(Node Exporter),3000/tcp(Grafana). Restrict externally if needed. - Server time in sync (chrony/systemd-timesyncd) to avoid skewed metrics.
Quick Architecture

Install Docker + Compose
Use your distro’s method. After install, ensure docker works and Compose v2 is available as docker compose.
Ubuntu/Debian
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
newgrp docker
docker --version && docker compose version
RHEL/Rocky/CentOS Stream/Fedora
curl -fsSL https://get.docker.com | sh
sudo systemctl enable --now docker
docker --version && docker compose version
Arch/Manjaro
sudo pacman -Syu --noconfirm docker docker-compose
sudo systemctl enable --now docker
docker --version && docker compose version
openSUSE/SLE
sudo zypper refresh
sudo zypper install -y docker docker-compose
sudo systemctl enable --now docker
docker --version && docker compose version
Install / Setup
We’ll prepare a working directory with Compose files and Prometheus configs. Node Exporter runs on the same host exposing Linux metrics on 9100.
Grafana connects to Prometheus as a data source. All services restart automatically and store data in named volumes.
# Working directory
mkdir -p ~/monitoring/{prometheus,grafana}
cd ~/monitoring
# Prometheus configuration
cat > prometheus/prometheus.yml <<'YAML'
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']
YAML
# Basic alert rule (optional)
cat > prometheus/alert.rules.yml <<'YAML'
groups:
- name: node-alerts
rules:
- alert: HostDown
expr: up{job="node"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Node exporter down"
description: "No scrape data for 1 minute."
YAML
Base Configuration
Create the docker-compose.yml with three services: Prometheus, Node Exporter, and Grafana.
Prometheus loads the config and rules, exposes 9090, and depends on Node Exporter.
Grafana exposes 3000 with persistent storage. Health checks ensure containers are restarted if unresponsive.
# docker-compose.yml
version: "3.9"
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--web.enable-lifecycle"
- "--web.enable-admin-api"
- "--web.console.libraries=/usr/share/prometheus/console_libraries"
- "--web.console.templates=/usr/share/prometheus/consoles"
- "--web.enable-remote-write-receiver"
- "--storage.tsdb.retention.time=15d"
- "--web.route-prefix=/"
- "--web.external-url=http://prometheus.local"
- "--web.enable-admin-api"
- "--web.enable-lifecycle"
- "--enable-feature=promql-negative-offset"
- "--alertmanager.notification-queue-capacity=10000"
- "--web.config.file=/etc/prometheus/web.yml"
volumes:
- prometheus-data:/prometheus
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
- ./prometheus/alert.rules.yml:/etc/prometheus/alert.rules.yml:ro
ports:
- "9090:9090"
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:9090/-/ready"]
interval: 15s
timeout: 5s
retries: 5
depends_on:
- node-exporter
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
pid: "host"
network_mode: "bridge"
command: ["--path.rootfs=/host"]
volumes:
- /:/host:ro,rslave
ports:
- "9100:9100"
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:9100/metrics"]
interval: 30s
timeout: 5s
retries: 5
grafana:
image: grafana/grafana:latest
container_name: grafana
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_USERS_ALLOW_SIGN_UP=false
volumes:
- grafana-data:/var/lib/grafana
ports:
- "3000:3000"
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:3000/login"]
interval: 30s
timeout: 5s
retries: 5
volumes:
prometheus-data:
grafana-data:
Start the stack in detached mode. The first run will pull images and create volumes.
docker compose up -d
docker compose ps
Reload/Enable & Health Checks
Use this sequence when you change configs:
- Edit files (e.g.,
prometheus/prometheus.yml). - Validate the Prometheus config with
promtoolinside the container. - Apply: for file changes use
docker compose up -d(idempotent). For Prometheus scrape/rules changes, hot-reload via/-/reload. - Restart only if reload fails or you changed container args/images.
- Check health: verify readiness endpoints and container health.
Validate & Apply changes
# Recreate containers if compose file changed; otherwise idempotent
docker compose up -d
# Validate Prometheus config (promtool is inside the image)
docker exec -it prometheus promtool check config /etc/prometheus/prometheus.yml
docker exec -it prometheus promtool check rules /etc/prometheus/alert.rules.yml
# Hot-reload Prometheus scrape/rule config (no restart)
curl -X POST http://localhost:9090/-/reload
Health checks & Logs
# Container health and status
docker compose ps
docker inspect --format='{{ .State.Health.Status }}' prometheus
# Readiness endpoints (expect HTTP 200)
curl -I http://localhost:9090/-/ready
curl -I http://localhost:9090/-/healthy
# Tail logs
docker logs --tail=100 -f prometheus
docker logs --tail=100 -f grafana
Security / Hardening
Limit remote access to the monitoring stack. Use the firewall that matches your OS and open only the ports you really need.
- Ubuntu/Debian → UFW
- RHEL/Rocky/CentOS Stream/Fedora/openSUSE/SLE → firewalld
Ubuntu/Debian (UFW)
sudo ufw allow OpenSSH
# Open only if remote access required
sudo ufw allow 3000/tcp # Grafana
sudo ufw allow 9090/tcp # Prometheus
sudo ufw allow 9100/tcp # Node Exporter
sudo ufw reload
sudo ufw status
RHEL/Rocky/CentOS/Fedora/openSUSE/SLE (firewalld)
sudo firewall-cmd --permanent --add-port=3000/tcp
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --permanent --add-port=9100/tcp
sudo firewall-cmd --reload
sudo firewall-cmd --list-ports
TLS & Auth: put the stack behind a reverse proxy (Caddy/Traefik/Nginx) with HTTPS and auth. Change Grafana admin password on first login and prefer SSO for production.
Performance & Optimization
Improve retention and dashboard performance:
- TSDB Retention: adjust
--storage.tsdb.retention.time(e.g., 30d/90d) based on disk. - Remote Write (optional): if long-term storage is needed, enable remote write to a back end like Cortex/Thanos.
- Grafana provisioning: use provisioning files to pre-load data sources and dashboards for repeatable deploys.
- Resource limits: cap container CPU/memory in Compose for noisy neighbors.
# Limit Grafana resources (example)
services:
grafana:
deploy:
resources:
limits:
cpus: "1.0"
memory: 1g
Backup & Restore
Back up both volumes and configuration files. Volumes contain Prometheus TSDB and Grafana data; configs are your source of truth.
Backup
cd ~/monitoring
# stop briefly for a consistent snapshot
docker compose down
sudo tar -C ~ -czf /root/monitoring-backup-$(date +%F).tgz monitoring
sha256sum /root/monitoring-backup-$(date +%F).tgz
Restore
sudo systemctl stop docker || true
sudo tar -C ~ -xzf /root/monitoring-backup-YYYY-MM-DD.tgz
sudo systemctl start docker || true
cd ~/monitoring
docker compose up -d
docker compose ps
Troubleshooting (Top issues)
Prometheus shows “target down” — Node Exporter not reachable or port blocked.
curl -sS http://localhost:9100/metrics | head
docker logs node-exporter | tail -n 50
Grafana cannot connect to Prometheus — Verify Prometheus URL inside Grafana (default http://prometheus:9090 when using Compose network).
docker exec -it grafana grafana-cli admin reset-admin-password 'StrongP@ss!'
High disk usage — Reduce retention or enable remote write.
docker exec -it prometheus du -sh /prometheus
Key Takeaways & Next Steps
- Prometheus Grafana Docker Compose Monitoring gives a fast, repeatable stack with Compose.
- Secure with firewall + reverse proxy + strong auth.
- Next: add Alertmanager, Grafana provisioning, and long‑term storage.
