Resource Limits and Performance Tuning for Docker Compose in Production

Setting CPU and memory limits, configuring logging drivers, and monitoring container performance in a Docker Compose production environment.

Jean-Pierre Broeders

Freelance DevOps Engineer

March 22, 20266 min. read
Resource Limits and Performance Tuning for Docker Compose in Production

Resource Limits and Performance Tuning for Docker Compose in Production

Deploying a Docker Compose stack without resource limits is like driving a car without a speedometer. It works — until it doesn't. And by then, it's usually too late.

Why resource limits aren't optional

Without explicit limits, a single container can consume all available memory. The Linux OOM killer steps in, but it doesn't distinguish between your database and your logging sidecar. The result: random containers getting killed, often exactly the wrong ones.

The deploy.resources section in Compose fixes this:

services:
  api:
    image: myapp/api:latest
    deploy:
      resources:
        limits:
          cpus: '1.0'
          memory: 512M
        reservations:
          cpus: '0.25'
          memory: 128M

There's a subtle but important difference between limits and reservations. Limits are hard caps: the container cannot exceed them. Reservations are soft guarantees: Docker ensures those resources are available at startup. On a server with 4GB RAM running five services, the sum of all reservations should stay under 4GB, while limits can go higher to accommodate peak loads.

CPU limits: shares vs quota

CPU limiting has more nuance than it appears. Docker uses two mechanisms: CPU shares and CPU quota.

services:
  worker:
    image: myapp/worker:latest
    deploy:
      resources:
        limits:
          cpus: '0.5'
    cpu_shares: 512

The cpus: '0.5' setting restricts the container to a maximum of 50% of one core. Always, regardless of how busy the server is. CPU shares work differently — they're a relative weight. A container with 512 shares gets half the CPU compared to one with 1024 shares, but only when there's actual contention. On an idle server, it makes no difference.

For most production setups, a combination works best. Hard CPU limits on CPU-intensive workers, shares for services that occasionally spike but don't consistently need much CPU.

Memory: the silent killer

Memory issues are trickier than CPU problems. A Node.js API with a slow leak, a Java service with an oversized heap, a Redis instance without maxmemory — it creeps in gradually.

A few practical settings that help:

services:
  redis:
    image: redis:7-alpine
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru
    deploy:
      resources:
        limits:
          memory: 300M

  api:
    image: myapp/api:latest
    environment:
      - NODE_OPTIONS=--max-old-space-size=384
    deploy:
      resources:
        limits:
          memory: 512M

The trick is setting the application limit slightly lower than the container limit. Redis gets 256MB as its own limit, but the container allows 300MB. That margin covers overhead from the Redis process itself. Without it, the container crashes before Redis can clean up gracefully.

Logging: the forgotten resource hog

By default, Docker sends all stdout/stderr to JSON files on disk. Without configuration, those grow indefinitely. On a busy API with debug logging enabled, producing gigabytes per day is entirely realistic.

services:
  api:
    image: myapp/api:latest
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

Three files of 10MB max per container. Keeps the disk clean without logs disappearing entirely. For serious monitoring, an external logging driver works better:

services:
  api:
    logging:
      driver: syslog
      options:
        syslog-address: "tcp://logserver:514"
        tag: "api"

Or the appropriate driver for a Loki or Elasticsearch stack. The point: think about logging before the disk fills up.

Health checks that actually matter

A health check that only verifies whether the container is running is useless. The container can run perfectly fine while the application inside is completely deadlocked.

services:
  api:
    image: myapp/api:latest
    healthcheck:
      test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

That /health endpoint should do more than return 200 OK. Check the database connection. Check if Redis is reachable. Check if the event queue isn't backing up. A health endpoint that genuinely reflects service health makes the difference between a problem that auto-recovers and one that triggers a 3 AM phone call.

Performance monitoring without extra tools

Docker itself provides more insight than most teams realize:

# Real-time resource usage per container
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"

# Detailed resource usage inspection
docker inspect --format='{{.HostConfig.Memory}}' container_name

For something more structured, a simple script that logs stats every minute:

#!/bin/bash
while true; do
  docker stats --no-stream --format \
    "{{.Name}},{{.CPUPerc}},{{.MemPerc}},{{.MemUsage}}" \
    >> /var/log/docker-stats.csv
  sleep 60
done

Not flashy, but effective. A CSV loadable in Grafana or even a spreadsheet gives a clear picture of actual resource usage after a few days. That data is essential for tuning limits properly — limits too tight cause OOM kills, limits too loose waste resources.

The right restart policy

Closely related to resource management is the restart policy. A container that gets OOM-killed and immediately restarts with restart: always can enter a crash loop that takes down the entire server.

services:
  api:
    restart: unless-stopped
    deploy:
      resources:
        limits:
          memory: 512M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 120s

The delay and max_attempts prevent a failing container from restarting endlessly. After three failed attempts within two minutes, Docker stops restarting it. That gives room to investigate the problem instead of masking it.

Wrapping up

Setting resource limits costs maybe an hour during initial deployment. That hour pays for itself the first time a memory leak doesn't take down your entire server. Start with conservative limits, monitor actual usage, and adjust based on data. Not the other way around.

Want to stay updated?

Subscribe to my newsletter or get in touch for freelance projects.

Get in Touch