Docker 101: Commands, Troubleshooting & Debugging — Intermediate to Advanced

Who this is for: You know what a container is. You've run docker run hello-world. Now you want to actually understand what's happening under the hood and handle production-level problems without Googling everything.


1. Architecture Refresher (What Actually Runs Your Containers)

Before commands make sense, internalize this mental model:

Docker CLI  ──►  dockerd (daemon)  ──►  containerd  ──►  runc  ──►  Linux kernel
     (API calls via UNIX socket)           (OCI runtime)      (namespaces + cgroups)
  • dockerd — The long-running background process. All CLI commands are REST calls to it.
  • containerd — Manages container lifecycle (pull, start, stop, snapshot).
  • runc — Spawns the actual process using Linux namespaces + cgroups.
  • The UNIX socket/var/run/docker.sock. If you lose this, you lose Docker.

2. Image Management — Beyond docker pull

Building Images

# Standard build
docker build -t myapp:1.0 .

# Build with a specific Dockerfile
docker build -f Dockerfile.prod -t myapp:prod .

# Pass build arguments
docker build --build-arg ENV=production --build-arg PORT=8080 -t myapp:prod .

# No cache — force full rebuild
docker build --no-cache -t myapp:fresh .

# Target a specific stage in a multi-stage build
docker build --target builder -t myapp:build-stage .

# Build for a different platform (cross-compilation)
docker buildx build --platform linux/arm64 -t myapp:arm64 --load .

# Build and push in one command
docker buildx build --platform linux/amd64,linux/arm64 \
  -t registry.example.com/myapp:latest --push .

Inspecting Images

# List images with full SHA
docker images --no-trunc

# Show image layers and their sizes
docker history myapp:1.0
docker history --no-trunc --format "{{.CreatedBy}}\t{{.Size}}" myapp:1.0

# Full image metadata as JSON
docker inspect myapp:1.0

# Extract specific field with Go template
docker inspect --format '{{.Config.Env}}' myapp:1.0
docker inspect --format '{{.RootFS.Layers}}' myapp:1.0   # list all layers

# Diff between image layers
docker image inspect --format='{{json .RootFS}}' myapp:1.0 | jq

# List dangling images (untagged, leftover from builds)
docker images -f dangling=true

Pruning & Cleanup

# Remove a specific image
docker rmi myapp:old

# Force remove even if containers use it
docker rmi -f myapp:old

# Remove all dangling images
docker image prune

# Remove ALL unused images (not just dangling)
docker image prune -a

# Nuclear option — remove everything unused
docker system prune -a --volumes

# See what prune WOULD remove (dry run)
docker system prune -a --volumes --dry-run

3. Container Lifecycle — Full Control

Running Containers

# Detached mode with name, port mapping, and restart policy
docker run -d \
  --name webapp \
  -p 8080:3000 \
  --restart unless-stopped \
  myapp:1.0

# With environment variables
docker run -d \
  --name webapp \
  -e DATABASE_URL=postgres://user:pass@db:5432/mydb \
  -e NODE_ENV=production \
  myapp:1.0

# Mount a volume
docker run -d \
  --name webapp \
  -v /host/data:/app/data \                  # bind mount
  -v app_data:/app/cache \                   # named volume
  --tmpfs /app/tmp \                         # in-memory tmpfs
  myapp:1.0

# Resource constraints (critical for production)
docker run -d \
  --name webapp \
  --memory 512m \
  --memory-swap 512m \                       # disable swap
  --cpus 1.5 \
  --pids-limit 100 \
  myapp:1.0

# Read-only filesystem with writable tmpfs
docker run -d \
  --name webapp \
  --read-only \
  --tmpfs /tmp \
  --tmpfs /var/run \
  myapp:1.0

# Run as non-root user
docker run -d --user 1001:1001 myapp:1.0

# Override entrypoint
docker run --entrypoint /bin/sh myapp:1.0 -c "echo hello"

Managing Running Containers

# List all containers (including stopped)
docker ps -a

# Custom format — much more readable
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}\t{{.Image}}"

# Filter by status
docker ps -f status=exited
docker ps -f name=webapp
docker ps -f label=env=production

# Stop, start, restart
docker stop webapp                            # SIGTERM → 10s → SIGKILL
docker stop -t 30 webapp                     # wait 30s before SIGKILL
docker kill webapp                            # immediate SIGKILL
docker restart webapp
docker restart --time 5 webapp               # 5s grace period

# Pause/unpause (freeze, useful for snapshots)
docker pause webapp
docker unpause webapp

# Rename a running container
docker rename webapp webapp-v1

4. Exec, Logs & Attach — Seeing Inside

# Interactive shell inside a running container
docker exec -it webapp /bin/bash
docker exec -it webapp /bin/sh               # Alpine/BusyBox

# Run a one-off command
docker exec webapp env
docker exec webapp cat /etc/os-release
docker exec webapp ps aux

# Run as root inside a non-root container
docker exec -u root -it webapp /bin/bash

# Set environment variable for the exec session
docker exec -e DEBUG=1 webapp node -e "console.log(process.env.DEBUG)"

# Live logs (like tail -f)
docker logs -f webapp

# Last 100 lines
docker logs --tail 100 webapp

# Logs with timestamps
docker logs -t webapp

# Logs since a specific time
docker logs --since 2024-01-01T00:00:00 webapp
docker logs --since 10m webapp               # last 10 minutes

# Attach to a running container's stdin/stdout (use with caution)
docker attach webapp
# Detach without stopping: Ctrl+P, Ctrl+Q

5. Networking Deep Dive

Network Basics

# List all networks
docker network ls

# Create a custom bridge network
docker network create --driver bridge \
  --subnet 172.20.0.0/16 \
  --gateway 172.20.0.1 \
  app_network

# Create an overlay network (for Swarm/multi-host)
docker network create --driver overlay --attachable prod_net

# Connect a running container to a network
docker network connect app_network webapp

# Disconnect
docker network disconnect app_network webapp

# Inspect network — see connected containers, IPs
docker network inspect app_network

# Run container on specific network
docker run -d --network app_network --name db postgres:15

# Give a container a static IP on a custom network
docker run -d \
  --network app_network \
  --ip 172.20.0.10 \
  --name db postgres:15

Network Troubleshooting Commands

# Check if two containers can reach each other
docker exec webapp ping db
docker exec webapp curl http://db:5432

# DNS resolution inside container
docker exec webapp nslookup db
docker exec webapp cat /etc/resolv.conf
docker exec webapp cat /etc/hosts

# Check open ports from inside the container
docker exec webapp ss -tulpn
docker exec webapp netstat -tulpn

# Trace route between containers
docker exec webapp traceroute db

# Inspect the container's network config
docker inspect webapp --format '{{json .NetworkSettings.Networks}}' | jq

# Check iptables rules Docker created
sudo iptables -L DOCKER -n
sudo iptables -t nat -L DOCKER -n

6. Volumes & Storage

# Create a named volume
docker volume create app_data

# List volumes
docker volume ls

# Inspect volume (find the actual mount path on host)
docker volume inspect app_data
# Output includes: "Mountpoint": "/var/lib/docker/volumes/app_data/_data"

# Remove a volume
docker volume rm app_data

# Remove all unused volumes
docker volume prune

# Copy files from container to host (no exec needed)
docker cp webapp:/app/config.json ./config.json

# Copy files from host to container
docker cp ./new_config.json webapp:/app/config.json

# Backup a volume
docker run --rm \
  -v app_data:/source:ro \
  -v $(pwd):/backup \
  alpine tar czf /backup/app_data_backup.tar.gz -C /source .

# Restore a volume
docker run --rm \
  -v app_data:/target \
  -v $(pwd):/backup \
  alpine tar xzf /backup/app_data_backup.tar.gz -C /target

7. Docker Compose — Production Patterns

Essential Commands

# Start all services (detached)
docker compose up -d

# Rebuild images before starting
docker compose up -d --build

# Scale a specific service
docker compose up -d --scale worker=5

# Pull latest images without starting
docker compose pull

# Stop and remove containers (keep volumes)
docker compose down

# Stop and remove containers AND volumes
docker compose down -v

# Stop and remove containers, volumes, AND images
docker compose down -v --rmi all

# View logs for all services
docker compose logs -f

# Logs for a specific service
docker compose logs -f webapp

# Exec into a compose service
docker compose exec webapp /bin/bash

# Run a one-off command in a service
docker compose run --rm webapp npm run migrate

# Validate your compose file
docker compose config

# See which services are running
docker compose ps

# Restart a single service
docker compose restart webapp

Multi-file Compose (Override Pattern)

# Base + override (common pattern)
docker compose -f docker-compose.yml -f docker-compose.override.yml up -d

# Use a production override
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

# Use a specific env file
docker compose --env-file .env.production up -d

8. Resource Monitoring & Stats

# Live CPU, memory, net I/O for all containers
docker stats

# Stats for specific containers, no-stream (one-shot)
docker stats --no-stream webapp db

# Custom stats format
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"

# Check container resource limits
docker inspect webapp \
  --format 'Memory: {{.HostConfig.Memory}} | CPUs: {{.HostConfig.NanoCpus}}'

# Disk usage breakdown
docker system df
docker system df -v                          # verbose — per container/image/volume

# Top processes inside a container
docker top webapp
docker top webapp -o pid,ppid,comm,%cpu,%mem

9. Troubleshooting Scenarios

Scenario 1: Container Exits Immediately (CrashLoopBackOff)

Symptom: docker ps -a shows Exited (1) seconds after starting.

# Step 1: Check the exit code and logs
docker ps -a --filter name=webapp
docker logs webapp

# Step 2: Override entrypoint to get a shell and investigate
docker run --entrypoint /bin/sh -it myapp:1.0

# Step 3: Check if the binary exists and is executable
docker run --entrypoint /bin/sh myapp:1.0 -c "which node && node --version"

# Step 4: Check if required env vars are missing
docker run --entrypoint /bin/sh myapp:1.0 -c "env | sort"

# Step 5: Check if ports are already in use on the host
sudo ss -tulpn | grep 3000
sudo lsof -i :3000

Common exit codes:

CodeMeaning
0Success
1General application error
125Docker daemon error
126Command not executable
127Command not found
137OOM killed (128 + SIGKILL=9)
139Segfault (128 + SIGSEGV=11)
143SIGTERM received (128 + 15)

Scenario 2: Container OOM Killed

Symptom: Exit code 137, docker stats showed high memory usage.

# Confirm OOM kill
docker inspect webapp --format '{{.State.OOMKilled}}'
# Output: true

# Check the kernel's OOM log
sudo dmesg | grep -i "oom\|killed" | tail -20

# Check what memory limit was set
docker inspect webapp \
  --format '{{.HostConfig.Memory}}'
# 0 = unlimited, otherwise bytes

# Run with more memory
docker run -d --memory 1g --name webapp myapp:1.0

# Monitor memory in real time before it dies
docker stats webapp --no-stream --format \
  "{{.MemUsage}} / {{.MemPerc}}"

# Analyze heap dump if it's a Node.js app
docker exec webapp node --expose-gc -e \
  "global.gc(); process.memoryUsage()"

Scenario 3: Can't Connect to Container Port

Symptom: curl http://localhost:8080 times out or refuses connection.

# Step 1: Verify the container is running and port is mapped
docker ps --filter name=webapp

# Step 2: Check if the app is actually listening inside the container
docker exec webapp ss -tulpn | grep 3000
# If nothing shows, the app crashed or wrong port in code

# Step 3: Check the port mapping is correct
docker inspect webapp \
  --format '{{json .NetworkSettings.Ports}}' | jq

# Step 4: Verify Docker's iptables rules
sudo iptables -t nat -L DOCKER -n | grep 8080

# Step 5: Check if another process on the host has the port
sudo ss -tulpn | grep :8080

# Step 6: Test from inside Docker network
docker run --rm --network container:webapp \
  curlimages/curl curl -v http://localhost:3000

# Step 6b: Test with Docker's own network
docker run --rm alpine wget -qO- http://webapp:3000

Scenario 4: Container Can't Reach the Internet

Symptom: docker exec webapp curl https://google.com fails.

# Step 1: Check DNS
docker exec webapp cat /etc/resolv.conf
docker exec webapp nslookup google.com

# Step 2: Ping the gateway
docker exec webapp ping 172.17.0.1          # default bridge gateway

# Step 3: Check Docker's DNS config
docker inspect webapp \
  --format '{{json .HostConfig.Dns}}' | jq

# Step 4: Check if IP forwarding is enabled on the host
cat /proc/sys/net/ipv4/ip_forward
# Should be 1. If 0:
sudo sysctl -w net.ipv4.ip_forward=1

# Step 5: Check iptables FORWARD chain
sudo iptables -L FORWARD -n | grep DROP

# Step 6: Check if Docker's bridge network is blocked
sudo iptables -L DOCKER-USER -n

# Fix — add a rule to allow forward traffic
sudo iptables -I DOCKER-USER -j ACCEPT

# Step 7: Set custom DNS servers
docker run --dns 8.8.8.8 --dns 1.1.1.1 myapp:1.0
# Or globally in /etc/docker/daemon.json:
# { "dns": ["8.8.8.8", "1.1.1.1"] }

Scenario 5: Slow Container Build / Huge Image Size

Symptom: docker build takes 10 minutes. Image is 2GB.

# Step 1: Inspect layer sizes
docker history myapp:1.0 --no-trunc | sort -k4 -rh | head -10

# Step 2: Use dive to visually explore layers
docker run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  wagoodman/dive myapp:1.0

# Step 3: Check what's being ignored (or not)
cat .dockerignore
# You should be ignoring: node_modules, .git, *.log, dist, __pycache__

# Step 4: Use BuildKit for parallel builds and better caching
DOCKER_BUILDKIT=1 docker build -t myapp:1.0 .
# Or set in daemon.json: { "features": { "buildkit": true } }

# Step 5: Export and analyze the build cache
docker buildx du

# Step 6: Multi-stage build to discard build dependencies
# In Dockerfile:
# FROM node:20-alpine AS builder
# RUN npm ci && npm run build
# FROM node:20-alpine AS runner     <-- only production deps
# COPY --from=builder /app/dist .

Scenario 6: Volume Permission Denied

Symptom: Container writes fail with Permission denied on mounted volume.

# Step 1: Check what user the container runs as
docker exec webapp id
docker inspect webapp --format '{{.Config.User}}'

# Step 2: Check ownership of the bind mount on the host
ls -la /host/data

# Step 3: Fix — match UID on the host
sudo chown -R 1001:1001 /host/data

# Step 4: Or run the container as root for the volume
docker run -u root -v /host/data:/app/data myapp:1.0

# Step 5: For named volumes — use an init container
docker run --rm \
  -v app_data:/data \
  --user root \
  alpine chown -R 1001:1001 /data

# Step 6: Set permissions at build time via entrypoint
# In your Dockerfile:
# COPY entrypoint.sh /entrypoint.sh
# ENTRYPOINT ["/entrypoint.sh"]
# entrypoint.sh:
# #!/bin/sh
# chown -R app:app /app/data
# exec su-exec app "$@"

Scenario 7: Docker Compose Services Can't Talk to Each Other

Symptom: webapp can't reach db even though both services are in the same compose file.

# Step 1: Verify both services are on the same network
docker compose ps
docker network ls | grep <project>

# Step 2: Check the default network
docker network inspect <project>_default | jq '.[0].Containers'

# Step 3: DNS inside compose uses service name
docker compose exec webapp ping db
docker compose exec webapp curl http://db:5432

# Step 4: Force explicit network in compose file:
# services:
#   webapp:
#     networks:
#       - app_net
#   db:
#     networks:
#       - app_net
# networks:
#   app_net:
#     driver: bridge

# Step 5: Check if service names are correct
docker compose config | grep -A5 'services:'

# Step 6: Check if the DB is actually listening
docker compose exec db pg_isready -U myuser
docker compose logs db | tail -30

Scenario 8: docker pull Fails with Registry Auth Errors

Symptom: unauthorized: authentication required or connection refused.

# Step 1: Log in explicitly
docker login registry.example.com
docker login registry.example.com -u user -p token   # non-interactive

# Step 2: Check stored credentials
cat ~/.docker/config.json | jq '.auths'

# Step 3: Verify the registry is reachable
curl -v https://registry.example.com/v2/

# Step 4: For self-hosted HTTP registries (like Harbor without TLS)
# Edit /etc/docker/daemon.json:
# { "insecure-registries": ["registry.myriadcara.com"] }
sudo systemctl reload docker

# Step 5: For containerd (Kubernetes nodes)
# Edit /etc/containerd/config.toml:
# [plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.myriadcara.com"]
#   endpoint = ["http://registry.myriadcara.com"]
sudo systemctl restart containerd

# Step 6: Test pull with debug output
docker --debug pull registry.example.com/myapp:latest

10. Advanced Debugging Techniques

nsenter — Entering Container Namespaces from the Host

When docker exec isn't available (distroless images, minimal containers):

# Get the container's PID on the host
PID=$(docker inspect webapp --format '{{.State.Pid}}')
echo "Container PID: $PID"

# Enter the container's network namespace
sudo nsenter -t $PID -n -- ss -tulpn
sudo nsenter -t $PID -n -- ip addr
sudo nsenter -t $PID -n -- tcpdump -i eth0

# Enter all namespaces (like a full shell)
sudo nsenter -t $PID -m -u -i -n -p -- /bin/bash

# Run a debugging tool inside the container's namespace
# even if the tool isn't in the container image
sudo nsenter -t $PID -n -- curl http://localhost:3000

strace — Syscall Tracing Inside Containers

# Trace a process inside a container (requires SYS_PTRACE cap)
docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
  -it myapp:1.0 strace -p 1

# Trace a new process
docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
  myapp:1.0 strace node server.js

# Count syscall frequency
docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
  myapp:1.0 strace -c node server.js

# From the host using nsenter
PID=$(docker inspect webapp --format '{{.State.Pid}}')
sudo strace -p $PID -f -e trace=network

tcpdump — Network Packet Capture

# Capture traffic on Docker's bridge interface
sudo tcpdump -i docker0 -w /tmp/capture.pcap

# Filter by container IP
CONTAINER_IP=$(docker inspect webapp \
  --format '{{.NetworkSettings.IPAddress}}')
sudo tcpdump -i docker0 host $CONTAINER_IP -w /tmp/capture.pcap

# Capture inside the container (needs tcpdump in image)
docker exec webapp tcpdump -i eth0 -w /tmp/cap.pcap
docker cp webapp:/tmp/cap.pcap ./cap.pcap    # copy out for Wireshark

# Sidecar approach — inject tcpdump container sharing network namespace
docker run --rm \
  --network container:webapp \
  nicolaka/netshoot tcpdump -i eth0 -n port 3000

Using netshoot — The Network Swiss Army Knife

# Attach to a running container's network namespace
docker run --rm -it \
  --network container:webapp \
  nicolaka/netshoot

# Available tools: curl, dig, nslookup, nmap, iperf, tcpdump,
# traceroute, ss, ip, iptables, bpftrace, tshark, and more

# Quick connectivity check
docker run --rm \
  --network container:webapp \
  nicolaka/netshoot curl -s http://db:5432

# Full port scan of another container
docker run --rm \
  --network container:webapp \
  nicolaka/netshoot nmap -p 1-65535 db

Inspecting the Docker Daemon

# Check daemon status and logs
sudo systemctl status docker
sudo journalctl -u docker -f --since "10 minutes ago"

# Daemon info — versions, driver, storage info
docker info

# Check current daemon config
sudo cat /etc/docker/daemon.json

# Enable debug mode on the daemon (hot reload)
sudo kill -SIGUSR1 $(pidof dockerd)
# Or set in daemon.json: { "debug": true }
sudo systemctl reload docker

# Docker events stream — audit all container lifecycle events
docker events
docker events --since 1h
docker events --filter type=container --filter event=die

Build Debugging

# BuildKit: print internal solve steps
BUILDKIT_PROGRESS=plain docker build -t myapp:1.0 .

# Run a build up to a specific layer and shell in
docker build --target debug-stage -t debug:1 . && docker run -it debug:1 /bin/sh

# Check build cache hit/miss
docker buildx build --no-cache \
  --cache-from type=registry,ref=registry.example.com/myapp:cache \
  -t myapp:1.0 .

# Export build cache to local directory
docker buildx build \
  --cache-to type=local,dest=/tmp/build-cache \
  --cache-from type=local,src=/tmp/build-cache \
  -t myapp:1.0 .

11. Security Scanning & Hardening

# Scan image for CVEs with Trivy
docker run --rm \
  -v /var/run/docker.sock:/var/run/docker.sock \
  aquasec/trivy image myapp:1.0

# Scan only critical/high vulnerabilities
docker run --rm \
  -v /var/run/docker.sock:/var/run/docker.sock \
  aquasec/trivy image --severity CRITICAL,HIGH myapp:1.0

# Scan Dockerfile for misconfigurations
docker run --rm \
  -v $(pwd):/workspace \
  aquasec/trivy config /workspace

# Benchmark Docker daemon config against CIS
docker run --rm --net host --pid host --userns host --cap-add audit_control \
  -v /etc:/etc:ro \
  -v /usr/bin/containerd:/usr/bin/containerd:ro \
  -v /var/lib:/var/lib:ro \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  docker/docker-bench-security

# Check running containers for known CVEs
docker run --rm \
  -v /var/run/docker.sock:/var/run/docker.sock \
  aquasec/trivy image --input \
  $(docker inspect $(docker ps -q) --format '{{.Image}}' | sort -u | head -1)

# Drop all capabilities except what's needed
docker run -d \
  --cap-drop ALL \
  --cap-add NET_BIND_SERVICE \
  --security-opt no-new-privileges \
  --read-only \
  myapp:1.0

12. One-liner Toolkit

# Kill all running containers
docker kill $(docker ps -q)

# Remove all stopped containers
docker rm $(docker ps -aq -f status=exited)

# Remove all images with a specific tag
docker rmi $(docker images | grep 'myapp' | awk '{print $3}')

# Get IPs of all running containers
docker ps -q | xargs docker inspect \
  --format '{{.Name}} — {{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}'

# Enter a container by partial name match
docker exec -it $(docker ps -qf name=web) /bin/sh

# Watch docker events live with formatting
docker events --format '{{.Time}} {{.Type}} {{.Action}} {{.Actor.Attributes.name}}'

# Check which containers are consuming the most memory
docker stats --no-stream \
  --format "{{.Name}}: {{.MemUsage}}" | \
  sort -t':' -k2 -rh | head -5

# Pull and run a temporary postgres for local dev
docker run --rm -d \
  --name tmpdb \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  postgres:15-alpine

# Run an ad-hoc redis CLI against a running redis container
docker exec -it $(docker ps -qf name=redis) redis-cli

# Generate a docker-compose.yml from a running container
docker run --rm \
  -v /var/run/docker.sock:/var/run/docker.sock \
  ghcr.io/red5d/docker-autocompose webapp

# Watch container logs with grep filter
docker logs -f webapp 2>&1 | grep --line-buffered "ERROR\|WARN"

# Export a container's filesystem as tar
docker export webapp -o webapp_fs.tar

# Diff what changed in a container's filesystem vs the image
docker diff webapp
# A = Added, C = Changed, D = Deleted

13. daemon.json — Configuration Reference

The full Docker daemon config file at /etc/docker/daemon.json:

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m",
    "max-file": "3"
  },
  "storage-driver": "overlay2",
  "insecure-registries": ["registry.myriadcara.com"],
  "registry-mirrors": ["https://mirror.gcr.io"],
  "default-address-pools": [
    { "base": "172.20.0.0/16", "size": 24 }
  ],
  "features": {
    "buildkit": true
  },
  "debug": false,
  "live-restore": true,
  "max-concurrent-downloads": 10,
  "max-concurrent-uploads": 5,
  "dns": ["8.8.8.8", "1.1.1.1"],
  "exec-opts": ["native.cgroupdriver=systemd"],
  "metrics-addr": "0.0.0.0:9323"
}

Apply changes:

sudo systemctl daemon-reload
sudo systemctl restart docker

# Or for non-breaking changes use live reload
sudo kill -SIGHUP $(pidof dockerd)

14. Checklist: Before Every Production Deploy

[ ] Image tagged with a specific version (not :latest)
[ ] .dockerignore excludes dev files, .git, node_modules
[ ] Multi-stage build — final image has no build tools
[ ] Non-root USER defined in Dockerfile
[ ] Health check defined: HEALTHCHECK CMD ...
[ ] Resource limits set: --memory, --cpus
[ ] Read-only filesystem where possible
[ ] Secrets not baked into image (use secrets/env at runtime)
[ ] Trivy scan: no CRITICAL CVEs
[ ] Restart policy set: --restart unless-stopped
[ ] Log driver configured with rotation
[ ] Ports exposed only what's necessary
[ ] Container joins only the networks it needs

Further Reading


Written for engineers who run things in production and need answers fast. If something's wrong or you have a scenario I missed — ping me on IRC at chat.mrsubodh.xyz.