starred/healthchecks

Fork 0

mirror of https://github.com/healthchecks/healthchecks.git synced 2026-04-25 23:15:49 +03:00

[GH-ISSUE #1181] DB Container fails: received fast shutdown request #809

New issue

Closed

opened 2026-02-25 23:43:39 +03:00 by kerem · 4 comments

kerem commented

2026-02-25 23:43:39 +03:00

Owner

Originally created by @ppittle on GitHub (Jun 18, 2025).
Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/1181

Healthchecks will run for days and then the database container will exit with this log message:

2025-06-17 19:22:20.625 UTC [44] FATAL:  terminating connection due to administrator command

And once the DB goes down, healthchecks web app also goes down.

AFAIK there wasn't any corresponding user or automated action that would cause db to terminate.

Any steps to prevent this from happening?

Full log context

2025-06-17 18:37:51.909 UTC [27] LOG:  checkpoint starting: time

2025-06-17 18:37:54.245 UTC [27] LOG:  checkpoint complete: wrote 24 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.320 s, sync=0.005 s, total=2.337 s; sync files=24, longest=0.004 s, average=0.001 s; distance=52 kB, estimate=52 kB; lsn=0/1F526E8, redo lsn=0/1F526B0

2025-06-17 19:22:20.609 UTC [1] LOG:  received fast shutdown request

2025-06-17 19:22:20.621 UTC [1] LOG:  aborting any active transactions

2025-06-17 19:22:20.625 UTC [44] FATAL:  terminating connection due to administrator command

2025-06-17 19:22:20.625 UTC [43] FATAL:  terminating connection due to administrator command

2025-06-17 19:22:20.637 UTC [1] LOG:  background worker "logical replication launcher" (PID 32) exited with exit code 1

2025-06-17 19:22:20.653 UTC [27] LOG:  shutting down

2025-06-17 19:22:20.666 UTC [27] LOG:  checkpoint starting: shutdown immediate

2025-06-17 19:22:20.818 UTC [27] LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.046 s, sync=0.001 s, total=0.165 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=47 kB; lsn=0/1F52798, redo lsn=0/1F52798

2025-06-17 19:22:20.825 UTC [1] LOG:  database system is shut down

Originally created by @ppittle on GitHub (Jun 18, 2025). Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/1181 Healthchecks will run for days and then the database container will exit with this log message: ``` 2025-06-17 19:22:20.625 UTC [44] FATAL: terminating connection due to administrator command ``` And once the DB goes down, healthchecks web app also goes down. AFAIK there wasn't any corresponding user or automated action that would cause db to terminate. Any steps to prevent this from happening? **Full log context** ``` 2025-06-17 18:37:51.909 UTC [27] LOG: checkpoint starting: time 2025-06-17 18:37:54.245 UTC [27] LOG: checkpoint complete: wrote 24 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.320 s, sync=0.005 s, total=2.337 s; sync files=24, longest=0.004 s, average=0.001 s; distance=52 kB, estimate=52 kB; lsn=0/1F526E8, redo lsn=0/1F526B0 2025-06-17 19:22:20.609 UTC [1] LOG: received fast shutdown request 2025-06-17 19:22:20.621 UTC [1] LOG: aborting any active transactions 2025-06-17 19:22:20.625 UTC [44] FATAL: terminating connection due to administrator command 2025-06-17 19:22:20.625 UTC [43] FATAL: terminating connection due to administrator command 2025-06-17 19:22:20.637 UTC [1] LOG: background worker "logical replication launcher" (PID 32) exited with exit code 1 2025-06-17 19:22:20.653 UTC [27] LOG: shutting down 2025-06-17 19:22:20.666 UTC [27] LOG: checkpoint starting: shutdown immediate 2025-06-17 19:22:20.818 UTC [27] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.046 s, sync=0.001 s, total=0.165 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=47 kB; lsn=0/1F52798, redo lsn=0/1F52798 2025-06-17 19:22:20.825 UTC [1] LOG: database system is shut down ```

kerem closed this issue

2026-02-25 23:43:39 +03:00

kerem commented

2026-02-25 23:43:39 +03:00

Author

Owner

@cuu508 commented on GitHub (Jun 18, 2025):

Healthchecks does not send any administrator commands to Postgres. I think this is an issue with either the database container, or the container engine running it.

A quick search turned up a similar report here – are you by any chance using podman?

@cuu508 commented on GitHub (Jun 18, 2025): Healthchecks does not send any administrator commands to Postgres. I think this is an issue with either the database container, or the container engine running it. A quick search [turned up a similar report here](https://github.com/containers/podman/discussions/20294) – are you by any chance using podman?

kerem commented

2026-02-25 23:43:39 +03:00

Author

Owner

@ppittle commented on GitHub (Jun 19, 2025):

@cuu508 - thanks for looking into this with me!

Unfortunately no podman. I have portainer running, but not rootless (afaik) - and I am just using it for monitoring.

I spun everything up with a docker compose file:

(I've removed some environment variables for brevity/privacy)

healthchecks-db:
    image: postgres:16
    container_name: healthchecks-db
    networks:
      - monitoring
    volumes:
      - /portainer/Files/AppData/Config/healthchecks/db-data:/var/lib/postgresql/data
    environment:
      - 
  
  # https://github.com/healthchecks/healthchecks/blob/master/README.md
  healthchecks:
    image: healthchecks/healthchecks:latest
    container_name: healthchecks
    networks:
      - monitoring
    ports:
      - "345:8000"
    depends_on:
        - healthchecks-db
    volumes:
      -  /portainer/Files/AppData/Config/healthchecks/hc-data:/data
    environment:    - 
    -  DB_HOST=healthchecks-db
    -  DB_NAME=healthchecks
    -  DB_PORT=5432
    -  DB_SSLMODE=prefer
    -  DB_TARGET_SESSION_ATTRS=read-write
    -  DB_USER=postgres
    -  DEBUG=False
    -  VICTOROPS_ENABLED=True
    -  WEBHOOKS_ENABLED=True
    command: bash -c 'while !</dev/tcp/healthchecks-db/5432; do sleep 1; done; uwsgi /opt/healthchecks/docker/uwsgi.ini'
    labels:
      - "traefik.enable=true"
      - 
    restart: unless-stopped

Any chance there could be an issue with this health check:

command: bash -c 'while !</dev/tcp/healthchecks-db/5432; do sleep 1; done; uwsgi /opt/healthchecks/docker/uwsgi.ini'

@ppittle commented on GitHub (Jun 19, 2025): @cuu508 - thanks for looking into this with me! Unfortunately no podman. I have portainer running, but not rootless (afaik) - and I am just using it for monitoring. I spun everything up with a docker compose file: (I've removed some environment variables for brevity/privacy) ``` healthchecks-db: image: postgres:16 container_name: healthchecks-db networks: - monitoring volumes: - /portainer/Files/AppData/Config/healthchecks/db-data:/var/lib/postgresql/data environment: - # https://github.com/healthchecks/healthchecks/blob/master/README.md healthchecks: image: healthchecks/healthchecks:latest container_name: healthchecks networks: - monitoring ports: - "345:8000" depends_on: - healthchecks-db volumes: - /portainer/Files/AppData/Config/healthchecks/hc-data:/data environment: - - DB_HOST=healthchecks-db - DB_NAME=healthchecks - DB_PORT=5432 - DB_SSLMODE=prefer - DB_TARGET_SESSION_ATTRS=read-write - DB_USER=postgres - DEBUG=False - VICTOROPS_ENABLED=True - WEBHOOKS_ENABLED=True command: bash -c 'while !</dev/tcp/healthchecks-db/5432; do sleep 1; done; uwsgi /opt/healthchecks/docker/uwsgi.ini' labels: - "traefik.enable=true" - restart: unless-stopped ``` Any chance there could be an issue with this health check: ``` command: bash -c 'while !</dev/tcp/healthchecks-db/5432; do sleep 1; done; uwsgi /opt/healthchecks/docker/uwsgi.ini' ```

kerem commented

2026-02-25 23:43:39 +03:00

Author

Owner

@cuu508 commented on GitHub (Jun 20, 2025):

The while !</dev/tcp/healthchecks-db/5432; do sleep 1; done; loop waits for postgres to start listening on port 5432, and only then starts uwsgi. It only runs on container startup, not continuously.

You mentioned the system runs for a few days, so I don't think this plays a role.

You could check the healthchecks containers logs around the time when the database shuts down to see if there's anything out of ordinary there – does the healthchecks container restart as well, are there any HTTP requests with a matching timestamps.

Perhaps there's something on your system that auto-updates containers (like watchtower, or perhaps portainer itself can do this?)?

@cuu508 commented on GitHub (Jun 20, 2025): The `while !</dev/tcp/healthchecks-db/5432; do sleep 1; done;` loop waits for postgres to start listening on port 5432, and only then starts uwsgi. It only runs on container startup, not continuously. You mentioned the system runs for a few days, so I don't think this plays a role. You could check the healthchecks containers logs around the time when the database shuts down to see if there's anything out of ordinary there – does the healthchecks container restart as well, are there any HTTP requests with a matching timestamps. Perhaps there's something on your system that auto-updates containers (like watchtower, or perhaps portainer itself can do this?)?

kerem commented

2026-02-25 23:43:39 +03:00

Author

Owner

@cuu508 commented on GitHub (Jul 2, 2025):

No reply, closing.

@cuu508 commented on GitHub (Jul 2, 2025): No reply, closing.