How YOU doin'? The Quirks of Checking MySQL Container Health

Recently, I've been working on a Rails app that relies on a MySQL database. Most of the apps we work on use PostgreSQL, but of course Rails works with MySQL just fine as well. For smaller applications, there's rarely much difference in configuration overhead.

As part of my effort to move this app off of a legacy platform-as-a-service, (one which I won't name but has to do with trains) and onto less expensive VMs, I've been setting up Docker containers for the app and its related services. Would you believe that in production, at the barest minimum, this app requires 8 separate containers? Its true! One for the app, one for the database, one for a background worker, one for a cache store for the worker, and four other containers for an open source video conferencing service.

Fortunately, the hard work of getting those 8 containers orchestrated, using docker compose, is behind me, but I did learn a thing or two about container health checks along the way.

You can't just start everything up at the same time

This may make me sound naive, but it wasn't immediately obvious to me. I was used to working with managed relational database services, which of course are always up and running for you, even when you provision a fresh app instance. When you're running your own database, it takes some time to set itself up (the first time it runs) and isn't immediately available to other containers, like the app container.

Docker compose is the solution here, specifically the depends_on attribute. With depends_on you can tell docker compose to only start a particular service after some other service meets some success criteria:

services:
 web:
   build: .
   depends_on:
     db:
       condition: service_healthy
       restart: true
     redis:
       condition: service_started

Your options are:

  • service_started (not necessarily health, just "started")
  • service_healthy (depends on a successful health check of the dependent service, see below)
  • service_completed_successfully (for services that you want to run and then exit with an exit code of 0, indicating that they completed their task without issues.)

Health checks can be defined either within the container or in docker compose

One of the ways that you can define how Docker checks a containers health is in the Dockerfile itself, like this:

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 CMD curl -f http://localhost/ || exit 1

Adding this line to the Dockerfile tells docker compose how to "ask" the container whether it is up, healthy and ready to do some work. In the basic example above, a simple cURL command is run, sending a request to

http://localhost. If this fails, the health check command ends, exist with status 1, and docker compose knows that the container is not yet healthy.

The other way you can define health checks is within the docker-compose.yml file, like this:

services:
 web:
   healthcheck:
     test: ["CMD", "curl", "-f", "http://localhost/"]
     interval: 30s
     timeout: 10s
     retries: 3
     start_period: 5s

The healthcheck section of this YAML above achieves the same thing: it queries a specific URI, and if that cURL command exits with anything other than 0, docker compose knows that the container isn't ready.

MySQL uses a temporary server during the initialization phase

In my case, the container that was causing the most issues (when initialized on a fresh server) was MySQL. We're running one of the official MySQL images, which itself does not define a health check in its Dockerfile.

If you do some digging online, the most common advice is to define a health check for MySQL containers like this:

test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "--silent"]

But... BUT!!!

Did you know that, unlike PostgreSQL, MySQL typically starts a temporary server on initialization? Often referred to as the "bootstrap" server, it performs initialization tasks like creating system tables and loading initial configurations. The temporary server only listens on a local Unix socket and is not available to external clients. After initialization, the temporary server shuts down and the real MySQL server starts, listening on the specified ports, ready to accept external connections.

The MySQL container may appear "up" from Docker's perspective, because even the temporary server will respond correctly to the ping request, but the real database server is not yet ready. In my case, this issue surfaced because my Rails app was attempting to make requests to the MySQL database, but the database was still initializing and was using the temporary server.

I did some more digging, and I came across a very nice solution to this, from the official docker-library repo.

First, we need to make a new (and very simple) Dockerfile (call it Dockerfile.db) for the database container, which just adds a custom HEALTHCHECK parameter, like this:

FROM mysql

COPY docker-healthcheck /usr/local/bin/

HEALTHCHECK CMD ["docker-healthcheck"]

This file just pulls the latest mysql container, copies a health checking script into it, and then defines that script as the healthcheck that docker compose should use.

Next, we write the health check script itself:

#!/bin/bash
set -eo pipefail

if [ "$MYSQL_RANDOM_ROOT_PASSWORD" ] && [ -z "$MYSQL_USER" ] && [ -z "$MYSQL_PASSWORD" ]; then
# there's no way we can guess what the random MySQL password was
echo >&2 'healthcheck error: cannot determine random root password (and MYSQL_USER and MYSQL_PASSWORD were not set)'
exit 0
fi

host="$(hostname --ip-address || echo '127.0.0.1')"
user="${MYSQL_USER:-root}"
export MYSQL_PWD="${MYSQL_PASSWORD:-$MYSQL_ROOT_PASSWORD}"

args=(
# force mysql to not use the local "mysqld.sock" (test "external" connectibility)
-h"$host"
-u"$user"
--silent
)

if command -v mysqladmin &> /dev/null; then
if mysqladmin "${args[@]}" ping > /dev/null; then
exit 0
fi
else
if select="$(echo 'SELECT 1' | mysql "${args[@]}")" && [ "$select" = '1' ]; then
exit 0
fi
fi

exit 1

This script will verify that the real server is running, not the temporary one.

Finally, our docker-compose.yml file will look like this:

services:
 app:
   depends_on:
     db:
       condition: service_healthy
   
# ...
 db:
   build:
     dockerfile: ./Dockerfile.db

Note that now the app container will check that the db container is healthy before it starts. Also note that the db service now refers to the new Dockerfile.db that we created, instead of the public official MySQL container image.

Do I have to do this in PostgreSQL?

Nope! PostgreSQL does not use a temporary server during its initialization phase, so none of this would be necessary in that case.

Happy containerizing!