Skip to content

Add additional extensions to generated ssl certs #536

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 19, 2025

Conversation

jnoordsij
Copy link
Contributor

@jnoordsij jnoordsij commented May 19, 2025

This MR adds some additional extensions to the self-generated CA and server certificates the dind image may employ to set up TLS connections. This ensures the certificates (better) match RFCs specifying these extensions as required (see https://datatracker.ietf.org/doc/html/rfc3280 and https://datatracker.ietf.org/doc/html/rfc5280). This is particularly relevant for clients enforcing checks on this while connecting.

Source of the problem

When using my combination of:

  • GitLab CI
  • Docker dind image as service in CI job
  • Python Docker image employing the Docker SDK for Python to connect to the Docker daemon

I was presented with errors among the following lines:

docker.errors.DockerException: Error while fetching server API version: HTTPSConnectionPool(host='docker', port=2376): Max retries exceeded with url: /version (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: CA cert does not include key usage extension (_ssl.c:1028)')))

This is due to a recent change in the Python urllib3 package used by the Python Docker SDK, which follows the stricter checks Python introduced with >=3.13.

MWE + proof of fix

I've created https://gitlab.com/jnoordsij/test-docker-ci/-/tree/e5289dcdc5a736663d66ea4199f95f5c94d73fec as basic example of attempting to use the Python Docker SDK with the dind image that fails; see https://gitlab.com/jnoordsij/test-docker-ci/-/jobs/10076832026.

After building a patched dind image (https://gitlab.com/jnoordsij/test-docker-ci/-/commit/0c6a1ac70fd6d6f26e29af8d5d797b3d70d41967) and using that in the CI job, everything seems to work fine
(see also https://gitlab.com/jnoordsij/test-docker-ci/-/pipelines/1824971402).

Further notes

I think technically speaking only the second commit, updating the CA cert, is required to fix this particular issue. However when looking into this, I figured adding the server cert extensions is probably good practice, especially with this being recommended in official documentation.

Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, thank you ❤️

This is 100% neither your fault nor your problem, but even as the original author of most of this code, I hate it - I hate that we have to have it, I hate how complicated it is, and I especially hate that it's so fiddly to get "right"

I would love to deprecate it all, tell everyone to use the Unix socket instead, and call it a day, but that's not terribly realistic. 😞

-subj '/CN=docker:dind CA' \
-x509 \
-days "$certValidDays" \
-addext keyUsage=critical,digitalSignature,keyCertSign
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the OpenSSL docs say about critical here is this:

If critical is present then the extension will be marked as critical.

Do you have a reference for why this is necessary? 🤔

The docs also suggest this can be more whitespace-y, so if we need to touch this again I think I might try something like this:

Suggested change
-addext keyUsage=critical,digitalSignature,keyCertSign
-addext 'keyUsage = critical, digitalSignature, keyCertSign'

But we should probably add a new test or update our existing test to make sure this stays working before we touch it again (since you've done good validation on this change as-is). ❤️

Copy link
Contributor Author

@jnoordsij jnoordsij May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the RFC section 4.2.1.3 Key Usage:

This extension MUST appear in certificates that contain public keys
that are used to validate digital signatures on other public key
certificates or CRLs. When this extension appears, it SHOULD be
marked critical.

Edit: also definitely not an expert on this in any way, mostly looked for fixes I found elsewhere with people encountering the same type of issue when using Python 3.13 combined with some server application employing self-signed certificates.

@tianon tianon merged commit 8979f92 into docker-library:master May 19, 2025
6 checks passed
docker-library-bot added a commit to docker-library-bot/official-images that referenced this pull request May 19, 2025
Changes:

- docker-library/docker@861512c: Update 28-rc
- docker-library/docker@8979f92: Merge pull request docker-library/docker#536 from jnoordsij/add-ssl-ext
- docker-library/docker@8daf440: Update 28-rc to 28.2.0-rc.1, buildx 0.23.0, compose 2.36.1
- docker-library/docker@d370b5c: Update 28 to compose 2.36.1
- docker-library/docker@52c8bfa: Add keyUsage extension to dind generated CA cert
- docker-library/docker@b5332f2: Add extendedKeyUsage = serverAuth to dind generated server cert
@jnoordsij jnoordsij deleted the add-ssl-ext branch May 20, 2025 07:20
@jnoordsij
Copy link
Contributor Author

Oof, thank you ❤️

This is 100% neither your fault nor your problem, but even as the original author of most of this code, I hate it - I hate that we have to have it, I hate how complicated it is, and I especially hate that it's so fiddly to get "right"

I would love to deprecate it all, tell everyone to use the Unix socket instead, and call it a day, but that's not terribly realistic. 😞

Yeah I understand your feelings on this, it feels a bit too over-complicated to have this here. Deprecating it will probably be quite a hard thing to achieve. However, at least it does what it have to do right now pretty good, and I think generally it should be more or less a stable approach apart from cases like this enforcing a bit more strictness.

For changing towards sockets, just looking at GitLab for example, they do list using a socket as an option in their docs, but not as the preferred one (section starts with "You should use Docker-in-Docker with TLS enabled, which is supported by GitLab.com instance runners.") and moreover one that thus does not seem to be supported by their shared instance runners.

Although I could not find current shared runner configuration in their current docs, an older version only had a mount bind configured for the TLS based setup.

So before any of this could happen that would probably require quite a bit of effort in getting them to migrate, and then there's probably still lots of other usecases for the TLS approach out there 😓

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants