Skip to content

Use multi-stage Docker builds #587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Use multi-stage Docker builds #587

wants to merge 1 commit into from

Conversation

lukebakken
Copy link
Collaborator

@lukebakken
Copy link
Collaborator Author

@tianon I did this because I'm also working on a branch to migrate to OpenSSL 3.0 (https://github.com/lukebakken/rabbitmq-1/tree/lukebakken/openssl-3.0).

I find using a multi-stage build a bit easier to understand than the long strings of shell commands. What do you think? cc @michaelklishin

@lukebakken lukebakken marked this pull request as ready for review November 3, 2022 15:55
@lukebakken
Copy link
Collaborator Author

@tianon all set. The images are the same size as what is available now:

lbakken@shostakovich ~
$ docker images 'rabbitmq*-local*'
REPOSITORY                   TAG             IMAGE ID       CREATED        SIZE
rabbitmq-local-multi-stage   latest          c146709d09e4   26 hours ago   229MB

lbakken@shostakovich ~
$ docker images 'rabbitmq:3'
REPOSITORY   TAG       IMAGE ID       CREATED      SIZE
rabbitmq     3         8d91aa7f6d1d   9 days ago   229MB

I could come up with a way to verify the contents of the rabbitmq:3 image vs one built using this PR if you'd like.

https://docs.docker.com/build/building/multi-stage/

Move most of Ubuntu build to multi-stage, still in progress

Finish up multi-stage build

Convert Alpine to multi-stage
@tianon
Copy link
Member

tianon commented Nov 7, 2022

Sorry, I'm struggling with this one a little bit 🙈

I'm not sure I understand the argument that it's easier to read? It's still a series of long RUN lines in basically the same order with the same ordering dependency between them but now slightly less obviously dependent 😅

(There's also https://github.com/docker-library/faq#multi-stage-builds 😞)

When I looked at updating to OpenSSL 3 in #574 (comment), one of the big issues I ran into was that the way we're specifying our build architecture to OpenSSL is technically wrong and I think needs to update to be a mapping to https://wiki.openssl.org/index.php/Compilation_and_Installation#Supported_Platforms 😩

@lukebakken
Copy link
Collaborator Author

lukebakken commented Nov 7, 2022

(There's also https://github.com/docker-library/faq#multi-stage-builds 😞)

That's fine, my PR is in essence a two-stage build anyway, but this could be explicit.

When I first started working on this branch it was pretty much my first exposure to how these docker files work. It seemed like a lot of shenanigans to make sure just the right deps were installed and then removed so that the final image was as small as possible.

In my opinion, by using a "build" phase (or phases) and "run" phase, it makes the distinction clearer and simplifies cleanup. Also, you don't have to worry about creating excessive layers in the build phase as they will all be discarded anyway.

I spent enough time working on this PR that the original code makes sense, so feel free to close it if you'd like.

Thanks for pointing out the OpenSSL 3.0 potential issues, I will take that into account with my work.

@lukebakken
Copy link
Collaborator Author

@tianon I remembered another feature of having these stages. When working on the OpenSSL 3.0 changes, I could use the --target argument to only run the openssl-builder stage. Of course, I could just comment everything else out but it's handy.

@michaelklishin
Copy link
Collaborator

michaelklishin commented Nov 7, 2022

My conclusion after reading https://github.com/docker-library/faq#multi-stage-builds and this discussion is that switching to a multi-stage build would be a certain quality of life improvement to some contributors. However, most typical benefits of multi-stage builds would
not necessarily apply to RabbitMQ and the way we build Erlang/OpenSSL/RabbitMQ today.

I am still in favor of merging this but it could be simply due to my long time familiarity and bias towards multi-stage builds. They are "how Docker was meant to be used" in my head, not like I have either broad or deep understanding of this space.

@lukebakken
Copy link
Collaborator Author

lukebakken commented Nov 7, 2022

I did think of another benefit to this approach vs the current one, but it only applies to occasions when someone is developing and iterating on this docker image. By having a multi-stage build with multiple statements, the OpenSSL and Erlang parts of the process are more easily cached (due to multiple RUN and other cache-able statements). So, for me, when I'm working on the OpenSSL 3.0 bits, I will get the "vast" speedup of not downloading stuff over-and-over.

Of course, in reality, how often does this happen?

Note that the final stage (the "runner" stage) is still as few statements as possible to minimize layers.

If this PR isn't merged I'll just keep my fork, iterate on that, and can submit final PRs to the original build files.

@tianon
Copy link
Member

tianon commented Nov 15, 2022

Thanks for your patience 🙇

After sitting with it for a while, I think I'm on board.

One use case that came to my mind after thinking about it longer is cases where (for whatever reason) the build of Erlang fails and we have to re-download and re-compile OpenSSL, and on some architectures that's heavier than others.

What I'd like to do though is make the conversion in more minimal/incremental steps (and I'm happy to do legwork here - I'm not asking you to do work you don't want to do!). For example, I think there's probably a minimal PR we could do to get most of the way here with a much smaller diff by adding a few line breaks with new FROM in them, a few new instances of RUN, and moving some of the currently-intermixed environment variables around, and then we can review other changes you'd like to make (like the changes for OpenSSL 3.x) in more isolated PRs.

We'll also want to update generate-stackbrew-library.sh to add Builder: buildkit to the "global entry" (right after GitRepo:, for example), which will help mitigate the downstream cache issues of multi-stage (docker-library/bashbrew#43).

Feel free to let me know if this is something you're interested in doing or whether you'd prefer I take the first stab. I don't have any problem with keeping this PR open as a reminder / marker of the full set of changes you'd like to get to, if that's helpful. 👍

@lukebakken
Copy link
Collaborator Author

I'll get right on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants