Skip to content

enabled_plugins file should be stored in /var/lib/rabbitmq #2234

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lukebakken opened this issue Feb 7, 2020 · 28 comments
Open

enabled_plugins file should be stored in /var/lib/rabbitmq #2234

lukebakken opened this issue Feb 7, 2020 · 28 comments

Comments

@lukebakken
Copy link
Collaborator

[159039271]

System daemons should not expect to write to /etc. This causes issues in environments like docker where /etc is frequently on a read-only FS.

https://groups.google.com/d/topic/rabbitmq-users/DuxRP0zrMdA/discussion

It's unusual for an application to write to /etc, and as the above mailing list discussion shows, /etc is sometimes mounted read-only.

We should investigate supporting the plugins file first in /var/lib/rabbitmq, falling back to /etc/rabbitmq.

@michaelklishin
Copy link
Collaborator

Makes sense 👍

@HoloRin
Copy link
Contributor

HoloRin commented Mar 30, 2020

@lukebakken @michaelklishin I believe the broker won't start without a writable /var/lib/rabbitmq, so when we say fallback, do we mean to first look for /var/lib/rabbitmq/enabled_plugins, and if it does not exist look for it at the current location? I presume we would also want to log a warning if it is in /etc that the default has now changed (not to mention update all the docs).

@HoloRin
Copy link
Contributor

HoloRin commented Mar 30, 2020

So the behavior that I now have in mind is:
IF /var/lib/rabbitmq/enabled_plugins is present
we use it
ELSE
IF /etc/rabbitmq/enabled_plugins is present and writable
we use the /etc location, but log a console message that the location is deprecated
ELSE
we use the new /var location, but log a message that the file was ignored because it was not writable.

I believe this should be safe for upgrades, as with an existing read and writable file in /etc, the behavior is unchanged aside from the log messages.

@michaelklishin
Copy link
Collaborator

That sounds reasonable to me.

@lukebakken
Copy link
Collaborator Author

OK!

@HoloRin
Copy link
Contributor

HoloRin commented Mar 30, 2020

Work in progress is here: rabbitmq/rabbitmq-common#369
In short, I want to validate it's effects on the docker images before calling the PR ready.

@HoloRin
Copy link
Contributor

HoloRin commented Mar 31, 2020

After some further consideration, if the file is present in /etc but not writable, I think rather than ignoring it, we should copy it to the new location. Since it seems possible that if one never changes the enabled plugins through the cli, the broker could have been working for them with that file not writable. Doing the copy means that we won't suddenly disable their plugins, and if they later enable/disable and restart, that state will be honored as the new location would be writable.

@HoloRin
Copy link
Contributor

HoloRin commented Apr 1, 2020

I have solved the docker issue that I was running into, and the fix is here rabbitmq/rabbitmq-server-release#126

I believe these changes are now ready to go.

@HoloRin
Copy link
Contributor

HoloRin commented Apr 2, 2020

#2298 handles the change in the broker, but I'll need to make a corresponding change to the cli to handle use of the rabbitmq-plugins cli when used with --offline

HoloRin added a commit to rabbitmq/rabbitmq-cli that referenced this issue Apr 16, 2020
We look for the enabled_plugins file at the deprecated location, and at the new location. If it exists only at the deprecated location, we attempt to use that file. If the deprecated location is read-only, we allow the cli command to fail (or not) as it always used to.

Part of the changes intended to address rabbitmq/rabbitmq-server#2234
@HoloRin
Copy link
Contributor

HoloRin commented Apr 16, 2020

Now addressed in the CLI with rabbitmq/rabbitmq-cli#407

HoloRin added a commit to rabbitmq/rabbitmq-cli that referenced this issue Apr 16, 2020
We look for the enabled_plugins file at the deprecated location, and at the new location. If it exists only at the deprecated location, we attempt to use that file. If the deprecated location is read-only, we allow the cli command to fail (or not) as it always used to.

Part of the changes intended to address rabbitmq/rabbitmq-server#2234
HoloRin added a commit to rabbitmq/rabbitmq-cli that referenced this issue Apr 16, 2020
We look for the enabled_plugins file at the deprecated location, and at the new location. If it exists only at the deprecated location, we attempt to use that file. If the deprecated location is read-only, we allow the cli command to fail (or not) as it always used to.

Part of the changes intended to address rabbitmq/rabbitmq-server#2234
@HoloRin
Copy link
Contributor

HoloRin commented Apr 16, 2020

To elaborate on the CLI changes, based on @michaelklishin 's helpful input I decided to simplify the approach taken there. In the CLI, if the enabled_plugins file is present at the old location only, we use that location and let the CLI subsequently succeed or fail as it would normally, without a guard for read only access. Otherwise, we use the new location, again allowing the cli to fail normally based on that location. This should reduce the behavior change, but be compatible with the way things are now handled in the broker.

HoloRin added a commit to rabbitmq/rabbitmq-cli that referenced this issue Apr 17, 2020
We look for the enabled_plugins file at the deprecated location, and at the new location. If it exists only at the deprecated location, we attempt to use that file. If the deprecated location is read-only, we allow the cli command to fail (or not) as it always used to.

Part of the changes intended to address rabbitmq/rabbitmq-server#2234
@dumbbell
Copy link
Collaborator

I looked at the patches from @pjk25 and they look good to me.

That said, I believe we still have a deeper issue with that file. And if users and us will have to go through some kind of migration plan, I would prefer we fix the actual problem.

To me, this file has two incompatible uses:

  • a configuration file
  • a state record

Nobody should consider it a configuration file, but its location on Unix says the opposite and there is no other way for configuration management tools to configure RabbitMQ plugins. I'm sure this leads to some surprises like "I disabled that plugin using rabbitmq-plugins(8) but the plugin is back! WTF!" if the configuration management tool "fixed" the configuration.

I think we should first split the two uses:

  • Move enabled_plugins to the data directory (i.e. this set of patches).
  • Make enabled plugins configurable from the regular RabbitMQ configuration file.

Now that we (almost) support many small files (#2277), we should think about a solution to:

  • provide a good compromise between the configuration from rabbitmq-plugins(8) and the possibly conflicting one from a configuration management tool.
  • show informatique/helpful messages to the user when there is a conflict.

What do you think?

@michaelklishin
Copy link
Collaborator

Assuming that configuration file would only need schema files from plugins, that could work. If we'd depend on schema files from enabled plugins, there's a chicken-and-egg problem.

@michaelklishin
Copy link
Collaborator

Conflicts between explicit operator actions and what deployment tools do are not specific to this file or RabbitMQ in general. I think this is more or less obvious to operators.

@HoloRin
Copy link
Contributor

HoloRin commented Apr 21, 2020

If plugins can be enabled/disabled with the rabbitmq-plugins cli, management ui, or a config file, then there is always potential for discrepancies of state that surprise the user. But, I believe @michaelklishin is correct in that operators are not too surprised by it (or at least they haven't been complaining in this case).

Configuring plugins via file rather than additional cli commands will be the most common path in the k8s world in my opinion, so I think it deserves first class support of some kind.

What currently feels most "right" to me as a goal would be that rabbitmq.conf exposes a plugins.default_enabled option, to signify that the list of plugins may be overridden later at runtime, or may have no effect if plugins have already been enabled via other means.

The currently enabled plugins would continue to be written to /var/lib/rabbitmq/enabled_plugins and persist across restarts of the broker. I'd prefer that we stop exposing the list of currently enabled plugins through the filesystem at all, and instead force users to query the broker, but I also prefer to avoid any breaking changes whenever possible.

@dumbbell
Copy link
Collaborator

Assuming that configuration file would only need schema files from plugins, that could work. If we'd depend on schema files from enabled plugins, there's a chicken-and-egg problem.

Plugins are in the code path, enabled or not, and rabbitmq_prelaunch_conf passes all schemas to Cuttlefish, regardless of the plugins' status. So I believe we don't have this issue.

@dumbbell
Copy link
Collaborator

I'd prefer that we stop exposing the list of currently enabled plugins through the filesystem at all, and instead force users to query the broker (...)

We have to persist that list somewhere, but I see what you mean. That's exactly my point: we should decouple the state recording from the configuration part.

@michaelklishin
Copy link
Collaborator

I suggest that we adopt a solution like this and consider changing the way plugin state is stored and preconfigured for 4.0.

@lukebakken
Copy link
Collaborator Author

Before we get too fancy, just remember the original problem is that /etc is not writable all the time. That's the only problem we need to solve right now 😄

@HoloRin
Copy link
Contributor

HoloRin commented Apr 23, 2020

I just discovered that the management plugin expects to be able to determine the location from the env, and crashes with the current code. Looks like I have more to do before this is ready to go.

@HoloRin
Copy link
Contributor

HoloRin commented Apr 23, 2020

My comment above can be disregarded. I was running mismatched branches in my build of the broker.

@HoloRin
Copy link
Contributor

HoloRin commented May 25, 2020

Given docker-library/rabbitmq#410 (comment), I'm now tempted to say that the file should stay put for now.

Basically, in the docker context, we should not assume /etc writable, nor should we assume that initial config can be kept in /var/lib/rabbitmq. Since the enabled_plugins file is both config and state, neither place is an appropriate home for it. At least it's configurable with RABBITMQ_ENABLED_PLUGINS_FILE so while it might not work out of the box, a user has options.

What does everyone think about me closing this issue and related PRs? I think the rabbitmq-plugins cli is basically incompatible with a read-only config scenario. What I might propose instead is a new flag for that tool, maybe rabbitmq-plugins enable_plugin rabbitmq_tracing --transient, where the enabling is not written to disk, and will revert on restart. It might address the debugging scenario from the mailing list that brought this up more directly.

@lukebakken
Copy link
Collaborator Author

Why not take these changes "all the way" at this point in time and separate config vs state correctly? This is the correct fix since we can't assume /etc is writable.

The VOLUME issue shouldn't affect our decisions as the solution is already outlined in docker-library/rabbitmq#410 (comment) - drop VOLUME altogther.

@HoloRin
Copy link
Contributor

HoloRin commented Jun 4, 2020

My impression was that VOLUME wouldn't get dropped. It servers as a hint to docker compose to retain data. So I think we have to account for it. Maybe not after moby/moby#3465 allows UNVOLUME or something similar.

To fix this properly, I think we have to assume that the config files aren't writable, or at least offer a mode of behavior that supports it. That's the real k8s pattern as I see it - config files get injected read only. kubernetes/kubernetes#62099 (comment) & https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#add-configmap-data-to-a-specific-path-in-the-volume

I don't know if that would be too big a change for 3.8.5. But, what if we

  1. allow plugins to be specified in the .conf/.config files
  2. if they are specified there, rabbitmq-plugins --offline will fail, as the file won't be modified
  3. it they are specified there, and the broker is running, rabbitmq-plugins will also fail, unless invoked with --transient, where the plugin will activate, but only until the next restart
  4. if not specified in .conf, honor the enabled_plugins file with the current behavior, in the current location

Once the public docker takes the new approach (it relies on --offline), these users should not run into this problem, but everyone else can opt-in to the new option on their own time.

@lukebakken
Copy link
Collaborator Author

Let's think of this from an operator's point of view.

  • "I want to configure the set of plugins that are enabled the first time RabbitMQ starts"

Ensure that /etc/rabbitmq/enabled_plugins is available when RabbitMQ starts. Supported today and with the changes in these PRs.

Or, create /var/lib/rabbitmq/enabled_plugins, which is supported by these PRs.

  • "I want to use rabbitmq-plugins at some point to enable or disable plugins."

Use the CLI command. Changes to the running set of plugins are saved in /var/lib/rabbitmq, which will be writable. RabbitMQ will always consult this file first when it starts to determine the list of plugins to start.

  • "I want to modify a file to change the current set of plugins after the first time RabbitMQ starts"

I suspect this is a very rare scenario. Modify /var/lib/rabbitmq/enabled_plugins. If the user modifies the file in /etc/rabbitmq the changes won't be applied BUT I expect that RabbitMQ will log loudly that files exist in /etc/rabbitmq and /var/lib/rabbitmq and that the latter file is used.

So having said all that I think the current set of changes suffice.

@michaelklishin
Copy link
Collaborator

The latter scenario is how most CI scripts work. They start a node and then use rabbitmq-plugins enable or rabbitmq-plugins set. As long as those are affective immediately and in case of a node restart, whether /var/lib/rabbitmq/enabled_plugins is used won't matter to almost anyone.

@michaelklishin michaelklishin modified the milestones: 3.8.5, 3.8.6 Jun 9, 2020
@HoloRin
Copy link
Contributor

HoloRin commented Jun 10, 2020

It's not yet clear to me where we landed on this. If we apply the change, it breaks the management variants of the public docker image. They would have to make some changes to the Dockerfiles, to support it. Maybe we are okay with that, but it wouldn't be my first choice, especially because there would still be some limitations for child docker files.

If not for the above, I'd be in favor of merging these changes now.

Maybe a suitable option is to contribute the necessary fix in advance to https://github.com/docker-library/rabbitmq. I'll take a look at what that would entail, and open an appropriate PR.

@HoloRin
Copy link
Contributor

HoloRin commented Jun 10, 2020

I have opened docker-library/rabbitmq#416

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants