Description
When #65905 was implemented to automatically set the heap size the idea was that explicit settings provided by the user would be respected:
In all cases we short-circuit if a user provides explicit heap options
so we only ever auto-determine heap if no existing heap options exist.
This doesn't always happen. In particular, it only happens if the user provided setting is an exact multiple of 4MB. Otherwise, the JVM rounds up what the user specified to the next highest multiple of 4MB and the setting is not then considered entirely a command line setting. For example:
bash-3.2$ java -Xmx408m -XX:+PrintFlagsFinal | grep MaxHeapSize
size_t MaxHeapSize = 427819008 {product} {command line}
bash-3.2$ java -Xmx409m -XX:+PrintFlagsFinal | grep MaxHeapSize
size_t MaxHeapSize = 429916160 {product} {command line, ergonomic}
So, on my laptop if I start Elasticsearch with -Xmx408m
I get a 408MB heap. If I start Elasticsearch with -Xmx409m
I get a 31GB heap. (If the -Xmx409m
wasn't replaced then I'd get a 412MB heap, i.e. 409 rounded up to the next multiple of 4.)
I doubt this is a problem at all for human end users, as they'll generally choose nice round sizes. But the place where it makes a difference is where the -Xmx
argument for Elasticsearch is provided by a machine orchestrator. For example, the Elastic Cloud orchestrator used to set the heap size on 1GB ML nodes to 409MB prior to 8.2. We thought that was taking precedence over the Elasticsearch auto-sizing but actually these 1GB ML nodes have been using Elasticsearch's auto-sizing since 7.11. It doesn't matter since the calculations were deliberately designed to be the same to start off with.
If we want to make the auto-sized heap respect all user overrides then I think this line should be changed to check if origin
contains command line
instead of being equal to it: