-
Notifications
You must be signed in to change notification settings - Fork 45
[reconfigurator] Call clickhouse-admin
API from SMF services
#6533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f.write_all(config.to_xml().as_bytes())?; | ||
f.flush()?; | ||
rename(f.path(), self.config_dir.join("replica-server-config.xml"))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, this rename produces an error when running via the clickhouse-admin API inside an SMF service:
{
"request_id": "a529336d-e113-429f-b90e-5bb80f7912fc",
"error_code": "Internal",
"message": "clickward XML generation failure: Cross-device link (os error 18)"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still want to create a temporary file and rename, otherwise we can end up with partially written configurations that do not get corrected without support and/or a config change. There are ways to work around this, but atomic rename is typically the simplest. And it will help once we have generation numbers and need to write those to disk as well.
My guess is that this is failing because NamedUtf8TempFile
is putting the temp file on a /tmpfs
device and you can't rename it to the zfs filesystem on an actual U.2 device. You should be able to fix this by not using NamedUtf8TempFile
and instead just creating a file named replica-server-config.xml.tmp
in self.config_dir()
and then doing the atomic rename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To emphasize what can go wrong. If we partially write the file, yes we will get an error, and yes we can retry. But the config will automatically get loaded by clickhouse and may crash it or cause other issues. It's better to have a stable old config and only load valid configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also the atomicwrites crate which is pretty good; e.g.,
omicron/dev-tools/openapi-manager/src/spec.rs
Line 482 in 3d3f6d7
/// Overwrite a file with new contents, if the contents are different. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also the atomicwrites crate which is pretty good; e.g.,
omicron/dev-tools/openapi-manager/src/spec.rs
Line 482 in 3d3f6d7
/// Overwrite a file with new contents, if the contents are different.
Oh nice. TIL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhh that's a very useful crate! I've updated the code
clickhouse-admin
API from SMF servicesclickhouse-admin
API from SMF services
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love to see it :)
Overview
This commit replaces the old replicated ClickHouse server and keeper configuration templates with calls to the
clickhouse-admin
API that generate said configuration files.Purpose
While the end goal is to have Nexus make the API calls to generate the configuration files, we'd like to have a working implementation of the
clickhouse-admin
API via the SMF services. Usingcurl
is not what the finished work will look like, but rather it is the simplest way to have a working implementation in the mean time.Testing
Deployed this branch on a Helios machine with the following results
Replica 1
Replica 2
Keeper 1
Keeper 2
Keeper 3
Related: #5999
Closes: #3824