Skip to content

[reconfigurator] Call clickhouse-admin API from SMF services #6533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 9, 2024

Conversation

karencfv
Copy link
Contributor

@karencfv karencfv commented Sep 6, 2024

Overview

This commit replaces the old replicated ClickHouse server and keeper configuration templates with calls to the clickhouse-admin API that generate said configuration files.

Purpose

While the end goal is to have Nexus make the API calls to generate the configuration files, we'd like to have a working implementation of the clickhouse-admin API via the SMF services. Using curl is not what the finished work will look like, but rather it is the simplest way to have a working implementation in the mean time.

Testing

Deployed this branch on a Helios machine with the following results

Replica 1

root@oxz_clickhouse_server_a9d02cd3:~# /opt/oxide/clickhouse_server/clickhouse client --host fd00:1122:3344:101::f
ClickHouse client version 23.8.7.1.
Connecting to fd00:1122:3344:101::f:9000 as user default.
Connected to ClickHouse server version 23.8.7 revision 54465.

oximeter_cluster_1 :) SHOW TABLES FROM oximeter

SHOW TABLES FROM oximeter

Query id: 06867649-f49e-451f-b9f1-5a574e12ce5b

┌─name─────────────────────────────┐
│ fields_bool                      │
│ fields_bool_local                │
│ fields_i16                       │
│ fields_i16_local                 │
│ fields_i32                       │
│ fields_i32_local                 │
│ fields_i64                       │
│ fields_i64_local                 │
│ fields_i8                        │
│ fields_i8_local                  │
│ fields_ipaddr                    │
│ fields_ipaddr_local              │
│ fields_string                    │
│ fields_string_local              │
│ fields_u16                       │
│ fields_u16_local                 │
│ fields_u32                       │
│ fields_u32_local                 │
│ fields_u64                       │
│ fields_u64_local                 │
│ fields_u8                        │
│ fields_u8_local                  │
│ fields_uuid                      │
│ fields_uuid_local                │
│ measurements_bool                │
│ measurements_bool_local          │
│ measurements_bytes               │
│ measurements_bytes_local         │
│ measurements_cumulativef32       │
│ measurements_cumulativef32_local │
│ measurements_cumulativef64       │
│ measurements_cumulativef64_local │
│ measurements_cumulativei64       │
│ measurements_cumulativei64_local │
│ measurements_cumulativeu64       │
│ measurements_cumulativeu64_local │
│ measurements_f32                 │
│ measurements_f32_local           │
│ measurements_f64                 │
│ measurements_f64_local           │
│ measurements_histogramf32        │
│ measurements_histogramf32_local  │
│ measurements_histogramf64        │
│ measurements_histogramf64_local  │
│ measurements_histogrami16        │
│ measurements_histogrami16_local  │
│ measurements_histogrami32        │
│ measurements_histogrami32_local  │
│ measurements_histogrami64        │
│ measurements_histogrami64_local  │
│ measurements_histogrami8         │
│ measurements_histogrami8_local   │
│ measurements_histogramu16        │
│ measurements_histogramu16_local  │
│ measurements_histogramu32        │
│ measurements_histogramu32_local  │
│ measurements_histogramu64        │
│ measurements_histogramu64_local  │
│ measurements_histogramu8         │
│ measurements_histogramu8_local   │
│ measurements_i16                 │
│ measurements_i16_local           │
│ measurements_i32                 │
│ measurements_i32_local           │
│ measurements_i64                 │
│ measurements_i64_local           │
│ measurements_i8                  │
│ measurements_i8_local            │
│ measurements_string              │
│ measurements_string_local        │
│ measurements_u16                 │
│ measurements_u16_local           │
│ measurements_u32                 │
│ measurements_u32_local           │
│ measurements_u64                 │
│ measurements_u64_local           │
│ measurements_u8                  │
│ measurements_u8_local            │
│ timeseries_schema                │
│ timeseries_schema_local          │
│ version                          │
└──────────────────────────────────┘

81 rows in set. Elapsed: 0.005 sec. 

oximeter_cluster_1 :) SELECT * FROM oximeter.measurements_u64

SELECT *
FROM oximeter.measurements_u64

Query id: 2e13d330-8f0b-4346-afc0-ba3c21ea7674

┌─timeseries_name─────────────────────────┬───────timeseries_key─┬─────────────────────timestamp─┬─datum─┐
│ ddm_router:originated_tunnel_endpoints  │  2085026407707057203 │ 2024-09-09 07:16:47.241835734 │     0 │
│ ddm_router:originated_tunnel_endpoints  │  2085026407707057203 │ 2024-09-09 07:16:48.241091831 │     0 │
│ ddm_router:originated_tunnel_endpoints  │  2085026407707057203 │ 2024-09-09 07:16:49.241294398 │     0 │
<...>

Replica 2

root@oxz_clickhouse_server_ba1601d3:~# /opt/oxide/clickhouse_server/clickhouse client --host fd00:1122:3344:101::e
ClickHouse client version 23.8.7.1.
Connecting to fd00:1122:3344:101::e:9000 as user default.
Connected to ClickHouse server version 23.8.7 revision 54465.

oximeter_cluster_2 :) SHOW TABLES FROM oximeter

SHOW TABLES FROM oximeter

Query id: 33dd1d4d-1596-44e3-90ea-c755a1e3ae24

┌─name─────────────────────────────┐
│ fields_bool                      │
│ fields_bool_local                │
│ fields_i16                       │
│ fields_i16_local                 │
│ fields_i32                       │
│ fields_i32_local                 │
│ fields_i64                       │
│ fields_i64_local                 │
│ fields_i8                        │
│ fields_i8_local                  │
│ fields_ipaddr                    │
│ fields_ipaddr_local              │
│ fields_string                    │
│ fields_string_local              │
│ fields_u16                       │
│ fields_u16_local                 │
│ fields_u32                       │
│ fields_u32_local                 │
│ fields_u64                       │
│ fields_u64_local                 │
│ fields_u8                        │
│ fields_u8_local                  │
│ fields_uuid                      │
│ fields_uuid_local                │
│ measurements_bool                │
│ measurements_bool_local          │
│ measurements_bytes               │
│ measurements_bytes_local         │
│ measurements_cumulativef32       │
│ measurements_cumulativef32_local │
│ measurements_cumulativef64       │
│ measurements_cumulativef64_local │
│ measurements_cumulativei64       │
│ measurements_cumulativei64_local │
│ measurements_cumulativeu64       │
│ measurements_cumulativeu64_local │
│ measurements_f32                 │
│ measurements_f32_local           │
│ measurements_f64                 │
│ measurements_f64_local           │
│ measurements_histogramf32        │
│ measurements_histogramf32_local  │
│ measurements_histogramf64        │
│ measurements_histogramf64_local  │
│ measurements_histogrami16        │
│ measurements_histogrami16_local  │
│ measurements_histogrami32        │
│ measurements_histogrami32_local  │
│ measurements_histogrami64        │
│ measurements_histogrami64_local  │
│ measurements_histogrami8         │
│ measurements_histogrami8_local   │
│ measurements_histogramu16        │
│ measurements_histogramu16_local  │
│ measurements_histogramu32        │
│ measurements_histogramu32_local  │
│ measurements_histogramu64        │
│ measurements_histogramu64_local  │
│ measurements_histogramu8         │
│ measurements_histogramu8_local   │
│ measurements_i16                 │
│ measurements_i16_local           │
│ measurements_i32                 │
│ measurements_i32_local           │
│ measurements_i64                 │
│ measurements_i64_local           │
│ measurements_i8                  │
│ measurements_i8_local            │
│ measurements_string              │
│ measurements_string_local        │
│ measurements_u16                 │
│ measurements_u16_local           │
│ measurements_u32                 │
│ measurements_u32_local           │
│ measurements_u64                 │
│ measurements_u64_local           │
│ measurements_u8                  │
│ measurements_u8_local            │
│ timeseries_schema                │
│ timeseries_schema_local          │
│ version                          │
└──────────────────────────────────┘

81 rows in set. Elapsed: 0.010 sec. 

oximeter_cluster_2 :) SELECT * FROM oximeter.measurements_u64

SELECT *
FROM oximeter.measurements_u64

Query id: 06da0f16-3055-47cb-9984-94dc78f99afc

┌─timeseries_name─────────────────────────┬───────timeseries_key─┬─────────────────────timestamp─┬─datum─┐
│ ddm_router:originated_tunnel_endpoints  │  2085026407707057203 │ 2024-09-09 07:22:02.443983562 │     0 │
│ ddm_router:originated_tunnel_endpoints  │  2085026407707057203 │ 2024-09-09 07:22:03.444346219 │     0 │
│ ddm_router:originated_tunnel_endpoints  │  2085026407707057203 │ 2024-09-09 07:22:04.444356384 │     0 │
<...>

Keeper 1

root@oxz_clickhouse_keeper_8cb0de91:~# echo mntr | nc fd00:1122:3344:101::12 9181
zk_version      v23.8.7.1-lts-077df679bed122ad45c8b105d8916ccfec85ae64
zk_avg_latency  4
zk_max_latency  103
zk_min_latency  0
zk_packets_received     27769
zk_packets_sent 29290
zk_num_alive_connections        1
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count  6535
zk_watch_count  83
zk_ephemerals_count     82
zk_approximate_data_size        2330794
zk_key_arena_size       1044480
zk_latest_snapshot_size 0
zk_followers    2
zk_synced_followers     2

Keeper 2

root@oxz_clickhouse_keeper_a6c18bd2:~# echo mntr | nc fd00:1122:3344:101::10 9181
zk_version      v23.8.7.1-lts-077df679bed122ad45c8b105d8916ccfec85ae64
zk_avg_latency  10
zk_max_latency  139
zk_min_latency  0
zk_packets_received     22278
zk_packets_sent 23922
zk_num_alive_connections        1
zk_outstanding_requests 0
zk_server_state follower
zk_znode_count  7015
zk_watch_count  83
zk_ephemerals_count     82
zk_approximate_data_size        2512980
zk_key_arena_size       1044480
zk_latest_snapshot_size 0

Keeper 3

root@oxz_clickhouse_keeper_45d3e6ef:~# echo mntr | nc fd00:1122:3344:101::11 9181
zk_version      v23.8.7.1-lts-077df679bed122ad45c8b105d8916ccfec85ae64
zk_avg_latency  0
zk_max_latency  0
zk_min_latency  0
zk_packets_received     0
zk_packets_sent 0
zk_num_alive_connections        0
zk_outstanding_requests 0
zk_server_state follower
zk_znode_count  7188
zk_watch_count  0
zk_ephemerals_count     82
zk_approximate_data_size        2575631
zk_key_arena_size       1044480
zk_latest_snapshot_size 0

Related: #5999
Closes: #3824

f.write_all(config.to_xml().as_bytes())?;
f.flush()?;
rename(f.path(), self.config_dir.join("replica-server-config.xml"))?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, this rename produces an error when running via the clickhouse-admin API inside an SMF service:

{
  "request_id": "a529336d-e113-429f-b90e-5bb80f7912fc",
  "error_code": "Internal",
  "message": "clickward XML generation failure: Cross-device link (os error 18)"
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still want to create a temporary file and rename, otherwise we can end up with partially written configurations that do not get corrected without support and/or a config change. There are ways to work around this, but atomic rename is typically the simplest. And it will help once we have generation numbers and need to write those to disk as well.

My guess is that this is failing because NamedUtf8TempFile is putting the temp file on a /tmpfs device and you can't rename it to the zfs filesystem on an actual U.2 device. You should be able to fix this by not using NamedUtf8TempFile and instead just creating a file named replica-server-config.xml.tmp in self.config_dir() and then doing the atomic rename.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To emphasize what can go wrong. If we partially write the file, yes we will get an error, and yes we can retry. But the config will automatically get loaded by clickhouse and may crash it or cause other issues. It's better to have a stable old config and only load valid configs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also the atomicwrites crate which is pretty good; e.g.,

/// Overwrite a file with new contents, if the contents are different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also the atomicwrites crate which is pretty good; e.g.,

/// Overwrite a file with new contents, if the contents are different.

Oh nice. TIL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh that's a very useful crate! I've updated the code

@karencfv karencfv marked this pull request as ready for review September 9, 2024 07:44
@karencfv karencfv changed the title [reconfigurator] WIP: Call clickhouse-admin API from SMF services [reconfigurator] Call clickhouse-admin API from SMF services Sep 9, 2024
Copy link
Contributor

@andrewjstone andrewjstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love to see it :)

@karencfv karencfv merged commit fef8616 into oxidecomputer:main Sep 9, 2024
18 checks passed
@karencfv karencfv deleted the generate-ch-xml branch September 9, 2024 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ClickHouse] Generate the configuration files instead of hardcoding them
3 participants