Skip to content

Commit a70e6f9

Browse files
authored
Update thing-flinger to track changes to omicron (#1001)
Summary of changes: * Improve error messages in omicron-package * Teach `thing-flinger sync` about `config-rss.toml` * Add `-y` option to `install_prerequisites.sh` to skip confirm prompts * Add `-p` option to `install_prerequisites.sh` to skip PATH check * Add `thing-flinger install-prereqs` subcommand
1 parent d16d77b commit a70e6f9

File tree

6 files changed

+382
-106
lines changed

6 files changed

+382
-106
lines changed

deploy/README.adoc

Lines changed: 96 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -83,42 +83,58 @@ all the dependencies for Omicron installed. Following the *prerequisites* in the
8383
https://github.com/oxidecomputer/omicron/#build-and-run[Build and run] section of the main Omicron
8484
README is probably a good idea.
8585

86-
=== Command Based Workflow
86+
==== Update `config-rss.toml`
8787

88-
==== Build thing-flinger on client
89-
`thing-flinger` is part of the `omicron-package` crate.
88+
Currently rack setup is driven by a configuration file that lives at
89+
`smf/sled-agent/config-rss.toml` in the root of this repository. The committed
90+
configuration of that file contains a single `[[requests]]` entry (with many
91+
services inside it), which means it will start services on only one sled. To
92+
start services (e.g., nexus) on multiple sleds, add additional entries to that
93+
configuration file before proceeding.
9094

91-
`cargo build -p omicron-package`
95+
=== Command Based Workflow
9296

9397
==== sync
94-
Copy your source code to the builder. Note that this copies over your `.git` subdirectory on purpose so
95-
that a branch can be configured for building with the `git_treeish` field in the toml `builder`
96-
table.
98+
Copy your source code to the builder.
99+
100+
`cargo run --bin thing-flinger -- -c <CONFIG> sync`
97101

98-
`./target/debug/thing-flinger -c <CONFIG.toml> sync`
102+
==== Install Prerequisites
103+
Install necessary build and runtime dependencies (including downloading prebuilt
104+
binaries like Clickhouse and CockroachDB) on the builder and all deployment
105+
targets. This step only needs to be performed once, absent any changes to the
106+
dependencies, but is idempotent so may be run multiple times.
99107

100-
==== build-minimal
101-
Build necessary parts of omicron on the builder, required for future use by thing-flinger.
108+
`cargo run --bin thing-flinger -- -c <CONFIG> install-prereqs`
102109

103-
`./target/debug/thing-flinger -c <CONFIG> build-minimal`
110+
==== check (optional)
111+
Run `cargo check` on the builder against the copy of `omicron` that was sync'd
112+
to it in the previous step.
104113

105-
==== package
114+
`cargo run --bin thing-flinger -- -c <CONFIG> build check`
115+
116+
==== package
106117
Build and package omicron using `omicron-package` on the builder.
107118

108-
`./target/debug/thing-flinger -c <CONFIG> package`
119+
`cargo run --bin thing-flinger -- -c <CONFIG> build package`
109120

110121
==== overlay
111122
Create files that are unique to each deployment server.
112123

113-
`./target/debug/thing-flinger -c <CONFIG> overlay`
124+
`cargo run --bin thing-flinger -- -c <CONFIG> overlay`
114125

115-
==== install
126+
==== install
116127
Install omicron to all machines, in parallel. This consists of copying the packaged omicron tarballs
117128
along with overlay files, and omicron-package and its manifest to a `staging` directory on each
118129
deployment server, and then running omicron-package, installing overlay files, and restarting
119130
services.
120131

121-
`./target/debug/thing-flinger -c <CONFIG> install`
132+
`cargo run --bin thing-flinger -- -c <CONFIG> deploy install`
133+
134+
==== uninstall
135+
Uninstall omicron from all machines.
136+
137+
`cargo run --bin thing-flinger -- -c <CONFIG> deploy uninstall`
122138

123139
=== Current Limitations
124140

@@ -140,3 +156,67 @@ effort to use securely. This particular implementation wraps the openssh ssh cli
140156
`std::process::Command`, rather than using the `ssh2` crate, because ssh2, as a wrapper around
141157
`libssh`, does not support agent-forwarding.
142158

159+
== Notes on Using VMs as Deployed Servers on a Linux Host
160+
161+
TODO: This section should be fleshed out more and potentially lifted to its own
162+
document; for now this is a collection of rough notes.
163+
164+
---
165+
166+
It's possible to use a Linux libvirt host running multiple helios VMs as the
167+
builder/deployment server targets, but it requires some additional setup beyond
168+
[`helios-engvm`](https://github.com/oxidecomputer/helios-engvm).
169+
170+
`thing-flinger` does not have any support for running the
171+
`tools/create_virtual_hardware.sh` script; this will need to be done by hand on
172+
each VM.
173+
174+
---
175+
176+
To enable communication between the VMs over their IPv6 bootstrap networks:
177+
178+
1. Enable IPv6 and DHCP on the virtual network libvirt uses for the VMs; e.g.,
179+
180+
```xml
181+
<ip family="ipv6" address="fdb0:5254::1" prefix="96">
182+
<dhcp>
183+
<range start="fdb0:5254::100" end="fdb0:5254::1ff"/>
184+
</dhcp>
185+
</ip>
186+
```
187+
188+
After booting the VMs with this enabled, they should be able to ping each other
189+
over their acquired IPv6 addresses, but connecting to each other over the
190+
`bootstrap6` interface that sled-agent creates will fail.
191+
192+
2. Explicitly add routes in the Linux host for the `bootstrap6` addresses,
193+
specifying the virtual interface libvirt created that is used by the VMs.
194+
195+
```
196+
bash% sudo ip -6 route add fdb0:5254:13:7331::1/64 dev virbr1
197+
bash% sudo ip -6 route add fdb0:5254:f0:acfd::1/64 dev virbr1
198+
```
199+
200+
3. Once the sled-agents advance sufficiently to set up `sled6` interfaces,
201+
routes need to be added for them both in the Linux host and in the Helios VMs.
202+
Assuming two sleds with these interfaces:
203+
204+
```
205+
# VM 1
206+
vioif0/sled6 static ok fd00:1122:3344:1::1/64
207+
# VM 2
208+
vioif0/sled6 static ok fd00:1122:3344:2::1/64
209+
```
210+
211+
The Linux host needs to be told to route that subnet to the appropriate virtual
212+
interface:
213+
214+
```
215+
bash% ip -6 route add fd00:1122:3344::1/48 dev virbr1
216+
```
217+
218+
and each Helios VM needs to be told to route that subnet to the host gateway:
219+
220+
```
221+
vm% pfexec route add -inet6 fd00:1122:3344::/48 $IPV6_HOST_GATEWAY_ADDR
222+
```

deploy/src/bin/deployment-example.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,9 @@ server = "foo"
1515
omicron_path = "/remote/path/to/omicron"
1616

1717
[deployment]
18-
servers = ["foo", "bar"]
18+
# which server is responsible for running the rack setup service; must
19+
# refer to one of the `servers` in the servers table
20+
rss_server = "foo"
1921
rack_secret_threshold = 2
2022
# Location where files to install will be placed before running
2123
# `omicron-package install`

0 commit comments

Comments
 (0)