-
Notifications
You must be signed in to change notification settings - Fork 69
Closed
Description
Current situation
- Github CI action is triggered manually.
- Binary is built
- Binary is sent to server over SSH to a VPS on vultr.
- We use systemctl to stop atomic-server
- We create an export
- We use systemctl to start atomic-server
What I like about this approach
- It's pretty simple to run. Just two clicks from github.
- It gives me status updates and error notifications
- It's pretty standard, which means it looks like what many other devs might do. That means I catch problems that others may encounter, which is a good thing.
- No vendor-lock in. I don't rely on any AWS / Azure / Google stuff
- Lots of control over hardware. I can move to a local machine if needed and little changes.
What went wrong
AtomicData.dev was just down for longer than I'd like to admit. Let's evaluate what went wrong, and how to tackle the problems.
- I replaced the binary on my VPS, which made it harder to revert to a backup. I've fixed that since then by creating backups in the CI.
- A change upstream updated OpenSSL in Rust, but not on my VPS. Still not sure where this came from. Maybe I should use fixed versions for github actions and ubuntu images.
- I don't have a
staging
machine / environment. I should have this. It should resemble production as much as possible (although it could be more resource constrained). - My built binary wasn't tested before it was deployed. I should have used a docker image that was pre-tested, and designed to run on the same OS. Ideally I run at least some tests on staging.
Things that can be improved
- Use (tested) images instead of binaries. (to prevent stuff like this)
- I'd like to use tools that improve observability. Think Grafana / Prometheus / Jaeger. Add metrics / Prometheus support #420. I think I'd like to run these on the same machine, to save costs.
- cattle vs pets. In the future, I'd like to not be dependent on single machines. But as of now, I focus on a cost effective single node setup. Also, the performance right now is pretty much amazing, so I don't think I need multi-node for perf scaling reasons anytime soon.
- performance regression tests.
- Setup staging Staging environment #588
What tech to use for deployments
How do I approach these different goals? What tools could help me?
- Docker. I'm pretty sure the answer will involve running images instead of running directly on ubuntu.
- Docker-Compose. I'm familiar with this, and it seems like a decent pick for a single node setup. But I suppose it doesn't really scale or offer lots of flexibility. Not sure how easy it is to deploy.
-
- Kubernetes. Definitely powerful, but I'm not sure if I need it. As of now, everything is just one node.
- Terraform / Pulumi. Allows for a lot of configuration! Can deploy to pretty much anything, but Pulumi will probably require kubernetes.
- Earthly is a build tool that uses docker
- sup is for running a command on multiple machines.
- monit for monitoring a single unix system and mmonit for multiple
- seaweedfs for multi-node fs
Metadata
Metadata
Assignees
Labels
No labels