-
Notifications
You must be signed in to change notification settings - Fork 279
CASSANDRA_SEEDS in Swarm Mode #94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
According to the documentation it has to be IP addresses. But you only need to give each new node one or more IPs of a node already in the cluster. What I would do to scale is start three seed nodes that point to each other (maybe a bash script to resolve their dns name to the set of IP addresses on startup and fill the seeds environment variable). Then to scale the rest I would point them at those IP addresses within that overlay network (or use the same script) and schedule the rest to be constrained to nodes that aren't running the seed nodes. I would not recommend using a load balanced IP for the internal connections between Cassandra nodes; the seeds list should be the IP addresses that the nodes advertise via broadcast address. For more info: http://stackoverflow.com/a/32183684 |
InetAddress.getByName() returns the IP address of a host's name. If an IP is passed in, "only the validity of the address format is checked". So, a Docker Swarm Mode service name like cassandra should work. I really feel like this would be the correct way to approach this problem. Using a service name like cassandra that resolves to a load balancing IP, would make for an elegant seed. As long as we retry if seed is pointing to itself, this would be great! |
I guess all their documentation about seeds needing to be IP addresses is wrong (that same function is used even back in 1.2). As nice as using a load-balanced hostname for seeds would be, it is too prone to racey conditions. How does the first node know when to stop resolving that hostname when trying to get an IP address that is not itself? What if you start 3 at once on 3 different machines and they all decide that they are the first node and thus their own seed? This would either require a change upstream in cassandra itself or something in the entrypoint script. If we implement something for this in the entrypoint script, would we resolve the hostname and only pass an IP to cassandra? Having an IP that is not us does not even guarantee that the other node is a valid seed node. This really seems better suited for using something with full service discovery so that each node can register when they are just started vs part of the cluster. I know consul is often mentioned for this purpose. |
you can get a list of nodes from You would then need to remove I have attempted to add to the scripting so that you can do that here https://github.com/amey-sam/cassandra in the same fashion that wurstmeister/kafka does this...
the problem with the above is that self.ip isnt removed from seeds so auto bootstrapping does not work for added nodes (i.s. fail over and scaling :( ). Also I am not sure what happens if you remove 'hostname -i' (or whatever is appropriate) from the seed ip list on nodes when bootstrapping a cluster. Presumably there wouldn't be any seeds? At the end of the day I think the cassandra model for doing this is a bit broken. Why have special seed nodes which become a point of failure? That is fail over and scaling of those nodes is not automatic, which is a bit of a let down given the hype of how fantastic and easy to use cassandra is supposed to be... (read its great, except for dev ops, buy our support... please). |
@Richard-Mathie thx for your brilliant input.
A sample compose file for docker swarm would be:
I just start my cluster with one replica
Then you can scale
i did not test heavy scale at once. just one by one after checking nodetool status i hope the official image would include swarm support even with down scaling |
@flybyray Thx for pointing out the "tasks.service_name" to get containers IP inside service, I wasn't aware of that. |
Here's how I dealt with Dynamic swarm cassandra seeds I have a custom boot-node.sh which is used when Cassandra boot
Also you will have to deal with containers scaling down and up, when they will scale up with the same IP as before, they will exit with an error because they need --replace-address while booting to take back their seat inside the cluster. I don't know why Cassandra would force us to have a manual intervention here. However, I found a workaround by adding some code at the end of cassandra-env.sh
I still have one scenario to cover up, when containers failed, rejoin the cluster with another IP and let a dead row inside the database. I need something to clean up the old dead containers IP from the system.peers. Anyone wrote something for this ? If not I'll probably try some stuff soon. cheers |
@JnMik credits go to @Richard-Mathie he useses
documented in version 1.13 of docker docs https://docs.docker.com/v1.13/engine/swarm/networking/ I dont know why that documentation was removed in later versions Maybe a merge result from https://docs.docker.com/hackathon/ |
Put this all together here: https://github.com/amey-sam/cassandra/tree/auto_scale run
ect It doesn't seem to like it if you scale more than one node at a time (cassandra complains if you try and join while other nodes are bootstrapping) Though you will eventually get to a stable state, there will be a lot of containers failing and restarting. Also the cluster does not seem to like it if you lose the first seed node. thanks @flybyray and @JnMik for the help, If you want to be added to that repo give us a shout. |
FYI when downscaling the cluster you may have to call remove node http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsRemoveNode.html its pretty manual. I guess when it matters, you should just pay datastack or intracluster to manage your db as a service and forget about all this pain. |
You got a typo in https://github.com/amey-sam/cassandra/blob/auto_scale/docker-entrypoint.sh
|
Also, not sure this is right :
What should CASSANDRA_NAME contain? |
I tried your image using a simple stack like this :
and deploying it inside a swarm using : I have this image : It seems somethings not working properly, it keeps respawning new containers and cluster never get up properly. Did you try that? |
What about case when I reboot docker host and swarm repair container with different IP address? Nodetool shows new cassandra node, and old cassandra node I remove manually. I try automate it with -Dcassandra.replace_address=OLD_IP_ADDRESS but cassandra says: "Cannot replace address with a node that is already bootstrapped". In this case I mean that docker container repaired on new host and previous data volume not exists and cassandra generate new hostId. If data volume exists and hostId not changed change IP address is not problem without replace_address. Please help! |
you would have to set maybe you can set the hostname of the service with docker-compose? I don't know as i stopped using it. |
@flybyray Hi, I tried your yml file, and it successfully created two containers on different instance. However, I couldn't manage to let them see each other, which when I run |
There is an issue with CASSANDRA_LISTEN_ADDRESS, port 7000 cannot bind to 0.0.0.0 so I am having issues connecting to cassandra 7000 from another container in an overlay network. Will open an issue with steps to reproduce |
Closing given that there are now several workarounds now documented in this thread, in addition to this being a fundamental issue with Cassandra itself (not an issue with how we're packaging it) -- setting up a cluster automatically is always going to be somewhat fragile, and is out of scope for what this image provides (which is an attempt at providing a faithful "upstream" Cassandra experience -- warts and all). Building something like that on top of this image remains the best solution we can offer. 👍 |
Uh oh!
There was an error while loading. Please reload this page.
I would like to bring up a stack by running:
docker-compose-stack.yaml contents:
However, the different Cassandra nodes don't know about each other unless I manually specify IPs that might quit getting used, in CASSANDRA_SEEDS, which is not ideal. How would I use a load balanced IP for CASSANDRA_SEEDS? I tried using CASSANDRA_SEEDS=10.0.0.2 and CASSANDRA_SEEDS=cassandra, but neither worked. Also, how should I handle the situation where the load balanced IP ends up pointing to the same Cassandra node instead of another one?
The text was updated successfully, but these errors were encountered: