Skip to content

When using docker-swarm, cassandra is erroneously bound to VIP instead of container IP #150

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dvincelli opened this issue Jul 19, 2018 · 3 comments
Labels

Comments

@dvincelli
Copy link

I recently experienced an issue with a single-node deploy of Cassandra in a docker-swarm stack. Upon using a LWT (INSERT .. IF NOT EXISTS), the statement would block and time-out.

This appears in the logs when the error occurs:

DEBUG [MessagingService-Outgoing-/10.0.0.9-Small] 2018-06-22 20:52:02,656 OutboundTcpConnection.java:545 - Unable to connect to /10.0.0.9
java.net.ConnectException: Connection refused
        at sun.nio.ch.Net.connect0(Native Method) ~[na:1.8.0_171]
        at sun.nio.ch.Net.connect(Net.java:454) ~[na:1.8.0_171]
        at sun.nio.ch.Net.connect(Net.java:446) ~[na:1.8.0_171]
        at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) ~[na:1.8.0_171]
        at org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:146) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:132) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:433) [apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:262) [apache-cassandra-3.11.2.jar:3.11.2]

After a bit of diagnosing, I concluded that cassandra was trying to coordinate the LWT with the wrong IP address. I saw that cassandra was using the VIP loopback interface, this config was auto generated for the docker-entrypoint.sh script which has a comment indicated that the Container IP should be chosen by _ip_address.

The config file contained

root@87a29d37bdd9:/etc/cassandra# grep 10.0.0.9 cassandra.yaml
          - seeds: "10.0.0.9"
listen_address: 10.0.0.9
broadcast_address: 10.0.0.9
broadcast_rpc_address: 10.0.0.9

My interface list was

root@87a29d37bdd9:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 10.0.0.9/32 brd 10.0.0.9 scope global lo
       valid_lft forever preferred_lft forever
12: eth0@if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
    link/ether 02:42:0a:00:00:0a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.0.10/24 brd 10.0.0.255 scope global eth0
       valid_lft forever preferred_lft forever
14: eth1@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:12:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet 172.18.0.3/16 brd 172.18.255.255 scope global eth1
       valid_lft forever preferred_lft forever

For this case, I addressed the issue with the following changes to docker-entrypoint.sh

diff --git a/docker-entrypoint.sh b/docker-entrypoint.sh
index 871f7f4..77e6212 100644
--- a/docker-entrypoint.sh
+++ b/docker-entrypoint.sh
@@ -17,7 +17,7 @@ _ip_address() {
        # scrape the first non-localhost IP address of the container
        # in Swarm Mode, we often get two IPs -- the container IP, and the (shared) VIP, and the container IP should always be first
        ip address | awk '
-               $1 == "inet" && $2 !~ /^127[.]/ {
+               $1 == "inet" && $2 !~ /^127[.]/ && $NF != "lo" {
                        gsub(/\/.+$/, "", $2)
                        print $2
                        exit

This makes things look a lot saner

root@8a3a665ee17f:/# grep _address: /etc/cassandra/cassandra.yaml
listen_address: 10.0.1.6
broadcast_address: 10.0.1.6
# listen_on_broadcast_address: false
rpc_address: 0.0.0.0
broadcast_rpc_address: 10.0.1.6
root@8a3a665ee17f:/# ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 10.0.1.3/32 brd 10.0.1.3 scope global lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1
    link/ipip 0.0.0.0 brd 0.0.0.0
3: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN group default qlen 1
    link/tunnel6 :: brd ::
71: eth0@if72: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
    link/ether 02:42:0a:00:01:06 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.1.6/24 brd 10.0.1.255 scope global eth0
       valid_lft forever preferred_lft forever
93: eth1@if94: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:18:00:12 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet 172.24.0.18/16 brd 172.24.255.255 scope global eth1
       valid_lft forever preferred_lft forever
@tianon
Copy link
Member

tianon commented Jul 20, 2018

Gah, it's changed again 😞 😕 ❤️

Ignoring lo entirely makes a ton of sense -- do you want to make a PR to update all the versions/variants with that change, or would you prefer I carry your change from here? 🙏

@dvincelli
Copy link
Author

Sure, I'll submit a PR as soon as possible.

@wglambert
Copy link

Fixed with #151

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants