openshift 3.11 storageos networking issue
I've created an openshift 3.11 3 node cluster, 2 of which are compute
nodes. I've installed storageos on this cluster. One of the compute
nodes seems fine with the storageos installation, however the 2nd
compute node can't reach the 1st node. It appears that the error
is routing related.
the 2nd node will not route to the 1st node it appears.
[root@cortado-o1 standard]# oc get pod -n storageos
NAME READY STATUS RESTARTS AGE
storageos-47qgc 1/1 Running 0 6m
storageos-6bqqp 0/1 Running 3 7m
[root@cortado-o2 ~]# netstat -na | grep 5705
tcp6 0 0 :::5705
[root@cortado-o3 ~]# netstat -na | grep 5705
tcp 0 0 192.168.0.101:43588 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43548 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43522 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43458 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43628 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43602 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43562 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43502 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43476 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43412 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43430 192.168.0.101:5705 TIME_WAIT
tcp6 0 0 :::5705 :::* LISTEN
[root@cortado-o3 ~]# !nc
nc 192.168.0.102 5705
Ncat: No route to host.
[root@cortado-o3 ~]# hostname --ip-address
192.168.0.101
time="2018-11-13T04:24:38Z" level=error msg="failed to join existing cluster" action=create category=etcd endpoint="192.168.0.102,192.168.0.101" error="Get http://192.168.0.102:5705/v1/members: dial tcp 192.168.0.102:5705: connect: no route to host" module=cp
time="2018-11-13T04:24:38Z" level=info msg="not first cluster node, joining first node" action=create address=192.168.0.101 category=etcd host=cortado-o3 module=cp target=192.168.0.101
time="2018-11-13T04:24:38Z" level=error msg="failed to join existing cluster" action=create category=etcd endpoint="192.168.0.102,192.168.0.101" error="503 Service Unavailable" module=cp
time="2018-11-13T04:24:38Z" level=info msg="retrying cluster join in 5 seconds..." action=create category=etcd module=cp
any suggestions? many thanks.
openshift
add a comment |
I've created an openshift 3.11 3 node cluster, 2 of which are compute
nodes. I've installed storageos on this cluster. One of the compute
nodes seems fine with the storageos installation, however the 2nd
compute node can't reach the 1st node. It appears that the error
is routing related.
the 2nd node will not route to the 1st node it appears.
[root@cortado-o1 standard]# oc get pod -n storageos
NAME READY STATUS RESTARTS AGE
storageos-47qgc 1/1 Running 0 6m
storageos-6bqqp 0/1 Running 3 7m
[root@cortado-o2 ~]# netstat -na | grep 5705
tcp6 0 0 :::5705
[root@cortado-o3 ~]# netstat -na | grep 5705
tcp 0 0 192.168.0.101:43588 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43548 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43522 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43458 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43628 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43602 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43562 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43502 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43476 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43412 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43430 192.168.0.101:5705 TIME_WAIT
tcp6 0 0 :::5705 :::* LISTEN
[root@cortado-o3 ~]# !nc
nc 192.168.0.102 5705
Ncat: No route to host.
[root@cortado-o3 ~]# hostname --ip-address
192.168.0.101
time="2018-11-13T04:24:38Z" level=error msg="failed to join existing cluster" action=create category=etcd endpoint="192.168.0.102,192.168.0.101" error="Get http://192.168.0.102:5705/v1/members: dial tcp 192.168.0.102:5705: connect: no route to host" module=cp
time="2018-11-13T04:24:38Z" level=info msg="not first cluster node, joining first node" action=create address=192.168.0.101 category=etcd host=cortado-o3 module=cp target=192.168.0.101
time="2018-11-13T04:24:38Z" level=error msg="failed to join existing cluster" action=create category=etcd endpoint="192.168.0.102,192.168.0.101" error="503 Service Unavailable" module=cp
time="2018-11-13T04:24:38Z" level=info msg="retrying cluster join in 5 seconds..." action=create category=etcd module=cp
any suggestions? many thanks.
openshift
add a comment |
I've created an openshift 3.11 3 node cluster, 2 of which are compute
nodes. I've installed storageos on this cluster. One of the compute
nodes seems fine with the storageos installation, however the 2nd
compute node can't reach the 1st node. It appears that the error
is routing related.
the 2nd node will not route to the 1st node it appears.
[root@cortado-o1 standard]# oc get pod -n storageos
NAME READY STATUS RESTARTS AGE
storageos-47qgc 1/1 Running 0 6m
storageos-6bqqp 0/1 Running 3 7m
[root@cortado-o2 ~]# netstat -na | grep 5705
tcp6 0 0 :::5705
[root@cortado-o3 ~]# netstat -na | grep 5705
tcp 0 0 192.168.0.101:43588 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43548 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43522 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43458 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43628 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43602 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43562 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43502 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43476 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43412 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43430 192.168.0.101:5705 TIME_WAIT
tcp6 0 0 :::5705 :::* LISTEN
[root@cortado-o3 ~]# !nc
nc 192.168.0.102 5705
Ncat: No route to host.
[root@cortado-o3 ~]# hostname --ip-address
192.168.0.101
time="2018-11-13T04:24:38Z" level=error msg="failed to join existing cluster" action=create category=etcd endpoint="192.168.0.102,192.168.0.101" error="Get http://192.168.0.102:5705/v1/members: dial tcp 192.168.0.102:5705: connect: no route to host" module=cp
time="2018-11-13T04:24:38Z" level=info msg="not first cluster node, joining first node" action=create address=192.168.0.101 category=etcd host=cortado-o3 module=cp target=192.168.0.101
time="2018-11-13T04:24:38Z" level=error msg="failed to join existing cluster" action=create category=etcd endpoint="192.168.0.102,192.168.0.101" error="503 Service Unavailable" module=cp
time="2018-11-13T04:24:38Z" level=info msg="retrying cluster join in 5 seconds..." action=create category=etcd module=cp
any suggestions? many thanks.
openshift
I've created an openshift 3.11 3 node cluster, 2 of which are compute
nodes. I've installed storageos on this cluster. One of the compute
nodes seems fine with the storageos installation, however the 2nd
compute node can't reach the 1st node. It appears that the error
is routing related.
the 2nd node will not route to the 1st node it appears.
[root@cortado-o1 standard]# oc get pod -n storageos
NAME READY STATUS RESTARTS AGE
storageos-47qgc 1/1 Running 0 6m
storageos-6bqqp 0/1 Running 3 7m
[root@cortado-o2 ~]# netstat -na | grep 5705
tcp6 0 0 :::5705
[root@cortado-o3 ~]# netstat -na | grep 5705
tcp 0 0 192.168.0.101:43588 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43548 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43522 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43458 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43628 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43602 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43562 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43502 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43476 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43412 192.168.0.101:5705 TIME_WAIT
tcp 0 0 192.168.0.101:43430 192.168.0.101:5705 TIME_WAIT
tcp6 0 0 :::5705 :::* LISTEN
[root@cortado-o3 ~]# !nc
nc 192.168.0.102 5705
Ncat: No route to host.
[root@cortado-o3 ~]# hostname --ip-address
192.168.0.101
time="2018-11-13T04:24:38Z" level=error msg="failed to join existing cluster" action=create category=etcd endpoint="192.168.0.102,192.168.0.101" error="Get http://192.168.0.102:5705/v1/members: dial tcp 192.168.0.102:5705: connect: no route to host" module=cp
time="2018-11-13T04:24:38Z" level=info msg="not first cluster node, joining first node" action=create address=192.168.0.101 category=etcd host=cortado-o3 module=cp target=192.168.0.101
time="2018-11-13T04:24:38Z" level=error msg="failed to join existing cluster" action=create category=etcd endpoint="192.168.0.102,192.168.0.101" error="503 Service Unavailable" module=cp
time="2018-11-13T04:24:38Z" level=info msg="retrying cluster join in 5 seconds..." action=create category=etcd module=cp
any suggestions? many thanks.
openshift
openshift
edited Nov 13 at 5:26
Robert
2,14062435
2,14062435
asked Nov 13 at 4:27
jeff mccormick
83
83
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I can see on your netstat output that StorageOS is bound to the port, not that they can communicate. In fact the Ncat shows that there is no route to host, so they can't connect. StorageOS needs to be able to communicate among its nodes.
The StorageOS docs have a reference about the prerequisites of the ports and how to open them. https://docs.storageos.com/docs/prerequisites/firewalls
It depends on your OpenShift installation if you use ufw, firewalld or straight ip tables.
For ufw try this:
ufw default allow outgoing
ufw allow 5701:5711/tcp
ufw allow 5711/udp
For firewalld try this:
firewall-cmd --permanent --new-service=storageos
firewall-cmd --permanent --service=storageos --add-port=5700-5800/tcp
firewall-cmd --add-service=storageos --zone=public --permanent
firewall-cmd --reload
For straight iptables:
# Inbound traffic
iptables -I INPUT -i lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I INPUT -m state --state ESTABLISHED,RELATED -m comment --comment 'Permit established traffic' -j ACCEPT
iptables -A INPUT -p tcp --dport 5701:5711 -m comment --comment 'StorageOS' -j ACCEPT
iptables -A INPUT -p udp --dport 5711 -m comment --comment 'StorageOS' -j ACCEPT
# Outbound traffic
iptables -I OUTPUT -o lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I OUTPUT -d 0.0.0.0/0 -m comment --comment 'Permit outbound traffic' -j ACCEPT
Check also the troubleshooting page of storageos for this particular issue.
https://docs.storageos.com/docs/platforms/openshift/troubleshoot/install#peer-discovery---networking
In addition, less than 3 node cluster is not supported. You can have 1 node for testing or 3+. But having 2 nodes makes impossible to ensure quorum in a distributed environment unless you use StorageOS pointing the kv store to a external etcd.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273806%2fopenshift-3-11-storageos-networking-issue%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I can see on your netstat output that StorageOS is bound to the port, not that they can communicate. In fact the Ncat shows that there is no route to host, so they can't connect. StorageOS needs to be able to communicate among its nodes.
The StorageOS docs have a reference about the prerequisites of the ports and how to open them. https://docs.storageos.com/docs/prerequisites/firewalls
It depends on your OpenShift installation if you use ufw, firewalld or straight ip tables.
For ufw try this:
ufw default allow outgoing
ufw allow 5701:5711/tcp
ufw allow 5711/udp
For firewalld try this:
firewall-cmd --permanent --new-service=storageos
firewall-cmd --permanent --service=storageos --add-port=5700-5800/tcp
firewall-cmd --add-service=storageos --zone=public --permanent
firewall-cmd --reload
For straight iptables:
# Inbound traffic
iptables -I INPUT -i lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I INPUT -m state --state ESTABLISHED,RELATED -m comment --comment 'Permit established traffic' -j ACCEPT
iptables -A INPUT -p tcp --dport 5701:5711 -m comment --comment 'StorageOS' -j ACCEPT
iptables -A INPUT -p udp --dport 5711 -m comment --comment 'StorageOS' -j ACCEPT
# Outbound traffic
iptables -I OUTPUT -o lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I OUTPUT -d 0.0.0.0/0 -m comment --comment 'Permit outbound traffic' -j ACCEPT
Check also the troubleshooting page of storageos for this particular issue.
https://docs.storageos.com/docs/platforms/openshift/troubleshoot/install#peer-discovery---networking
In addition, less than 3 node cluster is not supported. You can have 1 node for testing or 3+. But having 2 nodes makes impossible to ensure quorum in a distributed environment unless you use StorageOS pointing the kv store to a external etcd.
add a comment |
I can see on your netstat output that StorageOS is bound to the port, not that they can communicate. In fact the Ncat shows that there is no route to host, so they can't connect. StorageOS needs to be able to communicate among its nodes.
The StorageOS docs have a reference about the prerequisites of the ports and how to open them. https://docs.storageos.com/docs/prerequisites/firewalls
It depends on your OpenShift installation if you use ufw, firewalld or straight ip tables.
For ufw try this:
ufw default allow outgoing
ufw allow 5701:5711/tcp
ufw allow 5711/udp
For firewalld try this:
firewall-cmd --permanent --new-service=storageos
firewall-cmd --permanent --service=storageos --add-port=5700-5800/tcp
firewall-cmd --add-service=storageos --zone=public --permanent
firewall-cmd --reload
For straight iptables:
# Inbound traffic
iptables -I INPUT -i lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I INPUT -m state --state ESTABLISHED,RELATED -m comment --comment 'Permit established traffic' -j ACCEPT
iptables -A INPUT -p tcp --dport 5701:5711 -m comment --comment 'StorageOS' -j ACCEPT
iptables -A INPUT -p udp --dport 5711 -m comment --comment 'StorageOS' -j ACCEPT
# Outbound traffic
iptables -I OUTPUT -o lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I OUTPUT -d 0.0.0.0/0 -m comment --comment 'Permit outbound traffic' -j ACCEPT
Check also the troubleshooting page of storageos for this particular issue.
https://docs.storageos.com/docs/platforms/openshift/troubleshoot/install#peer-discovery---networking
In addition, less than 3 node cluster is not supported. You can have 1 node for testing or 3+. But having 2 nodes makes impossible to ensure quorum in a distributed environment unless you use StorageOS pointing the kv store to a external etcd.
add a comment |
I can see on your netstat output that StorageOS is bound to the port, not that they can communicate. In fact the Ncat shows that there is no route to host, so they can't connect. StorageOS needs to be able to communicate among its nodes.
The StorageOS docs have a reference about the prerequisites of the ports and how to open them. https://docs.storageos.com/docs/prerequisites/firewalls
It depends on your OpenShift installation if you use ufw, firewalld or straight ip tables.
For ufw try this:
ufw default allow outgoing
ufw allow 5701:5711/tcp
ufw allow 5711/udp
For firewalld try this:
firewall-cmd --permanent --new-service=storageos
firewall-cmd --permanent --service=storageos --add-port=5700-5800/tcp
firewall-cmd --add-service=storageos --zone=public --permanent
firewall-cmd --reload
For straight iptables:
# Inbound traffic
iptables -I INPUT -i lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I INPUT -m state --state ESTABLISHED,RELATED -m comment --comment 'Permit established traffic' -j ACCEPT
iptables -A INPUT -p tcp --dport 5701:5711 -m comment --comment 'StorageOS' -j ACCEPT
iptables -A INPUT -p udp --dport 5711 -m comment --comment 'StorageOS' -j ACCEPT
# Outbound traffic
iptables -I OUTPUT -o lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I OUTPUT -d 0.0.0.0/0 -m comment --comment 'Permit outbound traffic' -j ACCEPT
Check also the troubleshooting page of storageos for this particular issue.
https://docs.storageos.com/docs/platforms/openshift/troubleshoot/install#peer-discovery---networking
In addition, less than 3 node cluster is not supported. You can have 1 node for testing or 3+. But having 2 nodes makes impossible to ensure quorum in a distributed environment unless you use StorageOS pointing the kv store to a external etcd.
I can see on your netstat output that StorageOS is bound to the port, not that they can communicate. In fact the Ncat shows that there is no route to host, so they can't connect. StorageOS needs to be able to communicate among its nodes.
The StorageOS docs have a reference about the prerequisites of the ports and how to open them. https://docs.storageos.com/docs/prerequisites/firewalls
It depends on your OpenShift installation if you use ufw, firewalld or straight ip tables.
For ufw try this:
ufw default allow outgoing
ufw allow 5701:5711/tcp
ufw allow 5711/udp
For firewalld try this:
firewall-cmd --permanent --new-service=storageos
firewall-cmd --permanent --service=storageos --add-port=5700-5800/tcp
firewall-cmd --add-service=storageos --zone=public --permanent
firewall-cmd --reload
For straight iptables:
# Inbound traffic
iptables -I INPUT -i lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I INPUT -m state --state ESTABLISHED,RELATED -m comment --comment 'Permit established traffic' -j ACCEPT
iptables -A INPUT -p tcp --dport 5701:5711 -m comment --comment 'StorageOS' -j ACCEPT
iptables -A INPUT -p udp --dport 5711 -m comment --comment 'StorageOS' -j ACCEPT
# Outbound traffic
iptables -I OUTPUT -o lo -m comment --comment 'Permit loopback traffic' -j ACCEPT
iptables -I OUTPUT -d 0.0.0.0/0 -m comment --comment 'Permit outbound traffic' -j ACCEPT
Check also the troubleshooting page of storageos for this particular issue.
https://docs.storageos.com/docs/platforms/openshift/troubleshoot/install#peer-discovery---networking
In addition, less than 3 node cluster is not supported. You can have 1 node for testing or 3+. But having 2 nodes makes impossible to ensure quorum in a distributed environment unless you use StorageOS pointing the kv store to a external etcd.
answered Nov 23 at 14:58
Ferran Arau Castell
1013
1013
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273806%2fopenshift-3-11-storageos-networking-issue%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown