net_adm:ping failure very strange
Dears,
I am getting an issue regards to Erlang cluster. After a long time my cluster working, one day, I can't make any connection more to a specific node (e.g.SickNode@X.X.X.X) in the cluster, net_adm:ping(SickNode@X.X.X.X) returns a pang answer. Even using:
erlang -name abc@X.X.X.X -setcookie MYCOOKIE -remsh SickNode@X.X.X.X
return a failure result too.
The strange is, the SickNode@X.X.X.X is working well to other nodes in the cluster. The problem just has happened when a new node joining to the cluster and ping to SickNode.
There isn't any firewall here because all nodes are working well within the cluster. Is there anybody has got this bad situation? Erlang is not stable for cluster using?
PS: I am using Erlang/OTP 20 with Centos 6.8
Many Thanks!!!
erlang cluster-computing erlang-shell erlang-ports
add a comment |
Dears,
I am getting an issue regards to Erlang cluster. After a long time my cluster working, one day, I can't make any connection more to a specific node (e.g.SickNode@X.X.X.X) in the cluster, net_adm:ping(SickNode@X.X.X.X) returns a pang answer. Even using:
erlang -name abc@X.X.X.X -setcookie MYCOOKIE -remsh SickNode@X.X.X.X
return a failure result too.
The strange is, the SickNode@X.X.X.X is working well to other nodes in the cluster. The problem just has happened when a new node joining to the cluster and ping to SickNode.
There isn't any firewall here because all nodes are working well within the cluster. Is there anybody has got this bad situation? Erlang is not stable for cluster using?
PS: I am using Erlang/OTP 20 with Centos 6.8
Many Thanks!!!
erlang cluster-computing erlang-shell erlang-ports
Two questions: 1. Can you ping other nodes from your new node (i.e. nodes that are not SickNode@X.X.X.X)? / 2. Can you ping your new node from SickNode@X.X.X.X?
– Brujo Benavides
Nov 20 '18 at 10:40
1. I can ping from the new node to all other nodes, except the SickNode. A pong result returned to the new node./ 2. From the SickNode, I can't ping to any new node. In the SickNode, when using netstat, I can see old connection (s) keep maintain. Thanks for your help!
– Duke
Nov 20 '18 at 17:03
So… I know you said "SickNode@X.X.X.X is working well to other nodes in the cluster." but still… 3. from SickNode, can you ping other nodes (i.e. not new ones, just some other healthy ones)? / 4. from other healthy nodes, can you ping SickNode? / 5. have you checked that the cookie in SickNode is still 'MYCOOKIE'? Maybe it changed after that node was connected to the cluster…
– Brujo Benavides
Nov 21 '18 at 18:06
I can ping pong to healthy nodes from the sick node and vice-versa. Just problem happen with new nodes. I think you are right that, maybe the cookie has been changed. But I wonder, what causes to the cookie changed? While the sick node has been running quite long time. Thanks for your help!
– Duke
Nov 22 '18 at 2:00
add a comment |
Dears,
I am getting an issue regards to Erlang cluster. After a long time my cluster working, one day, I can't make any connection more to a specific node (e.g.SickNode@X.X.X.X) in the cluster, net_adm:ping(SickNode@X.X.X.X) returns a pang answer. Even using:
erlang -name abc@X.X.X.X -setcookie MYCOOKIE -remsh SickNode@X.X.X.X
return a failure result too.
The strange is, the SickNode@X.X.X.X is working well to other nodes in the cluster. The problem just has happened when a new node joining to the cluster and ping to SickNode.
There isn't any firewall here because all nodes are working well within the cluster. Is there anybody has got this bad situation? Erlang is not stable for cluster using?
PS: I am using Erlang/OTP 20 with Centos 6.8
Many Thanks!!!
erlang cluster-computing erlang-shell erlang-ports
Dears,
I am getting an issue regards to Erlang cluster. After a long time my cluster working, one day, I can't make any connection more to a specific node (e.g.SickNode@X.X.X.X) in the cluster, net_adm:ping(SickNode@X.X.X.X) returns a pang answer. Even using:
erlang -name abc@X.X.X.X -setcookie MYCOOKIE -remsh SickNode@X.X.X.X
return a failure result too.
The strange is, the SickNode@X.X.X.X is working well to other nodes in the cluster. The problem just has happened when a new node joining to the cluster and ping to SickNode.
There isn't any firewall here because all nodes are working well within the cluster. Is there anybody has got this bad situation? Erlang is not stable for cluster using?
PS: I am using Erlang/OTP 20 with Centos 6.8
Many Thanks!!!
erlang cluster-computing erlang-shell erlang-ports
erlang cluster-computing erlang-shell erlang-ports
edited Nov 19 '18 at 9:49
Duke
asked Nov 19 '18 at 8:42
DukeDuke
166
166
Two questions: 1. Can you ping other nodes from your new node (i.e. nodes that are not SickNode@X.X.X.X)? / 2. Can you ping your new node from SickNode@X.X.X.X?
– Brujo Benavides
Nov 20 '18 at 10:40
1. I can ping from the new node to all other nodes, except the SickNode. A pong result returned to the new node./ 2. From the SickNode, I can't ping to any new node. In the SickNode, when using netstat, I can see old connection (s) keep maintain. Thanks for your help!
– Duke
Nov 20 '18 at 17:03
So… I know you said "SickNode@X.X.X.X is working well to other nodes in the cluster." but still… 3. from SickNode, can you ping other nodes (i.e. not new ones, just some other healthy ones)? / 4. from other healthy nodes, can you ping SickNode? / 5. have you checked that the cookie in SickNode is still 'MYCOOKIE'? Maybe it changed after that node was connected to the cluster…
– Brujo Benavides
Nov 21 '18 at 18:06
I can ping pong to healthy nodes from the sick node and vice-versa. Just problem happen with new nodes. I think you are right that, maybe the cookie has been changed. But I wonder, what causes to the cookie changed? While the sick node has been running quite long time. Thanks for your help!
– Duke
Nov 22 '18 at 2:00
add a comment |
Two questions: 1. Can you ping other nodes from your new node (i.e. nodes that are not SickNode@X.X.X.X)? / 2. Can you ping your new node from SickNode@X.X.X.X?
– Brujo Benavides
Nov 20 '18 at 10:40
1. I can ping from the new node to all other nodes, except the SickNode. A pong result returned to the new node./ 2. From the SickNode, I can't ping to any new node. In the SickNode, when using netstat, I can see old connection (s) keep maintain. Thanks for your help!
– Duke
Nov 20 '18 at 17:03
So… I know you said "SickNode@X.X.X.X is working well to other nodes in the cluster." but still… 3. from SickNode, can you ping other nodes (i.e. not new ones, just some other healthy ones)? / 4. from other healthy nodes, can you ping SickNode? / 5. have you checked that the cookie in SickNode is still 'MYCOOKIE'? Maybe it changed after that node was connected to the cluster…
– Brujo Benavides
Nov 21 '18 at 18:06
I can ping pong to healthy nodes from the sick node and vice-versa. Just problem happen with new nodes. I think you are right that, maybe the cookie has been changed. But I wonder, what causes to the cookie changed? While the sick node has been running quite long time. Thanks for your help!
– Duke
Nov 22 '18 at 2:00
Two questions: 1. Can you ping other nodes from your new node (i.e. nodes that are not SickNode@X.X.X.X)? / 2. Can you ping your new node from SickNode@X.X.X.X?
– Brujo Benavides
Nov 20 '18 at 10:40
Two questions: 1. Can you ping other nodes from your new node (i.e. nodes that are not SickNode@X.X.X.X)? / 2. Can you ping your new node from SickNode@X.X.X.X?
– Brujo Benavides
Nov 20 '18 at 10:40
1. I can ping from the new node to all other nodes, except the SickNode. A pong result returned to the new node./ 2. From the SickNode, I can't ping to any new node. In the SickNode, when using netstat, I can see old connection (s) keep maintain. Thanks for your help!
– Duke
Nov 20 '18 at 17:03
1. I can ping from the new node to all other nodes, except the SickNode. A pong result returned to the new node./ 2. From the SickNode, I can't ping to any new node. In the SickNode, when using netstat, I can see old connection (s) keep maintain. Thanks for your help!
– Duke
Nov 20 '18 at 17:03
So… I know you said "SickNode@X.X.X.X is working well to other nodes in the cluster." but still… 3. from SickNode, can you ping other nodes (i.e. not new ones, just some other healthy ones)? / 4. from other healthy nodes, can you ping SickNode? / 5. have you checked that the cookie in SickNode is still 'MYCOOKIE'? Maybe it changed after that node was connected to the cluster…
– Brujo Benavides
Nov 21 '18 at 18:06
So… I know you said "SickNode@X.X.X.X is working well to other nodes in the cluster." but still… 3. from SickNode, can you ping other nodes (i.e. not new ones, just some other healthy ones)? / 4. from other healthy nodes, can you ping SickNode? / 5. have you checked that the cookie in SickNode is still 'MYCOOKIE'? Maybe it changed after that node was connected to the cluster…
– Brujo Benavides
Nov 21 '18 at 18:06
I can ping pong to healthy nodes from the sick node and vice-versa. Just problem happen with new nodes. I think you are right that, maybe the cookie has been changed. But I wonder, what causes to the cookie changed? While the sick node has been running quite long time. Thanks for your help!
– Duke
Nov 22 '18 at 2:00
I can ping pong to healthy nodes from the sick node and vice-versa. Just problem happen with new nodes. I think you are right that, maybe the cookie has been changed. But I wonder, what causes to the cookie changed? While the sick node has been running quite long time. Thanks for your help!
– Duke
Nov 22 '18 at 2:00
add a comment |
1 Answer
1
active
oldest
votes
Not a straight up answer, but a theory and a way to reproduce your issue.
It's complicated because it involves multiple nodes, but let's see if you can follow me.
TL;DR: SickNode@X.X.X.X changed its cookie after it was connected to the cluster.
So, this is what I did…
First, on a terminal I started node1
with cookie x
…
$ erl -name node1 -setcookie x
(node1@my.computer)1>
Then, on another terminal I started node2
with cookie x
, connected it to node1
and changed its cookie to y
…
$ erl -name node2 -setcookie x
(node2@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node2@my.computer)2> erlang:set_cookie(node(), 'y').
true
(node2@my.computer)3>
Then, in yet another terminal I started node3
with cookie x
and pinged node1
(which resulted in a connection attempt to node2
as well, as you will see below) and then explicitely tried to connect to node2
…
$ erl -name node3 -setcookie x
(node3@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node3@my.computer)2>
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node3@my.computer' failed to connect to 'node2@my.computer'
=ERROR REPORT==== 21-Nov-2018::15:09:26 ===
** Connection attempt from disallowed node 'node2@my.computer' **
(node3@my.computer)2> net_adm:ping('node2@FERNANDO-BENAVIDES.Conyfero').
pang
What happened so far? Well, since node1
's cookie was x
and node3
's cookie was x
as well, they could connect. node2
was still connected to node1
but, since the cookie there was y
, node3
could not connect to it.
Erlang tries to establish a fully connected mesh of nodes, so when you connect to one of them, it automatically tries to connect you to all the others.
But I wanted to be thorough so I pinged node2
from node3
and, as expected I got a pang
. Also, these messages popped up on node2
:
(node2@my.computer)3>
=ERROR REPORT==== 21-Nov-2018::15:09:07 ===
** Connection attempt from disallowed node 'node3@my.computer' **
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node2@my.computer' failed to connect to 'node3@my.computer'
And, of course, when I tried to ping node3
from node2
…
(node2@my.computer)3> net_adm:ping('node3@my.computer').
pang
But… if I try to ping node1
…
(node2@my.computer)4> net_adm:ping('node1@my.computer').
pong
That's because they're already connected and Erlang only validates the sharing of the cookie on the initial handshake.
Finally, if I try to ping nodes from node1
, I get the expected results…
(node1@my.computer)1> net_adm:ping('node2@my.computer').
pong
(node1@my.computer)2> net_adm:ping('node3@my.computer').
pong
(node1@my.computer)3>
Hope this helps.
Thanks for your help again! Your scenario looks to be same as my issue. But as I commented above, I don't know what is a root cause to make the cookie changed. Could you recommend some reason? Actually I restarted the sick node then it works well. But this situation is a serious one for me. So for now, I can't check whether or not the cookie was changed.
– Duke
Nov 22 '18 at 2:58
Well… the easiest way for a cookie to change is for some process to runerlang:set_cookie/2
and remember: That thing can be run from wherever (i.e. from any node in the cluster):> erlang:set_cookie('SickNode@X.X.X.X', 'the wrong cookie').
– Brujo Benavides
Nov 22 '18 at 10:03
If someone uses "make" command or "make clean" can cause to the cookie change? I declared cookie in vm.args file, so I don't think anyone changed the cookie by sing that api. Thanks for your help!!!
– Duke
Nov 23 '18 at 6:18
I reproduced the issue again and found that the cookie didn't change as we discussed. Is there any other reason here? After using oberver_cli to inspect the SickNode, new nodes can't join to it.
– Duke
Nov 23 '18 at 11:09
Even I set again a correct cookie.
– Duke
Nov 23 '18 at 11:27
|
show 5 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53370999%2fnet-admping-failure-very-strange%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Not a straight up answer, but a theory and a way to reproduce your issue.
It's complicated because it involves multiple nodes, but let's see if you can follow me.
TL;DR: SickNode@X.X.X.X changed its cookie after it was connected to the cluster.
So, this is what I did…
First, on a terminal I started node1
with cookie x
…
$ erl -name node1 -setcookie x
(node1@my.computer)1>
Then, on another terminal I started node2
with cookie x
, connected it to node1
and changed its cookie to y
…
$ erl -name node2 -setcookie x
(node2@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node2@my.computer)2> erlang:set_cookie(node(), 'y').
true
(node2@my.computer)3>
Then, in yet another terminal I started node3
with cookie x
and pinged node1
(which resulted in a connection attempt to node2
as well, as you will see below) and then explicitely tried to connect to node2
…
$ erl -name node3 -setcookie x
(node3@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node3@my.computer)2>
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node3@my.computer' failed to connect to 'node2@my.computer'
=ERROR REPORT==== 21-Nov-2018::15:09:26 ===
** Connection attempt from disallowed node 'node2@my.computer' **
(node3@my.computer)2> net_adm:ping('node2@FERNANDO-BENAVIDES.Conyfero').
pang
What happened so far? Well, since node1
's cookie was x
and node3
's cookie was x
as well, they could connect. node2
was still connected to node1
but, since the cookie there was y
, node3
could not connect to it.
Erlang tries to establish a fully connected mesh of nodes, so when you connect to one of them, it automatically tries to connect you to all the others.
But I wanted to be thorough so I pinged node2
from node3
and, as expected I got a pang
. Also, these messages popped up on node2
:
(node2@my.computer)3>
=ERROR REPORT==== 21-Nov-2018::15:09:07 ===
** Connection attempt from disallowed node 'node3@my.computer' **
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node2@my.computer' failed to connect to 'node3@my.computer'
And, of course, when I tried to ping node3
from node2
…
(node2@my.computer)3> net_adm:ping('node3@my.computer').
pang
But… if I try to ping node1
…
(node2@my.computer)4> net_adm:ping('node1@my.computer').
pong
That's because they're already connected and Erlang only validates the sharing of the cookie on the initial handshake.
Finally, if I try to ping nodes from node1
, I get the expected results…
(node1@my.computer)1> net_adm:ping('node2@my.computer').
pong
(node1@my.computer)2> net_adm:ping('node3@my.computer').
pong
(node1@my.computer)3>
Hope this helps.
Thanks for your help again! Your scenario looks to be same as my issue. But as I commented above, I don't know what is a root cause to make the cookie changed. Could you recommend some reason? Actually I restarted the sick node then it works well. But this situation is a serious one for me. So for now, I can't check whether or not the cookie was changed.
– Duke
Nov 22 '18 at 2:58
Well… the easiest way for a cookie to change is for some process to runerlang:set_cookie/2
and remember: That thing can be run from wherever (i.e. from any node in the cluster):> erlang:set_cookie('SickNode@X.X.X.X', 'the wrong cookie').
– Brujo Benavides
Nov 22 '18 at 10:03
If someone uses "make" command or "make clean" can cause to the cookie change? I declared cookie in vm.args file, so I don't think anyone changed the cookie by sing that api. Thanks for your help!!!
– Duke
Nov 23 '18 at 6:18
I reproduced the issue again and found that the cookie didn't change as we discussed. Is there any other reason here? After using oberver_cli to inspect the SickNode, new nodes can't join to it.
– Duke
Nov 23 '18 at 11:09
Even I set again a correct cookie.
– Duke
Nov 23 '18 at 11:27
|
show 5 more comments
Not a straight up answer, but a theory and a way to reproduce your issue.
It's complicated because it involves multiple nodes, but let's see if you can follow me.
TL;DR: SickNode@X.X.X.X changed its cookie after it was connected to the cluster.
So, this is what I did…
First, on a terminal I started node1
with cookie x
…
$ erl -name node1 -setcookie x
(node1@my.computer)1>
Then, on another terminal I started node2
with cookie x
, connected it to node1
and changed its cookie to y
…
$ erl -name node2 -setcookie x
(node2@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node2@my.computer)2> erlang:set_cookie(node(), 'y').
true
(node2@my.computer)3>
Then, in yet another terminal I started node3
with cookie x
and pinged node1
(which resulted in a connection attempt to node2
as well, as you will see below) and then explicitely tried to connect to node2
…
$ erl -name node3 -setcookie x
(node3@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node3@my.computer)2>
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node3@my.computer' failed to connect to 'node2@my.computer'
=ERROR REPORT==== 21-Nov-2018::15:09:26 ===
** Connection attempt from disallowed node 'node2@my.computer' **
(node3@my.computer)2> net_adm:ping('node2@FERNANDO-BENAVIDES.Conyfero').
pang
What happened so far? Well, since node1
's cookie was x
and node3
's cookie was x
as well, they could connect. node2
was still connected to node1
but, since the cookie there was y
, node3
could not connect to it.
Erlang tries to establish a fully connected mesh of nodes, so when you connect to one of them, it automatically tries to connect you to all the others.
But I wanted to be thorough so I pinged node2
from node3
and, as expected I got a pang
. Also, these messages popped up on node2
:
(node2@my.computer)3>
=ERROR REPORT==== 21-Nov-2018::15:09:07 ===
** Connection attempt from disallowed node 'node3@my.computer' **
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node2@my.computer' failed to connect to 'node3@my.computer'
And, of course, when I tried to ping node3
from node2
…
(node2@my.computer)3> net_adm:ping('node3@my.computer').
pang
But… if I try to ping node1
…
(node2@my.computer)4> net_adm:ping('node1@my.computer').
pong
That's because they're already connected and Erlang only validates the sharing of the cookie on the initial handshake.
Finally, if I try to ping nodes from node1
, I get the expected results…
(node1@my.computer)1> net_adm:ping('node2@my.computer').
pong
(node1@my.computer)2> net_adm:ping('node3@my.computer').
pong
(node1@my.computer)3>
Hope this helps.
Thanks for your help again! Your scenario looks to be same as my issue. But as I commented above, I don't know what is a root cause to make the cookie changed. Could you recommend some reason? Actually I restarted the sick node then it works well. But this situation is a serious one for me. So for now, I can't check whether or not the cookie was changed.
– Duke
Nov 22 '18 at 2:58
Well… the easiest way for a cookie to change is for some process to runerlang:set_cookie/2
and remember: That thing can be run from wherever (i.e. from any node in the cluster):> erlang:set_cookie('SickNode@X.X.X.X', 'the wrong cookie').
– Brujo Benavides
Nov 22 '18 at 10:03
If someone uses "make" command or "make clean" can cause to the cookie change? I declared cookie in vm.args file, so I don't think anyone changed the cookie by sing that api. Thanks for your help!!!
– Duke
Nov 23 '18 at 6:18
I reproduced the issue again and found that the cookie didn't change as we discussed. Is there any other reason here? After using oberver_cli to inspect the SickNode, new nodes can't join to it.
– Duke
Nov 23 '18 at 11:09
Even I set again a correct cookie.
– Duke
Nov 23 '18 at 11:27
|
show 5 more comments
Not a straight up answer, but a theory and a way to reproduce your issue.
It's complicated because it involves multiple nodes, but let's see if you can follow me.
TL;DR: SickNode@X.X.X.X changed its cookie after it was connected to the cluster.
So, this is what I did…
First, on a terminal I started node1
with cookie x
…
$ erl -name node1 -setcookie x
(node1@my.computer)1>
Then, on another terminal I started node2
with cookie x
, connected it to node1
and changed its cookie to y
…
$ erl -name node2 -setcookie x
(node2@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node2@my.computer)2> erlang:set_cookie(node(), 'y').
true
(node2@my.computer)3>
Then, in yet another terminal I started node3
with cookie x
and pinged node1
(which resulted in a connection attempt to node2
as well, as you will see below) and then explicitely tried to connect to node2
…
$ erl -name node3 -setcookie x
(node3@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node3@my.computer)2>
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node3@my.computer' failed to connect to 'node2@my.computer'
=ERROR REPORT==== 21-Nov-2018::15:09:26 ===
** Connection attempt from disallowed node 'node2@my.computer' **
(node3@my.computer)2> net_adm:ping('node2@FERNANDO-BENAVIDES.Conyfero').
pang
What happened so far? Well, since node1
's cookie was x
and node3
's cookie was x
as well, they could connect. node2
was still connected to node1
but, since the cookie there was y
, node3
could not connect to it.
Erlang tries to establish a fully connected mesh of nodes, so when you connect to one of them, it automatically tries to connect you to all the others.
But I wanted to be thorough so I pinged node2
from node3
and, as expected I got a pang
. Also, these messages popped up on node2
:
(node2@my.computer)3>
=ERROR REPORT==== 21-Nov-2018::15:09:07 ===
** Connection attempt from disallowed node 'node3@my.computer' **
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node2@my.computer' failed to connect to 'node3@my.computer'
And, of course, when I tried to ping node3
from node2
…
(node2@my.computer)3> net_adm:ping('node3@my.computer').
pang
But… if I try to ping node1
…
(node2@my.computer)4> net_adm:ping('node1@my.computer').
pong
That's because they're already connected and Erlang only validates the sharing of the cookie on the initial handshake.
Finally, if I try to ping nodes from node1
, I get the expected results…
(node1@my.computer)1> net_adm:ping('node2@my.computer').
pong
(node1@my.computer)2> net_adm:ping('node3@my.computer').
pong
(node1@my.computer)3>
Hope this helps.
Not a straight up answer, but a theory and a way to reproduce your issue.
It's complicated because it involves multiple nodes, but let's see if you can follow me.
TL;DR: SickNode@X.X.X.X changed its cookie after it was connected to the cluster.
So, this is what I did…
First, on a terminal I started node1
with cookie x
…
$ erl -name node1 -setcookie x
(node1@my.computer)1>
Then, on another terminal I started node2
with cookie x
, connected it to node1
and changed its cookie to y
…
$ erl -name node2 -setcookie x
(node2@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node2@my.computer)2> erlang:set_cookie(node(), 'y').
true
(node2@my.computer)3>
Then, in yet another terminal I started node3
with cookie x
and pinged node1
(which resulted in a connection attempt to node2
as well, as you will see below) and then explicitely tried to connect to node2
…
$ erl -name node3 -setcookie x
(node3@my.computer)1> net_adm:ping('node1@my.computer').
pong
(node3@my.computer)2>
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node3@my.computer' failed to connect to 'node2@my.computer'
=ERROR REPORT==== 21-Nov-2018::15:09:26 ===
** Connection attempt from disallowed node 'node2@my.computer' **
(node3@my.computer)2> net_adm:ping('node2@FERNANDO-BENAVIDES.Conyfero').
pang
What happened so far? Well, since node1
's cookie was x
and node3
's cookie was x
as well, they could connect. node2
was still connected to node1
but, since the cookie there was y
, node3
could not connect to it.
Erlang tries to establish a fully connected mesh of nodes, so when you connect to one of them, it automatically tries to connect you to all the others.
But I wanted to be thorough so I pinged node2
from node3
and, as expected I got a pang
. Also, these messages popped up on node2
:
(node2@my.computer)3>
=ERROR REPORT==== 21-Nov-2018::15:09:07 ===
** Connection attempt from disallowed node 'node3@my.computer' **
=WARNING REPORT==== 21-Nov-2018::15:09:07 ===
global: 'node2@my.computer' failed to connect to 'node3@my.computer'
And, of course, when I tried to ping node3
from node2
…
(node2@my.computer)3> net_adm:ping('node3@my.computer').
pang
But… if I try to ping node1
…
(node2@my.computer)4> net_adm:ping('node1@my.computer').
pong
That's because they're already connected and Erlang only validates the sharing of the cookie on the initial handshake.
Finally, if I try to ping nodes from node1
, I get the expected results…
(node1@my.computer)1> net_adm:ping('node2@my.computer').
pong
(node1@my.computer)2> net_adm:ping('node3@my.computer').
pong
(node1@my.computer)3>
Hope this helps.
answered Nov 21 '18 at 18:21
Brujo BenavidesBrujo Benavides
63137
63137
Thanks for your help again! Your scenario looks to be same as my issue. But as I commented above, I don't know what is a root cause to make the cookie changed. Could you recommend some reason? Actually I restarted the sick node then it works well. But this situation is a serious one for me. So for now, I can't check whether or not the cookie was changed.
– Duke
Nov 22 '18 at 2:58
Well… the easiest way for a cookie to change is for some process to runerlang:set_cookie/2
and remember: That thing can be run from wherever (i.e. from any node in the cluster):> erlang:set_cookie('SickNode@X.X.X.X', 'the wrong cookie').
– Brujo Benavides
Nov 22 '18 at 10:03
If someone uses "make" command or "make clean" can cause to the cookie change? I declared cookie in vm.args file, so I don't think anyone changed the cookie by sing that api. Thanks for your help!!!
– Duke
Nov 23 '18 at 6:18
I reproduced the issue again and found that the cookie didn't change as we discussed. Is there any other reason here? After using oberver_cli to inspect the SickNode, new nodes can't join to it.
– Duke
Nov 23 '18 at 11:09
Even I set again a correct cookie.
– Duke
Nov 23 '18 at 11:27
|
show 5 more comments
Thanks for your help again! Your scenario looks to be same as my issue. But as I commented above, I don't know what is a root cause to make the cookie changed. Could you recommend some reason? Actually I restarted the sick node then it works well. But this situation is a serious one for me. So for now, I can't check whether or not the cookie was changed.
– Duke
Nov 22 '18 at 2:58
Well… the easiest way for a cookie to change is for some process to runerlang:set_cookie/2
and remember: That thing can be run from wherever (i.e. from any node in the cluster):> erlang:set_cookie('SickNode@X.X.X.X', 'the wrong cookie').
– Brujo Benavides
Nov 22 '18 at 10:03
If someone uses "make" command or "make clean" can cause to the cookie change? I declared cookie in vm.args file, so I don't think anyone changed the cookie by sing that api. Thanks for your help!!!
– Duke
Nov 23 '18 at 6:18
I reproduced the issue again and found that the cookie didn't change as we discussed. Is there any other reason here? After using oberver_cli to inspect the SickNode, new nodes can't join to it.
– Duke
Nov 23 '18 at 11:09
Even I set again a correct cookie.
– Duke
Nov 23 '18 at 11:27
Thanks for your help again! Your scenario looks to be same as my issue. But as I commented above, I don't know what is a root cause to make the cookie changed. Could you recommend some reason? Actually I restarted the sick node then it works well. But this situation is a serious one for me. So for now, I can't check whether or not the cookie was changed.
– Duke
Nov 22 '18 at 2:58
Thanks for your help again! Your scenario looks to be same as my issue. But as I commented above, I don't know what is a root cause to make the cookie changed. Could you recommend some reason? Actually I restarted the sick node then it works well. But this situation is a serious one for me. So for now, I can't check whether or not the cookie was changed.
– Duke
Nov 22 '18 at 2:58
Well… the easiest way for a cookie to change is for some process to run
erlang:set_cookie/2
and remember: That thing can be run from wherever (i.e. from any node in the cluster): > erlang:set_cookie('SickNode@X.X.X.X', 'the wrong cookie').
– Brujo Benavides
Nov 22 '18 at 10:03
Well… the easiest way for a cookie to change is for some process to run
erlang:set_cookie/2
and remember: That thing can be run from wherever (i.e. from any node in the cluster): > erlang:set_cookie('SickNode@X.X.X.X', 'the wrong cookie').
– Brujo Benavides
Nov 22 '18 at 10:03
If someone uses "make" command or "make clean" can cause to the cookie change? I declared cookie in vm.args file, so I don't think anyone changed the cookie by sing that api. Thanks for your help!!!
– Duke
Nov 23 '18 at 6:18
If someone uses "make" command or "make clean" can cause to the cookie change? I declared cookie in vm.args file, so I don't think anyone changed the cookie by sing that api. Thanks for your help!!!
– Duke
Nov 23 '18 at 6:18
I reproduced the issue again and found that the cookie didn't change as we discussed. Is there any other reason here? After using oberver_cli to inspect the SickNode, new nodes can't join to it.
– Duke
Nov 23 '18 at 11:09
I reproduced the issue again and found that the cookie didn't change as we discussed. Is there any other reason here? After using oberver_cli to inspect the SickNode, new nodes can't join to it.
– Duke
Nov 23 '18 at 11:09
Even I set again a correct cookie.
– Duke
Nov 23 '18 at 11:27
Even I set again a correct cookie.
– Duke
Nov 23 '18 at 11:27
|
show 5 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53370999%2fnet-admping-failure-very-strange%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Two questions: 1. Can you ping other nodes from your new node (i.e. nodes that are not SickNode@X.X.X.X)? / 2. Can you ping your new node from SickNode@X.X.X.X?
– Brujo Benavides
Nov 20 '18 at 10:40
1. I can ping from the new node to all other nodes, except the SickNode. A pong result returned to the new node./ 2. From the SickNode, I can't ping to any new node. In the SickNode, when using netstat, I can see old connection (s) keep maintain. Thanks for your help!
– Duke
Nov 20 '18 at 17:03
So… I know you said "SickNode@X.X.X.X is working well to other nodes in the cluster." but still… 3. from SickNode, can you ping other nodes (i.e. not new ones, just some other healthy ones)? / 4. from other healthy nodes, can you ping SickNode? / 5. have you checked that the cookie in SickNode is still 'MYCOOKIE'? Maybe it changed after that node was connected to the cluster…
– Brujo Benavides
Nov 21 '18 at 18:06
I can ping pong to healthy nodes from the sick node and vice-versa. Just problem happen with new nodes. I think you are right that, maybe the cookie has been changed. But I wonder, what causes to the cookie changed? While the sick node has been running quite long time. Thanks for your help!
– Duke
Nov 22 '18 at 2:00