nginx-lego and autoscaler don't play well after scaling down
I'm having troubles with nginx-lego (I know it's deprecated) and node autoscaler. I had to scale up manually through an HPA and patching temporarily minReplicas to a high number. All scaled well, new nodes were added because of pod increase.
After the traffic spike, I set the number back to normal (which is really low) and I can see a lot of bad gateway 502 errors. After I examined the nginx-lego pod's log, I was able to see that plenty of requests were going to pods that aren't there anymore (connection refused or No route to host).
2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"
2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"
2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"
Any idea on what could be wrong?
I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.
amazon-web-services kubernetes nginx-ingress
add a comment |
I'm having troubles with nginx-lego (I know it's deprecated) and node autoscaler. I had to scale up manually through an HPA and patching temporarily minReplicas to a high number. All scaled well, new nodes were added because of pod increase.
After the traffic spike, I set the number back to normal (which is really low) and I can see a lot of bad gateway 502 errors. After I examined the nginx-lego pod's log, I was able to see that plenty of requests were going to pods that aren't there anymore (connection refused or No route to host).
2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"
2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"
2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"
Any idea on what could be wrong?
I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.
amazon-web-services kubernetes nginx-ingress
add a comment |
I'm having troubles with nginx-lego (I know it's deprecated) and node autoscaler. I had to scale up manually through an HPA and patching temporarily minReplicas to a high number. All scaled well, new nodes were added because of pod increase.
After the traffic spike, I set the number back to normal (which is really low) and I can see a lot of bad gateway 502 errors. After I examined the nginx-lego pod's log, I was able to see that plenty of requests were going to pods that aren't there anymore (connection refused or No route to host).
2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"
2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"
2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"
Any idea on what could be wrong?
I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.
amazon-web-services kubernetes nginx-ingress
I'm having troubles with nginx-lego (I know it's deprecated) and node autoscaler. I had to scale up manually through an HPA and patching temporarily minReplicas to a high number. All scaled well, new nodes were added because of pod increase.
After the traffic spike, I set the number back to normal (which is really low) and I can see a lot of bad gateway 502 errors. After I examined the nginx-lego pod's log, I was able to see that plenty of requests were going to pods that aren't there anymore (connection refused or No route to host).
2018/11/21 17:48:49 [error] 5546#5546: *6908265 connect() failed (113: No route to host) while connecting to upstream, client: 100.112.130.0, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com", referrer: "https://outlook.live.com/"
2018/11/21 17:48:49 [error] 5409#5409: *6908419 connect() failed (113: No route to host) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-instagram.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-instagram.png", host: "xxxx.com"
2018/11/21 17:48:49 [error] 5546#5546: *6908420 connect() failed (111: Connection refused) while connecting to upstream, client: 10.5.143.204, server: xxxx.com, request: "GET /public/images/social-facebook.png HTTP/1.1", upstream: "http://X.X.X.X:3000/public/images/social-facebook.png", host: "xxxx.com"
Any idea on what could be wrong?
I guess that patching minReplicas isn't probably the best way how to do it, but I knew that there will be a spike and I didn't have a better idea on how to pre-scale the whole cluster.
amazon-web-services kubernetes nginx-ingress
amazon-web-services kubernetes nginx-ingress
edited Nov 21 '18 at 19:02
Rico
29.2k95370
29.2k95370
asked Nov 21 '18 at 18:05
OndrejKOndrejK
328
328
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
Nov 21 '18 at 19:19
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
Nov 21 '18 at 19:20
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
Nov 21 '18 at 19:26
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
Nov 21 '18 at 19:27
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
Nov 21 '18 at 19:37
|
show 3 more comments
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53418115%2fnginx-lego-and-autoscaler-dont-play-well-after-scaling-down%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
Nov 21 '18 at 19:19
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
Nov 21 '18 at 19:20
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
Nov 21 '18 at 19:26
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
Nov 21 '18 at 19:27
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
Nov 21 '18 at 19:37
|
show 3 more comments
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
Nov 21 '18 at 19:19
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
Nov 21 '18 at 19:20
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
Nov 21 '18 at 19:26
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
Nov 21 '18 at 19:27
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
Nov 21 '18 at 19:37
|
show 3 more comments
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
Looks like a problem with your nginx ingress (lego) controller not updating the nginx.conf
, when scaling down. I would examine the nginx.conf
and see if it's pointing to backends that don't exist anymore.
$ kubectl cp <nginx-lego-pod>:nginx.conf .
If something looks odd you might have to delete the pod so that it gets created by the ReplicaSet managing your nginx ingress controller pods.
$ kubectl delete <nginx-controller-pod>
Then examine the nginx.conf
again.
Another issue could be your endpoints for your backend services not being updated by Kubernetes, but this would be unrelated directly to upscaling/downscaling your lego HPA. You can check with:
$ kubectl get ep
And see if there are any that don't exist anymore.
answered Nov 21 '18 at 19:13
RicoRico
29.2k95370
29.2k95370
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
Nov 21 '18 at 19:19
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
Nov 21 '18 at 19:20
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
Nov 21 '18 at 19:26
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
Nov 21 '18 at 19:27
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
Nov 21 '18 at 19:37
|
show 3 more comments
I'm gettingerror: unexpected EOF
when trying to cp the nginx.conf
– OndrejK
Nov 21 '18 at 19:19
You can try shelling into the pod and checking where that file is:kubectl exec -it <pod-id> sh
– Rico
Nov 21 '18 at 19:20
Hey Rico, I got this:rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
Nov 21 '18 at 19:26
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
Nov 21 '18 at 19:27
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
Nov 21 '18 at 19:37
I'm getting
error: unexpected EOF
when trying to cp the nginx.conf– OndrejK
Nov 21 '18 at 19:19
I'm getting
error: unexpected EOF
when trying to cp the nginx.conf– OndrejK
Nov 21 '18 at 19:19
You can try shelling into the pod and checking where that file is:
kubectl exec -it <pod-id> sh
– Rico
Nov 21 '18 at 19:20
You can try shelling into the pod and checking where that file is:
kubectl exec -it <pod-id> sh
– Rico
Nov 21 '18 at 19:20
Hey Rico, I got this:
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
Nov 21 '18 at 19:26
Hey Rico, I got this:
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "sh": executable file not found in $PATH"
– OndrejK
Nov 21 '18 at 19:26
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
Nov 21 '18 at 19:27
I also delete one of 3 pods (the one providing me with access logs) and it didn't work as well
– OndrejK
Nov 21 '18 at 19:27
1
1
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
Nov 21 '18 at 19:37
So in the end I was able to make it run after deleting another nginx pod I have 3 pods... Thanks a lot for your help!
– OndrejK
Nov 21 '18 at 19:37
|
show 3 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53418115%2fnginx-lego-and-autoscaler-dont-play-well-after-scaling-down%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown