Why does dockerd on a node get bad?
After a few days of running dockerd
on a kubernetes host, where pods are scheduled by kubelet, dockerd
goes bad - consuming a lot of resources (50% memory - ~4gigs).
When it gets to this state, it is unable to act on commands for containers that appear to be running via $ docker ps
. Also checking ps -ef
on the host these containers don't map to any underlying host processes.
$ docker exec
yields -
level=error msg="Error running exec in container: rpc error: code = 2 desc = containerd: container not found"
Cannot kill container 6a8d4....8: rpc error: code = 14 desc = grpc: the connection is unavailable"
level=fatal msg="open /var/run/docker/libcontainerd/containerd/7657...4/65...6/process.json: no such file or directory"
Looking through the process tree on the host there seem to be a lot of defunct processes which point to dockerd
as the parent id. Any pointers on what the issue might be or where to look further?
Enabled debug on dockerd
to see if the issue re-occurs, a dockerd restart fixes the issue.
docker kubernetes
add a comment |
After a few days of running dockerd
on a kubernetes host, where pods are scheduled by kubelet, dockerd
goes bad - consuming a lot of resources (50% memory - ~4gigs).
When it gets to this state, it is unable to act on commands for containers that appear to be running via $ docker ps
. Also checking ps -ef
on the host these containers don't map to any underlying host processes.
$ docker exec
yields -
level=error msg="Error running exec in container: rpc error: code = 2 desc = containerd: container not found"
Cannot kill container 6a8d4....8: rpc error: code = 14 desc = grpc: the connection is unavailable"
level=fatal msg="open /var/run/docker/libcontainerd/containerd/7657...4/65...6/process.json: no such file or directory"
Looking through the process tree on the host there seem to be a lot of defunct processes which point to dockerd
as the parent id. Any pointers on what the issue might be or where to look further?
Enabled debug on dockerd
to see if the issue re-occurs, a dockerd restart fixes the issue.
docker kubernetes
add a comment |
After a few days of running dockerd
on a kubernetes host, where pods are scheduled by kubelet, dockerd
goes bad - consuming a lot of resources (50% memory - ~4gigs).
When it gets to this state, it is unable to act on commands for containers that appear to be running via $ docker ps
. Also checking ps -ef
on the host these containers don't map to any underlying host processes.
$ docker exec
yields -
level=error msg="Error running exec in container: rpc error: code = 2 desc = containerd: container not found"
Cannot kill container 6a8d4....8: rpc error: code = 14 desc = grpc: the connection is unavailable"
level=fatal msg="open /var/run/docker/libcontainerd/containerd/7657...4/65...6/process.json: no such file or directory"
Looking through the process tree on the host there seem to be a lot of defunct processes which point to dockerd
as the parent id. Any pointers on what the issue might be or where to look further?
Enabled debug on dockerd
to see if the issue re-occurs, a dockerd restart fixes the issue.
docker kubernetes
After a few days of running dockerd
on a kubernetes host, where pods are scheduled by kubelet, dockerd
goes bad - consuming a lot of resources (50% memory - ~4gigs).
When it gets to this state, it is unable to act on commands for containers that appear to be running via $ docker ps
. Also checking ps -ef
on the host these containers don't map to any underlying host processes.
$ docker exec
yields -
level=error msg="Error running exec in container: rpc error: code = 2 desc = containerd: container not found"
Cannot kill container 6a8d4....8: rpc error: code = 14 desc = grpc: the connection is unavailable"
level=fatal msg="open /var/run/docker/libcontainerd/containerd/7657...4/65...6/process.json: no such file or directory"
Looking through the process tree on the host there seem to be a lot of defunct processes which point to dockerd
as the parent id. Any pointers on what the issue might be or where to look further?
Enabled debug on dockerd
to see if the issue re-occurs, a dockerd restart fixes the issue.
docker kubernetes
docker kubernetes
edited Nov 16 '18 at 21:36
asked Nov 14 '18 at 3:11
user2062360
1901319
1901319
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Sounds like you have a container misbehaving and docker is not able to reap it. I would take a look at what has been scheduled on the nodes where you see the problem. The error you are seeing seems like the docker daemon not responding to API requests issued by the docker CLI. Some pointers:
- Has the container exited successfully or with an error?
- Did they containers get killed for some reason?
- Check the kubelet logs
- Check the kube-scheduler logs?
- Follow the logs in the containers on your node
docker logs -f <containerid>
this seems to be limited to more than one container.
– user2062360
Nov 16 '18 at 21:37
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292658%2fwhy-does-dockerd-on-a-node-get-bad%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sounds like you have a container misbehaving and docker is not able to reap it. I would take a look at what has been scheduled on the nodes where you see the problem. The error you are seeing seems like the docker daemon not responding to API requests issued by the docker CLI. Some pointers:
- Has the container exited successfully or with an error?
- Did they containers get killed for some reason?
- Check the kubelet logs
- Check the kube-scheduler logs?
- Follow the logs in the containers on your node
docker logs -f <containerid>
this seems to be limited to more than one container.
– user2062360
Nov 16 '18 at 21:37
add a comment |
Sounds like you have a container misbehaving and docker is not able to reap it. I would take a look at what has been scheduled on the nodes where you see the problem. The error you are seeing seems like the docker daemon not responding to API requests issued by the docker CLI. Some pointers:
- Has the container exited successfully or with an error?
- Did they containers get killed for some reason?
- Check the kubelet logs
- Check the kube-scheduler logs?
- Follow the logs in the containers on your node
docker logs -f <containerid>
this seems to be limited to more than one container.
– user2062360
Nov 16 '18 at 21:37
add a comment |
Sounds like you have a container misbehaving and docker is not able to reap it. I would take a look at what has been scheduled on the nodes where you see the problem. The error you are seeing seems like the docker daemon not responding to API requests issued by the docker CLI. Some pointers:
- Has the container exited successfully or with an error?
- Did they containers get killed for some reason?
- Check the kubelet logs
- Check the kube-scheduler logs?
- Follow the logs in the containers on your node
docker logs -f <containerid>
Sounds like you have a container misbehaving and docker is not able to reap it. I would take a look at what has been scheduled on the nodes where you see the problem. The error you are seeing seems like the docker daemon not responding to API requests issued by the docker CLI. Some pointers:
- Has the container exited successfully or with an error?
- Did they containers get killed for some reason?
- Check the kubelet logs
- Check the kube-scheduler logs?
- Follow the logs in the containers on your node
docker logs -f <containerid>
answered Nov 14 '18 at 7:24
Rico
26.2k94864
26.2k94864
this seems to be limited to more than one container.
– user2062360
Nov 16 '18 at 21:37
add a comment |
this seems to be limited to more than one container.
– user2062360
Nov 16 '18 at 21:37
this seems to be limited to more than one container.
– user2062360
Nov 16 '18 at 21:37
this seems to be limited to more than one container.
– user2062360
Nov 16 '18 at 21:37
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292658%2fwhy-does-dockerd-on-a-node-get-bad%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown