Mapping out a Kafka+Zookeeper cluster

up vote
1
down vote

favorite

Background

I inherited a Kafka/Zookeeper installation. I have a passing knowledge of those - I know the general architecture, how clients work, about topics, etc., have been involved in programming Java clients etc.

But the installation is somewhat dubious. They are three instances of Kafka and Zookeeper each (in their separate docker containers). Supposedly they should work, but what I am seeing is all processes spout immense amount of log output with loads and loads of (diverse) warnings and errors. I have the impression that some of these seem to be quite normal (or are being self-healed all the time), and am having a very hard time figuring if everything works as intended or not, and set up correctly.

Some of these are - according to Google - related to unclean shutdowns of the brokers; corrupted individual topics and such. As this is a test environment, I can easily delete such files.

I know about some commands which help me check topics etc. (basic stuff, like listing them, displaying their individual configuration etc.).

However...

Question

Is there an online ressource/documentation which can be used as a systematic walkthrough to check whether everything is basically setup OK; for example to clear up these questions:

Do the three Zookeepers and the three Kafka instances correctly talk to each other for high-availability purposes? Do they have a correct "leader" etc.?

Are the servers generally "healthy", i.e., easily able to accept connections etc.?

How are the topics working (what's in there, how many messages, etc.)?

I am aware that one may very quickly dismiss this question as too generic; I am not asking you to solve my problems. I am looking for a ressource to systematically walk through such an installation - it may or may not cover the examples I have given, but it definitely should give a systematic way to find out if things are fundamentally wrong.

asked Nov 8 at 17:51

AnoE

5,1691919

add a comment |

up vote
1
down vote

favorite

Background

Some of these are - according to Google - related to unclean shutdowns of the brokers; corrupted individual topics and such. As this is a test environment, I can easily delete such files.

I know about some commands which help me check topics etc. (basic stuff, like listing them, displaying their individual configuration etc.).

However...

Question

Is there an online ressource/documentation which can be used as a systematic walkthrough to check whether everything is basically setup OK; for example to clear up these questions:

Do the three Zookeepers and the three Kafka instances correctly talk to each other for high-availability purposes? Do they have a correct "leader" etc.?

Are the servers generally "healthy", i.e., easily able to accept connections etc.?

How are the topics working (what's in there, how many messages, etc.)?

asked Nov 8 at 17:51

AnoE

5,1691919

add a comment |

up vote
1
down vote

favorite

Background

Some of these are - according to Google - related to unclean shutdowns of the brokers; corrupted individual topics and such. As this is a test environment, I can easily delete such files.

I know about some commands which help me check topics etc. (basic stuff, like listing them, displaying their individual configuration etc.).

However...

Question

Is there an online ressource/documentation which can be used as a systematic walkthrough to check whether everything is basically setup OK; for example to clear up these questions:

Do the three Zookeepers and the three Kafka instances correctly talk to each other for high-availability purposes? Do they have a correct "leader" etc.?

Are the servers generally "healthy", i.e., easily able to accept connections etc.?

How are the topics working (what's in there, how many messages, etc.)?

asked Nov 8 at 17:51

AnoE

5,1691919

Background

Some of these are - according to Google - related to unclean shutdowns of the brokers; corrupted individual topics and such. As this is a test environment, I can easily delete such files.

I know about some commands which help me check topics etc. (basic stuff, like listing them, displaying their individual configuration etc.).

However...

Question

Is there an online ressource/documentation which can be used as a systematic walkthrough to check whether everything is basically setup OK; for example to clear up these questions:

Do the three Zookeepers and the three Kafka instances correctly talk to each other for high-availability purposes? Do they have a correct "leader" etc.?

Are the servers generally "healthy", i.e., easily able to accept connections etc.?

How are the topics working (what's in there, how many messages, etc.)?

apache-kafka apache-zookeeper

asked Nov 8 at 17:51

AnoE

5,1691919

asked Nov 8 at 17:51

AnoE

5,1691919

asked Nov 8 at 17:51

AnoE

5,1691919

asked Nov 8 at 17:51

AnoE

5,1691919

asked Nov 8 at 17:51

AnoE

5,1691919

add a comment |

2 Answers
2

active

oldest

votes

up vote
0
down vote

This packtpub tutorial/training by Stéphane Maarek is wonderful resource for setting kafka in cluster mode. However he did that in AWS cloud in ubuntu VM.

I have followed the same steps and installed in Vagrant VMs in cent OS. You can find the code here.

The VM has yahoo kafka manager to monitor the kafka internal details. list of broker available, healthy , partitions, leaders etc.,

kafka manager can help you with high level monitoring.

Please provide your comments.

edited Nov 9 at 18:01

answered Nov 8 at 21:13

Rajkumar Natarajan

9651033

The question doesn't seem to be about setting up, but rather monitoring
– cricket_007
Nov 9 at 15:34

yeah, I missed to add about kafka manager.
– Rajkumar Natarajan
Nov 9 at 18:02

add a comment |

up vote
0
down vote

Rather than looking solely at logs, you might want to familiarize yourself with JMX metrics and how you can gather them across the cluster.

If you want to actually collect and analyze logs, you'll likely need to separately use something like Elasticsearch.

You won't see "how many messages" in a topic, and you'll need even more monitoring to know if a port is actually open and the Kafka process is running, the disks are filling up, etc.

My point here is that, Kafka needs fed and watered, if you plan to productionalize it, you can't just set up a small cluster and forget about it. Even if you think it's setup correctly at the beginning, increasing the load on it will cause it to fall in a bad state eventually.

For a limited trial for your dev environment to get a full look at your cluster health, Confluent Control Center can assist with that.

To solve the "what's in there" problem, I suggest you setup a Schema Registry, and convince Kafka producers to use it.

edited Nov 10 at 5:17

answered Nov 9 at 15:41

cricket_007

76.4k1042106

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53213484%2fmapping-out-a-kafkazookeeper-cluster%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
0
down vote

This packtpub tutorial/training by Stéphane Maarek is wonderful resource for setting kafka in cluster mode. However he did that in AWS cloud in ubuntu VM.

I have followed the same steps and installed in Vagrant VMs in cent OS. You can find the code here.

The VM has yahoo kafka manager to monitor the kafka internal details. list of broker available, healthy , partitions, leaders etc.,

kafka manager can help you with high level monitoring.

Please provide your comments.

edited Nov 9 at 18:01

answered Nov 8 at 21:13

Rajkumar Natarajan

9651033

The question doesn't seem to be about setting up, but rather monitoring
– cricket_007
Nov 9 at 15:34

yeah, I missed to add about kafka manager.
– Rajkumar Natarajan
Nov 9 at 18:02

add a comment |

up vote
0
down vote

This packtpub tutorial/training by Stéphane Maarek is wonderful resource for setting kafka in cluster mode. However he did that in AWS cloud in ubuntu VM.

I have followed the same steps and installed in Vagrant VMs in cent OS. You can find the code here.

The VM has yahoo kafka manager to monitor the kafka internal details. list of broker available, healthy , partitions, leaders etc.,

kafka manager can help you with high level monitoring.

Please provide your comments.

edited Nov 9 at 18:01

answered Nov 8 at 21:13

Rajkumar Natarajan

9651033

The question doesn't seem to be about setting up, but rather monitoring
– cricket_007
Nov 9 at 15:34

yeah, I missed to add about kafka manager.
– Rajkumar Natarajan
Nov 9 at 18:02

add a comment |

up vote
0
down vote

This packtpub tutorial/training by Stéphane Maarek is wonderful resource for setting kafka in cluster mode. However he did that in AWS cloud in ubuntu VM.

I have followed the same steps and installed in Vagrant VMs in cent OS. You can find the code here.

The VM has yahoo kafka manager to monitor the kafka internal details. list of broker available, healthy , partitions, leaders etc.,

kafka manager can help you with high level monitoring.

Please provide your comments.

edited Nov 9 at 18:01

answered Nov 8 at 21:13

Rajkumar Natarajan

9651033

This packtpub tutorial/training by Stéphane Maarek is wonderful resource for setting kafka in cluster mode. However he did that in AWS cloud in ubuntu VM.

I have followed the same steps and installed in Vagrant VMs in cent OS. You can find the code here.

The VM has yahoo kafka manager to monitor the kafka internal details. list of broker available, healthy , partitions, leaders etc.,

kafka manager can help you with high level monitoring.

Please provide your comments.

edited Nov 9 at 18:01

answered Nov 8 at 21:13

Rajkumar Natarajan

9651033

edited Nov 9 at 18:01

answered Nov 8 at 21:13

Rajkumar Natarajan

9651033

answered Nov 8 at 21:13

Rajkumar Natarajan

9651033

answered Nov 8 at 21:13

Rajkumar Natarajan

9651033

The question doesn't seem to be about setting up, but rather monitoring
– cricket_007
Nov 9 at 15:34

yeah, I missed to add about kafka manager.
– Rajkumar Natarajan
Nov 9 at 18:02

add a comment |

The question doesn't seem to be about setting up, but rather monitoring
– cricket_007
Nov 9 at 15:34

yeah, I missed to add about kafka manager.
– Rajkumar Natarajan
Nov 9 at 18:02

The question doesn't seem to be about setting up, but rather monitoring
– cricket_007
Nov 9 at 15:34

yeah, I missed to add about kafka manager.
– Rajkumar Natarajan
Nov 9 at 18:02

add a comment |

up vote
0
down vote

Rather than looking solely at logs, you might want to familiarize yourself with JMX metrics and how you can gather them across the cluster.

If you want to actually collect and analyze logs, you'll likely need to separately use something like Elasticsearch.

You won't see "how many messages" in a topic, and you'll need even more monitoring to know if a port is actually open and the Kafka process is running, the disks are filling up, etc.

For a limited trial for your dev environment to get a full look at your cluster health, Confluent Control Center can assist with that.

To solve the "what's in there" problem, I suggest you setup a Schema Registry, and convince Kafka producers to use it.

edited Nov 10 at 5:17

answered Nov 9 at 15:41

cricket_007

76.4k1042106

add a comment |

up vote
0
down vote

Rather than looking solely at logs, you might want to familiarize yourself with JMX metrics and how you can gather them across the cluster.

If you want to actually collect and analyze logs, you'll likely need to separately use something like Elasticsearch.

You won't see "how many messages" in a topic, and you'll need even more monitoring to know if a port is actually open and the Kafka process is running, the disks are filling up, etc.

For a limited trial for your dev environment to get a full look at your cluster health, Confluent Control Center can assist with that.

To solve the "what's in there" problem, I suggest you setup a Schema Registry, and convince Kafka producers to use it.

edited Nov 10 at 5:17

answered Nov 9 at 15:41

cricket_007

76.4k1042106

add a comment |

up vote
0
down vote

Rather than looking solely at logs, you might want to familiarize yourself with JMX metrics and how you can gather them across the cluster.

If you want to actually collect and analyze logs, you'll likely need to separately use something like Elasticsearch.

You won't see "how many messages" in a topic, and you'll need even more monitoring to know if a port is actually open and the Kafka process is running, the disks are filling up, etc.

For a limited trial for your dev environment to get a full look at your cluster health, Confluent Control Center can assist with that.

To solve the "what's in there" problem, I suggest you setup a Schema Registry, and convince Kafka producers to use it.

edited Nov 10 at 5:17

answered Nov 9 at 15:41

cricket_007

76.4k1042106

Rather than looking solely at logs, you might want to familiarize yourself with JMX metrics and how you can gather them across the cluster.

If you want to actually collect and analyze logs, you'll likely need to separately use something like Elasticsearch.

You won't see "how many messages" in a topic, and you'll need even more monitoring to know if a port is actually open and the Kafka process is running, the disks are filling up, etc.

For a limited trial for your dev environment to get a full look at your cluster health, Confluent Control Center can assist with that.

To solve the "what's in there" problem, I suggest you setup a Schema Registry, and convince Kafka producers to use it.

edited Nov 10 at 5:17

answered Nov 9 at 15:41

cricket_007

76.4k1042106

edited Nov 10 at 5:17

answered Nov 9 at 15:41

cricket_007

76.4k1042106

answered Nov 9 at 15:41

cricket_007

76.4k1042106

answered Nov 9 at 15:41

cricket_007

76.4k1042106

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Agfdhyk