I have one group Id "my-consumer-group-id" and 3 topics with their own consumers say
"my-consumer-1" consumed to "topic1" with groupId "my-consumer-group-id"
"my-consumer-2" consumed to "topic2" with groupId "my-consumer-group-id"
"my-consumer-3" consumed to "topic3" with groupId "my-consumer-group-id"
I observed "my-consumer-1" has 2 million records with 600k consumer lag, will this affect other consumers from being processed their own message? in terms of the performance, is it better to have a separate consumer groupId for each consumer?
CodePudding user response:
It is best practice to use a different group for each.
While the lag on one won't affect the other consumers, using the same group means that if there is a rebalance on one topic, it will cause a rebalance of all your consumers, including those consuming from different topics.
CodePudding user response:
A consumer can join a consumer group (let us say group_1) by setting its group.id to group_1. Consumer groups is also a way of supporting parallel consumption of the data i.e. different consumers of the same consumer group consume data in parallel from different partitions.
In addition to the group Id, each consumer also identifies itself to the Kafka broker using consumer.id. This is used by Kafka to identify the currently ACTIVE consumers of a particular consumer group.
Reference difference between groupid and consumerid in Kafka consumer
In your case; by adding another consumer (with different ID), Kafka will assign another partition to the new consumer. As you mentioned, my-consumer-1 has 600k lag. Adding a new consumer under the same group won’t help since a new partition will be allocated. But there are only three partitions, 4th consumer won't get any new partition as long 3 consumers are working fine.