Home > Mobile >  Design Pattern - Spring KafkaListener processing 1 million records in 1 hour
Design Pattern - Spring KafkaListener processing 1 million records in 1 hour

Time:06-15

My spring boot application is going to listen to 1 million records an hour from a kafka broker. The entire processing logic for each message takes 1-1.5 seconds including a database insert. Broker has 64 partitions, which is also the concurrency of my @KafkaListener.

My current code is only able to process 90 records in a minute in a lower environment where I am listening to around 50k records an hour. Below is the code and all other config parameters like max.poll.records etc are default values:

@KafkaListener(id="xyz-listener", concurrency="64", topics="my-topic")
public void listener(String record) {

// processing logic 

}

I do get "it is likely that the consumer was kicked out of the group" 7-8 times an hour. I think both of these issues can be solved through isolating listener method and multithreading processing of each message but I am not sure how to do that.

CodePudding user response:

There are a few points to consider here. First, 64 consumers seems a bit too much for a single application to handle consistently.

Considering each poll by default fetches 500 records per consumer at a time, your app might be getting overloaded and causing the consumers to get kicked out of the group if a single batch takes more than the 5 minutes default for max.poll.timeout.ms to be processed.

So first, I'd consider scaling the application horizontally so that each application handles a smaller amount of partitions / threads.

A second way to increase throughput would be using a batch listener, and handling processing and DB insertions in batches as you can see in this answer.

Using both, you should be processing a sensible amount of work in parallel per app, and should be able to achieve your desired throughput.

Of course, you should load test each approach with different figures to have proper metrics.

EDIT: Addressing your comment, if you want to achieve this throughput I wouldn't give up on batch processing just yet. If you do the DB operations row by row you'll need a lot more resources for the same performance.

If your rule engine doesn't do any I/O you can iterate each record from the batch through it without losing performance.

About data consistency, you can try some strategies. For example, you can have a lock to ensure that even through a rebalance only one instance will process a given batch of records at a given time - or perhaps there's a more idiomatic way of handling that in Kafka using the rebalance hooks.

With that in place, you can batch load all the information you need to filter out duplicated / outdated records when you receive the records, iterate each record through the rule engine in memory, and then batch persist all results, to then release the lock.

Of course, it's hard to come up with an ideal strategy without knowing more details about the process. The point is by doing that you should be able to handle around 10x more records within each instance, so I'd definitely give it a shot.

  • Related