I am trying to have exactly-once consuming of Kafka Consumer.
My requirement is of:
- Read data from Topic
- Process the data [which involves calling another API]
- Writing the response back to Kafka
I wanted to know if exactly once is possible in this scenario?
I know that use case satisfies Kafka streams API, but I wanted to know from the Producer/Consumer API? Also, if lets say that after processing of the data, the consumer fails for some reason, (the processing should be done only once), what would be best way to handle such cases? Can there be any continuation/checkpoint for such cases?
I understand that Kafka Streams API is produce-consumer-produce transactional. Here also, if after calling the API consumer crashes, the flow would start from the very start, right?
CodePudding user response:
Yes; Spring for Apache Kafka supports exactly once semantics in the same way as Kafka Streams.
See
https://docs.spring.io/spring-kafka/docs/current/reference/html/#exactly-once
and
https://docs.spring.io/spring-kafka/docs/current/reference/html/#transactions
Bear in mind that "exactly once" means that the entire successful
consume -> process -> produce
is performed once. But, if the produce
step fails (rolling back the transaction), then the consume -> process
part is "at least once".
Therefore, you need to make the process
part idempotent.