Home > database >  How do I set partition key with batched messages in Azure Event Hub?
How do I set partition key with batched messages in Azure Event Hub?

Time:10-08

I read rows from a file (let's call it customers) and publish them to an Event Hub. Each row contains a customer_id and some data. The same customer may appear several times in the file. So I want to use customer_id as the partition key to guarantee that ordering is preserved for the same customer, while still sending messages in batches for performance reasons. Seems easy enough...

So, you can set partition on messages or batches using

   var opts = new SendEventOptions() {
       PartitionKey = customer.CustomerId
   };
   client.SendAsync(messages, opts);

But this will set the partition key on the batch itself and that will make all messages in the batch having the same key

Is it even possible to set partition key on each message and still use batches in a sane way? Preferably I would like to set the key on each message and then just add them to a batch and send it.

I'm using the Azure.Messaging.EventHubs namespace and C#.

CodePudding user response:

Unfortunately, what you're looking to do is not possible. The Event Hubs service requires that all events in a batch be assigned the same partition key.

Your best bet using partition keys would be to build a batch for each partition key that you're using and add to them as you process, sending when full or some flush threshold has elapsed.

Alternatively, you could assign each customer identifier to a specific partition and then build a batch per partition with the approach described above. This would be more efficient, as you'd have a smaller number of batches to manage and would likely be filling each batch more quickly.

  • Related