Home > database >  What is the cheapest way to iterate through all cosmos documents?
What is the cheapest way to iterate through all cosmos documents?

Time:11-05

I have a cosmos resource with a 'products' collection. There are many documents in this collection. Enough such that I will often see a lot of 429s during busy periods despite a fair amount of provisioned throughput.

What is the cheapest way (with regards to R/Us) to iterate through every single product?

Assumptions:

  • Assume I must, ultimately, touch every single document - but assume I can do this in batches - it doesn't need to be one single operation. e.g. if skip/take was the right approach - that could work despite some docs being processed before others

CodePudding user response:

Using Change Feed should be the cheapest. You can do it with a myriad of options (Spark Connector, Azure Functions, etc), but if you are already working on .NET, the most straight forward would be with the Change Feed Pull model

FeedIterator<YourType> iteratorForTheEntireContainer = container.GetChangeFeedIterator<YourType>(ChangeFeedStartFrom.Beginning(), ChangeFeedMode.LatestVersion);

while (iteratorForTheEntireContainer.HasMoreResults)
{
    FeedResponse<YourType> response = await iteratorForTheEntireContainer.ReadNextAsync();

    if (response.StatusCode == HttpStatusCode.NotModified)
    {
        Console.WriteLine($"Completed full read of entire container");
        break;
    }
    else 
    {
        foreach (YourType item in response)
        {
            // process item
        }
    }
}

CodePudding user response:

What is the cheapest way (with regards to R/Us) to iterate through every single product?

You can't get any cheaper than zero RU so definitely worth mentioning the analytical store here.

If you have many documents and need to process all of them this definitely sounds like something very suited to that.

Until recently if you were using the continuous backup model then analytical store was not possible but it is now possible to enable the synapse link feature on such accounts.

In terms of actual monetary "cost" then the analytical store itself is not free. But I have found in accounts with it enabled the cost is extremely minor compared to the transactional store cost. Certainly if it allows you to lower provisioned RU by getting this kind of expensive operation out of the transactional store this can more than break even.

  • Related