I'm developing an application in Python which uses Azure Cosmos DB as the main database. At some point in the app, I need to insert bulk data (a batch of items) into Cosmos DB. So far, I've been using Azure Cosmos DB Python SDK for SQL API for communicating with Cosmos DB; however, it doesn't provide a method for bulk data insertion.
As I understood, these are the insertion methods provided in this SDK, both of which only support single item insert, which can be very slow when using it in a for
loop:
.upsert_item()
.create_item()
Is there another way to use this SDK to insert bulk data instead of using the methods above in a for
loop? If not, is there an Azure REST API that can handle bulk data insertion?
CodePudding user response:
The Cosmos DB service does not provide this via its REST API. Bulk mode is implemented at the SDK layer and unfortunately, the Python SDK does not yet support bulk mode. It does however support asynchronous IO. Here's an example that may help you.
from azure.cosmos.aio import CosmosClient
import os
URL = os.environ['ACCOUNT_URI']
KEY = os.environ['ACCOUNT_KEY']
DATABASE_NAME = 'myDatabase'
CONTAINER_NAME = 'myContainer'
async def create_products():
async with CosmosClient(URL, credential=KEY) as client:
database = client.get_database_client(DATABASE_NAME)
container = database.get_container_client(CONTAINER_NAME)
for i in range(10):
await container.upsert_item({
'id': 'item{0}'.format(i),
'productName': 'Widget',
'productModel': 'Model {0}'.format(i)
}
)