I have a DynamoDB table and I want to output items from it to a client using pagination. I thought I'd use DynamoDB.Paginator.Scan and supply StartingToken
, however I dont see NextToken
in the output of either page
or iterator
itself. So how do I get it?
My goal is a REST API where client requests next X items from a table, supplying StartingToken to iterate. Originally there's no token, but with each response server returns NextToken
which client supplies as a StartingToken
to get the next X items.
import boto3
import json
table="TableName"
client = boto3.client("dynamodb")
paginator = client.get_paginator("query")
token = None
size=1
for i in range(1,10):
client.put_item(TableName=table, Item={"PK":{"S":str(i)},"SK":{"S":str(i)}})
it = paginator.paginate(
TableName=table,
ProjectionExpression="PK,SK",
PaginationConfig={"MaxItems": 100, "PageSize": size, "StartingToken": token}
)
for page in it:
print(json.dumps(page, indent=2))
break
As a side note - how do I get one page from paginator without using break/for? I tried using next(it)
but it does not work.
Here's it
object:
{
'_input_token': ['ExclusiveStartKey'],
'_limit_key': 'Limit',
'_max_items': 100,
'_method': <bound method ClientCreator._create_api_method.<locals>._api_call of <botocore.client.DynamoDB object at 0x000001CBA5806AA0>>,
'_more_results': None,
'_non_aggregate_key_exprs': [{'type': 'field', 'children': [], 'value': 'ConsumedCapacity'}],
'_non_aggregate_part': {'ConsumedCapacity': None},
'_op_kwargs': {'Limit': 1,
'ProjectionExpression': 'PK,SK',
'TableName': 'TableName'},
'_output_token': [{'type': 'field', 'children': [], 'value': 'LastEvaluatedKey'}],
'_page_size': 1,
'_result_keys': [{'type': 'field', 'children': [], 'value': 'Items'},
{'type': 'field', 'children': [], 'value': 'Count'},
{'type': 'field', 'children': [], 'value': 'ScannedCount'}],
'_resume_token': None,
'_starting_token': None,
'_token_decoder': <botocore.paginate.TokenDecoder object at 0x000001CBA5D81960>,
'_token_encoder': <botocore.paginate.TokenEncoder object at 0x000001CBA5D82290>
}
And the page:
{
"Items": [
{
"PK": {
"S": "2"
},
"SK": {
"S": "2"
}
}
],
"Count": 1,
"ScannedCount": 1,
"LastEvaluatedKey": {
"PK": {
"S": "2"
},
"SK": {
"S": "2"
}
},
"ResponseMetadata": {
"RequestId": "DBE4ON8SI0GOTS2RRO2OG43QJVVV4KQNSO5AEMVJF66Q9ASUAAJG",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"server": "Server",
"date": "Fri, 30 Dec 2022 11:37:52 GMT",
"content-type": "application/x-amz-json-1.0",
"content-length": "121",
"connection": "keep-alive",
"x-amzn-requestid": "DBE4ON8SI0GOTS2RRO2OG43QJVVV4KQNSO5AEMVJF66Q9ASUAAJG",
"x-amz-crc32": "973385738"
},
"RetryAttempts": 0
}
}
I thought I could use LastEvaluatedKey
but that throws an error, also tried to get token like this, but it did not work:
it._token_encoder.encode(page["LastEvaluatedKey"])
I also thought about using just scan
without iterator, but I'm actually outputting a very filtered result-set. I need to set Limit
to a very large value to get results and I don't want too many results at the same time. Is there a way to scan up to 1000 items but stop as soon as 10 items are found?
CodePudding user response:
I would suggest not using paginator
but rather just use the lower level Query
. The reason being is the confusion between NextToken
and LastEvaluatedKey
. These are not interchangeable.
LastEvaluatedKey
is passed toExclusiveStartKey
NextToken
is passed toStartToken
It's preferrable to use the Resource Client which I believe causes no confusing on how to paginate
import boto3
dynamodb = boto3.resource('dynamodb', region_name=region)
table = dynamodb.Table('my-table')
response = table.query()
data = response['Items']
# LastEvaluatedKey indicates that there are more results
while 'LastEvaluatedKey' in response:
response = table.query(ExclusiveStartKey=response['LastEvaluatedKey'])
data.update(response['Items'])
CodePudding user response:
The LastEvaluatedKey is in the response object and can be set as the ExclusiveStartKey in the scan.
Sample code showing this can be found in the AWS DynamoDB Sample code (here, for example)