Home > OS >  Only get new data from DynamoDB using Python
Only get new data from DynamoDB using Python

Time:09-02

I'm trying to export data from a DynamoDB transaction table using Python. Until now I was able to get all the data from the table but I would like to add a filter that allows me to only get the data from a certain date until today.

There is a field called CreatedAt that indicates the time when the transaction was made, I was thinking of using this field to filter the new data.

This is the code I've been using to query the table, it would be really helpful if anyone can tell me how to apply this filter into this script.

import pandas as pd
from boto3.dynamodb.conditions

aws_access_key_id = '*****'
aws_secret_access_key = '*****'
region='****'

dynamodb = boto3.resource(
    'dynamodb',
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
    region_name=region
    )

transactions_table = dynamodb.Table('transactions_table')

result = transactions_table.scan()

items = result['Items']

df_transactions_table  = pd.json_normalize(items)

print(df_transactions_table)

Thanks!

CodePudding user response:

Boto3 allows for FilterExpressions as part of a DynamoDB query that will achieve filtering on the field. See here

Optionally using FilterExpressions will still consume the same amount of read capacity units.

CodePudding user response:

You need to use FilterExpression which would look like the following:

import boto3
from boto3.dynamodb.conditions import Key, Attr, And


aws_access_key_id = '*****'
aws_secret_access_key = '*****'
region='****'

dynamodb = boto3.resource(
    'dynamodb',
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
    region_name=region
    )

transactions_table = dynamodb.Table('transactions_table')

result = transactions_table.scan(
    FilterExpression=Attr('CreatedAt').gt('2020-08-10'),
)

items = result['Items']

df_transactions_table  = pd.json_normalize(items)

print(df_transactions_table)

You can learn more from the docs on Boto3 Scan and FilterExpression.

Some advice: Please do not hard code your keys the way you have done in this code, use an IAM role. If you are testing locally, configure the AWS CLI which will provide credentials that you can assume when testing, that way you wont make a mistake and share keys on GitHub etc...

  • Related