I'm trying to query DynamoDB and get a result similar to select distinct(address) from ...
in SQL.
I know DynamoDB is a document-oriented DB and maybe I need to change the data structure.
I'm trying to avoid getting all the data first and filtering later.
My data looks like this:
Attribute | Datatype |
---|---|
ID | String |
Var1 | Map |
VarN | Map |
Address | String |
So I want to get the distinct addresses in the entire table.
How it's the best way to do it?
CodePudding user response:
Unfortunately, no. You'll need to Scan
the entire table (you can use the ProjectionExpression
or AttributesToGet
options to ask just for the "Address" attribute, but anyway you'll pay for scanning the entire contents of the table).
If you need to do this scan often, you can add a secondary-index which projects only the keys and the "Address" attribute, to make it cheaper to scan. But unfortunately, using a GSI whose partition key is the "Address" does not give you an ability to eliminate duplicates: Each partition will still contain a list of duplicate items, and unfortunately there is no way to just listing the different partition keys in an index - Scan
ing the index will give you the same partition key multiple times, as many items there are in this partition.