I need to be able to run some range-based queries on my DynamoDB table, such as int_attribute > 5
, or starts_with(string_attribute, "foo")
. These can all be answered by creating a global or local secondary index and then submitting a Query to these indexes. However, running a Query requires that you also provide a single value of the partition key to restrict the query set. Neither of these queries has a strict equality condition, so I am therefore considering giving all the items in my Dynamo table the same partition key, and distinguishing them only with the sort key. My dataset is will within the 10 GB partition size limit.
Are there any catastrophic issues that might occur if I do this?
CodePudding user response:
Yes, you can create a GSI where every item goes under the same partition key. The thing to be aware of is you'll generally be putting all those writes into the same physical partition, each of which has a max update rate of 1,000 WCU.
If your update rate is below that, proceed. If your update rate is above that, you'll want to follow a pattern of sharding the GSI partition key value so it spreads across more partitions.
Say you require 10,000 WCU for the GSI. You can assign each item's GSI PK value to a random value-{x}
where x is 0 to 9. Then yes, at query time you do 10 queries and put the results back together yourself. This approach can scale as large as you need.