use case: Need to retrieve data from Dynamodb for multiple pk(PartitionKey) sk(SortKey) combination. So leveraging the IN operator in partiql on both pk and sk.
I did small poc on how much RCU is getting consumed for 2 types of partiql pattern.
- Pattern 1: RCU consumed = 5, Returned item count = 4. For the query:
SELECT * from "test" where pk IN ['12185','05405','14222'] and sk IN ['CELL_ID:101','CAM_ID:12185','CAM_ID:05405','CAM_ID:14222','CAM_ID:14223'];
- Pattern 2: RCU consumed = 1.5 , Returned item count = 30. For the query:
SELECT * from "test" where pk IN ['12185','05405','14222'];
Question -
- How RCU is getting calculated for these two partiQL patterns?
The answer to this question will help me up in designing the query. If IN operator on both pk and sk is bad idea, i will stick with In operator only on pk and do the sk filtering at the application side.
CodePudding user response:
When using PartiQL you need to understand how it converts to the lower level API.
SELECT * from "test"
where pk IN ['12185','05405','14222']
and sk IN ['CELL_ID:101','CAM_ID:12185','CAM_ID:05405','CAM_ID:14222','CAM_ID:14223'];
Here you are using 3 partition keys, but 5 sort keys which will result in 10 operations being executed. PartiQL backend will only allow 10 internal GetItem
requests before you need to paginate, and as your requests are eventually consistent each of the consumes 0.5 RCU:
10 * 0.5 = 5RCU consumed
SELECT * from "test" where pk IN ['12185','05405','14222'];
This will result in 3 Query
operations, each of the queries will be rounded up to the nearest 4KB.
3 * 0.5 = 1.5 RCU consumed