Home > OS >  Fixed Sized Bucket - Nested JSON Array
Fixed Sized Bucket - Nested JSON Array

Time:01-04

I have stream of events coming from a particular user. I am using Cosmos DB to store my User profile. Take following example JSON object. Here I just want to store limited set of events, say only 2. As soon as the 3rd event comes in I want to remove older one and add that 3rd event, not exceeding my bucket size. Easy way is for each update I pull the record for that User, and modify and update. Was wondering if there is any other efficient way to achieve the same.

{
   "id":"4717cd3c-78d9-4a0e-bf5d-4645c97bd55c",
   "email":"[email protected]",
   "events":[
      {
         "event":"USER_INSTALL",
         "time":1641232180,
         "data":{
            "app":"com.abc"
         }
      },
      {
         "time":1641232181,
         "event":"USER_POST_INSTALL",
         "data":{
            "app":"com.xyz"
         }
      }
   ]
}

CodePudding user response:

There are no options to limit an array size, within a document. You would need to modify this on your own, as you're currently doing. Even if you stored each array item as a separate document, you would still need to periodically purge older documents on your own. At least, with independent documents per event, you could consider purging older events via ttl, but you still wouldn't be able to specify an exact number of documents to keep.

CodePudding user response:

If you are really looking to optimize the speed of such business logic, then I would recommend:

  1. create a separate table for storing events.
  2. Implement ONLY adding events in the application code.
  3. in the database itself, create a trigger that truncates the number of records by condition.
CREATE TRIGGER tai_events AFTER INSERT ON events FOR EACH ROW EXECUTE PROCEDURE fai_events_clean_count();
CREATE OR REPLACE FUNCTION fai_events_clean_count()
RETURNS trigger AS $$
DECLARE
  events_count int DEFAULT 0;
BEGIN
  SELECT count(*) INTO events_count
  FROM events r
    WHERE
      r.user_id = ...
    IF events_count > 2 THEN
      DELETE
      FROM events tr
      ...

This will stop wasting CPU time on: encoding / decoding json data, formation of various queries and data transformations sql -> java, java -> sql.

After receiving data from the database, it must be translated into the processing language (you have java), then processing is done on the java side, and only then the delete request is formed, and then the delete request is executed, then the insert request is formed. --- All this can be done on the database side, virtually invisible to the application and just as secure.. in a transaction ...

  •  Tags:  
  • Related