I have a couple of apps that my users use and I want to track their usage for API limiting and weekly emails. I'm using the Firebase Realtime NoSQL DB.
I'm struggling to find best practices for NoSQL Database setup for usage tracking when one user could use the app 100s of times one day, and others not at all. I know I need to save timestamps, but I'm not sure the actual setup that's recommended.
Option 1:
<user_id>/<app_name>/<day_integer>
and save a timestamp
array. To get usage over an entire month tho I think I'd have to make 30 requests.
Option 2:
<user_id>/usage/<day_integer>
and save or update key app_name
with the usage count (1, 2, 100, etc). To get usage over an entire month I think I need to make 30 requests.
Option 3:
<user_id>/usage
and save objects with app_name
and timestamp
values. I would have a ton of objects which means I'd have to do a lot of data transfer and filtering.
I only need timestamps to build usage charts. I can't find any article on best practices for database architecture for usage tracking. Does anyone have any insights on what is the best practice?
CodePudding user response:
Honestly the difference between the options is small, and the best option depends on your use-case more than anything else.
In all three scenarios you can get the usage for a user for an app for a range of days with a single operation. In the first two cases, that'd be firebase.database().ref(uid).child(appNameOrUsage).orderByKey().startAt(firstKeyToReturn).limitToFirst(30)
. In the third scenario that'd take a firebase.database().ref(uid).child("usage").orderByChild("timestamp").startAt(firstTimeStampOfMonth).endAfter(lastTimestampOfMonth)
.
Even if you needed multiple requests, that's not nearly as slow as you may think because Firebase pipelines the requests over a single socket connection as explained here: Speed up fetching posts for my social network app by using query instead of observing a single event repeatedly
In NoSQL databases it is typically best to store the data as you show it on the screen. Since you seem to want to show usage per month, I'd actually recommend to (also) store the aggregated usage per month, and per app per month.
Storing duplicate data may be counter-intuitive if you come from a SQL background, but is actually quite common in NoSQL land - and often a primary reason why these database scale so well when it comes to read operations.
So here I'd probably store all aggregates you might want, so:
- per user across all time
- per user for each month
- per user per all across all time
- per user per app for each month
- across all users across all time
- across all users for each month
- etc
The write logic becomes more complex this way, but reading the data for a chart becomes really simple. That's another common pattern in NoSQL databases.
Finally: Firebase Realtime Database, like most NoSQL databases, are not well suited for executing ad-hoc queries. If that is what you need, consider a better solution for the use-case, such as BigQuery if the datasets may become arbitrarily large.