I have a collection that is about 2000 docs in size. I want to stream a List of just their ids.
As follows...
Stream<QuerySnapshot> stream = FirebaseFirestore.instance.collection('someCollection').snapshots();
return stream.map((querySnapshot) => querySnapshot.docs.map((doc) => doc.id).toList());
Will this be a significant performance issue on the client side? Is this an unrealistic approach?
CodePudding user response:
When you query for documents in Firestore using the client SDKs, you are pulling down the entire document every time. There is no way to reduce the amount of data - all of the matching documents are sent across the wire in their entirety.
As such, you use of map()
to extract only the document ID has no real effect on performance here, since it runs after the query is complete and you have all of that data in a snapshot on the client. All you are doing is trimming down the entire document down to a string, but you are not saving on the cost of transferring that entire document.
If you want to make this faster, you should make the query on a backend (such as Cloud Functions), ideally in the same region as your Firestore instance, and trim the documents in your backend code before you send the data to the frontend. That will save you the cost of unnecessarily trasferring the contents of the document you don't need.
Read: Should I query my database directly or use Cloud Functions?
CodePudding user response:
Performance implications will mostly come from:
- The bandwidth consumed for transferring the documents.
- The memory used for keeping the
DocumentSnapshot
objects.
Since you're throwing the document.data()
of each snapshot away, that is quite directly where the quickest gains will be. If your documents have few fields, the gains will be small. If they have many fields, the gains will larger.
If you have a lot to gain by not transferring the fields and keeping them in memory, the main options you have are:
- Use a server-side SDK to get only the document IDs, and transfer only that back to the client.
- Create secondary/proxy documents that contain only the ID, and no data.
While the first approach is tempting because it reduces data duplication, the second one is typically a lot simpler to implement (as you're only going to be impacting the code that handles data writes).