Home > database >  Index Structure and Index Search in Firestore
Index Structure and Index Search in Firestore

Time:12-29

I have just read the documentation for Firestore indexing and now I have the following questions to make sure that I understood the concept correctly :

Assume that I have the following data structure:

 {
    "user_collection": {
        "user1_document":{
            "name": "Joe",
            "age": 21
        },
        "user2_document":{
            "name": "Sarah",
            "age": 29
        },
        "user3_document":{
            "name": "Sarah",
            "age": 24
        }
    }
 }

If I now perform a query that returns every document with the name Sarah, Firestore looks through every index record of the field name and returns every document where the name value equals "Sarah". Did I understand that correctly?

My next question is a little bit more specific: indexes are sorted(in ascending and descending order). Now, when a query is looking for every document where the user's age is smaller than 20, would Firestore start with the age 21, notice that the smallest age in the user collection is 21, and therefore stop checking any further document OR would Firestore still go through all the remaining documents? Generally, is there any information about what algorithm Firestore uses to search indexes, like binary search?

I know this information is irrelevant in terms of working with Firebase, but it just interests me.

CodePudding user response:

If I now perform a query that returns every document with the name Sarah, Firestore looks through every index record of the field name and returns every document where the name value equals "Sarah". Did I understand that correctly?

Yes, and you'll have to pay a document read for each document the query returns. If however, your query yields no result, according to the official documentation regarding Firestore pricing, it is said:

Minimum charge for queries

There is a minimum charge of one document read for each query that you perform, even if the query returns no results.

So if, for example, you try to filter all users and you get no results, you're still charged with 1 read.

When a query is looking for every document where the user's age is smaller than 20, would Firestore start with the age 21, notice that the smallest age in the user collection is 21, and therefore stop checking any further document OR would Firestore still go through all the remaining documents?

No. When you're looking for every document where the user's age is less than 20, Firestore will return all documents where the age field holds a value that is less than 20. It would have returned documents where the field age holds a value of 20 if you were looking for every document where the user's age is less than or equal to 20.

Yes, in order to provide some results, Firestore will have to check all documents against a value.

Generally, is there any information about what algorithm Firestore uses to search indexes, like binary search?

I'm not aware of something public about the Firestore algorithm, but if I find something I will update my answer.

Please also note that in Firestore, we are not only charged based on the number of reads/writes/deletes we perform but also based on space. So we have to pay for what we consume, including storage overhead. What does that mean? It means that we have to pay for the metadata, automatic indexes, and composite indexes.

CodePudding user response:

The single key indexes can be consider as a value -> docId mapping in short. As per your database structure, an index on field 'name' would be like this:

"Sarah": "user1_id",
"Sarah": "user2_id",
"Sarah": "user3_id",

For an index on field age, the index structure would be:

"21": "user1_id",
"29": "user2_id",
"24": "user3_id",

When you run a query and an index supporting the same exists, it just has to read those index entries.

Every document where the user's age is smaller than 20, would Firestore start with the age 21,

In case of where("age", "<", 20) (and you have no document matching the query), there are no index entries for the same and hence no data is returned i.e. no other entries are read. However, it'll still cost you a read as Alex mentioned.


Additionally, if you want to query based on both the fields, you would need a composite index e.g. { name: ASC, age: ASC }:

{"Sarah", 21}: "user1_id",
{"Sarah", 29}: "user2_id"
{"Sarah", 24}: "user3_id"

Whenever you create a new document, all the indexes related are updated so creating many indexes may slow down write operations generally. Databases (like MongoDB) generally use B-Trees. If you are curious about Firestore then it might be a good idea to contact Firebase.

  • Related