{
field_1: ["a", "b", "c"],
field_2: ["x", "y", "z"]
}
Can I create a compound index with field_1 and field_2? If not, what is my alternative to get around this?
CodePudding user response:
Experimentally it is pretty easy to confirm that the answer to the first question is "no". As demonstrated in this playground example, attempting to do so will generate an error similar to the following (with the behavior dependent on whether the index or the data was created first):
cannot index parallel arrays [field_2] [field_1]
This limitation is documented here and a request to lift this restriction seems to be tracked here.
The response to the second question is a little more open-ended. How to workaround this depends on your environment and your goals.
The most straightforward thing to do would be to create indexes that only include the more selective of these two fields. If field_1
is reasonably selective (potentially along with other indexed non-array fields), then the filtering the database will have to do for field_2
may be perfectly acceptable from a performance and efficiency perspective.
If you really wanted, though I don't necessarily recommend this, you could modify the schema to change these "parallel arrays" into nested ones. This might look something like the following:
{
arr: [
{ field_1: "a", field_2: ["x", "y", "z"] },
{ field_1: "b", field_2: ["x", "y", "z"] },
{ field_1: "c", field_2: ["x", "y", "z"] }
]
}
You can see in this playground example that the following index can be successfully built with that schema:
{ "arr.field_1": 1, "arr.field_2": 1 }
This is most feasible if the size of one of the arrays is bounded and pretty small.
If neither of these are appropriate then you may need to consider more drastic changes to your schema and overall data model.