Home > Enterprise >  Postgres array intersection queries performance
Postgres array intersection queries performance

Time:02-11

I have a int[] column, category_ids, which has a cardinality of 5.

I have users which look for products that match certain category_ids, and I'm currently returning results based on intersection between the ids they request and ids that match in category_ids. Currently we have only 14 categories, but soon will be adding a lot more (~40). So, if a user wants to find 38/40 categories, that seems like it will be hairy.

I'm trying to learn more about arrays, but completely lost on indexing for this type of querying. How can I improve performance for this kind of query?

Basic high level example would be find me products that match category_ids [1..35], and finding it via array overlaps user_requested_category_ids && my_table.category_ids

CodePudding user response:

There is no simple an efficient way to do this. You'll have to unnest both arrays and intersect the result.

My recommendation is to avoid data models where you store references to other data in arrays. Instead, use a “junction table” to model m:n relationships. That is the natural way to do it, and your queries will become simple and probably more efficient. Also, you can define foreign key constraints that way, so that you cannot end up with inconsistent data.

  • Related