Getting entry from many to many mapping table-CodePudding

I have two tables that are mapped together with a mapping table; the tables are as follows:

data
metadata
matadata_map

metadata_map is mapping data together with metadata. In the metadata_map table, I need to map some IDs together. I need to find out which data ids have X, Y .. N metadata ids. The metadata_map table could look like the following:

--------------------
| data_id| meta_id |
--------------------
| 1      | 2       |
| 1      | 3       |
| 2      | 1       |
| 2      | 2       |
| 3      | 1       |
| 3      | 2       |
| 3      | 3       |
| 3      | 4       |
--------------------

In this example, I will try to find out which data_ids has both meta_ids 2 and 3. When looking up in the metadata_map table, I should get the result: (1, and 3)

----------
| data   |
----------
| 1      |
| 3      |
----------

since both data_id 1 and 3 has meta_id 2 and 3.

My question is, how can I best match them together in PostgreSQL to get the result? This also requires that it's highly performant because the table has, at the moment of typing, about ~700mil entries.

CodePudding user response：

Alternative - using an array:

 select data_id
 from metadata_map 
 group by data_id
 having array_agg(meta_id) @> array[2,3];

This query retrieves data_ids that have both meta_id 2 and meta_id 3 but may have others too.

having array_agg(meta_id order by meta_id) = array[2,3];

would retrieve data_ids that have exactly meta_id 2 and meta_id 3 and no others.
The pattern can be reused with other array operators too.

CodePudding user response：

You can use grouping

 select data_id
 from metadata_map 
 where meta_id in (2,3)
 group by data_id
 having count(*) = 2

The query assumes metadata_map contains no dups. Alternatively use count(distinct meta_id).