With an ~ 18 years old application users file "cases" and each case creates a row in a "journal" table a data base (on SQL 2000). This cases can be tagged with "descriptors" where a somewhere hard coded limit of 50 is set. The descriptors/tags are stored in a lookup table and the key for the descriptors is a number from the power of two sequence (2^n).
This table looks like this:
key | descriptor |
---|---|
1 | D 1 |
2 | D 2 |
4 | D 3 |
8 | D 4 |
16 | D 5 |
There are 50 rows, which means the biggest key is 562.949.953.421.312. Each case can have up to 8 descriptors, which are unfortunately stored in a single column in the case journal table. They keys are stored as a summary of all descriptors on that case.
- A case with the descriptor D2 has 2 in the journal
- A case with the descriptors D2 and D4 has 10
- A case with the descriptors D1, D3 and D5 has 21
The Journal has 100 million records. Now the first time since years there is the requirement to analyze the journal by descriptors. What would be a smart (mathematical) way to query the journal and get the results for one descriptor?
Edit: in answer to the comment of @Squirrel:
jkey | jvalue | descriptors |
---|---|---|
1 | V 1 | 0 |
2 | V 2 | 24 |
3 | V 3 | 3 |
4 | V 4 | 12 |
5 | V 5 | 6 |
CodePudding user response:
You need to use bitwise operators.
Assuming the column is bigint
then
where yourcolumn & 16 > 0
will find the ones matching D5 for example
If you are trying this query for literal values larger than fit into a signed 32 bit int make sure you cast them to BIGINT
by the way as they will be interpreted as numeric
datatype by default which cannot be used with bitwise operators.
WHERE yourcolumn & CAST(562949953421312 AS BIGINT) > 0
You may also similarly need to cast yourcolumn
if it is in fact numeric
rather than bigint