I have data such as these:
id year month day
1 2020 11 1
2 2020 11 1
3 2020 12 31
4 2020 12 31
5 2021 1 1
6 2021 1 1
7 2021 1 31
8 2021 1 31
9 2021 2 1
10 2021 2 12
I wish to SELECT all rows between 2 dates.
I realise that I could convert the year/month/day to a timestamp which would make this very easy. Actually the timestamp is already there, however, the underlying table is huge (tens of billions of rows) and the DB administrator has set up indexing/clustering on the year/month/day columns in order provide performant queries. Queries that directly use the actual timestamp in a WHERE clause take far too long to run (hours) whereas queries that use year/month/day run within seconds.
Here is a db<>fiddle
CodePudding user response:
Just to clarify, this is a way to do it, but your problem is your DBA, your arquitechture, etc. etc. You won't solve this by this way, neither the time or resources wasted. Maybe you need to ask a proper way to do this with non SQL database in DBA stack exchange site.
Anyway, for the to-know way:
- Convert data to datestamp ISO: yyyyMMdd with CONCAT and LPAD
- Compare the data as normal integer
Example: (of course you can change the values of your search)
SELECT * from dt WHERE CONCAT(year,LPAD(month, 2, '0'),LPAD(day, 2, '0')) BETWEEN 20201231 AND 20210101