Home > Software design >  How can I filter rows according to a date range when only year, month and day are available?
How can I filter rows according to a date range when only year, month and day are available?

Time:09-29

I have data such as these:

id  year    month   day
1   2020    11  1
2   2020    11  1
3   2020    12  31
4   2020    12  31
5   2021    1   1
6   2021    1   1
7   2021    1   31
8   2021    1   31
9   2021    2   1
10  2021    2   12

I wish to SELECT all rows between 2 dates.

I realise that I could convert the year/month/day to a timestamp which would make this very easy. Actually the timestamp is already there, however, the underlying table is huge (tens of billions of rows) and the DB administrator has set up indexing/clustering on the year/month/day columns in order provide performant queries. Queries that directly use the actual timestamp in a WHERE clause take far too long to run (hours) whereas queries that use year/month/day run within seconds.

Here is a db<>fiddle

CodePudding user response:

Just to clarify, this is a way to do it, but your problem is your DBA, your arquitechture, etc. etc. You won't solve this by this way, neither the time or resources wasted. Maybe you need to ask a proper way to do this with non SQL database in DBA stack exchange site.

Anyway, for the to-know way:

  1. Convert data to datestamp ISO: yyyyMMdd with CONCAT and LPAD
  2. Compare the data as normal integer

Example: (of course you can change the values of your search)

SELECT * from dt WHERE CONCAT(year,LPAD(month, 2, '0'),LPAD(day, 2, '0')) BETWEEN 20201231 AND 20210101
  • Related