Home > Software engineering >  Presto SQL date_format extract week of year
Presto SQL date_format extract week of year

Time:11-30

Documentation : https://prestodb.io/docs/current/functions/datetime.html

I have epoch timestamps from which I want to extract week of the year like 2021-32, 2020-50, 2021-02 and so on.

SELECT concat(date_format(from_unixtime((CAST(my_timestamp AS BIGINT)   19800000)/1000), '%Y'), 
            '-' ,
            date_format(from_unixtime((CAST(my_timestamp AS BIGINT)   19800000)/1000), '%v')) 
AS week

However I am getting up some wrong values like : week = 2021-53 for Epoch-Time corresponding to Jan 1, 2021 or Jan 2, 2021. I understand that there is a sync issue happening here but that is definitely not something I want - How do I offset so that first day of week 1 starts from the beginning of the year.

CodePudding user response:

In that case you should just get the count of days and calculate the week. Something like this:

SELECT concat(date_format(from_unixtime((CAST(my_timestamp AS BIGINT)   19800000)/1000), '%Y'), 
            '-' ,
            ceiling(date_format(from_unixtime((CAST(my_timestamp AS BIGINT)   19800000)/1000), '%j')/7)) 
AS week

CodePudding user response:

There is no sync issue here - it expected behaviour for date_format and it's MySQL counterpart.

Note, that you can use full format string on the date:

select date_format(timestamp '2021-01-01', '%Y-%v')
_col0
2021-53

You can try calculating week number by dividing day_of_year by 7:

select ceiling(day_of_year(timestamp '2021-01-01') / 7.0)

Related:

CodePudding user response:

I was able to solve it using week_of_year and year_of_week methods.
Docs : https://prestodb.io/docs/current/functions/datetime.html#week

Query :

SELECT concat(CAST(year_of_week(from_unixtime((CAST(my_timestamp AS BIGINT)   19800000)/1000)) AS varchar(15)),
              '-',
              CAST(week_of_year(from_unixtime((CAST(my_timestamp AS BIGINT)   19800000)/1000)) AS varchar(15))) as week

Had to introduce some extra casts to varchar since concat doesn't support multiple datatypes.

Edit : Another issue with above solution is that week_of_year returns single-digit weeks like 1, 2 instead of 01, 02 - thus causing an issue while sorting with these week numbers.

I ended up using a combination of year_of_week to get the correct year, and the general date_format(date, '%v') for week-numbers.

Updated Query :

SELECT concat(CAST(year_of_week(from_unixtime((CAST(my_timestamp AS BIGINT)   19800000)/1000)) AS varchar(15)),
              '-',
              date_format(from_unixtime((CAST(my_timestamp AS BIGINT)   19800000)/1000), '%v'))
  • Related