Home > Back-end >  daylight saving time issue on converting date through epoch time in sqlite
daylight saving time issue on converting date through epoch time in sqlite

Time:09-22

I am working with a very big data where the precision is very important and am observing human behavior at different hours of the day. I use the following code to extract the date from the epoch date (using rsqlite - basically sqlite only enables use within R) queries are built the same) :

dbGetQuery(db,"SELECT datetime(started_at, 'unixepoch', 'utc','-4 hours') FROM biketransactions limit 3")

The data I work with is recorded in New York, but one problem is New York is not always 4 hours behind the UTC time zone. Without daylight savings, it is a -5 hours difference. Using this method I will get the correct hour stamps for during "summer" but the data in "winter-time" will always be assumed an hour later. Which would greatly affect my end results. What are some ways I can convert the epoch time taking day light savings into account?

CodePudding user response:

Here's 3 ways,

  • the first uses a table with the daylight savings start and end date and times.

  • the second uses a CTE (Common Table Expression) to dynamically build a single row table just for the current year (possibly not suitable)

  • the third takes the basis of the second a little further in that it incorporates the CTE (dynamic determination of the daylight savings period) based upon the year from the started_at date. So it can handle any date (assuming the determination of the daylights savings remains as from the 2 am on the 2nd Sunday of March until 01:59:59 on the 1st Sunday in November).

  • You may wish to check out the 3rd solution first.


First solution


One way would be to have a table (dst in the demo) with the start and end date and times of daylight savings.

You could then adjust the -5 hour offset according to by -1 (true) or -0 (false) if the started_at is BEWTEEN the two dates for the respective year.

e.g. datetime(started_at,'unixepoch','-'||(5 - (started_at BETWEEN dst_start AND dst_end))||' hours')

Here's a demo:-

DROP TABLE IF EXISTS dst;
DROP TABLE IF EXISTS bt;
/* Create and populate DST table  with start of ds and end of ds*/
CREATE TABLE IF NOT EXISTS dst (dst_start INTEGER, dst_end INTEGER);
INSERT INTO dst VALUES 
    (strftime('%s','2022-03-13 02:00:00'),strftime('%s','2022-11-06 01:59:59')),
    (strftime('%s','2023-03-12 02:00:00'),strftime('%s','2023-11-05 01:59:59')),
    (strftime('%s','2024-03-10 02:00:00'),strftime('%s','2024-11-03 01:59:59')),
    (strftime('%s','2025-03-09 02:00:00'),strftime('%s','2025-11-02 01:59:59'))
;
/* Create the biketransactions table and load with some test data */
CREATE TABLE IF NOT EXISTS bt (started_at INTEGER);
with cte(started_at) AS (
    SELECT strftime('%s','now') 
    UNION ALL SELECT strftime('%s',started_at,'unixepoch','10 days') FROM cte LIMIT 100
    )
INSERT INTO bt SELECT * FROM cte;



/* select 3 values based upon the bt started_at 
    1. The actual stored date
    2. Whether (1) or not (0) daylight savings is applicable
    3. the adjusted datetime    
*/
SELECT datetime(started_at,'unixepoch') AS UTC,
    started_at BETWEEN dst_start AND dst_end AS dstflag,
    datetime(started_at,'unixepoch','-'||(5 - (started_at BETWEEN dst_start AND dst_end))||' hours') AS adjusted
    
FROM bt 
/* join gets the applicable row from the dst table i.e. the row for the year */
JOIN dst ON strftime('%Y',started_at,'unixepoch') = strftime('%Y',dst_start,'unixepoch')
;

And the result including (dst datetimes highlighted):-

enter image description here

enter image description here


Second solution


An alternative would be to determine just the current daylight savings start and end based upon the rule that it starts on the 2nd Sunday of March and ends on the 1st Sunday in November. However this, as shown, is limited to dates for the current year:-

Here's a demo:-

DROP TABLE IF EXISTS bt;
/* Create the biketransactions table and load with some test data */
CREATE TABLE IF NOT EXISTS bt (started_at INTEGER);
with cte(started_at) AS (
    SELECT strftime('%s','now') 
    UNION ALL SELECT strftime('%s',started_at,'unixepoch','10 days') FROM cte LIMIT 200
    )
INSERT INTO bt SELECT * FROM cte;

/* 
    as a CTE (Commaon Tabel Expression) to determine the start and end dates and times of daylight savings for the current year
    note that this is restricted to only working for dates in the current year
*/
WITH ds AS (SELECT 
    strftime('%Y','now') AS year, /* YEAR */
    strftime('%s',strftime('%Y','now')||'-03-01 02:00:00',' '||(14 - strftime('%w',strftime('%Y','now')||'-03-01'))||' days') AS dst_start,
    strftime('%s',strftime('%Y','now')||'-11-01 01:59:59',' '||(7 - strftime('%w',strftime('%Y','now')||'-11-01'))||' days') AS dst_end
    )
SELECT 
    datetime(started_at,'unixepoch'), 
    started_at BETWEEN (SELECT dst_start FROM ds) AND (SELECT dst_end FROM ds),
    datetime(started_at,'unixepoch','-'||(5-(started_at BETWEEN (SELECT dst_start FROM ds) AND (SELECT dst_end FROM ds)))||' hours') AS adjusted 
    FROM bt 
;

The results being :-

enter image description here

  • as can be seen the above, as it is, will not have the correct time when daylight savings apply for any other year than the current year.

3rd (perhaps the best solution)


The first solution requires an additional table to be maintained, the second solution, does not cater well for dates other than in the current year. This third option takes the second option further by using the year fro the started_at date as the basis of the adjustment. Therefore this is perhaps the preferable solution:-

SELECT 
    datetime(started_at,'unixepoch') AS unadjusted,
    CASE 
        WHEN
            started_at 
            BETWEEN 
                strftime(
                    '%s',strftime('%Y',started_at,'unixepoch') /* Year according to the started_at date */
                        ||'-03-01 02:00:00',' ' /* 1st March */
                        ||(14 - strftime('%w',strftime('%Y',started_at,'unixepoch')||'-03-01'))||' days' /* 2nd Sunday */
                        )
            AND
                strftime(
                    '%s',strftime('%Y',started_at,'unixepoch') /* Year according to the started_at date */
                        ||'-11-01 01:59:59',' ' /* 1st Nov */
                        ||(7 - strftime('%w',strftime('%Y',started_at,'unixepoch')||'-11-01'))||' days' /* 1st Sunday */
                        )
        THEN /* i.e. if started_at is between daylight savings date thus 1 */ datetime(started_at,'unixepoch','-4 hours')
        ELSE /* if not between the daylight savings dates thus 0*/ datetime(started_at,'unixepoch','-5 hours')
    END as adjusted
FROM bt;

This, using the same means of loading data into the biketransactions (bt for brevity) table, results in :-

enter image description here

and later on :-

enter image description here

  • Just the actual and the adjusted date and times.
  • dates within the daylight savings periods are in the highlighted blocks
  • Related