Home > database >  Count day difference between two dates ( in YYYY-DD-MM HH:MM:SS format) from two different columns i
Count day difference between two dates ( in YYYY-DD-MM HH:MM:SS format) from two different columns i

Time:12-24

I have tsv file with two column contains dates in YYYY-DD-MM HH:MM:SS format.

I want to count the day difference between these dates (as Truncated Integer)

for example:

input:

   column1                             column2    
2013-04-26 12:38:33              2013-04-27 12:38:33.757    
2019-06-07 17:46:02.543          2019-06-27 22:46:02.543   
2013-02-06 12:38:33              2013-02-18 22:50:50  

Output :
1
20
12


My dates in column 1 and 2 I tried

gawk -F "\t" '
  {print$0 "\t" 
    split($1,a,"-" || ":" ); 
    split($2,b,"-" || ":" ); 
    t1 = mktime(sprintf("%d %d %d 0 0 0 0",a[1],a[2],a[3])); 
    t2 = mktime(sprintf("%d %d %d 0 0 0 0",b[1],b[2],b[3])); 
    print (t2-t1)/86400
  }'
  

But I got strange number which not reflecting the accurate day difference.

CodePudding user response:

When you need to do it multiple times, it's cleaner/easier to define your own function for converting your date format to epoch time.

printf '%s\t%s\n' \
    '2013-04-26 12:38:33'     '2013-04-27 12:38:33.757' \
    '2019-06-07 17:46:02.543' '2019-06-27 22:46:02.543' \
    '2013-02-06 12:38:33'     '2013-02-18 22:50:50' |

awk -F '\t' '

    function epoch(str,    i) {
        gsub(/[-:]/," ",str)
        return mktime(str) ((i = index(str,".")) ? substr(str,i) : "")
    }

    { print int((epoch($2) - epoch($1)) / 86400) }
1
20
12

notes:

  • mktime doesn't support floating point precision so I re-append the sub-zero part.

  • print int(float) truncates the sub-zero part of float; if your prefer a rounded output then use printf "%d%s", float, ORS

Merry Christmas!

CodePudding user response:

I'm guessing you mean a .csv file (comma-separated value)? I would use R to do this. If you install R (it is a free as in beer download), you should get the Rscript command with it, then you can can use the as.Date() method and do this from the command line or a script

Rscript -e 'as.Date("2013-04-27 12:38:33.757") - as.Date("2013-04-26 12:38:33")'
Time difference of 1 days

Dealing with dates and time differences is devilishly complicated.

  • Related