Home > Software design >  create a list of date and time in shell script
create a list of date and time in shell script

Time:09-11

I try to run script to find the missing dates in a file "date_meta", therefore I try to write a list of whole dates using shell script and put it in file "date_correct". The format is %d%H%M with increment of 30 minutes. I get this error: line 9: [[: 2022-01-01T00: value too great for base (error token is "01T00")

The script:

#!/bin/sh 
strdate='2022-01-01T00:00'
enddate='2022-01-31T23:30'
while [[ ${strdate} -le ${enddate} ]] ; do 
echo $strdate>>date_correct 
strdate=$(date -d "$strdate 30 minute"  %d%H%M) 
done 
diff date_metar date_correct >output

CodePudding user response:

Your best bet for generating the range of valid dates/times will probably come from combining two ideas:

  • use epoch seconds for comparisons and math
  • use awk (or comparable program) to replace the time-consuming bash/while loop

One epoch(secs) / awk idea:

strdate='2022-01-01T00:00'
enddate='2022-01-31T23:30'

strdate_s=$(date -d "${strdate}"  %s)
enddate_s=$(date -d "${enddate}"  %s)

inc_m=30
((inc_s = inc_m * 60))

awk -v ss="${strdate_s}" -v es="${enddate_s}" -v inc="${inc_s}" '
BEGIN { while ( ss <= es ) {
              print strftime("%d%H%M", ss)
              ss =inc
        }
      }
' > date_correct

NOTE: as Fravadona's mentioned in the comments, strftime() requires GNU awk (aka gawk)

To show the performance improvement of using awk instead of the bash/while loop we'll modify OP's current code to use the epoch(secs) approach:

strdate='2022-01-01T00:00'
enddate='2022-01-31T23:30'

strdate_s=$(date -d "${strdate}"  %s)
enddate_s=$(date -d "${enddate}"  %s)

inc_m=30
((inc_s = inc_m * 60))

while [[ "${strdate_s}" -le "${enddate_s}" ]] ; do
    date -d "@${strdate_s}"  %d%H%M >> date_correct2
    ((strdate_s =inc_s))
done

A diff of the outputs show both sets of code generate the same output:

$ diff date_correct date_correct2
               <<<=== no output

Results of running both processes under time:

# awk

real    0m0.042s
user    0m0.015s
sys     0m0.015s

# bash/while

real    0m46.412s
user    0m6.727s
sys     0m27.314s

So awk is about 1100x times faster than a comparable bash/while loop.

If the sole purpose of this date/time-generating code is simply to find the missing dates/times in the date_metar file then OP may want to consider using a single awk script to eliminate the need for the date_correct file and still determine what dates/times are missing from date_metar ... but that's for another Q&A ...


Looking a bit more into the performance issues of the bash/while loop ...

Replacing the date call with a comparable printf -v call:

while [[ "${strdate_s}" -le "${enddate_s}" ]] ; do
    printf -v new_date '%(%d%H%M)T' "${strdate_s}"
    echo "${new_date}" >> date_correct2
    ((strdate_s =inc_s))
done

We see overall time is reduced from 46 secs to 10 secs:

real    0m10.127s
user    0m0.141s
sys     0m0.312s

We should be able to get a further improvement by moving the >> date_correct2 to after the done, thus replacing 1400 file open/close operations (date ... >> date_correct2) with a single file open/close operation (done > date_correct2)

while [[ "${strdate_s}" -le "${enddate_s}" ]] ; do
    printf -v new_date '%(%d%H%M)T' "${strdate_s}"
    echo "${new_date}"
    ((strdate_s =inc_s))
done  > date_correct2

This speeds up the process by ~50x times (10 secs down to 0.2 secs):

real    0m0.198s
user    0m0.141s
sys     0m0.000s

Thus reducing the bash/while loop overhead (compared to awk) from 1100x to 5x.

CodePudding user response:

strftime() does not require gnu-awk ::: gawk

mawk1 'BEGIN { fmt = "%Y-%m-%d %H:%M:%S %Z ( epochs %s | %Y-%j )"

print ORS, systime(), ORS ORS, strftime(fmt, systime()), ORS }'
1662840559 

2022-09-10 16:09:19 EDT ( epochs 1662840559 | 2022-253 ) 

you'll get a very tiny, almost statistically insignificant, speed gain via mawk-1 :

{m,g}awk -v  __='2022 01 01 00 00 00' \
         -v ___='2022 01 31 23 30 00' '

BEGIN {  _*= (_ =(_ =(_^=_<_) _)^_) _
        __ = mktime(__)
       ___ = mktime(___)

      ____ = "%Y-%m-%d %H:%M:%S %Z ( %s | %Y-%j )"
do {
     print __, strftime(____,__) } while ((__ =_)<=___) }'
out9: 78.5KiB 0:00:00 [64.4MiB/s] [64.4MiB/s] [<=> ]
( mawk -v __='2022 01 01 00 00 00' -v ___='2022 01 31 23 30 00' -- ; )
0.01s user 0.00s system 89% cpu 0.019 total

  1484  1643682600 2022-01-31 21:30:00 EST ( 1643682600 | 2022-031 )
  1485  1643684400 2022-01-31 22:00:00 EST ( 1643684400 | 2022-031 )
  1486  1643686200 2022-01-31 22:30:00 EST ( 1643686200 | 2022-031 )
  1487  1643688000 2022-01-31 23:00:00 EST ( 1643688000 | 2022-031 )
  1488  1643689800 2022-01-31 23:30:00 EST ( 1643689800 | 2022-031 )

|

 out9: 78.5KiB 0:00:00 [17.0MiB/s] [17.0MiB/s] [<=> ]
 ( gawk -v __='2022 01 01 00 00 00' -v ___='2022 01 31 23 30 00' -be ; )
 0.02s user 0.01s system 85% cpu 0.026 total

  1484  1643682600 2022-01-31 21:30:00 EST ( 1643682600 | 2022-031 )
  1485  1643684400 2022-01-31 22:00:00 EST ( 1643684400 | 2022-031 )
  1486  1643686200 2022-01-31 22:30:00 EST ( 1643686200 | 2022-031 )
  1487  1643688000 2022-01-31 23:00:00 EST ( 1643688000 | 2022-031 )
  1488  1643689800 2022-01-31 23:30:00 EST ( 1643689800 | 2022-031 )
  • Related