Home > Blockchain >  bash sed replace datetime with timestamp format
bash sed replace datetime with timestamp format

Time:10-26

I'm not prof in bash and Linux but I need to preprocess some finance data (OHLC data) in JSON syntax like below:

$ data='
[
{ "t": "2022-09-01T00:00:00", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  }, 
{ "t": "2022-09-01T00:00:15", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  }, 
{ "t": "2022-09-01T00:00:45", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  }, 
{ "t": "2022-09-01T00:02:45", "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  }, 
{ "t": "2022-09-01T00:03:00", "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  }, 
{ "t": "2022-09-01T00:03:45", "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  }, 
{ "t": "2022-09-01T00:04:45", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  }, 
{ "t": "2022-09-01T00:05:00", "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  }, 
{ "t": "2022-09-01T00:05:15", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 }, 
{ "t": "2022-09-01T00:05:45", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  }, 
...
]'

I want to use DATE command to replace "time" field from current format into timestamp format as bellow:

new_data=
[
{ "t": 1661974200, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  }, 
{ "t": 1661974215, "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  }, 
{ "t": 1661974245, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  }, 
{ "t": 1661974365, "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  }, 
{ "t": 1661974380, "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  }, 
{ "t": 1661974435, "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  }, 
{ "t": 1661974495, "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  }, 
{ "t": 1661974510, "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  }, 
{ "t": 1661974525, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 }, 
{ "t": 1661974555, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  }, 
...
]

With help from google I tried to run this command

$ echo "$data" | sed "s/\"([0-9] -[0-9] -[0-9] T[0-9] :[0-9] :[0-9] (\.[0-9]*Z)?)\"/$(date --date=\1  '%s')/g"

but output result has same timestamp for all records!

output=
[
{ "t": 1666733400, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  }, 
{ "t": 1666733400, "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  }, 
{ "t": 1666733400, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  }, 
{ "t": 1666733400, "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  }, 
{ "t": 1666733400, "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  }, 
{ "t": 1666733400, "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  }, 
{ "t": 1666733400, "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  }, 
{ "t": 1666733400, "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  }, 
{ "t": 1666733400, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 }, 
{ "t": 1666733400, "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  }, 
...
]

finally after much and much failed tries I found that the replace part of SED doesn't pass the matched sub-string and just pass string "\1" instead.

please guide me anyone how to fix the problem. tanks guys

CodePudding user response:

I would suggest to process the file line by line. Also, keep in mind that it's not advised to parse or process JSON in the shell. A simple script such as the following can do the trick for you:

#!/bin/bash

while read -r line; do
 if [[ $line == {* ]]; then
  timestamp=$(echo "$line" | awk '/t/ {t=$3; gsub(/[,"]/, "", t); print t }')
  epoch=$(date --date="$timestamp"  '%s')
  echo "$line" | sed "s/$timestamp/$epoch/"
 else
  echo "$line"
 fi
done < data

It is a basic (and inelegant) solution but you'd avoid having to deal with regexes. Note that in the example above data is a text file and not a env var.

So, for instance, if the data file contains:

[
{ "t": "2022-09-01T00:00:00", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  }, 
{ "t": "2022-09-01T00:00:15", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  }, 
{ "t": "2022-09-01T00:00:45", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  }, 
{ "t": "2022-09-01T00:02:45", "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  }, 
{ "t": "2022-09-01T00:03:00", "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  }, 
{ "t": "2022-09-01T00:03:45", "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  }, 
{ "t": "2022-09-01T00:04:45", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  }, 
{ "t": "2022-09-01T00:05:00", "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  }, 
{ "t": "2022-09-01T00:05:15", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 }, 
{ "t": "2022-09-01T00:05:45", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  }, 
...
]

The output when running the script will be:

[
{ "t": "1661986800", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  },
{ "t": "1661986815", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  },
{ "t": "1661986845", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  },
{ "t": "1661986965", "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  },
{ "t": "1661986980", "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  },
{ "t": "1661987025", "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  },
{ "t": "1661987085", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  },
{ "t": "1661987100", "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  },
{ "t": "1661987115", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 },
{ "t": "1661987145", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  },
...
]

CodePudding user response:

Using GNU sed and GNU date

$ new_data=$(sed -E 's/"/#/g;s/([^:]*[^#]*#)([^#]*)(.*)/echo "\1$(date -d \2  '%s')\3"/e;s/#/"/g' <<< "$data")
$ echo "$new_data"

[
{ "t": "1661986800", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":   46.900, "n": 1  },
{ "t": "1661986815", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  299.100, "n": 1  },
{ "t": "1661986845", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":    2.900, "n": 1  },
{ "t": "1661986965", "o": 1.3700, "c": 1.3735, "h": 1.3735, "l": 1.3700, "v":  450.443, "n": 7  },
{ "t": "1661986980", "o": 1.3743, "c": 1.3744, "h": 1.3744, "l": 1.3743, "v":   15.128, "n": 2  },
{ "t": "1661987025", "o": 1.3773, "c": 1.3776, "h": 1.3776, "l": 1.3773, "v":   32.078, "n": 3  },
{ "t": "1661987085", "o": 1.3700, "c": 1.3700, "h": 1.3700, "l": 1.3700, "v":  380.000, "n": 1  },
{ "t": "1661987100", "o": 1.3783, "c": 1.3783, "h": 1.3783, "l": 1.3783, "v":    8.191, "n": 1  },
{ "t": "1661987115", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v": 5654.400, "n": 14 },
{ "t": "1661987145", "o": 1.3800, "c": 1.3800, "h": 1.3800, "l": 1.3800, "v":  427.100, "n": 2  },
...
]
  • Related