Here is the date format in a bunch of files I have
$cat ./file.log
20220405T130001 message1
20220405T130002 message2
20220405T130003 message3
20220405T130004 message4
20220405T130005 message5
I am able to convert it to a usable date format by doing this:
$cat ./file.log | sed 's/^\(.\{4\}\)/\1-/' | sed 's/^\(.\{7\}\)/\1-/' | sed 's/\(.\{10\}\)./\1 /' | sed 's/^\(.\{13\}\)/\1:/' | sed 's/^\(.\{16\}\)/\1:/
2022-04-05 13:00:01 message1
2022-04-05 13:00:02 message2
2022-04-05 13:00:03 message3
2022-04-05 13:00:04 message4
2022-04-05 13:00:05 message5
This seems very inefficient. Is there an easier / better way to accomplish this in bash?
The rules to change would be the following
- insert
-
after first 4 characters - insert
-
after next 2 characters after previous rule - insert
-
after next 2 characters after previous rule - replace
T
with - insert
:
after next 2 characters after previous rule - insert
:
after next 2 characters after previous rule
CodePudding user response:
Assuming you are always getting same format in input file, a single sed can handle this with multiple capture groups:
sed -E 's/^(.{4})(..)(..)T(..)(..)/\1-\2-\3 \4:\5:/' file
2022-04-05 13:00:01 message1
2022-04-05 13:00:02 message2
2022-04-05 13:00:03 message3
2022-04-05 13:00:04 message4
2022-04-05 13:00:05 message5
CodePudding user response:
With your shown samples, please try following awk
code. Simple explanation would be, setting field separator as T
OR spaces. In main program printing sub strings(using substr
function of awk
) where printing respective sub strings using 1st, 2nd and 3rd fields as per required output.
awk -F'T| ' '
{
print substr($1,1,4)"-"substr($1,5,2)"-"substr($1,7,2),substr($2,1,2)":"substr($2,3,2)":"substr($2,5,2),$3
}
' Input_file