Home > database >  How to use sed to group date/time?
How to use sed to group date/time?

Time:11-13

I have a text

7304628626|duluth/superior|18490|2016|volvo|gas|49230|automatic|sedan|white|mn|46.815216|-92.178109|2021-04-10T08:46:33-0500

I want to change text 2021-04-10T08:46:33-0500 to 10/04/2021 08:46:33

I try use this command

sed -n "s/|\([0-2][0-9][0-9][0-9]\)-\([0-1][0-9]\)-\([1-3][0-9]\)\(T\)\([0-9][0-9]:[0-9][0-9]:[0-9][0-9]\)\(-[0-1][0-9][0][0]\)/|\3\/\2\/\1 \5 /p" filename 

but some text hasn't change

CodePudding user response:

Using sed

$ sed 's/\(.*|\)\([^-]*\)-\([^-]*\)-\([^T]*\)T\([^-]*\).*/\1\4\/\3\/\2 \5/' input_file
7304628626|duluth/superior|18490|2016|volvo|gas|49230|automatic|sedan|white|mn|46.815216|-92.178109|10/04/2021 08:46:33

\(.*|\) - Match till the last occurance of | pipe symbol

\([^-]*\) - Match till the next occurance of - slash. Stores 2021 and 04 which can be returned with \2 and \3 back reference

\([^T]*\) - Match till the next occurance of T capital T. Stores 10 which can be returned with \4 back reference

T - Exclude the T

\([^-]*\) - Match till the next occurance of - slash. Stores 08:46:33 which can be returned with \5 back reference

.* - Exclude everything else

If your intent is to return only the date and time, you can remove the first back reference

$ sed 's/\(.*|\)\([^-]*\)-\([^-]*\)-\([^T]*\)T\([^-]*\).*/\4\/\3\/\2 \5/' input_file
10/04/2021 08:46:33

CodePudding user response:

With your shown samples, please try following sed program.

sed -E 's/(.*\\|)([0-9]{4})-([0-9]{2})-([0-9]{2})T([0-9]{2}:[0-9]{2}:[0-9]{2})-.*/\1\4\/\3\/\2 \5/' Input_file

Explanation: Using sed program's back reference capability here to store matched values into temp buffer and use them later on in substitution. In main sed program using -E option to enable ERE(extended regular expression) then using s option to perform substitution. First creating 5 capturing group to match 7304628626|duluth/superior|18490|2016|volvo|gas|49230|automatic|sedan|white|mn|46.815216|-92.178109|(in first capturing group), 2021(in 2nd capturing group), 04(in 3rd capturing group), 10(in 4th) and 08 :46:33(in 5th capturing group). And while substituting them keeping order to capturing group as per OP's needed order since OP wants 2021-04-10T08:46:33-0500 to be changed to 10/04/2021 08:46:33.

CodePudding user response:

This might work for you (GNU sed):

sed -E 's#\|(....)-(..)-(..)T(..:..:..)-....$#|\3/\2/\1 \4#' file

Pattern match and using back references format as required.

N.B. The use of the | and $ to anchor the pattern to the last field on the line and the nature of the dashes, colons and the capital T make it most unlikely any other string will match, so a dot can be used to match the digits, but if you like replace .'s by [0-9]'s. Also the # is used as alternative delimiter to the normal / in the substitution command s#...#...# as / appear in the replacement string.

  • Related