How to prepend each line with an instance of a pattern found with awk-CodePudding

I have a file with many lines. Some lines display the date and time, e.g. 2022-03-16-08:00

I want all lines following the pattern found to have that pattern prepended

In addition, because there are many lines with different times, I want this to change for every instance of the pattern, and prepend the following lines of that instance with the respective date&time.

For example, I have the following file (example.txt):

Date1: 2022-03-16-08:00
Something happened
Something else happened
Date2: 2022-03-16-08:10
Something happened
Something else happened
Something else happened

And the result I want is:

Date1: 2022-03-16-08:00
2022-03-16-08:00 Something happened
2022-03-16-08:00 Something else happened
Date2: 2022-03-16-08:10
2022-03-16-08:10 Something happened
2022-03-16-08:10 Something else happened
2022-03-16-08:10 Something else happened

I tried with sed to prepend the pattern found to each other line, but the sed variable doesn't seem to work:

sed -e '/2022-/s/$.*$/\1/' -e 's/^/$1/' example.txt

Result:

$1Date1: 2022-03-16-08:00
$1Something happened
$1Something else happened
$1Date2: 2022-03-16-08:10
$1Something happened
$1Something else happened
$1Something else happened

I thought that it may be feasible with awk, to take the pattern with awk -F: '/2022/{var=$2}' and then prepend it to the next lines, but I don't know how I would change it to the new instance of Date

Any help is appreciated and very welcome

Thank you very much in advance!

CodePudding user response：

I would GNU AWK for this task following way, let file.txt content be

Date1: 2022-03-16-08:00
Something happened
Something else happened
Date2: 2022-03-16-08:10
Something happened
Something else happened
Something else happened

then

awk 'BEGIN{FPAT="[0-9]{4}-[0-9]{2}-[0-9]{2}-[0-9]{2}:[0-9]{2}"}NF{when=$1;print}!NF{print when,$0}' file.txt

output

Date1: 2022-03-16-08:00
2022-03-16-08:00 Something happened
2022-03-16-08:00 Something else happened
Date2: 2022-03-16-08:10
2022-03-16-08:10 Something happened
2022-03-16-08:10 Something else happened
2022-03-16-08:10 Something else happened

Explanation: Inside BEGIN I inform GNU AWK using FPAT (Field PATtern) that it should consider field to be following string: 4 digits followed by - followed by 2 digits followed by - followed by 2 digits followed by - followed by 2 digits followed by : followed by 2 digits, i.e. timestamp compliant with format you are using. For each line if it does contain such file (i.e number of fields that is NF is non-zero) do set when variable value for content of 1st such field ($1) and do print current line as is, if there is not such field (!NF that is negation of NF) then do print when variable value followed by whole current line ($0).

Warning: my code assume that if you have more than one timestamp in single line you want to use first one and there is always timestamp in first line

(tested in gawk 4.2.1)

CodePudding user response：

This solution should work in any version of awk:

awk -F ': ' 'NF == 2 && $2 ~ /^20[0-9]{2}/ {
   dt = $2; print; next} {print dt, $0}' file.log

Date1: 2022-03-16-08:00
2022-03-16-08:00 Something happened
2022-03-16-08:00 Something else happened
Date2: 2022-03-16-08:10
2022-03-16-08:10 Something happened
2022-03-16-08:10 Something else happened
2022-03-16-08:10 Something else happened

CodePudding user response：

With your shown samples, please try following awk code.

awk '
match($0,/[0-9]{4}(-[0-9]{2}){3}:[0-9]{2}/){
  value=substr($0,RSTART,RLENGTH)
  print
  next
}
{
  print value,$0
}
'  Input_file