Linux commands - how to replace every two sequential new lines as new record?-CodePudding

I have a log file that has the following structure:

Record always begins with time data like YYYY-MM-DD-hh-mm-ss
Then there can be one or multiple lines including empty lines
New record begins with exactly two new lines (like \n characters) followed by YYYY-MM-DD-hh-mm-ss at the beginning of line

Sample:

2022-05-05-14.06.15.041968 some data
that can spread to one line

2022-05-05-14.06.16.036412 some data
that can spread to
two lines

2022-05-05-14.06.17.234123 some data
that can spread to
two lines

or multiple lines with empty new lines

I would like to get:

2022-05-05-14.06.15.041968 some data that can spread to one line
2022-05-05-14.06.16.036412 some data that can spread to two lines
2022-05-05-14.06.17.234123 some data that can spread to two lines or multiple lines with empty new lines

How to solve this problem using Linux commands like sed, awk, tr and similar?

CodePudding user response：

This should get the job done but with limitation that data after datetime of each record should not include any datetime format string

sed '/^$/d' your_file_name | tr '\n' ' ' | sed -E 's/([0-9]{4}-[0-9]{2}-[0-9]{2}-[0-9]{2}\.[0-9]{2}\.[0-9]{2}\.[0-9]{6})/\n\1/g' | sed 1d

CodePudding user response：

Suggesting gawk script (standard awk in most Linux machines):

 gawk '{gsub("\n "," ");gsub("\n","",RT);print $0 "\n"RT}' RS="\n[[:digit:]]{4}-[[:digit:]]{2}-" ORS="" input.txt

Results:

2022-05-05-14.06.15.041968 some data that can spread to one line 
2022-05-05-14.06.16.036412 some data that can spread to two lines 
2022-05-05-14.06.17.234123 some data that can spread to two lines or multiple lines with empty new lines

CodePudding user response：

awk '
        {
                if ( $1 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2} / ) {
                        printf "\n%s", $0
                } else {
                        if ($1 != "") printf " %s", $0
                }
        }
        END{
                printf "\n"
        }' input_file | sed 1d

2022-05-05-14.06.15.041968 some data that can spread to one line
2022-05-05-14.06.16.036412 some data that can spread to two lines
2022-05-05-14.06.17.234123 some data that can spread to two lines or multiple lines with empty new lines