Home > Software design >  How to move lines of text that dont begin with a , to the end of the line above it?
How to move lines of text that dont begin with a , to the end of the line above it?

Time:11-20

Ive exported into a text file all of my text messages and they are formatted as such.

, NAME  18001112222, RECV, Text message contents.
, NAME  18001112222, RECV, Text message contents that are run over to 
the line below it.
, NAME  18001112222, SENT, Text message contents that have

multiple lines and empty lines!
, NAME  18001112222, SENT, Text Message contents  

I know how to remove the empty lines, how would a guy use awk or sed or grep to move all of these lines, that don't begin with a , to the end of the line above it?

or how would you reformat this to ensure each text message has all of its contents on one single line.

I haven't tried anything yet, because Im unsure where to even begin, that why im here asking for more practiced hands to hopefully provide some practical examples as to how to go about solving this issue. Thanks in Advance

CodePudding user response:

Same idea as https://stackoverflow.com/a/73030681/10971581 :

awk -v ORS= '
    NR>1 && /^, / { print "\n" }
    1;
    END { print "\n" }
' inputfile

The input seems to be malformed CSV. One would normally expect fields that could contain newlines or the field delimiter (, ) to be quoted.

Note that it is impossible in general to determine if a line that starts with , is a continuation or intended to start a new line. The code above assumes it is always the latter.

CodePudding user response:

I would harness GNU AWK for this task following way, let file.txt content be

, NAME  18001112222, RECV, Text message contents.
, NAME  18001112222, RECV, Text message contents that are run over to 
the line below it.
, NAME  18001112222, SENT, Text message contents that have

multiple lines and empty lines!
, NAME  18001112222, SENT, Text Message contents  

then

awk 'BEGIN{RS="\n,"}{ORS=RT;gsub(/\n/," ");print}' file.txt

gives output

, NAME  18001112222, RECV, Text message contents.
, NAME  18001112222, RECV, Text message contents that are run over to  the line below it.
, NAME  18001112222, SENT, Text message contents that have  multiple lines and empty lines!
, NAME  18001112222, SENT, Text Message contents    

Explanation: I inform GNU AWK that row separator (RS) is newline (\n) followed by comma (,) then for each line I set output row separator (ORS) is current row terminator (RT) then replace all newlines (\n) in rows by space (depending on your requirement you might need alter that to empty string) then I print row which is suffixed by row terminator.

(tested in GNU Awk 5.0.1)

  • Related