Ive exported into a text file all of my text messages and they are formatted as such.
, NAME 18001112222, RECV, Text message contents.
, NAME 18001112222, RECV, Text message contents that are run over to
the line below it.
, NAME 18001112222, SENT, Text message contents that have
multiple lines and empty lines!
, NAME 18001112222, SENT, Text Message contents
I know how to remove the empty lines, how would a guy use awk or sed or grep to move all of these lines, that don't begin with a , to the end of the line above it?
or how would you reformat this to ensure each text message has all of its contents on one single line.
I haven't tried anything yet, because Im unsure where to even begin, that why im here asking for more practiced hands to hopefully provide some practical examples as to how to go about solving this issue. Thanks in Advance
CodePudding user response:
Same idea as https://stackoverflow.com/a/73030681/10971581 :
awk -v ORS= '
NR>1 && /^, / { print "\n" }
1;
END { print "\n" }
' inputfile
The input seems to be malformed CSV. One would normally expect fields that could contain newlines or the field delimiter (,
) to be quoted.
Note that it is impossible in general to determine if a line that starts with ,
is a continuation or intended to start a new line. The code above assumes it is always the latter.
CodePudding user response:
I would harness GNU AWK
for this task following way, let file.txt
content be
, NAME 18001112222, RECV, Text message contents.
, NAME 18001112222, RECV, Text message contents that are run over to
the line below it.
, NAME 18001112222, SENT, Text message contents that have
multiple lines and empty lines!
, NAME 18001112222, SENT, Text Message contents
then
awk 'BEGIN{RS="\n,"}{ORS=RT;gsub(/\n/," ");print}' file.txt
gives output
, NAME 18001112222, RECV, Text message contents.
, NAME 18001112222, RECV, Text message contents that are run over to the line below it.
, NAME 18001112222, SENT, Text message contents that have multiple lines and empty lines!
, NAME 18001112222, SENT, Text Message contents
Explanation: I inform GNU AWK
that row separator (RS
) is newline (\n
) followed by comma (,
) then for each line I set output row separator (ORS
) is current row terminator (RT
) then replace all newlines (\n
) in rows by space (depending on your requirement you might need alter that to empty string) then I print
row which is suffixed by row terminator.
(tested in GNU Awk 5.0.1)