Home > Blockchain >  how to add header text with adjacent content in un-formatted data set, side by side with a delimiter
how to add header text with adjacent content in un-formatted data set, side by side with a delimiter

Time:04-06

I have a long list of unformatted data say data.txt where each set is started with a header and ends with a blank line, like:

TypeA/Price:20$
alexmob
moblexto
unkntom

TypeB/Price:25$
moblexto2
unkntom0
alexmob3
poptop9
tyloret

TypeC/Price:30$
rtyuoper0
kunlohpe6
mobryhox

Now, i want to add the header of each set with it's content side by side with comma separated. Like:

alexmob,TypeA/Price:20$
moblexto,TypeA/Price:20$
unkntom,TypeA/Price:20$

moblexto2,TypeB/Price:25$
unkntom0,TypeB/Price:25$
alexmob3,TypeB/Price:25$
poptop9,TypeB/Price:25$
tyloret,TypeB/Price:25$

rtyuoper0,TypeC/Price:30$
kunlohpe6,TypeC/Price:30$
mobryhox,TypeC/Price:30$

so that whenever i will grep with one keyword, relevant content along with the header comes together. Like:

$grep mob data.txt
alexmob,TypeA/Price:20$
moblexto,TypeA/Price:20$
moblexto2,TypeB/Price:25$
alexmob3,TypeB/Price:25$
mobryhox,TypeC/Price:30$

I am newbie on bash scripting as well as python and recently started learning these, so would really appreciate any simple bash scipting (using sed/awk) or python scripting.

CodePudding user response:

Using sed

$ sed '/Type/{h;d;};/[a-z]/{G;s/\n/,/}' input_file
alexmob,TypeA/Price:20$
moblexto,TypeA/Price:20$
unkntom,TypeA/Price:20$

moblexto2,TypeB/Price:25$
unkntom0,TypeB/Price:25$
alexmob3,TypeB/Price:25$
poptop9,TypeB/Price:25$
tyloret,TypeB/Price:25$

rtyuoper0,TypeC/Price:30$
kunlohpe6,TypeC/Price:30$
mobryhox,TypeC/Price:30$

Match lines containing Type, hold it in memory and delete it.

Match lines with alphabetic characters, append G the contents of the hold space. Finally, sub new line for a comma.

CodePudding user response:

I would use GNU AWK for this task following way, let file.txt content be

TypeA/Price:20$
alexmob
moblexto
unkntom

TypeB/Price:25$
moblexto2
unkntom0
alexmob3
poptop9
tyloret

TypeC/Price:30$
rtyuoper0
kunlohpe6
mobryhox

then

awk '/^Type/{header=$0;next}{print /./?$0 ";" header:$0}' file.txt

output

alexmob;TypeA/Price:20$
moblexto;TypeA/Price:20$
unkntom;TypeA/Price:20$

moblexto2;TypeB/Price:25$
unkntom0;TypeB/Price:25$
alexmob3;TypeB/Price:25$
poptop9;TypeB/Price:25$
tyloret;TypeB/Price:25$

rtyuoper0;TypeC/Price:30$
kunlohpe6;TypeC/Price:30$
mobryhox;TypeC/Price:30$

Explanation: If line starts with (^) Type set header value to that line ($0) and go to next line. For every line print if it does contain at least one character (/./) line ($0) concatenated with ; and header, otherwise print line ($0) as is.

(tested in GNU Awk 5.0.1)

  • Related