I have a CSV file that looks like this:
English, "Hawkin, Jason" ,[email protected]
English, "McDonald, Matt" ,[email protected]
English, "Campbell, Josh" ,[email protected]
My intention is to make a Bash Script to make a second CSV file with the following format:
ID, Email, FNAME, LNAME
jhawkin110, [email protected], Jason, Hawkin
After using "cut" to remove the first column I did not need, I am at a loss on how to create the "ID" by extracting the first part of the "Email".
CodePudding user response:
Using awk
$ awk 'BEGIN {FS=OFS=","; print "ID, Email, FNAME, LNAME"}{gsub(/"/,"");split($NF,a,"@"); print tolower(a[1])," " $NF, $3, $2}' input_file
ID, Email, FNAME, LNAME
jhawkin110, [email protected], Jason , Hawkin
mmcdonald114, [email protected], Matt , McDonald
jcampbell111, [email protected], Josh , Campbell
Using sed
$ sed -E 's/.*"([^,]*),([^"]*)..,(([^@]*)@.*)/\3,\2, \1,\L\4/;s/(.*),(.*)/\2, \1/;1iID, Email, FNAME, LNAME' input_file
ID, Email, FNAME, LNAME
jhawkin110, [email protected], Jason, Hawkin
mmcdonald114, [email protected], Matt, McDonald
jcampbell111, [email protected], Josh, Campbell
CodePudding user response:
echo 'ID, Email, FNAME, LNAME'
awk -v FS=', *"?|"? *,' \
-v OFS=', ' \
'{split($4,a,"@"); print tolower(a[1]),$4,$3,$2}' file.csv