Home > other >  Parsing and modifying csv with bash
Parsing and modifying csv with bash

Time:11-04

Have a csv file with tons of rows, small example:

id,location_id,name,title,email,directorate 
1,1, Amy lee,Singer,, 
2,2,brad Pitt,Actor,,Production 
3,5,Steven Spielberg,Producer,[email protected],Production

Need to:

  • change first and last name to uppercase, example, Brad Pitt, Amy Lee.
  • create email with pattern first letter of first name last name, all in lowercase with @google.com and value from location_id, example - [email protected], [email protected]
  • save it to new file.csv, with the same structure, example:
id,location_id,name,title,email,directorate 
1,1, Amy Lee,Singer,[email protected], 
2,2,Brad Pitt,Actor,[email protected],Production 
3,5,Steven Spielberg,Producer,[email protected],Production

I started from create a array and iterate through it, with bunch of sed, awk, but it gives to me random results. Please give me advice, how resolve this task.

while read -ra array; do
    for i in ${array[@]};
    do
        awk -F ',' '{print tolower(substr($3,1,1))$2$3"@google.com"}'
    done

    for i in ${array[@]};
    do
        awk -F "\"*,\"*" '{print $3}' | sed -e "s/\b\(.\)/\u\1/g"
    done

done < file.csv

awk -F ',' '{print tolower(substr($3,1,1))$2$3"@google.com"}' working not correct.

CodePudding user response:

Using GNU sed

$ sed -E 's/([^,]*,([^,]*),) ?(([[:alpha:]])[^ ]*  )(([^,]*),[^,]*,)[^,]*/\1\u\3\u\5\L\4\6\[email protected]/' input_file
id,location_id,name,title,email,directorate
1,1,Amy Lee,Singer,[email protected],
2,2,Brad Pitt,Actor,[email protected],Production
3,5,Steven Spielberg,Producer,[email protected],Production

CodePudding user response:

With your shown samples please try following awk.

awk '
BEGIN{ FS=OFS="," }
{
  split($3,arr," ")
  val=(substr($3,1,1) arr[2]"@google.com,")
  $NF=tolower(val) $NF
  val=""
} 
1
' Input_file
  • Related