Home > database >  Use sed to replace values in a csv column if a condition is met in another column
Use sed to replace values in a csv column if a condition is met in another column

Time:04-10

I have a CSV file composed of several fields split by commas.

id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,

I have to change from lowercases to uppercases the values on column "name" when the column "sport" is shooting or judo. I can only use sed. I am using this command

sed 's/\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\)/\1,\U\2\E,\3,\4,\5,\6,\7,\shooting|judo,\9,\10,\11,\12/' athletesv2.csv

But it is not working, as it's just showing "shooting|judo" in all the rows.

How can I make these replacements?

Note that the output must be a .sed file, which has to be called using sed -f script.sed athletes.csv

In the output I need to keep the header.

I am using Ubuntu Linux.

CodePudding user response:

In case you can use a GNU sed, you can use

rx='^([^,]*),([^,]*),([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,(shooting|judo),[^,]*,[^,]*,[^,]*,[^,]*)$'
repl='\1,\U\2\E,\3'
sed -E "s/$rx/$repl/" athletes.csv

See the online demo:

#!/bin/bash
rx='^([^,]*),([^,]*),([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,(shooting|judo),[^,]*,[^,]*,[^,]*,[^,]*)$'
repl='\1,\U\2\E,\3'

s='id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,
132041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,shooting,0,0,0,'

sed -E "s/$rx/$repl/" <<< "$s"

Output:

id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,
132041664,A JESUS GARCIA,ESP,male,1969-10-17,1.72,64,shooting,0,0,0,

Notes:

  • ^([^,]*),([^,]*),([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,(shooting|judo),[^,]*,[^,]*,[^,]*,[^,]*)$ is a pattern that matches a whole string (^ is the start of string and $ matches the end of string) that captures Field 1 and 2 into separate groups and the rest of the string into Group 3. Field 8 pattern is hard-coded, (shooting|judo) either matches shooting or judo.
  • \U\2\E in the replacement will put Group 2 value back in uppercase.

Note you cannot use more than \9 backreference in sed, so you need to decrease their amount and group those groups that are not used.

CodePudding user response:

Using sed

$ sed '/^[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,shooting\|judo,/s/,[^,]*/\U&/' input_file
id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A JESUS GARCIA,ESP,male,1969-10-17,1.72,64,shooting,0,0,0,
  • Related