Home > Net >  sed - replace specific group with uppercase
sed - replace specific group with uppercase

Time:04-11

I am trying to learn the bash and I recently made my first incursions into regex and associated commands (grep, sed and awk). I got stucked in this problem:

Let's assume I have a file (called test.txt) with the following content:

id,name,sport  
100,Pau Gasol,basketball  
101,Tiger Woods,golf  
102,Yao Ming,basketball  
103,Nadal,tennis  
104,LeBron,basketball  

I am looking for a sed command that will change to uppercase the field name of those athletes for which the sport field is basketball (note that I do not wish any changes to be made on the header line). Therefore the output should be:

id,name,sport  
100,PAU GASOL,basketball  
101,Tiger Woods,golf  
102,YAO MING,basketball  
103,Nadal,tennis  
104,LEBRON,basketball  

After much trial and error I have arrived to this command:

sed -r 's/(.*),([[:alpha:]]*[[:space:]]*[[:alpha:]]*),(basketball)/\1,\U\2,\3/' test.txt

However this command changes the header line to uppercase and it also changes the field sport. It seems to me that the \U is acting on everything that is to its right and not just on the second group as I intend.

Any help is very much appreciated.
Thank you in advance.

CodePudding user response:

You need to add \E (that cancels the \U effect for the pattern on the right) after \2:

sed -r 's/(.*),([[:alpha:]]*[[:space:]]*[[:alpha:]]*),(basketball)/\1,\U\2\E,\3/'

See the online demo:

#!/bin/bash
s='id,name,sport
100,Pau Gasol,basketball
101,Tiger Woods,golf
102,Yao Ming,basketball
103,Nadal,tennis
104,LeBron,basketball'
sed -r 's/(.*),([[:alpha:]]*[[:space:]]*[[:alpha:]]*),(basketball)/\1,\U\2\E,\3/' <<< "$s"

Output:

id,name,sport
100,PAU GASOL,basketball
101,Tiger Woods,golf
102,YAO MING,basketball
103,Nadal,tennis
104,LEBRON,basketball

CodePudding user response:

Using sed

$  sed '/^[^,]*,[^,]*,basketball/{s/,[^,]*,/\U&/}' input_file
id,name,sport
100,PAU GASOL,basketball
101,Tiger Woods,golf
102,YAO MING,basketball
103,Nadal,tennis
104,LEBRON,basketball

CodePudding user response:

Note that \U is a GNU sed extension; that won't necessarily work with, say, a *BSD sed. Done portably (And more clearly) in awk:

$ awk -F, '$3 ~ /^basketball/  { $2 = toupper($2) } 1' OFS=, test.txt
id,name,sport
100,PAU GASOL,basketball
101,Tiger Woods,golf
102,YAO MING,basketball
103,Nadal,tennis
104,LEBRON,basketball

Note use of regular expression instead of straight up string equality because your sample data has trailing spaces.

CodePudding user response:

This might work for you (GNU sed):

sed '/,basketball\s*$/s/[^,]*/\U&/2' file

If the last field is basketball then uppercase the second field.

  • Related