I need to create a new file using awk script modifying the column "name" deleting the surnames. It must necessarily be made with a while or for.
Original csv:
id,name,date,manner_of_death,armed,age,gender,race,city,state,signs_of_mental_illness,threat_level,flee,body_camera,longitude,latitude,is_geocoding_exact
3,Tim Elliot,2015-01-02,shot,gun,53,M,A,Shelton,WA,True,attack,Not fleeing,False,-123.122,47.247,True
4,Lewis Lee Lembke,2015-01-02,shot,gun,47,M,W,Aloha,OR,False,attack,Not fleeing,False,-122.892,45.487,True
8,Matthew Hoffman,2015-01-04,shot,toy weapon,32,M,W,San Francisco,CA,True,attack,Not fleeing,False,-122.422,37.763,True
The expected output:
id,name,date,manner_of_death,armed,age,gender,race,city,state,signs_of_mental_illness,threat_level,flee,body_camera,longitude,latitude,is_geocoding_exact
3,Tim,2015-01-02,shot,gun,53,M,A,Shelton,WA,True,attack,Not fleeing,False,-123.122,47.247,True
4,Lewis,2015-01-02,shot,gun,47,M,W,Aloha,OR,False,attack,Not fleeing,False,-122.892,45.487,True
8,Matthew,2015-01-04,shot,toy weapon,32,M,W,San Francisco,CA,True,attack,Not fleeing,False,-122.422,37.763,True
CodePudding user response:
I would harness GNU AWK
for this task following way in order to comply with must necessarily be made with a while(...) requirement, let file.txt
content be
id,name,date,manner_of_death,armed,age,gender,race,city,state,signs_of_mental_illness,threat_level,flee,body_camera,longitude,latitude,is_geocoding_exact
3,Tim Elliot,2015-01-02,shot,gun,53,M,A,Shelton,WA,True,attack,Not fleeing,False,-123.122,47.247,True
4,Lewis Lee Lembke,2015-01-02,shot,gun,47,M,W,Aloha,OR,False,attack,Not fleeing,False,-122.892,45.487,True
8,Matthew Hoffman,2015-01-04,shot,toy weapon,32,M,W,San Francisco,CA,True,attack,Not fleeing,False,-122.422,37.763,True
then
awk 'BEGIN{FS=OFS=","}{while(sub(/ [[:alpha:]] $/,"",$2)){}}{print}' file.txt
output
id,name,date,manner_of_death,armed,age,gender,race,city,state,signs_of_mental_illness,threat_level,flee,body_camera,longitude,latitude,is_geocoding_exact
3,Tim,2015-01-02,shot,gun,53,M,A,Shelton,WA,True,attack,Not fleeing,False,-123.122,47.247,True
4,Lewis,2015-01-02,shot,gun,47,M,W,Aloha,OR,False,attack,Not fleeing,False,-122.892,45.487,True
8,Matthew,2015-01-04,shot,toy weapon,32,M,W,San Francisco,CA,True,attack,Not fleeing,False,-122.422,37.763,True
Explanation: firstly I inform GNU AWK
that both field separator (FS
) and output field separator (OFS
) is ,
. Then I use while
statement to remove space followed by zero or more (
) letters ([[:alpha:]]
) which are immediately before end of string from 2nd field by replacing it with empty string. sub
String function does alter provided variable in this case 2nd field ($2
) and return 1 if change was done 0 otherwise therefore while
will terminate when change is not possible. After ending while I do print
changed line.
(tested in gawk 4.2.1)
CodePudding user response:
Here we can choose to print all the fields except $2 as there are. We split $2 and print the first element.
echo "3,Tim Elliot,2015-01-02,shot,gun,53,M,A,Shelton,WA,True,attack,Not fleeing,False,-123.122,47.247,True"| awk -F"[,]" '{for(i=1; i<=NF; i ) if( i == 2 ) {split($i,a," ");printf a[1] ","} else { printf $i "," ;};} '
output
3,Tim,2015-01-02,shot,gun,53,M,A,Shelton,WA,True,attack,Not fleeing,False,-123.122,47.247,True,