I have a .csv where each row corresponds to a person (first column) and attributes with values that are available for that person. I want to extract the names and values a particular attribute for persons where the attribute is available. The doc is structured as follows:
name,attribute1,value1,attribute2,value2,attribute3,value3
joe,height,5.2,weight,178,hair,
james,,,,,,
jesse,weight,165,height,5.3,hair,brown
jerome,hair,black,breakfast,donuts,height,6.8
I want a file that looks like this:
name,attribute,value
joe,height,5.2
jesse,height,5.3
jerome,height,6.8
Using this earlier post, I've tried a few different awk
methods but am still having trouble getting both the first column and then whatever column has the desired value for the attribute (say height). For example the following returns everything.
awk -F "height," '{print $1 "," FS$2}' file.csv
I could grep
only the rows with height in them, but I'd prefer to do everything in a single line if I can.
CodePudding user response:
You may use this awk
:
cat attrib.awk
BEGIN {
FS=OFS=","
print "name,attribute,value"
}
NR > 1 && match($0, k "[^,] ") {
print $1, substr($0, RSTART 1, RLENGTH-1)
}
# then run it as
awk -v k=',height,' -f attrib.awk file
name,attribute,value
joe,height,5.2
jesse,height,5.3
jerome,height,6.8
# or this one
awk -v k=',weight,' -f attrib.awk file
name,attribute,value
joe,weight,178
jesse,weight,165
CodePudding user response:
I'd suggest a sed
one-liner:
sed -n 's/^\([^,]*\).*\(,height,[^,]*\).*/\1\2/p' file.csv
CodePudding user response:
One awk
idea:
awk -v attr="height" '
BEGIN { FS=OFS="," }
FNR==1 { print "name", "attribute", "value"; next }
{ for (i=2;i<=NF;i =2) # loop through even-numbered fields
if ($i == attr) { # if field value is an exact match to the "attr" variable then ...
print $1,$i,$(i 1) # print current name, current field and next field to stdout
next # no need to check rest of current line; skip to next input line
}
}
' file.csv
NOTE: this assumes the input value (height
in this example) will match exactly (including same capitalization) with a field in the file
This generates:
name,attribute,value
joe,height,5.2
jesse,height,5.3
jerome,height,6.8
CodePudding user response:
With a perl one-liner:
$ perl -lne '
print "name,attribute,value" if $.==1;
print "$1,$2" if /^(\w ).*(height,\d \.\d )/
' file
output
name,attribute,value
joe,height,5.2
jesse,height,5.3
jerome,height,6.8