i have a awk need to fix output
#!/usr/bin/awk -f
BEGIN {
print "Identifier,prob_id,score,comments"
FS = ","
}
{
if (NR>1)
print "Participant",$1,$4
}
Result of this script is
Identifier,prob_id,score,comments
Participant 748737 code compiles
Participant 748737 input 101
Participant 748737 input 10011
Participant 748737 empty input
Participant 748737 bad input
Participant 748708 code compiles
Participant 748708 input 101
Participant 748708 input 10011
Participant 748708 empty input
Participant 748708 bad input
Participant 748701 code compiles
Participant 748701 input 101
Participant 748701 input 10011
Participant 748701 empty input
Participant 748701 bad input
origanl csv file data is
Identifier,prob_id,score,prob_desc
748737,1,0,code compiles
748737,2,0,input 101
748737,3,0,input 10011
748737,4,0,empty input
748737,5,0,bad input
748708,1,0,code compiles
748708,2,0,input 101
748708,3,1,input 10011
748708,4,0,empty input
748708,5,1,bad input
748701,1,0,code compiles
748701,2,0,input 101
748701,3,0,input 10011
748701,4,0,empty input
748701,5,1,bad input
Reuired output is
Identifier,prob_id,score,comments
Participant 748737,3_a,10,code compiles
Participant 748737,3_b,5,input 101
Participant 748737,3_c,5,input 10011
Participant 748737,3_d,5,empty input
Participant 748737,3_e,5,bad input
Participant 748708,3_a,10,code compiles
Participant 748708,3_b,5,input 101
Participant 748708,3_c,0,input 10011
Participant 748708,3_d,5,empty input
Participant 748708,3_e,0,bad input
Participant 748701,3_a,10,code compiles
Participant 748701,3_b,5,input 101
Participant 748701,3_c,5,input 10011
Participant 748701,3_d,5,empty input
Participant 748701,3_e,0,bad input
Note
• prob_id values in the second field should be renamed from 1-5 to 3_a, 3_b, …, 3_e • if the input score value is 1, the transformed output value should be 0, and otherwise if the input score value is 0, the transformed output values should be 10, 5, 5, 5, 5, respectively for problem ids 1 through 5.
CodePudding user response:
You are overthinking what you need to do. All you really need to do is output the lines of the original.csv
file unchanged. The only caveat is that for lines (records) greater than 1
, you output "Participant "
as a prefix. You can do that simply using a ternary to control whether "Participant "
prints based on the record number (line number) NR
.
For example, all you really need is:
awk '{ print (NF>1 ? "Participant " : "") $0 }' original.csv
Example Use/Output
With your sample data in original.csv
you get:
$ awk '{ print (NF>1 ? "Participant " : "") $0 }' original.csv
Identifier,prob_id,score,prob_desc
Participant 748737,1,0,code compiles
Participant 748737,2,0,input 101
Participant 748737,3,0,input 10011
Participant 748737,4,0,empty input
Participant 748737,5,0,bad input
Participant 748708,1,0,code compiles
Participant 748708,2,0,input 101
Participant 748708,3,1,input 10011
Participant 748708,4,0,empty input
Participant 748708,5,1,bad input
Participant 748701,1,0,code compiles
Participant 748701,2,0,input 101
Participant 748701,3,0,input 10011
Participant 748701,4,0,empty input
Participant 748701,5,1,bad input
If you want to write the command in script form (which from your question it appears you do), then the long-form way without a ternary just using the pattern NR == 1
to respond differently to the first record, outputting it without a prefix, you could do:
#!/usr/bin/awk -f
NR == 1 {
print $0
next
}
{
print "Participant " $0
}
(same output)
CodePudding user response:
Not complete answer, but something to get you going. Basically use the awk tables to perform the mapping.
This is verbose - possible to write much more compact code, once you figure out the basic.
awk -F, '
BEGIN {
OFS = ","
# Lookup tables for prob_id
probid_code[1] = "3_a"
probid_code[2] = "3_b"
... # Extend as needed
probid_code[5] = "3_e"
# Lookup table for score
probid_score[1] = 10
probid_score[2] = 5
... # Extended as needed
probid_score[5] = 5
}
NR == 1 {
print "Identifier", "prob_id", "score", "comments"
}
NR > 1 {
participant= "participant " $1
prob_id = probid_code[$2]
score = $3 == 1 ? 0 : $3 == 0 ? probid_score[$2] : ""
comments = $4
print participant, prob_id, score, comments
}
'