I have 1 data file 'file1.csv' with two columns data and need to read file content and find out the between rows based on first column string values 'finance & total' and similarly 'marketing & total'. Script should find 'finance' and 'finance total' then get the between (A,B,D,H) records And script should find 'marketing' and 'marketing total' then get the between (C,E,G) records. I don't want to get records before 'Finance' and after 'Marketing Total' string.
Also need to skip lines where 2nd or 3rd fields have nan string.
I tried with below code but not getting expected output.
while read line
do
if [[ $line == "finance" ]]; then
echo $line >> output.csv
else
echo ""
fi
done < file1.csv
file1.csv:
Departments Accounts ##Header
Monthwise data ##Sub Header
IT,Amount
P,20
q,30
IT Total,50
Finance,Amount
A,20
B,30
D,60
H,50
Finance Total,160
<Empty space>
Marketing,Amount
C,40
E,10
G,60
Marketing Total,110
HR,amount
X,20
Y,50
Z,10
Expected Output:
department,name,amount
Finance,A,20
Finance,B,30
Finance,D,60
Finance,H,50
Marketing,C,40
Marketing,E,10
Marketing,G,60
Expected output should be stored in output file.
CodePudding user response:
With your shown samples, please try following awk
program. Simple explanation would be, setting comma as FS, OFS for all lines and print heading in BEGIN section itself. Then in main program, check different conditions and print the values accordingly. NOTE this program will exit once it finds Marketing Total
as per OP's mention.
awk '
BEGIN{
FS=OFS=","
print "department,name,amount"
}
/Marketing Total/ { exit }
/Finance Total/ || ($2=="nan" || $3=="nan"){ next }
/Finance,Amount/ || /Marketing,Amount/ { found=1;first=$1;next }
found { print first,$0 }
' Input_file
OR in case nan value needs to be compared as case in-sensitive then try following code.
awk '
BEGIN{
FS=OFS=","
print "department,name,amount"
}
/Marketing Total/ { exit }
/Finance Total/ || (tolower($2)=="nan" || tolower($3)=="nan"){ next }
/Finance,Amount/ || /Marketing,Amount/ { found=1;first=$1;next }
found { print first,$0 }
' Input_file