I have file like this
chr1 13369510 13369602 PRAMEF18 0
chr1 13369510 13369602 PRAMEF19 0
i want to compare first three columns of every row and if it matches then i want an output like this
chr1 13369510 13369602 PRAMEF18,PRAMEF19 0
CodePudding user response:
This should work:
awk -F'\t' '{
key=$1$2$3;
split($0,fields,"\t");
last_fields[key]=$5"\t"$6;
lines[key]=lines[key]?lines[key] ", " $4 : $1"\t"$2"\t"$3"\t"$4
}
END {
for (line in lines) print lines[line]"\t"last_fields[line]
}' your_file.tsv
- First use column 1, 2 and 3 as a key.
- Split the columns and save the last two in a dict for later mapping.
- Create a dict with all the lines of the files, using col1,2,3 as key. If the key already exists in the dict, append the 4th column (the one you want to merge).