Home > Software design >  Merge three columns in one (linux, python, or perl)
Merge three columns in one (linux, python, or perl)

Time:10-19

I have one file (.tsv) that contain variants calling for all the samples. I would like to merge the first three columns into one column:

Example: Original:

file name= variants.tsv > the first three columns that I want to merge are:

lane sampleID Barcode

B31 00-00-NNA-0000 0000

Desired output:

ID

B31_00-00-NNA-0000_0000

what are the recommended methods?

CodePudding user response:

One way, with a perl one-liner:

perl -F'\t' -lane '
    if ($. == 1) {
        print join("\t", "ID", @F[3..$#F])
    } else {
        print join("\t", join("_", @F[0,1,2]), @F[3..$#F])
    }' variants tsv

Splits each line into an array (@F) on tabs, and prints out the header and later lines using slices of that array to extract the appropriate elements, which are then joined into delimited strings.

CodePudding user response:

Starting from this

lane    sampleID    Barcode
B31 00-00-NNA-0000  0000

and using Miller, you can run

mlr --tsv put -S '$ID=$lane."_".$sampleID."_".$Barcode' input.tsv >output.tsv

to have

 ------ ---------------- --------- ------------------------- 
| lane | sampleID       | Barcode | ID                      |
 ------ ---------------- --------- ------------------------- 
| B31  | 00-00-NNA-0000 | 0000    | B31_00-00-NNA-0000_0000 |
 ------ ---------------- --------- ------------------------- 

If you want only the ID field the command is

mlr --tsv put -S '$ID=$lane."_".$sampleID."_".$Barcode' then cut -f ID input.tsv >output.tsv
  • Related