Home > database >  Sorting a table in R using only a part of a string in a column
Sorting a table in R using only a part of a string in a column

Time:09-12

Data:

chr10:10003219_A_G  LoF  
chr14:983281_T_C    Missense
chr1:1283721_A_G    Splice
chr21:198727614_T_C Missense
chrX:123212_T_CA    LOF
chr1:12309123_GG_C  Missense

Desired output:

chr1:1283721_A_G    Splice 
chr1:12309123_GG_C  Missense   
chr10:10003219_A_G  LoF  
chr14:983281_T_C    Missense
chr21:198727614_T_C Missense
chrX:123212_T_CA    LOF

I have tried: df[order(df$V1),]

But this puts chr10 before any of the others.

I would like them from chr1,chr2....chX and then by position which comes after the colon in ascending order. I have to keep the table in the same format for downstream analysis so can't string split into different columns.

Any help would be appreciated.

CodePudding user response:

Use str_order with numeric = TRUE:

library(stringr)
df[str_order(df$V1, numeric = TRUE), ]
                   V1       V2
3    chr1:1283721_A_G   Splice
6  chr1:12309123_GG_C Missense
1  chr10:10003219_A_G      LoF
2    chr14:983281_T_C Missense
4 chr21:198727614_T_C Missense
5    chrX:123212_T_CA      LOF
  • Related