Home > Software design >  Change for a specific column, a string (identically repeated in all the rows) order
Change for a specific column, a string (identically repeated in all the rows) order

Time:09-16

a) I have the same string for all the rows in column 4.

b) I am trying to reorder this string.

c) Each row is divided in 8 groups from strsplit.

d) I would like to reorder those groups for all the rows of the column (same way of reorder).

I tried then this script... I just can't find what to insert in the [[ ? ]]... I tried the number of the rows and the column name... but still doesn't work to change the order in the column... what should I insert at the place of the [[ ? ]] to make it work, ordering the string for all the rows of the column? Any advices?

df

dput(df[1:4, ])
structure(list(V1 = c("chr1", "chr1", "chr1", "chr1"), V2 = 3003641:3003644, 
    V3 = 3003650:3003653, V4 = c("Class=C2H2.zinc.finger.factors;Family=More.than.3.adjacent.zinc.finger.factors;strand=-;id=ZNF449.Me9548.1.YY2017.HT-SE2;seq=CCCCCCCCCC;score=10.4571;pval=8.34e-05;Averageconservationscore=NA", 
    "Class=C2H2.zinc.finger.factors;Family=More.than.3.adjacent.zinc.finger.factors;strand=-;id=ZNF449.Me9548.1.YY2017.HT-SE2;seq=CCCCCCCCCC;score=10.4571;pval=8.34e-05;Averageconservationscore=NA", 
    "Class=C2H2.zinc.finger.factors;Family=More.than.3.adjacent.zinc.finger.factors;strand=-;id=ZNF449.Me9548.1.YY2017.HT-SE2;seq=CCCCCCCCCC;score=10.4571;pval=8.34e-05;Averageconservationscore=NA", 
    "Class=C2H2.zinc.finger.factors;Family=More.than.3.adjacent.zinc.finger.factors;strand=-;id=ZNF449.Me9548.1.YY2017.HT-SE2;seq=ACCCCCCCCC;score=10.8429;pval=6.74e-05;Averageconservationscore=NA"
    ), V5 = c(0L, 0L, 0L, 0L)), row.names = c(NA, 4L), class = "data.frame")
script

df$V4 <- strsplit(df$V4, '[,]')
df$V4 <- df$V4[[  ?  ]][c(1,2,4,3,5,6,7,8)]
order col4 before (id after strand) 

Class=C2H2.zinc.finger.factors;Family=More.than.3.adjacent.zinc.finger.factors;strand=-;id=YY1.Ca9487.2.YY2017.HT-SE2;seq=AGCCATCTTGTCTCACGAGTCCA;score=5.57576;pval=8.28e-05;Averageconservationscore=NA
order col4 after (id before strand)

Class=C2H2.zinc.finger.factors;Family=More.than.3.adjacent.zinc.finger.factors;id=YY1.Ca9487.2.YY2017.HT-SE2;strand=-;seq=AGCCATCTTGTCTCACGAGTCCA;score=5.57576;pval=8.28e-05;Averageconservationscore=NA

CodePudding user response:

Use strsplit to split the string and sapply to reorder and paste the string back.

df$V4 <- sapply(strsplit(df$V4, ';', fixed = TRUE), function(x) 
                paste0(x[c(1,2,4,3,5,6,7,8)], collapse = ';'))
  • Related