Home > OS >  Convert data. frame column character to numeric
Convert data. frame column character to numeric

Time:03-30

I am working on data frame which has information like this.



df<- as.data.frame(read.table("headen.bed",header = FALSE, sep="\t",stringsAsFactors=FALSE, quote=""))
C1  C2    C3 
33  12249  0,300,3900,400,4500,400,4200
83  9213   0,49,66,75,158,160,170,183,218
146 680    0,3,13,129,274,278,383,481,482,496

I want to do addition of C1 into each element of C3, it would be something like.

C1  C2    C3 
33  12249  33,333,3933,433,4533,433,433
83  9213   83 132 149 158 241 243 253 266 301
146 680    146 149 159 275 420 424 529 627 628 642

but somehow its showing that the C3 is a character class, I tried. different ways to convert into numeric type using as.numeric type.convert, character to factor and then numeric . But still didn't can anyone suggest best way to perform this?

CodePudding user response:

You can try,

mapply(function(x, y)paste(x   as.numeric(y), collapse = ','),df$C1 ,strsplit(df$C3, ','))
[1] "33,333,3933,433,4533,433,4233"  "83,132,149,158,241,243,253,266,301"  "146,149,159,275,420,424,529,627,628,642"

DATA

df <- data.frame(C1 = c(33, 83, 146), 
                 C2 = c(1, 2, 3), 
                 C3 = c('0,300,3900,400,4500,400,4200', '0,49,66,75,158,160,170,183,218', '0,3,13,129,274,278,383,481,482,496'), 
                 stringsAsFactors = FALSE)

EDIT To make C3 into numeric you will have to split it into many columns. There are a bunch of ways to do it as shown here. I like the splitstackshape approach, i.e.

library(splitstackshape)
df1 <- cSplit(df, 'C3', sep = ',')

#C1 C2 C3_01 C3_02 C3_03 C3_04 C3_05 C3_06 C3_07 C3_08 C3_09 C3_10
#1:  33  1    33   333  3933   433  4533   433  4233    NA    NA    NA
#2:  83  2    83   132   149   158   241   243   253   266   301    NA
#3: 146  3   146   149   159   275   420   424   529   627   628   642

str(df1)
Classes ‘data.table’ and 'data.frame':  3 obs. of  12 variables:
 $ C1   : num  33 83 146
 $ C2   : num  1 2 3
 $ C3_01: int  33 83 146
 $ C3_02: int  333 132 149
 $ C3_03: int  3933 149 159
 $ C3_04: int  433 158 275
 $ C3_05: int  4533 241 420
 $ C3_06: int  433 243 424
 $ C3_07: int  4233 253 529
 $ C3_08: int  NA 266 627
 $ C3_09: int  NA 301 628
 $ C3_10: int  NA NA 642
  • Related