Home > database >  Why doesn't str_split_fixed() create new column?
Why doesn't str_split_fixed() create new column?

Time:09-21

I have the following data set:

>data_short

Symbol_ID      GFP_Mean  GFP_SD Cells
   <chr>             <dbl>   <dbl> <dbl>
 1 Control_0        0.0303 0.00657 7071.
 2 XRCC4_7518       0.0396 0.00768 5022 
 3 XRCC5_7520       0.0305 0.00629 5781.
 4 BRCA1_672        0.0178 0.00833 1822.
 5 DDX48_9775       0.109  0.0201   239 
 6 HMGN1_3150       0.0997 0.00875 1173 
 7 PRDM13_59336     0.0789 0.00794  980 
 8 UBOX5_22888      0.0734 0.00653 1378 
 9 HIST1H2AE_3012   0.0719 0.00592 1906 
10 HMGN2_3151       0.0691 0.00934  738 

I try to split the first column into 2 different columns and it seems to work well

data_short<-data_short %>% mutate(Symbol_ID=str_split_fixed(data_short$Symbol_ID, "_", 2))

Symbol_ID[,1] [,2]  GFP_Mean  GFP_SD Cells
   <chr>         <chr>    <dbl>   <dbl> <dbl>
 1 Control       0       0.0303 0.00657 7071.
 2 XRCC4         7518    0.0396 0.00768 5022 
 3 XRCC5         7520    0.0305 0.00629 5781.
 4 BRCA1         672     0.0178 0.00833 1822.
 5 DDX48         9775    0.109  0.0201   239 
 6 HMGN1         3150    0.0997 0.00875 1173 
 7 PRDM13        59336   0.0789 0.00794  980 
 8 UBOX5         22888   0.0734 0.00653 1378 
 9 HIST1H2AE     3012    0.0719 0.00592 1906 
10 HMGN2         3151    0.0691 0.00934  738 

But when I check the str(data_short) it seems like it didn't work well...:

> str(data_short)
tibble [1,177 × 4] (S3: tbl_df/tbl/data.frame)
 $ Symbol_ID: chr [1:1177, 1:2] "Control" "XRCC4" "XRCC5" "BRCA1" ...
 $ GFP_Mean : num [1:1177] 0.0303 0.0396 0.0305 0.0178 0.1088 ...
 $ GFP_SD   : num [1:1177] 0.00657 0.00768 0.00629 0.00833 0.02014 ...
 $ Cells    : num [1:1177] 7071 5022 5781 1822 239 ...

Why is that? how can I fix it? Thanks in advance!

CodePudding user response:

str_split_fixed outputs a character matrix so isn't ideal for working with dataframe columns. tidyr::separate would be more suitable in this case e.g.

data_short %>%
  tidyr::separate(Symbol_ID, into = c("SymbolID1", "SymbolID2"), sep = "_")
  • Related