Home > OS >  My data set consists of many rows but only one column want to seperate the values from the one colum
My data set consists of many rows but only one column want to seperate the values from the one colum

Time:07-14

I have a dataset of thousands of rows and only one column. It was supposed to be 108 columns. All the values are seperated by tabs and I want to re-write this data frame with seperate columns in R. An example of one row is "A_23_P149050\t-0.78007\t-0.43862\t0.26336\t-0.02076\t-0.11873\t0.30805\t-0.70170\t0.18403\t1.42516\t0.77827\t0.49341\t-0.07636\t0.00152\t0.55901"

It should be 15 different columns. With strsplit, I am getting a list and length still shows 1.

CodePudding user response:

df <- data.frame(a = "A_23_P149050\t-0.78007\t-0.43862\t0.26336\t-0.02076\t-0.11873\t0.30805\t-0.70170\t0.18403\t1.42516\t0.77827\t0.49341\t-0.07636\t0.00152\t0.55901")

library(dplyr)
library(tidyverse)
df1 <- df %>% 
  separate(col = a, into = paste("col", seq(1:15), sep= ""), sep = "\\t")

>df1
          col1     col2     col3    col4     col5
1 A_23_P149050 -0.78007 -0.43862 0.26336 -0.02076
      col6    col7     col8    col9   col10
1 -0.11873 0.30805 -0.70170 0.18403 1.42516
    col11   col12    col13   col14   col15
1 0.77827 0.49341 -0.07636 0.00152 0.55901

CodePudding user response:

Using scan.

scan(what='A', qui=T, text="A_23_P149050\t-0.78007\t-0.43862\t0.26336\t-0.02076\t-0.11873\t0.30805\t-0.70170\t0.18403\t1.42516\t0.77827\t0.49341\t-0.07636\t0.00152\t0.55901")
# [1] "A_23_P149050" "-0.78007"     "-0.43862"     "0.26336"      "-0.02076"     "-0.11873"     "0.30805"      "-0.70170"    
# [9] "0.18403"      "1.42516"      "0.77827"      "0.49341"      "-0.07636"     "0.00152"      "0.55901"     

CodePudding user response:

data.table option:

df <- data.frame(V1 = c("A_23_P149050\t-0.78007\t-0.43862\t0.26336\t-0.02076\t-0.11873\t0.30805\t-0.70170\t0.18403\t1.42516\t0.77827\t0.49341\t-0.07636\t0.00152\t0.55901"))
library(data.table) 
setDT(df)[, paste0("V", 1:15) := tstrsplit(V1, "\\t")]
df
#>              V1       V2       V3      V4       V5       V6      V7       V8
#> 1: A_23_P149050 -0.78007 -0.43862 0.26336 -0.02076 -0.11873 0.30805 -0.70170
#>         V9     V10     V11     V12      V13     V14     V15
#> 1: 0.18403 1.42516 0.77827 0.49341 -0.07636 0.00152 0.55901

Created on 2022-07-13 by the reprex package (v2.0.1)

  • Related