Home > Software engineering >  Creating a Data Frame with Commas
Creating a Data Frame with Commas

Time:03-04

Is it possible to make a data frame containing a column with "multiple elements"?

For instance - given the following data:

a = sample(c(1,-1), size=2 ,replace = T, prob=c(0.5, 0.5))
b = sample(c(1,-1), size=3 ,replace = T, prob=c(0.5, 0.5))
c = sample(c(1,-1), size=4 ,replace = T, prob=c(0.5, 0.5))

#some random numbers
d = rexp(3,5)

#some random letters
e = "g"

#id column
n_id = 1:3

Can all this be combined into a single data frame (4 columns, 3 rows)? I tried to do this the regular way:

answer = data.frame(a,b,c,d,e)

But I get this error:

Error in data.frame(a, b, c, d, e, n_id) : 
  arguments imply differing number of rows: 2, 3, 4, 1

Is it possible to do this in R? I am trying to get something like this:

enter image description here

Thank you!

CodePudding user response:

data.table(n_id = n_id,a=list(a,b,c),d=d,e=e)

    n_id           a          d      e
   <int>      <list>      <num> <char>
1:     1       -1, 1 0.01357525      g
2:     2    -1,-1, 1 0.34263042      g
3:     3  1, 1,-1, 1 0.08830073      g

You can also do with tidyverse

tibble(n_id = n_id,a=list(a,b,c),d=d,e=e)

   n_id a              d e    
  <int> <list>     <dbl> <chr>
1     1 <dbl [2]> 0.0136 g    
2     2 <dbl [3]> 0.343  g    
3     3 <dbl [4]> 0.0883 g   

Notice under both approaches that a is a list-column

CodePudding user response:

You can use this code:

a = sample(c(1,-1), size=2 ,replace = T, prob=c(0.5, 0.5))
         b = sample(c(1,-1), size=3 ,replace = T, prob=c(0.5, 0.5))
         c = sample(c(1,-1), size=4 ,replace = T, prob=c(0.5, 0.5))
         
         #some random numbers
         d = rexp(3,5)
         
         #some random letters
         e = "g"

df = list(a=a, b=b, c=c, d=d, e=e)
         attributes(df) = list(names = names(df),
                                 row.names=1:max(length(c), length(c)), class='data.frame')

With output:

     a    b  c          d    e
1   -1    1 -1 0.05939183    g
2    1    1 -1 0.01683215 <NA>
3 <NA>   -1  1 0.59068018 <NA>
4 <NA> <NA>  1       <NA> <NA>
Warning message:
In format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x,  :
  corrupt data frame: columns will be truncated or padded with NAs

It gives an warning because you actually don't want dataframes with different vector lengths.

  • Related