Home > Software engineering >  Tibble equivalent of data frame creation in R
Tibble equivalent of data frame creation in R

Time:09-27

In R, when I use the following code to generate a table with the data.frame() command:

nn <- 10
df <- data.frame(Completers = rep(c(1, 0), each = nn),
                 Gender = c(1, 0))

I get this result:

   Completers Gender
1           1      1
2           1      0
3           1      1
4           1      0
5           1      1
6           0      0
7           0      1
8           0      0
9           0      1
10          0      0

However, when I try to do the same with tibble::tibble():

tb <- tibble::tibble(Completers = rep(c(1, 0), each = nn),
                     Gender = c(1, 0))

I get the following error:

Error:
! Tibble columns must have compatible sizes.
• Size 10: Existing data.
• Size 2: Column `Gender`.
ℹ Only values of size one are recycled.
Run `rlang::last_error()` to see where the error occurred.

Needless to say that running rlang::last_error() does not help (me, at least).

Of course, I could simply tb <- tibble::as_tibble(df) and get on with my life, but still...

Therefore, my question is:

  • What is the tibble() code equivalent of data.frame() in order to have the same result as above?

sessioninfo::session_info() extract:

 setting  value
 version  R version 4.2.1 (2022-06-23)
 os       macOS Monterey 12.6
 system   x86_64, darwin17.0
 rstudio  2022.07.1 554 Spotted Wakerobin (desktop)
-------------------------------------------------------
package              * version    date (UTC) lib source
tibble                 3.1.8      2022-07-22 [1] CRAN (R 4.2.0)

CodePudding user response:

It is mentioned in the ?tibble documentation

tibble() builds columns sequentially. When defining a column, you can refer to columns created earlier in the call. Only columns of length one are recycled.

> tibble::tibble(Completers = rep(c(1, 0), each = nn), Gender = 1)
# A tibble: 20 × 2
   Completers Gender
        <dbl>  <dbl>
 1          1      1
 2          1      1
 3          1      1
 4          1      1
 5          1      1
 6          1      1
 7          1      1
 8          1      1
 9          1      1
10          1      1
11          0      1
12          0      1
13          0      1
14          0      1
15          0      1
16          0      1
17          0      1
18          0      1
19          0      1
20          0      1

If we want to get the same output, use rep with length.output

tibble::tibble(Completers = rep(c(1, 0), each = nn), 
     Gender = rep(c(1, 0), length.out = length(Completers)))
# A tibble: 20 × 2
   Completers Gender
        <dbl>  <dbl>
 1          1      1
 2          1      0
 3          1      1
 4          1      0
 5          1      1
 6          1      0
 7          1      1
 8          1      0
 9          1      1
10          1      0
11          0      1
12          0      0
13          0      1
14          0      0
15          0      1
16          0      0
17          0      1
18          0      0
19          0      1
20          0      0
  • Related