After my data frame is converted to long, manipulated, then converted back to wide, the column types seem to have changed.
x = data.frame(A = rnorm(100),
b = rnorm(100)) %>%
mutate(id = row_number())
typeof(x[,'A'])
# produces "double"
x3 = x %>% pivot_longer(-id) %>%
pivot_wider(names_from = name, values_from = value)
typeof(x3[,'A'])
# produces "list"
typeof(x3[,'A'] %>% unlist())
# produces "double"
This is a problem because I want to loop through an array and assign parts of the df to parts of the array. For example:
arr = array(dim = c(2,100))
# arr has type 'logical'
arr[,1] = x3[,'A']
# arr now has type 'list'
arr[,2] = x3[,'A']
#last line gives me: 'Error in arr[, 2] = x3[, "A"] : incorrect number of subscripts on matrix'
Assigning a slice of the array to a list seems to convert the whole thing to a list. I believe I can get round this by replacing the last line with arr[,2] = x3[,'A'] %>% unlist()
, but it's suck strange behaviour I want to know what's going on.
CodePudding user response:
The reason is that x3
is a tibble
and using x3[, 'A']
is still a tibble
which you could check via class(x3[,'A'])
. And as a tibble
or a data.frame
is basically a list
typeof
will give you a type of list
(check e.g. typeof(mtcars)
). This is one of the differences between a data.frame
and a tibble
. In case of a data.frame
x3[,'A']
would be simplified to a vector by default which is not the case for a tibble
.
If your want a vector then you have to be more explicit when slicing from a tibble
by using x3[,'A', drop = TRUE]
or x3[["A"]]
or x3$A
:
library(dplyr, warn=FALSE)
library(tidyr)
set.seed(123)
x = data.frame(A = rnorm(100),
b = rnorm(100)) %>%
mutate(id = row_number())
x3 <- x %>%
pivot_longer(-id) %>%
pivot_wider(names_from = name, values_from = value)
class(x3[,'A'])
#> [1] "tbl_df" "tbl" "data.frame"
typeof(x3$A)
#> [1] "double"
typeof(x3[,'A', drop = TRUE])
#> [1] "double"
typeof(x3[['A']])
#> [1] "double"