After I update my Rstudio today, when I tried to get z-scores of a data frame by using mutate() and scale(), it returns a matrix with a 'new name' warning:
df <- df %>% group_by(participants) %>% mutate(zscore=scale(answer))
New names:
NA -> ...8
class(df$zscore)
[1] "matrix" "array"
The column of the z-scores should have been named 'zscore', but why it is now named '...8'? I never had any problems with the codes before. Is it because of the update?
CodePudding user response:
I think you just added another column without a header or read in data with a column without a header. There is no issue with your classes.
library(tidyverse)
test <- mtcars|>
group_by(cyl) |>
mutate(zscore=scale(mpg))
#class of test
class(test)
#> [1] "grouped_df" "tbl_df" "tbl" "data.frame"
#class of column
class(test$zscore)
#> [1] "matrix" "array"
#recreate warning
test <- test |>
bind_cols("")
#> New names:
#> * `` -> `...13`
The warning at the bottom means that I added a column without a name in the 13th position.
CodePudding user response:
Part of the issue is that scale()
returns a matrix. You can fix this by wrapping in as.double()
:
library(dplyr)
starwars2 <- starwars %>%
select(height, gender) %>%
group_by(gender) %>%
mutate(zscore = as.double(scale(height)))
Output:
# A tibble: 87 × 3
# Groups: gender [3]
height gender zscore
<int> <chr> <dbl>
1 172 masculine -0.120
2 167 masculine -0.253
3 96 masculine -2.14
4 202 masculine 0.677
5 150 feminine -0.624
6 178 masculine 0.0394
7 165 feminine 0.0133
8 97 masculine -2.11
9 183 masculine 0.172
10 182 masculine 0.146
# … with 77 more rows
But I’m not sure this explains your NA -> ...8
issue. If not, please update your question to include your data (using dput(df)
) or a subset (using dput(head(df))
).