I'm baffled. This code will not work for my dataset, but it works fine with dummy data. As far as I can tell there is no important differences in the structure of these two datasets. Why might I be getting this error about undefined columns?
> packageVersion('tidyr')
[1] ‘1.2.0’
> str(test)
'data.frame': 229 obs. of 9 variables:
$ Response : chr "presence" "presence" "presence" "presence" ...
$ Predictor : chr "tussock_gram" "wet_sedge" "nontussock_gram" "dry_gram_dwarf_shrub" ...
$ Estimate : num 1.03 2.77 2.02 13.73 -6.69 ...
$ Std.Error : chr "1.6469" "1.7951" "8.5393" "14.6206" ...
$ DF : num 844 844 844 844 844 844 844 844 844 844 ...
$ Crit.Value : num 0.628 1.542 0.236 0.939 -0.761 ...
$ P.Value : num 0.53 0.123 0.813 0.348 0.447 ...
$ Std.Estimate: num 0.0233 0.0536 0.0177 0.1019 -0.1441 ...
$ : chr "" "" "" "" ...
> dput(head(test))
structure(list(Response = c("presence", "presence", "presence",
"presence", "presence", "presence"), Predictor = c("tussock_gram",
"wet_sedge", "nontussock_gram", "dry_gram_dwarf_shrub", "low_shrub",
"high_shrub"), Estimate = c(1.035, 2.7687, 2.0189, 13.7295, -6.6858,
12.4353), Std.Error = c("1.6469", "1.7951", "8.5393", "14.6206",
"8.7873", "3.5288"), DF = c(844, 844, 844, 844, 844, 844), Crit.Value = c(0.6285,
1.5424, 0.2364, 0.9391, -0.7608, 3.524), P.Value = c(0.5297,
0.123, 0.8131, 0.3477, 0.4467, 0.0004), Std.Estimate = c(0.0233,
0.0536, 0.0177, 0.1019, -0.1441, 0.1436), c("", "", "", "", "",
"***")), row.names = c(NA, 6L), class = "data.frame")
> test <- test %>%
unite("Relationship", c(Response, Predictor), sep = "~")
Error in `[.data.frame`(out, setdiff(names(out), names(from_vars))) :
undefined columns selected
> df <- as.data.frame(expand_grid(Response = c("a", NA), Predictor = c("b", NA)))
> str(df)
'data.frame': 4 obs. of 2 variables:
$ Response : chr "a" "a" NA NA
$ Predictor: chr "b" NA "b" NA
> df <- df %>%
unite("Relationship", c(Response, Predictor), sep = "~")
# works fine
CodePudding user response:
There was a column in the updated dput
, that is just blank as column name (""
). We need to remove it
library(dplyr)
library(tidyr)
test %>%
select(-"") %>%
unite(Relationship, Response, Predictor, sep = "~")
Relationship Estimate Std.Error DF Crit.Value P.Value Std.Estimate
1 presence~tussock_gram 1.0350 1.6469 844 0.6285 0.5297 0.0233
2 presence~wet_sedge 2.7687 1.7951 844 1.5424 0.1230 0.0536
3 presence~nontussock_gram 2.0189 8.5393 844 0.2364 0.8131 0.0177
4 presence~dry_gram_dwarf_shrub 13.7295 14.6206 844 0.9391 0.3477 0.1019
5 presence~low_shrub -6.6858 8.7873 844 -0.7608 0.4467 -0.1441
6 presence~high_shrub 12.4353 3.5288 844 3.5240 0.0004 0.1436
The issue is in the source code where it checks
...
out <- out[setdiff(names(out), names(from_vars))]
...
It triggers the error because when we try to select a column with blank (""
) as column name, it returns the error
> names(test)
[1] "Response" "Predictor" "Estimate" "Std.Error" "DF" "Crit.Value" "P.Value" "Std.Estimate" ""
> test[""]
Error in `[.data.frame`(test, "") : undefined columns selected
If there are unusual column names, either run make.names
(from base R
)
> make.names(names(test))
[1] "Response" "Predictor" "Estimate" "Std.Error" "DF" "Crit.Value" "P.Value" "Std.Estimate" "X"
Or use clean_names
from janitor
> janitor::clean_names(test)
response predictor estimate std_error df crit_value p_value std_estimate x
1 presence tussock_gram 1.0350 1.6469 844 0.6285 0.5297 0.0233
2 presence wet_sedge 2.7687 1.7951 844 1.5424 0.1230 0.0536
3 presence nontussock_gram 2.0189 8.5393 844 0.2364 0.8131 0.0177
4 presence dry_gram_dwarf_shrub 13.7295 14.6206 844 0.9391 0.3477 0.1019
5 presence low_shrub -6.6858 8.7873 844 -0.7608 0.4467 -0.1441
6 presence high_shrub 12.4353 3.5288 844 3.5240 0.0004 0.1436 ***
Thus, updating the column names will make sure that it runs with unite
(without removing the column ''
)
names(test) <- make.names(names(test))
test %>%
unite(Relationship, Response, Predictor, sep = "~")
Relationship Estimate Std.Error DF Crit.Value P.Value Std.Estimate X
1 presence~tussock_gram 1.0350 1.6469 844 0.6285 0.5297 0.0233
2 presence~wet_sedge 2.7687 1.7951 844 1.5424 0.1230 0.0536
3 presence~nontussock_gram 2.0189 8.5393 844 0.2364 0.8131 0.0177
4 presence~dry_gram_dwarf_shrub 13.7295 14.6206 844 0.9391 0.3477 0.1019
5 presence~low_shrub -6.6858 8.7873 844 -0.7608 0.4467 -0.1441
6 presence~high_shrub 12.4353 3.5288 844 3.5240 0.0004 0.1436 ***