My data:
structure(list(x = c(-32.2473803623946, -10.3430055552535,
-10.4110625155105, -30.6804086316593),
y = c(-1.04361388101641, 24.6971017038231,
24.6303839929497, 35.7958624586036),
z = c(202.270724194289, 228.921139241279,
226.240853533147, 232.326865994258),
...4 = c(0, 0, 0, 0), ...5 = c(0, 0, 0, 0),
...6 = c(0, 0, 0, 0), ...7 = c(1, 1, 1, 1),
...8 = c(1, 1, 1, 1), ...9 = c(1, 1, 1, 1),
...10 = c(1, 1, 1, 1),
...11 = c("Point # 1 in 1-LV_TC_EDIT",
"Point # 2 in 1-LV_TC_EDIT",
"Point # 3 in 1-LV_TC_EDIT",
"Point # 5 in 1-LV_TC_EDIT"),
...12 = c("Bipolar 7.827 / Unipolar 16.911 / LAT -9.0",
"Bipolar 2.34 / Unipolar 9.09 / LAT -10.0",
"Bipolar 1.974 / Unipolar 9.219 / LAT -11.0",
"Bipolar 1.938 / Unipolar 10.572 / LAT -9.0")),
row.names = c(NA, -4L),
class = c("tbl_df", "tbl", "data.frame"))
I'm trying to only keep a column if it contains certain text.
This code labels the correct column as TRUE but gives this atomic vector error
bipol %>% stringr::str_detect(., "Bipolar")
Warning: argument is not an atomic vector; coercing [1] FALSE ... TRUE
That error led me to here: using select and stringr together but I'm not sure how to incorporate this into my code
But when I use it with select(where( logic, it returns where()
must be used with functions that return TRUE
or FALSE
.
bipol %>% select(where(~ stringr::str_detect(., "Bipolar")))
Error in `select()`:
! `where()` must be used with functions that return `TRUE` or `FALSE`.
Backtrace:
1. bipol %>% ...
3. dplyr:::select.data.frame(., where(~stringr::str_detect(., "Bipolar")))
6. tidyselect::eval_select(expr(c(...)), .data)
7. tidyselect:::eval_select_impl(...)
16. tidyselect:::vars_select_eval(...)
...
19. tidyselect:::reduce_sels(node, data_mask, context_mask, init = init)
20. tidyselect:::walk_data_tree(new, data_mask, context_mask)
21. tidyselect:::as_indices_sel_impl(...)
23. purrr::map_lgl(data, predicate)
24. tidyselect (local) .f(.x[[i]], ...)
I will then cbind() just this column to the the first 3 columns and extract the relevent text using stringr::str_extract
Thanks!
CodePudding user response:
select
with where
should return a single TRUE/FALSE for each column for selection of that column. So, wrap with any
- i.e. str_detect
will be applied on each of the columns to check whether there is 'Bipolar', but the length will be equal to the length of the column with TRUE/FALSE values as a vector. Wrapping with any
returns only a single TRUE if there is any TRUE value and FALSE is nothing matches
library(dplyr)
library(stringr)
bipol %>%
select(where(~ any(stringr::str_detect(.x, "Bipolar"))))
-output
# A tibble: 4 × 1
...12
<chr>
1 Bipolar 7.827 / Unipolar 16.911 / LAT -9.0
2 Bipolar 2.34 / Unipolar 9.09 / LAT -10.0
3 Bipolar 1.974 / Unipolar 9.219 / LAT -11.0
4 Bipolar 1.938 / Unipolar 10.572 / LAT -9.0
We may also add a short circuit to check the 'Bipolar' on only character
columns
bipol %>%
select(where(~ is.character(.x) && any(stringr::str_detect(., "Bipolar"))))
# A tibble: 4 × 1
...12
<chr>
1 Bipolar 7.827 / Unipolar 16.911 / LAT -9.0
2 Bipolar 2.34 / Unipolar 9.09 / LAT -10.0
3 Bipolar 1.974 / Unipolar 9.219 / LAT -11.0
4 Bipolar 1.938 / Unipolar 10.572 / LAT -9.0
The error is a bit misleading, but in essence
> bipol %>% select(where(~ stringr::str_detect(., "Bipolar")))
Error in `select()`:
! `where()` must be used with functions that return `TRUE` or `FALSE`.
It means to return a single TRUE/FALSE and not a vector of TRUE/FALSE of length > 1