similarly asked questions to mine don’t seem to quite apply to what I am trying to accomplish, and at least one of the provided answers in one of the most similar questions didn’t properly provide a solution that actually works.
So I have a data frame that lets say is similar to the following.
sn <- 1:6
pn <- letters[1:6]
issue1_note <- c(“issue”,”# - #”,NA,”sue”,”# - #”,”ISSUE”)
issue2_note <- c(“# - #”,”ISS”,”# - #”,NA,”Issue”,”Tissue”)
df <- data.frame(sn,pn,issue1_note,issue2_note)
Here is what I want to do. I want to be able to visually inspect each _note column quickly and easily. I know I can do this on each column by using select()
and filter()
as in
df %>% select(issue1_note) %>%
filter(!is.na(issue1_note) & issue1_note != “# - #”)
However, I have around 30 columns and 300 rows in my real data and don’t want to do this each time.
I’d like to write a for loop that will do this across all of the columns. I also want each of the columns printed individually. I tried the below to remove just the NAs, but it merely selects and prints the columns. It’s as if it skips over the filtering completely.
col_notes <- df %>% select(ends_with(“note”)) %>% colnames()
for(col in col_notes){
df %>% select(col) %>% filter(!is.na(col)) %>% print()
}
Any ideas on how I can get this to also filter?
CodePudding user response:
I was able to figure out a solution through more research, though it doesn’t involve a for loop. I created a custom function and then used lapply. In case anybody is wondering, here is my solution.
my_fn <- function(column){
tmp <- df %>% select(column)
tmp %>% filter(!is.na(.data[[column]]) & .data[[column]] != “# - #”)
}
lapply(col_notes, my_fn)
Thanks for the consideration.
CodePudding user response:
This can be done all at once with filter/across
or filter/if_any
/filter_if_all` depending on the outcome desired
library(dplyr)
df %>%
filter(across(ends_with('note'), ~ !is.na(.) & . != "# - #"))
This will return rows with no NA or "# - #" in all of the columns with "note" as suffix in its column names. If we want to return at least one column have non-NA, use if_any
df %>%
filter(if_any(ends_with("note"), ~ !is.na(.) & . != "# - #"))