Home > Blockchain >  R for loop to filter and print columns of a data frame
R for loop to filter and print columns of a data frame

Time:09-30

similarly asked questions to mine don’t seem to quite apply to what I am trying to accomplish, and at least one of the provided answers in one of the most similar questions didn’t properly provide a solution that actually works.

So I have a data frame that lets say is similar to the following.

sn <- 1:6
pn <- letters[1:6]
issue1_note <- c(“issue”,# - #”,NA,”sue”,”# - #”,”ISSUE”)
issue2_note <- c(# - #”,”ISS”,”# - #”,NA,”Issue”,”Tissue”)
df <- data.frame(sn,pn,issue1_note,issue2_note)

Here is what I want to do. I want to be able to visually inspect each _note column quickly and easily. I know I can do this on each column by using select() and filter() as in

df %>% select(issue1_note) %>%
filter(!is.na(issue1_note) & issue1_note !=# - #”)

However, I have around 30 columns and 300 rows in my real data and don’t want to do this each time.

I’d like to write a for loop that will do this across all of the columns. I also want each of the columns printed individually. I tried the below to remove just the NAs, but it merely selects and prints the columns. It’s as if it skips over the filtering completely.

col_notes <- df %>% select(ends_with(“note”)) %>% colnames()

for(col in col_notes){
df %>% select(col) %>% filter(!is.na(col)) %>% print()
}

Any ideas on how I can get this to also filter?

CodePudding user response:

I was able to figure out a solution through more research, though it doesn’t involve a for loop. I created a custom function and then used lapply. In case anybody is wondering, here is my solution.

my_fn <- function(column){
tmp <- df %>% select(column)
tmp %>% filter(!is.na(.data[[column]]) & .data[[column]] !=# - #”)
}

lapply(col_notes, my_fn)

Thanks for the consideration.

CodePudding user response:

This can be done all at once with filter/across or filter/if_any/filter_if_all` depending on the outcome desired

library(dplyr)
df %>% 
   filter(across(ends_with('note'), ~ !is.na(.) & . != "# - #"))

This will return rows with no NA or "# - #" in all of the columns with "note" as suffix in its column names. If we want to return at least one column have non-NA, use if_any

df %>%
    filter(if_any(ends_with("note"), ~ !is.na(.) & . != "# - #"))
  • Related