Home > Software design >  Coloring rows in a data frame that differ from another (R)
Coloring rows in a data frame that differ from another (R)

Time:08-13

I have two data frames and would like to highlight or color rows in the larger data frame that are not in the smaller data frame. The data frames are like so:

response <- c("Best overall", "Best overall", "Diease control", "Diease control", "Observed response", "Observed respone")
value <- c("pd", "sd", "yes", "no", "yes", "no")
drug <- c(10, 2, 0, 12, 0, 12)

ex_df <- data.frame(response, value)
ex_df <- data.frame(ex_df, drug)

response1 <- c("Best overall", "Best overall", "Best overall", "Best overall", "Best overall", "Best overall", "Diease control", "Diease control", "Observed response", "Observed respone")
value1 <- c("cr", "pr", "sd", "pd", "NE", "NA", "yes", "no", "yes", "no")
drug1 <- c(0, 0, 2, 10, 0, 0, 0, 12, 2, 10)

ex1_df <- data.frame(response1, value1)
ex1_df <- data.frame(ex1_df, drug1)

enter image description here enter image description here

So What I want here is rows 1, 2, 5, and 6 in the larger data frame to be highlighted or colored a light red to show the differences between the two data frames. Note that the columns in my actual data are the same so if need be please change the column names to equal each other. I have a hard coded solution but this will not be acceptable for my project. my latest attempt is this but returns an empty data frame instead of a data frame with the missing rows.

deltas <- diffdf(base=ex1_df, compare=ex_df)
var_list <- colnames(ex1_df)

delta_list <- list()

for(i in var_list) {
  dat <- data.frame(eval(parse(text = paste0("deltas$VarDiff_", i)))) %>%
    janitor::clean_names()
  delta_list[[i]] <- dat
}

delta_data <- do.call(rbind, delta_list)

Any help is much apricated!

CodePudding user response:

An efficient way to do this is with an anti-join: which rows in B aren't in A?

This will be simpler if we make the column names align. I've also added row numbers to ex1_df:

ex_df <- data.frame(response, value, drug)

ex1_df <- data.frame(row = 1:length(response1),
                     response = response1, 
                     value = value1, 
                     drug = drug1)

Then we can use dplyr to do an anti join and gt to render a data table with formatting in HTML:

library(dplyr); library(gt)
missing <- ex1_df %>%      # "missing" is the subset of ex1_df not in ex_df
  anti_join(ex_df) %>%
  mutate(missing = TRUE)

ex1_df %>%                 # let's pull in the "missing" column
  left_join(missing) %>%
  gt() %>%                 # render as html
  tab_style(style = list(cell_fill(color = "lightpink")),
            locations = cells_body(columns = everything(), rows = missing)) %>%
  cols_hide(missing)

enter image description here

(Results vary from your description since I used the provided data, which is a little different in rows 9 10.)

  •  Tags:  
  • r
  • Related