I have two data frames and would like to highlight or color rows in the larger data frame that are not in the smaller data frame. The data frames are like so:
response <- c("Best overall", "Best overall", "Diease control", "Diease control", "Observed response", "Observed respone")
value <- c("pd", "sd", "yes", "no", "yes", "no")
drug <- c(10, 2, 0, 12, 0, 12)
ex_df <- data.frame(response, value)
ex_df <- data.frame(ex_df, drug)
response1 <- c("Best overall", "Best overall", "Best overall", "Best overall", "Best overall", "Best overall", "Diease control", "Diease control", "Observed response", "Observed respone")
value1 <- c("cr", "pr", "sd", "pd", "NE", "NA", "yes", "no", "yes", "no")
drug1 <- c(0, 0, 2, 10, 0, 0, 0, 12, 2, 10)
ex1_df <- data.frame(response1, value1)
ex1_df <- data.frame(ex1_df, drug1)
So What I want here is rows 1, 2, 5, and 6 in the larger data frame to be highlighted or colored a light red to show the differences between the two data frames. Note that the columns in my actual data are the same so if need be please change the column names to equal each other. I have a hard coded solution but this will not be acceptable for my project. my latest attempt is this but returns an empty data frame instead of a data frame with the missing rows.
deltas <- diffdf(base=ex1_df, compare=ex_df)
var_list <- colnames(ex1_df)
delta_list <- list()
for(i in var_list) {
dat <- data.frame(eval(parse(text = paste0("deltas$VarDiff_", i)))) %>%
janitor::clean_names()
delta_list[[i]] <- dat
}
delta_data <- do.call(rbind, delta_list)
Any help is much apricated!
CodePudding user response:
An efficient way to do this is with an anti-join: which rows in B aren't in A?
This will be simpler if we make the column names align. I've also added row numbers to ex1_df
:
ex_df <- data.frame(response, value, drug)
ex1_df <- data.frame(row = 1:length(response1),
response = response1,
value = value1,
drug = drug1)
Then we can use dplyr to do an anti join and gt
to render a data table with formatting in HTML:
library(dplyr); library(gt)
missing <- ex1_df %>% # "missing" is the subset of ex1_df not in ex_df
anti_join(ex_df) %>%
mutate(missing = TRUE)
ex1_df %>% # let's pull in the "missing" column
left_join(missing) %>%
gt() %>% # render as html
tab_style(style = list(cell_fill(color = "lightpink")),
locations = cells_body(columns = everything(), rows = missing)) %>%
cols_hide(missing)
(Results vary from your description since I used the provided data, which is a little different in rows 9 10.)