I’m wondering if there is a way to do the following in R:
- Produce separate .xlsx workbooks from a single dataset based on a column value
- Apply conditional formatting to rows in each .xlsx file based on a column value
I can do each of these separately, but efforts to combine them haven't been successful and I can't find an exact use-case match online. Any help would be greatly appreciated.
I can't share my specific data, but here is a sample that replicates the data I have.
df <- data.frame (
assign = c("YES", "NO", "NO", "YES", "NO", "YES", "YES", "NO"),
dept = c("HIST","HIST", "PSYC", "PSYC", "PSYC", "ENGL", "ENGL", "ENGL"),
class = c(1009, 1330, 1001, 1015, 2190, 1001, 3001, 4390))
I can successfully create separate workbooks by generating a list of the dept variable and then using lapply(), but attempts to incorporate conditional formatting are unsuccessful:
# create a list of dept values to split into separate workbooks
li <- split(df, with(df, df$dept), drop = FALSE)
# using lapply to generate .xlsx docs
lapply(names(li), function(x){write.xlsx(li[[x]], "report", file = paste0("report_", x, ".xlsx"), row.names = FALSE)})
With the following code, I can generate a .xlsx file with conditional formatting, but can only produce a single file with all rows rather than multiple files:
# create style for classes that haven’t finished the assignment
noadmin <- createStyle(fontColour = "#FF0000", fontSize = 10)
# create style for top row
Heading <- createStyle(textDecoration = "bold", fgFill = "#FFFFCC", border = "TopBottomLeftRight")
# workbook call begins here
assign_all <- createWorkbook()
addWorksheet(assign_all, 1, gridLines = TRUE)
writeData(assign_all, 1, df, withFilter = TRUE)
# identify which rows didn’t complete (e.g., need to be formatted)
noRows = data.frame(which(df$assign == "NO", arr.ind=FALSE))
# freeze top row
freezePane(assign_all, 1, firstActiveRow = 2, firstActiveCol = 1)
# add style to header
addStyle(assign_all, 1, cols = 1:ncol(df), rows = 1, style = Heading)
# add style to "NO" rows
addStyle(assign_all, 1, cols = 1:ncol(df), rows = noRows[,1] 1, style = noadmin, gridExpand = TRUE)
saveWorkbook(assign_all, paste0("report.xlsx"), overwrite = TRUE)
This produces the output I want, but with all rows in one file:
Thanks in advance for any guidance you can provide. I've been working on this problem for a few weeks and have run out of ideas.
CodePudding user response:
You could put your code to create the workbook inside a function, then loop over the list of splitted dataframes to create your xlsx files. Instead of lapply
I use mapply
to loop over both the list and the names:
li <- split(df, df$dept)
library(openxlsx)
# create style for classes that haven’t finished the assignment
noadmin <- createStyle(fontColour = "#FF0000", fontSize = 10)
# create style for top row
Heading <- createStyle(textDecoration = "bold", fgFill = "#FFFFCC", border = "TopBottomLeftRight")
make_xl <- function(x, y) {
assign_all <- createWorkbook()
addWorksheet(assign_all, 1, gridLines = TRUE)
writeData(assign_all, 1, x, withFilter = TRUE)
# identify which rows didn’t complete (e.g., need to be formatted)
noRows = data.frame(which(x$assign == "NO", arr.ind=FALSE))
# freeze top row
freezePane(assign_all, 1, firstActiveRow = 2, firstActiveCol = 1)
# add style to header
addStyle(assign_all, 1, cols = 1:ncol(x), rows = 1, style = Heading)
# add style to "NO" rows
addStyle(assign_all, 1, cols = 1:ncol(x), rows = noRows[,1] 1, style = noadmin, gridExpand = TRUE)
saveWorkbook(assign_all, paste0("report_", y, ".xlsx"), overwrite = TRUE)
}
mapply(make_xl, li, names(li))
#> ENGL HIST PSYC
#> 1 1 1
list.files(pattern = "^report")
#> [1] "report_ENGL.xlsx" "report_HIST.xlsx" "report_PSYC.xlsx"