Home > Software design >  How to export data frame rows based on group data with specific formatting in R
How to export data frame rows based on group data with specific formatting in R

Time:05-25

I have a data frame that includes all employee data (1000 employees, nested in 50 groups). For each group, I want R to export each group data participants in a seperate excel file (or docx if possible). The formatting should be as follows: 1- The first rows should have the group informtino (i.e. id_group, country, region, section), then 2- the other rows should have the participants daa (i.e. sequesnce, EmployeeName, id_emplyee)

The results should include 50 different files, each file includes all employees data including their group information (attached is an example of each file).

# Here is an example of the full data frame:

EmployeeName =  randomNames::randomNames(1000, ethnicity = 5 ,which.names = "last") 
id_emplyee =  sample(1:1000, 1000, replace = F)
id_group =  sample(1:50, 1000, replace = T)

cou = c("USA", "UK", "AUS", "ARM", "BEL")
country = sample(cou, 1000, replace = T)
region_ = c("n", "e", "w", "s", "m")
region = sample(region_, 1000, replace = T)

sect = c("1", "2", "3")
section = sample(sect, 1000, replace = T)

df = data.frame(EmployeeName, id_emplyee, id_group, country,
                       region, section)

enter image description here

CodePudding user response:

I have not tested it, but the approach below should work:

library(randomNames)
library(dplyr)
library(openxlsx)

df %>% 
  group_by(id_group, country, region, section) %>%
  mutate(grp_id = cur_group_id()) %>%
  # iterate over every group with walk which will have side effects only
  group_walk(~ {
    
    # create a dynamic filename: adjust to a name (and folder) of your liking
    file_name  <- paste0("filename_", first(.x$grp_id), ".xlsx")
    
    # create the group header
    header <- .x %>% 
      select(id_group, country, region, section) %>%
      slice_head(n = 1)
    
    # write the excel file containing only the group header at this point
    write.xlsx(header,
               file_name,
               startCol = 2,
               startRow = 3,
               overwrite = TRUE)
    
    # create the body
    body <- .x %>% 
      select(EmployeeName, id_emplyee) %>% 
      mutate(n = row_number(),
             .before = 1)
    
    # open the workbook that the header has just been saved to
    wb <- loadWorkbook(file_name)
    
    # write the body to the workbook (same sheet but start row is 7)
    writeData(wb,
              sheet = 1,
              x = body,
              startRow = 7)
    
    # save the combined data (header and body)
    saveWorkbook(wb,
                 file =file_name,
                 overwrite = TRUE)
    
  }, .keep = TRUE)
  •  Tags:  
  • r
  • Related