how to loop a series of nrow calculations for multiple lists of variable and save in a data frame-CodePudding

I am trying to calculate the number of appointments conducted by each staff member, in each office pulled from an excel csv. We have a variety of staff members, offices, and appointment types.

Here is a sample of what I have so far:

staff = c("smith", "jones", "carter")
office = c("toronto", "oakville", "ottawa")
appointment.type = c("initial", "initial2", "followup")

smith_initial_to = (nrow(appointmentdata[staff == "smith" & office == "toronto" 
   & appointment.type == "initial",])

smith_initial_oak = (nrow(appointmentdata[staff == "smith" & office == "oakville" 
   & appointment.type == "initial",])

smith_initial_ott = (nrow(appointmentdata[staff == "smith" & office == "ottawa" 
   & appointment.type == "initial",])

jones_initial_to = (nrow(appointmentdata[staff == "jones" & office == "toronto" 
   & appointment.type == "initial",])

jones_initial_oak = (nrow(appointmentdata[staff == "jones" & office == "oakville" 
   & appointment.type == "initial",])

jones_initial_ott = (nrow(appointmentdata[staff == "jones" & office == "ottawa" 
   & appointment.type == "initial",])

etc.

df = data.frame(Name = c("smith", "jones", "carter"), TorontoInitial = 
   c(smith_intitial_to, jones_intitial_to, carter_initial_to), 
   OakvilleInitial = c(smith_initial_oak, jones_initial_oak, 
   carter_initial_oak), OttawaInitial = c(smith_initial_ott, 
   jones_initial_ott, carter_initial_ott))

I want to repeat this for a list of staff members, a list of offices, and a list of appoint types, save each iteration to its own variable name, and then display all the data in one large data frame. The looping that i've tried to far hasn't worked.

I have done this all by hand, by rewriting the code out over and over, and assembling a data frame by entering each variable individually, but I want to figure out how to do it as a loop.

Thanks for your help!

Edit: sample data-

structure(list(Client.Code = 1:20, Office = c("TORONTO", "TORONTO", 
"TORONTO", "OAKVILLE", "OAKVILLE", "TORONTO", "TORONTO", "TORONTO", 
"TORONTO", "TORONTO", "TORONTO", "TORONTO", "TORONTO", "TORONTO", 
"TORONTO", "OTTAWA", "OTTAWA", "OTTAWA", "OAKVILLE", "OAKVILLE"
), Staff = c("SMITH", "SMITH", "SMITH", "SMITH", "SMITH", "JONES", 
"JONES", "JONES", "JONES", "JONES", "JONES", "JONES", "CARTER", 
"CARTER", "CARTER", "CARTER", "CARTER", "CARTER", "CARTER", "CARTER"
), Appointment.Type = c("INITIAL", "INITIAL", "INITIAL2", "INITIAL", 
"INTIAL2", "INTIAL", "FOLLOWUP", "FOLLOWUP", "INITIAL", "FOLLOWUP", 
"FOLLOWUP", "INITIAL", "INITIAL2", "INITIAL2", "INITIAL", "INITIAL", 
"INITIAL", "FOLLOWUP", "INITIAL", "INITIAL")), row.names = c(NA, 
20L), class = "data.frame")

CodePudding user response：

I think you're looking for dplyr::count:

library(dplyr)
appointmentdata %>% 
  count(Office, Staff, Appointment.Type)
#      Office  Staff Appointment.Type n
# 1  OAKVILLE CARTER          INITIAL 2
# 2  OAKVILLE  SMITH          INITIAL 1
# 3  OAKVILLE  SMITH          INTIAL2 1
# 4    OTTAWA CARTER         FOLLOWUP 1
# 5    OTTAWA CARTER          INITIAL 2
# 6   TORONTO CARTER          INITIAL 1
# 7   TORONTO CARTER         INITIAL2 2
# 8   TORONTO  JONES         FOLLOWUP 4
# 9   TORONTO  JONES          INITIAL 2
# 10  TORONTO  JONES           INTIAL 1
# 11  TORONTO  SMITH          INITIAL 2
# 12  TORONTO  SMITH         INITIAL2 1

The above is a data frame, which you could assign a name to, df <- appointmentdata %>% count(...).

If there are values you don't want to count, you could subset your data first, e.g.

office_vec = c("toronto", "oakville", "ottawa")
staff_vec = c("smith", "jones", "carter")
appointmentdata %>%
  filter(tolower(Office) %in% office_vec & tolower(Staff) %in% staff_vec)) %>%
  count(Office, Staff, Appointment.Type)