I am trying to calculate the number of appointments conducted by each staff member, in each office pulled from an excel csv. We have a variety of staff members, offices, and appointment types.
Here is a sample of what I have so far:
staff = c("smith", "jones", "carter")
office = c("toronto", "oakville", "ottawa")
appointment.type = c("initial", "initial2", "followup")
smith_initial_to = (nrow(appointmentdata[staff == "smith" & office == "toronto"
& appointment.type == "initial",])
smith_initial_oak = (nrow(appointmentdata[staff == "smith" & office == "oakville"
& appointment.type == "initial",])
smith_initial_ott = (nrow(appointmentdata[staff == "smith" & office == "ottawa"
& appointment.type == "initial",])
jones_initial_to = (nrow(appointmentdata[staff == "jones" & office == "toronto"
& appointment.type == "initial",])
jones_initial_oak = (nrow(appointmentdata[staff == "jones" & office == "oakville"
& appointment.type == "initial",])
jones_initial_ott = (nrow(appointmentdata[staff == "jones" & office == "ottawa"
& appointment.type == "initial",])
etc.
df = data.frame(Name = c("smith", "jones", "carter"), TorontoInitial =
c(smith_intitial_to, jones_intitial_to, carter_initial_to),
OakvilleInitial = c(smith_initial_oak, jones_initial_oak,
carter_initial_oak), OttawaInitial = c(smith_initial_ott,
jones_initial_ott, carter_initial_ott))
I want to repeat this for a list of staff members, a list of offices, and a list of appoint types, save each iteration to its own variable name, and then display all the data in one large data frame. The looping that i've tried to far hasn't worked.
I have done this all by hand, by rewriting the code out over and over, and assembling a data frame by entering each variable individually, but I want to figure out how to do it as a loop.
Thanks for your help!
Edit: sample data-
structure(list(Client.Code = 1:20, Office = c("TORONTO", "TORONTO",
"TORONTO", "OAKVILLE", "OAKVILLE", "TORONTO", "TORONTO", "TORONTO",
"TORONTO", "TORONTO", "TORONTO", "TORONTO", "TORONTO", "TORONTO",
"TORONTO", "OTTAWA", "OTTAWA", "OTTAWA", "OAKVILLE", "OAKVILLE"
), Staff = c("SMITH", "SMITH", "SMITH", "SMITH", "SMITH", "JONES",
"JONES", "JONES", "JONES", "JONES", "JONES", "JONES", "CARTER",
"CARTER", "CARTER", "CARTER", "CARTER", "CARTER", "CARTER", "CARTER"
), Appointment.Type = c("INITIAL", "INITIAL", "INITIAL2", "INITIAL",
"INTIAL2", "INTIAL", "FOLLOWUP", "FOLLOWUP", "INITIAL", "FOLLOWUP",
"FOLLOWUP", "INITIAL", "INITIAL2", "INITIAL2", "INITIAL", "INITIAL",
"INITIAL", "FOLLOWUP", "INITIAL", "INITIAL")), row.names = c(NA,
20L), class = "data.frame")
CodePudding user response:
I think you're looking for dplyr::count
:
library(dplyr)
appointmentdata %>%
count(Office, Staff, Appointment.Type)
# Office Staff Appointment.Type n
# 1 OAKVILLE CARTER INITIAL 2
# 2 OAKVILLE SMITH INITIAL 1
# 3 OAKVILLE SMITH INTIAL2 1
# 4 OTTAWA CARTER FOLLOWUP 1
# 5 OTTAWA CARTER INITIAL 2
# 6 TORONTO CARTER INITIAL 1
# 7 TORONTO CARTER INITIAL2 2
# 8 TORONTO JONES FOLLOWUP 4
# 9 TORONTO JONES INITIAL 2
# 10 TORONTO JONES INTIAL 1
# 11 TORONTO SMITH INITIAL 2
# 12 TORONTO SMITH INITIAL2 1
The above is a data frame, which you could assign a name to, df <- appointmentdata %>% count(...)
.
If there are values you don't want to count, you could subset your data first, e.g.
office_vec = c("toronto", "oakville", "ottawa")
staff_vec = c("smith", "jones", "carter")
appointmentdata %>%
filter(tolower(Office) %in% office_vec & tolower(Staff) %in% staff_vec)) %>%
count(Office, Staff, Appointment.Type)