I have a dataframe named 2022_Rev that looks sort of like this:
Name Vendor Sales
Steve 6 80,000
Annie 4 95,000
Bill 6 45,000
Steve 3 25,000
Bill 2 40,000
Sam 5 5,000
... ... ...
I also have a list of each sales person:
Employees ['Steve', 'Annie', 'Bill', 'Sam', ...]
I want to apply mean() to column sales for each item in the list "Employee". I am supposed to use base R to create a loop that goes through each value in "Employees" and then creates a vector showing the mean for each employee. So far I have:
avgSales = rep(NA, 10)
for (i in length(Employees)){
if(Employees[i] == 2022_Rev$Name){
avgSales[i] = mean(2022_Rev$Sales[i])
}
}
This is erroring apparently because if can only check one value? I'm not sure how to fix it.
CodePudding user response:
This is not normally the approach we would take in R (i.e. there are better ways to get the mean of a column by group). However, if you want an example of a for loop over the names of the Employees in your list, here is one base R approach. First preallocated a named vector of length as long as your Employees, and then fill it use a for loop:
sales_means = setNames(vector("numeric", length = length(Employees)), Employees)
for(e in Employees) {
sales_means[e] = mean(`2022_Rev`[`2022_Rev`$Name==e, "Sales"],na.rm=T)
}
Output:
Steve Annie Bill Sam
52500 95000 42500 5000
Input:
`2022_Rev` = structure(list(Name = c("Steve", "Annie", "Bill", "Steve", "Bill",
"Sam"), Vendor = c(6L, 4L, 6L, 3L, 2L, 5L), Sales = c(80000L,
95000L, 45000L, 25000L, 40000L, 5000L)), row.names = c(NA, -6L
), class = "data.frame")
Employees = list('Steve', 'Annie', 'Bill', 'Sam')
CodePudding user response:
We can use aggregate
to calculate the mean of Sales
with respect to Name
, then transform your list Employees
to data.frame then merge it with the aggregate
result to get the values in the list
aggregate(Sales ~ Name , `2022_Rev` , mean) |>
merge(do.call(rbind , Employees) |>
data.frame(Name = _) , by.y = "Name")
- Output
Name Sales
1 Annie 95000
2 Bill 42500
3 Sam 5000
4 Steve 52500