I have the following data frame that I am trying to make a function for:
df<- structure(list(BLG = c(37.037037037037, 12.0603015075377, 93.5593220338983,
3.96563119629874, 77.634011090573, 71.608040201005, 3.96563119629874,
119.775421085465, 44.8765893792072), GSF = c(0, 0, 0, 0, 11.090573012939,
0, 0, 0, 0), LMB = c(66.6666666666667, 24.1206030150754, 40.6779661016949,
31.7250495703899, 73.9371534195933, 67.8391959798995, 31.7250495703899,
22.4578914535246, 31.413612565445), YLB = c(0, 0, 0, 0, 14.7874306839187,
0, 0, 0, 0), BLC = c(3.7037037037037, 0, 4.06779661016949, 7.93126239259749,
7.39371534195933, 11.3065326633166, 7.93126239259749, 3.74298190892077,
22.4382946896036), WHC = c(7.40740740740741, 0, 0, 0, 0, 0, 0,
7.48596381784155, 4.48765893792072), RSF = c(0, 0, 0, 0, 0, 0,
0, 0, 4.48765893792072), CCF = c(3.7037037037037, 0, 8.13559322033898,
0, 0, 0, 0, 0, 0), BLB = c(0, 0, 0, 0, 0, 0, 0, 0, 0), group = c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L)), row.names = c(NA, -9L), class = c("data.table",
"data.frame"))
Function
p_true<- c(83, 10, 47, 8, 9, 6, 12, 5, 8) #true value for each column
estimate2 = function(df) {
y_est2 = df
sqrt(mean((y_est2-p_true)^2))/p_true*100
}
final<- df %>%
group_by(group) %>%
group_modify(~ as.data.frame.list(estimate2(.)))
The final output should be a 3x9 data frame: one value for each column per group. Can get the intended output format with plyr::ddply(df, .(group), estimate2)
Even without trying to run the function across groups with estimate2(df)
(and taking out the group column) it still says argument is not logical or numeric; returning NA.
I'm not sure why though because I've run functions very similar to this one that only differ slightly by the actual equation inside and they work fine.
Anyone know where I'm going wrong?
CodePudding user response:
The problem is the mean
command. Looking at the help for it with ?mean
it says:
x
An R object. Currently there are methods for numeric/logical vectors and date, date-time and time interval objects. Complex vectors are allowed for trim = 0, only.
But you want to calculate the mean for three rows of a data frame.
I'm not entirely sure if the following is what you want, but you can unlist your data frame so that it is a vector. The division by p_true
is then recycled to the length of this vector. You can then combine the result again into a data frame:
p_true<- c(83, 10, 47, 8, 9, 6, 12, 5, 8) #true value for each column
estimate2 = function(df) {
y_est2 = df
return_df <- as.data.frame(t(sqrt(mean(unlist((y_est2-p_true)^2)))/p_true*100))
names(return_df) <- names(y_est2)
return(return_df)
}
final<- df %>%
group_by(group) %>%
group_modify(~ as.data.frame.list(estimate2(.)))
This returns:
# A tibble: 3 x 10
# Groups: group [3]
group BLG GSF LMB YLB BLC WHC RSF CCF BLB
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 38.7 321. 68.3 401. 357. 535. 268. 642. 401.
2 2 45.9 381. 81.1 477. 424. 635. 318. 763. 477.
3 3 45.6 378. 80.4 473. 420. 630. 315. 756. 473.