I have the following formula of relative estimation error that I am trying to turn into a function to measure the precision of measurements.
Where:
Y_estimated = the observed values in the test df
y_true = the true value that we are trying to estimate (denoted as p_true in code)
R = the number of observations in each iteration (n=3)
My data has the following format:
# dataframe
test<- iris[1:3,1:4]
# make a vector that shows the true population value for each column in the dataframe
p_true<- c(5, 3, 1, 0.3)
# function
estimate = function(df, y_true) {
((sqrt(sum((df - y_true) ^ 2)) / 3) / y_true) * 100
}
y_true <- p_true
final2 <- test %>%
group_modify( ~ as.data.frame(estimate(., p_true)))
The goal is to have an output of 1 row of 4 precision estimates (one for each variable). Apart from the current output format being put into 1 column and 4 rows instead of 4 columns and 1 row, I'm not sure if the function is set up correctly as the values are much more extreme than I am expecting to get.
If anyone can confirm if my function is set up correctly and/or how to get the output to be in the correct format I would really appreciate the help.
CodePudding user response:
No. It does not do what you think it is doing. Look at the result of test - y_true
. The values are not what you expect because you are subtracting a vector (y_true
) from a matrix (test
). R applies the vector across rows, therefore 5.1 - 5 = 0.1, 4.9 - 3 = 1.9, 4.7 - 1 = 3.7:
test - y_true
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 1 0.1 3.2 0.4 -2.8
# 2 1.9 -2.0 1.1 -0.8
# 3 3.7 0.2 -3.7 -0.1
There are several ways to handle this, here are 3:
test - matrix(y_true, 3, 4, byrow=TRUE)
t(t(test) - y_true)
sweep(test, 2, y_true, "-")
# These will produce what you are expecting:
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 1 0.1 0.5 0.4 -0.1
# 2 -0.1 0.0 0.4 -0.1
# 3 -0.3 0.2 0.3 -0.1
You will also need to use colSums()
instead of sum()
. Then the rest of your code should work:
sqrt(colSums((test - matrix(y_true, 3, 4, byrow=TRUE))^2) / 3) / y_true * 100
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 3.829708 10.363755 36.968455 33.333333