I'm trying to calculate the area below a certain point, and unsure how to do that. I've seen this question, but it's not exactly answering what I'm looking for.
Here is some example data...
test_df <- structure(list(time = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23), balance = c(27,
-45, -118, -190, -263, -343, -424, -1024, -434, -533, -613, -694,
-775, -355, -436, -516, -597, -77, -158, -239, -319, -400, -472,
-545)), row.names = c(NA, -24L), class = c("tbl_df", "tbl", "data.frame"
)) %>% as_tibble()
ggplot(test_df, aes(time, balance))
geom_smooth(se = F)
geom_hline(yintercept = -400)
I'd like to calculate the AUC for the trend line, but only for when it is below a certain threshold (-400, for example).
So I can extract the values for the smoothed line...
test_plot <- ggplot(test_df, aes(time, balance))
geom_smooth(se = F)
geom_hline(yintercept = -400)
ggp_data <- ggplot_build(test_plot)$data[[1]]
and use something like this to get an AUC value
MESS::auc(ggp_data$x, ggp_data$y)
My questions are:
- How to only calculate below -400?
- How to interpret the value?
- What units would it be in?
- If my x axis is in hours, is there a way to turn the value into an hour value?
Thanks!
CodePudding user response:
To calculate the area only below a certain threshold you can add the threshold to your y-values if your threshold is below 0 and subtract if your threshold is larger than 0. For your case that would be like this:
MESS::auc(ggp_data$x, ggp_data$y 400)
However, this calculates the AUC from 0 to 23 and therefore, also parts that are above -400. To get the AUC for the part that is below your threshold you have to find the x-values of the intersection between your smoothed line and the h-line at -400. Inspecting your values by eye you could find the following approximation of these x-values that fulfill this criteria:
x1 <- 4.45
x2 <- 15.45
x3 <- 21.35
Now we have to calculate the AUC between x1 and x2, and x3 and max(x). Then we have to add these values together:
AUC1 <- MESS::auc(ggp_data$x, ggp_data$y 400, from = x1, to = x2)
AUC2 <- MESS::auc(ggp_data$x, ggp_data$y 400, from = x3, to = max(ggp_data$x))
AUC.total <- AUC1 AUC2
> AUC.total
[1] -1747.352
Note that the value is negative because it is below 0. There are now "negative areas" therefore, you can take the absolute value AUC.total = 1747.352
to proceede. However, without information on your y-axis one cannot clearly interpret this value.