Home > other >  Calculate area under a curve below a certain threshold in R
Calculate area under a curve below a certain threshold in R

Time:08-12

I'm trying to calculate the area below a certain point, and unsure how to do that. I've seen this question, but it's not exactly answering what I'm looking for.

Here is some example data...

test_df <- structure(list(time = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23), balance = c(27, 
-45, -118, -190, -263, -343, -424, -1024, -434, -533, -613, -694, 
-775, -355, -436, -516, -597, -77, -158, -239, -319, -400, -472, 
-545)), row.names = c(NA, -24L), class = c("tbl_df", "tbl", "data.frame"
)) %>% as_tibble()

ggplot(test_df, aes(time, balance)) 
  geom_smooth(se = F) 
  geom_hline(yintercept = -400)

I'd like to calculate the AUC for the trend line, but only for when it is below a certain threshold (-400, for example).

So I can extract the values for the smoothed line...

test_plot <- ggplot(test_df, aes(time, balance)) 
  geom_smooth(se = F) 
  geom_hline(yintercept = -400)

ggp_data <- ggplot_build(test_plot)$data[[1]]

and use something like this to get an AUC value

MESS::auc(ggp_data$x, ggp_data$y)

My questions are:

  1. How to only calculate below -400?
  2. How to interpret the value?
  3. What units would it be in?
  4. If my x axis is in hours, is there a way to turn the value into an hour value?

Thanks!

CodePudding user response:

To calculate the area only below a certain threshold you can add the threshold to your y-values if your threshold is below 0 and subtract if your threshold is larger than 0. For your case that would be like this:

MESS::auc(ggp_data$x, ggp_data$y 400)

However, this calculates the AUC from 0 to 23 and therefore, also parts that are above -400. To get the AUC for the part that is below your threshold you have to find the x-values of the intersection between your smoothed line and the h-line at -400. Inspecting your values by eye you could find the following approximation of these x-values that fulfill this criteria:

 x1 <- 4.45 
 x2 <- 15.45 
 x3 <- 21.35

Now we have to calculate the AUC between x1 and x2, and x3 and max(x). Then we have to add these values together:

AUC1 <- MESS::auc(ggp_data$x, ggp_data$y 400, from = x1, to = x2)
AUC2 <- MESS::auc(ggp_data$x, ggp_data$y 400, from = x3, to = max(ggp_data$x))

AUC.total <- AUC1   AUC2

> AUC.total
[1] -1747.352

Note that the value is negative because it is below 0. There are now "negative areas" therefore, you can take the absolute value AUC.total = 1747.352 to proceede. However, without information on your y-axis one cannot clearly interpret this value.

  •  Tags:  
  • r auc
  • Related