Home > OS >  Return the 90th percentile values in R
Return the 90th percentile values in R

Time:05-08

For example, I have a dataset of 30-years air temperature of a city, the dataset looks like:

Year  Julian_date  temperature
1991    1             2.1
1991    2             2.2
...     ...           ...
1991    365           2.3
1992    1             2.1
...     ...           ...
1992    365           2.5
...     ...           ...
2020    366           2.5

I would like to calculate the 90th percentile value of each Julian date (from different years), and returen the results, like:

Julian_date        value(the 90th percentile)
1                  2.4
2                  2.6
...                ...
365                2.5

How should I write the code in r?

CodePudding user response:

You can first group by Julian_date, then use the quantile function to set the probability inside summarise.

library(tidyverse)

df %>% 
  group_by(Julian_date) %>% 
  summarise("value (the 90th percentile)" = quantile(temperature, probs=0.9, na.rm=TRUE))

Output

  Julian_date `value (the 90th percentile)`
        <int>                         <dbl>
1           1                           2.1
2           2                           2.2
3         365                           2.5

Data

df <- structure(list(Year = c(1991L, 1991L, 1991L, 1992L, 1992L, 2020L
), Julian_date = c(1L, 2L, 365L, 1L, 365L, 365L), temperature = c(2.1, 
2.2, 2.3, 2.1, 2.5, 2.5)), class = "data.frame", row.names = c(NA, 
-6L))

CodePudding user response:

You can use quantile() function. If (from different years) in your question means each year should have separate calculation, then you need to group the data frame by Year and Julian_date. If instead it means the different years are combined, you need to group the data frame only by Julian_date, as @AndrewGB and @benson23 showed.

library(dplyr)
yourdf %>% group_by(Year, Julian_date) %>% 
summarise (value_90th_percentile = quantile(temperature, 0.9, na.rm = TRUE))
  • Related