I have a density distribution. The x
column represents the values on the x axis, while the y
column represents the corresponding density.
I would like to find the 2.5th and 97.5 percentiles and their corresponding x values.
The end dataframe should look like cri
with the x
column filled in.
library(tidyverse)
x = seq(-10,10, 0.1)
y = dnorm(x, mean = 0, sd = 2)
df = tibble(x,y)
cri <- df %>%
summarise(lwr = quantile(y, probs = 0.025),
upr = quantile(y, probs = 0.975),
mean = mean(y)) %>%
pivot_longer(cols = everything()) %>%
mutate(x = NA)
cri
#> # A tibble: 3 x 3
#> name value x
#> <chr> <dbl> <lgl>
#> 1 lwr 0.00000122 NA
#> 2 upr 0.197 NA
#> 3 mean 0.0498 NA
Created on 2022-09-13 by the reprex package (v2.0.1)
CodePudding user response:
Assuming there is an exact match, you can use match
:
cri <- df %>%
summarise(lwr = quantile(y, probs = 0.025),
upr = quantile(y, probs = 0.975),
mean = mean(y)) %>%
pivot_longer(cols = everything()) %>%
mutate(exact_x = df$x[match(value, df$y)]) %>%
rowwise %>%
mutate(closest_x = df$x[which.min(abs(value - df$y))]) %>%
ungroup()
cri
# # A tibble: 3 × 4
# name value exact_x closest_x
# <chr> <dbl> <dbl> <dbl>
# 1 lwr 0.00000122 -9.8 -9.8
# 2 upr 0.197 -0.300 -0.300
# 3 mean 0.0498 NA 3.3
CodePudding user response:
Another way to find values is using left_join
.
library(dplyr)
library(tidyr)
x = seq(-10,10, 0.1)
y = dnorm(x, mean = 0, sd = 2)
df = tibble(x,y)
cri <- df %>%
summarise(lwr = quantile(y, probs = 0.025),
upr = quantile(y, probs = 0.975),
mean = mean(y)) %>%
pivot_longer(cols = everything()) %>%
left_join(., df, by = c("value"="y"))
# A tibble: 4 x 3
name value x
<chr> <dbl> <dbl>
1 lwr 0.00000122 -9.8
2 lwr 0.00000122 9.8
3 upr 0.197 -0.300
4 mean 0.0498 NA