Here is a sample of my data:
dat<-read.table (text=" W1D1 W1D2 W1D3 MN1 MN2 MN3
11 13 12 A A B
11 14 17 B B B
12 17 10 C C B
10 10 19 C C B
19 12 18 D C B
19 18 10 A D C
18 10 17 B D D
16 16 18 C D D
19 17 20 C D A
16 14 17 D D A
", header=TRUE)
W1D1 corresponds to MN1, W1D2 corresponds to MN2 and W1D3 corresponds to MN3.
I want to calculate the Median for the values of C and D. For example, MN1, has the values of 12,10,19,16,19,16 in W1D1. So I get a median of 16. So W1D1 goes to MN1, W1D2 goes to MN2 and W1D3 goes to MN3.
The outcome is :
Median1 16
Median2 15
Median3 17
Is it possible to do it using R-based, if not what is the simplest option?
Considering I have 10 Medians but to simplify the matter I have provided 3 Medians
CodePudding user response:
I'm not sure I fully understood what you are looking for, but here is one attempt (although not very elegant).
df1 <- df %>%
pivot_longer(cols = starts_with("M")) %>%
filter(value %in% c("C", "D")) %>%
group_by(name) %>%
summarise(across(starts_with("W1"), median)) %>%
select(-name)
df2 <- tibble(median = df1 %>%
as.matrix() %>%
diag(),
variable = names(df1))
-Result
> df2
# A tibble: 3 x 2
median variable
<dbl> <chr>
1 16 W1D1
2 15 W1D2
3 17 W1D3
CodePudding user response:
Here is an approach using pmap_dfc
and then pivoting the result.
library(tidyverse)
n <- ncol(dat) / 2
pmap_dfc(list(str_c("Median", 1:n), str_c('W1D', 1:n), str_c('MN', 1:n)),
~ filter(dat, .data[[..3]] %in% c('C', 'D')) %>%
summarise('{..1}' := median(.data[[..2]]))) %>%
pivot_longer(everything())
#> # A tibble: 3 × 2
#> name value
#> <chr> <dbl>
#> 1 Median1 16
#> 2 Median2 15
#> 3 Median3 17
Created on 2021-12-30 by the reprex package (v2.0.1)