A have a hypothetical dataframe that looks like this
df<-data.frame(Group = rep(c("A","A","B","B", "C","C","D","D","E","E")),
Sex = rep(c("M","F"),5),
Rate= rnorm(10))
df
Group Sex Rate
1 A M -1.0678237
2 A F -0.2179749
3 B M -1.0260044
4 B F -0.7288912
5 C M -0.6250393
6 C F -1.6866933
7 D M 0.8377870
8 D F 0.1533731
9 E M -1.1381369
10 E F 1.2538149
BUT the rate of "F" is actually rate of M rate of F, within their own group. Within the dataframe, taking into account the different groups, can I change the rate of F to include the rate of M? Potentially in a different column (df$Rate2)
Hypothetically....
Group Sex Rate Rate2
1 A M -1.0678237 Same
2 A F -0.2179749 M F (Group A)
3 B M -1.0260044 Same
4 B F -0.7288912 M F (Group B)
5 C M -0.6250393 Same
6 C F -1.6866933 M F (Group C)
7 D M 0.8377870 Same
8 D F 0.1533731 M F (Group D)
9 E M -1.1381369 Same
10 E F 1.2538149 M F (Group E)
CodePudding user response:
Assuming that there is always a M and F in every group, you could do this:
library(tidyverse)
df<-tibble(Group = rep(c("A","A","B","B", "C","C","D","D","E","E")),
Sex = rep(c("M","F"),5),
Rate= rnorm(10))
df %>%
group_by(Group) %>%
mutate(Rate2 = case_when(Sex == "M" ~ Rate,
Sex == "F" ~ sum(Rate)))
#> # A tibble: 10 x 4
#> # Groups: Group [5]
#> Group Sex Rate Rate2
#> <chr> <chr> <dbl> <dbl>
#> 1 A M 1.54 1.54
#> 2 A F 1.82 3.35
#> 3 B M -0.778 -0.778
#> 4 B F 0.507 -0.270
#> 5 C M -0.624 -0.624
#> 6 C F 1.31 0.690
#> 7 D M 0.760 0.760
#> 8 D F 0.666 1.43
#> 9 E M 1.46 1.46
#> 10 E F 0.218 1.68
CodePudding user response:
We can use pivot_wider
, which would output a different format, but a tidy one.
data from @AndS. (but this time reproducible, with set.seed)
set.seed(1)
df<-tibble(Group = rep(c("A","A","B","B", "C","C","D","D","E","E")),
Sex = rep(c("M","F"),5),
Rate= rnorm(10))
Answer
library(dplyr)
library(tidyr)
df %>% pivot_wider(values_from = Rate, names_from = Sex) %>%
mutate(F = F M)
# A tibble: 5 × 3
Group M F
<chr> <dbl> <dbl>
1 A -0.626 -0.443
2 B -0.836 0.760
3 C 0.330 -0.491
4 D 0.487 1.23
5 E 0.576 0.270
In case we eant the original layouot, we can pivot_longer
the output, as in:
df %>% pivot_wider(values_from = Rate, names_from = Sex)%>%
mutate(F = F M)%>%
pivot_longer(cols = c(M, F))
# A tibble: 10 × 3
Group name value
<chr> <chr> <dbl>
1 A M -0.626
2 A F -0.443
3 B M -0.836
4 B F 0.760
5 C M 0.330
6 C F -0.491
7 D M 0.487
8 D F 1.23
9 E M 0.576
10 E F 0.270