with this formula:
datanew <- df_bsp %>%
group_by(id_mother) %>%
dplyr::mutate(Family = cur_group_id())
I got this output:
datanew <- data.frame(id_pers=c(1, 2, 3, 4, 5, 6),
id_mother=c(11, 11, 11, 12, 12, 12),
FAMILY=c(1,1,1,2,2,2)
now the problem:
There are also some NA's in the id_mother-variable
it looks like this:
datanew_1 <- data.frame(id_pers=c(1, 2, 3, 4, 5, 6, 7, 8, 9,10),
id_mother=c(11, 11, 11, 12, 12, 12, NA, NA, NA, NA)
How can i get this result:
datanew <- data.frame(id_pers=c(1, 2, 3, 4, 5, 6, 7, 8, 9,10),
id_mother=c(11, 11, 11, 12, 12, 12, NA, NA, NA, NA),
FAMILY=c(1,1,1,2,2,2,3,4,5,6)
THX
CodePudding user response:
If you want each NA
value treated as its own group, give each one a unique value:
datanew_1 %>%
mutate(
id_mother_na = ifelse(
is.na(id_mother),
paste("g", "na", cumsum(is.na(id_mother))),
paste("g", id_mother)
)
) %>%
group_by(id_mother_na) %>%
mutate(Family = cur_group_id()) %>%
ungroup()
# # A tibble: 10 × 4
# id_pers id_mother id_mother_na Family
# <dbl> <dbl> <chr> <int>
# 1 1 11 g 11 1
# 2 2 11 g 11 1
# 3 3 11 g 11 1
# 4 4 12 g 12 2
# 5 5 12 g 12 2
# 6 6 12 g 12 2
# 7 7 NA g na 1 3
# 8 8 NA g na 2 4
# 9 9 NA g na 3 5
# 10 10 NA g na 4 6
CodePudding user response:
Along the same lines of the other answer, you need to make a unique group for the NA:
library(tidyverse)
make_grp <- function(x){
coalesce(x, cumsum(is.na(x))) (max(x, na.rm = TRUE)*is.na(x))
}
datanew_1 |>
group_by(grp = make_grp(id_mother)) |>
mutate(Family = cur_group_id()) |>
ungroup() |>
select(-grp)
#> # A tibble: 10 x 3
#> id_pers id_mother Family
#> <dbl> <dbl> <int>
#> 1 1 11 1
#> 2 2 11 1
#> 3 3 11 1
#> 4 4 12 2
#> 5 5 12 2
#> 6 6 12 2
#> 7 7 NA 3
#> 8 8 NA 4
#> 9 9 NA 5
#> 10 10 NA 6