I have a large data frame that looks like this. Each player is assigned to a group.
library(tidyverse)
df <- tibble(player=c(1,2,3,4,5),groups=c("group1","group2","group2",NA,NA))
df
#> # A tibble: 5 × 2
#> player groups
#> <dbl> <chr>
#> 1 1 group1
#> 2 2 group2
#> 3 3 group2
#> 4 4 <NA>
#> 5 5 <NA>
Created on 2022-04-12 by the reprex package (v2.0.1) Some players are not assigned into groups and I want to fill them serially -i.e. like this-
#> # A tibble: 5 × 2
#> player groups
#> <dbl> <chr>
#> 1 1 group1
#> 2 2 group2
#> 3 3 group2
#> 4 4 group3
#> 5 5 group4
CodePudding user response:
dplyr
library(dplyr)
df %>%
mutate(
maxgrp = max(as.integer(gsub("[^0-9]", "", groups)), na.rm = TRUE),
groups = if_else(is.na(groups), paste0("group", maxgrp cumsum(is.na(groups))), groups)
) %>%
select(-maxgrp)
# # A tibble: 5 x 2
# player groups
# <dbl> <chr>
# 1 1 group1
# 2 2 group2
# 3 3 group2
# 4 4 group3
# 5 5 group4
data.table
library(data.table)
DT <- as.data.table(df)
DT[, groups := fifelse(
is.na(groups),
paste0("group", cumsum(is.na(groups)) max(as.integer(gsub("[^0-9]", "", groups)), na.rm = TRUE)),
groups) ]
CodePudding user response:
This was tricky, finally I think we could do it this way:
library(dplyr)
df %>%
mutate(x = cumsum(groups %in% NA) 1) %>%
mutate(groups = ifelse(is.na(groups), paste0("group", x 1), groups), .keep="unused")
player groups
<dbl> <chr>
1 1 group1
2 2 group2
3 3 group2
4 4 group3
5 5 group4
CodePudding user response:
You could do:
df |>
mutate(new_group = max(parse_number(groups), na.rm = TRUE) cumsum(is.na(groups)),
groups = if_else(is.na(groups), paste0("group", new_group), groups)) |>
select(-new_group)
Using a slightly different data example where after the missings another group is mentioned, this would give you:
Input:
library(tidyverse)
df <- tibble(player=c(1,2,3,4,5,6),groups=c("group1","group2","group2",NA,NA, "group3"))
# A tibble: 6 x 2
player groups
<dbl> <chr>
1 1 group1
2 2 group2
3 3 group2
4 4 NA
5 5 NA
6 6 group3
Output:
# A tibble: 6 x 2
player groups
<dbl> <chr>
1 1 group1
2 2 group2
3 3 group2
4 4 group4
5 5 group5
6 6 group3