My data looks like this:
data <- data.frame(grupoaih = c("09081997", "13122006", "09081997", "22031969"),
NMM_PROC_BR = c(1, 1, 0, 1),
NMM_CID = c(0, 1, 1, 0),
CPAV_PROC_BR = c(0, 0, 0, 1),
CPAV_CID = c(1, 1, 0, 1))
grupoaih NMM_PROC_BR NMM_CID CPAV_PROC_BR CPAV_CID
1 09081997 1 0 0 1
2 13122006 1 1 0 1
3 09081997 0 1 0 0
4 22031969 1 0 1 1
How can I assign the value 1 when "grupoaih" is a duplicate so the other 4 variables get filled equally like this:
data2 <- data.frame(grupoaih = c("09081997", "13122006", "09081997", "22031969"),
NMM_PROC_BR = c(1, 1, 1, 1),
NMM_CID = c(1, 1, 1, 0),
CPAV_PROC_BR = c(0, 0, 0, 1),
CPAV_CID = c(1, 1, 1, 1))
grupoaih NMM_PROC_BR NMM_CID CPAV_PROC_BR CPAV_CID
1 09081997 1 1 0 1
2 13122006 1 1 0 1
3 09081997 1 1 0 1
4 22031969 1 0 1 1
This only applies if grupoaih is duplicated and any of the 4 variables are filled with 1. If both are 0 in all variables, they stay as they are.
CodePudding user response:
You can use a group_by
and then an n()
to check if there are duplicates. .
stands for the original value, and ~
indicates a formula.
library(dplyr)
data %>%
group_by(grupoaih) %>%
mutate(across(c("NMM_PROC_BR", "NMM_CID", "CPAV_CID"), ~ifelse(n() > 1, 1, .))) %>%
ungroup()
# # A tibble: 4 × 5
# grupoaih NMM_PROC_BR NMM_CID CPAV_PROC_BR CPAV_CID
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 09081997 1 1 0 1
# 2 13122006 1 1 0 1
# 3 09081997 1 1 0 1
# 4 22031969 1 0 1 1
CodePudding user response:
It could work with max
after grouping
library(dplyr)
data %>%
group_by(grupoaih) %>%
mutate(across(everything(), max)) %>%
ungroup
-output
# A tibble: 4 × 5
grupoaih NMM_PROC_BR NMM_CID CPAV_PROC_BR CPAV_CID
<chr> <dbl> <dbl> <dbl> <dbl>
1 09081997 1 1 0 1
2 13122006 1 1 0 1
3 09081997 1 1 0 1
4 22031969 1 0 1 1
Or use fmax
from collapse
library(collapse)
data[-1] <- fmax(data[-1], data$grupoaih, TRA = 1)