Say I have a dataset that looks like this
ID| Street | street type| Crime|
1 Main ST N
2 Main ST Y
3 Pleasant AVE Y
4 Pleasant AVE Y
5 Harris BLVD N
6 Lincoln Road Y
7 Lincoln Road Y
8 Lincoln Road Y
9 Breezy Ave Y
10 Breezy Ave N
11 Rose ST N
12 Rose ST N
13 Rose ST N
I would want something like this note that a group is set up between column "Street" and "Crime". If at least one "Y" appears in "Crime" Assign a Y for all part of the same group.
ID| Street | street type| Crime|Flag
1 Main ST N Y
2 Main ST Y Y
3 Pleasant AVE Y Y
4 Pleasant AVE Y Y
5 Harris BLVD N N
6 Lincoln Road Y Y
7 Lincoln Road Y Y
8 Lincoln Road Y Y
9 Breezy Ave Y Y
10 Breezy Ave N Y
11 Rose ST N N
12 Rose ST N N
13 Rose ST N N
CodePudding user response:
library(dplyr)
df %>%
group_by(Street) %>%
mutate(Crime = if_else(any(Crime == "Y"),"Y","N"))
# A tibble: 13 x 4
# Groups: Street [6]
ID Street street_type Crime
<int> <chr> <chr> <chr>
1 1 Main ST Y
2 2 Main ST Y
3 3 Pleasant AVE Y
4 4 Pleasant AVE Y
5 5 Harris BLVD N
6 6 Lincoln Road Y
7 7 Lincoln Road Y
8 8 Lincoln Road Y
9 9 Breezy Ave Y
10 10 Breezy Ave Y
11 11 Rose ST N
12 12 Rose ST N
13 13 Rose ST N
-- data
structure(list(ID = 1:13, Street = c("Main", "Main", "Pleasant",
"Pleasant", "Harris", "Lincoln", "Lincoln", "Lincoln", "Breezy",
"Breezy", "Rose", "Rose", "Rose"), street_type = c("ST", "ST",
"AVE", "AVE", "BLVD", "Road", "Road", "Road", "Ave", "Ave", "ST",
"ST", "ST"), Crime = c("N", "Y", "Y", "Y", "N", "Y", "Y", "Y",
"Y", "N", "N", "N", "N")), class = "data.frame", row.names = c(NA,
-13L))
CodePudding user response:
try any
df %>%
group_by(Street) %>%
mutate(Flag = as.character(any(Crime == "Y"))) %>%
mutate(Flag = recode(Flag, "TRUE" = "Y", "FALSE" = "N"))
Street Crime Flag
<chr> <chr> <chr>
1 Main N Y
2 Main Y Y
3 Pleasant Y Y
4 Pleasant Y Y
5 Harris N N
CodePudding user response:
Or use max
since Y
is bigger than N
:
df %>%
group_by(Street) %>%
mutate(Flag = max(Crime))
This is the shortest one here, and keeps your original data structure:
ID Street street_type Crime Flag
<int> <chr> <chr> <chr> <chr>
1 1 Main ST N Y
2 2 Main ST Y Y
3 3 Pleasant AVE Y Y
4 4 Pleasant AVE Y Y
5 5 Harris BLVD N N
6 6 Lincoln Road Y Y
7 7 Lincoln Road Y Y
8 8 Lincoln Road Y Y
9 9 Breezy Ave Y Y
10 10 Breezy Ave N Y
11 11 Rose ST N N
12 12 Rose ST N N
13 13 Rose ST N N
CodePudding user response:
Here is a base R option -
df$Flag <- 'N'
df$Flag[df$Street %in% unique(df$Street[df$Crime == 'Y'])] <- 'Y'
df
# ID Street street_type Crime Flag
#1 1 Main ST N Y
#2 2 Main ST Y Y
#3 3 Pleasant AVE Y Y
#4 4 Pleasant AVE Y Y
#5 5 Harris BLVD N N
#6 6 Lincoln Road Y Y
#7 7 Lincoln Road Y Y
#8 8 Lincoln Road Y Y
#9 9 Breezy Ave Y Y
#10 10 Breezy Ave N Y
#11 11 Rose ST N N
#12 12 Rose ST N N
#13 13 Rose ST N N
Initialise the Flag
column to 'N'
. Change the Flag
column to 'Y'
which has Street
value that as any 'Y'
value.