Home > front end >  Create a flag value in a new column if a combination between two columns has at least one "Y&qu
Create a flag value in a new column if a combination between two columns has at least one "Y&qu

Time:09-28

Say I have a dataset that looks like this

ID| Street   | street type| Crime|
 1  Main       ST            N
 2  Main       ST            Y
 3  Pleasant   AVE           Y
 4  Pleasant   AVE           Y
 5  Harris     BLVD          N
 6  Lincoln    Road          Y
 7  Lincoln    Road          Y
 8  Lincoln    Road          Y
 9  Breezy     Ave           Y 
10  Breezy     Ave           N
11  Rose       ST            N    
12  Rose       ST            N    
13  Rose       ST            N    

I would want something like this note that a group is set up between column "Street" and "Crime". If at least one "Y" appears in "Crime" Assign a Y for all part of the same group.

ID| Street   | street type| Crime|Flag
 1  Main       ST            N    Y
 2  Main       ST            Y    Y
 3  Pleasant   AVE           Y    Y
 4  Pleasant   AVE           Y    Y
 5  Harris     BLVD          N    N
 6  Lincoln    Road          Y    Y
 7  Lincoln    Road          Y    Y
 8  Lincoln    Road          Y    Y
 9  Breezy     Ave           Y    Y
10  Breezy     Ave           N    Y
11  Rose       ST            N    N
12  Rose       ST            N    N
13  Rose       ST            N    N

CodePudding user response:

library(dplyr)

df %>% 
  group_by(Street) %>% 
  mutate(Crime = if_else(any(Crime == "Y"),"Y","N"))

# A tibble: 13 x 4
# Groups:   Street [6]
      ID Street   street_type Crime
   <int> <chr>    <chr>       <chr>
 1     1 Main     ST          Y    
 2     2 Main     ST          Y    
 3     3 Pleasant AVE         Y    
 4     4 Pleasant AVE         Y    
 5     5 Harris   BLVD        N    
 6     6 Lincoln  Road        Y    
 7     7 Lincoln  Road        Y    
 8     8 Lincoln  Road        Y    
 9     9 Breezy   Ave         Y    
10    10 Breezy   Ave         Y    
11    11 Rose     ST          N    
12    12 Rose     ST          N    
13    13 Rose     ST          N 

-- data

structure(list(ID = 1:13, Street = c("Main", "Main", "Pleasant", 
"Pleasant", "Harris", "Lincoln", "Lincoln", "Lincoln", "Breezy", 
"Breezy", "Rose", "Rose", "Rose"), street_type = c("ST", "ST", 
"AVE", "AVE", "BLVD", "Road", "Road", "Road", "Ave", "Ave", "ST", 
"ST", "ST"), Crime = c("N", "Y", "Y", "Y", "N", "Y", "Y", "Y", 
"Y", "N", "N", "N", "N")), class = "data.frame", row.names = c(NA, 
-13L))

CodePudding user response:

try any

df %>%
  group_by(Street) %>%
  mutate(Flag = as.character(any(Crime == "Y"))) %>%
  mutate(Flag = recode(Flag, "TRUE" = "Y", "FALSE" = "N"))

  Street   Crime Flag 
  <chr>    <chr> <chr>
1 Main     N     Y    
2 Main     Y     Y    
3 Pleasant Y     Y    
4 Pleasant Y     Y    
5 Harris   N     N 

CodePudding user response:

Or use max since Y is bigger than N:

df %>%
   group_by(Street) %>%
   mutate(Flag = max(Crime))

This is the shortest one here, and keeps your original data structure:

      ID Street   street_type Crime Flag 
   <int> <chr>    <chr>       <chr> <chr>
 1     1 Main     ST          N     Y    
 2     2 Main     ST          Y     Y    
 3     3 Pleasant AVE         Y     Y    
 4     4 Pleasant AVE         Y     Y    
 5     5 Harris   BLVD        N     N    
 6     6 Lincoln  Road        Y     Y    
 7     7 Lincoln  Road        Y     Y    
 8     8 Lincoln  Road        Y     Y    
 9     9 Breezy   Ave         Y     Y    
10    10 Breezy   Ave         N     Y    
11    11 Rose     ST          N     N    
12    12 Rose     ST          N     N    
13    13 Rose     ST          N     N   

CodePudding user response:

Here is a base R option -

df$Flag <- 'N'
df$Flag[df$Street %in% unique(df$Street[df$Crime == 'Y'])] <- 'Y'
df

#   ID   Street street_type Crime Flag
#1   1     Main          ST     N    Y
#2   2     Main          ST     Y    Y
#3   3 Pleasant         AVE     Y    Y
#4   4 Pleasant         AVE     Y    Y
#5   5   Harris        BLVD     N    N
#6   6  Lincoln        Road     Y    Y
#7   7  Lincoln        Road     Y    Y
#8   8  Lincoln        Road     Y    Y
#9   9   Breezy         Ave     Y    Y
#10 10   Breezy         Ave     N    Y
#11 11     Rose          ST     N    N
#12 12     Rose          ST     N    N
#13 13     Rose          ST     N    N

Initialise the Flag column to 'N'. Change the Flag column to 'Y' which has Street value that as any 'Y' value.

  • Related