Home > OS >  Check if comma delimited column contains a value
Check if comma delimited column contains a value

Time:06-11

I have an R dataframe where one of the columns is a comma delimited string. I want to add a new column to the dataset to show whether the column contains a particular value

For example

> data <- data.frame(a = 1:5, b = c("123", "6475,320", "475", "905,1204,543", "567,475"))
> data
  a            b
1 1          123
2 2     6475,320
3 3          475
4 4 905,1204,543
5 5      567,475

I want to create a new column to indicate whether b contains 475, which would leave me with

  a            b has_475
1 1          123   FALSE
2 2     6475,320   FALSE
3 3          475    TRUE
4 4 905,1204,543   FALSE
5 5      567,475    TRUE

CodePudding user response:

You can use boundaries '\b' to look for the number. This will ensure things like 1475 24756 are not matched

data$has_475 <- grepl('\\b475\\b', data$b)
data
  a            b has_475
1 1          123   FALSE
2 2     6475,320   FALSE
3 3          475    TRUE
4 4 905,1204,543   FALSE
5 5      567,475    TRUE
6 6         1475   FALSE

CodePudding user response:

You can use this regular expression

data["has_475"] = grepl("(^|,)475(,|$)",data$b)

Output:

  a            b has_475
1 1          123   FALSE
2 2     6475,320   FALSE
3 3          475    TRUE
4 4 905,1204,543   FALSE
5 5      567,475    TRUE
  •  Tags:  
  • r
  • Related