I have a data set like the one below
ID | col1 | col2 |
---|---|---|
1 | 042 | 10 |
2 | 353 | 13 |
3 | 403 | 03 |
4 | 642 | 22 |
I want to filter out only the rows that have a value in col1 that start with 4. This includes row1 that has 04, but does not include row 4. Col1 is a character column.
The final data set should look like this.
ID | col1 | col2 |
---|---|---|
2 | 353 | 13 |
4 | 642 | 22 |
Thanks!
CodePudding user response:
You may try
library(dplyr)
df %>%
filter(substring(as.numeric(col1),1,1) != "4")
ID col1 col2
1 2 353 13
2 4 642 22
CodePudding user response:
We can combine str_detect
with filter
using the regex '^0 4|^4'
that indicates starts with 0 one or more times followed by a 4 or starts with 4.
code:
library(tidyverse)
df <- read_table("ID col1 col2
1 042 10
2 353 13
3 403 03
4 642 22")
df %>%
filter(!str_detect(col1, '^0 4|^4'))
#> # A tibble: 2 × 3
#> ID col1 col2
#> <dbl> <chr> <chr>
#> 1 2 353 13
#> 2 4 642 22
Created on 2021-11-24 by the reprex package (v2.0.1)