I have a data set that looks like this :
vec = c(989, 987, 145, 315, 8449, 9999999999000)
char = c("a","b","c","d","e","f")
df2 = tibble(vec,char);df2
# A tibble: 6 × 2
vec char
<dbl> <chr>
1 989 a
2 987 b
3 145 c
4 315 d
5 8449 e
6 9999999999000 f
I want to remove the values from the column vector that contains more than or equal to 5 digits. Ideally I want to look like this :
1 989 a
2 987 b
3 145 c
4 315 d
5 8449 e
How can I do this in R using dplyr ?
Any help ?
CodePudding user response:
Use nchar
in base R
subset(df2, nchar(vec) <6)
Or filter
library(dplyr)
filter(df2, nchar(vec) <6)
# A tibble: 5 × 2
vec char
<dbl> <chr>
1 989 a
2 987 b
3 145 c
4 315 d
5 8449 e
If there are decimals, convert to integer and count
filter(df2, nchar(as.integer(vec)) < 6)
CodePudding user response:
You could also try:
library(tidyverse)
df2 %>% filter(!str_count(vec, '\\d{5,}') == 1)
# A tibble: 5 × 2
vec char
<dbl> <chr>
1 989 a
2 987 b
3 145 c
4 315 d
5 8449 e
CodePudding user response:
Please check
tibble(vec,char) %>% filter(nchar(vec)<=4)
Created on 2023-01-25 with reprex v2.0.2
# A tibble: 5 × 2
vec char
<dbl> <chr>
1 989 a
2 987 b
3 145 c
4 315 d
5 8449 e