Countries <- c("AAAAAAA", sample(c("India","USA","UK","SSS"),20,replace=TRUE))
df1 <- data.frame(Countries)
df1
I want to check if a string contains same character
library(stringi)
df1$All_S<-stri_count_fixed(df1$Countries,"S")==nchar(df1$Countries)
df1
Countries All_S
1 AAAAAAA FALSE
2 UK FALSE
3 India FALSE
4 India FALSE
5 SSS TRUE
6 UK FALSE
7 SSS TRUE
8 SSS TRUE
9 India FALSE
10 SSS TRUE
11 UK FALSE
12 India FALSE
13 SSS TRUE
14 USA FALSE
15 UK FALSE
16 UK FALSE
17 SSS TRUE
18 SSS TRUE
19 India FALSE
20 USA FALSE
21 USA FALSE
However this only check if string contains only "S". How can I change it to make it work for any string. In above example, this means first entry AAAAAAA
will also be True
CodePudding user response:
another possibility:
sapply(strsplit(df1$Countries, ""), function(x) all(x == x[1]))
CodePudding user response:
You can try this
transform(
df1,
All_same = grepl("^(.)(\\1) $", Countries)
)
which gives something like
Countries All_same
1 AAAAAAA TRUE
2 UK FALSE
3 UK FALSE
4 UK FALSE
5 USA FALSE
6 UK FALSE
7 USA FALSE
8 USA FALSE
9 USA FALSE
10 UK FALSE
11 India FALSE
12 SSS TRUE
13 USA FALSE
14 USA FALSE
15 India FALSE
16 USA FALSE
17 UK FALSE
18 SSS TRUE
19 India FALSE
20 UK FALSE
21 UK FALSE
CodePudding user response:
I would use grepl
here:
df1$All_S <- grepl("^S $", df1$Countries)
To do the above for any letter, then use:
df1$All_S <- grepl("^(\\w)\\1*$", df1$Countries)
CodePudding user response:
"I want to check if a string contains same character" -- this leaves open the possibility that
- (i) the string contains not-same characters too ("AAB")
- (ii) the same characters can, but need not, be adjacent (e.g., "ABA")
- (iii) the string contains only same characters
The three conditions call for slightly different regex solutions:
solution (i):
df1 %>%
mutate(Same = str_detect(Countries, "(.)\\1 "))
The three conditions call for different solutions:
solution (ii):
df1 %>%
mutate(Same = str_detect(Countries, "(.).*\\1 "))
solution (iii):
df1 %>%
mutate(Same = str_detect(Countries, "^(.)\\1 $"))