Hi what I am trying to do is count the number of unique characters in a string. Here is what my dataframe looks like
Text unique char count
banana 3
banana12 5
Ace@343 6
Upper/lower cases doesn't matter, what I am trying to get is unique chars(numbers, letters) in the output
I have tried unique, distinct functions etc however they provide the out for entire column within the column but I need it for each corresponding cell as shown above.
CodePudding user response:
In base R you can do:
df$char_count <- sapply(strsplit(df$Text, ""), function(x) length(unique(x)))
df
#> Text char_count
#> 1 banana 3
#> 2 banana12 5
#> 3 Ace@343 6
Data
df <- data.frame(Text = c("banana", "banana12", "Ace@343"))
Created on 2021-11-12 by the reprex package (v2.0.0)
CodePudding user response:
You could directly use regex
to do the count
df %>%
mutate(char_count = str_count(Text, "(.)(?!.*\\1)"))
Text char_count
1 banana 3
2 banana12 5
3 Ace@343 6