Home > Back-end >  Trying to extract/count the unique characters in a string (of class character)
Trying to extract/count the unique characters in a string (of class character)

Time:11-13

Hi what I am trying to do is count the number of unique characters in a string. Here is what my dataframe looks like

Text            unique char count
banana              3
banana12            5
Ace@343             6

Upper/lower cases doesn't matter, what I am trying to get is unique chars(numbers, letters) in the output

I have tried unique, distinct functions etc however they provide the out for entire column within the column but I need it for each corresponding cell as shown above.

CodePudding user response:

In base R you can do:

df$char_count <- sapply(strsplit(df$Text, ""), function(x) length(unique(x)))

df
#>       Text char_count
#> 1   banana          3
#> 2 banana12          5
#> 3  Ace@343          6

Data

df <- data.frame(Text = c("banana", "banana12", "Ace@343"))

Created on 2021-11-12 by the reprex package (v2.0.0)

CodePudding user response:

You could directly use regex to do the count

df %>%
   mutate(char_count = str_count(Text, "(.)(?!.*\\1)"))

      Text char_count
1   banana          3
2 banana12          5
3  Ace@343          6
  • Related