Home > Back-end >  Get a match when there are duplicate letters in a string
Get a match when there are duplicate letters in a string

Time:10-18

I have a list of inputs in google sheets,

Input Desired Output "To demonstrate only not an input" The repeated letters
Outdoors Match o
dog No Match
step No Match
bee Match e
Chessboard Match s
Cookbooks Match o, k

How do I verify if all letters are unique in a string without splitting it?

My process so far

I tried enter image description here


update

create a query heat map, filter it and vlookup back row position

=INDEX(LAMBDA(a, IF(""<>IFNA(VLOOKUP(ROW(a), 
 SPLIT(QUERY(QUERY(FLATTEN(ROW(a)&"​"&REGEXEXTRACT(a, REPT("(.)", LEN(a)))), 
 "select Col1,count(Col1) where Col1 matches '.*\w $' group by Col1"), 
 "select Col1 where Col2 > 1", ), "​"), 2, )), "Match", "No Match"))
 (A2:INDEX(A:A, MAX((A:A<>"")*ROW(A:A)))))

enter image description here

case insensitive would be:

=INDEX(LAMBDA(a, IF(""<>IFNA(VLOOKUP(ROW(a), 
 SPLIT(QUERY(QUERY(FLATTEN(ROW(a)&"​"&LOWER(REGEXEXTRACT(a, REPT("(.)", LEN(a))))), 
 "select Col1,count(Col1) where Col1 matches '.*\w $' group by Col1"), 
 "select Col1 where Col2 > 1", ), "​"), 2, )), "Match", "No Match"))
 (A2:INDEX(A:A, MAX((A:A<>"")*ROW(A:A)))))

enter image description here

CodePudding user response:

Uses , but it's efficient. Loop through each row and every character with MAP and REDUCE. REPLACE each character in the word and find the difference in length. If more than 1, don't check length anymore and return Match

=MAP(
  A2:INDEX(A2:A,COUNTA(A2:A)),
  LAMBDA(_,
    REDUCE(
      "No Match",
      SEQUENCE(LEN(_)),
      LAMBDA(a,c,
        IF(a="Match",a,
          IF(
            LEN(_)-LEN(
              REGEXREPLACE(_,"(?i)"&MID(_,c,1),)
            )>1,
            "Match",a
          )
        )
      )
    )
  )
)

If you do run into lambda limitations, remove the MAP and drag fill the REDUCE formula.

=REDUCE("No Match",SEQUENCE(LEN(A2)),LAMBDA(a,c,IF(a="Match",a,IF(LEN(A2)-LEN(REGEXREPLACE(A2, "(?i)"&MID(A2,c,1),))>1,"Match",a))))

The latter is preferred for conditional formatting as well.

CodePudding user response:

You are explicitly asking for an answer using a single regular expression. Unfortunately there is no such thing as a backreference to a former capture group using RE2. So if you'd spell out the answer to your problem it would look like:

=INDEX(IF(A2:A="","",REGEXMATCH(A2:A,"(?i)(?:a.*a|b.*b|c.*c|d.*d|e.*e|f.*f|g.*g|h.*h|i.*i|j.*j|k.*k|l.*l|m.*m|n.*n|o.*o|p.*p|q.*q|r.*r|s.*s|t.*t|u.*u|v.*v|w.*w|x.*x|y.*y|z.*z)")))

Since you are looking for case-insensitive matching (?i) modifier will help to cut down the options to just the 26 letters of the alphabet. I suppose the above can be written a bit neater like:

=INDEX(IF(A2:A="","",REGEXMATCH(A2:A,"(?i)(?:"&TEXTJOIN("|",1,REPLACE(REPT(CHAR(SEQUENCE(26,1,65)),2),2,0,".*"))&")")))
  • Related