Home > Back-end >  Diacritic normalization in excel formula, without vba
Diacritic normalization in excel formula, without vba

Time:03-11

How to replace diacritics (accented characters) with corresponding alphabetical character with regular formula: =SUBSTITUTE(A1,"é","e"), without VBA, for large range of uppercase and lowercase diacritics like this:

áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâ =a

ÁÀȦÄǍĀÃÅĄȺẤẦẮẰǠǺǞẪẴẢȀȂẨẲẠḀẬẶĂÂ = A

ḃƀɓḅḇ =b

ḂɃƁḄḆ = B

ćċĉčçȼḉƈɔ =c

ĆĊĈČÇȻḈƇƆ = C

ḋďḑđɗḍḓḏðɖ =d

ḊĎḐĐƊḌḒḎÐƉ =D

etc.

CodePudding user response:

With Excel 2019 (the version you are using) this is quite a stretch. I'd suggest something along the following lines:

enter image description here

Formula in C1:

=CONCAT(IFERROR(IF(EXACT(UPPER(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),UPPER(MID(A2,SEARCH(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),A1),1)),MID(A2,FIND(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),A1),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)))

Understandig the limit of SEARCH() (255 chars) I opted to only include the lowercase variants and add the extra needed intelligence inside the formulae through functions like IF(), EXACT() and UPPER(). Again, this is an CSE-entered array formula.


EDIT: Without helper:

=CONCAT(IFERROR(IF(EXACT(UPPER(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),UPPER(MID("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbcccccccccdddddddddd",SEARCH(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),"áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâḃƀɓḅḇćċĉčçȼḉƈɔḋďḑđɗḍḓḏðɖ"),1)),MID("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbcccccccccdddddddddd",FIND(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),"áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâḃƀɓḅḇćċĉčçȼḉƈɔḋďḑđɗḍḓḏðɖ"),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)))
  • Related