How to uppercase only Latin characters, leaving others as it is?
I want to sort objects, here is my code it is case-insensitive:
objects = ['факультет', 'Worm', 'Фонарь', 'word', 'Фонтан']
objects.sort(key=lambda x: x.upper(), reverse=False)
print(objects)
But I have to sort in Latin case-insensitive and non-Latin case-sensitive:
This is what I get:
['word', 'Worm', 'факультет', 'Фонарь', 'Фонтан']
Latin and non-Latin sorted case-insensitive
This is what I'm trying to match:
['word', 'Worm', 'Фонарь', 'Фонтан', 'факультет']
Latin sorted case-insensitive, non-Latin sorted case-sensitive
I know this is exactly behavior from Python 2, but I use Python 3.
CodePudding user response:
Treat the strings different in your key function. If the first character (I assume that's enough to check) is an ASCII letter then apply upper()
otherwise take the string as it is.
import string
objects = ['факультет', 'Worm', 'Фонарь', 'word', 'Фонтан']
objects.sort(key=lambda x: x.upper() if x[0] in string.ascii_letters else x, reverse=False)
print(objects)
This will give you ['word', 'Worm', 'Фонарь', 'Фонтан', 'факультет']
If you want to check if all characters are ASCII letters then use all
.
objects.sort(key=lambda x: x.upper() if all(c in string.ascii_letters for c in x) else x, reverse=False)
CodePudding user response:
One way:
from string import ascii_lowercase as abc, ascii_uppercase as ABC
table = str.maketrans(abc, ABC)
blocks.sort(key=lambda s: s.translate(table))
Another:
import re
blocks.sort(key=lambda s: re.sub('[a-z] ', lambda m: m[0].upper(), s))