I have a list of car brands in upper case, (MERCEDES-BENZ', 'BMW', 'CHEVROLET', 'MG', 'FORD'...etc), what is the best way to get the formal names, like:
MERCEDES-BENZ ===> Mercedes-Benz
BMW ===> BMW
CHEVROLET ===> Chevrolet
MG ===> MG
JOSS ===> JOSS
i am thinking of something using spacy but couldn't find a proper way,
Edit: for people suggesting basic solution (loop and if statement), i obviously wouldn't ask if i wanted that, the original list is quite big, and contains many brands from all over the world, each has its own formal name. i know this is not an easy task, but i was thinking may be someone had done the same thing with spacy for example or another library that i don't know... Thank you.
CodePudding user response:
From the sample list of brands, this list comprehension might work:
lis = ['MERCEDES-BENZ', 'BMW', 'CHEVROLET', 'MG', 'JOSS']
print([e.title() if len(e)>4 else e for e in lis])
Output
['Mercedes-Benz', 'BMW', 'Chevrolet', 'MG', 'JOSS']
CodePudding user response:
Simple, eazy pz:
from pprint import pprint
brands = {'CHEVROLET', 'MG', 'JOSS', 'MERCEDES-BENZ', 'BMW'}
pprint({b: b.title() if len(b) > 5 else b for b in brands})
CodePudding user response:
The examples you've provided have very different casing rules that can't be automatically determined just based on the manufacturer name (e.g., if "CHEVROLET" becomes "Chevrolet", shouldn't "JOSS" become "Joss"?).
This means you'll need to maintain some additional information for each manufacturer in order to format it correctly. If this is true, then the simplest way to do it is to just have an explicit mapping from the uppercased version to the "formal name", as you call it:
brands = {
'MERCEDES-BENZ': 'Mercedes-Benz',
'BMW': 'BMW',
'CHEVROLET': 'Chevrolet',
'MG': 'MG',
'JOSS': 'JOSS'
}