Home > Mobile >  Pandas replace string values in a column which has multiple variations
Pandas replace string values in a column which has multiple variations

Time:06-03

I am working with this csv file. It's a small dataset of laptop information.

laptops = pd.read_csv('laptops.csv',encoding="Latin-1")
laptops["Operating System"].value_counts()

Which gives:

Windows      1125
No OS          66
Linux          62
Chrome OS      27
macOS          13
Mac OS          8
Android         2
Name: Operating System, dtype: int64

I want to merge the variations of macOS and Mac OS under a single value "macOS".

I have tried this, which works.

mapping_dict = {
    'Android': 'Android',
    'Chrome OS': 'Chrome OS',
    'Linux': 'Linux',
    'Mac OS': 'macOS',
    'No OS': 'No OS',
    'Windows': 'Windows',
    'macOS': 'macOS'
}

laptops["Operating System"] = laptops["Operating System"].map(mapping_dict)

laptops["Operating System"].value_counts()

Windows      1125
No OS          66
Linux          62
Chrome OS      27
macOS          21
Android         2
Name: Operating System, dtype: int64

Is this is the only way or the best way of doing it? Assume such requirement might arise for multiple values (and not just macOS).

CodePudding user response:

You can simply do

laptops['Operating System'] = laptops['Operating System'].replace('Mac OS', 'macOS')

CodePudding user response:

laptops['Operating System'] = laptops['Operating System'].str.replace(r'(Mac OS|MAC OS|MC OS)', 'macOS', regex=True)
  • Related