Home > Back-end >  Iterate over single column inside Pandas DataFrame and mutate data
Iterate over single column inside Pandas DataFrame and mutate data

Time:04-20

I received a Pandas Dataframe I have to work with and optimize. There are a bunch of columns and my goal is to iterate over a specific column through all the rows.

I got e.g.:

Association Year
National Basketball Association 1991
Major League Baseball 2001

And I now want to iterate over the association column and mutate all "National Basketball Association" to "NBA" and all "Major League Baseball" to "MLB" and so on.

What would be the most efficient approach for this? I tried using IFs, which felt not that efficient.

Thank you guys in advance!

CodePudding user response:

You could use a regex to automatically convert your strings to acronyms. The following regex removes all lowercase letters that follow a leading Capital:

df['Acronym'] = df['Association'].str.replace(r'(?<=\b[A-Z])([a-z] \s*)',
                                              '', regex=True)

output:

                       Association  Year Acronym
0  National Basketball Association  1991     NBA
1            Major League Baseball  2001     MLB

regex demo

CodePudding user response:

You can try pandas.Series.str.replace, but this requires you manually define all the pattern.

df['Association'] = df['Association'].replace({"National Basketball Association": "NBA",
                                               "Major League Baseball": "MLB"})
print(df)

  Association  Year
0         NBA  1991
1         MLB  2001
  • Related