Home > Software design >  replace multiple strings in dataframe (Python / Pandas)
replace multiple strings in dataframe (Python / Pandas)

Time:06-24

I have the below including in a script and I'm sure there must be a better way to code this. Perhaps referring to two lists, reducing the number of lines. This is two different data frames but the replaced text is the same across both, so again could both df be referenced in the same line?

df_Final_hours = df_Final_hours.replace('Monday', 'Mon', regex=True)
df_Final_hours = df_Final_hours.replace('Tuesday', 'Tue', regex=True)
df_Final_hours = df_Final_hours.replace('Wednesday', 'Wed', regex=True)
df_Final_hours = df_Final_hours.replace('Thursday', 'Thu', regex=True)
df_Final_hours = df_Final_hours.replace('Friday', 'Fri', regex=True)
df_Final_hours = df_Final_hours.replace('Saturday', 'Sat', regex=True)
df_Final_hours = df_Final_hours.replace('Sunday', 'Sun', regex=True)

df_Final_traffic = df_Final_traffic.replace('Monday', 'Mon', regex=True)
df_Final_traffic = df_Final_traffic.replace('Tuesday', 'Tue', regex=True)
df_Final_traffic = df_Final_traffic.replace('Wednesday', 'Wed', regex=True)
df_Final_traffic = df_Final_traffic.replace('Thursday', 'Thu', regex=True)
df_Final_traffic = df_Final_traffic.replace('Friday', 'Fri', regex=True)
df_Final_traffic = df_Final_traffic.replace('Saturday', 'Sat', regex=True)
df_Final_traffic = df_Final_traffic.replace('Sunday', 'Sun', regex=True)

Matt

CodePudding user response:

make use of the map

If you post the data, it'll be easier to present the solution. but, here is the idea

df_Final_hours[COLUMN_NAME].map(
    {'Monday':'Mon',
     'Tuesday':'Tue',
     'Wednesday': 'Wed',
     'Thursday' : 'Thu',
     'Friday' : 'Fri',
     'Saturday' : 'Sat',
     'Sunday' : 'Sun'
    })

CodePudding user response:

You can create a dict containing the changes and then use the replace method you already used:

changes = {'Monday' : 'Mon',
           'Tuesday': 'Tue',
           'Wednesday': 'Wed',
           'Thursday': 'Thu',
           'Friday': 'Fri',
           'Saturday': 'Sat',
           'Sunday': 'Sun'}

df_Final_hours = df_Final_hours.replace(changes, regex=True)
df_Final_traffic = df_Final_traffic.replace(changes, regex=True)

CodePudding user response:

you can create a dictionary like this:

weekAbrvs= {
  "Monday": "Mon",
  "Tuesday": "Tues",
  "Wednesday": "Wed"
}

then you can do one for loop that changes/sets both df_Final_hours and df_Final_traffic
so something like this:

for key,value in weekAbrvs.items():
  df_Final_hours = df_Final_hours.replace(key, value, regex=True)
  df_Final_traffic = df_Final_traffic.replace(key, value, regex=True)

check this out: Iterating over dictionaries using 'for' loops

CodePudding user response:

You can import calendar create a dictionary then use replace with regex=True:

import calendar
d = dict(zip(calendar.day_name, calendar.day_abbr))
df.replace(d, regex=True)
  • Related