Home > other >  Update pandas column names with replace chain not working properly
Update pandas column names with replace chain not working properly

Time:11-24

We have the following column names:

our_df.columns
Index(['Rank', 'Album', 'Artist', 'Label', 'Label Description',
       'Peak Position', 'Last Week Rank', 'Last 2 Week Rank', 'Weeks On Chart',
       'TW Total Activity', '% CHG', 'LW Total Activity', 'TW Album Sales',
       'TW Song Sales', 'TW TEA', 'TW Audio Streaming Activity',
       'TW Video Streaming Activity', 'TW Total SEA (Audio   Video)', 'ATD'],
      dtype='object')

And we'd like to do something like this:

our_df.columns.str.replace(' ','_').replace('.', '').replace(' /-','plus_minus').lower()

...where we first make all of the replacements, and then convert everything to lowercase. However, this is failing with the error 'Index' object has no attribute 'replace'. We've updated this to

our_df.columns.str.replace(' ','_').str.replace('.', '')

# and get the warning
FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.

lastly, when we try

our_df.columns.str.replace(' ','_').str.replace('.', '').str.replace(' /-','plus_minus').str.lower()

...we get the error nothing to repeat at position 0.

  1. Is it necessary to repeat str. between each .replace() call?
  2. How can we resolve future warning?
  3. The last error is caused by .replace(' /-','plus_minus') because there is no /- in the columns. How can we handle this so that rather than throwing an error, it simply makes no replacements?
  4. Am I doing this right?

CodePudding user response:

If want use str.replace need repeat it, for and . is necessary escape \ because regex special characters and add regex=True for remove future warning:

(our_df.columns.str.replace(' ','_', regex=True)
               .str.replace('\.', '', regex=True)
               .str.replace('\ /-','plus_minus', regex=True)
               .str.lower())

Or if pass dictionary convert values to Series, because error:

AttributeError: 'Index' object has no attribute 'replace'


c = ['Rank', 'Album', 'Artist', 'Label', 'Label Description',
       'Peak Position', 'Last Week Rank', 'Last 2 Week Rank', 'Weeks On Chart',
       'TW Total Activity', '% CHG', 'LW Total Activity', 'TW Album Sales',
       'TW Song Sales', 'TW TEA', 'TW Audio Streaming Activity',
       'TW Video Streaming Activity', 'TW Total SEA (Audio  /- Video)', 'ATD']
our_df = pd.DataFrame(columns=c)

our_df.columns = our_df.columns.to_series().replace({'\s ':'_','\.': '','\ /-':'plus_minus'}, regex=True).str.lower()
print (our_df)
Empty DataFrame
Columns: [rank, album, artist, label, label_description, peak_position, last_week_rank, last_2_week_rank, weeks_on_chart, tw_total_activity, %_chg, lw_total_activity, tw_album_sales, tw_song_sales, tw_tea, tw_audio_streaming_activity, tw_video_streaming_activity, tw_total_sea_(audio_plus_minus_video), atd]
Index: []
  • Related