Home > Back-end >  python pandas remove text in column names after specific characters
python pandas remove text in column names after specific characters

Time:10-28

I have a Dataframe with hundreds of columns and very long column-names. I want to remove any text in the column-names after the ":" or "." This is basically splitting on multiple delimiters.

I tried to extract column-names in a list and use the split method at ":" and "." and then keep only the portion of the text before ":" or "." but the split did not work as I wanted. I do not know why. any idea how to fix it and achieve my goal.

data = {'Name of the injured. Bla bla bla': ['Bill', 'John'],
           'Age of the injured: bla bla': [50,40],
          }
df_data = pd.DataFrame.from_dict(data)
print(df_data)
cols = df_data.columns.values
new_cols = [( x.split(':') or x.split('.') ) for x in cols]
print(new_cols)

This is the outcome that I need:

enter image description here

Thanks, GR

CodePudding user response:

You could use regular expression re in order to split the column name strings, and then reassing them to the dataframe :

import pandas as pd
import re
data = {'Name of the injured. Bla bla bla': ['Bill', 'John'],
           'Age of the injured: bla bla': [50,40],
          }
df_data = pd.DataFrame.from_dict(data)

then, following your intuition, you could do:

cols = df_data.columns.values
new_cols= [re.split(r'[.:]',item)[0] for item in cols]
df_data.columns = new_cols

If you have some other charcaters to split your text, just add it between the brackets, and here we choose to keep only the first part, hence the [0].

The result for df_data is as expected:

  Name of the injured  Age of the injured
0                Bill                  50
1                John                  40

CodePudding user response:

import re

# using re.split, split the col on multiple delimiters
# list comprehension to form a list of columns
# and assign back to the DF

df.columns=[re.split(r'[:|.]', col)[0] for col in df.columns]
df
    Name of the injured     Age of the injured
0   Bill                    50
1   John                    40

  • Related