Home > Blockchain >  How to change name of columns if it contains specific text
How to change name of columns if it contains specific text

Time:07-08

How do I go about changing the text in columns? for e.g instead of MAC000306_Std_E_Affluent, it'll be affluent. I would like to change all the names to the last name. So ideally I'd end up with columns printing Affluent, Comfortable, enter image description hereand Adversity.

CodePudding user response:

Suppose your dataframe is called df.

  1. we can retrieve the column names with df.columns.
  2. we use the str.split function and split on _. This returns a list from which we select the last item. We save this item together with the old name in a dictionary.
  3. Now the dictionary can be supplied to df.rename to rename the columns.

I used a Dict Comprehension to wrap most of the logic in one line.

column_name_mapping = {old_name: old_name.split('_')[-1] for old_name in df.columns}
df = df.rename(columns=column_name_mapping)

Please update your question so that it provides a minimal code example that can be copied and used for testing. This will also ensure that your question is helpful to others in the future. Linked images are prone to "disappear" ;-)

Best regards!

CodePudding user response:

You can use split("_") and get the last element with [-1] as the new name of your column

>>>string="MAC0999_STD_F_ColumnName"
>>>string.split("_") 
['MAC0999', 'STD','F', 'ColumnName'] 
>>>NewColumnName=string.split("_")[-1] 
>>>NewColumnName
'ColumnName'´

CodePudding user response:

Here you have another solution , although it is similar (one way or another) with the one provided by ffrosch:

for col in df.columns:
    df.rename(columns={col:col.split("_")[-1]},inplace=True)

CodePudding user response:

You could get that done using rename with a lambda function:

df = pd.DataFrame({'C_A':list('abc'),'C_B':list('abc'),'C_C':list('abc')})
print(df)

  C_A C_B C_C
0   a   a   a
1   b   b   b
2   c   c   c

df.rename(columns=lambda x: x.split('_')[-1], inplace=True)
print(df)

   A  B  C
0  a  a  a
1  b  b  b
2  c  c  c
  • Related