How to rename all columns at once-CodePudding

I have a dataframe looks like this

index brand_spend_kitchen brand_spend_car brand_spend_home
1      3                      5           7
2      2                      5           8

The df is more complicated than this, but this example is suffice. I am hoping to change all column names to '% spend kitchen', '% spend car', and '% spend home'.

Instead of using df.rename(columns = {}) which requires manual input. Is there a more efficient way to change the col names

CodePudding user response：

You could simply map() them to a different name using .replace():

data.columns = list(map(lambda c: '% '   c.replace('_', ' '), data.columns))

CodePudding user response：

df.columns = df.columns.str.replace('.*(spend)_(.*)', r'% \1 \2', regex = True)

df

   index  % spend kitchen  % spend car  % spend home
0      1                3            5             7
1      2                2            5             8

CodePudding user response：

the answer depends on if you have a typo or not. I'm going to assume you do not.

you have a list containing a single string ['value1, value2, value3'] (see how this is different from ['value1', 'value2', 'value3'] ). We're going to separate this into a new list, full of individual strings.

# grab the first item in the list (the long string)
myString = data.columns[0]
#split it into a list
myList = list(myString.split(', '))
print(myList)
>>>> ['spend_in_home', 'spend_in_kitchen', 'spend_in_tools' ... etc]

okay so if that was a typo, no need to do this bit of code, either way, once you have a list that contains several strings, we need to actually manipulate it. It looks like what you want to do is to add '% ' to the beginning of each string, and replace every '_' with a ' '. first lets do the '% '

for i in range(len(myList)):
    #current item
    current = myList[i]
    #add our prefix
    current = '% '   current

    #paste it onto the list
    myList[i] = current

print(myList)
>>>> ['% spend_in_home', '% spend_in_kitchen', '% spend_in_tools']

finally, let's add an action in that same for loop to turn the underscores into spaces:

for i in range(len(myList)):
    #current item
    current = myList[i]
    #add our prefix
    current = '% '   current
    #replace all instances of '_' with ' '
    current = current.replace('_', ' ')

    #paste it onto the list
    myList[i] = current

print(myList)
>>>> ['% spend in home', '% spend in kitchen', '% spend in tools']

Hope that helps, feel free to clarify if I missed something!

CodePudding user response：

You can use functions in pd.DataFrame.rename:

df = df.set_index('index') # Be sure index is in the index
df_out = df.rename(columns=lambda x: '% '  ' '.join(x.split('_')[1:]))
df_out = df_out.reset_index() #move index back to columns if desired.

Output:

   index  % spend kitchen  % spend car  % spend home
0      1                3            5             7
1      2                2            5             8

Here we just created a lambda function to do string manipulation.