I have a dataframe looks like this
index brand_spend_kitchen brand_spend_car brand_spend_home
1 3 5 7
2 2 5 8
The df is more complicated than this, but this example is suffice.
I am hoping to change all column names to '% spend kitchen', '% spend car', and '% spend home'
.
Instead of using df.rename(columns = {})
which requires manual input. Is there a more efficient way to change the col names
CodePudding user response:
You could simply map()
them to a different name using .replace()
:
data.columns = list(map(lambda c: '% ' c.replace('_', ' '), data.columns))
CodePudding user response:
df.columns = df.columns.str.replace('.*(spend)_(.*)', r'% \1 \2', regex = True)
df
index % spend kitchen % spend car % spend home
0 1 3 5 7
1 2 2 5 8
CodePudding user response:
the answer depends on if you have a typo or not. I'm going to assume you do not.
you have a list containing a single string ['value1, value2, value3']
(see how this is different from ['value1', 'value2', 'value3']
). We're going to separate this into a new list, full of individual strings.
# grab the first item in the list (the long string)
myString = data.columns[0]
#split it into a list
myList = list(myString.split(', '))
print(myList)
>>>> ['spend_in_home', 'spend_in_kitchen', 'spend_in_tools' ... etc]
okay so if that was a typo, no need to do this bit of code, either way, once you have a list that contains several strings, we need to actually manipulate it. It looks like what you want to do is to add '% ' to the beginning of each string, and replace every '_' with a ' '. first lets do the '% '
for i in range(len(myList)):
#current item
current = myList[i]
#add our prefix
current = '% ' current
#paste it onto the list
myList[i] = current
print(myList)
>>>> ['% spend_in_home', '% spend_in_kitchen', '% spend_in_tools']
finally, let's add an action in that same for loop to turn the underscores into spaces:
for i in range(len(myList)):
#current item
current = myList[i]
#add our prefix
current = '% ' current
#replace all instances of '_' with ' '
current = current.replace('_', ' ')
#paste it onto the list
myList[i] = current
print(myList)
>>>> ['% spend in home', '% spend in kitchen', '% spend in tools']
Hope that helps, feel free to clarify if I missed something!
CodePudding user response:
You can use functions in pd.DataFrame.rename:
df = df.set_index('index') # Be sure index is in the index
df_out = df.rename(columns=lambda x: '% ' ' '.join(x.split('_')[1:]))
df_out = df_out.reset_index() #move index back to columns if desired.
Output:
index % spend kitchen % spend car % spend home
0 1 3 5 7
1 2 2 5 8
Here we just created a lambda function to do string manipulation.