Home > Mobile >  How to group same columns (different by suffix) using sum in pandas?
How to group same columns (different by suffix) using sum in pandas?

Time:10-14

I have a dataframe that looks like this. Note of the total_customer columns it has 500 differentiated by a date suffix.

total_customer      total_customer_2021-03-31  total_customer_date X  
1                   10                          4                  
3                   14                          3                  

Now, I want to sum up the columns row-wise that are the same given by removing the date suffix, so this cannot be done manually. I.e the expected output is:

total_customer         
15                                             
20                                             

The issue why I cannot do this manually is because I have 500 column pairs, so I need an efficient way to do this. Also, the order of columns is not predictable either. What do you recommend? Thanks!

CodePudding user response:

If there are diffrent groups use DataFrame.groupby with aggregate sum:

df1 = df.groupby(df.columns.str.replace('[0-9-_] $',''), axis=1).sum()

Or if need sum all columns only use sum like commented:

df1 = df.sum(axis=1).to_frame(name='total_customer')
  • Related