How to group same columns (different by suffix) using sum in pandas?-CodePudding

I have a dataframe that looks like this. Note of the total_customer columns it has 500 differentiated by a date suffix.

total_customer      total_customer_2021-03-31  total_customer_date X  
1                   10                          4                  
3                   14                          3

Now, I want to sum up the columns row-wise that are the same given by removing the date suffix, so this cannot be done manually. I.e the expected output is:

total_customer         
15                                             
20

The issue why I cannot do this manually is because I have 500 column pairs, so I need an efficient way to do this. Also, the order of columns is not predictable either. What do you recommend? Thanks!

CodePudding user response：

If there are diffrent groups use DataFrame.groupby with aggregate sum:

df1 = df.groupby(df.columns.str.replace('[0-9-_] $',''), axis=1).sum()

Or if need sum all columns only use sum like commented:

df1 = df.sum(axis=1).to_frame(name='total_customer')