Home > database >  Python: sum column for every dataframe in a list
Python: sum column for every dataframe in a list

Time:11-12

I have a list of identical dataframes and I am trying to sum one column in each dataframe in the list. My thought is something like total = [df['A'].sum for df in dfs] but this returns a list of length dfs containing only the value method. My desired output is a list of the column sum for each dataframe. What is the fastest way to achieve this goal? I have to repeat this sum thousands of times per list on thousands of different lists.

CodePudding user response:

Perhaps, you are missing () after sum

 total = [df['A'].sum() for df in dfs]

You want to call the method sum not just reference it.

Python sum is pretty quick: Python built-in sum function vs. for loop performance and I assume that pandas sum should be comparable. Difference between sum, 'sum' and np.sum *under the hood* (Python / Pandas / Numpy)

  • Related