Home > Mobile >  Pandas new dataframe that has sum of columns from another
Pandas new dataframe that has sum of columns from another

Time:10-15

I'm struggling to figure out how to do a couple of transformation with pandas. I want a new dataframe with the sum of the values from the columns in the original. I also want to be able to merge two of these 'summed' dataframes.

Example #1: Summing the columns

Before:

A    B    C    D
1    4    7    0
2    5    8    1
3    6    9    2

After:

A    B    C    D
6    15   24   3

Right now I'm getting the sums of the columns I'm interested in, storing them in a dictionary, and creating a dataframe from the dictionary. I feel like there is a better way to do this with pandas that I'm not seeing.

Example #2: merging 'summed' dataframes

Before:

 A    B    C    D   F
 6    15   24   3   1

 A    B    C    D   E
 1    2    3    4   2

After:

 A    B    C    D   E    F
 7    17   27   7   2    1

CodePudding user response:

First part, use DataFrame.sum() to sum the columns then convert Series to dataframe by .to_frame() and finally transpose:

df_sum = df.sum().to_frame().T

Result:

print(df_sum)


   A   B   C  D
0  6  15  24  3

Second part, use DataFrame.add() with parameter fill_value, as follows:

df_sum2 = df1.add(df2, fill_value=0)

Result:

print(df_sum2)


   A   B   C  D    E    F
0  7  17  27  7  2.0  1.0

CodePudding user response:

First question:

Summing the columns

Use sum then convert Series to DataFrame and transpose

df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6],
                    'C': [7, 8, 9], 'D': [0, 1, 2]})

df1 = df1.sum().to_frame().T
print(df1)

# Output:
   A   B   C  D
0  6  15  24  3

Second question:

Merging 'summed' dataframes

Use combine

df2 = pd.DataFrame({'A': [1], 'B': [2], 'C': [3], 'D': [4], 'E': [2]})

out = df1.combine(df2, sum, fill_value=0)
print(out)

# Output:
   A   B   C  D  E
0  7  17  27  7  2
  • Related