I'm struggling to figure out how to do a couple of transformation with pandas. I want a new dataframe with the sum of the values from the columns in the original. I also want to be able to merge two of these 'summed' dataframes.
Example #1: Summing the columns
Before:
A B C D
1 4 7 0
2 5 8 1
3 6 9 2
After:
A B C D
6 15 24 3
Right now I'm getting the sums of the columns I'm interested in, storing them in a dictionary, and creating a dataframe from the dictionary. I feel like there is a better way to do this with pandas that I'm not seeing.
Example #2: merging 'summed' dataframes
Before:
A B C D F
6 15 24 3 1
A B C D E
1 2 3 4 2
After:
A B C D E F
7 17 27 7 2 1
CodePudding user response:
First part, use DataFrame.sum()
to sum the columns then convert Series to dataframe by .to_frame()
and finally transpose
:
df_sum = df.sum().to_frame().T
Result:
print(df_sum)
A B C D
0 6 15 24 3
Second part, use DataFrame.add()
with parameter fill_value
, as follows:
df_sum2 = df1.add(df2, fill_value=0)
Result:
print(df_sum2)
A B C D E F
0 7 17 27 7 2.0 1.0
CodePudding user response:
First question:
Summing the columns
Use sum
then convert Series
to DataFrame
and transpose
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6],
'C': [7, 8, 9], 'D': [0, 1, 2]})
df1 = df1.sum().to_frame().T
print(df1)
# Output:
A B C D
0 6 15 24 3
Second question:
Merging 'summed' dataframes
Use combine
df2 = pd.DataFrame({'A': [1], 'B': [2], 'C': [3], 'D': [4], 'E': [2]})
out = df1.combine(df2, sum, fill_value=0)
print(out)
# Output:
A B C D E
0 7 17 27 7 2