Home > Blockchain >  Pandas Sum() issues
Pandas Sum() issues

Time:06-21

My current data is organised into two data frames of the same shape. Id then like to sum all columns into a single column after calculation.

I am doing this using:

    df = df1_kwh.multiply(df2np).sum(axis=1)

However when I use df.shape i get a shape of "(347,)" meaning no columns and I am then unable to add additional columns to the "sum" value column using df.insert.

What can I do to make the output of df.sum able to be manipulated by other functions?

CodePudding user response:

If you want a DataFrame you can convert from Series using to_frame:

df = df1_kwh.multiply(df2np).sum(axis=1).to_frame()

By default the column name is 0, to change it (for example to "sum"), use:

df = df1_kwh.multiply(df2np).sum(axis=1).to_frame('sum')

Example:

np.random.seed(0)
df1_kwh = pd.DataFrame(np.random.random(size=(5,5)))
df2np = 2

df = df1_kwh.multiply(df2np).sum(axis=1).to_frame('sum')
df.insert(0, 'new', 'x')

output:

  new       sum
0   x  5.670608
1   x  6.644717
2   x  5.770594
3   x  5.176273
4   x  6.276120

CodePudding user response:

UPDATED:

If I understand your question, it is saying that df1_kwh and df2np are "two data frames of the same shape".

Assuming they have identical column labels, your code for multiply() should work. If the column labels differ and the index labels are identical, then df1_kwh.multiply(df2np, axis=0).sum(axis=1) should work.

Since the example in your question does not use axis=0 within multiply(), I'll assume your dataframes have identical column labels.

Here's a way to create a DataFrame using the result of your sum() and then use insert() to add a column:

import pandas as pd
df1_kwh = pd.DataFrame({'a':range(5), 'b':range(5, 10)})
df2np = pd.DataFrame({'a':range(10, 15), 'b':range(15, 20)})
df = df1_kwh.multiply(df2np).sum(axis=1).to_frame()
print(df)
df.insert(loc=1, column='additional_column', value='test')
print(df)

Input:

df1_kwh
   a  b
0  0  5
1  1  6
2  2  7
3  3  8
4  4  9
df2np
    a   b
0  10  15
1  11  16
2  12  17
3  13  18
4  14  19

Output:

     0 additional_column
0   75              test
1  107              test
2  143              test
3  183              test
4  227              test
  • Related