Home > Software engineering >  Build a dataframe based on new dataframe
Build a dataframe based on new dataframe

Time:04-30

I have a data frame with 9 columns (my real data is very big). I want to consider 4 by 4 columns and build a new dataframe with 2 columns which shows the summation of those 4 columns. Here is a simple example: I want to have the id column.

import pandas as pd
df = pd.DataFrame()
df['id'] = [1, 2, 3, 4]
df['a'] = [10, 0, 1, 3]
df['b'] = [-10, 0, 2, 2]
df['c'] = [0, 1, 3, 3]
df['d'] = [0, 0, 4, 4]
df['e'] = [10, 0, 1, 3]
df['f'] = [10, 0, 2, 2]
df['g'] = [0, -1, 0, 0]
df['h'] = [0, 0, 0, 0]
df

enter image description here

CodePudding user response:

You can use the underlying numpy array for an easy way to reshape:

a = df.drop(columns='id').to_numpy()
df2 = pd.DataFrame(a.reshape((-1, 2, len(df))).sum(2),
                   columns=['value1', 'value2'],
                   index=df['id']).reset_index()

output:

   id  value1  value2
0   1       0      20
1   2       1      -1
2   3      10       3
3   4      12       5
  • Related