I'd like to create some new columns based on calculation from each row values
For example, input
data = {"c1": [10], "c2": [20], "c3":[30], "c4":[40], "c5":[50], "c6":[10]}
df = pd.DataFrame(data=data)
Let us say we take values from series=c2:c6, [20 30 40 50 10]
new_column1= np.mean(series[0:2]). # np.mean([20,30]) = 25
new_column2 = np.mean(series[2:4]) # np.mean(40,50) = 45
new_column3 = new_column1 new_column2 # 70
output:
c1 c2 c3 c4 c5 c6 new_column1 new_column_2 new_column_3
0 10 20 30 40 50 10. 25. 45 70
I am looking for an efficient way (list comprehension or apply function?) instead of iterrows
CodePudding user response:
Looks like you want:
df['new_column1'] = df.loc[:, 'c2':'c3'].mean(axis=1)
df['new_column2'] = df.loc[:, 'c4':'c5'].mean(axis=1)
df['new_column3'] = df[['new_column1', 'new_column2']].sum(axis=1)
print(df)
Output:
c1 c2 c3 c4 c5 c6 new_column1 new_column2 new_column3
0 10 20 30 40 50 10 25.0 45.0 70.0