I have a dataframe like:
| x || y || z || values |
| x0 || y0 || z0 || areal |
| x1 || y1 || z1 || aimag |
| x2 || y2 || z2 || breal |
| x3 || y3 || z3 || bimag |
I want to produce:
| x || y || z || values |
|sqrt(x0^2 x1^2)||sqrt(y0^2 y1^2)||sqrt(z0^2 z1^2)|| a |
|sqrt(x2^2 x3^2)||sqrt(y2^2 y3^2)||sqrt(z2^2 z3^2)|| b |
Is there a better way to do this than nested for loops?
CodePudding user response:
You could use groupby
agg
:
import numpy as np
group = df['values'].str.extract('(.*)(?:real|imag)', expand=False)
# 0 a
# 1 a
# 2 b
# 3 b
(df.select_dtypes('number')
.groupby(group)
.agg(lambda x: np.sqrt((x**2).sum()))
.reset_index()
)
example input:
x y z values
0 1 2 3 areal
1 4 5 6 aimag
2 7 8 9 breal
3 10 11 12 bimag
output:
values x y z
0 a 4.123106 5.385165 6.708204
1 b 12.206556 13.601471 15.000000
CodePudding user response:
You can use shift
method to pick a pair of rows.
import numpy as np
import pandas as pd
# Create a sample dataframe
import io
s='''x,y,z,values
1,5,9,areal
2,6,10,aimag
3,7,11,breal
4,8,12,bimag'''
df = pd.read_csv(io.StringIO(s))
# Calculate values
df_calc = np.sqrt(df[['x', 'y', 'z']].shift() ** 2 df[['x', 'y', 'z']] ** 2)
df_calc = df_calc[df_calc.index % 2 == 1].reset_index(drop=True)