Perform operations on a dataframe from groupings by ID-CodePudding

I have the following dataframe in Python:

ID	maths	value
0	add	12
1	sub	30
0	add	10
2	mult	3
0	sub	10
1	add	11
3	sub	40
2	add	21

My idea is to perform the following operations to get the result I want:

First step: Group the rows of the dataframe by ID. The order of the groups shall be indicated by the order of the original dataframe.

ID	maths	value
0	add	12
0	add	10
0	sub	10
1	sub	30
1	add	11
2	mult	3
2	add	21
3	sub	40

Second step: For each group created: Create a value for a new column 'result' where a mathematical operation indicated by the previous column of 'maths' is performed. If there is no previous row for the group, this column would have the value NaN.

ID	maths	value	result
0	add	12	NaN
0	add	10	22
0	sub	10	20
1	sub	30	NaN
1	add	11	19
2	mult	3	NaN
2	add	21	63
3	sub	40	NaN

Third step: Return the resulting dataframe.

I have tried to realise this code by making use of the pandas groupby method. But I have problems to iterate with conditions for each row and each group, and I don't know how to create the new column 'result' on a groupby object.

grouped_df = testing.groupby('ID')
  for key, item in grouped_df:
    print(grouped_df.get_group(key))

I don't know whether to use orderby or groupby or some other method that works for what I want to do. If you can help me with a better idea, I'd appreciate it.

CodePudding user response：

ID = list("00011223")
maths = ["add","add","sub","sub","add","mult","add","sub"]
value = [12,10,10,30,11,3,21,40]
import pandas as pd
df = pd.DataFrame(list(zip(ID,maths,value)),columns = ["ID","Maths","Value"])
df["Maths"] = df.groupby(["ID"]).pipe(lambda df:df.Maths.shift(1)).fillna("add")
df["Value1"] = df.groupby(["ID"]).pipe(lambda df:df.Value.shift(1))   
df["result"] = df.groupby(["Maths"]).pipe(lambda x:(x.get_group("add")["Value1"]   x.get_group("add")["Value"]).append(
    x.get_group("sub")["Value1"] - x.get_group("sub")["Value"]).append(
        x.get_group("mult")["Value1"] * x.get_group("mult")["Value"])).sort_index()

Here is the Output:

df
Out[168]: 
  ID Maths  Value  Value1  result
0  0   add     12     NaN     NaN
1  0   add     10    12.0    22.0
2  0   add     10    10.0    20.0
3  1   add     30     NaN     NaN
4  1   sub     11    30.0    19.0
5  2   add      3     NaN     NaN
6  2  mult     21     3.0    63.0
7  3   add     40     NaN     NaN