Home > Software design >  calculate the average of two consecutive rows based on multiple conditions
calculate the average of two consecutive rows based on multiple conditions

Time:11-08

say I have the following df:

Location Class Value Type
1 A1 2 WD
1 A2 4 WD
1 B 10 WD
1 A1 4 WE
1 A2 6 WE
1 B 20 WE
2 A1 4 WD
2 A2 6 WD
2 B 15 WD
2 A1 6 WE
2 A2 8 WE
2 B 25 WE

and I want to cal the average of A1 and A2 at each location and for each type as follows:

Location Class Value Type
1 A12 3 WD
1 B 10 WD
1 A12 5 WE
1 B 20 WE
2 A12 5 WD
2 B 15 WD
2 A12 7 WE
2 B 25 WE

I have solved this issue by making two different dfs and joining them but I was wondering if there is a more straightforward solution I can use

a_list=['A1','A2']
df1=df.loc[df['Class'].isin(a_list)]
df1=df1.groupby(['Location','Type'], as_index = False).agg({'Value':np.sum })

df2=df.loc[df['Class']=='B']

df3=[df1,df2]
df4=pd.concat(df3)

CodePudding user response:

This gives you what you are looking for:

df = pd.read_clipboard()
df.loc[df["Class"].isin(['A1', 'A2']), "Class"] = 'A12'
df.groupby(["Location", "Class", "Type"]).mean().reset_index()

CodePudding user response:

Where column Class contains A, make it `A12', groupby and aggregate as required

   df = df.assign(Class =np.where(df['Class'].str.contains('A'), 'A12', df['Class'])).groupby(['Location','Class','Type']).agg('mean').reset_index()

    Location Class Type  Value
0         1   A12   WD      3
1         1   A12   WE      5
2         1     B   WD     10
3         1     B   WE     20
4         2   A12   WD      5
5         2   A12   WE      7
6         2     B   WD     15
7         2     B   WE     25
  • Related