calculate the average of two consecutive rows based on multiple conditions-CodePudding

say I have the following df:

Location	Class	Value	Type
1	A1	2	WD
1	A2	4	WD
1	B	10	WD
1	A1	4	WE
1	A2	6	WE
1	B	20	WE
2	A1	4	WD
2	A2	6	WD
2	B	15	WD
2	A1	6	WE
2	A2	8	WE
2	B	25	WE

and I want to cal the average of A1 and A2 at each location and for each type as follows:

Location	Class	Value	Type
1	A12	3	WD
1	B	10	WD
1	A12	5	WE
1	B	20	WE
2	A12	5	WD
2	B	15	WD
2	A12	7	WE
2	B	25	WE

I have solved this issue by making two different dfs and joining them but I was wondering if there is a more straightforward solution I can use

a_list=['A1','A2']
df1=df.loc[df['Class'].isin(a_list)]
df1=df1.groupby(['Location','Type'], as_index = False).agg({'Value':np.sum })

df2=df.loc[df['Class']=='B']

df3=[df1,df2]
df4=pd.concat(df3)

CodePudding user response：

This gives you what you are looking for:

df = pd.read_clipboard()
df.loc[df["Class"].isin(['A1', 'A2']), "Class"] = 'A12'
df.groupby(["Location", "Class", "Type"]).mean().reset_index()

CodePudding user response：

Where column Class contains A, make it `A12', groupby and aggregate as required

   df = df.assign(Class =np.where(df['Class'].str.contains('A'), 'A12', df['Class'])).groupby(['Location','Class','Type']).agg('mean').reset_index()

    Location Class Type  Value
0         1   A12   WD      3
1         1   A12   WE      5
2         1     B   WD     10
3         1     B   WE     20
4         2   A12   WD      5
5         2   A12   WE      7
6         2     B   WD     15
7         2     B   WE     25

Location	Class	Value	Type
1	A1	2	WD
1	A2	4	WD
1	B	10	WD
1	A1	4	WE
1	A2	6	WE
1	B	20	WE
2	A1	4	WD
2	A2	6	WD
2	B	15	WD
2	A1	6	WE
2	A2	8	WE
2	B	25	WE

Location	Class	Value	Type
1	A1	2	WD
1	A2	4	WD
1	B	10	WD
1	A1	4	WE
1	A2	6	WE
1	B	20	WE
2	A1	4	WD
2	A2	6	WD
2	B	15	WD
2	A1	6	WE
2	A2	8	WE
2	B	25	WE

Location	Class	Value	Type
1	A1	2	WD
1	A2	4	WD
1	B	10	WD
1	A1	4	WE
1	A2	6	WE
1	B	20	WE
2	A1	4	WD
2	A2	6	WD
2	B	15	WD
2	A1	6	WE
2	A2	8	WE
2	B	25	WE