Pandas: finding mean in the dataframe based on condition included in another-CodePudding

I have two dataframes, just like below.

Dataframe1:

country	type	start_week	end_week
1	a	12	13
2	b	13	14

Dataframe2:

country	type	week	value
1	a	12	1000
1	a	13	900
1	a	14	800
2	b	12	1000
2	b	13	900
2	b	14	800

I want to add to the first dataframe column with the mean value from the second dataframe for key (country type) and between start_week and end_week.

I want desired output to look like the below:

country	type	start_week	end_week	avg
1	a	12	13	950
2	b	13	14	850

CodePudding user response：

here is one way :

combined = df1.merge(df2 , on =['country','type'])
combined = combined.loc[(combined.start_week <= combined.week) & (combined.week <= combined.end_week)]
output = combined.groupby(['country','type','start_week','end_week'])['value'].mean().reset_index()

output:

>>
   country type  start_week  end_week  value
0        1    a          12        13  950.0
1        2    b          13        14  850.0

CodePudding user response：

You can use pd.melt and comparison of numpy arrays.

# melt df1
melted_df1 = df1.melt(id_vars=['country','type'],value_name='week')[['country','type','week']]

# for loop to compare two dataframe arrays
result = []
for i in df2.values:
    for j in melted_df1.values:
        if (j == i[:3]).all():
            result.append(i)
            break

# Computing mean of the result dataframe
result_df = pd.DataFrame(result,columns=df2.columns).groupby('type').mean().reset_index()['value']

# Assigning result_df to df1
df1['avg'] = result_df

   country type  start_week  end_week    avg
0        1    a          12        13  950.0
1        2    b          13        14  850.0