Home > OS >  Calculate average temperature/humidity between 2 dates pandas data frames
Calculate average temperature/humidity between 2 dates pandas data frames

Time:12-01

I have the following data frames: df3

Harvest_date Starting_date
2022-10-06 2022-08-06
2022-02-22 2021-12-22

df (I have all temp and humid starting from 2021-01-01 till the present)

date temp humid
2022-10-06 00:30:00 2 30
2022-10-06 00:01:00 1 30
2022-10-06 00:01:30 0 30
2022-10-06 00:02:00 0 30
2022-10-06 00:02:30 -2 30

I would like to calculate the avg temperature and humidity between the starting_date and harvest_date. I tried this:

import pandas as pd
df = pd.read_csv (r'C:\climate.csv')
df3 = pd.read_csv (r'C:\Flower_weight_Seson.csv')
df['date'] = pd.to_datetime(df.date)
df3['Harvest_date'] = pd.to_datetime(df3.Harvest_date)
df3['Starting_date'] = pd.to_datetime(df3.Starting_date)
df.style.format({"date": lambda t: t.strftime("%Y-%m-%d")})
df3.style.format({"Harvest_date": lambda t: t.strftime("%Y-%m-%d")})
df3.style.format({"Starting_date": lambda t: t.strftime("%Y-%m-%d")})
for harvest_date,starting_date in zip(df3['Harvest_date'],df3['Starting_date']):
    df3["Season avg temp"]= df[df["date"].between(starting_date,harvest_date)]["temp"].mean()
    df3["Season avg humid"]= df[df["date"].between(starting_date,harvest_date)]["humid"].mean()

I get the same value for all dates. Can someone point out what I did wrong, please?

CodePudding user response:

Use DataFrame.loc with match indices by means of another DataFrame:

#changed data for match with df3
print (df)
                  date  temp  humid
0  2022-10-06 00:30:00     2     30
1  2022-09-06 00:01:00     1     33
2  2022-09-06 00:01:30     0     23
3  2022-10-06 00:02:00     0     30
4  2022-01-06 00:02:30    -2     25


for i,harvest_date,starting_date in zip(df3.index,df3['Harvest_date'],df3['Starting_date']):
    mask = df["date"].between(starting_date,harvest_date)
    avg = df.loc[mask, ["temp",'humid']].mean()
    df3.loc[i, ["Season avg temp",'Season avg humid']] = avg.to_numpy()
print (df3)
  Harvest_date Starting_date  Season avg temp  Season avg humid
0   2022-10-06    2022-08-06              0.5              28.0
1   2022-02-22   2021-12-220             -2.0              25.0
  • Related