I have 2 dataframes that look like this:
monthly_abscount = pd.DataFrame(df.groupby(df["Month"])["Abscount"].sum())
monthly_abscount["Abscount"] = monthly_abscount["Abscount"].astype(float)
monthly_abscount.head()
Abscount
Month
1 264.0
2 163.0
3 283.0
4 226.0
5 66.0
The month column is the index here.
workdays = workdays2 #creating new DF for working
workdays2["Month"]= workdays2["Month"].replace(['Jan 2022','Feb 2022','Mar 2022','Apr 2022','May 2022','Jun 2022','Jul 2022','Aug 2022','Sep 2022','Oct 2022','Nov 2022','Dec 2022'],['1','2','3','4','5','6','7','8','9','10','11','12'])
workdays2["Month"] = workdays2["Month"].astype(str)
workdays2["Workdays"] = workdays2["Workdays"].astype(float)
workdays2.head()
Month Workdays
0 1.0 21.0
1 2.0 20.0
2 3.0 23.0
3 4.0 21.0
4 5.0 22.0
I want to either merge them to eventually calculate Abscount/Workdays to get an average of the Abscount in every month, or I want to directly create a dataframe which would have a column Abscount/Workdays.
Here are the infos of both the dataframes:
workdays2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Month 5 non-null object
1 Workdays 5 non-null float64
dtypes: float64(1), object(1)
memory usage: 208.0 bytes
monthly_abscount.info()
<class 'pandas.core.frame.DataFrame'>
Index: 5 entries, 1 to 5
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Abscount 5 non-null float64
dtypes: float64(1)
memory usage: 80.0 bytes
Here is the code that I have tried so far:
mergedpd = pd.merge(workdays2, monthly_abscount, on = 'Month', how = "inner")
print(mergedpd)
Empty DataFrame
Columns: [Month, Workdays, Abscount]
Index: []
The above code gives an empty dataframe.
What am I doing wrong here? Please help.
CodePudding user response:
Use:
#The reason of empty result.
df1=df1.reset_index()
#######
df2['Month']=df2['Month'].astype(float).astype(int)
df1['Month']=df1['Month'].astype(int)
out = df1.merge(df2, on='Month')
out['Abscount']=out['Abscount'].astype(float)
out['Workdays']=out['Workdays'].astype(float)
out['needed'] = out['Abscount']/out['Workdays']
Output:
index Month Abscount Workdays needed
0 0 1 264.0 21.0 12.571429
1 1 2 163.0 20.0 8.150000
2 2 3 283.0 23.0 12.304348
3 3 4 226.0 21.0 10.761905
4 4 5 66.0 22.0 3.000000