Home > Back-end >  Creating new dataframe from existing dataframes using a mathematical calculation
Creating new dataframe from existing dataframes using a mathematical calculation

Time:05-22

I have 2 dataframes that look like this:

monthly_abscount = pd.DataFrame(df.groupby(df["Month"])["Abscount"].sum())
monthly_abscount["Abscount"] = monthly_abscount["Abscount"].astype(float) 
monthly_abscount.head()

    Abscount
Month   
1   264.0
2   163.0
3   283.0
4   226.0
5   66.0

The month column is the index here.

workdays = workdays2 #creating new DF for working
workdays2["Month"]= workdays2["Month"].replace(['Jan 2022','Feb 2022','Mar 2022','Apr 2022','May 2022','Jun 2022','Jul 2022','Aug 2022','Sep 2022','Oct 2022','Nov 2022','Dec 2022'],['1','2','3','4','5','6','7','8','9','10','11','12'])
workdays2["Month"] = workdays2["Month"].astype(str)
workdays2["Workdays"] = workdays2["Workdays"].astype(float)
workdays2.head()

    Month   Workdays
0   1.0 21.0
1   2.0 20.0
2   3.0 23.0
3   4.0 21.0
4   5.0 22.0

I want to either merge them to eventually calculate Abscount/Workdays to get an average of the Abscount in every month, or I want to directly create a dataframe which would have a column Abscount/Workdays.

Here are the infos of both the dataframes:

workdays2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 2 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   Month     5 non-null      object 
 1   Workdays  5 non-null      float64
dtypes: float64(1), object(1)
memory usage: 208.0  bytes

monthly_abscount.info()

<class 'pandas.core.frame.DataFrame'>
Index: 5 entries, 1 to 5
Data columns (total 1 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   Abscount  5 non-null      float64
dtypes: float64(1)
memory usage: 80.0  bytes

Here is the code that I have tried so far:

mergedpd = pd.merge(workdays2, monthly_abscount, on = 'Month', how = "inner")
print(mergedpd)

Empty DataFrame
Columns: [Month, Workdays, Abscount]
Index: []

The above code gives an empty dataframe.

What am I doing wrong here? Please help.

CodePudding user response:

Use:

#The reason of empty result.
df1=df1.reset_index()
#######

df2['Month']=df2['Month'].astype(float).astype(int)
df1['Month']=df1['Month'].astype(int)
out = df1.merge(df2, on='Month')
out['Abscount']=out['Abscount'].astype(float)
out['Workdays']=out['Workdays'].astype(float)
out['needed'] = out['Abscount']/out['Workdays']

Output:

index   Month   Abscount    Workdays    needed
0   0   1   264.0   21.0    12.571429
1   1   2   163.0   20.0    8.150000
2   2   3   283.0   23.0    12.304348
3   3   4   226.0   21.0    10.761905
4   4   5   66.0    22.0    3.000000
  • Related