I am trying to convert SQL query into equivalent python pandas. The SQL query is
select count(*),sum(days) into :_cnt_ML_2R, :_pd_QL_1R
from _gm_Qr_bfr_mnt
where x=1 and y=1 and input(code,8.) in (70001:73599)
Now I was trying to convert it into equivalent python pandas.
For SQL select count(*) from _gm_Qr_bfr_mnt
equivalent panda is pd.Series(_gm_Qr_bfr_mnt.shape[0])
and for SQL select sum(days) from _gm_Qr_bfr_mnt
equivalent panda is pd.Series(__gm_Qr_bfr_mnt['days'].sum())
but I am unable to convert the INTO
from SQL to pandas along with WHERE
condition.
Need guidance on that how to covert this to the equivalent pandas.
CodePudding user response:
for a SQL like as follows
select available_date, count(), sum() from DF
where price > 500
group by available_date
the equivalent will be
df.query('Price > 500').groupby('Available_Since_Date')['Available_Quantity'].agg(['size', 'sum'])
Where clause: df.query('Price > 500')
Group by: group by columns for getting count ('size), and sum ('sum')
product_name Price Final_Price Available_Quantity Available_Since_Date
0 Keyboard 500.000 5.0 5.0 11/5/2021
1 Mouse NaN NaN 9.0 4/23/2021
2 Monitor 5000.235 10.0 6.0 08/21/2021
3 CPU NaN NaN NaN 09/18/2021
4 CPU 10000.550 20.0 6.0 09/18/2021
5 Speakers 250.500 8.0 5.0 01/05/2021
6 NaT NaN NaN 8.0 NaT
size sum
Available_Since_Date
08/21/2021 1 6.0
09/18/2021 1 6.0
CodePudding user response:
Check filter the df first
sub_df = df.loc[(df['x'].eq(1) &
df['y'].eq(1) &
df['input(code,8.)'].between(70001, 73599, inclusive="both"))]
_cnt_ML_2R = sub_df.shape[0]
:_pd_QL_1R = sub_df['days'].sum()