Home > Software design >  SQL to equivalent pandas
SQL to equivalent pandas

Time:06-16

I am trying to convert SQL query into equivalent python pandas. The SQL query is

select count(*),sum(days) into :_cnt_ML_2R, :_pd_QL_1R
from _gm_Qr_bfr_mnt
where x=1 and y=1 and input(code,8.) in (70001:73599)

Now I was trying to convert it into equivalent python pandas. For SQL select count(*) from _gm_Qr_bfr_mnt equivalent panda is pd.Series(_gm_Qr_bfr_mnt.shape[0]) and for SQL select sum(days) from _gm_Qr_bfr_mnt equivalent panda is pd.Series(__gm_Qr_bfr_mnt['days'].sum()) but I am unable to convert the INTO from SQL to pandas along with WHERE condition.

Need guidance on that how to covert this to the equivalent pandas.

CodePudding user response:

for a SQL like as follows

select available_date, count(), sum() from DF
where price > 500
group by available_date

the equivalent will be

df.query('Price > 500').groupby('Available_Since_Date')['Available_Quantity'].agg(['size', 'sum'])
Where clause:  df.query('Price > 500')
Group by: group by columns for getting count ('size), and sum ('sum')
product_name    Price   Final_Price Available_Quantity  Available_Since_Date
0   Keyboard    500.000     5.0     5.0                 11/5/2021
1   Mouse       NaN         NaN     9.0                 4/23/2021
2   Monitor     5000.235    10.0    6.0                 08/21/2021
3   CPU         NaN         NaN     NaN                 09/18/2021
4   CPU         10000.550   20.0    6.0                 09/18/2021
5   Speakers    250.500     8.0     5.0                 01/05/2021
6   NaT         NaN         NaN     8.0                 NaT
                      size  sum
Available_Since_Date        
08/21/2021              1   6.0
09/18/2021              1   6.0

CodePudding user response:

Check filter the df first

sub_df = df.loc[(df['x'].eq(1) & 
                df['y'].eq(1) & 
                df['input(code,8.)'].between(70001, 73599, inclusive="both"))]
_cnt_ML_2R = sub_df.shape[0]
:_pd_QL_1R = sub_df['days'].sum()
  • Related