Home > Software design >  Python pandas value error for code conversion
Python pandas value error for code conversion

Time:07-27

I am trying to convert SQL code into equivalent python code.

My SQL code is given below:

select count(*),sum(days) into :_cnt_D_DmRP_1D, :_pd_D_DmRP_1D
 from _fw_bfr_wgt
where pk=1 and type=1 and input(zipcode,8.) not in (700001:735999)
and input(zipcode,8.) not in (&cal_list.); 

I was trying to convert this code into python.

My python code is given below:

cnt_D_DmRP_1D,_pd_D_DmRP_1D=df.loc[(df['pk']==1) & (df['type']==1) & (df['zipcode'] not in(list(range(700001,(735999 1))))),'days'].agg(['size','sum']) & (df['zipcode'] not in(cal_list)),df.loc['days'].agg(['size','sum'])

Here df is the dataframe which is _fw_bfr_wgt in SQL code, that I have already created.

But I am getting an error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Unable to resolve this issue as I am a beginner in python. Need help and suggestion to fix this error.

CodePudding user response:

Problem is in df['zipcode'] not in (list(range(700001,(735999 1)))), you can use Series in in operator

>>> pd.Series(range(3)) in list(range(10))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "~/.local/lib/python3.10/site-packages/pandas/core/generic.py", line 1527, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Instead you can use Series.isin

~df['zipcode'].isin(list(range(700001,(735999 1))))

CodePudding user response:

I separated the lines to facilitate the explanation:

cnt_D_DmRP_1D,_pd_D_DmRP_1D=df.loc[(df['pk']==1)
& (df['type']==1) 
& (df['zipcode'] not in(list(range(700001,(735999 1))))),'days'].agg(['size','sum']) 
& (df['zipcode'] not in(cal_list)),df.loc['days'].agg(['size','sum'])

The problem is in the zipcode comparisons.
In binary comparisons and therefore in the df.loc function, we cannot use the 'in' logic. in this case, it will look for the object as a whole within the list and not each of the items.

So to do what you want we use .isin() and in this case '~' for NOT isin()

This way:

cnt_D_DmRP_1D,_pd_D_DmRP_1D=df.loc[(df['pk']==1)
& (df['type']==1) 
& (~df['zipcode'].isin(list(range(700001,(735999 1))))),'days'].agg(['size','sum']) 
& (~df['zipcode'].isin(cal_list)),df.loc['days'].agg(['size','sum'])
  • Related