Home > Net >  Assigning pandasql output to new column in DataFrame
Assigning pandasql output to new column in DataFrame

Time:05-16

I'm using pandasql to get data from df1. Can I assign the output of query to new column in df2? I tried (df2['grade']=ps.sqldf(sqlcode,locals())) but this didn't work, which was expected because query output isn't directly series. Is there a way to do it? Thank you in advance!

import pandasql as ps

df1=pd.DataFrame({"min":[10,10,21],
                   "max":[20, 20, 30],
                   "grade":['low', 'medium', "high"],
                   "class":['english', 'math', "english"]})
df2=pd.DataFrame({"score":([15, 16, 25]),
                          "class":['english', 'math', "english"]})


import pandasql as ps

sqlcode = '''
select
df1.grade

from df2 
inner join df1 
on df2.score between df1.min and df1.max and df1.class = df2.class
'''

newdf = ps.sqldf(sqlcode,locals())
newdf

CodePudding user response:

No need to assign the new column, you can directly get the desired output by tweaking your sql query a little bit:

select df2.*, df1.grade -- Notice the change
from df2 
left join df1 -- Notice the change
on (df2.score between df1.min and df1.max) and (df1.class = df2.class)

   score    class   grade
0     15  english     low
1     16     math  medium
2     25  english    high
  • Related