Home > front end >  Pass list of dates to SQL WHERE statement in PySpark
Pass list of dates to SQL WHERE statement in PySpark

Time:01-21

In the process of converting some SAS code to PySpark and we previously used a macro variable for the where statement in this code. In adapting to PySpark, I'm trying to pass a list of dates to the where statement, but I keep getting errors. I want the SQL code to pull all data from those 3 months. Any pointers?

month_list = ['202107', '202108', '202109'] 

sql_query = """ (SELECT *                   
                FROM Table_Blah                  
                WHERE (to_char(DateVariable,'yyyymm') IN '{}')                  
                ) as table1""".format(month_list)

CodePudding user response:

Pass the list as a tuple to have the right sql syntax:

month_list = ['202107', '202108', '202109'] 

sql_query = """ (SELECT *                   
                FROM Table_Blah                  
                WHERE (to_char(DateVariable,'yyyymm') IN {})                  
                ) as table1""".format(tuple(month_list))

And you don’t need apostrophe for in statement

  •  Tags:  
  • Related