have list in python that is used throughout my code:
pylist = ['A', 'B', 'C', 'D']
I also have have a simple spark.sql()
line that I need to execute:
query = spark.sql(
"""
SELECT col1, col2, col3
FROM database.table
WHERE col3 IN ('A', 'B', 'C', 'D')
"""
)
I want to replace the list of elements in the spark.sql()
statment with the python list so that that last line in the SQL is
...
AND col3 IN pylist
I am aware of using {}
and str.format
but I am struggling to understand if that's the correct option here, and how that workd.
CodePudding user response:
Okay, after some more research, I think I found the solution using .format(tuple(pylist))
The statement now is:
pylist = ['A', 'B', 'C', 'D']
s = """
SELECT col1, col2, col3
FROM database.table
WHERE col3 IN {}
""".format(tuple(pylist))
query = spark.sql(s)
CodePudding user response:
Convert to tuple to get "()" instead of "[]" while converting to string
sql_str="SELECT col1, col2, col3 FROM database.table WHERE col3 IN " str(tuple(pylist))
query = spark.sql(sql_str)