I have 2 different databases and those are not linked. I need to filter the data from database B using values from database A.
So here is my table Database A, table filter_data
| Filter | Value |
| Commodity | CPU |
| Commodity | GPU |
Database B, table forecast_data
| Commodity | Value |
| CPU | 100 |
| GPU | 200 |
| HDD | 300 |
| SSD | 400 |
So what I wanted to achieve is something like this
SELECT * FROM forecast_data WHERE Commodity IN (SELECT Value FROM filter_data)
But since these are in different database and not possible to link, I need to "export" the commodity list from database A, form SQL statement and then execute it in database B. So above query becomes
SELECT * FROM forecast_data WHERE Commodity IN (%s)
But putting the exported data from database A may expose to SQL injection
My Python code, using psycopg2:
def select_command_postgres_no_argument(conn, sql):
with conn.cursor() as cur:
cur = conn.cursor()
cur.execute(sql)
return cur.fetchall()
def select_command_postgres_with_argument(conn, sql, sql_args = ()):
with conn.cursor() as cur:
cur = conn.cursor()
cur.execute(sql, sql_args)
return cur.fetchall()
# Retrieve the list of commodities
commodity_records = select_command_postgres_no_argument(postgres_conn, "SELECT Value FROM filter_data")
commodity = []
for row in commodity_records:
commodity.append(row[0])
commodity_text = ', '.join(f"'{w}'" for w in commodity)
src_sql = "SELECT * FROM forecast_data WHERE Commodity IN (%s)" % (commodity_text)
I'm afraid this may cause SQL injection
src_sql = "SELECT * FROM forecast_data WHERE Commodity IN (%s)" % (commodity_text)
How to replace this in a safe way?
CodePudding user response:
Since you are using psycopg2, you should use its mechanism for passing parameters to queries; that will take care of any necessary quoting for you. Any other DB library worth its salt should have a similar mechanism.
(In this case, you can either use the =ANY
construct with a list, or convert to a tuple and use IN
.)