I use psycopg2 to call PostgreSQL COPY
command from my programs written in Python, and so far I have used copy_from
function in that package to achieve this. As I understand, in recent versions qualified names for tables, that is, scheman_name.table_name
are no more authorized while using copy_from
for security reasons (SQL injection). According to what I read on their Github page, apparently copy_expert is henceforth the way to go for qualified names. I tried to adapt my program accordingly but I get a syntax error. Here I provide a test case.
Let's say that this is my table (assuming that dev_schema
is an already existing schema):
create table dev_schema.testtab(numval integer not null);
And our data file to import into this table: Data.txt
100
120
200
800
500
To import the above file into the table directly via psql
, all I have to do is running the following command in psql
prompt:
\copy dev_schema.testtab from 'D:/Data.txt' with (format CSV, NULL '', delimiter '|', quote '"');
My purpose was to translate the above into a Python program using psycopg2.copy_expert
.
And here is my program:
import psycopg2
from psycopg2 import sql
def main():
connection = psycopg2.connect(
dbname="testdb",
user="dev_user",
password="Here I write my password",
host="localhost",
port=5432
)
#
# Create a cursor
cursor = connection.cursor()
#
#
# Importing the data file
filepath = "D:/Data.txt"
with open(
file=filepath,
mode="r",
encoding="UTF-8"
) as file_desc:
option_values = [
"format CSV",
"NULL ''",
"delimiter '|'",
"quote '\"'"
]
copy_options = sql.SQL(', ').join(
sql.Identifier(n) for n in option_values
)
cursor.copy_expert(
sql=sql.SQL(
"copy {} from stdin with ({})"
).format(
sql.Identifier("dev_schema", "testtab"),
copy_options
),
file=file_desc
)
cursor.close()
connection.close()
print("done")
if __name__ == "__main__":
main()
But when I run this program I get the following error message:
cursor.copy_expert(
psycopg2.errors.SyntaxError: ERROR: option « format CSV » not recognized
LINE 1: copy "dev_schema"."testtab" from stdin with ("format CSV", "...
But I don't see what's the problem with CSV
option as it works pretty well when I use directly \copy
in psql
prompt.
Could you kindly make some clarification? Thanks in advance
EDIT: Based on Adrian Klaver's comment I changed my code accordingly by including directly as a string all options at the end but now I get a new error message:
psycopg2.errors.SyntaxError: ERROR : Syntax error on or near 'dev_schema'
LINE 1: copy Identifier('dev_schema', 'testtab') from stdin with (fo...
Here is the new version of my code:
import psycopg2
from psycopg2 import sql
def main():
connection = psycopg2.connect(
dbname="testdb",
user="dev_user",
password="Here I write my password",
host="localhost",
port=5432
)
#
# Create a cursor
cursor = connection.cursor()
#
#
# Importing the data file
filepath = "D:/Data.txt"
with open(
file=filepath,
mode="r",
encoding="UTF-8"
) as file_desc:
option_values = "".join(
[
"format CSV, ",
"NULL '', ",
"delimiter '|', ",
"quote '\"',"
]
)
cursor.copy_expert(
sql=sql.SQL("".join(
[
"copy {} from stdin with (",
option_values,
")"
]).format(sql.Identifier("dev_schema", "testtab"))
),
file=file_desc
)
cursor.close()
connection.close()
print("done")
if __name__ == "__main__":
main()
CodePudding user response:
General tip. Assign a composed query to a variable and print it to see what is wrong.
copy_cmd = sql.SQL(
"copy {} from stdin with ({})"
).format(
sql.Identifier("dev_schema", "testtab"),
copy_options
)
print(copy_cmd.as_string(connection))
Of course, copy_options
should be built as plain text, not a list of identifiers:
with open(
file=filepath,
mode="r",
encoding="UTF-8"
) as file_desc:
option_values = [
"format CSV",
"NULL ''",
"delimiter '|'",
"quote '\"'"
]
copy_options = sql.SQL(', '.join(
n for n in option_values)
)
copy_cmd = sql.SQL(
"copy {} from stdin with ({})"
).format(
sql.Identifier("dev_schema", "testtab"),
copy_options
)
# print(copy_cmd.as_string(connection))
cursor.copy_expert(
sql=copy_cmd,
file=file_desc
)
connection.commit() # !!!
connection.close()
Do not forget to commit!