Psycopg2 not auto generating id when using copy_from a csv file to Postgres db-CodePudding

I have a csv file that has several columns:

upc date quantity customer

In my physical table, I have an auto generating id column for each row:

id upc date quantity customer

It seems as though the db is interpreting the upc as the actual id when I run my python script to copy into the db. I'm getting this error message:

Error: value "1111111" is out of range for type integer
CONTEXT:  COPY physical, line 1, column id: "1111111"

I've never attempted this before, but I believe this is correct:

def insert_csv(f, table):
    connection = get_postgres_connection()
    cursor = connection.cursor()
    try:
        cursor.copy_from(f, table, sep=',')
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()

Am I doing something wrong here, or do I have to create another script to get the last id from the table?

CodePudding user response：

You need to specify columns to import. From the documentation:

columns – iterable with name of the columns to import. The length and types should match the content of the file to read. If not specified, it is assumed that the entire table matches the file structure.

Your code may look like this:

def insert_csv(f, table, columns):
    connection = connect()
    cursor = connection.cursor()
    try:
        cursor.copy_from(f, table, sep=',', columns=columns)
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()
        
with open("path_to_my_csv") as file:
    insert_csv(file, "my_table", ("upc", "date", "quantity", "customer"))

If you have to use copy_expert() modify your function in the way as follow:

def insert_csv(f, table, columns):
    connection = connect()
    cursor = connection.cursor()
    try:
        column_names = ','.join(columns)
        copy_cmd = f"copy {table}({column_names}) from stdout (format csv)"
        cursor.copy_expert(copy_cmd, f)
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()

CodePudding user response：

From here COPY:

If a column list is specified, COPY TO copies only the data in the specified columns to the file. For COPY FROM, each field in the file is inserted, in order, into the specified column. Table columns not specified in the COPY FROM column list will receive their default values.

So the values in the CSV file will be assigned left to right and the fields at the end of the table will get their DEFAULT values. If you don't want that to happen then from here copy_from:

columns – iterable with name of the columns to import. The length and types should match the content of the file to read. If not specified, it is assumed that the entire table matches the file structure.

Create a list of columns that match the file structure, leaving out the id column which will be filled with the sequence values.