Home > Blockchain >  COPY with FORCE NULL to all fields
COPY with FORCE NULL to all fields

Time:07-23

I have several CSVs with varying field names that I am copying into a Postgres database from an s3 data source. There are quite a few of them that contain empty strings, "" which I would like to convert to NULLs at import. When I attempt to copy I get an error along the lines of this (same issue for other data types, integer, etc.):

psycopg2.errors.InvalidDatetimeFormat: invalid input syntax for type date: ""

I have tried using FORCE_NULL (field 1, field2, field3) - and this works for me, except I would like to do FORCE_NULL (*) and apply to all of the columns as I have A LOT of fields I am bringing in that I'd like this applied to.

Is this available?

This is an example of my csv: "ABC","tgif","123","","XyZ"

CodePudding user response:

Yes force null is possible. You should try with FORCE_NULL clause. You can refer the syntax and details here:


FORCE_NULL

Match the specified columns' values against the null string, even if it has been quoted, and if a match is found set the value to NULL. In the default case where the null string is empty, this converts a quoted empty string into NULL. This option is allowed only in COPY FROM, and only when using CSV format.


[PostgresPro Doc][1]

[Postgres Doc][2]


  [1]: https://postgrespro.com/docs/postgrespro/14/sql-copy
  [2]: https://www.postgresql.org/docs/current/sql-copy.html

CodePudding user response:

Using psycopg2 Copy functions. In this case copy_expert:

cat empty_str.csv 
1, ,3,07/22/2
2,test,4,
3,dog,,07/23/2022

create table empty_str_test(id integer, str_fld varchar, int_fld integer, date_fld date);

import psycopg2

con = psycopg2.connect("dbname=test user=postgres host=localhost port=5432") 
cur = con.cursor() 

with open("empty_str.csv") as csv_file:
    cur.copy_expert("COPY empty_str_test FROM STDIN WITH csv", csv_file)
con.commit()

select * from empty_str_test ;

 id | str_fld | int_fld |  date_fld  
---- --------- --------- ------------
  1 |         |       3 | 2022-07-22
  2 | test    |       4 | 
  3 | dog     |         | 2022-07-23

From here COPY:

NULL

Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings. This option is not allowed when using binary format.

copy_expert allows you specify the CSV format. If you use copy_from it will use the text format.

  • Related