Home > Enterprise >  How to Unescape a character in snowflake during data ingestion from CSV to Snowflake table
How to Unescape a character in snowflake during data ingestion from CSV to Snowflake table

Time:12-06

I have a process that automatically ingests CSV data into Snowflake Tables from an external stage. CSV File that has the delimiter as pipe (vertical bar or | )

But one cell has data that contains the Escape character itself (which is backslash - "") and immediately followed by the Column Delimiter (which is |)

Now the ingestion process is assuming that the delimiter ("|") is part of the data itself because it is immediately followed by the "" and due to this, column values are shifting left causing erratic data in the table.

The data in CSV looks like this

Column_1|Column_2|Column_3
ABC ......... |DEF..........\ |EFG

When this gets ingested into the table it looks like this

Column_1 Column_2 Column_3
ABC.......... DEF......|EFG NULL

Ideally, it should have got ingested as

Column_1 Column_2 Column_3
ABC.......... DEF.........\ EFG

Note: I can not alter the data as it has to flow from upstream systems and I need to ingest the data as is (even though it looks like a genuine data issue)

Could someone help me here as to which property of the File Format should I use in order for ingesting this data as is?

Thank you in advance!

CodePudding user response:

Setting the ESCAPE_UNENCLOSED_FIELD = NONE for the file_format configuration might help to achieve the desired result. The entire file format should like:

  create or replace file format my_csv_format
  type = csv
  field_delimiter = '|'
  skip_header = 1
  null_if = ('NULL', 'null')
  ESCAPE_UNENCLOSED_FIELD = NONE
  empty_field_as_null = true
  ERROR_ON_COLUMN_COUNT_MISMATCH = false
  compression = auto;
  • Related