Home > other >  End of record reached while expected to parse column in snowflake data loading from S3
End of record reached while expected to parse column in snowflake data loading from S3

Time:11-02

I am loading data from S3 stage to my snowflake table, but getting the below error :

currently the file format is csv type and delimited by escape character

End of record reached while expected to parse column '"ABC"["T_STMP":29]' If you would like to continue loading when an error is encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more information on loading options, please run 'info loading_data' in a SQL client.

I have checked the data also, the value of T_STMP is getting at new line and that snowflake is considering as the end of file

 1234562020-11-18 
 07:41:12.336COVIDABC.comhttps://www.covide.com/bright- 
 keno/u-glass/2020/xyz-a-33/chat-pnevery goodcls F- 
 CLASS12345Very-free REl Men Rare CTR.  WATCH  
 ABCDR.  HEYA MOO  https://abc.ABCD.com/v2/abc-xyze- 
chore/sdfgsjhdfgjdhfkjdh/HIIII/HELLO-ABC_EFG_I-123- 
5.jpgCENTCANT
2010-11-15 10:12:30.083663sysadmin0001-01-01 00:00:00.0wwwww33ww- 
2a2a-1a1a-7ff6-3434343trtr36352020-15-15 10:44:05.79

The error is in the timestamp column that is the last third line should be like :

     5.jpgCENTCANT2010-11-15 10:12:30.083663

I know its hard to understand, but data quality is too bad!

The file is csv type and delimitted by escape character

Any help will be appreciated

CodePudding user response:

The error caused due to columns having 100 varying characters length as a result timestamp and other columns was entering into new line.

The solution was to replace new line characters , carriage return characters, backspace characters for all those lengthy columns with empty string in the sqoop import job as :

   REPLACE(REPLACE(REPLACE(length_column,CHAR(8),''),CHAR(10),''),CHAR(13),'') as 
   length_column
  • Related