I am loading data from S3 stage to my snowflake table, but getting the below error :
currently the file format is csv type and delimited by escape character
End of record reached while expected to parse column '"ABC"["T_STMP":29]' If you would like to continue loading when an error is encountered, use other values such as 'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more information on loading options, please run 'info loading_data' in a SQL client.
I have checked the data also, the value of T_STMP is getting at new line and that snowflake is considering as the end of file
1234562020-11-18
07:41:12.336COVIDABC.comhttps://www.covide.com/bright-
keno/u-glass/2020/xyz-a-33/chat-pnevery goodcls F-
CLASS12345Very-free REl Men Rare CTR. WATCH
ABCDR. HEYA MOO https://abc.ABCD.com/v2/abc-xyze-
chore/sdfgsjhdfgjdhfkjdh/HIIII/HELLO-ABC_EFG_I-123-
5.jpgCENTCANT
2010-11-15 10:12:30.083663sysadmin0001-01-01 00:00:00.0wwwww33ww-
2a2a-1a1a-7ff6-3434343trtr36352020-15-15 10:44:05.79
The error is in the timestamp column that is the last third line should be like :
5.jpgCENTCANT2010-11-15 10:12:30.083663
I know its hard to understand, but data quality is too bad!
The file is csv type and delimitted by escape character
Any help will be appreciated
CodePudding user response:
The error caused due to columns having 100 varying characters length as a result timestamp and other columns was entering into new line.
The solution was to replace new line characters , carriage return characters, backspace characters for all those lengthy columns with empty string in the sqoop import job as :
REPLACE(REPLACE(REPLACE(length_column,CHAR(8),''),CHAR(10),''),CHAR(13),'') as
length_column