Home > OS >  trying to import csv file to table in sql
trying to import csv file to table in sql

Time:11-30

I have 4 csv files each having 500,000 rows. I am trying to import the csv data into my Exasol databse, but there is an error with the date column and I have a problem with the first unwanted column in the files.

Here is an example CSV file:

unnamed:0 , time, lat, lon, nobs_cloud_day
0,  2006-03-30, 24.125, -119.375, 22.0
1,  2006-03-30, 24.125, -119.125, 25.0

The table I created to import csv to is

CREATE TABLE cloud_coverage_CONUS (
index_cloud DECIMAL(10,0)
,"time" DATE -- PRIMARY KEY 
,lat DECIMAL(10,6)
,lon DECIMAL(10,6)
,nobs_cloud_day DECIMAL (3,1)
)

The command to import is

IMPORT INTO cloud_coverage_CONUS FROM LOCAL CSV FILE 'D:\uni\BI\project 1\AOL_DB_ANALYSIS_TASK1\datasets\cloud\cfc_us_part0.csv';

But I get this error:

SQL Error [42636]: java.sql.SQLException: ETL-3050: [Column=0 Row=0] [Transformation of value='Unnamed: 0' failed - invalid character value for cast; Value: 'Unnamed: 0'] (Session: 1750854753345597339) while executing '/* add path to the 4 csv files, that are in the cloud database folder*/ IMPORT INTO cloud_coverage_CONUS FROM CSV AT 'https://27.1.0.10:59205' FILE 'e12a96a6-a98f-4c0a-963a-e5dad7319fd5' ;'; 04509 java.sql.SQLException: java.net.SocketException: Connection reset by peer: socket write error

Alternatively I use this table (without the first column):

CREATE TABLE cloud_coverage_CONUS (
"time" DATE -- PRIMARY KEY 
,lat DECIMAL(10,6)
,lon DECIMAL(10,6)
,nobs_cloud_day DECIMAL (3,1)
)

And use this import code:

IMPORT INTO cloud_coverage_CONUS FROM LOCAL CSV FILE 'D:\uni\BI\project 1\AOL_DB_ANALYSIS_TASK1\datasets\cloud\cfc_us_part0.csv'(2 FORMAT='YYYY-MM-DD', 3 .. 5);

But I still get this error:

SQL Error [42636]: java.sql.SQLException: ETL-3052: [Column=0 Row=0] [Transformation of value='time' failed - invalid value for YYYY format token; Value: 'time' Format: 'YYYY-MM-DD'] (Session: 1750854753345597339) while executing '/* add path to the 4 csv files, that are in the cloud database folder*/ IMPORT INTO cloud_coverage_CONUS FROM CSV AT 'https://27.1.0.10:60350' FILE '22c64219-cd10-4c35-9e81-018d20146222' (2 FORMAT='YYYY-MM-DD', 3 .. 5);'; 04509 java.sql.SQLException: java.net.SocketException: Connection reset by peer: socket write error

(I actually do want to ignore the first column in the files.)

How can I solve this issue?

Solution:

IMPORT INTO cloud_coverage_CONUS FROM LOCAL CSV FILE 'D:\uni\BI\project 1\AOL_DB_ANALYSIS_TASK1\datasets\cloud\cfc_us_part0.csv' (2 .. 5) ROW SEPARATOR = 'CRLF' COLUMN SEPARATOR = ',' SKIP = 1;

I did not realise that mysql is different from exasol

CodePudding user response:

Looking at the first error message, a few things stand out. First we see this:

[Column=0 Row=0]

This tells us the problem is with the very first value in the file. This brings us to the next thing, where the message even tells us what value was read:

Transformation of value='Unnamed: 0' failed

So it's failing to convert Unnamed: 0. You also provided the table definition, where we see the first column in the table is a decimal type.

This makes sense. Unnamed: 0 is not a decimal. For this to work, the CSV data MUST align with the data types for the columns in the table.

But we also see this looks like a header row. Assuming everything else matches we can fix it by telling the database to skip this first row. I'm not familiar with Exasol, but according to the documentation I believe the correct code will look like this:

IMPORT INTO cloud_coverage_CONUS 
FROM LOCAL CSV FILE 'D:\uni\BI\project 1\AOL_DB_ANALYSIS_TASK1\datasets\cloud\cfc_us_part0.csv'
    (2 FORMAT='YYYY-MM-DD', 3 .. 5)
ROW SEPARATOR = 'CRLF' 
COLUMN SEPARATOR = ','
SKIP = 1;
  • Related