I have a file in S3 with the following format:
col1,col2
number1,content1
number2,content2
number3,content3
I am creating a Redshift table with the structure bellow:
CREATE TABLE IF NOT EXISTS general.name.name_test (
col1 VARCHAR(255),
col2 VARCHAR(255),
inserted_timestamp TIMESTAMP DEFAULT GETDATE()
);
After that, I am using Redshift copy command to have the data available in the table I just created:
COPY general.name.name_test
FROM 's3://.../name_test.txt'
ACCESS_KEY_ID '' SECRET_ACCESS_KEY '' SESSION_TOKEN ''
DELIMITER AS ','
IGNOREHEADER AS 1
csv;
The problem is that "inserted_timestamp" is NULL and Redshift is not taking the default value.
Am I missing something? This is what I will get in Redshift:
col1,col2,inserted_timestamp
number1,content1,null
number2,content2,null
number3,content3,null
It only works if I specify the columns but I wanted to avoid that if possible:
COPY general.name.name_test
(col1,col2)
FROM 's3://.../name_test.txt'
ACCESS_KEY_ID '' SECRET_ACCESS_KEY '' SESSION_TOKEN ''
DELIMITER AS ','
IGNOREHEADER AS 1
csv;
Thank you!
CodePudding user response:
Since CSV doesn’t name the columns RS doesn’t know which column is which. You need to add the column names to the COPY command to clear up the confusion. Alternatively you could add a trailing comma to your data file to indicate that the missing column is the last one.
Other than these approaches I don’t know of a way to make it clear to Redshift.