Home > OS >  clickhouse import csv dictionary
clickhouse import csv dictionary

Time:04-15

New CH user and I'm trying to setup a dictionary that maps airline 2 character codes to airline names to use in the onTime databases I create using the sample data from here https://clickhouse.com/docs/en/getting-started/example-datasets/ontime/

Then I manually created a csv file with these contents:

id,code,name
1,UA,United Airlines
2,HA,Hawaiian Airlines
3,OO,SkyWest
4,B6,Jetblue Airway
5,QX,Horizon Air
6,YX,Republic Airway
7,G4,Allegiant Air
8,EV,ExpressJet Airlines
9,YV,Mesa Airlines
10,WN,Southwest Airlines
11,OH,PSA Airlines
12,MQ,Envoy Air
13,9E,Endeavor Air
14,NK,Spirit Airlines
15,AA,American Airlines
16,DL,Delta Air Lines
17,AS,Alaska Airlines
18,F9,Frontier Airlines

Created the dictionary

CREATE DICTIONARY airlinecompany
(
    id UInt64, 
    code String,
    company String

)
PRIMARY KEY id 
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSV'))
LAYOUT(FLAT())
LIFETIME(3600)

I can see the dictionary has been created

┌─name───────────┐
│ airlinecompany │
│ ontime         │
└────────────────┘

But when I try and list its contents I get this error:

Received exception from server (version 22.3.3):
Code: 27. DB::Exception: Received from localhost:9000. DB::Exception: Cannot parse input: expected ',' before: 'id,code,name\r\n1,UA,United Airlines\r\n2,HA,Hawaiian Airlines\r\n3,OO,SkyWest\r\n4,B6,Jetblue Airway\r\n5,QX,Horizon Air\r\n6,YX,Republic Airway\r\n7,G4,Allegiant Air\r\n8,EV,':
Row 1:
Column 0,   name: id,      type: UInt64, ERROR: text "id,code,na" is not like UInt64

: While executing CSVRowInputFormat. (CANNOT_PARSE_INPUT_ASSERTION_FAILED)

But I dont think a csv starts with a , before id. Am I missing something from my creation statement or do I need to generate a csv in a certain way?

*** Edit with correct insert:

CREATE DICTIONARY airlinecompany
(
    id UInt64, 
    code String,
    name String

)
PRIMARY KEY id 
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSVWithNames'))
SETTINGS(input_format_skip_unknown_fields=1)
LAYOUT(FLAT())
LIFETIME(3600)

Though you can probably exclude the SETTINGS row

CodePudding user response:

CSV file contains the header so it needs to use CSVWithNames-format instead of CSV:

CREATE DICTIONARY airlinecompany
(
    id UInt64, 
    code String,
    company String

)
PRIMARY KEY id 
SOURCE(FILE(path '/var/lib/clickhouse/user_files/airlinenames.csv' format 'CSVWithNames'))
LAYOUT(FLAT())
LIFETIME(3600)
  • Related