I am trying to load a bq table with the below definition and one of the column (ref_list) is of STRING REPEATED.
[
{
"name": "emp",
"type": "STRING"
},
{
"mode": "REPEATED",
"name": "ref_list",
"type": "STRING"
},
{
"name": "update_date",
"type": "DATE"
}
]
Below is how my input data is:
{"emp":"Adam","ref_list":["Roger","Calvin","Andrew","Kohl"],"update_date":"1999-01-01"}
{"emp":"AntiP27","ref_list":["John","Patrick","Nick","Chris"],"update_date":"2020-01-01"}
I am able to load the table by point the .schema file from my local but the same is failing when I provide the in-line schema.
Here is my bq load command with inline schema option. I am not quite sure how I could specify the mode = REPEATED
bq load --replace --source_format=NEWLINE_DELIMITED_JSON emp_stage.emp_dtl gs://1324-global-delivery/emp_dtl.json emp:STRING,ref_list:STRING,update_date:DATE
CodePudding user response:
According to the documentation, it's not possible to specify a RECORD
and the columns mode
(NULLABLE, REPEATED), with an inline schema :
When you specify the schema on the command line, you cannot include a RECORD (STRUCT) type, you cannot include a column description, and you cannot specify the column's mode. All modes default to NULLABLE. To include descriptions, modes, and RECORD types, supply a JSON schema file instead.
bq_manually_specifying_schemas
If you need to use these parameters, you have to specify them in a Json
schema in a dedicated file, as you used in your example.