I am reading impala parquet from HDFS into Clickhouse. Target table in clickhouse has 2 complex types:
target_type1 Array(Tuple( LowCardinality(String),
Int 8,
Int 32,
Int 32,
Int 32
))
target_type2 Array(Tuple( LowCardinality(String),
LowCardinality(String),
LowCardinality(String)
))
Source table's type is String
. How should I construct souce table's String
and transform it into the target type?
For example,
Target table
create table test (
a String,
b Array(String),
c Array(Tuple(Int, String, String))
) engine = MergeTree order by tuple()
Source
with t as (
select '123#def#aaa|456#xxx#aaa|789#bbbb#aaaa|3333#www#aaaa' as x, splitByChar('|', x) as y, splitByChar('#', y[1]) as z
) select * from t;
How should I construct the source string x
and transform it, so as to load into c Array(Tuple(Int, String, String))
?
CodePudding user response:
Something like this?
INSERT INTO test (c) SELECT groupArray(new) AS val
FROM
(
SELECT
'123#def#aaa|456#xxx#aaa|789#bbbb#aaaa|3333#www#aaaa' AS x,
splitByChar('|', x) AS y,
splitByChar('#', arrayJoin(y)) AS s,
(toInt32(s[1]), s[2], s[3]) AS new
)
Output is:
SELECT *
FROM test
Query id: e087e0b8-af15-45e9-a4bb-ce22cc2d6f6a
┌─a─┬─b──┬─c─────────────────────────────────────────────────────────────────────────────┐
│ │ [] │ [(123,'def','aaa'),(456,'xxx','aaa'),(789,'bbbb','aaaa'),(3333,'www','aaaa')] │
└───┴────┴───────────────────────────────────────────────────────────────────────────────┘
1 row in set. Elapsed: 0.002 sec.
I think it could be done without expanding the array with some lambda functions too.
CodePudding user response:
My version
select '123|bbb|ccc;111|abc|deee;123|BB|CC;123|feq|ffdfa;848|ddkz|djkf' as t0,
splitByChar(';', t0) as t,
arrayJoin(t) as t1,
tuple(toInt32(splitByChar('|', t1)[1]), splitByChar('|', t1)[2], splitByChar('|', t1)[3]) as t2,
groupArray(t2) as t3,
toTypeName(t3);