Home > Mobile >  How should I construct source `String` so as to transform and put it into `Array(Tuple(Int, String,
How should I construct source `String` so as to transform and put it into `Array(Tuple(Int, String,

Time:11-03

I am reading impala parquet from HDFS into Clickhouse. Target table in clickhouse has 2 complex types:

target_type1 Array(Tuple(  LowCardinality(String),
                      Int 8,
                      Int 32, 
                      Int 32,
                      Int 32
         ))
target_type2 Array(Tuple(  LowCardinality(String),
                      LowCardinality(String), 
                      LowCardinality(String)
         ))

Source table's type is String. How should I construct souce table's String and transform it into the target type?

For example,

Target table

create table test (
    a String,
    b Array(String),
    c Array(Tuple(Int, String, String))
) engine = MergeTree order by tuple()

Source

with t as (
    select '123#def#aaa|456#xxx#aaa|789#bbbb#aaaa|3333#www#aaaa' as x, splitByChar('|', x) as y, splitByChar('#', y[1]) as z
) select * from t;

How should I construct the source string x and transform it, so as to load into c Array(Tuple(Int, String, String))?

CodePudding user response:

Something like this?

INSERT INTO test (c) SELECT groupArray(new) AS val
FROM
(
    SELECT
        '123#def#aaa|456#xxx#aaa|789#bbbb#aaaa|3333#www#aaaa' AS x,
        splitByChar('|', x) AS y,
        splitByChar('#', arrayJoin(y)) AS s,
        (toInt32(s[1]), s[2], s[3]) AS new
)

Output is:

SELECT *
FROM test

Query id: e087e0b8-af15-45e9-a4bb-ce22cc2d6f6a

┌─a─┬─b──┬─c─────────────────────────────────────────────────────────────────────────────┐
│   │ [] │ [(123,'def','aaa'),(456,'xxx','aaa'),(789,'bbbb','aaaa'),(3333,'www','aaaa')] │
└───┴────┴───────────────────────────────────────────────────────────────────────────────┘

1 row in set. Elapsed: 0.002 sec.

I think it could be done without expanding the array with some lambda functions too.

CodePudding user response:

My version

select '123|bbb|ccc;111|abc|deee;123|BB|CC;123|feq|ffdfa;848|ddkz|djkf'                          as t0,
       splitByChar(';', t0)                                                                      as t,
       arrayJoin(t)                                                                              as t1,
       tuple(toInt32(splitByChar('|', t1)[1]), splitByChar('|', t1)[2], splitByChar('|', t1)[3]) as t2,
       groupArray(t2)                                                                            as t3,
       toTypeName(t3);
  • Related