Home > Net >  Extracting all possible keys from a JSON column in Clickhouse
Extracting all possible keys from a JSON column in Clickhouse

Time:07-02

I have a table in Clickhouse that has a properties column. This column is a JSON type column and stores an object with several keys and values. No arrays, just a single object with multiple keys and values.

What I am trying to do is to build a list of all the unique key names present in the column.

So, if the column has this object:

{ age: '25'}

And also this object:

{ genre: 'pop'}

I'd like to retrieve age and genre as results for this query.

Any way this can be done?

CodePudding user response:

You can retrieve all of the inferred columns/keys from a ClickHouse JSON column using the DESCRIBE command with the setting describe_extend_object_types=1;

For example:

DESCRIBE TABLE json_test
SETTINGS describe_extend_object_types = 1

Query id: bfa406b0-948c-460f-9e05-6981b5f52270

┌─name──┬─type───────────────────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
│ key   │ UInt32                         │              │                    │         │                  │                │
│ value │ Tuple(json_key Int8, value String) │              │                    │         │                  │                │
└───────┴────────────────────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘

The value column in the json_test table is actually a JSON type, but it currently stores two keys, json_key which is an Int8, and value, which is a String type.

I believe at the moment you'd have to parse the results of the DESCRIBE TABLE query to use those column names/keys in another query.

CodePudding user response:

I ended up realizing that my use case is better suited for a MAP column type.

After changing it to Map, doing what I wanted was very easy:

SELECT DISTINCT arrayJoin(mapKeys(properties)) AS keys
FROM groups
  • Related