I have tried to find the answer in the doc and in other forums but failed to understand. Doc: https://clickhouse.com/docs/en/sql-reference/statements/select/join/ CH doc
My tables are similar to: stack overflow question
I would like to get the order price right before a trade (and also return the order ts and info regarding the trades).
Using the code below, I get the following error: SQL Error [48]: ClickHouse exception, code: 48,DB::Exception: ASOF join over right table Nullable column is not implemented
What I have tried is the condition "IS NOT NULL" but it didn't do anything.
Code:
WITH
orders as(
SELECT order_timestamp, order_price, product_id
FROM order_table
WHERE
( order_timestamp >= toInt32(toDateTime64('2022-01-02 10:00:00.000', 3))*1000
AND order_timestamp <= toInt32(toDateTime64('2022-01-02 12:00:00.000', 3))*1000)
AND product_id = 'SPXFUT'
AND order_timestamp IS NOT NULL),
trades as (
SELECT
trade_timestamp,
price
FROM trades_table
WHERE
trade_timestamp >= toInt32(toDateTime64('2021-12-02 10:00:00.000', 3))*1000 AND trade_timestamp <= toInt32(toDateTime64('2021-12-02 12:00:00.000', 3))*1000
AND product_id = 'SPXFUT'
AND trade_timestamp IS NOT NULL),
results as(
SELECT
tt.product_id,
tt.trade_timestamp,
tt.price,
o.order_timestamp,
o.order_price
FROM trades tt
ASOF LEFT JOIN orders o
ON (tt.product_id = o.product_id ) AND (tt.trade_timestamp >= o.order_timestamp ))
SELECT *
FROM results
CodePudding user response:
Current implementation of ASOF LEFT JOIN
requires that the right side column used in inequality to be a non-nullable type. As the types comes from the table definition (and your order_table must have a definition of order_timestamp
like Nullable(Int64)
), ClickHouse will refuse to run with the exception like ClickHouse exception, code: 48,DB::Exception: ASOF join over right table Nullable column is not implemented
.
As a solution, you can wrap o.order_timestamp
with assumeNotNull
function in the ASOF LEFT JOIN condition:
ON (tt.product_id = o.product_id ) AND (tt.trade_timestamp >= assumeNotNull(o.order_timestamp) ))
But you shall consider the behavior of assumeNotNull
(documentation) - it will give a non-null value for the type (default value), which might give wrong results for the cases where o.order_timestamp
is null (assumeNotNull
is implementation specific and can bring more problems - assumeNotNull and friends).
Another solution is to use ifNull
and provide a suitable replacement for the missing values, avoiding potential problems that assumeNotNull
brings:
ON (tt.product_id = o.product_id ) AND (tt.trade_timestamp >= ifNull(o.order_timestamp,0) ))
And the last suggestion is to change the column datatype to a non-nullable one:
ALTER TABLE order_table MODIFY COLUMN order_timestamp <TypeName>;
But this depends on your use case - if it is a requirement that order_timestamp
has to accept null values, I would recommend to adopt the second solution.