This query lasts about 30 sec:
WITH cte AS(SELECT t.id, to_json (row) json
FROM public.test t, LATERAL (SELECT * FROM public.test2 t2 WHERE t.id = t2.test_id) row)
UPDATE test t
SET json = cte.json
FROM cte
WHERE cte.id = t.id
"Update on test t (cost=55185.61..96669.49 rows=0 width=0) (actual time=26952.797..26952.804 rows=0 loops=1)"
" -> Hash Join (cost=55185.61..96669.49 rows=150572 width=50) (actual time=1415.264..4256.365 rows=150571 loops=1)"
" Hash Cond: (t_1.id = t2.test_id)"
" -> Seq Scan on test t_1 (cost=0.00..28237.36 rows=545736 width=14) (actual time=0.026..349.467 rows=355580 loops=1)"
" -> Hash (cost=50068.46..50068.46 rows=150572 width=150) (actual time=1359.280..1359.285 rows=150571 loops=1)"
" Buckets: 32768 Batches: 8 Memory Usage: 3717kB"
" -> Hash Join (cost=10007.87..50068.46 rows=150572 width=150) (actual time=339.426..1191.612 rows=150571 loops=1)"
" Hash Cond: (t.id = t2.test_id)"
" -> Seq Scan on test t (cost=0.00..28237.36 rows=545736 width=14) (actual time=0.015..492.247 rows=355580 loops=1)"
" -> Hash (cost=5184.72..5184.72 rows=150572 width=136) (actual time=225.551..225.553 rows=150571 loops=1)"
" Buckets: 32768 Batches: 8 Memory Usage: 3420kB"
" -> Seq Scan on test2 t2 (cost=0.00..5184.72 rows=150572 width=136) (actual time=0.017..79.663 rows=150572 loops=1)"
"Planning Time: 0.931 ms"
"Execution Time: 26954.723 ms"
Another query without cte lasts too much time(still waiting to complete):
UPDATE test t
SET json = (SELECT to_json(row) FROM (SELECT * FROM test2 t2 WHERE t.id = t2.test_id LIMIT 1) row)
I want to create json of each row in test2 and update test table with it. Why 1st query is better than 2nd? How to write it without LATERAL?
Schema:
CREATE TABLE test(
id SERIAL PRIMARY KEY,
json TEXT
);
CREATE TABLE test2(
id SERIAL PRIMARY KEY,
test_id INTEGER,
a TEXT,
b TEXT,
c TEXT,
d TEXT,
e TEXT,
f TEXT,
g TEXT,
h TEXT,
i TEXT,
j TEXT,
k TEXT,
l TEXT,
m TEXT
);
CodePudding user response:
You are joining the target table multiple times with the second table.
One time is enough:
UPDATE test t
SET json = to_json(t2)
FROM public.test2 as t2
WHERE t2.test_id = t.id;
If you want to exclude some columns from the JSON, use jsonb
(which is the recommended data type for JSON anyway):
UPDATE test t
SET json = to_jsonb(t2) - 'a' - 'b'
FROM public.test2 as t2
WHERE t2.test_id = t.id;
If you want to rename some columns you can use a derived table:
UPDATE test t
SET json = to_json(t2)
FROM (
select test_id,
a as xyz,
b as abc
from public.test2
) as as t2
WHERE t2.test_id = t.id;