I am working on a SQL query in Azure Databricks Environment, where considering the following dataset:
CREATE OR REPLACE TABLE tb_user_info
(
clientid INT,
visitid STRING,
channel STRING,
conversion INT,
index INT,
value STRING
);
INSERT INTO tb_user_info VALUES
(123, 'abc123', 'google', 1, 11, '1250'),
(123, 'abc123', 'google', 1, 22, '25000'),
(123, 'abc123', 'google', 1, 33, '1K and 3K'),
(456, 'def456', 'facebook', 3, 11, '2860'),
(456, 'def456', 'facebook', 3, 22, '78000'),
(456, 'def456', 'facebook', 3, 33, '3K and 5K');
SELECT * FROM tb_user_info ORDER BY clientid, index
clientid | visitid | channel | conversion | index | value |
---|---|---|---|---|---|
123 | abc123 | 1 | 11 | 1250 | |
123 | abc123 | 1 | 22 | 25000 | |
123 | abc123 | 1 | 33 | 1K and 3k | |
456 | def456 | 3 | 11 | 2860 | |
456 | def456 | 3 | 22 | 78000 | |
456 | def456 | 3 | 33 | 3K and 5k |
I want to get the following output:
clientid | visitid | channel | conversion | salary (index=11) | savings (index=22) | salary range (index=33) |
---|---|---|---|---|---|---|
123 | abc123 | 1 | 1250 | 25000 | 1K and 3k | |
456 | def456 | 1 | 2860 | 78000 | 3K and 5k |
where the columns clientid, visitid, channel and conversion are grouped and the columns index and value are the columns that are pivoted.
I've tried using the Pivot function and I read this Documentation but I haven't been successful.
Could you help me with how can I solve this task?
CodePudding user response:
I am not sure what actual problem you have encounted, I write one query, it seems work normally,
SELECT * FROM (
SELECT clientid, visitid , channel , conversion , ind , value
FROM tb_user_info
) ss
PIVOT (
max(value)
FOR ind in (
[11] ,[22] ,[33]
)
) as a