My original table has: (1) acceptance_rate - string, percentage (2) host_is_superhost - boolean
I wanted to convert (1) acceptance_rate to integer without the %, so I created a CTE as follow:
WITH acceptance_rate_cte AS
(SELECT
CAST(REPLACE(acceptance_rate,'%',"") AS int) AS new_acceptance_rate,
host_is_superhost AS new_superhost
FROM table1
WHERE acceptance_rate NOT IN ("N/A","0%")
ORDER BY new_acceptance_rate DESC)
SELECT new_acceptance_rate, new_superhost
FROM acceptance_rate_cte;
New CTE table looks like:
new_acceptance_rate | new_superhost
100 | true
90 | true
95 | false ...
NEXT, I wanted to create a table to group all the new_acceptance_rate into buckets of 20 and then count how many true or false are within those buckets. So I did this:
SELECT CASE WHEN new_acceptance_rate >0 AND new_acceptance_rate <= 20 then '1-20'
WHEN new_acceptance_rate >20 AND new_acceptance_rate <=40 then '21-40'
WHEN new_acceptance_rate >40 AND new_acceptance_rate<=60 THEN '41-60'
WHEN new_acceptance_rate >60 AND new_acceptance_rate <=80 THEN '61-80'
ELSE 'Above 80'
END acceptance_range,
new_superhost,
count(*) as superhost_count
FROM acceptance_rate_cte
My expectation for the result is to look like this:
acceptance_range | new_superhost | superhost_count
1-20 | true | 15
1-20 | false | 25
...
But instead i received an error msg as follow:
Error running query Table name "acceptance_rate_cte" missing dataset while no default dataset is set in the request.
CodePudding user response:
I ran your query above with some sample data it seems to be mostly correct.
with acceptance_rate_cte AS
( SELECT
CAST(REPLACE(acceptance_rate,'%',"") AS int) AS new_acceptance_rate,
host_is_superhost AS new_superhost
FROM table1
WHERE acceptance_rate NOT IN ("N/A","0%")
ORDER BY new_acceptance_rate DESC
)
SELECT CASE WHEN new_acceptance_rate >0 AND new_acceptance_rate <= 20 then '1-20'
WHEN new_acceptance_rate >20 AND new_acceptance_rate <=40 then '21-40'
WHEN new_acceptance_rate >40 AND new_acceptance_rate<=60 THEN '41-60'
WHEN new_acceptance_rate >60 AND new_acceptance_rate <=80 THEN '61-80'
ELSE 'Above 80'
END acceptance_range,
new_superhost,
count(*) as superhost_count
FROM acceptance_rate_cte
GROUP BY acceptance_range, new_superhost;
I did have to add a group by in order for it to execute properly. Based on your error though my guess is you have not run both the CTE and query together in the same session. When you execute run both at the same time.