I am trying to calculate if a member shops in January, what proportion shop again in February and what proportion shop again within 3 months. Ultimately to create a table similar to the image attached.
I have tried the below code. The first left join works, but when I add the second one to calculate within_3months the error: "FROM keyword not found where expected" is shown (for the separate line). Can I left join twice or must I do separate scripts for columns?
, count(distinct B.members)/count(distinct A.members) *100 as 1month_retention_rate
select
year_month_january21
, count(distinct A.members) as num_of_mems_shopped_january21
, count(distinct B.members)as retained_february21
, count(distinct B.members)/count(distinct A.members) *100 as 1month_retention_rate
, count(distinct C.members)/count(distinct A.members) *100 as within_3months
from
(select
members
, year_month as year_month_january21
from table.members t
join table.date tm on t.dt_key = tm.date_key
and year_month = 202101
group by
members
, year_month) A
left join
(select
members
, year_month as year_month_february21
from table.members t
join table.date tm on t.dt_key = tm.date_key
and year_month = 202102
group by
members
, year_month) B on A.members = B.members
left join
(select
members
, year_month as year_month_3months
from table.members t
join table.date tm on t.dt_key = tm.date_key
and year_month between 202102 and 202104
group by
members
, year_month) C on A.members = C.members
group by
year_month_january21;
I have tried left creating a separate time table and joining to this. It does not work. Doing calculations separately works but I must do this for multiple time frames so will take a long time.
CodePudding user response:
The error isn't coming from the added left join, it's from the as 1month_retention_rate
part, because it's an illegal name.
You can see that more simply with:
select dummy as 1month_retention_rate
from dual;
ORA-00923: FROM keyword not found where expected
You could change the column alias so it follows the naming rules (specifically here, does not start with a digit), or if that specific name is actually required then you could make it a quoted identifier - generally not a good option, but sometimes OK in the final output of a query.
So in your code you would just change your new line
, count(distinct B.members)/count(distinct A.members) *100 as 1month_retention_rate
to something like
, count(distinct B.members)/count(distinct A.members) *100 as one_month_retention_rate
fiddle - which still errors but now with ORA-00942 as I don't have your tables, and that is after changing your obfuscated schema/table names to something legal too.
There may be more efficient ways to perform the calculation, but that's a separate issue...
CodePudding user response:
I could understand that you want to get :
- count of all members who visited in Jan.
- count of all members who visited in Jan and visited again in Feb.
- count of all members who visited in Jan and visited again in Feb, Mars and April.
If my understanding is true then you could simplify your inner query using IF instead of LEFT JOIN . Take a look on the following query. Assuming that table members have an ID field :
SELECT
mem_jan AS num_of_mems_shopped_january21,
mem_feb AS retained_february21,
mem_feb / mem_jan * 100 as 1month_retention_rate
mem_3m / mem_jan * 100 as within_3months
FROM(
SELECT
SUM(IF(mm_jan>0,1,0) AS mem_jan,
SUM(IF(mm_jan>0 AND mm_feb>0,1,0) AS mem_feb,
SUM(IF(mm_jan>0 AND mm_count_3m>0,1,0) AS mem_3m
FROM
(
SELECT
t.Id,
SUM(IF(year_month = 202101, 1,0)) AS mm_jan, /*visit for a member in Jan*/
SUM(IF(year_month = 202102, 1,0)) AS mm_feb, /*visit for a member in Feb*/
SUM(IF(year_month between 202102 and 202104,1,0)) AS mem_3m/*visit for a member in 3 months*/
FROM
table.members t
join table.date tm on t.dt_key = tm.date_key
WHERE
year_month between 202101 and 202104
GROUP BY
t.Id
) AS t1
) AS t2
This is not a final running query but it can explain my idea. According to your engine you may use CASE
or IF THEN ELSE