I'm currently working on data analysis for a game's tutorial. I have a table with the completed steps of the tutorial. Here is a sample of what two day's worth of data currently looks like. As the tutorial increases in complexity, the amount of players that complete the next step, decreases.
date | stepName | completionCount |
---|---|---|
2022-06-08 | loadIn | 25 |
2022-06-08 | receiveSword | 20 |
2022-06-08 | defeatEnemy | 15 |
2022-06-08 | claimReward | 10 |
2022-06-08 | leaveTutorial | 5 |
2022-06-07 | loadIn | 45 |
2022-06-07 | receiveSword | 35 |
2022-06-07 | defeatEnemy | 20 |
2022-06-07 | claimReward | 15 |
2022-06-07 | leaveTutorial | 8 |
My goal is to include a new column with a percentage based on that day's max step count. Below is what I hope to convert the top table into. The first step, "loadIn", should always be 100% since that is the baseline for the rest of the steps.
date | stepName | completionCount | completionPercentage |
---|---|---|---|
2022-06-08 | loadIn | 25 | 100% |
2022-06-08 | receiveSword | 20 | 80% |
2022-06-08 | defeatEnemy | 15 | 60% |
2022-06-08 | claimReward | 10 | 40% |
2022-06-08 | leaveTutorial | 5 | 20% |
2022-06-07 | loadIn | 50 | 100% |
2022-06-07 | receiveSword | 45 | 90% |
2022-06-07 | defeatEnemy | 40 | 80% |
2022-06-07 | claimReward | 20 | 40% |
2022-06-07 | leaveTutorial | 15 | 30% |
I am VERY new to SQL and could really use some help here. Thanks!
CodePudding user response:
try this:
select
*,
max(completitionCount) OVER(partition by date) as dailymax,
completitionCount/dailymax as completionPercentage
from yourtable
Because your loadIn number is consistently the highest number each day, a window function using MAX()
is appropriate in this scenario.
Here's a reference to redshift MAX() window function