How do you translate the below SQL into Python, specifically the count(*) OVER() portion? I need to display the total count of userIDs not as a single value but as the same value for each userID row, so I can do Ct/Total_Ct and display each userID's rate.
select userID, count(*) as Ct, count(*) OVER() AS Total_Ct
from cte
group by userID
CodePudding user response:
That is the way you can translate this query easily to Python:
data['total_users'] = data['userID'].count()
total_rows_for_user = data.groupby('userID').size().reset_index().rename(columns = {'size':'total_rows_for_user'})
data = data.merge(total_rows_for_user, on='userID')
CodePudding user response:
I suspect that this is the version you want:
SELECT userID, COUNT(*) AS ct, SUM(COUNT(*)) OVER () AS Total_Ct
FROM cte
GROUP BY userID;
Keep in mind that in a GROUP BY
query, the window functions operate over the aggregated intermediate table. So COUNT(*) OVER ()
just counts the number of distinct users, but SUM(COUNT(*)) OVER ()
counts the total number of records in the original table.
Regarding running the above SQL from Python, you would need to use a raw query. It would likely not work directly from an ORM layer.