I wish to create a summary table.
For example, the dataset df
looks like this:
type category color
bmw a black
bmw b black
bmw c white
bmw d black
bmw e white
.....
Basically, type
will be the same value throughout the whole dataset, and category
will be all unique values. What I want to do is to figure out the proportion of the colors. So I hope to get an output that looks something like this:
color count prop
black 48 0.34
white 62 0.52
grey 20 ...
.....
Please note that the numbers and proportions above are all made up, not proportional, just for illustration purposes.
In R, all I have to do is use dplyr
and use the summarise
function to output the desired result, but not sure how to do it in SQL.
I've tried doing the following code:
SELECT type, category, color, count(*)
FROM df
However it gave me an error.
CodePudding user response:
You can use aggregation and window functions:
select color, count(*),
count(*) * 1.0 / sum(count(*)) over ()
from df
group by color;