I have a query that I want to add a colum to. It should contain the amount of how many times a value of a specific column appears in that table.
My current attempt
SELECT cleaning.*,
(SELECT COUNT(*)
FROM cleaning
WHERE column_b = cleaning.column_b)
AS multicleanamount
FROM cleaning
is counting every single row as it is just comparing a column with same column - I understand the reason. So the current (wrong) result is
column_a | column_b | multicleanamount |
---|---|---|
12 | 300 | 7 |
13 | 321 | 7 |
14 | 300 | 7 |
15 | 330 | 7 |
16 | 330 | 7 |
17 | 351 | 7 |
18 | 330 | 7 |
What I am missing now is how do I tell the sub-Select to compare with the current value of cleaning.column_b
from the very first line? (which is in the cleaning.*
)
2bh it was pretty easy in my head at first and it also would be if I'd compare to a static value like '300'
but that table has 74 K entries. This has to be dynamic.
My desired output is
column_a | column_b | multicleanamount |
---|---|---|
12 | 300 | 2 |
13 | 321 | 1 |
14 | 300 | 2 |
15 | 330 | 3 |
16 | 330 | 3 |
17 | 351 | 1 |
18 | 330 | 3 |
I hope that is understandable. If not please tell me and I will try to specify even more.
CodePudding user response:
Use table aliases so you can distinguish the cleaning
table in the subquery from the table in the main query. In your query, WHERE column_b = cleaning.column_b
is comparing the column with itself, so the condition is always true and you're counting all the rows.
SELECT c1.*,
(SELECT COUNT(*)
FROM cleaning AS c2
WHERE c1.column_b = c2.column_b)
AS multicleanamount
FROM cleaning AS c1
It may also be better to write this as a JOIN
instead of correlated subquery:
SELECT c1.*, c2.multicleanamount
FROM cleaning AS c1
JOIN (
SELECT column_b, COUNT(*) AS multicleanamount
FROM cleaning
GROUP BY column_b
) AS c2 ON c1.column_b = c2.column_b