Home > Blockchain >  How do I merge a frequency count back onto a table in postgresql?
How do I merge a frequency count back onto a table in postgresql?

Time:12-10

I am attempting to count the number of each category and merge it back onto the table by overwriting the table in postgresql.

This is the main table I have (Named Titanic, containing the columns in question):

PassengerId Group
0001_01 1
0002_01 2
0003_01 3
0003_02 3

I've altered the table by adding a new numeric column "GroupSize" which I want to contain the frequency counts of each group category. So record 1, would be a count of 1, record 2 would be a count of 1 and record 3 and 4 would both be a count of 2. And I want my main "Titanic" table to be retained as opposed to creating a new table or view so ideally using an "Update" statement to impute values into "GroupSize";

I have created a view to contain group the corresponding frequency counts from this code:

CREATE OR REPLACE VIEW "GroupSize"("Group", "GroupSize") AS
    select "Group", count("Group") from "Titanic" GROUP BY "Group";

which outputs this:

Group GroupSize
1 1
2 1
3 2

And I've tried an Update statement to use this view to add data into my "GroupSize" column from "Titanic" like such:

UPDATE "Titanic"
    SET "GroupSize" = (SELECT "GroupSize" from "GroupSize")
    WHERE "Group" IN (SELECT "Group" from "GroupSize");

I have been unsuccessful in getting this UPDATE statement to work mainly because I get an error: "more than one row returned by a subquery used as an expression". I am pretty new to SQL so ny help would be appreciated.

CodePudding user response:

You almost had it right. The value used in SET is dynamic based off the row being modified. All you have to do is add a WHERE clause to it to ensure it picks the right value from the view.

UPDATE "Titanic"
    SET "GroupSize" = (
        SELECT "GroupSize" from "GroupSize"
        where "Titanic"."Group" = "GroupSize"."Group"
        -- (Pedantic safety limit, just in case)
        limit 1
    )

Beware, though, this will modify ever row, setting NULL for values not found in the view. To have it preserve "GroupSize" column for rows without a match in the view, tack on another WHERE clause:

UPDATE "Titanic"
    SET "GroupSize" = (
        SELECT "GroupSize" from "GroupSize"
        where "Titanic"."Group" = "GroupSize"."Group"
        limit 1
    )
    WHERE "Group" IN (SELECT "Group" from "GroupSize");

CodePudding user response:

Do not actually Update you main table, just create the view to hold the group size. This eliminates maintenance headaches when performing DML on the table, image what extra you need to transfer one group to another. With the count only in the view, you do nothing extra. You get the count of id in each group with the window version of count. (see demo)

create or replace view titanic_vw as 
 select passengerid "Passenger Id" 
      , passenger_group "Group"
      , count(*) over (partition by passenger_group) "Group Size"
   from  titanic;
  • Related