Split values from column based on the name of the column-CodePudding

I work with QGIS and PostgreSQL with PostGIS. I need help with dynamic queries for PostgreSQL.

Information is structured in tables that contain votes for parties, and other types of information like geographic area or election date.

Some columns contains values that have to be splitted among several parties. For example, we can have a column with name "PartyA_PartyB" and a value of 10, and it should be splitted 5 votes to PartyA and 5 votes to PartyB. Additionally we will have independent columns for PartyA and PartyB (separated), so we need to compute a column where we allocate the original PartyA PartyA_PartyB/2.

So for example for the given the tables “Election Results” and "Parties":

create table election_results ("Country" text,  "PartyA" text,  "PartyB" text,  "PartyC" text, "PartyA_PartyB" text);
insert into election_results
VALUES
  ('Argentina', 100, 10, 20, 2),
  ('Uruguay', 3, 5, 1, 0),
  ('Chile', 40, 200, 50, 10)
;


create table parties (party text);
insert into parties
VALUES
  ('PartyA'),
  ('PartyB'),
  ('PartyC'),
  ('PartyD'),
  ('PartyE')
;

I need to create a new table with a column where 'new' PartyA = PartyA PartyA_PartyB/2 and 'new' PartyB = PartyB PartyA_PartyB/2

So with previous data desired result is:

Country	PartyA	PartyB	PartyC
Argentina	101	11	20
Uruguay	3	5	1
Chile	45	205	50

In all cases the special characters that separates the names to be splitted is '_'.

We can have n parties in the column names (for example PartyA_PartyB_PartyD_PartyE). Votes have to be splitted among the n parties.

With my limited understanding I think iterate over the columns could be a solution, look for the '_' character and recalculate.

CodePudding user response：

Note: Please store your values not as text but as a numeric type.

demo: db<>fiddle (2 joined colums)
demo: db<>fiddle (additional 3 joined columns)

Create your new table:

CREATE TABLE parties (
     "Country" text,  
     "PartyA" numeric,  
     "PartyB" numeric,  
     "PartyC" numeric
);

Copy values for the "single" columns:

INSERT INTO parties

SELECT "Country", "PartyA", "PartyB", "PartyC"
FROM election_results;

Update the columns with a function

SELECT * FROM split_and_update_parties();

The function could look like this:

CREATE OR REPLACE FUNCTION split_and_update_parties()
  RETURNS void
  LANGUAGE plpgsql AS
$func$
DECLARE
    i record;
    j text;
    n integer;
BEGIN 
  FOR i in 
     
      SELECT                                       
           column_name,                                  -- 1
           string_to_array(column_name, '_')             -- 2
       FROM information_schema.columns
       WHERE table_name   = 'election_results'
           AND column_name ~ 'Party'     
     
   LOOP
       n = cardinality(i.string_to_array);               -- 3
       IF n > 1 THEN
           FOREACH j in array i.string_to_array LOOP
            
               EXECUTE format('
         
                   UPDATE parties p                      -- 4
                   SET %I = p.%I   s.val / %s
                   FROM (
                       SELECT %I as val, "Country"
                       FROM election_results
                   ) s
                   WHERE p."Country" = s."Country"
          
               ', j, j, n, i.column_name);

           END LOOP;
       END IF;
   END LOOP;

END
$func$;

Explanation:

Fetch column names from internal information schema
Immediately split the names and convert them into arrays
Count the elements of the arrays to know the divider needed furtherly in the calculation
Loop through all these multiple-party-arrays/columns (with more than 1 element), fetch the original values from the election_results table and update the single-party-columns in the new table