Home > OS >  Count for each columns in table (not null) PostgreSQL PL/pgSQL
Count for each columns in table (not null) PostgreSQL PL/pgSQL

Time:11-11

I am trying to count the number of rows that do not contain null for each column in the table

There is a simple table actor_new The first 2 columns (actor_id, first_name) contain 203 rows not null Other 2 columns (last_name, last_update) contain 200 rows not null

This is a simple test that outputs the same value for all columns, but if you perform select separately, then everything works correctly, please help me understand the LOOP block

create or replace function new_cnt_test_ho(in_table text, out out_table text, out cnt_rows int) returns setof record AS $$ 
DECLARE i text;
BEGIN 
    FOR i IN 
        select column_name 
        from information_schema."columns" 
        where table_schema = 'public' 
        and table_name = in_table
    LOOP 
        execute '
        select $1, count($1)
        from '|| quote_ident(in_table) ||'
        where $1 is not null '
        INTO out_table, cnt_rows
        using i, quote_literal(i), quote_ident(in_table), quote_literal(in_table) ;
        return next;
    END LOOP;
END;
$$LANGUAGE plpgsql

Result:

select * from new_cnt_test_ho('actor_new')
out_table  |cnt_rows|
----------- -------- 
actor_id   |     203|
first_name |     203|
last_name  |     203|
last_update|     203|

enter image description here

There are 4 parameters specified in using, because I assumed that the error was in quotes, I took turns playing with arguments from 1 to 4

The correct result should be like this

out_table  |cnt_rows|
----------- -------- 
actor_id   |     203|
first_name |     203|
last_name  |     200|
last_update|     200|

CodePudding user response:

based on your title: input is a table name, output is a table one column is column name, another column is return of count(column)
first check the table exists or not. then for loop get each column name, after that for each column name run a query.

a sample query is select 'cola',count(cola) from count_nulls. first occurrence is literal 'cola', so we need quote_literal(cols.column_name), second is the column name, so we need use quote_ident(cols.column_name)

select 'cola',count(cola) from count_nulls will count column cola all not null value. if a column all value is null then return 0.
The following function will return the expected result. Can be simplified, since i use a lot of raise notice.

CREATE OR REPLACE FUNCTION get_all_nulls (_table text)
    RETURNS TABLE (
        column_name_ text,
        numberofnull bigint
    )
    AS $body$
DECLARE
    cols RECORD;
    _sql text;
    _table_exists boolean;
    _table_reg regclass;
BEGIN
    _table_reg := _table::regclass;
    _table_exists := (
        SELECT
            EXISTS (
                SELECT
                FROM
                    pg_tables
                WHERE
                    schemaname = 'public'
                    AND tablename = _table));
    FOR cols IN
    SELECT
        column_name
    FROM
        information_schema.columns
    WHERE
        table_name = _table
        AND table_schema = 'public' LOOP
            _sql := 'select ' || quote_literal(cols.column_name) || ',count(' || quote_ident(cols.column_name) || ') from ' || quote_ident(_table::text);
            RAISE NOTICE '_sql:%', _sql;
            RETURN query EXECUTE _sql;
        END LOOP;
END;
$body$ STRICT
LANGUAGE plpgsql;

setup.

begin;
create table count_nulls(cola int, colb int, colc int);
INSERT into count_nulls values(null,null,null);
INSERT into count_nulls values(1,null,null);
INSERT into count_nulls values(2,3,null);
commit;
  • Related