Home > OS >  Counting nulls for each column in table
Counting nulls for each column in table

Time:09-15

We want to count how many nulls each column in a table has. There are too many columns to do this one by one, so the following PLSQL procedure was created.

In the first part of the procedure, all column names are obtained. This works, as the dbms_output correctly lists them all.

Secondly, a query inserts the count of null values in the variable 'nullscount'. This part does not work, as the output printed for this variable is always 0, even for columns where we know there are nulls.

Does anyone know how to handle the second part correctly?

Many thanks.

CREATE OR REPLACE PROCEDURE COUNTNULLS AS 
nullscount int;

BEGIN

for c in (select column_name from all_tab_columns where table_name = upper('gp'))
loop

select count(*) into nullscount from gp where c.column_name is null;

dbms_output.put_line(c.column_name||' '||nullscount);
end loop;
 
END COUNTNULLS;

CodePudding user response:

You can get it with just one query like this: this query scans table just once:

DBFiddle: https://dbfiddle.uk/asgrCezT

select *
from xmltable(
        '/ROWSET/ROW/*'
        passing
        dbms_xmlgen.getxmltype(
            (
            select 
                'select '
              ||listagg('count(*)-count("'||column_name||'") as "'||column_name||'"',',')
              ||' from '||upper('gp')
            from user_tab_columns 
            where table_name = upper('gp')
            )
        )
        columns
            column_name varchar2(30) path './name()',
            cnt_nulls   int path '.'
        );

Results:

COLUMN_NAME                     CNT_NULLS
------------------------------ ----------
A                                       5
B                                       4
C                                       3

Dynamic sql in this query uses (24 chars column name length) so it should work fine for example for 117 columns with average column name length = 10. If you need more, you can rewrite it a bit, for example:

select *
from xmltable(
        'let $cnt := /ROWSET/ROW/CNT
         for $r in /ROWSET/ROW/*[name() != "CNT"]
           return <R name="{$r/name()}"> {$cnt - $r} </R>'
        passing
        dbms_xmlgen.getxmltype(
            (
            select 
                'select count(*) CNT,'
              ||listagg('count("'||column_name||'") as "'||column_name||'"',',')
              ||' from '||upper('gp')
            from user_tab_columns 
            where table_name = upper('gp')
            )
        )
        columns
            column_name varchar2(30) path '@name',
            cnt_nulls   int path '.'
        );

CodePudding user response:

c.column_name is never null because it's the content of the column "column_name" of the table "all_tab_columns" not the column of which name is the value of c.column_name, in table gp. You have to use dynamic query and EXECUTE IMMEDIATE to achieve what you want.

CodePudding user response:

create table gp (
    id                             number generated by default on null as identity 
                                   constraint gp_pk primary key,
    c1                             number,
    c2                             number,
    c3                             number,
    c4                             number,
    c5                             number
)
;

-- add some data with NULLS and numbers
DECLARE
BEGIN
  FOR r IN 1 .. 20 LOOP
    INSERT INTO gp (c1,c2,c3,c4,c5) VALUES 
    (CASE WHEN mod(r,2) = 0 THEN NULL ELSE mod(r,2) END
    ,CASE WHEN mod(r,3) = 0 THEN NULL ELSE mod(r,3) END
    ,CASE WHEN mod(r,4) = 0 THEN NULL ELSE mod(r,4) END
    ,CASE WHEN mod(r,5) = 0 THEN NULL ELSE mod(r,5) END
    ,5);
  END LOOP;
END;
/

-- check what is in the table
SELECT * FROM gp;

-- do count of each column
DECLARE
  l_colcount NUMBER;
  l_statement VARCHAR2(100) := 'SELECT COUNT(*) FROM $TABLE_NAME$ WHERE $COLUMN_NAME$ IS NULL';
BEGIN
  FOR r IN (SELECT column_name,table_name FROM user_tab_columns WHERE table_name = 'GP') LOOP
  
    EXECUTE IMMEDIATE REPLACE(REPLACE(l_statement,'$TABLE_NAME$',r.table_name),'$COLUMN_NAME$',r.column_name) INTO l_colcount;
    
    dbms_output.put_line('Table: '||r.table_name||', column'||r.column_name||', COUNT: '||l_colcount);
  END LOOP;
END;
/

Table created.

Statement processed.

Result Set 4
ID  C1  C2  C3  C4  C5
1   1   1   1   1   5
2    -  2   2   2   5
3   1    -  3   3   5
4    -  1    -  4   5
5   1   2   1    -  5
6    -   -  2   1   5
7   1   1   3   2   5
8    -  2    -  3   5
9   1    -  1   4   5
10   -  1   2    -  5
11  1   2   3   1   5
12   -   -   -  2   5
13  1   1   1   3   5
14   -  2   2   4   5
15  1    -  3    -  5
16   -  1    -  1   5
17  1   2   1   2   5
18   -   -  2   3   5
19  1   1   3   4   5
20   -  2    -   -  5

20 rows selected.

Statement processed.

Table: GP, columnID, COUNT: 0
Table: GP, columnC1, COUNT: 10
Table: GP, columnC2, COUNT: 6
Table: GP, columnC3, COUNT: 5
Table: GP, columnC4, COUNT: 4
Table: GP, columnC5, COUNT: 0
  • Related