I want to check for duplicate rows . and see there column values . if there were only few columns in my table - 2 for example - I would have done something like: '''
select col1, col2 ,count(*)
from mytable
group by col1,col2
having count(*) > 1.
''' but I have dozens of column in my table .... and using the above syntax is tedious to specify all the columns in the table. trying another approach with select distinct ... will not identify for me the content of duplicated rows . I tried somthing like '''
select * , count (*)
from my table
group by *
''' but that doesn't work.
CodePudding user response:
Write a query which will write a query for you.
For example, "john smith" is a duplicate here:
SQL> select * from my_data order by 1;
FULL_NAME FIRST_NAME LAST_NAME
---------- -------------------- --------------------
h gonzalez h gonzalez
john smith john smith
john smith john smith
rudy chan rudy chan
Query uses user_tab_columns
and aggregates all column names, concatenating them to the rest of a select
statement:
SQL> SELECT 'select '
2 || LISTAGG (column_name, ', ') WITHIN GROUP (ORDER BY column_id)
3 || ', count(*) cnt '
4 || CHR (10)
5 || ' from '
6 || table_name
7 || CHR (10)
8 || ' group by '
9 || LISTAGG (column_name, ', ') WITHIN GROUP (ORDER BY column_id)
10 || CHR (10)
11 || ' having count(*) > 1;' statement_to_run
12 FROM user_tab_columns
13 WHERE table_name = 'MY_DATA'
14 GROUP BY table_name;
STATEMENT_TO_RUN
--------------------------------------------------------------------------------
select FULL_NAME, FIRST_NAME, LAST_NAME, count(*) cnt
from MY_DATA
group by FULL_NAME, FIRST_NAME, LAST_NAME
having count(*) > 1;
Now, copy/paste the above statement_to_run
and get the result:
SQL> select FULL_NAME, FIRST_NAME, LAST_NAME, count(*) cnt
2 from MY_DATA group by
3 FULL_NAME, FIRST_NAME, LAST_NAME having count(*) > 1;
FULL_NAME FIRST_NAME LAST_NAME CNT
---------- -------------------- -------------------- ----------
john smith john smith 2
SQL>
CodePudding user response:
Just write out all the columns.
there are dozens of columns ,about 30 , and there names look like : 'AGtrf-456F_RValue'
Copy-paste.
In SQL*Plus you can use the DESCRIBE
command to describe a table and you can copy the column names from the table description.
Or you can list all the columns using:
SELECT '"' || column_name || '",'
FROM user_tab_columns
WHERE table_name = 'MY_DATA'
ORDER BY column_id;
And then copy-paste the output into your query into the SELECT
and GROUP BY
clauses.
Can you generate the query automatically.
Yes, but it usually is not worth it as it takes longer to write a query to generate the query than it does just to list the columns and copy-paste.
If you have lots of column names that require you to use quoted identifiers (i.e. they are mixed-case or use non-standard characters like -
) then you can use:
SELECT EMPTY_CLOB()
|| 'SELECT '
|| LISTAGG('"' || column_name || '"', ',') WITHIN GROUP (ORDER BY column_id)
|| ', COUNT(1) FROM MY_DATA GROUP BY '
|| LISTAGG('"' || column_name || '"', ',') WITHIN GROUP (ORDER BY column_id)
|| ' HAVING COUNT(1) > 1;'
FROM user_tab_columns
WHERE table_name = 'MY_DATA'
ORDER BY column_id;
Which works unless you have too many columns and LISTAGG
exceeds 4000 characters then you would need to use something like:
WITH columns (col, pos) AS (
SELECT '"' || column_name || '",',
column_id
FROM user_tab_columns
WHERE table_name = 'MY_DATA'
ORDER BY column_id
)
SELECT sql
FROM (
SELECT 'SELECT ' AS sql, 0 FROM DUAL
UNION ALL
SELECT col, pos FROM columns
UNION ALL
SELECT ' COUNT(1) FROM MY_DATA GROUP BY ', 10000 FROM DUAL
UNION ALL
SELECT col, 10000 pos FROM columns
UNION ALL
SELECT '1 HAVING COUNT(1) > 1', 20000 FROM DUAL
ORDER BY 2
)