Home > front end >  Left joining a table to itself with no WHERE clause results in fewer rows than the original
Left joining a table to itself with no WHERE clause results in fewer rows than the original

Time:10-16

I have a table (table_old) with a column (person_id) that sometimes has duplicate values. I want to add a column that has a the maximum value of age_at_midmonth for each value of person_id. Table_old has 27,236,296 rows. Table_new has 27,209,850 rows. The number of rows should not decrease since this is a LEFT JOIN with no WHERE clause. What is happening here? Am I doing this wrong? Is there a better way? The query is running in Oracle.

    create table schema1.table_new as
        select 
            t1.*, 
            t2.current_age
        from 
            schema1.table_old t1
            left join (
                select person_id, max(age_at_midmonth) as current_age
                from schema1.table_old
                group by person_id
            ) t2
                on t1.person_id = t2.person_id;

CodePudding user response:

The issue here was that I was looking at the number of rows using the Analyze Table feature and then looking at the Stats/Size info for the table using Toad for Oracle. It appears that running Analyze Table does not necessarily produce an accurate count of rows. When I did select count(*) from table_new; and the same for table_old, I got the exact same number of rows. Oops.

CodePudding user response:

I know this wasn't your actual question, but you don't actually even need to do a LEFT JOIN for that example. It would be easier to just use analytical functions instead of the aggregate as part of a subquery. For instance, your example query could be turned into:

SELECT t.*, 
       MAX(t.age_at_midmonth) OVER (PARTITION BY t.person_id) AS CURRENT_AGE
FROM schema1.table_old t;
  • Related