Retrieving missing data from within the same table-CodePudding

I'm importing CSV files with missing data into a MariDB table. I need to find all codes that don't have a corresponding place = 2.

Table cityX
| id | code | place | value | description | subcode | 
| 1  | 001x |  1    |  6.00 | unique str  |   A     |
| 2  | 002x |  1    |  2.23 | diff string |   B     |
| 3  | 003x |  1    |  2.23 | another str |   B     |

Every code in the table must have a duplicate row with place = 1 and place = 2

| id | code | place | value | description | subcode |
| 1  | 001x |  1    |  6.00 | unique str  |   A     |
| 2  | 001x |  2    |  6.00 | unique str  |   A     |

I've used variations of select ... except statements to isolate the codes with varying amount of erroneous fields.

Using SELECT [code] FROM cityX WHERE place = '1' EXCEPT SELECT [code] FROM cityX where place = '2', creating a temporary table and joining the remaining place, value, description, and subcode fields to retrieve missing codes. I'm retrieving most of the missing records, but am introducing duplicates as well.

How can I properly select and insert rows missing a place = 2

CodePudding user response：

This solution avoids EXCEPT which doesn't work in every RDBMS (not sure about mysql)

SELECT CODES.code,
       CODE_W_1.place AS place_1,
       CODE_W_2.place AS place_2
FROM   (SELECT code
        FROM   cityx
        GROUP  BY code) AS CODES
       LEFT OUTER JOIN (SELECT code,
                               place
                        FROM   cityx
                        WHERE  place = 1
                        GROUP  BY code) AS CODE_W_1
                    ON CODES.code = CODE_W_1.code
       LEFT OUTER JOIN (SELECT code,
                               place
                        FROM   cityx
                        WHERE  place = 2
                        GROUP  BY code) AS CODE_W_2
                    ON CODES.code = CODE_W_2.code
WHERE  code_w_1 IS NULL
        OR CODE_W_2.code IS NULL

I don't have access to mysql to test this, but I got this from Rasgo which automatically writes SQL.

CodePudding user response：

We can use selects which test whether the other values exists. We can either use the queries alone to check for unmatched values, or in an insert to add the missing values.

create table cityX (
  id int primary key not null auto_increment,
  code char(5),
  place int );
insert into cityX (code, place) values

('001x',1),('001x',2),('002x',1),('003x',2);

select * from cityX
order by code, place;

id | code | place
-: | :--- | ----:
 1 | 001x |     1
 2 | 001x |     2
 3 | 002x |     1
 4 | 003x |     2

insert into cityX (code, place)
select x.code,1 from cityX x
where place = 2
and not exists
(select id from cityX c
where c.code = x.code
and place = 1);

insert into cityX (code, place)
select x.code,2 from cityX x
where place = 1
and not exists
(select id from cityX c
where c.code = x.code
and place = 2);

select * from cityX
order by code, place;

id | code | place
-: | :--- | ----:
 1 | 001x |     1
 2 | 001x |     2
 3 | 002x |     1
 6 | 002x |     2
 5 | 003x |     1
 4 | 003x |     2

db<>fiddle here