In SQLite, my goal is to delete the duplicates from Table 1 (this is a snippet of data from 10,000 rows). After deleting duplicates, I want to keep the new table and name it as "newtable1". The problem I am having is saving the new table after deleting the duplicates. After deleting the duplicates, what other statements should I add for my code (after the third semicolon)? I hope my question is clear as I am new here.
Table 1:
ID | DATE1 | TIME |
---|---|---|
12 | 2021-07-08 | 2:00:34 |
11 | 2020-08-03 | 9:19:39 |
33 | 2020-07-04 | 10:18:00 |
33 | 2020-07-04 | 10:18:00 |
Desired "newtable1" (after deleting duplicates)
ID | DATE1 | TIME |
---|---|---|
12 | 2021-07-08 | 2:00:34 |
11 | 2020-08-03 | 9:19:39 |
33 | 2020-07-04 | 10:18:00 |
Code:
CREATE TABLE table1 AS
SELECT * /*, COUNT(*)*/
FROM table1
WHERE DATE1 >= '2020-07-01'
;
/*check duplicates */
SELECT *, COUNT(*) AS Count
FROM table1
GROUP BY ID, DATE1, TIME
HAVING COUNT(*)>1
ORDER BY ID DESC
;
/*There are 50 duplicates. delete duplicates here*/
delete from table1
where rowid not in (select min(rowid)
from table
group by ID)
;
/*duplicates deleted but how do I save new data?*/
/*I am stuck at this part*/
ALTER TABLE table1
RENAME to newtable1;
CodePudding user response:
Something like
CREATE TABLE newtable1 AS
SELECT id, date1, time
FROM (SELECT id, date1, time,
row_number() OVER (PARTITION BY id, date1, time) AS rn
FROM table1
WHERE date1 >= '2020-07-01')
WHERE rn = 1;
might work; use a window function to easily select only one entry from each group with the same values for all three columns.
Though come to think, it can be even simpler if those are the only columns in your real table:
CREATE TABLE newtable1 AS
SELECT DISTINCT id, date1, time
FROM table1
WHERE date1 >= '2020-07-01';
CodePudding user response:
You can remove all Dupliacts by GROUP BY all Colums and use that to create the table
CREATE TABLE nEwtable1 AS SELECT table1.* FROM table1 WHERE DATE1 >= '2020-07-01' GROUP BY "ID", "DATE1", "TIME"
SELECT * FROM nEwtable1
ID DATE1 TIME 11 2020-08-03 9:19:39 12 2021-07-08 2:00:34 33 2020-07-04 10:18:00
db<>fiddle here