Home > Blockchain >  How to delete Sqlite rows that contain old data for the same field but with different timestamps?
How to delete Sqlite rows that contain old data for the same field but with different timestamps?

Time:06-14

I have a unique situation. I haven't found a similar question yet. Here's the table I have.

    TIMESTAMP               SYMBOL NAME PRICE YEAR-LOW YEAR-HIGH               STATEMENT-DATE
2022-06-12 17:32:37.117340  AOS A. O. Smith Corporation 58.06   56.61   86.74   2021-12-31
2022-06-12 17:32:37.109389  AOS A. O. Smith Corporation 58.06   56.61   86.74   2020-12-31
2022-06-12 17:32:37.101411  AOS A. O. Smith Corporation 58.06   56.61   86.74   2019-12-31
2022-06-12 17:32:37.093402  AOS A. O. Smith Corporation 58.06   56.61   86.74   2018-12-31
2022-06-12 17:32:37.026740  AOS A. O. Smith Corporation 58.06   56.61   86.74   2017-12-31
2022-06-12 17:32:29.742554  MMM 3M Company  137.65  137.58  203.59  2021-12-31
2022-06-12 17:32:29.727191  MMM 3M Company  137.65  137.58  203.59  2019-12-31
2022-06-12 17:32:29.654842  MMM 3M Company  137.65  137.58  203.59  2017-12-31
2022-06-12 17:32:08.582652  AOS A. O. Smith Corporation 58.06   56.61   86.74   2021-12-31
2022-06-12 17:32:08.574681  AOS A. O. Smith Corporation 58.06   56.61   86.74   2020-12-31
2022-06-12 17:32:08.565711  AOS A. O. Smith Corporation 58.06   56.61   86.74   2019-12-31
2022-06-12 17:32:08.558671  AOS A. O. Smith Corporation 58.06   56.61   86.74   2018-12-31
2022-06-12 17:32:07.904663  AOS A. O. Smith Corporation 58.06   56.61   86.74   2017-12-31
2022-06-12 17:32:00.701647  MMM 3M Company  137.65  137.58  203.59  2021-12-31
2022-06-12 17:32:00.685993  MMM 3M Company  137.65  137.58  203.59  2020-12-31
2022-06-12 17:32:00.670601  MMM 3M Company  137.65  137.58  203.59  2018-12-31
2022-06-12 17:32:00.604380  MMM 3M Company  137.65  137.58  203.59  2017-12-31

As you can see, I have the same data replicated many times, each time the Python program is run.

I want to make the statement-date column Unique of course, and the problem would be solved. However, many other companies may have the same statement-date, and since they are part of the same table, rows for other companies don't get written to the database. So, the statement-date column can NOT be unique in my case, which is what presents this problem.

So, everytime I run the program, the data for a company (field) gets reinserted.

How should I fix this? Either;

a. Do NOT insert values if the company's data already exists for the statement-data column. As I covered before, the statement-data column can be the same for other companies and therefore can't be unique. So, how do I achieve this?

OR

b. Write a Delete statement where I delete older data for a company.

Which would be the most efficient? And how do I write the delete statement in this case?

If a is not an option, then I need help writing something like DELETE FROM TABLE-NAME WHERE SYMBOL EXISTS FOR STATEMENT-DATE AND TIMESTAMP IS NOT LATEST? What's the command to do so? Is it possible?

CodePudding user response:

You can create a composite unique index on the columns SYMBOL and STATEMENT-DATE so that their combination is unique:

CREATE UNIQUE INDEX idx_symbol_date 
ON tablename (SYMBOL, `STATEMENT-DATE`);
  • Related