I have a unique situation. I haven't found a similar question yet. Here's the table I have.
TIMESTAMP SYMBOL NAME PRICE YEAR-LOW YEAR-HIGH STATEMENT-DATE
2022-06-12 17:32:37.117340 AOS A. O. Smith Corporation 58.06 56.61 86.74 2021-12-31
2022-06-12 17:32:37.109389 AOS A. O. Smith Corporation 58.06 56.61 86.74 2020-12-31
2022-06-12 17:32:37.101411 AOS A. O. Smith Corporation 58.06 56.61 86.74 2019-12-31
2022-06-12 17:32:37.093402 AOS A. O. Smith Corporation 58.06 56.61 86.74 2018-12-31
2022-06-12 17:32:37.026740 AOS A. O. Smith Corporation 58.06 56.61 86.74 2017-12-31
2022-06-12 17:32:29.742554 MMM 3M Company 137.65 137.58 203.59 2021-12-31
2022-06-12 17:32:29.727191 MMM 3M Company 137.65 137.58 203.59 2019-12-31
2022-06-12 17:32:29.654842 MMM 3M Company 137.65 137.58 203.59 2017-12-31
2022-06-12 17:32:08.582652 AOS A. O. Smith Corporation 58.06 56.61 86.74 2021-12-31
2022-06-12 17:32:08.574681 AOS A. O. Smith Corporation 58.06 56.61 86.74 2020-12-31
2022-06-12 17:32:08.565711 AOS A. O. Smith Corporation 58.06 56.61 86.74 2019-12-31
2022-06-12 17:32:08.558671 AOS A. O. Smith Corporation 58.06 56.61 86.74 2018-12-31
2022-06-12 17:32:07.904663 AOS A. O. Smith Corporation 58.06 56.61 86.74 2017-12-31
2022-06-12 17:32:00.701647 MMM 3M Company 137.65 137.58 203.59 2021-12-31
2022-06-12 17:32:00.685993 MMM 3M Company 137.65 137.58 203.59 2020-12-31
2022-06-12 17:32:00.670601 MMM 3M Company 137.65 137.58 203.59 2018-12-31
2022-06-12 17:32:00.604380 MMM 3M Company 137.65 137.58 203.59 2017-12-31
As you can see, I have the same data replicated many times, each time the Python program is run.
I want to make the statement-date
column Unique of course, and the problem would be solved. However, many other companies may have the same statement-date
, and since they are part of the same table, rows for other companies don't get written to the database. So, the statement-date
column can NOT be unique in my case, which is what presents this problem.
So, everytime I run the program, the data for a company (field) gets reinserted.
How should I fix this? Either;
a. Do NOT insert values if the company's data already exists for the statement-data column. As I covered before, the statement-data column can be the same for other companies and therefore can't be unique. So, how do I achieve this?
OR
b. Write a Delete statement where I delete older data for a company.
Which would be the most efficient? And how do I write the delete statement in this case?
If a
is not an option, then I need help writing something like DELETE FROM TABLE-NAME WHERE SYMBOL EXISTS FOR STATEMENT-DATE AND TIMESTAMP IS NOT LATEST
? What's the command to do so? Is it possible?
CodePudding user response:
You can create a composite unique index on the columns SYMBOL
and STATEMENT-DATE
so that their combination is unique:
CREATE UNIQUE INDEX idx_symbol_date
ON tablename (SYMBOL, `STATEMENT-DATE`);