Home > Net >  How to Create a table in postgreSql which will automatically delete its records after 3 months of cr
How to Create a table in postgreSql which will automatically delete its records after 3 months of cr

Time:11-29

I have a requirement to insert new records and delete this record 3 months after created date. I will create a new table. I want that query will create the table as well as auto delete the records which are more than 3 months of the created date.

CodePudding user response:

This sounds like a maintenance task - already suggested cron job would typically be the most suitable.

  1. If you're free to use extensions - consider pg_cron:

    CREATE EXTENSION pg_cron;
    
    -- Delete old data daily at 3:30am (GMT)
    SELECT cron.schedule(
              'delete outdated records',
              '30 3 * * *', 
              $$ DELETE FROM new_table WHERE created_date < now()-'3 months'::interval $$
       );    
    
  2. Otherwise, save your script to a file and feed it to a regular cron job

    # weekdays at 01:00am
    # min hour mday month wday command-to-run
    0    1    *     *    1-5 psql -h dbhost -p 5432 -U dbuser --dbname dbname < my.sql
    

There's also the topic of how strict your rules are:

  1. If the records can be 3 months old, /- 1 day, the daily cron/pg_cron jobs will suffice. The lower your tolerance, the more often your cron will have to keep repeating the cleanup. At some point it might get impractical if it's effectively running all or most of the time. It could get to a point when next cleanup tasks start before previous ones finish, resulting in a growing queue.

  2. If you absolutely can't afford to be in possession of records older than 3 months but it's tolerated when you already delete stuff that's one day away from the 3 month mark, you can change the cron job to delete records a day earlier

    -('3 months'::interval-'1 day'::interval)

  3. If you are allowed to be in possession of those records, just no longer allowed to show outdated records - hide the table behind a view that's filtering out everything older than exactly 3 months. Demo:

    create table new_table (data text, created_date timestamp default now());
    insert into new_table (data) values ('value1');
    
    alter table new_table rename to new_table_raw;
    
    create view new_table as 
    select * from new_table_raw where created_date>now()-'3 months'::interval;
    
    --insert, update, delete still work even though it's a view now
    insert into new_table (data) values ('value2');
    delete from new_table where data='value2';
    update new_table set data='value3' where data='value1';
    

    This gives you the highest precision at no maintenance expense. It's guaranteed to hide records from 3 months ago and older, as of query time - that's 3 months exactly, down to fractions of a second. You can keep this perceived precision while following the 3 month restriction a bit less strictly, running the clean up task using cron, pg_cron or even triggers, less often.

  4. If you can't have records older than exactly 3 months and you must hold records younger than exactly 3 months, you might want to set up a trigger/rule that will NOTIFY on a channel about incoming records and their scheduled deletion times, and a daemon that can LISTEN on that channel and sets up an at task (schtasks on Windows).

    at now   3 months -f delete_older_than_3_months_using_psql.sh
    

    The script can be a regular delete that's just triggered at the time coinciding with the moment some records become outdated, but still has to search them all and find those, or you can save their primary keys in the cleanup command or let it pop() it off a FIFO queue somewhere.

How long is "3 months"

If you mean 90 days, you need to accept that the deletion dates will shift around based on month lengths within that 90-day difference from creation date.

with
 test_dates(example) as
(   values
    ('2023.01.01'::date),
    ('2023.02.01'::date),
    ('2024.02.01'::date),--leap year
    ('2023.03.01'::date))
select example, (example   '90 days'::interval)::date as "date 90 days later"
from test_dates;
--  example   | date 90 days later
-------------- --------------------
-- 2023-01-01 | 2023-04-01
-- 2023-02-01 | 2023-05-02
-- 2024-02-01 | 2024-05-01 --leap year
-- 2023-03-01 | 2023-05-30

If you mean 3 months, you need to accept that it'll sometimes be longer or shorter:

with 
 test_dates(example) as 
(   values 
    ('2023.01.01'::timestamp),
    ('2023.02.01'::timestamp),
    ('2024.02.01'::timestamp),--leap year
    ('2023.03.01'::timestamp),
    ('2023.04.01'::timestamp),
    ('2023.05.01'::timestamp),
    ('2023.06.01'::timestamp) )
select example, example   '3 months'::interval - example as "3 months length in days"
from test_dates;
--       example       | 3 months length in days
----------------------- -------------------------
-- 2023-01-01 00:00:00 | 90 days
-- 2023-02-01 00:00:00 | 89 days
-- 2024-02-01 00:00:00 | 90 days --leap year
-- 2023-03-01 00:00:00 | 92 days
-- 2023-04-01 00:00:00 | 91 days
-- 2023-05-01 00:00:00 | 92 days
-- 2023-06-01 00:00:00 | 92 days

And that it's interpreted differently by different systems and people: 'N months' interval literal in PostgreSQL:

select '01-31-2023'::timestamp   '1 month';--2023-02-28 00:00:00
select '02-28-2023'::timestamp   '1 month';--2023-03-28 00:00:00
select '02-29-2024'::timestamp   '1 month';--2024-03-29 00:00:00 --leap year
select '03-31-2023'::timestamp   '1 month';--2023-04-30 00:00:00

In bash, at understands it differently:

$ echo "command" | at "00:00 013123"   1 month
job 14 at Fri Mar  3 00:00:00 2023
$ echo "command" | at "00:00 022823"   1 month
job 15 at Tue Mar 28 00:00:00 2023
$ echo "command" | at "00:00 022924"   1 month #leap year
job 16 at Tue Mar 29 00:00:00 2024
$ echo "command" | at "00:00 033123"   1 month
job 17 at Mon May  1 00:00:00 2023

CodePudding user response:

You should set cron jobs on your server this is best approach.

  • Related