Home > Software engineering >  (Alembic, SQLAlchemy) Can I copy data from non partitioned key to a partitioned one in the migration
(Alembic, SQLAlchemy) Can I copy data from non partitioned key to a partitioned one in the migration

Time:01-31

I have a table needs to be partitioned, but since the postgresql_partition_by wasn't added while the creation of the table so am trying to:

  • create a new partitioned table that is similar the origin one.
  • moving the data from the old one to the new one.
  • drop the original one.
  • rename the new one. so what is the best-practice to move the data from the old table to the new one ??

I tried this and it didn't work

COPY partitioned_table 
FROM original_table;

also tried

INSERT INTO partitioned_table (column1, column2, ...)
SELECT column1, column2, ...
FROM original_table;

but both didn't work :( noting that I am using Alembic to generate the migration scripts also am using sqlalchemy from Python

CodePudding user response:

Basically you have two scenarios described below.

- The table is large and you need to split the data in several partitions

- The table gets the first partition and you add new partition for new data

Lets use this setup for the not partitioned table

create table jdbn.non_part 
 (id int not null, name varchar(100));

insert into jdbn.non_part (id,name)
SELECT  id, 'xxxxx'|| id::varchar(20) name
from generate_series(1,1000) id;

The table contains id from 1 to 1000 and for the first case you need to split them in two partition for 500 rows each.

Create the partitioned table

with identical structure and constraints as the original table

create table jdbn.part  
(like jdbn.non_part INCLUDING DEFAULTS INCLUDING CONSTRAINTS)
PARTITION BY RANGE (id);

Add partitions

to cover current data

create table jdbn.part_500 partition of jdbn.part
for values from (1) to (501); /* 1 <= id < 501 */

create table jdbn.part_1000 partition of jdbn.part
for values from (501) to (1001);  

for future data (as required)

create table jdbn.part_1500 partition of jdbn.part
for values from (1001) to (1501); 

Use insert to copy data

Note that this approach copy the data that means you need twice the space and a possible cleanup of the old data.

insert into jdbn.part (id,name)
select id, name from jdbn.non_part;

Check partition pruning

Note that only the partition part_500 is accessed

EXPLAIN SELECT * FROM jdbn.part WHERE id <= 500;

QUERY PLAN                                                      |
---------------------------------------------------------------- 
Seq Scan on part_500 part  (cost=0.00..14.00 rows=107 width=222)|
  Filter: (id <= 500)                                           |

Second Option - MOVE Data to one Partition

If you can live with the one (big) initial partition, you may use the second approach

Create the partitioned table

same as above

Attach the table as a partition

ALTER TABLE jdbn.part ATTACH PARTITION jdbn.non_part
   for values from (1) to (1001);

Now the original table gets the first partition of your partitioned table. I.e. no data duplication is performed.

EXPLAIN SELECT * FROM jdbn.part WHERE id <= 500;
QUERY PLAN                                                     |
--------------------------------------------------------------- 
Seq Scan on non_part part  (cost=0.00..18.50 rows=500 width=12)|
  Filter: (id <= 500)                                          |

Similar answer with some hints to automation of partition creation here

CodePudding user response:

After trying a few things, the solution was:

INSERT INTO new_table(fields ordered as the result of the select statement) SELECT * FROM old_table

dunno if there was an easier way to get the fields ordered, but I tried inserting a row in DBEver from these options Then got names like these steps

  • Related