For full scale increment table design-CodePudding

Table structure (A.B.C, SRC, time)

Constraints are as follows:

Src the source, the time, said the latest time, full scale is 1 billion records, there is an index on a, logically. A, b, c is the only sure a record, how to design
Business requirements:

Daily amount of SRC have different sources about millions of data, using the data in the daily increment in the table, and then to update the inventory, then insert, if the same source under the SRC A.B.C just update the same time, the operation time can be long, seek expert advice on what to do

CodePudding user response:

Logically. A, b, c only determine a record

PK is built on the three fields,

Use the merge updates;

CodePudding user response:

Given the amount of data is too large, can consider to use partition table

CodePudding user response:

A, b, c three establishing uniqueness constraints, or PK.
If need to establish a range SRC number less than 4000 partition table;

CodePudding user response:

The time interval partition

CodePudding user response:

If it's not Oracle with what big data form, redis?
It should be a, b, c, the only sure a record under the same source

CodePudding user response:

reference 4 floor sych888 response:

time interval partition
"insert time by then, if the same source under the SRC A.B.C just update the same time",
Partition in accordance with the time, no matter you are a LIST, or reaching, you have to scan a full table to determine a, b, c exists, try to ask what's the point?
CodePudding user response:
refer to 6th floor POM_24 response:

Quote: refer to 4th floor sych888 response:

Time interval partition

"insert time by then, if the same source under the SRC A.B.C just update the same time",
Partition in accordance with the time, no matter you are a LIST, or reaching, you have to scan a full table to determine a, b, c exists, try to ask what's the point?

I'll worry about some of the SRC below data volume is too much, to the data to a full table scan, too
CodePudding user response:
One, the index is a tree structure, find does not need a full table scan,
2, a, b, c if there is a string field, move the strings to other dictionaries in the table, only the integer id here, to speed up the processing speed,
Three, with daily increment table (dates, a, b, c) is done advocate key ( date does not include the time), every day the traversal sequence and full scale to check/update order is consistent,
Four, daily increment table to 1000 batches, each batch of a transaction, million-dollar only thousands of times,
CodePudding user response:
Increment table put new data every day, only the original data before receiving daily incremental data empty (other table can be backup to prevent late data error),
For incremental sheet and full scale of the A, B, C three fields to establish uniqueness index, time, said the latest time, show time field will be accurate to seconds or milliseconds, suggested that incremental and full scale table is adding A date date fields, such as tradingdate, on the field to establish the range partitioning index,
Attach the stored procedure:
The MERGE INTO full scale T USING
(SELECT A, B, C, SRC, TIME
The FROM incremental table) S
ON (T.A=S.A AND T.B=S.B AND tc=S.C)
The WHEN MATCHED THEN
UPDATE the SET
(T.T IME=SELECT. TIME)
WHEN NOT MATCHED THEN
INSERT T
(T.A.
T.B.
Tc,
T.S RC,
T.T IME)
VALUES (
S.A,
S.B.
S.C,
S.S RC,
S.T IME)
;
PS: I was a rookie, opinions are for reference only, at the actual execution efficiency in the statement,