Home > database >  Hive generated after insert a small file to merge
Hive generated after insert a small file to merge

Time:09-30

Scene:
Increment of daily business data archive, that is, insert into the table operations, the implementation of the hive existing hive every time insert will not file append to the file before, but has become a new problem, for example:
Insert file before: 000000 _0
After insert file: 000000 _0
000000 _0_copy_1
Insert more for many times,
Official provided on how to merge these small file configuration is as follows:
 & lt; Property> 
Hive. Merge. Mapfiles
true


Hive. Merge. Mapredfiles
true


Hive. Merge. Smallfiles. Avgsize
134217728 & lt;/value>


But there is no work, file number or continue to accumulate
There are other configuration for guidance?

CodePudding user response:

Create a temporary table as an intermediary, and then the join to do incremental don't know will not solve the problem

CodePudding user response:

Solved? With o

CodePudding user response:

Build a PK PK table to save the new data every day, before each insert, PK is delete old data, insert new data
  • Related