Home > other >  Optimization problem of Redshift
Optimization problem of Redshift

Time:09-17

Our company's business belongs to the heavy users of Redshift, there are four cluster nodes to support 10 database table, there is a big table which named "click_track" used to record the user on the mobile app click operation, the table will be on the first day of each month for a cut, to keep only 3 months of data, due to the rapid business growth, now this form will be in the month when data volume surge, up to 3 billion data volume, and disk space occupancy rate reached 95%,

Based on the above situation, I have two problems:
1. The disk utilization is very uneven, in one of the two nodes, utilization rate is 92%, but in the other two nodes, utilization rate is 45%, could you tell me how to balance the four nodes disk usage?

2. We in Vacuum for the Zhang Dabiao maintenance every week to release the disk space, but as the table size increases, more slowly, sometimes it takes about 12 hours to complete an action, and affect more write performance during this period, and how do I can optimize?

CodePudding user response:

Uneven Distribution of disk is usually Distribution Style design is not perfect, it's a pity that once the table design change again after the completion of Distribution is almost impossible, if the amount of data is very big, can only be reset again after optimization,
  •  Tags:  
  • AWS
  • Related