Remember a SparkStreaming + Kafka + Redis project solve the problem of blocking process-CodePudding

Projects using batch mode processing data, the first project is stable, a second batch, basic processed 200 ms,
Then data crush one times, projects have blocking happens from time to time,
Solution, 1, the first thought that computing resources are inadequate, after check the machine found that there is no problem,
2, the communication speed between each server, use ali's server, all the server to adjust to a network segment, still no improvement,
3, there is no way to set their sights on redis, but redis is and the back-end general, do not easily to the conclusion that dot in the business logic, type redis each operation of the time, found that is the problem of redis,

Positioning, why redis not line? Most thought machine memory use, moved the redis and 32 gb server, still no, is not memory, that is how to return a responsibility? Log in cli, queries all time-consuming longer operation, and then change the code, not yet,
Finally found every moment redis service account for one hundred percent of the CPU, but also for two seconds, in this, to the original question of redis data is in memory, but redis have backup mechanism, in order to prevent data loss, there are several trigger condition, once triggered a, can brush of memory data to disk, the formation of RDB file (RDB backup mechanism, by default) to the conf configuration file to modify the trigger condition later so he is not in the CARDS,

Must have a backup mechanism, combined with the server, the amount of data, data importance to choose the most suitable for their own backup mechanism and backup strategy is king,

CodePudding user response:

Your starting conditions change later became what building Lord, according to what you said the original condition should be apendsync aways, now is to modify the no, or is it everysec