Home > other >  The spark how to update the mysql existing data
The spark how to update the mysql existing data

Time:09-24

Dataset The.write () mode (SaveMode. Append). The JDBC ()
Writing data to database only Append to Overwrite, ErrorIfExists, Ignore this several patterns, how to update the original data, the general is how to deal with

CodePudding user response:

I wrote a insertOrUpdate, actual logic is a Dataset table structure is consistent with the target, specify a field as the id, foreachPartition operator within each record query whether there is the primary key at a time, if you have made the update (SQL) assembled, not just to insert
The most violent way is to delete + insert,,,,

CodePudding user response:

Suggest delete first, then insert, speed, the update wrong won't happen again

CodePudding user response:

ForeachPartition operator within each record query whether there is the primary key at a time, how this kind of performance, if the target table has tens of millions of data, will this performance is very poor?

CodePudding user response:

reference SomebodyTOLove reply: 3/f
foreachPartition operator within each record query whether there is the primary key at a time, how this kind of performance, if the target table has tens of millions of data, will this performance is very poor?

Slow to batch is for sure, I'm a million-dollar update to the full amount of data, it's about 15 to 20 minutes, MySQL standalone throughput is on this,,, if you want to the pursuit of speed, and no other read depends on the target table to can use the delete + insert, basic can get this done in less than 2 to 3 minutes

CodePudding user response:

Well, it seems that it is best to only do insert operations

CodePudding user response:

I'm sorry, because of personal ability is limited, can't help you!




  • Related