Home > Software engineering >  php laravel update or create massive data
php laravel update or create massive data

Time:01-17

I have a case in which i need to sync an external existing table with the website table every few minutes.

I previously had it with a simple foreach which would loop through every record, as the table grows it became slower and slower and now it is taking a long time for around 20.000 records.

I want to make sure it creates a new record or updates an existing one. This is what I got but it doesn't seem to update the existing rows.

        $no_of_data = RemoteUser::count(); // 20.000 (example)

        $webUserData = array();
        for ($i = 0; $i < $no_of_data; $i  ) {
            // I check the external user so i can match it.
            $externalUser = RemoteUser::where('UserID', $i)
                ->first();

            if($externalUser) {
                $webUserData[$i]['username'] = $externalUser->username;
                $webUserData[$i]['user_id'] = $externalUser->UserID;
            }
        }

        $chunk_data = array_chunk($webUserData, 1000);
        if (isset($chunk_data) && !empty($chunk_data)) {
            foreach ($chunk_data as $chunk_data_val) {

                \DB::table('WebUser')->updateOrInsert($chunk_data_val);
            }
        }

Is there something I am missing or is this the wrong approach?

Thanks in advance

CodePudding user response:

I'll try to make a complete all-in-one answer on some possible event driven solutions. The ideal scenario would be to change the current situation of a static check of each and every row to an event-driven solution where each entry notifies a change.

I won't list solutions per database here and use MySQL by default.

I see three possible solutions:

  • using an internal solution if only one and the same database instance is at play using triggers
  • if the creation or modification of the eloquent models are based in one place, eloquent events could be optional
  • alternatively mysql replication could play a role to catch the events if modifications occur outside of the application (multiple applications modify the same database).

Using triggers

If the situation applies syncing data on the same database instance (different databases) or on the same database process (same database) and the data you copy doesn’t need intervention by an external interpreter, you can use SQL or any extension of SQL supported by your database to use triggers or prepared statements.

I assume you’re using MySQL, if not, SQL triggers are quite similar across all known databases supporting SQL.

A trigger structure has a simple layout like:

CREATE TRIGGER trigger_name
AFTER UPDATE ON table_name 
FOR EACH ROW
  body_to_execute

Where AFTER_UPDATE is the event to catch in this example.

So for an update event, we would like to know the data that has been changed AFTER it has been updated, so we’ll use the AFTER UPDATE trigger.

So an AFTER UPDATE for your table, calling both remote_user as original and web_user as the copy, using both user_id and username as fields, would look something like

CREATE TRIGGER user_updated 
AFTER UPDATE ON remote_user 
FOR EACH ROW
  UPDATE web_user
     SET username = NEW.username
   WHERE user_id = NEW.user_id;

The variables NEW and OLD are available in triggers, where NEW owns the data after the update and OLD before the update.

For a new user that has been inserted, we have the same procedure, we just need to create the entry in web_user.

CREATE TRIGGER user_created
AFTER INSERT on remote_user
FOR EACH ROW
  INSERT INTO web_user(user_id, username)
  VALUES(NEW.user_id, NEW.username);

Hope this gives you a clear idea on how to use triggers with SQL. There is a lot of information to be found, guides, tutorials, you name it. SQL might be an old boring language created by old people with long beards, but to know its features gives you a great advantage to solve complicated problems with simple methods.

Using Eloquent events

Laravel has a bunch of Eloquent events that get triggered when models do stuff. If the creation or modification of a model (entry in the database) only occur in one place (e.g. on entry point or application), the use of Eloquent events could be an option.

This means that you have to guarantee that the modification and/or creation takes place using Eloquents model:

Model::create([...]);
Model::find(1)->update([...])
$model->save();
// etc

And not indirectly using DB or similar:

// won't trigger any event
DB::table('remote_users')->update()->where();

Also avoid using saveQuietly() or any method on the model that's been built deliberately to suppress events.


The simplest solution would be to directly register events in the model itself using the protected static boot method.

namespace App\Models;

use bunch\of\classes;

class SomeModel extends Model {

   protected static function booted() {
     static::updated(function($model) {
         // access any database or service
     });

     static::created(function($model) {
       // access any database or service
     });
   }

}

To put the callback on a queue, Laravel 8 and up offer the queueable function to utilize the queue.

static::updated(queueable(function($ThisModel) {
  // access any database or service
}));

From Laravel 7 or lower, it would be wise to create an observer and push everything on queue using jobs.


example based on your comment

If a model is present for both databases, the Eloquent events could be used in such a way, where InternalModel presents the main model which will trigger the events (source) and ExternalModel the model to update its, to be synced, database (sync table or replication).

namespace App\Models;

use App\Models\ExternalModel;

class InternalModel extends Model {

   protected static function booted() {
     static::updated(function($InternalModel) {
       ExternalModel::find($InternalModel->id)->update([
         'whatever-needs' => 'to-be-updated'
       ]);
     });

     static::created(function($InternalModel) {
       ExternalModel::create([
         'whatever-is' => 'required-to-create',
         'the-external' => 'model'
       ]);
     });

     static::deleted(function($InternalModel) {
        // do know we only have the $InternalModel object left, the entry in the database doesn't exist anymore.
        ExternalModal::destroy($InternalModel->id);
     });
   }

}

And remember to use the queueable() to utilize the queue if it might take longer than expected.

If indeed for some reason the InternalModel table get's updated by not using Eloquent, you can trigger each Eloquent event manually if the dispatch event() method is accessible, to keep the sync process functional. e.g.

$modal = InternalModel::find($updated_id);

// trigger the update manually
event('eloquent.updated: ' . $model::class, $model);

All Eloquent events related to the models can be triggered in such a way, so: retrieved, creating, created, updating, updated, saving, saved, deleting and so on.

I would also suggest to create an additional Console command to start the sync process once, before stepping over to the Eloquent model events. Such a command is like the foreach you already used where you check once if all data is synced, something like php artisan users:sync. This could help if sometimes events don't trigger caused by exceptions, this is rare, but it does happen once in a while.


MySQL Replication

If triggers isn't a solution and you can't guarantee the data is modified from one single source, replication would be my final solution.

Someone created a package for Laravel which uses the krowinski/php-mysql-replication or the more up to date fork moln/php-mysql-replication called huangdijia/laravel-trigger.

A few things need to be configured though:

Firstly MySQL should be configured to save all events in a log file to be read.

server-id        = 1
log_bin          = /var/log/mysql/mysql-bin.log
expire_logs_days = 1
max_binlog_size  = 100M
binlog_row_image = full
binlog-format    = row

Secondly, the database user connected with the database should be granted replication privileges:

GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'user'@'host';

GRANT SELECT ON `dbName`.* TO 'user'@'host';

The general idea here is to readout a log file MySQL generates about events that occur. Writing this answer took me a while longer because I couldn't get this package up and running within a few minutes. Though I have used it in the past and know it worked flawlessly, I wrote a smaller package which would minimize the traffic and filter out events I didn't use.

I've already opened an issue and I'm going to open several over time to get this thing up and running again.

But to grasp the idea of its use-fullness, I'm going to explain its workings anyway.

To configure an event, listeners are put in a routes file called routes/trigger.php, where you have access to the $trigger instance (manager) to bind your listeners.

If we would put this into context of your tables, a listener would look like

$trigger->on('database_name.remote_users', 'update', function($event) {
  // event will contain a `EventInfo` object with changed entry data.
});

Same would go for create (write) events on the table

$trigger->on('database_name.remote_users', 'write', function($event) {
  // event will contain a `EventInfo` object with changed entry data.
});

To start listening for database events use the

php artisan trigger:start

To get a list of all listeners recognized from the routes/trigger.php use

php artisan trigger:list

To get a status of which bin file has been recognized and its current position use

php artisan trigger:status

In a ideal situation you would use supervisor to start the listener (artisan trigger:start) to be run in the background. If the service needs to boot again due to updates made in your application, you can simply use php artisan trigger:terminate to reboot the service. Supervisor will notice and start again with a fresh booted application.


update on package status

They seem to respond very well and some things have already been fixed. I can definitely say for sure that this package will be up and running the in a few weeks.


Normally I won't put anything in my answers on stuff I didn't used or tested myself, though I know this worked before, I'm giving it a chance it's going to work again in the next several weeks. It's a least something to watch out for or even test it to grasp ideas on how to implement in a real case scenario.

Hope you enjoyed reading.

  • Related