Home > Software engineering >  SQL Laravel 5.7 unique() on collection really slow
SQL Laravel 5.7 unique() on collection really slow

Time:05-23

I'm trying to calculate the number of unique records based on a mobile column that has an index via the Laravel collect and unique method. I have 200,000 rows and have a column called optout_csv_schedule_id that has an index on it along with the mobile. Right now, it's been running over 15 minutes for the query to execute, how can I improve the performance of this as I need to calculate the number of unique numbers out of the 200,000, my current query is:

/**
 * Get valid lead count
 */
protected function getValidLeadCount($schedule_id)
{
    $optoutConnectionLogs = OptoutConnectionLog::where('optout_csv_schedule_id', $schedule_id)
                                               ->get();

    // no leads
    if (!$optoutConnectionLogs) {
        return 0;
    }

    // count total unique leads
    $uniqueLeads = collect($optoutConnectionLogs)->unique('mobile')->count();
    return $uniqueLeads;
}

CodePudding user response:

It seems to be difficult to calculate the number of unique numbers out of the 200,000 in Laravel. Try to change as follows:

protected function getValidLeadCount($schedule_id)
{
    $uniqueLeads = OptoutConnectionLog::where('optout_csv_schedule_id', $schedule_id)
                                                 ->distinct('mobile')
                                                 ->count('mobile');
    return $uniqueLeads;
}

CodePudding user response:

You are not using the database to do the unique, you already got the records with ->get(), and are using PHP/Laravel to do it. That will be much slower than using the database.

Use distinct() to get unique records, eg:

$optoutConnectionLogs = OptoutConnectionLog::where('optout_csv_schedule_id', $schedule_id)
    ->select('mobile')
    ->distinct()
    ->get();

CodePudding user response:

You read all the data into memory, convert it into PHP objects, and then iterate to count the numbers. The database index you created is not used at all.

Your needs should be simplified into the following code

return OptoutConnectionLog::where('optout_csv_schedule_id', $schedule_id)
                           ->distinct('mobile')
                           ->count();
  • Related