Home > database >  Slow data insertion into mysql database
Slow data insertion into mysql database

Time:10-19

I am trying to make a csv upload page for an application that Im building. It needs to be able to upload thousands of rows of data in seconds each row including a first name, last name, and phone number. The data is being uploaded to a vm that is running ubuntu server. When I run the script to upload the data it takes almost 2 minutes to upload 1500 rows. The script is using PDO, I have also made a test script in python to see if It was a php issue and the python script is just as slow. I have made csv upload scripts in the past that are exactly the same that would upload thousands of rows in seconds. We have narrowed the issue down to the script as we have tested it on other vms that are hosted closer to us and the issue still persist. Is there an obvious issue with the script or PDO that could be slowing it down? Below is the code for the script.

<?php

$servername =[Redacted];
$username = [Redacted];
$password = [Redacted];

try {
    $conn = new PDO("mysql:host=$servername;dbname=test", $username, $password);
    // set the PDO error mode to exception
    $conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
    echo "Connected successfully";
} catch(PDOException $e) {
    echo "Connection failed: " . $e->getMessage();
}

echo print_r($_FILES);

$fileTmpPath = $_FILES['fileToUpload']['tmp_name'];
$fileName = $_FILES['fileToUpload']['name'];
$fileSize = $_FILES['fileToUpload']['size'];
$fileType = $_FILES['fileToUpload']['type'];

$CSVfp = fopen($fileTmpPath, "r");

$final_array = [];
while (($data = fgetcsv($CSVfp)) !== false) {

    $recived = [];
    foreach ($data as $i) {
        array_push($recived, $i );
    }
    array_push($final_array, $recived);
}
echo print_r($final_array);

fclose($CSVfp);

$non_compliant_rows = [];

foreach ($final_array as $key => $row){
    $fname = preg_replace('/[^A-Za-z0-9]/', "", $row[0]);
    $lname = preg_replace('/[^A-Za-z0-9]/', "", $row[1]);
    $mobileNumber = preg_replace( '/[^0-9]/i', '', $row[2]);
    $sanatized_row = array($fname, $lname, $mobileNumber);
    $recived[$key] = $sanatized_row;
    if (strlen($mobileNumber) > 10 or strlen($mobileNumber) < 9){
        array_push($non_compliant_rows, $final_array[$key]);
        unset($final_array[$key]);
    }
}
$final_array = array_values($final_array);
echo print_r($final_array);

foreach($final_array as $item){
    try{

        $stmt = $conn->prepare("INSERT INTO bulk_sms_list(fname, lname, pn, message, send) VALUES (?, ?, ?, 'EMPTY', 1) ;");
        $stmt->execute($item);

    }catch(PDOException $e){
        echo $e;
    }

}
echo "done";

The phone numbers column has a UNIQUE constraint to prevent us from having duplicate phone numbers. We have tried to use batch inserting but if one row doesn't comply with the constraints then all of the insertions fail.

below is the schema of the table:

 --------- ------ ------ ----- --------- ---------------- 
| Field   | Type | Null | Key | Default | Extra          |
 --------- ------ ------ ----- --------- ---------------- 
| id      | int  | NO   | PRI | NULL    | auto_increment |
| fname   | text | NO   |     | NULL    |                |
| lname   | text | NO   |     | NULL    |                |
| pn      | text | NO   | UNI | NULL    |                |
| message | text | YES  |     | NULL    |                |
| send    | int  | NO   |     | 1       |                |
 --------- ------ ------ ----- --------- ----------------

EDIT: I have timed the clean up portion of the script at the request of @aynber. the time of the clean up was 0.24208784103394 Seconds. The time it took to do the sql portion is 108.2597219944 Seconds

CodePudding user response:

The fastest solution should be to use LOAD DATA LOCAL INFILE. Since you answered in a comment that duplicate phone numbers should result in skipping a row, you can use the IGNORE option.

Load directly from the file, instead of processing it with PHP. You can do some of your transformations in the LOAD DATA statement.

For example:

LOAD DATA INFILE ? IGNORE INTO TABLE bulk_sms_list
FIELDS TERMINATED BY ','
(@fname, @lname, @mobile)
SET fname = REGEXP_REPLACE(@fname, '[^A-Za-z0-9]', ''),
    lname = REGEXP_REPLACE(@lname, '[^A-Za-z0-9]', ''),
    pn = IF(LENGTH(@mobile) BETWEEN 9 AND 10, @mobile, NULL),
    message = 'EMPTY', 
    send = 1;

Then follow the import with some cleanup to get rid of any rows with invalid phone numbers:

DELETE FROM bulk_sms_list WHERE pn IS NULL;

Read https://dev.mysql.com/doc/refman/8.0/en/load-data.html for more information.

CodePudding user response:

How to optimize slow INSERT queries in MySQL

Remove existing indexes - Inserting data to a MySQL table will slow down once you add more and more indexes. Inserting data in bulks - To optimize insert speed, combine many small operations into a single large operation.

  • Related