Home > Mobile >  Multiple external scripts with output buffering
Multiple external scripts with output buffering

Time:03-30

Trying to create a batch job that calls 4 external php scripts. I can do this using curl_exec, but then it waits for all 4 scripts to complete before sending text to screen. I am sure this can be optimized, but can't really get it working. Any suggestions on how to do this with curl_multi_getcontent output buffering so it displays the data as it get's it from the external sites?

CodePudding user response:

curl_multi indeed.

Any suggestions on how to do this with curl_multi_getcontent output buffering so it displays the data as it get's it from the external sites?

well there's no need to buffer it really. printing the response to stdout is curl's default behavior anyway, so if you just want it printed to to the terminal, you don't need to add any code for printing at all :)

$mh=curl_multi_init();
$urls=array(
"url1",
"url2",
"url3",
"url4"
);
$curls=array();
foreach($urls as $url){
    $ch=curl_init($url);
    curl_multi_add_handle($mh,$ch);
    $curls[]=$ch;
}
for(;;){
curl_multi_exec($mh,$active);
if($active<1){
 // all downloads finished
 break;
}
curl_multi_select($mh);
}
// cleanup
foreach($curls as $ch){
    curl_multi_remove_handle($mh,$ch);
    curl_close($ch);
}
curl_multi_close($mh);

note that this approach is only safe while you have a small-ish number of urls. when you go ~100 urls or more, you should probably look into queuing, so you don't get auto-firewall-banned-for-ddos, for an example check the function curl_fetch_multi_2(array $urls_unique, int $max_connections = 100, array $additional_curlopts = null) from https://gist.github.com/divinity76/79efd7b8c0d7849b956cd194659c98e5 , that function will never fetch more than $max_connections = 100 (configurable, default 100) urls at the same time :)

  • edit: per the comments, if you want to buffer the outputs for some reason, it gets more complicated but you can still do it, example:
<?php
$mh = curl_multi_init();
$urls = array(
    "https://example.com",
    "https://example.net",
    "https://example.org",
);
$curls = array();
foreach ($urls as $url) {
    $ch = curl_init($url);
    curl_setopt_array($ch, array(
        CURLOPT_ENCODING => '',
        CURLOPT_RETURNTRANSFER => 1
    ));
    curl_multi_add_handle($mh, $ch);
    $curls[(int)$ch] = $ch;
}
while (!empty($curls)) {
    for (;;) {
        curl_multi_exec($mh, $active);
        if ($active < count($curls)) {
            // at least 1 download has finished, process it
            break;
        }
        curl_multi_select($mh);
    }
    while (($info = curl_multi_info_read($mh))) {
        if ($info['msg'] !== CURLMSG_DONE) {
            continue;
        }
        $ch = $info['handle'];
        $url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
        $response = curl_multi_getcontent($ch);
        $cap_response = true;
        if ($cap_response) {
            $response = substr($response, 0, 50);
        }
        if ($info['result'] !== CURLE_OK) {
            echo "Error: {$url}: " . curl_error($ch) . "\n";
        }
        echo "download finished: " . var_export(["url" => $url, "response" => $response], true) . "\n";
        // cleanup
        unset($curls[(int)$ch]);
        curl_multi_remove_handle($mh, $ch);
        curl_close($ch);
    }
}
// cleanup
curl_multi_close($mh);

this code prints

$ php fuk.php
download finished: array (
  'url' => 'https://example.com/',
  'response' => '<!doctype html>
<html>
<head>
    <title>Example D',
)
download finished: array (
  'url' => 'https://example.net/',
  'response' => '<!doctype html>
<html>
<head>
    <title>Example D',
)
download finished: array (
  'url' => 'https://example.org/',
  'response' => '<!doctype html>
<html>
<head>
    <title>Example D',
)

downloading example.com and example.net and example.org in parallel and buffering the output :)

  • Related