Hello I Need some help with dynamic sitemap using PHP This is my code
<?php
header('Content-type: application/xml; charset="ISO-8859-1"', true);
$dataAll1 = scandir('cache-data');
unset($dataAll1[0]);
unset($dataAll1[1]);
unset($dataAll1[2]);
$sitemap = '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="'.PROTOCOL.'://www.sitemaps.org/schemas/sitemap/0.9">';
$sitemap .= '<url>
<loc>' . SITE_HOST . '</loc>
<priority>1.0</priority>
</url>';
foreach($dataAll1 as $val){
$data = json_decode(file_get_contents('cache-data/'.$val),1);
if($val=='index.php'){
continue;
}
$sitemap .= '<url>
<loc>'.SITE_HOST . '/job-detail/' . $data['jk'].'jktk'.$data['tk'] . '-' . $service->slugify($data['title']).'</loc>
<priority>0.9</priority>
<changefreq>daily</changefreq>
</url>';
}
$sitemap.='</urlset>';
echo $sitemap;
this code success build sitemap file but after 50K this code wont make new file sitemap, so how to make this code want make new file sitemap after 50K Url?
I really appreciate any help
CodePudding user response:
I would recommend to switch from a web page that serves the sitemap when it's called, to a script that generates static XMLs and that you can optionally call via crontab
.
Step 1: Define directory where sitemaps are saved
This will be used to write one static xml
file for each sitemap, plus the sitemap index. The directory needs to be writable from the user running the script and accessible via web.
define('PATH_TO_SITEMAP', __DIR__.'/www/');
Step 2: Define the public URL where the sitemaps will be reachable
This is the Web URL of the folder you defined above.
define('HTTP_TO_SITEMAP', 'https://yourwebsite.com/');
Step 3: Write sitemaps to file
Instead of echo
the Sitemap, write it to static xml files in the folder defined above.
Furthermore keep a counter ($count_urls
in my example below) for the number of URLs and when it reaches 50k reset the counter, close the sitemap and open a new one.
Also, keep a counter of how many sitemaps you're creating ($count_sitemaps
in my example below).
Here's a working example that keeps your structure but writes the multiple files. It's not ideal, you should use a class/method or a function to write the sitemaps and avoid repeated code, but I thought this would be easier to understand as it keeps the same approach you had in your code.
$h = fopen(PATH_TO_SITEMAP.'sitemap_'.$count_sitemaps.'.xml', 'w ');
fwrite($h, '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="'.PROTOCOL.'://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>'.SITE_HOST.'</loc>
<priority>1.0</priority>
</url>
');
$count_urls = 0;
foreach ($dataAll1 as $val) {
$count_urls ;
$data = json_decode(file_get_contents('cache-data/'.$val), 1);
if ($val == 'index.php') {
continue;
}
// When you reach 50k, close the sitemap, increase the counter and open a new sitemap
if ($count_urls >= 50000) {
fwrite($h, '</urlset>');
fclose($h);
$count_sitemaps ;
$h = fopen(PATH_TO_SITEMAP.'sitemap_'.$count_sitemaps.'.xml', 'w ');
fwrite($h, '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="'.PROTOCOL.'://www.sitemaps.org/schemas/sitemap/0.9">');
$count_urls = 0;
}
fwrite($h, '
<url>
<loc>'.SITE_HOST.'/job-detail/'.$data['jk'].'jktk'.$data['tk'].'-'.$service->slugify($data['title']).'</loc>
<priority>0.9</priority>
<changefreq>daily</changefreq>
</url>
');
}
// Important! Close the final sitemap and save it.
fwrite($h, '</urlset>');
fclose($h);
Step 4: Write SitemapIndex
Finally you write a sitemapindex file that points to all the sitemaps you generated using the same $count_sitemaps
counter.
// Build the SitemapIndex
$h = fopen(PATH_TO_SITEMAP.'sitemap.xml', 'w ');
fwrite($h, '<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">');
for ($i = 1; $i <= $count_sitemaps; $i ) {
fwrite($h, '
<sitemap>
<loc>'.HTTP_TO_SITEMAP.'sitemap_'.$i.'.xml</loc>
<lastmod>'.date('c').'</lastmod>
</sitemap>
');
}
fwrite($h, '
</sitemapindex>');
fclose($h);
Step 5 (optional): ping Google
file_get_contents('https://www.google.com/webmasters/tools/ping?sitemap='.urlencode(HTTP_TO_SITEMAP.'sitemap.xml'));
Conclusions
As mentioned this approach is a bit messy and doesn't follow best practices on code re-use, but it should be quite clear on how to loop, store multiple sitemaps and finally build the sitemapindex file.
For a better approach, I suggest you to look at this open source Sitemap class.
CodePudding user response:
You should start the array generated from the scandir with the 50001, but it would be better to split it into several scripts with a crontab, so you don't exhaust the server with excessive runtime; otherwise, increase the runtime of PHP scripts.