Home > Enterprise >  PHP efficient XML library suitable for batch processing
PHP efficient XML library suitable for batch processing

Time:10-05

So I'm looking for a XML library that would be suitable for batch processing, I'd like to limit how many records are selected from XML file and an offset from where to start reading the records. I could not find anything relevant although I have went through github search, or maybe my search terms are not accurate enough, some libraries don't even have a documentation so it's really hard to tell right of the bat. But maybe some of you have already used ones that do just that and could share your findings.

Any help is appreciated

CodePudding user response:

Take a look at XMLReader, which works more like a cursor reading through the XML, and which you can terminate and close whenever you want.

For example :

<building_data>
  <building address="some address" lat="28.902914" lng="-71.007235" />
  <building address="some address" lat="48.892342" lng="-75.0423423" />
  <building address="some address" lat="58.929753" lng="-79.1236987" />
</building_data>

Then

$reader = new XMLReader();

if (!$reader->open("data.xml")) {
    die("Failed to open 'data.xml'");
}

while($reader->read()) {
  if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'building') {
    $address = $reader->getAttribute('address');
    $latitude = $reader->getAttribute('lat');
    $longitude = $reader->getAttribute('lng');

    // abort at the first "building" node
    break;
  }
}

$reader->close();

CodePudding user response:

XMLReader allows you to optimize memory consumption mostly but it has a XMLReader::next() to iterate a list of sibling elements. Combined with some counters it should be pretty efficient but maintainable.

After you moved to the right element you can read data using the XMLReader methods or expand it to DOM. The right solution/balance depends on how complex the structure of the "list item" elements is.

$xmlUri = 'books.xml';

$reader = new XMLReader();
$reader->open($xmlUri);

$document = new DOMDocument();
$xpath = new DOMXpath($document);

// look for the first book element
while ($reader->read() && $reader->localName !== 'book') {
  continue;
}

$start = 50;
$end = $start   10;
$offset = 0;

// while here is a book element
while ($reader->localName === 'book') {
  // ignore until start
  if ($offset < $start) {
    continue;
  }
  // expand to DOM and read data
  $book = $reader->expand($document);
  var_dump(
    $xpath->evaluate('string(title/@isbn)', $book),
    $xpath->evaluate('string(title)', $book)
  );
  // break loop after end
  if ($offset >= $end) {
    break;
  }
  // increment offset counter
  $offset  ;
  // move to the next book sibling
  $reader->next('book');
}
$reader->close();
  • Related