I want to iterate over the objects in a bucket. I REALLY need to paginate this - we have 100's of thousands of objects in the bucket. Our bucket looks like:
bucket/MLS ID/file 1
bucket/MLS ID/file 2
bucket/MLS ID/file 3
... etc
Simplest version of my code follows. I know the value I'm setting into $params['nextToken'] is wrong, I can't figure out how or where to get the right one. $file_objects is a 'Google\Cloud\Storage\ObjectIterator', right?
// temp: pages of 10, out of a total of 100. I really want pages of 100
// out of all (in my test bucket, I have about 700 objects)
$params = [
'prefix' => $mls_id,
'maxResults' => 10,
'resultLimit' => 100,
'fields' => 'items/id,items/name,items/updated,nextPageToken',
'pageToken' => NULL
];
while ( $file_objects = $bucket->objects($params) )
{
foreach ( $file_objects as $object )
{
print "NAME: {$object->name()}\n";
}
// I think that this might need to be encoded somehow?
// or how do I get the requested nextPageToken???
$params['pageToken'] = $file_objects->nextResultToken();
}
So - I don't understand maxResults vs resultLimit. It would seem that resultLimit would be the total that I want to see from my bucket, and maxResults the size of my page. But maxResults doesn't seem to affect anything, while resultLimit does.
maxResults = 100
resultLimit = 10
produces 10 objects.
maxResults = 10
resultLimit = 100
spits out 100 objects.
maxResults = 10
resultLimit = 0
dumps out all 702 in the bucket, with maxResults having no effect at all. And at no point does "$file_objects->nextResultToken();" give me anything.
What am I missing?
CodePudding user response:
The objects
method automatically handles pagination for you. It returns an ObjectIterator
object.
The resultLimit
parameter limits the total number of objects to return across all pages. The maxResults
parameter sets the maximum number to return per page.
If you use a foreach
over the ObjectIterator
object, it'll iterate through all objects, but note that there are also other methods in ObjectIterator
, like iterateByPage
.
CodePudding user response:
Ok, I think I got it. I found the documentation far too sparse and misleading. The code I came up with:
$params = [
'prefix' => <my prefix here>,
'maxResults' => 100,
//'resultLimit' => 0,
'fields' => 'items/id,items/name,items/updated,nextPageToken',
'pageToken' => NULL
];
// Note: setting 'resultLimit' to 0 does not work, I found the
// docs misleading. If you want all results, don't set it at all
// Get the first set of objects per those parameters
$object_iterator = $bucket->objects($params);
// in order to get the next_result_token, I had to get the current
// object first. If you don't, nextResultToken() always returns
// NULL
$current = $object_iterator->current();
$next_result_token = $object_iterator->nextResultToken();
while ($next_result_token)
{
$object_page_iterator = $object_iterator->iterateByPage();
foreach ($object_page_iterator->current() as $file_object )
{
print " -- {$file_object->name()}\n";
}
// here is where you use the page token retrieved earlier - get
// a new set of objects
$params['pageToken'] = $next_result_token;
$object_iterator = $bucket->objects($params);
// Once again, get the current object before trying to get the
// next result token
$current = $object_iterator->current();
$next_result_token = $object_iterator->nextResultToken();
print "NEXT RESULT TOKEN: {$next_result_token}\n";
}
This seems to work for me, so now I can get to the actual problem. Hope this helps someone.