I have a google spreadsheet with around 3000
rows and I am trying to extract comments from this spreadsheet using the following code:
import requests
from apiclient import errors
from apiclient import discovery
from apiclient.discovery import build
from oauth2client.client import OAuth2WebServerFlow
import httplib2
CLIENT_ID = "xxxxxyyyy"
CLIENT_SECRET = "xxxxxxx"
OAUTH_SCOPE = "https://www.googleapis.com/auth/drive"
REDIRECT_URI = 'urn:ietf:wg:oauth:2.0:oob'
file-id = "zzzzzz"
def retrieve_comments(service, file_id):
"""Retrieve a list of comments.
Args:
service: Drive API service instance.
file_id: ID of the file to retrieve comments for.
Returns:
List of comments.
"""
try:
comments = service.comments().list(fileId=file_id).execute()
return comments.get('items', [])
except errors.HttpError as error:
print(f'An error occurred: {error}')
return None
# ...
flow = OAuth2WebServerFlow(CLIENT_ID,CLIENT_SECRET,OAUTH_SCOPE)
flow.redirect_uri = REDIRECT_URI
authorize_url = flow.step1_get_authorize_url()
print("Go to the following link in your web browser " authorize_url)
code = input("Enter verfication code : ").strip()
credentials = flow.step2_exchange(code)
http = httplib2.Http()
http = credentials.authorize(http)
service = build('drive', 'v2', http=http)
comments = retrieve_comments(service, file-id)
However, the length of the list comments
is only 20
whereas the spreadsheet surely contains more comments. Could someone explain which parameter I would need to tweak to retrieve all the comments in the spreadsheet? Thanks!
CodePudding user response:
In the current stage, the default value of maxResults
(Drive API v2) or pageSize
(Drive API v3) of "Comments: list" of Drive API v3 is 20. I thought that this might be the reason for your current issue of However, the length of the list comments is only 20 whereas the spreadsheet surely contains more comments.
. In this case, how about the following modification?
From:
comments = service.comments().list(fileId=file_id).execute()
return comments.get('items', [])
To:
From your script, when you want to use Drive API v2, please modify it as follows.
file_id = "###" # Please set your file ID.
res = []
page_token = None
while True:
obj = service.comments().list(fileId=file_id, pageToken=page_token, maxResults=100, fields="*").execute()
if len(obj.get("items", [])) > 0:
res = [*res, *obj.get("items", [])]
page_token = obj.get("nextPageToken")
if not page_token:
break
return res
When you want to use Drive API v3, please modify it as follows.
file_id = "###" # Please set your file ID.
res = []
page_token = None
while True:
obj = service.comments().list(fileId=file_id, pageToken=page_token, pageSize=100, fields="*").execute()
if len(obj.get("comments", [])) > 0:
res = [*res, *obj.get("comments", [])]
page_token = obj.get("nextPageToken")
if not page_token:
break
return res
- In this modification, the all comments in the Spreadsheet is returned as an array.