I have this sheet:
I need to update rows with some description and price based on the first column (ASIN)
This is my code:
from socket import timeout
from time import sleep
import asyncio
import os.path
from pyppeteer import launch
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/spreadsheets']
# The ID and range of a sample spreadsheet.
SAMPLE_SPREADSHEET_ID = '1TjOE0AHt7y_X6T3Cecq0oNBY6X5euQfcPru7q9VIH8s'
SAMPLE_RANGE_NAME = 'A:E'
class product:
def __init__(self, name, url, price):
self.name = name
self.url = url
self.price = price
async def main():
"""Shows basic usage of the Sheets API.
Prints values from a sample spreadsheet.
"""
creds = None
# The file token.json stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.json'):
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.json', 'w') as token:
token.write(creds.to_json())
try:
service = build('sheets', 'v4', credentials=creds)
# Call the Sheets API
sheet = service.spreadsheets()
result = sheet.values().get(spreadsheetId=SAMPLE_SPREADSHEET_ID,
range=SAMPLE_RANGE_NAME).execute()
values = result.get('values', [])
if not values:
print('No data found.')
return
for row in values:
if len(row) < 5:
prod = product("test product", "https://www.amazon.es/dp/TESTASINNUMBER", "1,20$")
# update row
print("==================== INSERTING ====================")
print(row)
print(prod.name)
print(prod.price)
print(prod.url)
except HttpError as err:
print(err)
asyncio.get_event_loop().run_until_complete(main())
This is the terminal output:
So I want to loop every row and update some of them depending on some tests that will be made to the ASIN code
How can I update the current row I'm reading in google sheets?
CodePudding user response:
When I saw your script and your log, all values are the same. But from your question, I guessed that the value of prod
might be changed by the values of column "A" of the sheet.
If my guessing is correct, how about the following modification?
From:
SAMPLE_RANGE_NAME = 'A:E'
To:
SAMPLE_RANGE_NAME = 'A2:E'
And,
From:
if not values:
print('No data found.')
return
for row in values:
if len(row) < 5:
prod = product("test product", "https://www.amazon.es/dp/TESTASINNUMBER", "1,20$")
# update row
print("==================== INSERTING ====================")
print(row)
print(prod.name)
print(prod.price)
print(prod.url)
To:
res = []
if not values:
print('No data found.')
return
for row in values:
if row and len(row) < 5:
prod = product("test product", "https://www.amazon.es/dp/TESTASINNUMBER", "1,20$")
res.append([prod.name, prod.url, "", prod.price])
else:
res.append([""] * 4)
sheet.values().update(spreadsheetId=SAMPLE_SPREADSHEET_ID, range="B2", valueInputOption="USER_ENTERED", body={"values": res}).execute()
In the case of
SAMPLE_RANGE_NAME = 'A:E'
, the 1st header row is retrieved. From your showing sample sheet, I modified it toSAMPLE_RANGE_NAME = 'A2:E'
.When you want to use the column "A" of the sheet, in the case of your script,
row[0]
is the value of the column "A". Please be careful this.When I saw your showing sample sheet, the header row is
ASIN, Product name, PRODUCT LINK, Source, source price, Amazon price
. From your script,prod.name, prod.price, prod.url
might be the columns "B", "E" and "C", respectively. So by this my guess, I modified like the above. If you want to change the columns, please modify the above script.And, in this modification, the values are created in the loop. And, the created values are put to the sheet using Sheets API.
Note:
- From your provided information, I proposed the modified script. If this modified script cannot be used for your actual situation, in order to correctly understand your actual situation, it might be required to be more information.
Reference:
CodePudding user response:
Basically what I ended up doing is to create a pandas dataframe with my Google Sheets:
import gspread
from df2gspread import df2gspread as d2g
from gspread_dataframe import get_as_dataframe
# The ID and range of a sample spreadsheet.
SAMPLE_SPREADSHEET_ID = '1TjOE0AHt7y_X6T3Cecq0oNBY6X5euQfcPru7q9VIH8s'
SHEET_NAME = "AmazonPriceScrapper"
# Open an existing spreadsheet
gc = gspread.service_account()
worksheet = gc.open(SHEET_NAME).sheet1
# Get some columns back out
dataframe = get_as_dataframe(worksheet, usecols=[1,2,3,4,5,6],header=None, skiprows=1, include_index=False)
dataframe = dataframe.dropna(how='all')
columns = ["ASIN", "PRODUCT NAME", "PRODUCT LINK", "SOURCE", "SOURCE PRICE", "AMAZON PRICE"]
dataframe.columns = columns
After having a pandas dataframe, I just have to modify my dataframe:
# Make a list of ASIN codes
for index, value in enumerate(dataframe["ASIN"]):
# Modify value
dataframe.at[index,"PRODUCT NAME"]="my new product name"
Then we're good to upload it:
d2g.upload(dataframe, SAMPLE_SPREADSHEET_ID, SHEET_NAME)
I've finally ended up with this solution working as I've been struggling to update row by row and didn't manage to achieve it.
I personally also prefer this way, create the data locally then push.