I have this code that im tryng to do but get error on invalid schema
#for index, row in df.iterrows():
# print(index,row["Data"])
for offset in (df.apply(lambda row: row["Data"] , axis = 1)):
response = requests.get(df["Data"])
print('url:', response.url)
this is my dataframe that are a group of links per page (10 per page) and two index so they are 20 links. Data 0 [http://www.mercadopublico.cl/Procurement/Modu... 1 [http://www.mercadopublico.cl/Procurement/Modu...
I want to make this code run for every 10 links and scrape them and get the data, then go to next , but the data scraped will be on one set of information in a table.
but i cant make the response get the url inside of the data frame
i get this message
InvalidSchema: No connection adapters were found for '0 [http://www.mercadopublico.cl/Procurement/Modu...\n1 [http://www.mercadopublico.cl/Procurement/Modu...\nName: Data, dtype: object'
do you have a advice to this? best regards
I think that also would help me put both index in one fusing them, but not sure how to do it, searched a lot but coultdn't find how, some reference to np.array that I tried but didnt work.
CodePudding user response:
just to answer because i solved it , never store url as dataframe if you are scraping later, instead of make a dataframe resultsurl[] store it as list resultsurl=list()
and then iterate on list as for i in list() this case is calet resulturl..
thanks