I have defined a class "Scraper" and the method "scraping" contained in it outputs a list with price information ("results"). My objects are several online shops, for which I have defined the respective attributes. Currently I only manage to get a separate DataFrame with the price information for each online shop with separate queries (see code snippet).
def main():
online_shop_1 = Scraper('Attr_1', 'Attr_2', …)
online_shop_2 = Scraper('Attr_1', 'Attr_2', …)
online_shop_1.scraping()
df_results_ = pd.DataFrame(results)
print(df_results)
online_shop_2.scraping()
df_results = pd.DataFrame(results)
print(df_results)
I would like to iterate with a loop over all online shops and get directly one DataFrame containing the price information of all online shops. I guess the question, has already been asked and answered in a similar form, however, as a beginner, I do not yet manage to apply the solutions to my problem. Therefore I would be very pleased about support.
CodePudding user response:
The following code did the trick:
global df_results
df_results = pd.DataFrame(results)
def main():
online_shop_1 = Scraper('Attr_1', 'Attr_2', …)
online_shop_2 = Scraper('Attr_1', 'Attr_2', …)
complete_results = []
shops = [online_shop_1, online_shop_2]
for shop in shops:
shop.scraping()
complete_results.append(df_results)
complete_results = pd.concat(complete_results, ignore_index=True)
CodePudding user response:
I don't fully understand your code, for example is the results
variable the output of the .scraping() method, if so, is it getting assigned globally?
But maybe you can try a simple for-loop in which you call the scraping method and add the results to a previously created dataframe.
def main():
online_shop_1 = Scraper('Attr_1', 'Attr_2', …)
online_shop_2 = Scraper('Attr_1', 'Attr_2', …)
df = pd.DataFrame()
shops = {'shop1': online_shop_1, 'shop2':online_shop_2}
for shop_name,shop in shops.items():
shop.scraping()
df[shop_name] = results
This should give you one dataframe with a column for each shop.