Home > database >  Webscraping, appending multiple values to a single row in a list
Webscraping, appending multiple values to a single row in a list

Time:11-24

I'm trying to figure out how i can append several values to a list correctly. The webpage I'm scraping is a food blog. I want to retrieve the title for a recipe and all the recipe keys(gluten free, vegan, dairy free, vegetarian etc) associated to that specific recipe. I'm able to retrieve the information from the page but the problem I'm having is appending several recipe keys to a single row on a list, so if the first recipe on the page is both dairy free and gluten free I'm not able to append them so that they match the row of corresponding recipe. I'm sharing a piece of my code so you can see what I'm working with. Appreciate the help thanks in advance.

recipe = []
key = []


for page in pages:
page = requests.get('https://www.skinnytaste.com/page/' str(page) '/') 
soup = BeautifulSoup(page.text, 'html.parser')
recipes = soup.find_all('article', class_='post teaser-post odd')
recipes.extend(soup.find_all('article', class_='post teaser-post even'))
sleep(randint(2, 8)) 

for r in recipes:
    
    titles = r.h2.text
    recipe.append(titles)
    print(titles)
    
    
    post_meta = r.find('div', class_='post-meta')                                             
    icons = post_meta.find('div', class_='icons')
    if not (post_meta.find('div', class_='icons') is None):
        keys = icons.find_all('span')
        for k in keys:
            recipe_key = k.find('a').find('img').get('alt')
            key.append(recipe_key) 
            print(recipe_key)

CodePudding user response:

Initialize an empty list called rows. Then create a dictionary of each row, update the dictionary dynamically, as some recipes will have more "keys" than others. Then append that dictionary row into your list of rows. Then pandas can use that to construct the table.

import requests
import pandas as pd
from bs4 import BeautifulSoup
from time import sleep
from random import randint


headers = {'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Mobile Safari/537.36'}

rows = []
pages = range(1,5)
for page in pages:
    response = requests.get('https://www.skinnytaste.com/page/' str(page) '/', headers=headers) 
    soup = BeautifulSoup(response.text, 'html.parser')
    recipes = soup.find_all('article', class_='post teaser-post odd')
    recipes.extend(soup.find_all('article', class_='post teaser-post even'))
    sleep(randint(2, 8)) 
    
    for r in recipes:
        
        titles = r.h2.text

        print(titles)
        row = {'Title':titles}
        
        
        post_meta = r.find('div', class_='post-meta')                                             
        icons = post_meta.find('div', class_='icons')
        if not (post_meta.find('div', class_='icons') is None):
            keys = icons.find_all('span')
            for count, k in enumerate(keys, start=1):
                recipe_key = k.find('a').find('img').get('alt')
                row.update({'key_%.2d' %count: recipe_key})
                print(recipe_key)
                
        rows.append(row)
        
results = pd.DataFrame(rows)

Output:

print(results.to_string())
                                                                Title            key_01             key_02            key_03                   key_04               key_05 key_06            key_07
0   Baked Pumpkin Pasta with Pancetta, Gruyere, Kale, and White Beans       Gluten Free                NaN               NaN                      NaN                  NaN    NaN               NaN
1                                        Mom’s Stuffing, Lightened Up               NaN                NaN               NaN                      NaN                  NaN    NaN               NaN
2                         Roasted Green Beans with Caramelized Onions        Dairy Free        Gluten Free  Vegetarian Meals         Whole 30 Recipes                  NaN    NaN               NaN
3                            7 Day Healthy Meal Plan (November 22-28)               NaN                NaN               NaN                      NaN                  NaN    NaN               NaN
4                                             Makeover Spinach Gratin       Gluten Free       Kid Friendly          Low Carb         Vegetarian Meals                  NaN    NaN               NaN
5                            Turkey Pot Pie with Sweet Potato Topping       Gluten Free       Kid Friendly               NaN                      NaN                  NaN    NaN               NaN
6                     Sautéed Shredded Brussels Sprouts with Pancetta        Dairy Free        Gluten Free      Keto Recipes             Kid Friendly             Low Carb  Paleo  Under 30 Minutes
7                    Baked Brie Phyllo Cups with Craisins and Walnuts  Under 30 Minutes   Vegetarian Meals               NaN                      NaN                  NaN    NaN               NaN
8                      Chicken Cassoulet with Sausage and Swiss Chard        Dairy Free      Freezer Meals       Gluten Free                      NaN                  NaN    NaN               NaN
9                                   Drunken Style Noodles with Shrimp        Dairy Free        Gluten Free               NaN                      NaN                  NaN    NaN               NaN
10                              Chicken and Broccoli Noodle Casserole      Kid Friendly                NaN               NaN                      NaN                  NaN    NaN               NaN
11               Arugula Salmon Salad with Capers and Shaved Parmesan       Gluten Free       Keto Recipes          Low Carb         Under 30 Minutes                  NaN    NaN               NaN
12                              Roasted Acorn Squash with Brown Sugar        Dairy Free        Gluten Free  Vegetarian Meals                      NaN                  NaN    NaN               NaN
13                                 Turkey Cutlets with Parmesan Crust      Kid Friendly   Under 30 Minutes               NaN                      NaN                  NaN    NaN               NaN
14                          Butternut Squash Ravioli with Sage Butter  Vegetarian Meals                NaN               NaN                      NaN                  NaN    NaN               NaN
15                Air Fryer Chicken Milanese with Mediterranean Salad         Air Fryer        Gluten Free  Under 30 Minutes                      NaN                  NaN    NaN               NaN
16                                Salisbury Steak with Mushroom Gravy        Dairy Free      Freezer Meals      Kid Friendly                 Low Carb     Under 30 Minutes    NaN               NaN
17                                                   Huevos Rancheros       Gluten Free   Under 30 Minutes  Vegetarian Meals                      NaN                  NaN    NaN               NaN
18                Easy Black Bean Vegetarian Chili with Spiced Yogurt       Gluten Free       Kid Friendly  Under 30 Minutes         Vegetarian Meals                  NaN    NaN               NaN
19                                                      Apple Cobbler  Vegetarian Meals                NaN               NaN                      NaN                  NaN    NaN               NaN
20                Tofu Stir Fry with Vegetables in a Soy Sesame Sauce        Dairy Free        Gluten Free  Under 30 Minutes         Vegetarian Meals                  NaN    NaN               NaN
21                        Autumn Apple and Grape Medley (Fruit Salad)       Gluten Free       Kid Friendly  Under 30 Minutes         Vegetarian Meals                  NaN    NaN               NaN
22                                       Chicken Cutlet Caprese Salad       Gluten Free  Meal Prep Recipes               NaN                      NaN                  NaN    NaN               NaN
23                                             Beef Stew with Pumpkin        Dairy Free      Freezer Meals      Kid Friendly  Pressure Cooker Recipes  Slow Cooker Recipes    NaN               NaN
24                                       Pumpkin Cream Cheese Muffins     Freezer Meals       Kid Friendly  Vegetarian Meals                      NaN                  NaN    NaN               NaN
25                                         Pumpkin Pie Overnight Oats        Dairy Free        Gluten Free      Kid Friendly         Vegetarian Meals                  NaN    NaN               NaN
26                                          Strawberry Cheesecake Dip       Gluten Free       Kid Friendly  Under 30 Minutes                      NaN                  NaN    NaN               NaN
  • Related