Home > Back-end >  Rvest web-scraping url's from drop down menu, website without pages in R
Rvest web-scraping url's from drop down menu, website without pages in R

Time:08-29

enter image description here

I'm trying to find a solution where I can extract the URLs from the dropdown menu of Mscdonalds products, any idea?

CodePudding user response:

This is how you do it with rvest

library(tidyverse)
library(rvest)

page <- "https://www.mcdonalds.com/de/de-de/produkte/alle-produkte.html" %>%  
  read_html() 

tibble(
  category = page %>%  
    html_elements(".category-title") %>% 
    html_text2(),
  links = page %>%  
    html_elements(".category-link") %>% 
    html_attr("href") %>% 
    str_c("https://www.mcdonalds.com", .)
)

# A tibble: 12 x 2
   category                links                                                
   <chr>                   <chr>                                                
 1 Beliebte Produkte       https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
 2 Highlights              https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
 3 McMenü®                 https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
 4 Burger                  https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
 5 McNuggets® & Fingerfood https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
 6 Vegan, Veggie & Co.     https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
 7 Happy Meal®             https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
 8 Beilagen & Extras       https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
 9 Frühstück               https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
10 Getränke                https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
11 Desserts                https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
12 McCafé®                 https://www.mcdonalds.com/de/de-de/produkte/alle-pro~
  • Related