Home > OS >  Requests triggering Cloudfare security
Requests triggering Cloudfare security

Time:09-22

The code that has the problem:

def populate_dictionary(data):                                            # Function to populate a dictionary with the info for each manga in the top 10
stem_link = 'https://www.anime-planet.com'                                  # Since we have the href tag we can use this stem link to find the page for each manga
rank = list(range (1,11))                                                   # Sets the lists we will use to populate the dictionary
name, descr, chp, = [], [], []
current_page = ""
page = requests.get('https://www.anime-planet.com/manga/top-manga/week')
soup = BeautifulSoup(page.text, 'html.parser')  
print(soup.text)
for link in soup.select('td.tableTitle > a.tooltip'):
    current_page = requests.get(stem_link   str(link.get('href')))                 # Each row in the table has a link to the manga's info page
    soup2 = BeautifulSoup(current_page.text, 'html.parser')
    if len(name) < 10:                                                             # If statement that cuts out after 10 mangas
        if not soup2.select('a[href^="/manga/tags/webtoons"]'):                      # Only enters if its not under the Webtoon Tag
            name.append(soup2.find('h1').get_text())                                 # Finds the name 
            for p in soup2.select('div.synopsisManga p'):                            # Finds the description
                descr.append(p.getText())
            chp.append(soup2.select_one(".entryBar > div").text.split()[-1].strip(" "))
                                                                                 # Finds how many chapters the manga has  
data['Rank'] = rank                                                         # Populates the dictionary
data['Name'] = name
data['Description'] = descr
data['Amount of Chapters'] = chp
return (data)   

Originally I thought it was the for loop that was the problem as the len(soup.select('td.tableTitle > a.tooltip') was returning as 0. Then another user said it was returning 100 for them. That makes sense because yesterday when I made this script it was working. I then printed out the soup object and it returns as:

    Checking if the site connection is secure








                Enable JavaScript and cookies to continue






    www.anime-planet.com needs to review the security of your connection before proceeding.

Why is cloudfare blocking the requests/soup? Not even cloudscraper works as a fix. It was working yesterday and it works for other people using my code?

CodePudding user response:

The webpage isn't under cloudflare protection rather it's working fine. I've answered so fast because You have deleted your previous question a bit before i posted my answer which is similar to it.

import pandas as pd
from bs4 import BeautifulSoup
import requests

url = 'https://www.anime-planet.com/manga/top-manga/week'

data = []

soup = BeautifulSoup(requests.get(url).text,'lxml')

for u in ['https://www.anime-planet.com' a.get('href') for a in soup.select('.tooltip')]:
    #print(u)
    soup2 = BeautifulSoup(requests.get(u).text,'lxml')
    d = {
        'Name':soup2.h1.text,
        'Rank':soup2.select_one('.pure-g.entryBar div:nth-of-type(5)').text.strip().replace('Rank #',''),
        'Amount of Chapters':soup2.select_one('.pure-g.entryBar div').text.strip(),
        'Description':soup2.select_one('.synopsisManga p').get_text(strip=True) if soup2.select_one('.synopsisManga p') else None

    }
   
    data.append(d)
df = pd.DataFrame(data)
print(df)

Output:

      Name   Rank    Amount of Chapters                                        Description
0      The Beginning After the End      4              Ch: 160   King Grey has unrivaled strength, wealth, and ...  
1                        One Piece      3  Vol: 103 ; Ch: 1060   Long ago the infamous Gol D. Roger was the str...  
2                Omniscient Reader     18              Ch: 125   Back then, Dokja had no idea. He had no idea h...  
3                Teenage Mercenary     45              Ch: 102   At the age of eight, Ijin Yu lost his parents ...  
4            The Swordmaster's Son    467               Ch: 38   Jin Runcandel was destined to become the head ...  
..                             ...    ...                   ...                                                ...  
95        Infinite Leveling: Murim  1,211              Ch: 129   Killed on the battlefield without glory to his...  
96               The Crow's Prince    839               Ch: 51   Life is about survival of the fittest... But w...  
97          Not-Sew-Wicked Stepmom  1,550               Ch: 76   Fairytale stepmothers are notoriously wicked. ...  
98                   Solitary Lady    613               Ch: 73   Noblewoman Hillis Inoaden has had many lives s...  
99  Demon Slayer: Kimetsu no Yaiba     23      Vol: 23; Ch: 205  The setting is Taisho era Japan. Tanjirou is a...  

[100 rows x 4 columns]

CodePudding user response:

You are trying to access a protected site, avoid commom requests and use instead non traditional ways:

  • use cookies from browser in request
  • try using splash scrapinghub to eval js
  • use proxies
  • slow down your for with sleep, or randomize

you get the idea.. good luck

link to Cloudflare antibot solution

CodePudding user response:

There probably isn't a need for separate network calls for each manga title. Given that the only information needed is rank, title, chapters_count, synopsis, tags -- all that information is in the original page.

import requests
import pandas as pd
from bs4 import BeautifulSoup as bs

url = 'https://www.anime-planet.com/manga/top-manga/week'

big_list = []
r = requests.get(url)
soup = bs(r.text)
mangas = soup.select_one('table[]').select_one('tbody').select('tr')

for m in mangas:
    rank = m.select_one('td.tableRank').text.strip()
    title = m.select_one('td.tableTitle').text.strip()
    chapters_count = m.select_one('select[name="chapters"]').get('data-eps')
    synopsis = ' '.join([x.get_text(strip=True) for x in bs(m.select_one('td.tableTitle').select_one('a').get('title')).select_one("div[class='pure-2-3']").select('p')])
    tags = ', '.join([x.get_text(strip=True) for x in bs(m.select_one('td.tableTitle').select_one('a').get('title')).select_one("div[class='tags']").select('li')])
    
    big_list.append((rank, title, chapters_count, synopsis, tags))
df = pd.DataFrame(big_list, columns=['Rank', 'Title', 'Chapters', 'Synopsis', 'Tags'])
nonwebtoonsdf = df[df["Tags"].str.contains("Webtoons")==False]
print('total non-webtoons manga: ', nonwebtoonsdf.shape[0])
print(nonwebtoonsdf)

Result printed in terminal:

total non-webtoons manga:  14
Rank    Title   Chapters    Synopsis    Tags
1   2   One Piece   1060    Long ago the infamous Gol D. Roger was the strongest and most powerful pirate on the seas. As he was about to be executed he revealed that he hid all of his wealth, including the legendary treasure known as One Piece, on an island at the end of the Grand Line - a treacherous and truly unpredictable sea. Monkey D. Luffy is a spirited, energetic and somewhat dim-witted young man with a very big dream: to find One Piece and become the Pirate King! However Luffy is no ordinary boy, as when he was younger he ate one of the Devil's Fruits and gained its power to become a Rubber Man. Now in this grand age of pirates Luffy sets out to gather a crew and sail to the most dangerous sea in the world so that he can fulfill his dream... and maybe even his appetite!   Action, Adventure, Comedy, Fantasy, Shounen, Pirates, Superpowers, Adapted to Anime
7   8   Chainsaw Man    104 Denji's life of poverty is changed forever when he merges with his pet chainsaw dog, Pochita! Now he's living in the big city and an official Devil Hunter. But he's got a lot to learn about his new job and chainsaw powers! Source: MANGA Plus   Action, Fantasy, Horror, Shounen, Dark Fantasy, Demons, Emotional Abuse, Explicit Violence, Adapted to Anime
12  13  SPY x FAMILY    68  Master spy Twilight is the best at what he does when it comes to going undercover on dangerous missions in the name of a better world. But when he receives the ultimate impossible assignment—get married and have a kid—he may finally be in over his head! Not one to depend on others, Twilight has his work cut out for him procuring both a wife and a child for his mission to infiltrate an elite private school. What he doesn’t know is that the wife he’s chosen is an assassin and the child he’s adopted is a telepath! Source: VIZ    Action, Comedy, Shounen, Assassins, Espionage, Family Life, Psychic Powers, Secret Identity, Drug Use, Violence, Adapted to Anime
16  17  My Hero Academia    366 Middle school student Izuku Midoriya wants to be a hero more than anything, but he hasn’t got an ounce of power in him. With no chance of ever getting into the prestigious U.A. High School for budding heroes, his life is looking more and more like a dead end. Then an encounter with All Might, the greatest hero of them all, gives him a chance to change his destiny… Source: Viz  Action, Comedy, Fantasy, Sci Fi, Shounen, School Life, Superheroes, Superpowers, Weak to Strong, Emotional Abuse, Explicit Violence, Mature Themes, Physical Abuse, Adapted to Anime, Domestic Abuse
25  26  Black Clover    338 Asta is a young boy who dreams of becoming the greatest mage in the kingdom. Only one problem—he can’t use any magic! Luckily for Asta, he receives the incredibly rare five-leaf clover grimoire that gives him the power of anti-magic. Can someone who can’t use magic really become the Wizard King? One thing’s for sure—Asta will never give up! Source: Viz  Action, Adventure, Comedy, Drama, Fantasy, Shounen, Childhood Promise, Guilds, Hiatus, Magic, Adapted to Anime
26  27  Jujutsu Kaisen  197 Although Yuji Itadori looks like your average teenager, his immense physical strength is something to behold! Every sports club wants him to join, but Itadori would rather hang out with the school outcasts in the Occult Research Club. One day, the club manages to get their hands on a sealed cursed object. Little do they know the terror they’ll unleash when they break the seal… Source: Viz Action, Horror, Shounen, Body Sharing, Curse, Exorcists, Monsters, Non-Human Protagonists, School Life, Supernatural, Explicit Violence, Adapted to Anime
30  31  Tokyo Revengers 270 Watching the news, Takemichi Hanagaki learns that his girlfriend from way back in middle school, Hinata Tachibana, has died. The only girlfriend he ever had was just killed by a villainous group known as the Tokyo Manji Gang. He lives in a crappy apartment with thin walls, and his six-years-younger boss treats him like an idiot. Plus, he’s a complete and total virgin … At the height of his rock-bottom life, he suddenly time-leaps twelve years back to his middle school days!! To save Hinata, and change the life he spent running away, hopeless part-timer Takemichi must aim for the top of Kanto’s most sinister delinquent gang!! Source: Kodansha   Action, Drama, Shounen, Age Transformation, Delinquents, Gangs, Second Chance, Time Travel, Adapted to Anime
43  44  My Dress-Up Darling 81  Traumatized by a childhood incident with a friend who took exception to his love of traditional dolls, doll-artisan hopeful Wakana Gojou passes his days as a loner, finding solace in the home ec room at his high school. To Wakana, people like beautiful Marin Kitagawa, a trendy girl who's always surrounded by a throng of friends, is practically an alien from another world. But when cheerful Marin--never one to be shy--spots Wakana sewing away one day after school, she barges in with the aim of roping her quiet classmante into her secret hobby: cosplay! Will Wakana's wounded heart be able to handle the invasion from this sexy alien?! Source: Square Enix Comedy, Ecchi, Romance, Seinen, Gyaru, Otaku Culture, Panty Shots, School Life, Nudity, Adapted to Anime
50  51  Kaiju No. 8 70  With the highest kaiju-emergence rates in the world, Japan is no stranger to attack by deadly monsters. Enter the Japan Defense Force, a military organization tasked with the neutralization of kaiju. Kafka Hibino, a kaiju-corpse cleanup man, has always dreamed of joining the force. But when he gets another shot at achieving his childhood dream, he undergoes an unexpected transformation. How can he fight kaiju now that he’s become one himself?! Source: VIZ Media   Action, Comedy, Horror, Sci Fi, Shounen, Kaijuu, Military, Monsters, Non-Human Protagonists, Overpowered Main Characters, Secret Identity
55  56  Tales of Demons and Gods    397 Because of the Space-Time Demon Spirit Book, time and space underwent a reversal. Though he was supposed to have died to demon beasts, Nie Li found himself sitting in a classroom when he opened his eyes. He had gone back to the age of thirteen. Now that everything has begun anew... how will he protect his beloved? Source: Bilibili Comics Action, Comedy, Fantasy, Manhua, Age Transformation, Cultivation, Demons, Full Color, Martial Arts, Second Chance, Time Travel, Weak to Strong, Xianxia, Based on a Web Novel
71  72  Berserk 385 Born beneath the gallows tree from which his dead mother hung, Guts has always existed on the boundary between life and death. After enduring a terrible childhood, he spends his adulthood in brutal combat, pitting his strength against others in order to build his own. Life is simple enough for Guts until he meets Griffith, the inspirational, ambitious, and beautiful leader of the mercenaries, the Band of the Hawk. When Guts loses to Griffith in a duel, he is forced to join the Band of the Hawk, and, despite himself, finds a sense of camaraderie and belonging amongst them. However, as Griffith leads his soldiers from victory to victory, the bloody wars and underhanded politics reveal a side to him that nobody quite expected. Very soon, what seems like a straightforward march for conquest becomes a harrowing struggle for humanity and life itself. Can Guts, a simple warrior, defend those who have come to mean the most to him, all the while struggling not to lose to the darkness he has carried with him his entire life?  Action, Fantasy, Seinen, Dark Fantasy, Demons, Hiatus, Medieval, Overpowered Main Characters, Revenge, Swordplay, Explicit Sex, Explicit Violence, Mature Themes, Physical Abuse, Sexual Abuse, Adapted to Anime
74  75  One-Punch Man   168 In a city plagued with thugs, mutants, and supervillains, Saitama decides to become a superhero for fun. He envisions an exciting life where he is constantly challenged with tough opponents, but after three years of intense training, he's become so strong that he defeats every enemy with one punch! His dream of engaging challenging foes has gone up in smoke, and his overpowered life is filled with overpowering boredom. Then a cyborg named Genos learns about Saitama's amazing ability and begs him to make him his disciple. Saitama isn't interested in taking on an apprentice, but Genos isn't giving up. Can he convince the disillusioned hero to teach him the secret of his strength? And will Saitama ever find a worthy adversary to battle? Action, Comedy, Sci Fi, Seinen, Cyborgs, Martial Arts, Monsters, Overpowered Main Characters, Parody, Satire, Superheroes, Superpowers, Adapted to Anime
81  82  Mushoku Tensei: Jobless Reincarnation   84  Just when an unemployed thirty-four-year-old otaku reaches a dead end in life and decides that it’s time to turn over a new leaf—he gets run over by a truck and dies! Shockingly, he finds himself reborn into an infant’s body in a strange, new world of swords and magic. His new identity is Rudeus Grayrat, but he still retains the memories of his previous life. Follow Rudeus from infancy to adulthood, as he struggles to redeem himself in a wondrous yet dangerous world.. Source: Seven Seas Action, Adventure, Drama, Ecchi, Fantasy, Harem, Romance, Seinen, Demons, Isekai, Magic, NEET, Person in a Strange World, Reincarnation, Violence, Adapted to Anime, Based on a Light Novel
91  92  Koisuru Tetsumenpi  24  Kitagawa Ryouno is the company's most popular hottie, and the ace of the Sales Department. But he has a secret he can't tell anyone. He's a coward, whose face turns red from any unexpected turn of events. Therefore, from the day he noticed that he would calm down in front of his strict and awkward senior Natsume, the hope of Development Department, he became attached to him. On the other hand, Natsume has his secret, too. He's gay and has his eyes set on Kitagawa, ever since he joined the company... 4 years ago. Source: MU    BL, Yaoi, Adult Couples, Coworkers

Requests documentation: https://requests.readthedocs.io/en/latest/

Also, BeautifulSoup docs: https://beautiful-soup-4.readthedocs.io/en/latest/index.html

And pandas: https://pandas.pydata.org/pandas-docs/stable/index.html

  • Related