Home > Software engineering >  How Can I Assign The Result of A Beautiful Soup Request To An Array?
How Can I Assign The Result of A Beautiful Soup Request To An Array?

Time:11-25

I want to have an array urls[] so I can check and remove duplicates. My current code looks like

match2 = soup.find_all("a", href=True, target="_blank");
for match2 in match2:
    if match2['href'][0] == ".":
        imageUrl = url.split("/")[2]   "/"   url.split("/")[3]   "/src/"   match2['href'].split("/")[-1];
        urls = [];
        urls.append(imageUrl);
print("array");
for i in urls:
   print(i);

But when I run the code there is only one element in urls[] when there should be more. How can I assign the results of match2 to an array?

CodePudding user response:

Consider initializing your URL list outside of the for loop so it isn't overridden.

Your for loop header may also have unintended consequences when using the same variable for both the iterable variable and list for match2 in match2.

match2 = soup.find_all("a", href=True, target="_blank");
urls = []
for entry in match2:
    if entry['href'][0] == ".":
        imageUrl = url.split("/")[2]   "/"   url.split("/")[3]   "/src/"   
        entry['href'].split("/")[-1];
        urls.append(imageUrl);
 print("array");
 for i in urls:
     print(i);

CodePudding user response:

you are overiding urls in each time you enter the if. you need to declare it outside the loop:

urls = [];
match2 = soup.find_all("a", href=True, target="_blank")
for match2 in match2:
    if match2['href'][0] == ".":
        imageUrl = url.split("/")[2]   "/"   url.split("/")[3]   "/src/"   match2['href'].split("/")[-1]
        urls.append(imageUrl)
print("array")
for i in urls:
   print(i)
  • Related