Home > Net >  Delete a specific tag from main soup in BeautifulSoup4 (python)
Delete a specific tag from main soup in BeautifulSoup4 (python)

Time:04-06

This is what i have tried - look at soup.div.decompose(), I also tried soup.elements.div.decompose(). Also this is using content from DataTables and this my first time using it so if there's a better way to achieve what i'm doing please tell me! Thanks in advace!

import bs4

with open('MapPage.html', 'r', encoding="utf8") as f:
    txt = f.read()
    soup = bs4.BeautifulSoup(txt,"html5lib")

elements = soup.find_all('tr')
elements.pop(0)

def DeleteData(msgID):
    for div in elements:
        ID = div.find('a').contents[0]
        if int(msgID)==int(ID):
            soup.div.decompose()
            return
    print('Failed to delete data from', msgID)

I'm hoping i'll be able to then just write the soup to the 'MapPage.html' again. The error AttributeError: 'NoneType' object has no attribute 'decompose' is produced. This is the output when printing div: This is the output when printing div

CodePudding user response:

If I understand right, you like to decompose() the <tr> that contains a specific value in its <a>.

Main issue is that you try to perform soup.div.decompose() what means, that you like to decompose() first <div> of soup object.

Simply use:

div.decompose()

or even better change your variable name to a none tag name:

e.decompose()

Example

from bs4 import BeautifulSoup

html = '''
<html><body>
    <h2>Welcome to our collection of community made maps!</h2>
    <table id="example"  style="width:100%">
        <thead>
            <tr><th>ID</th><th>Author</th><th>Content</th><th>Thumbnail</th><th>Download</th><th>Rating</th>
            </tr>
        </thead>
        <tbody>
            <tr>
                <td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980851</a></td>
                <td>Matter</td><td>Cervinia Source</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td>
            </tr>
            <tr><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980852</a></td><td>Tea</td><td>Chamonix</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td></tr>
        </tbody>
    </table>
</body></html>
'''
soup = BeautifulSoup(html,)
elements = soup.select('tr:has(td)')

def DeleteData(msgID):
    for e in elements:
        ID = e.find('a').contents[0]
        if int(msgID)==int(ID):
            e.decompose()
            return
        print('Failed to delete data from', msgID)

DeleteData(939257309387980851)
  • Related