Beautifulsoup Removing Tags
I'm trying to remove the style tags and their contents from the source, but it's not working, no errors just simply doesn't decompose. This is what I have: source = BeautifulSoup(o
Solution 1:
The following code does what you want and works fine; do not use blanket except handling to mask bugs:
source = BeautifulSoup(open("page.html"))
for hidden in source.body.find_all(style='display:none'):
hidden.decompose()
or better still, use a regular expression to cast the net a little wider:
import re
source = BeautifulSoup(open("page.html"))
for hidden in source.body.find_all(style=re.compile(r'display:\s*none')):
hidden.decompose()
Tag.children
only lists direct children of the body
tag, not all nested children.
Post a Comment for "Beautifulsoup Removing Tags"