Skip to content Skip to sidebar Skip to footer

Get An Element Before A String With Beautiful Soup

I'm using Beautiful Soup to search a website for a set of integer values and produce a list of these, matched to names. However, the problem I'm having is that the website uses som

Solution 1:

Just iterate on your class elements and check if their contents match your important string :

for listItem in soup.findAll('li', class_='list-item'):
    if listItem.decode_contents(formatter="html").find('Important Values') != -1:
        print(listItem.find('strong').contents)        

Solution 2:

I would like to be able to set Beautiful Soup/Python to parse for a string like "Important Values" and get the element directly before it (ignoring any line breaks or white-space), or better yet the value contained within the element

BeautifulSoup is quite flexible in terms of locating elements. There are all sorts of techniques to find elements in the HTML. In this case, we can find the "Important Values" text node and find the preceding strong element:

important_values = int(soup.find(text=lambda text: text and text.strip() == 'Important Values').find_previous_sibling("strong").get_text())
print(important_values)  # prints 65

Or, we can create a "search function" and check for the strong element name and the next text sibling node to be "Important Values":

defsearch_function(tag):
    is_strong = tag.name == "strong"
    is_important = tag.next_sibling and tag.next_sibling.strip() == 'Important Values'return is_strong and is_important

important_values = int(soup.find(search_function).get_text())
print(important_values)  # prints 65

Post a Comment for "Get An Element Before A String With Beautiful Soup"