Finding elements by tag name in Beautiful Soup
Extracting a single element by tag name
To extract a single element by tag name, use either the methods find(tag_name)
or select_one(tag_name)
, both of which return the first occurrence of an element with the specified tag.
Example
Suppose we have the following html document:
my_html = """ <html> <p>Alex</p> <p>Bob</p> </html>"""
Using find method
To extract the first element with the p
tag:
<p>Alex</p>
If there is no element with the specified tag, None
is returned.
Using select_one method
To extract the first element with the p
tag:
<p>Alex</p>
If there is no element with the specified tag, None
is returned.
Using dot notation
Equivalently, we could also access the first element with the p
tag like so:
soup.p
<p>Alex</p>
Extracting multiple elements by tag name using find_all method
To extract multiple elements by tag name, we could use either the methods find_all(tag_name)
or the select(tag_name)
, both of which return a list of elements with the specified tag.
Example
Suppose we have the following html document:
my_html = """ <html> <p>Alex</p> <p>Bob</p> <p>Cathy</p> </html>"""
Using find_all method
To extract all elements wit the p
tag:
from bs4 import BeautifulSoupsoup = BeautifulSoup(my_html, "html.parser")
print(item)
<p>Alex</p><p>Bob</p><p>Cathy</p>
Since the find_all(~)
method is so commonly used, there is a handy shorter-form that is equivalent:
for item in soup("p"): print(item)
<p>Alex</p><p>Bob</p><p>Cathy</p>
If there is no element with the specified tag, an empty list is returned.
Using select method
To extract all elements with the p
tag using the select(~)
method:
print(item)
<p>Alex</p><p>Bob</p><p>Cathy</p>