3. Linear equations
4. Matrix determinant
5. Vector space
6. Special matrices
7. Eigenvalues and Eigenvectors
9. Matrix decomposition
check_circleMark as learned
Beautiful Soup Tag | get_text method
schedule Mar 5, 2023Last updated
tocTable of Contentsexpand_more
Check out the interactive map of data science
Tag.get_text() method returns the text within the tag.
Consider the following HTML document:
my_html = """<div><p>I like tea.</p><p>I like <b>soup</b>.</p>I like soda.</div>"""soup = BeautifulSoup(my_html)
Extracting raw text
To extract all text:
I like tea.I like soup.I like soda.
Notice how you end up with awkward structure due to the spacings.
Extracting stripped text
To solve the problem of awkward spacings, add the
I like tea.I likesoup.I like soda.
This looks much cleaner.
Specifying a separator
To join the bits and pieces of text using
"**" as the separator:
I like tea.**I like**soup**.**I like soda.
To explain the output, recall that our HTML document's middle line was as follows:
<p>I like <b>soup</b>.</p>
Each pair of opening and closing tags are replaced by your specified separator - that's all.
Extracting all text from an element in Beautiful Soup
To extract all text from an element in Beautiful Soup, use the get_text() method.
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
Ask a question or leave a feedback...
Official Beautiful Soup Documentationhttps://www.crummy.com/software/BeautifulSoup/bs4/doc/#get-text
Enjoy our search
Hit / to insta-search docs and recipes!