my_html = """
       <div>
              <p>I like tea.</p>
              <p>I like <b>soup</b>.</p>
              I like soda.
       </div>
"""
soup = BeautifulSoup(my_html)

Extracting raw text

To extract all text:


        
        
            
                
                
                    print(soup.get_text())
                
            
                          I like tea.
              I like soup.
              I like soda.

Notice how you end up with awkward structure due to the spacings.

Extracting stripped text

To solve the problem of awkward spacings, add the strip=True parameter:


        
        
            
                
                
                    print(soup.get_text(strip=True))
                
            
            I like tea.I likesoup.I like soda.

This looks much cleaner.

Specifying a separator

To join the bits and pieces of text using "**" as the separator:


        
        
            
                
                
                    print(soup.get_text("**", strip=True))
                
            
            I like tea.**I like**soup**.**I like soda.

To explain the output, recall that our HTML document's middle line was as follows:


        
        
            
                
                
                    <p>I like <b>soup</b>.</p>

Each pair of opening and closing tags are replaced by your specified separator - that's all.

Extracting all text from an element in Beautiful Soup

To extract all text from an element in Beautiful Soup, use the get_text() method.

chevron_right

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

Official Beautiful Soup Documentation

https://www.crummy.com/software/BeautifulSoup/bs4/doc/#get-text

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!

Beautiful Soup Tag | get_text method

Examples

Extracting raw text

Extracting stripped text

Specifying a separator

Related