Beautiful Soup is a Python library designed for quick turnaround projects like
screen-scraping.
Three features make it powerful:
- Beautiful Soup provides a few simple methods and Pythonic idioms for
navigating, searching, and modifying a parse tree: a toolkit for dissecting
a document and extracting what you need. It doesn't take much code to write
an application.
- Beautiful Soup automatically converts incoming documents to Unicode and
outgoing documents to UTF-8. You don't have to think about encodings, unless
the document doesn't specify an encoding and Beautiful Soup can't detect one.
Then you just have to specify the original encoding.
- Beautiful Soup sits on top of popular Python parsers like lxml and html5lib,
allowing you to try out different parsing strategies or trade speed for
flexibility.