html5libvssoup
html5lib is a pure-python library for parsing HTML. It is designed to conform to the WHATWG HTML specification, as is implemented by all major web browsers.
As html5lib is implemented in pure-python it is significantly slower than alternatives powered by lxml (like parsel or beautifulsoup).
However, html5lib implements a more true html5 parsing which can represent HTML tree more correctly than alternatives.
soup is a Go library for parsing and querying HTML documents.
It provides a simple and intuitive interface for extracting information from HTML pages. It's inspired by popular Python web scraping
library BeautifulSoup and shares similar use API implementing functions like Find and FindAll.
soup can also use go's built-in http client to download HTML content.
Note that unlike beautifulsoup, soup does not support CSS selectors or XPath.