extractnet
ExtractNet is an automated web data extraction tool using machine learning to parse HTML and text data.
This tool can be used in web scraping to automatically extract details from scraped HTML documents. While it's not as accurate as structured extraction using HTML parsing tools like CSS selectors or XPath it can still parse a lot of details.
Example Use
import requests
from extractnet import Extractor
raw_html = requests.get('https://currentsapi.services/en/blog/2019/03/27/python-microframework-benchmark/.html').text
results = Extractor().extract(raw_html)
{'phone_number': '555-555-5555', 'email': 'example@example.com'}