feedparser
feedparser is a Python module for downloading and parsing syndicated feeds. It can handle RSS 0.90, Netscape RSS 0.91, Userland RSS 0.91, RSS 0.92, RSS 0.93, RSS 0.94, RSS 1.0, RSS 2.0, Atom 0.3, Atom 1.0, and CDF feeds. It also parses several popular extension modules, including Dublin Core and Appleās iTunes extensions.
To use Universal Feed Parser, you will need Python 3.6 or later. Universal Feed Parser is not meant to run standalone; it is a module for you to use as part of a larger Python program.
feedparser can be used to scrape data feeds as it can download them and parse the XML structured data.
Example Use
import feedparser
# the feed can be loaded from a remote URL
data = feedparser.parse('http://feedparser.org/docs/examples/atom10.xml')
# local path
data = feedparser.parse('/home/user/data.xml')
# or raw string
data = feedparser.parse('<xml>...</xml>')
# the result dataset is a nested python dictionary containing feed data:
data['feed']['title']