Skip to content

readability

2,894 5 37 Apache-2.0
0.8.4.1 (3 May 2025) Jun 30 2011 1.6 million (month)

python-readability is a python package that allows developers to extract the main content of a web page, removing any unnecessary or unwanted elements, such as ads, navigation, and sidebars.

It is based on the algorithm used by the popular web-based service, Readability, and it uses the beautifulsoup4 package to parse the HTML and extract the main content.

Readability is similar to Newspaper in terms that it's extracting HTML data

Example Use


```python import requests from readability import document

response = requests.get('http://example.com') doc = document(response.content) doc.title() 'example domain'

doc.summary() """

\n
\n

example domain

\n

this domain is established to be used for illustrative examples in documents. you may use this\n domain in examples without prior coordination or asking for permission.

\n

more information...

\n

\n\n

""" ```

Alternatives / Similar


2,140 2025.4.15 (2025-04-15 04:02:28 ago) Dec 14 2008 compare
5,650 2.0.0 (2024-12-03 15:23:21 ago) Jul 17 2019 compare
15,018 0.2.8 (2018-09-28 04:58:18 ago) Dec 28 2012 compare
961 0.18.0 (2024-11-08 14:59:22 ago) Oct 27 2015 compare
3,670 0.12.0 (2026-02-14 21:00:12 ago) Oct 20 2013 compare
12,807 1.1.9 (2018-10-21 03:39:17 ago) Aug 24 2018 compare
297 2.0.7 (2022-11-06 07:33:14 ago) Dec 11 2020 compare
#### Other Languages { data-search-exclude }
2,824 v1.3.0 (2024-03-01 03:34:34 ago) Apr 20 2016 compare
Was this page helpful?