Skip to content

newspaper

13,945 6 502 MIT
0.2.8 (28 Sep 2018) Dec 28 2012 507.8 thousand (month)

newspaper is a Python package that allows developers to easily extract text, images, and videos from articles on the web.

It is designed to be fast, easy to use, and compatible with a wide variety of websites. It uses advanced algorithms to extract relevant information and metadata from articles, and it also supports several languages.

newspaper includes a http client or can ingest pre-scraped HTML documents.

Example Use


from newspaper import Article

# Create a new article object
article = Article('https://www.example.com/article')

# Download the article
article.download()

# Parse the article
article.parse()

# Print the article text
print(article.text)

# Print the article title
print(article.title)

# Print the article authors
print(article.authors)

# Print the article publication date
print(article.publish_date)

Alternatives / Similar


1,768 2024.2.26 (5 months ago) Dec 14 2008 compare
832 0.17.0 (2 months ago) Oct 27 2015 compare
3,467 0.11.0 (1 year, 9 months ago) Oct 20 2013 compare
3,270 1.12.0 (8 days ago) Jul 17 2019 compare
2,614 0.8.1 (4 years ago) Jun 30 2011 compare
203 2.0.7 (1 year, 9 months ago) Dec 11 2020 compare
10,745 1.1.9 (5 years ago) Aug 24 2018 compare

Other Languages

2,533 v1.3.0 (5 months ago) Apr 20 2016 compare
Was this page helpful?