Skip to content

newspaper

13,679 6 521 MIT
0.2.8 (28 Sep 2018) Dec 28 2012 701.2 thousand (month)

newspaper is a Python package that allows developers to easily extract text, images, and videos from articles on the web.

It is designed to be fast, easy to use, and compatible with a wide variety of websites. It uses advanced algorithms to extract relevant information and metadata from articles, and it also supports several languages.

newspaper includes a http client or can ingest pre-scraped HTML documents.

Example Use


from newspaper import Article

# Create a new article object
article = Article('https://www.example.com/article')

# Download the article
article.download()

# Parse the article
article.parse()

# Print the article text
print(article.text)

# Print the article title
print(article.title)

# Print the article authors
print(article.authors)

# Print the article publication date
print(article.publish_date)

Alternatives / Similar


1,595 2024.2.26 (2 months ago) Dec 14 2008 compare
2,911 1.9.0 (9 days ago) Jul 17 2019 compare
2,572 0.8.1 (3 years ago) Jun 30 2011 compare
3,429 0.11.0 (1 year, 6 months ago) Oct 20 2013 compare
823 0.16.0 (10 months ago) Oct 27 2015 compare
172 2.0.7 (1 year, 6 months ago) Dec 11 2020 compare
10,534 1.1.9 (5 years ago) Aug 24 2018 compare

Other Languages

2,400 v1.3.0 (2 months ago) Apr 20 2016 compare