Skip to content

newspaper

14,022 6 501 MIT
0.2.8 (28 Sep 2018) Dec 28 2012 509.8 thousand (month)

newspaper is a Python package that allows developers to easily extract text, images, and videos from articles on the web.

It is designed to be fast, easy to use, and compatible with a wide variety of websites. It uses advanced algorithms to extract relevant information and metadata from articles, and it also supports several languages.

newspaper includes a http client or can ingest pre-scraped HTML documents.

Example Use


from newspaper import Article

# Create a new article object
article = Article('https://www.example.com/article')

# Download the article
article.download()

# Parse the article
article.parse()

# Print the article text
print(article.text)

# Print the article title
print(article.title)

# Print the article authors
print(article.authors)

# Print the article publication date
print(article.publish_date)

Alternatives / Similar


1,795 2024.2.26 (6 months ago) Dec 14 2008 compare
835 0.17.0 (3 months ago) Oct 27 2015 compare
3,405 1.12.1 (21 days ago) Jul 17 2019 compare
3,501 0.11.0 (1 year, 10 months ago) Oct 20 2013 compare
2,641 0.8.1 (4 years ago) Jun 30 2011 compare
216 2.0.7 (1 year, 10 months ago) Dec 11 2020 compare
10,890 1.1.9 (5 years ago) Aug 24 2018 compare

Other Languages

2,551 v1.3.0 (6 months ago) Apr 20 2016 compare
Was this page helpful?