Skip to content

sumy

3,670 4 28 Apache-2.0
0.12.0 (14 Feb 2026) Oct 20 2013 152.5 thousand (month)

sumy is a Python library for automatic summarization of text documents. It can be used to extract summaries from various input formats such as plaintext, HTML, and URLs. It supports multiple languages and multiple summarization algorithms, including Latent Semantic Analysis (LSA), Luhn, Edmundson, TextRank, and SumBasic.

Example Use


```python

-- coding: utf-8 --

from future import absolute_import from future import division, print_function, unicode_literals

from sumy.parsers.html import HtmlParser from sumy.parsers.plaintext import PlaintextParser from sumy.nlp.tokenizers import Tokenizer from sumy.summarizers.lsa import LsaSummarizer as Summarizer from sumy.nlp.stemmers import Stemmer from sumy.utils import get_stop_words

LANGUAGE = "english" SENTENCES_COUNT = 10

if name == "main": url = "https://en.wikipedia.org/wiki/Automatic_summarization" parser = HtmlParser.from_url(url, Tokenizer(LANGUAGE)) # or for plain text files # parser = PlaintextParser.from_file("document.txt", Tokenizer(LANGUAGE)) # parser = PlaintextParser.from_string("Check this out.", Tokenizer(LANGUAGE)) stemmer = Stemmer(LANGUAGE)

summarizer = Summarizer(stemmer)
summarizer.stop_words = get_stop_words(LANGUAGE)

for sentence in summarizer(parser.document, SENTENCES_COUNT):
    print(sentence)

```

Alternatives / Similar


2,140 2025.4.15 (2025-04-15 04:02:28 ago) Dec 14 2008 compare
5,650 2.0.0 (2024-12-03 15:23:21 ago) Jul 17 2019 compare
2,894 0.8.4.1 (2025-05-03 21:11:43 ago) Jun 30 2011 compare
15,018 0.2.8 (2018-09-28 04:58:18 ago) Dec 28 2012 compare
961 0.18.0 (2024-11-08 14:59:22 ago) Oct 27 2015 compare
12,807 1.1.9 (2018-10-21 03:39:17 ago) Aug 24 2018 compare
297 2.0.7 (2022-11-06 07:33:14 ago) Dec 11 2020 compare

Other Languages

2,824 v1.3.0 (2024-03-01 03:34:34 ago) Apr 20 2016 compare
Was this page helpful?