Skip to content

cheeriovsselectolax

MIT 40 13 30,265
80.4 million (month) Oct 08 2011 1.2.0(2026-02-21 19:30:40 ago)
1,607 1 10 MIT
Mar 01 2018 4.5 million (month) 0.4.7(2026-03-06 09:23:35 ago)

cheerio is a popular JavaScript library that allows you to interact with and manipulate HTML and XML documents in a similar way to how you would with jQuery in a browser. It is a fast, flexible, and lean implementation of core jQuery designed specifically for the server.

One of the main benefits of using cheerio is that it allows you to use jQuery-like syntax to navigate and m anipulate the Document Object Model (DOM) of an HTML or XML document, making it easy to work with.

cheerio supports CSS selectors though not XPath.

selectolax is a fast and lightweight library for parsing HTML and XML documents in Python. It is designed to be a drop-in replacement for the popular BeautifulSoup library, with significantly faster performance.

selectolax uses a Cython-based parser to quickly parse and navigate through HTML and XML documents. It provides a simple and intuitive API for working with the document's structure, similar to BeautifulSoup.

To use selectolax, you first need to install it via pip by running pip install selectolax``. Once it is installed, you can use theselectolax.html.fromstring()` function to parse an HTML document and create a selectolax object. For example: ``` from selectolax.parser import HTMLParser

html_string = "Hello, World!" root = HTMLParser(html_string).root print(root.tag) # html ` You can also use `selectolax.html.fromstring()` with file-like objects, bytes or file paths, as well as `selectolax.xml.fromstring() for parsing XML documents.

Once you have a selectolax object, you can use the select() method to search for elements in the document using CSS selectors, similar to BeautifulSoup. For example: body = root.select("body")[0] print(body.text()) # "Hello, World!"

Like BeautifulSoups find and find_all methods selectolax also supports searching using the search()`` method, which returns the first matching element, and thesearch_all()`` method, which returns all matching elements.

Example Use


```javascript const cheerio = require('cheerio'); const $ = cheerio.load('My title

Hello World!

'); // use css selectors console.log($('title').text()); // My title console.log($('.name').text()); // Hello World! // select multiple elements const $ = cheerio.load('
  • item 1
  • item 2
'); $('li').each(function(i, elem) { console.log($(this).text()); }); // modify elements const $ = cheerio.load('

Hello World!

'); $('h1').text('Hello, Cheerio!'); console.log($.html()); ```
```python from selectolax.parser import HTMLParser html_string = "Hello, World!" root = HTMLParser(html_string).root print(root.tag) # html # use css selectors: body = root.select("body")[0] print(body.text()) # "Hello, World!" # find first matching element: body = root.search("body") print(body.text()) # "Hello, World!" # or all matching elements: html_string = "

paragraph1

paragraph2

" root = HTMLParser(html_string).root for el in root.search_all("p"): print(el.text()) # will print: # paragraph 1 # paragraph 2 ```

Alternatives / Similar


Was this page helpful?