Skip to content

cheeriovschopper

MIT 40 13 30,265
80.4 million (month) Oct 08 2011 1.2.0(2026-02-21 19:30:40 ago)
23 3 1 MIT
Jul 24 2014 1.7 thousand (month) 0.6.0(2023-04-26 10:16:25 ago)

cheerio is a popular JavaScript library that allows you to interact with and manipulate HTML and XML documents in a similar way to how you would with jQuery in a browser. It is a fast, flexible, and lean implementation of core jQuery designed specifically for the server.

One of the main benefits of using cheerio is that it allows you to use jQuery-like syntax to navigate and m anipulate the Document Object Model (DOM) of an HTML or XML document, making it easy to work with.

cheerio supports CSS selectors though not XPath.

Chopper is a tool to extract elements from HTML by preserving ancestors and CSS rules.

Compared to other HTML parsers Chopper is designed to retain original HTML tree but eliminate elements that do not match parsing rules. Meaning, we can parse HTML elements and keep thei structure for machine learning or other tasks where data structure is needed as well as the data value.

Example Use


```javascript const cheerio = require('cheerio'); const $ = cheerio.load('My title

Hello World!

'); // use css selectors console.log($('title').text()); // My title console.log($('.name').text()); // Hello World! // select multiple elements const $ = cheerio.load('
  • item 1
  • item 2
'); $('li').each(function(i, elem) { console.log($(this).text()); }); // modify elements const $ = cheerio.load('

Hello World!

'); $('h1').text('Hello, Cheerio!'); console.log($.html()); ```
```python HTML = """ Test
HELLO WORLD Do not want

<div id="footer"></div>

"""

CSS = """ div { border: 1px solid black; } div#main { color: blue; } div.iwantthis { background-color: red; } a { color: green; } div#footer { border-top: 2px solid red; } """

extractor = Extractor.keep('//div[@class="iwantthis"]').discard('//a') html, css = extractor.extract(HTML, CSS)

will result in:

html """

HELLO WORLD

"""

css """ div{border:1px solid black;} div#main{color:blue;} div.iwantthis{background-color:red;} """ ```

Alternatives / Similar


Was this page helpful?