Skip to content

cheeriovsrequests-html

MIT 40 13 30,265
80.4 million (month) Oct 08 2011 1.2.0(2026-02-21 19:30:40 ago)
13,863 2 240 MIT
Feb 25 2018 720.8 thousand (month) 0.10.0(2019-02-17 20:14:17 ago)

cheerio is a popular JavaScript library that allows you to interact with and manipulate HTML and XML documents in a similar way to how you would with jQuery in a browser. It is a fast, flexible, and lean implementation of core jQuery designed specifically for the server.

One of the main benefits of using cheerio is that it allows you to use jQuery-like syntax to navigate and m anipulate the Document Object Model (DOM) of an HTML or XML document, making it easy to work with.

cheerio supports CSS selectors though not XPath.

requests-html is a Python package that allows you to easily make HTTP requests and parse the HTML content of web pages. It is built on top of the popular requests package and uses the html parser from the lxml library, which makes it fast and efficient. This package is designed to provide a simple and convenient API for web scraping, and it supports features such as JavaScript rendering, CSS selectors, and form submissions.

It also offers a lot of functionalities such as cookie, session, and proxy support, which makes it an easy-to-use package for web scraping and web automation tasks.

In short requests-html offers:

  • Full JavaScript support!
  • CSS Selectors (a.k.a jQuery-style, thanks to PyQuery).
  • XPath Selectors, for the faint of heart.
  • Mocked user-agent (like a real web browser).
  • Automatic following of redirects.
  • Connection–pooling and cookie persistence.
  • The Requests experience you know and love, with magical parsing abilities.
  • Async Support

Example Use


```javascript const cheerio = require('cheerio'); const $ = cheerio.load('My title

Hello World!

'); // use css selectors console.log($('title').text()); // My title console.log($('.name').text()); // Hello World! // select multiple elements const $ = cheerio.load('
  • item 1
  • item 2
'); $('li').each(function(i, elem) { console.log($(this).text()); }); // modify elements const $ = cheerio.load('

Hello World!

'); $('h1').text('Hello, Cheerio!'); console.log($.html()); ```
```python from requests_html import HTMLSession session = HTMLSession() r = session.get('https://www.example.com') # print the HTML content of the page print(r.html.html) # use CSS selectors to find specific elements on the page title = r.html.find('title', first=True) print(title.text) ```

Alternatives / Similar


Was this page helpful?