rvest

1,517 1 38 MIT

1.0.5 (12 Feb 2024) Nov 22 2014 534.7 thousand (month)

rvest is a popular R library for web scraping and parsing HTML and XML documents. It is built on top of the xml2 and httr libraries and provides a simple and consistent API for interacting with web pages.

One of the main advantages of using rvest is its simplicity and ease of use. It provides a number of functions that make it easy to extract information from web pages, even for those who are not familiar with web scraping. The html_nodes and html_node functions allow you to select elements from an HTML document using CSS selectors, similar to how you would select elements in JavaScript.

rvest also provides functions for interacting with forms, including html_form, set_values, and submit_form functions. These functions make it easy to navigate through forms and submit data to the server, which can be useful when scraping sites that require authentication or when interacting with dynamic web pages.

rvest also provides functions for parsing XML documents. It includes xml_nodes and xml_node functions, which also use CSS selectors to select elements from an XML document, as well as xml_attrs and xml_attr functions to extract attributes from elements.

Another advantage of rvest is that it provides a way to handle cookies, so you can keep the session alive while scraping a website, and also you can handle redirections with handle_redirects

Example Use

```r library("rvest")

Rvest can use basic HTTP client to download remote HTML:

tree <- read_html("http://webscraping.fyi/lib/r/rvest")

or read from string:

tree <- read_html('

Cat Food Dog Food

')

to parse HTML trees with rvest we use r pipes (the %>% symbol) and html_element function:

we can use css selectors:

print(tree %>% html_element(".products>a") %>% html_text())

"[1] "\nCat Food\nDog Food\n""

or XPath:

print(tree %>% html_element(xpath="//div[@class='products']/a") %>% html_text())

"[1] "\nCat Food\nDog Food\n""

Additionally rvest offers many quality of life functions:

html_text2 - removes trailing and leading spaces and joins values

print(tree %>% html_element("div") %>% html_text2())

"[1] "Cat Food Dog Food""

html_attr - selects element's attribute:

print(tree %>% html_element("div") %>% html_attr('class'))

"products"

```

Alternatives / Similar

ralger

165 2.3.0 (2021-03-18 00:10:00 ago) Dec 22 2019 compare

httr

985 1.4.8 (2023-08-15 11:00:00 ago) May 06 2012 compare

xml2

223 1.5.2 (2025-12-01 15:40:00 ago) Apr 20 2015 compare

crul

107 1.6.0 (2024-07-19 19:50:00 ago) Nov 09 2016 compare

Other Languages

scrapling new

36,206 0.4.5 (2026-04-07 04:22:27 ago) Aug 01 2024 compare

mechanize new

4,440 2.14.0 (2025-01-05 18:30:46 ago) Jul 25 2009 compare

requests

53,883 2.33.1 (2026-03-30 16:09:13 ago) Feb 14 2011 compare

node-fetch

8,860 3.3.2 (2023-11-30 14:10:12 ago) Dec 28 2012 compare

httpx

15,183 0.28.1 (2024-12-06 15:37:21 ago) Jul 26 2019 compare

aiohttp

16,395 3.13.5 (2026-03-31 21:56:30 ago) Jul 26 2019 compare

axios

108,987 1.15.0 (2026-04-08 16:09:38 ago) Aug 29 2014 compare

parse5

3,886 8.0.0 (2026-02-21 19:30:52 ago) Jul 03 2013 compare

sax-js

1,153 1.6.0 (2026-03-17 01:32:31 ago) Feb 09 2011 compare

htmlparser2

4,789 12.0.0 (2026-03-20 23:08:40 ago) Aug 28 2011 compare

lxml

3,010 6.0.3 (2026-04-09 14:33:38 ago) Dec 13 2022 compare

beautifulsoup

- 4.14.3 (2025-11-30 15:08:24 ago) Jul 26 2019 compare

jsdom new

21,552 29.0.2 (2026-04-07 03:38:38 ago) Nov 21 2011 compare

colly

25,231 v2.2.0 (2025-03-27 10:47:28 ago) May 14 2018 compare

katana new

16,499 v1.5.0 (2026-03-10 14:52:47 ago) Nov 07 2022 compare

got

14,897 15.0.1 (2026-04-08 16:15:35 ago) Mar 27 2014 compare

xmltodict

5,734 1.0.4 (2026-02-22 02:21:21 ago) Jul 30 2007 compare

cheerio

30,265 1.2.0 (2026-02-21 19:30:40 ago) Oct 08 2011 compare

pholcus

7,594 v1.4.0 (2026-03-03 03:58:32 ago) Feb 15 2020 compare

curl-impersonate

5,944 v0.6.1 (2024-03-02 18:08:29 ago) Feb 23 2022 compare

needle

1,631 3.5.0 (2026-03-12 22:24:55 ago) Dec 11 2011 compare

superagent

16,610 10.1.1 (2024-10-22 17:26:05 ago) Aug 22 2011 compare

html5lib

1,220 1.1 (2020-06-22 23:32:36 ago) Jul 30 2007 compare

geziyor

2,772 2026-04-11 (2026-04-11 21:30:25 ago) Jun 06 2019 compare

cssselect

309 1.4.0 (2026-01-29 07:00:24 ago) Apr 14 2012 compare

feedparser

2,351 6.0.12 (2025-09-10 13:33:58 ago) Jun 15 2007 compare

dataflowkit

711 2026-03-21 (2026-03-21 09:11:03 ago) Feb 09 2017 compare

primp new

504 1.2.2 (2026-04-03 07:11:15 ago) Jun 01 2024 compare

faraday

5,927 2.14.1 (2026-02-07 15:17:15 ago) Dec 19 2009 compare

nokogiri

6,248 1.19.2 (2026-03-19 21:12:43 ago) Jul 25 2009 compare

pycurl

1,147 7.45.7 (2025-09-24 13:35:56 ago) Feb 25 2003 compare

parsel

1,324 1.11.0 (2026-01-29 07:19:22 ago) Jul 26 2019 compare

selectolax

1,607 0.4.7 (2026-03-06 09:23:35 ago) Mar 01 2018 compare

excon

1,172 1.4.2 (2026-03-20 19:18:25 ago) Oct 31 2009 compare

scrapy

61,276 2.15.0 (2026-04-09 12:02:09 ago) Jul 26 2019 compare

httpclient

707 2.9.0 (2025-02-22 01:13:49 ago) Jul 25 2009 compare

httparty

5,889 0.24.2 (2026-01-14 22:54:36 ago) Jul 25 2009 compare

pyquery

2,381 2.0.1 (2024-08-30 08:12:22 ago) Dec 05 2008 compare

crawl4ai new

63,373 0.8.6 (2026-03-24 15:07:50 ago) May 01 2024 compare

typhoeus

4,131 1.6.0 (2026-03-10 12:58:26 ago) Oct 06 2009 compare

requests-html

13,863 0.10.0 (2019-02-17 20:14:17 ago) Feb 25 2018 compare

curl-cffi

1,751 0.7.1 (2024-07-13 09:07:25 ago) Feb 23 2022 compare

guzzle

23,447 7.10.0 (2025-08-23 22:36:01 ago) Nov 14 2011 compare

untangle

632 1.2.1 (2022-07-02 14:09:28 ago) Jun 09 2011 compare

crawlee new

22,720 3.16.0 (2026-04-09 07:36:53 ago) Apr 22 2022 compare

wreck

378 18.1.0 (2025-07-24 23:01:15 ago) Aug 06 2011 compare

em-http-request

1,219 1.1.7 (2020-08-31 21:38:00 ago) Oct 25 2009 compare

html5-php

1,772 2.10.0 (2025-07-25 09:04:22 ago) Jun 01 2013 compare

symfony-http

2,033 v8.0.8 (2026-03-30 15:14:47 ago) Apr 28 2019 compare

treq

605 25.5.0 (2025-06-03 03:42:30 ago) Dec 28 2012 compare

domcrawler

4,038 v8.0.8 (2026-03-30 15:14:47 ago) Sep 26 2011 compare

http-2

908 1.1.3 (2026-03-05 00:04:35 ago) Sep 25 2013 compare

scrapegraphai new

23,278 1.76.0 (2026-04-09 09:41:03 ago) Jan 15 2024 compare

req

4,781 v3.57.0 (2025-12-16 09:07:40 ago) Nov 20 2023 compare

goquery

14,926 v1.12.0 (2026-03-15 16:28:52 ago) Aug 29 2016 compare

cascadia

754 Start (2018-02-20 18:47:44 ago) Feb 20 2018 compare

resty

11,632 v2.17.2 (2026-02-14 22:43:18 ago) Aug 05 2024 compare

htmlquery

781 v1.3.6 (2026-03-06 04:46:15 ago) Feb 07 2019 compare

ferret

5,964 v2.0.0-alpha.7 (2026-04-07 15:33:51 ago) Oct 28 2020 compare

xpath

739 v1.3.6 (2026-02-23 07:10:29 ago) Jun 08 2019 compare

gocrawl

2,053 (2021-05-19 15:14:49 ago) Nov 20 2016 compare

soup

2,227 v1.2.5 (2022-01-16 14:36:54 ago) Apr 29 2017 compare

nestful

505 1.1.4 (2020-02-07 22:04:51 ago) Apr 20 2010 compare

chompjs

218 1.4.0 (2025-08-04 21:07:54 ago) Jul 30 2007 compare

scrapyd

3,087 1.6.0 (2025-07-22 06:00:53 ago) Sep 04 2013 compare

botasaurus new

4,321 4.0.97 (2026-01-06 07:45:54 ago) Oct 01 2023 compare

hrequests

1,001 0.9.2 (2024-12-01 02:55:27 ago) Feb 23 2022 compare

requests

3,577 v2.0.17 (2025-12-12 17:47:19 ago) Oct 06 2013 compare

html5-parser

700 0.4.12 (2023-11-19 15:09:54 ago) Jun 03 2007 compare

node-crawler

6,790 2.0.2 (2025-05-28 09:36:01 ago) Sep 10 2012 compare

panther

3,062 v2.4.0 (2026-01-08 05:29:21 ago) Jul 17 2018 compare

goutte new

9,215 v4.0.3 (2023-04-01 09:05:33 ago) Dec 02 2012 compare

buzz

1,924 1.3.0 (2024-09-23 13:16:34 ago) Nov 11 2011 compare

gazpacho

768 1.1 (2020-10-09 12:50:18 ago) Dec 28 2012 compare

gracy

248 1.34.0 (2024-11-27 14:57:34 ago) Feb 05 2023 compare

httpful

1,803 1.0.0 (2024-05-01 11:33:16 ago) Apr 14 2012 compare

embed

2,103 v4.4.15 (2025-01-02 16:53:09 ago) Oct 26 2013 compare

spidr

835 0.7.2 (2025-02-03 07:58:27 ago) Jul 25 2009 compare

kimurai new

1,098 2.2.0 (2026-01-27 17:36:19 ago) Aug 23 2018 compare

scrapydweb

3,400 1.6.0 (2025-02-16 13:18:50 ago) Sep 30 2018 compare

chopper

23 0.6.0 (2023-04-26 10:16:25 ago) Jul 24 2014 compare

photon

12,807 1.1.9 (2018-10-21 03:39:17 ago) Aug 24 2018 compare

wombat

1,360 3.3.0 (2026-04-07 16:31:34 ago) Dec 27 2011 compare

autoscraper

7,136 1.1.14 (2022-07-17 17:20:09 ago) Jul 26 2019 compare

simple-html-dom new

- 2.0-RC2 (2019-11-09 15:42:50 ago) Nov 09 2019 compare

roach

1,454 v3.2.1 (2025-03-21 06:53:36 ago) Dec 27 2021 compare

gerapy

3,495 0.9.13 (2023-07-19 18:53:46 ago) Jul 04 2017 compare

ruia

1,743 0.8.5 (2022-09-06 08:54:56 ago) Oct 17 2018 compare

ayakashi

217 1.0.0-beta8.4 (2023-06-29 12:37:12 ago) Apr 18 2019 compare

http.rb

3,104 0.17.0 (2026-03-25 01:20:02 ago) Mar 20 2015 compare

phpscraper

583 3.0.0 (2024-04-09 15:34:59 ago) May 04 2020 compare

dude

425 0.1.3 (2023-08-01 20:28:33 ago) Feb 20 2022 compare

php-spider

1,341 v0.7.6 (2025-12-04 15:08:06 ago) Mar 16 2013 compare

crwlr-crawler

369 v3.5.6 (2026-01-05 11:13:18 ago) Apr 18 2022 compare

firecrawl new

- 0.0.0 (2025-03-15 00:00:00 ago) Apr 01 2024 compare