lxml

3,010 13 14 BSD-3-Clause

6.0.3 (9 Apr 2026) Dec 13 2022 270.5 million (month)

lxml is a low-level XML and HTML tree processor. It's used by many other libraries such as parsel or beautifulsoup for higher level HTML parsing.

One of the main features of lxml is its speed and efficiency.
It is built on top of the libxml2 and libxslt C libraries, which are known for their high performance and low memory footprint. This makes lxml well-suited for processing large and complex XML and HTML documents.

One of the key components of lxml is the ElementTree API, which is modeled after the ElementTree API from the Python standard library's xml module. This API provides a simple and intuitive way to access and manipulate the elements and attributes of an XML or HTML document. It also provides a powerful and flexible Xpath engine that allows you to select elements based on their names, attributes, and contents.

Another feature of lxml is its support for parsing and creating XML documents using the XSLT standard. The lxml library provides a powerful and easy-to-use interface for applying XSLT stylesheets to XML documents, which can be used to transform and convert XML documents into other formats, such as HTML, PDF, or even other XML formats.

For web scraping it's best to use other higher level libraries that use lxml like parsel or beautifulsoup

Highlights

low-levelfast

Example Use

```python from lxml import etree

this is our HTML page:

html = """ Hello World!

Product Title

paragraph 1

paragraph2

$10

"""

tree = tree.fromstring(html)

for parsing, LXML only supports XPath selectors:

tree.xpath('//span[@class="price"]')[0].text "$10" ```

Alternatives / Similar

beautifulsoup

- 4.14.3 (2025-11-30 15:08:24 ago) Jul 26 2019 compare

xmltodict

5,734 1.0.4 (2026-02-22 02:21:21 ago) Jul 30 2007 compare

html5lib

1,220 1.1 (2020-06-22 23:32:36 ago) Jul 30 2007 compare

cssselect

309 1.4.0 (2026-01-29 07:00:24 ago) Apr 14 2012 compare

feedparser

2,351 6.0.12 (2025-09-10 13:33:58 ago) Jun 15 2007 compare

parsel

1,324 1.11.0 (2026-01-29 07:19:22 ago) Jul 26 2019 compare

selectolax

1,607 0.4.7 (2026-03-06 09:23:35 ago) Mar 01 2018 compare

pyquery

2,381 2.0.1 (2024-08-30 08:12:22 ago) Dec 05 2008 compare

requests-html

13,863 0.10.0 (2019-02-17 20:14:17 ago) Feb 25 2018 compare

untangle

632 1.2.1 (2022-07-02 14:09:28 ago) Jun 09 2011 compare

scrapling new

36,206 0.4.5 (2026-04-07 04:22:27 ago) Aug 01 2024 compare

chompjs

218 1.4.0 (2025-08-04 21:07:54 ago) Jul 30 2007 compare

html5-parser

700 0.4.12 (2023-11-19 15:09:54 ago) Jun 03 2007 compare

gazpacho

768 1.1 (2020-10-09 12:50:18 ago) Dec 28 2012 compare

chopper

23 0.6.0 (2023-04-26 10:16:25 ago) Jul 24 2014 compare

Other Languages

parse5

3,886 8.0.0 (2026-02-21 19:30:52 ago) Jul 03 2013 compare

sax-js

1,153 1.6.0 (2026-03-17 01:32:31 ago) Feb 09 2011 compare

htmlparser2

4,789 12.0.0 (2026-03-20 23:08:40 ago) Aug 28 2011 compare

jsdom new

21,552 29.0.2 (2026-04-07 03:38:38 ago) Nov 21 2011 compare

cheerio

30,265 1.2.0 (2026-02-21 19:30:40 ago) Oct 08 2011 compare

nokogiri

6,248 1.19.2 (2026-03-19 21:12:43 ago) Jul 25 2009 compare

xml2

223 1.5.2 (2025-12-01 15:40:00 ago) Apr 20 2015 compare

rvest

1,517 1.0.5 (2024-02-12 21:10:00 ago) Nov 22 2014 compare

html5-php

1,772 2.10.0 (2025-07-25 09:04:22 ago) Jun 01 2013 compare

domcrawler

4,038 v8.0.8 (2026-03-30 15:14:47 ago) Sep 26 2011 compare

goquery

14,926 v1.12.0 (2026-03-15 16:28:52 ago) Aug 29 2016 compare

cascadia

754 Start (2018-02-20 18:47:44 ago) Feb 20 2018 compare

htmlquery

781 v1.3.6 (2026-03-06 04:46:15 ago) Feb 07 2019 compare

xpath

739 v1.3.6 (2026-02-23 07:10:29 ago) Jun 08 2019 compare

soup

2,227 v1.2.5 (2022-01-16 14:36:54 ago) Apr 29 2017 compare

embed

2,103 v4.4.15 (2025-01-02 16:53:09 ago) Oct 26 2013 compare

simple-html-dom new

- 2.0-RC2 (2019-11-09 15:42:50 ago) Nov 09 2019 compare

ralger

165 2.3.0 (2021-03-18 00:10:00 ago) Dec 22 2019 compare