Skip to content


MIT 8 1 693
58.1 thousand (month) Feb 07 2019 v1.3.0(1 year, 2 months ago)
4,263 4 12 MIT
9.1.0(2 months ago) Aug 28 2011 127.1 million (month)

htmlquery is a Go library that allows you to parse and extract data from HTML documents using XPath expressions. It provides a simple and intuitive API for traversing and querying the HTML tree structure, and it is built on top of the popular Goquery library.

htmlparser2 is a Node.js library for parsing HTML and XML documents. It works by building a tree of elements, similar to the Document Object Model (DOM) in web browsers. This allows you to easily traverse and manipulate the structure of the document.

htmlparser2 is a low-level html tree parser but it can still be useful in web scraping as it's a powerful tool for HTML restructuring and serialization.

Example Use

package main

import (


func main() {
  // Parse the HTML string
  doc, err := htmlquery.Parse([]byte(`
        <h1>Hello, World!</h1>
          <li>Item 1</li>
          <li>Item 2</li>
          <li>Item 3</li>
  if err != nil {

  // Extract the text of the first <h1> element
  h1 := htmlquery.FindOne(doc, "//h1")
  fmt.Println(htmlquery.InnerText(h1)) // "Hello, World!"

  // Extract the text of all <li> elements
  lis := htmlquery.Find(doc, "//li")
  for _, li := range lis {
  // "Item 1"
  // "Item 2"
  // "Item 3"
const htmlparser = require("htmlparser2");
const parser = new htmlparser.Parser({
    onopentag: (name, attribs) => {
        console.log(`Opening tag: ${name}`);
    ontext: (text) => {
        console.log(`Text: ${text}`);
    onclosetag: (name) => {
        console.log(`Closing tag: ${name}`);
}, {decodeEntities: true});

const html = "<p>Hello, <b>world</b>!</p>";

Alternatives / Similar