Skip to content

parse5vsxpath

MIT 33 7 3,698
147.5 million (month) Jul 03 2013 7.2.1(9 months ago)
699 2 13 MIT
Jun 08 2019 58.1 thousand (month) v1.3.3(8 months ago)

parse5 is a Node.js library for parsing and manipulating HTML and XML documents. It is designed to be fast and flexible, and it is commonly used in web scraping and web development projects.

parse5 is used by popular libraries such as Angular, Lit, Cheerio and many more. Unlike Cheerio parse5 is a low level html parsing library that might be useful directly in web scraping without higher level abstraction.

xpath is a library for Go that allows you to use XPath expressions to select elements from an HTML document. It is built on top of the html package in the Go standard library, and provides a way to select elements from an HTML document using XPath expressions, which are more powerful and expressive than CSS selectors.

Example Use


const parse5 = require("parse5");

// parse string
const document = parse5.parse('<html><body>Hello World!</body></html>');
console.log(document);

// html tree can be traversed as javascript object:
const body = document.childNodes[1];
console.log(body.childNodes[0].value); // "Hello World!"

// and modified
const newElement = parse5.parseFragment('<p>New Element</p>');
body.appendChild(newElement.childNodes[0]);
console.log(parse5.serialize(document)); 
package main

import (
  "fmt"
  "github.com/antchfx/xpath"
  "golang.org/x/net/html"
  "strings"
)

func main() {
  // Create an HTML string
  html := `<html>
        <body>
          <div id="content">
            <p>Hello, World!</p>
            <a href="http://example.com">Example</a>
          </div>
        </body>
      </html>`

  // Parse the HTML string into a node tree
  doc, err := html.Parse(strings.NewReader(html))
  if err != nil {
    fmt.Println("Error:", err)
    return
  }

  // Compile the XPath expression
  expr, err := xpath.Compile("//p")
  if err != nil {
    fmt.Println("Error:", err)
    return
  }

  // Use the Evaluate method to select elements from the document
  nodes, err := expr.Evaluate(xpath.NodeNavigator(doc))
  if err != nil {
    fmt.Println("Error:", err)
    return
  }
  if nodes.MoveNext() {
    fmt.Println(nodes.Current().Value())
    // > Hello, World!
  }
}

Alternatives / Similar


Was this page helpful?