xpathvsparse5
xpath is a library for Go that allows you to use XPath expressions to select elements from an HTML document. It is built on top of the html package in the Go standard library, and provides a way to select elements from an HTML document using XPath expressions, which are more powerful and expressive than CSS selectors.
parse5 is a Node.js library for parsing and manipulating HTML and XML documents. It is designed to be fast and flexible, and it is commonly used in web scraping and web development projects.
parse5 is used by popular libraries such as Angular, Lit, Cheerio and many more. Unlike Cheerio parse5 is a low level html parsing library that might be useful directly in web scraping without higher level abstraction.
Example Use
package main
import (
"fmt"
"github.com/antchfx/xpath"
"golang.org/x/net/html"
"strings"
)
func main() {
// Create an HTML string
html := `<html>
<body>
<div id="content">
<p>Hello, World!</p>
<a href="http://example.com">Example</a>
</div>
</body>
</html>`
// Parse the HTML string into a node tree
doc, err := html.Parse(strings.NewReader(html))
if err != nil {
fmt.Println("Error:", err)
return
}
// Compile the XPath expression
expr, err := xpath.Compile("//p")
if err != nil {
fmt.Println("Error:", err)
return
}
// Use the Evaluate method to select elements from the document
nodes, err := expr.Evaluate(xpath.NodeNavigator(doc))
if err != nil {
fmt.Println("Error:", err)
return
}
if nodes.MoveNext() {
fmt.Println(nodes.Current().Value())
// > Hello, World!
}
}
const parse5 = require("parse5");
// parse string
const document = parse5.parse('<html><body>Hello World!</body></html>');
console.log(document);
// html tree can be traversed as javascript object:
const body = document.childNodes[1];
console.log(body.childNodes[0].value); // "Hello World!"
// and modified
const newElement = parse5.parseFragment('<p>New Element</p>');
body.appendChild(newElement.childNodes[0]);
console.log(parse5.serialize(document));