Skip to content

xml2vssax-js

MIT 63 3 216
624.5 thousand (month) Apr 20 2015 1.3.6(9 months ago)
1,073 1 103 ISC
1.3.0(6 months ago) Feb 09 2011 149.3 million (month)

The xml2 package is a binding to libxml2, making it easy to work with HTML and XML from R. The API is somewhat inspired by jQuery.

xml2 can be used to parse HTML documents using XPath selectors and is a successor to R's XML package with a few improvements:

  • xml2 takes care of memory management for you. It will automatically free the memory used by an XML document as soon as the last reference to it goes away.
  • xml2 has a very simple class hierarchy so don't need to think about exactly what type of object you have, xml2 will just do the right thing.
  • More convenient handling of namespaces in Xpath expressions - see xml_ns() and xml_ns_strip() to get started.

sax-js is a streaming XML parser for Node.js that is built on top of the sax C library. It is designed to be fast, low-memory, and easy to use. It is commonly used for parsing large XML files, as it allows you to process the XML data incrementally, rather than loading the entire file into memory at once.

sax-js is a low-level html tree parser and does not provide html query capabilities (like CSS selectors) though it can be useful in HTML tree parsing and serialization.

Example Use


library("xml2")
x <- read_xml("<foo> <bar> text <baz/> </bar> </foo>")
x

xml_name(x)
xml_children(x)
xml_text(x)
xml_find_all(x, ".//baz")

h <- read_html("<html><p>Hi <b>!")
h
xml_name(h)
const fs = require("fs");
const sax = require("sax");

const xmlStream = fs.createReadStream("example.xml");
const saxParser = sax.createStream(true, {});

saxParser.on("opentag", function(node) {
    console.log(`<${node.name}>`);
});

saxParser.on("closetag", function(nodeName) {
    console.log(`</${nodeName}>`);
});

saxParser.on("text", function(text) {
    console.log(text);
});

xmlStream.pipe(saxParser);

Alternatives / Similar