object-scanvsjmespath
object-scan allows traversal of complex javascript objects to find specific keys.
In web scraping, it's useful for parsing large, nested JSON datasets for specific datafields. object-scan can be used to recursively find any key in any object structure:
import objectScan from 'object-scan';
const haystack = { a: { b: { c: 'd' }, e: { f: 'g' } } };
objectScan(['a.*.f'], { joined: true })(haystack);
// => [ 'a.e.f' ]
JMESPath (pronounced “james path”) allows you to declaratively specify how to extract elements from a JSON document.
In web scraping, jmespath is a powerful tool for parsing and reshaping large JSON datasets. Jmespath is fast and easily extendible following it's own powerful query language.
For more see the Json parsing introduction section.
Example Use
const objectScan = require('object-scan');
const myNestedObject = {
level1: {
level2: {
level3: {
myTargetKey: 'value',
},
},
},
};
const searchTerm = 'myTargetKey';
const result = objectScan([`**.${searchTerm}`], { joined: false })(myNestedObject);
console.log(result);
import jmespath
data = {
"data": {
"info": {
"products": [
{"price": {"usd": 1}, "_type": "product", "id": "123"},
{"price": {"usd": 2}, "_type": "product", "id": "345"}
]
}
}
}
# easily reshape nested dataset to flat structure:
jmespath.search("data.info.products[*].{id:id, price:price.usd}", data)
[{'id': '123', 'price': 1}, {'id': '345', 'price': 2}]