jmespathvskiba
JMESPath (pronounced “james path”) allows you to declaratively specify how to extract elements from a JSON document.
In web scraping, jmespath is a powerful tool for parsing and reshaping large JSON datasets. Jmespath is fast and easily extendible following it's own powerful query language.
For more see the Json parsing introduction section.
Kiba is a lightweight Ruby gem that provides a simple and powerful way to process and transform data in an ETL (Extract, Transform, Load) pipeline. It allows you to define a set of operations to perform on the data, and then automatically applies those operations to the data, making it easy to extract, transform, and load data from various sources and formats.
Kiba provides a simple and intuitive API for defining the pipeline, and it is built on top of the Enumerator API, which allows for easy manipulation of large datasets with low memory usage.
Example Use
import jmespath
data = {
"data": {
"info": {
"products": [
{"price": {"usd": 1}, "_type": "product", "id": "123"},
{"price": {"usd": 2}, "_type": "product", "id": "345"}
]
}
}
}
# easily reshape nested dataset to flat structure:
jmespath.search("data.info.products[*].{id:id, price:price.usd}", data)
[{'id': '123', 'price': 1}, {'id': '345', 'price': 2}]
require 'kiba'
data = [{ name: 'Alice', age: 25 }, { name: 'Bob', age: 30 }]
Kiba.parse do
source Kiba::Common::EnumerableSource, data
transform { |row| row[:age] += 1 }
destination Kiba::Common::EnumerableDestination
end.run
# Output: [{ name: 'Alice', age: 26 }, { name: 'Bob', age: 31 }]