mxjvsnested-lookup
mxj is a Go library for working with JSON and XML data. It allows you to convert between JSON and XML, merge JSON and XML documents, and extract values from JSON and XML using a simple and intuitive API.
One of the main features of mxj is its ability to work with JSON and XML data in a struct-like manner, allowing you to access values using dot notation.
nested-lookup is a convenient way to parse multi-depth JSON documents which are often encountered in web scraping. Using nested-lookup we can easily extract deeply nested data-field just by providing key value.
The library provides a number of functions for searching and extracting data from nested dictionaries, including:
nested_lookup
: search for a key within a nested dictionary and returns the associated value.nested_update
: update a key-value pair within a nested dictionary.nested_has
: check if a key exists within a nested dictionary.nested_values
: returns all the values within a nested dictionary, including values within nested dictionaries.
The library is designed to be flexible and can work with dictionaries of any size and structure, making it a useful tool for working with complex and nested data structures.
Example Use
package main
import (
"fmt"
"github.com/clbanning/mxj"
)
func main() {
// Parse the JSON string
jsonData := []byte(`
{
"name": "John Doe",
"age": 30,
"address": {
"street": "Main St",
"city": "Anytown",
"state": "CA",
"zip": "12345"
},
"phones": [
"555-555-5555",
"555-555-5556"
]
}
`)
mv, err := mxj.NewMapJson(jsonData)
if err != nil {
fmt.Println("Error:", err)
return
}
// Extract the name
name, _ := mv.ValueForPath("name")
fmt.Println("name:", name) // "John Doe"
// Extract the city
city, _ := mv.ValueForPath("address.city")
fmt.Println("city:", city) // "Anytown"
// Extract all phone numbers
phones, _ := mv.ValuesForPath("phones")
for _, phone := range phones {
fmt.Println("phone:", phone)
}
// "555-555-5555"
// "555-555-5556"
}
from nested_lookup import nested_lookup
my_document = {
"name" : "Rocko Ballestrini",
"email_address" : "test1@example.com",
"other" : {
"secondary_email" : "test2@example.com",
"EMAIL_RECOVERY" : "test3@example.com",
"email_address" : "test4@example.com",
},
}
# retrieving all keys can be useful in dataset overview
from nested_lookup import get_all_keys
get_all_keys(my_document)
['name', 'email_address', 'other', 'secondary_email', 'EMAIL_RECOVERY', 'email_address']
# key/value stats can also be useful for data overview:
from nested_lookup import get_occurrence_of_key, get_occurrence_of_value, get_occurrences_and_values
data = {"products": [{"category": "t-shirt"},{"category": "underwear"},{"category": "t-shirt"}]}
get_occurrence_of_key(data, key='category')
3
get_occurrence_of_value(data, value='t-shirt')
2
get_occurrences_and_values([data], "t-shirt") # count t-shirt products
{
't-shirt': {
'occurrences': 2,
'values': [{'category': 't-shirt'}, {'category': 't-shirt'}]
}
}
# it can also be used to delete/alter values:
from nested_lookup import nested_alter
data = {"products": [{"price": 10}, {"price": 14}]}
nested_alter(data, "price", lambda price: price * 1.4)
{'products': [{'price': 14.0}, {'price': 19.599999999999998}]}
nested_delete(data, "price")
{'products': [{}, {}]}