Skip to content

nested-lookupvsjmespath

Public Domain - 2 209
459.5 thousand (month) Feb 09 2022 0.2.25(1 year, 8 months ago)
2,053 2 55 MIT
1.0.1(1 year, 9 months ago) Feb 09 2022 179.6 million (month)

nested-lookup is a convenient way to parse multi-depth JSON documents which are often encountered in web scraping. Using nested-lookup we can easily extract deeply nested data-field just by providing key value.

The library provides a number of functions for searching and extracting data from nested dictionaries, including:

  • nested_lookup: search for a key within a nested dictionary and returns the associated value.
  • nested_update: update a key-value pair within a nested dictionary.
  • nested_has: check if a key exists within a nested dictionary.
  • nested_values: returns all the values within a nested dictionary, including values within nested dictionaries.

The library is designed to be flexible and can work with dictionaries of any size and structure, making it a useful tool for working with complex and nested data structures.

JMESPath (pronounced “james path”) allows you to declaratively specify how to extract elements from a JSON document.

In web scraping, jmespath is a powerful tool for parsing and reshaping large JSON datasets. Jmespath is fast and easily extendible following it's own powerful query language.

For more see the Json parsing introduction section.

Example Use


from nested_lookup import nested_lookup

my_document = {
   "name" : "Rocko Ballestrini",
   "email_address" : "test1@example.com",
   "other" : {
       "secondary_email" : "test2@example.com",
       "EMAIL_RECOVERY" : "test3@example.com",
       "email_address" : "test4@example.com",
    },
}

# retrieving all keys can be useful in dataset overview
from nested_lookup import get_all_keys
get_all_keys(my_document)
['name', 'email_address', 'other', 'secondary_email', 'EMAIL_RECOVERY', 'email_address']

# key/value stats can also be useful for data overview: 
from nested_lookup import get_occurrence_of_key, get_occurrence_of_value, get_occurrences_and_values
data = {"products": [{"category": "t-shirt"},{"category": "underwear"},{"category": "t-shirt"}]}

get_occurrence_of_key(data, key='category')
3
get_occurrence_of_value(data, value='t-shirt')
2
get_occurrences_and_values([data], "t-shirt")  # count t-shirt products
{
  't-shirt': {
    'occurrences': 2,
    'values': [{'category': 't-shirt'}, {'category': 't-shirt'}]
    }
  }

# it can also be used to delete/alter values:
from nested_lookup import nested_alter
data = {"products": [{"price": 10}, {"price": 14}]}

nested_alter(data, "price", lambda price: price * 1.4)
{'products': [{'price': 14.0}, {'price': 19.599999999999998}]}

nested_delete(data, "price")
{'products': [{}, {}]}
import jmespath

data = {
    "data": {
        "info": {
            "products": [
                {"price": {"usd": 1}, "_type": "product", "id": "123"}, 
                {"price": {"usd": 2}, "_type": "product", "id": "345"}
            ]
        }
    }
}

# easily reshape nested dataset to flat structure:
jmespath.search("data.info.products[*].{id:id, price:price.usd}", data)
[{'id': '123', 'price': 1}, {'id': '345', 'price': 2}]

Alternatives / Similar