chompjs
chompjs can be used in web scrapping for turning JavaScript objects embedded in pages into valid Python dictionaries.
In web scraping this is particularly useful for parsing Javascript variables like:
import chompjs
js = """
var myObj = {
myMethod: function(params) {
// ...
},
myValue: 100
}
"""
chompjs.parse_js_object(js, json_params={'strict': False})
{'myMethod': 'function(params) {\n // ...\n }', 'myValue': 100}
In practice this can be used to extract hidden JSON data like data from <script id=__NEXT_DATA__>
elements
from nextjs (and similar) websites. Unlike json.loads
command chompjs can ingest json documents that contain
javascript natives like functions making it a super easy way to scrape hidden web data objects.
Example Use
# basic use
import chompjs
js = """
var myObj = {
myMethod: function(params) {
// ...
},
myValue: 100
}
"""
chompjs.parse_js_object(js, json_params={'strict': False})
{'myMethod': 'function(params) {\n // ...\n }', 'myValue': 100}
# example how to use with hidden data parsing:
import httpx
import chompjs
from parsel import Selector
response = httpx.get("http://example.com")
hidden_script = Selector(response.text).css("script#__NEXT_DATA__::text").get()
data = chompjs.parse_js_object(hidden_script)
print(data['props'])