Skip to content

ayakashivspuppeteer

AGPL-3.0-only 8 1 205
58 (month) Apr 18 2019 1.0.0-beta8.4(1 year, 2 months ago)
88,173 29 282 Apache-2.0
Mar 23 2013 16.1 million (month) 23.3.0(6 days ago)

Ayakashi is a web scraping library for Node.js that allows developers to easily extract structured data from websites. It is built on top of the popular "puppeteer" library and provides a simple and intuitive API for defining and querying the structure of a website.

Features:

  • Powerful querying and data models
    Ayakashi's way of finding things in the page and using them is done with props and domQL. Directly inspired by the relational database world (and SQL), domQL makes DOM access easy and readable no matter how obscure the page's structure is. Props are the way to package domQL expressions as re-usable structures which can then be passed around to actions or to be used as models for data extraction.
  • High level builtin actions
    Ready made actions so you can focus on what matters. Easily handle infinite scrolling, single page navigation, events and more. Plus, you can always build your own actions, either from scratch or by composing other actions.
  • Preload code on pages
    Need to include a bunch of code, a library you made or a 3rd party module and make it available on a page? Preloaders have you covered.

Puppeteer is a Node.js library that provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. It allows you to automate browser tasks such as generating screenshots, creating PDFs, and testing web pages by simulating user interactions.

Puppeteer is commonly used for web scraping, end-to-end testing, and browser automation.

Puppeteer is one of the most popular browser automation toolkits though it's only available in NodeJS. It offers asynchronous API which enables easy asynchronous scaling.

Example Use


const ayakashi = require("ayakashi");
const myAyakashi = ayakashi.init();

// navigate the browser
await myAyakashi.goTo("https://example.com/product");

// parsing HTML
// first by defnining a selector
myAyakashi
    .select("productList")
    .where({class: {eq: "product-item"}});

// then executing selector on current HTML:
const productList = await myAyakashi.extract("productList");
console.log(productList);
const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    // go to pages
    await page.goto('https://www.example.com');
    // take a screenshot
    await page.screenshot({path: 'example.png'});
    // fill in the form
    await page.type('input[name="name"]', 'John Doe');
    await page.type('input[name="email"]', 'johndoe@example.com');
    await page.select('select[name="country"]', 'US');

    // submit the form
    await page.click('button[type="submit"]');

    // wait for the page to load after the form is submitted
    await page.waitForNavigation();

    // take a screenshot
    await page.screenshot({path: 'form-submission.png'});

    await browser.close();
})();

Alternatives / Similar


Was this page helpful?