Skip to content

geziyorvsmechanize

MPL-2.0 30 1 2,772
Jun 06 2019 2026-04-11(2026-04-11 21:30:25 ago)
4,440 8 6 MIT
Jul 25 2009 213.1 thousand (month) 2.14.0(2025-01-05 18:30:46 ago)

Geziyor is a blazing fast web crawling and web scraping framework. It can be used to crawl websites and extract structured data from them. Geziyor is useful for a wide range of purposes such as data mining, monitoring and automated testing.

Features:

  • JS Rendering
  • 5.000+ Requests/Sec
  • Caching (Memory/Disk/LevelDB)
  • Automatic Data Exporting (JSON, CSV, or custom)
  • Metrics (Prometheus, Expvar, or custom)
  • Limit Concurrency (Global/Per Domain)
  • Request Delays (Constant/Randomized)
  • Cookies, Middlewares, robots.txt
  • Automatic response decoding to UTF-8
  • Proxy management (Single, Round-Robin, Custom)

Mechanize is a Ruby library for automating interaction with websites. It automatically stores and sends cookies, follows redirects, and can submit forms — making it behave like a web browser without needing an actual browser engine.

Key features include:

  • Automatic cookie management Stores cookies received from servers and sends them back on subsequent requests, maintaining session state across multiple pages.
  • Form handling Can find, fill in, and submit HTML forms programmatically. Supports text inputs, selects, checkboxes, radio buttons, and file uploads.
  • Link following Navigate through pages by clicking links using their text content, CSS selectors, or href patterns.
  • History and back/forward Maintains a browsing history, allowing you to go back and forward through visited pages.
  • HTTP authentication Supports basic and digest HTTP authentication.
  • Proxy support Can route requests through HTTP proxies.
  • Redirect handling Automatically follows HTTP redirects (configurable).

Mechanize is one of the oldest and most established web interaction libraries in Ruby. It is best suited for scraping traditional server-rendered websites with forms and multi-page workflows. For JavaScript-heavy sites, a browser automation tool like Selenium or Playwright is recommended instead.

Highlights


popularproduction

Example Use


```go // This example extracts all quotes from quotes.toscrape.com and exports to JSON file. func main() { geziyor.NewGeziyor(&geziyor.Options{ StartURLs: []string{"http://quotes.toscrape.com/"}, ParseFunc: quotesParse, Exporters: []export.Exporter{&export.JSON{}}, }).Start() } func quotesParse(g *geziyor.Geziyor, r *client.Response) { r.HTMLDoc.Find("div.quote").Each(func(i int, s *goquery.Selection) { g.Exports <- map[string]interface{}{ "text": s.Find("span.text").Text(), "author": s.Find("small.author").Text(), } }) if href, ok := r.HTMLDoc.Find("li.next > a").Attr("href"); ok { g.Get(r.JoinURL(href), quotesParse) } } ```
```ruby require 'mechanize' agent = Mechanize.new # Navigate to a page page = agent.get('https://example.com') puts page.title # Find and click a link page = page.link_with(text: 'Products').click # Extract data from the page page.search('.product').each do |product| name = product.at('.name').text price = product.at('.price').text puts "#{name}: #{price}" end # Fill in and submit a login form login_page = agent.get('https://example.com/login') form = login_page.form_with(action: '/login') form['username'] = 'user@example.com' form['password'] = 'password123' dashboard = agent.submit(form) # Cookies are maintained automatically puts dashboard.title # "Dashboard" # Download a file agent.get('https://example.com/report.csv').save('report.csv') ```

Alternatives / Similar


Was this page helpful?