youtube-dlvscloudscraper

Unlicense 4121 30 140,026

230.8 thousand (month) Feb 22 2012 2021.12.17(2021-12-16 19:02:14 ago)

6,431 2 36 MIT

Dec 28 2012 4.3 million (month) 1.2.71(2023-04-25 23:20:15 ago)

youtube-dl is a command-line utility and a library for downloading multimedia content from various websites, including YouTube, Vimeo, TikTok, and many others. It supports a wide range of video and audio formats, and can be used to download both live streams and on-demand videos. The library is written in Python and can be easily integrated into other Python projects. Youtube-dl contains open-source scrapers for hundreds of websites and is a great educational source for understanding how to scrape many popular websites.

A simple Python module to bypass Cloudflare's anti-bot page (also known as "I'm Under Attack Mode", or IUAM), implemented with Requests. Cloudflare changes their techniques periodically, so I will update this repo frequently.

This can be useful if you wish to scrape or crawl a website protected with Cloudflare. Cloudflare's anti-bot page currently just checks if the client supports Javascript, though they may add additional techniques in the future.

Due to Cloudflare continually changing and hardening their protection page, cloudscraper requires a JavaScript Engine/interpreter to solve Javascript challenges. This allows the script to easily impersonate a regular web browser without explicitly deobfuscating and parsing Cloudflare's Javascript.

For reference, this is the default message Cloudflare uses for these sorts of pages: ``` Checking your browser before accessing website.com. This process is automatic. Your browser will redirect to your requested content shortly.

Please allow up to 5 seconds... ```

Any script using cloudscraper will sleep for ~5 seconds for the first visit to any site with Cloudflare anti-bots enabled, though no delay will occur after the first request.

Cloudscraper is a great introduction to javascript fingerprint/challenge scraper blocking and is a useful educational tool even if it doesn't always work.

Highlights

popularcomplex

Example Use

CLI: ```shell $ youtube-dl 'https://www.youtube.com/watch?t=4&v=BaW_jenozKc' ``` Library: ```python import youtube_dl # define the download options options = { 'outtmpl': '%(title)s.%(ext)s', 'format': 'best', 'postprocessors': [{ 'key': 'FFmpegExtractAudio', 'preferredcodec': 'mp3', 'preferredquality': '192', }] } # download the video with youtube_dl.YoutubeDL(options) as ydl: ydl.download(['https://www.youtube.com/watch?v=dQw4w9WgXcQ']) ```

```python import cloudscraper scraper = cloudscraper.create_scraper() # returns a CloudScraper instance # Or: scraper = cloudscraper.CloudScraper() # CloudScraper inherits from requests.Session print(scraper.get("http://somesite.com").text) # => "..." ```

youtube-dlvscloudscraper

Highlights

Example Use

Alternatives / Similar