Skip to content

cssselectvsembed

BSD 21 8 281
8.3 million (month) Apr 14 2012 1.2.0(1 year, 5 months ago)
2,063 6 64 MIT
v4.4.10(3 months ago) Oct 26 2013 4.0 thousand (month)

cssselect is a BSD-licensed Python library to parse CSS3 selectors and translate them to XPath 1.0 expressions.

XPath 1.0 expressions can be used in lxml or another XPath engine to find the matching elements in an XML or HTML document.

cssselect is used by other popular Python packages like parsel and scrapy but can also be used on it's own to generate valid XPath 1.0 expressions for parsing HTML and XML documents in other tools.

Note that because XPath selectors are more powerful than CSS selectors this translation is only possible one way. Converting XPath to CSS selectors is impractical and not supported by cssselect.

PHP library to get information from any web page (using oembed, opengraph, twitter-cards, scrapping the html, etc). It's compatible with any web service (youtube, vimeo, flickr, instagram, etc) and has adapters to some sites like (archive.org, github, facebook, etc).

Example Use


from cssselect import GenericTranslator, SelectorError

translator = GenericTranslator()
try:
    expression = translator.css_to_xpath('div.content')
    print(expression)
    'descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' content ')]'
except SelectorError as e:
    print(f'Invalid selector {e}')
use Embed\Embed;

$embed = new Embed();

//Load any url:
$info = $embed->get('https://www.youtube.com/watch?v=PP1xn5wHtxE');

//Get content info

$info->title; //The page title
$info->description; //The page description
$info->url; //The canonical url
$info->keywords; //The page keywords

$info->image; //The thumbnail or main image

$info->code->html; //The code to embed the image, video, etc
$info->code->width; //The exact width of the embed code (if exists)
$info->code->height; //The exact height of the embed code (if exists)
$info->code->ratio; //The aspect ratio (width/height)

$info->authorName; //The resource author
$info->authorUrl; //The author url

$info->cms; //The cms used
$info->language; //The language of the page
$info->languages; //The alternative languages

$info->providerName; //The provider name of the page (Youtube, Twitter, Instagram, etc)
$info->providerUrl; //The provider url
$info->icon; //The big icon of the site
$info->favicon; //The favicon of the site (an .ico file or a png with up to 32x32px)

$info->publishedTime; //The published time of the resource
$info->license; //The license url of the resource
$info->feeds; //The RSS/Atom feeds

Alternatives / Similar