xpathvssoup

MIT 18 2 739

58.1 thousand (month) Jun 08 2019 v1.3.6(2026-02-23 07:10:29 ago)

2,227 1 22 MIT

Apr 29 2017 58.1 thousand (month) v1.2.5(2022-01-16 14:36:54 ago)

xpath is a library for Go that allows you to use XPath expressions to select elements from an HTML document. It is built on top of the html package in the Go standard library, and provides a way to select elements from an HTML document using XPath expressions, which are more powerful and expressive than CSS selectors.

soup is a Go library for parsing and querying HTML documents.

It provides a simple and intuitive interface for extracting information from HTML pages. It's inspired by popular Python web scraping library BeautifulSoup and shares similar use API implementing functions like Find and FindAll.

soup can also use go's built-in http client to download HTML content.

Note that unlike beautifulsoup, soup does not support CSS selectors or XPath.

Example Use

```go package main import ( "fmt" "github.com/antchfx/xpath" "golang.org/x/net/html" "strings" ) func main() { // Create an HTML string html := `

Hello, World!

Example

` // Parse the HTML string into a node tree doc, err := html.Parse(strings.NewReader(html)) if err != nil { fmt.Println("Error:", err) return } // Compile the XPath expression expr, err := xpath.Compile("//p") if err != nil { fmt.Println("Error:", err) return } // Use the Evaluate method to select elements from the document nodes, err := expr.Evaluate(xpath.NodeNavigator(doc)) if err != nil { fmt.Println("Error:", err) return } if nodes.MoveNext() { fmt.Println(nodes.Current().Value()) // > Hello, World! } } ```

```go package main

import ( "fmt" "log"

"github.com/anaskhan96/soup" )

func main() {

url := "https://www.bing.com/search?q=weather+Toronto"

# soup has basic HTTP client though it's not recommended for scraping: resp, err := soup.Get(url) if err != nil { log.Fatal(err) }

# create soup object from HTML doc := soup.HTMLParse(resp)

# html elements can be found using Find or FindStrict methods: # in this case find

elements where "class" attribute matches some values: grid := doc.FindStrict("div", "class", "b_antiTopBleed b_antiSideBleed b_antiBottomBleed") # note: to find all elements FindAll() method can be used the same way

# elements can be further searched for descendents: heading := grid.Find("div", "class", "wtr_titleCtrn").Find("div").Text() conditions := grid.Find("div", "class", "wtr_condition") primaryCondition := conditions.Find("div") secondaryCondition := primaryCondition.FindNextElementSibling() temp := primaryCondition.Find("div", "class", "wtr_condiTemp").Find("div").Text() others := primaryCondition.Find("div", "class", "wtr_condiAttribs").FindAll("div") caption := secondaryCondition.Find("div").Text()

fmt.Println("City Name : " + heading) fmt.Println("Temperature : " + temp + "˚C") for _, i := range others { fmt.Println(i.Text()) } fmt.Println(caption) } ```