XPath Engine

The XPath engine is a generic engine with which it is possible to configure engines in the settings.

Configuration

Request:

Paging:

Time Range:

Safe-Search:

Response:

XPath selector:

Example

Here is a simple example of a XPath engine configured in the engine: section, further read Engine Overview.

- name : bitbucket
  engine : xpath
  paging : True
  search_url : https://bitbucket.org/repo/all/{pageno}?name={query}
  url_xpath : //article[@class="repo-summary"]//a[@class="repo-link"]/@href
  title_xpath : //article[@class="repo-summary"]//a[@class="repo-link"]
  content_xpath : //article[@class="repo-summary"]/p

Implementations

searx.engines.xpath.request(query, params)[source]

Build request parameters (see Making a Request).

searx.engines.xpath.response(resp)[source]

Scrap results from the response (see Result Types (template)).

searx.engines.xpath.content_xpath = None

XPath selector of result’s content.

searx.engines.xpath.cookies = {}

Some engines might offer different result based on cookies. Possible use-case: To set safesearch cookie.

searx.engines.xpath.first_page_num = 1

Number of the first page (usually 0 or 1).

searx.engines.xpath.headers = {}

Some engines might offer different result based headers. Possible use-case: To set header to moderate.

searx.engines.xpath.lang_all = 'en'

Replacement {lang} in search_url if language all is selected.

searx.engines.xpath.method = 'GET'

Some engines might require to do POST requests for search.

searx.engines.xpath.no_result_for_http_status = []

Return empty result for these HTTP status codes instead of throwing an error.

no_result_for_http_status: []
searx.engines.xpath.page_size = 1

Number of results on each page. Only needed if the site requires not a page number, but an offset.

searx.engines.xpath.paging = False

Engine supports paging [True or False].

searx.engines.xpath.request_body = ''

The body of the request. This can only be used if different method is set, e.g. POST. For formatting see the documentation of search_url:

search={query}&page={pageno}{time_range}{safe_search}
searx.engines.xpath.results_xpath = ''

XPath selector for the list of result items

searx.engines.xpath.safe_search_map = {0: '&filter=none', 1: '&filter=moderate', 2: '&filter=strict'}

Maps safe-search value to {safe_search} in search_url.

safesearch: true
safes_search_map:
  0: '&filter=none'
  1: '&filter=moderate'
  2: '&filter=strict'
searx.engines.xpath.safe_search_support = False

Engine supports safe-search.

searx.engines.xpath.search_url = None

Search URL of the engine. Example:

https://example.org/?search={query}&page={pageno}{time_range}{safe_search}

Replacements are:

{query}:

Search terms from user.

{pageno}:

Page number if engine supports paging paging

{lang}:

ISO 639-1 language code (en, de, fr ..)

{time_range}:

URL parameter if engine supports time range. The value for the parameter is taken from time_range_map.

{safe_search}:

Safe-search URL parameter if engine supports safe-search. The {safe_search} replacement is taken from the safes_search_map. Filter results:

0: none, 1: moderate, 2:strict

If not supported, the URL parameter is an empty string.

searx.engines.xpath.soft_max_redirects = 0

Maximum redirects, soft limit. Record an error but don’t stop the engine

searx.engines.xpath.suggestion_xpath = ''

XPath selector of result’s suggestion.

searx.engines.xpath.thumbnail_xpath = False

XPath selector of result’s thumbnail.

searx.engines.xpath.time_range_map = {'day': 24, 'month': 720, 'week': 168, 'year': 8760}

Maps time range value from user to {time_range_val} in time_range_url.

time_range_map:
  day: 1
  week: 7
  month: 30
  year: 365
searx.engines.xpath.time_range_support = False

Engine supports search time range.

searx.engines.xpath.time_range_url = '&hours={time_range_val}'

Time range URL parameter in the in search_url. If no time range is requested by the user, the URL parameter is an empty string. The {time_range_val} replacement is taken from the time_range_map.

time_range_url : '&days={time_range_val}'
searx.engines.xpath.title_xpath = None

XPath selector of result’s title.

searx.engines.xpath.url_xpath = None

XPath selector of result’s url.