XPath Engine¶
The XPath engine is a generic engine with which it is possible to configure engines in the settings.
Configuration¶
Request:
Paging:
Time Range:
Safe-Search:
Response:
Example¶
Here is a simple example of a XPath engine configured in the engines: section, further read Engine Overview.
- name : bitbucket
engine : xpath
paging : True
search_url : https://bitbucket.org/repo/all/{pageno}?name={query}
url_xpath : //article[@class="repo-summary"]//a[@class="repo-link"]/@href
title_xpath : //article[@class="repo-summary"]//a[@class="repo-link"]
content_xpath : //article[@class="repo-summary"]/p
Implementations¶
- searx.engines.xpath.search_url = None¶
Search URL of the engine. Example:
https://example.org/?search={query}&page={pageno}{time_range}{safe_search}Replacements are:
{query}:Search terms from user.
{pageno}:Page number if engine supports paging
paging{lang}:ISO 639-1 language code (en, de, fr ..)
{time_range}:URL parameterif enginesupports time range. The value for the parameter is taken fromtime_range_map.{safe_search}:Safe-search
URL parameterif enginesupports safe-search. The{safe_search}replacement is taken from thesafes_search_map. Filter results:0: none, 1: moderate, 2:strict
If not supported, the URL parameter is an empty string.
- searx.engines.xpath.lang_all = 'en'¶
Replacement
{lang}insearch_urlif languageallis selected.
- searx.engines.xpath.no_result_for_http_status = []¶
Return empty result for these HTTP status codes instead of throwing an error.
no_result_for_http_status: []
- searx.engines.xpath.soft_max_redirects = 0¶
Maximum redirects, soft limit. Record an error but don’t stop the engine
- searx.engines.xpath.results_xpath = ''¶
XPath selector for the list of result items
- searx.engines.xpath.url_xpath = None¶
XPath selector of result’s
url.
- searx.engines.xpath.content_xpath = None¶
XPath selector of result’s
content.
- searx.engines.xpath.title_xpath = None¶
XPath selector of result’s
title.
- searx.engines.xpath.thumbnail_xpath = False¶
XPath selector of result’s
thumbnail.
- searx.engines.xpath.suggestion_xpath = ''¶
XPath selector of result’s
suggestion.
- searx.engines.xpath.cookies = {}¶
Some engines might offer different result based on cookies. Possible use-case: To set safesearch cookie.
- searx.engines.xpath.headers = {}¶
Some engines might offer different result based headers. Possible use-case: To set header to moderate.
- searx.engines.xpath.method = 'GET'¶
Some engines might require to do POST requests for search.
- searx.engines.xpath.request_body = ''¶
The body of the request. This can only be used if different
methodis set, e.g.POST. For formatting see the documentation ofsearch_url:search={query}&page={pageno}{time_range}{safe_search}
- searx.engines.xpath.paging = False¶
Engine supports paging [True or False].
- searx.engines.xpath.page_size = 1¶
Number of results on each page. Only needed if the site requires not a page number, but an offset.
- searx.engines.xpath.first_page_num = 1¶
Number of the first page (usually 0 or 1).
- searx.engines.xpath.time_range_support = False¶
Engine supports search time range.
- searx.engines.xpath.time_range_url = '&hours={time_range_val}'¶
Time range URL parameter in the in
search_url. If no time range is requested by the user, the URL parameter is an empty string. The{time_range_val}replacement is taken from thetime_range_map.time_range_url : '&days={time_range_val}'
- searx.engines.xpath.time_range_map = {'day': 24, 'month': 720, 'week': 168, 'year': 8760}¶
Maps time range value from user to
{time_range_val}intime_range_url.time_range_map: day: 1 week: 7 month: 30 year: 365
- searx.engines.xpath.safe_search_support = False¶
Engine supports safe-search.
- searx.engines.xpath.safe_search_map = {0: '&filter=none', 1: '&filter=moderate', 2: '&filter=strict'}¶
Maps safe-search value to
{safe_search}insearch_url.safesearch: true safes_search_map: 0: '&filter=none' 1: '&filter=moderate' 2: '&filter=strict'
- searx.engines.xpath.request(query, params)[source]¶
Build request parameters (see Making a Request).
- searx.engines.xpath.response(resp) EngineResults[source]¶
Scrap results from the response (see Result Types).