XPath Engine¶
The XPath engine is a generic engine with which it is possible to configure engines in the settings.
Configuration¶
Request:
Paging:
Time Range:
Safe-Search:
Response:
Example¶
Here is a simple example of a XPath engine configured in the engine: section, further read Engine Overview.
- name : bitbucket
engine : xpath
paging : True
search_url : https://bitbucket.org/repo/all/{pageno}?name={query}
url_xpath : //article[@class="repo-summary"]//a[@class="repo-link"]/@href
title_xpath : //article[@class="repo-summary"]//a[@class="repo-link"]
content_xpath : //article[@class="repo-summary"]/p
Implementations¶
- searx.engines.xpath.request(query, params)[source]¶
Build request parameters (see Making a Request).
- searx.engines.xpath.response(resp)[source]¶
Scrap results from the response (see Result Types (template)).
- searx.engines.xpath.content_xpath = None¶
XPath selector of result’s
content
.
- searx.engines.xpath.cookies = {}¶
Some engines might offer different result based on cookies. Possible use-case: To set safesearch cookie.
- searx.engines.xpath.first_page_num = 1¶
Number of the first page (usually 0 or 1).
- searx.engines.xpath.headers = {}¶
Some engines might offer different result based headers. Possible use-case: To set header to moderate.
- searx.engines.xpath.lang_all = 'en'¶
Replacement
{lang}
insearch_url
if languageall
is selected.
- searx.engines.xpath.method = 'GET'¶
Some engines might require to do POST requests for search.
- searx.engines.xpath.no_result_for_http_status = []¶
Return empty result for these HTTP status codes instead of throwing an error.
no_result_for_http_status: []
- searx.engines.xpath.page_size = 1¶
Number of results on each page. Only needed if the site requires not a page number, but an offset.
- searx.engines.xpath.paging = False¶
Engine supports paging [True or False].
- searx.engines.xpath.request_body = ''¶
The body of the request. This can only be used if different
method
is set, e.g.POST
. For formatting see the documentation ofsearch_url
:search={query}&page={pageno}{time_range}{safe_search}
- searx.engines.xpath.results_xpath = ''¶
XPath selector for the list of result items
- searx.engines.xpath.safe_search_map = {0: '&filter=none', 1: '&filter=moderate', 2: '&filter=strict'}¶
Maps safe-search value to
{safe_search}
insearch_url
.safesearch: true safes_search_map: 0: '&filter=none' 1: '&filter=moderate' 2: '&filter=strict'
- searx.engines.xpath.safe_search_support = False¶
Engine supports safe-search.
- searx.engines.xpath.search_url = None¶
Search URL of the engine. Example:
https://example.org/?search={query}&page={pageno}{time_range}{safe_search}
Replacements are:
{query}
:Search terms from user.
{pageno}
:Page number if engine supports paging
paging
{lang}
:ISO 639-1 language code (en, de, fr ..)
{time_range}
:URL parameter
if enginesupports time range
. The value for the parameter is taken fromtime_range_map
.{safe_search}
:Safe-search
URL parameter
if enginesupports safe-search
. The{safe_search}
replacement is taken from thesafes_search_map
. Filter results:0: none, 1: moderate, 2:strict
If not supported, the URL parameter is an empty string.
- searx.engines.xpath.soft_max_redirects = 0¶
Maximum redirects, soft limit. Record an error but don’t stop the engine
- searx.engines.xpath.suggestion_xpath = ''¶
XPath selector of result’s
suggestion
.
- searx.engines.xpath.thumbnail_xpath = False¶
XPath selector of result’s
thumbnail
.
- searx.engines.xpath.time_range_map = {'day': 24, 'month': 720, 'week': 168, 'year': 8760}¶
Maps time range value from user to
{time_range_val}
intime_range_url
.time_range_map: day: 1 week: 7 month: 30 year: 365
- searx.engines.xpath.time_range_support = False¶
Engine supports search time range.
- searx.engines.xpath.time_range_url = '&hours={time_range_val}'¶
Time range URL parameter in the in
search_url
. If no time range is requested by the user, the URL parameter is an empty string. The{time_range_val}
replacement is taken from thetime_range_map
.time_range_url : '&days={time_range_val}'
- searx.engines.xpath.title_xpath = None¶
XPath selector of result’s
title
.
- searx.engines.xpath.url_xpath = None¶
XPath selector of result’s
url
.