Commit Graph

1139 Commits

Author SHA1 Message Date
Paul Alcock bb34685dfa
Add IMDB support (#2980)
Closes #1145
2021-10-02 13:41:38 +02:00
Noémi Ványi 1d5feed4c1 [fix] style of stackexchange engine 2021-10-02 13:25:50 +02:00
Markus Heiser a1f9919587 [fix] engine stackexchange - decode HTML entities in title & content
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-02 11:49:12 +02:00
Markus Heiser 5e84d670a2 [mod] replace old stackoverflow engine by Stack Exchange API v2.3
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-02 11:43:51 +02:00
Markus Heiser 2cf9a61246 [mod] engines - add Stack Exchange API v2.3
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-02 11:41:36 +02:00
Noémi Ványi 3f05513601 [fix] DDG engine format and remove logger 2021-10-02 11:40:56 +02:00
Markus Heiser 9d9a89c6ef [mod] engine duckduckgo - update supported_languages_url
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-02 11:33:21 +02:00
Markus Heiser c218fb76b8 [mod] engine duckduckgo - use DuckDuckGo-Lite
Implement a scrapper for DuckDuckGo-Lite [1].  The existing DuckDuckGo [2]
engine does not support paging.  DuckDuckgo-Lite is much faster, less verbose
and does have a paging option (reversed engineered from the input form of [1]).

[1] https://lite.duckduckgo.com/lite
[2] https://duckduckgo.com/

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-02 11:33:12 +02:00
Markus Heiser 987b0d6f5b [fix] remove minimum length of content for XPath engine
Instead of raising an exception and therefore hiding all results of the engine.

It make sense to remove that requirement in order to allow the implementation of
search engines that do not always have a description.  In fact some search
engines that in 99% of the case have a description like Brave Search or Mojeek
crash completely if they for some reason included a result with no description.

To test this patch try Mojeek:

    !mjk xyz

before and after the patch.

Suggested-by: 0xhtml in https://github.com/searx/searx/discussions/2933
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-13 21:24:26 +02:00
dependabot[bot] e271d6d1e1 Bump pylint from 2.9.6 to 2.10.2
Bumps [pylint](https://github.com/PyCQA/pylint) from 2.9.6 to 2.10.2.
- [Release notes](https://github.com/PyCQA/pylint/releases)
- [Changelog](https://github.com/PyCQA/pylint/blob/main/ChangeLog)
- [Commits](https://github.com/PyCQA/pylint/compare/v2.9.6...v2.10.2)

---
updated-dependencies:
- dependency-name: pylint
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-22 20:53:12 +02:00
Noémi Ványi 23b3b56a06 add filter=0 to Google engine for more results
Closes #2944
2021-08-21 16:29:20 +02:00
Émilien Devos ee443d9739 Fix google images
Proposed fix in https://github.com/searx/searx/pull/2115#issuecomment-876716010

Closes #2914
2021-08-02 20:14:54 +02:00
Samuel Dudík 13a608b0b8
Fix Seznam engine (#2905)
## What does this PR do?

Fixes the Seznam engine by updating XPath strings.

## Why is this change important?

Without this PR Seznam returns no results.

Co-authored-by: Noémi Ványi <kvch@users.noreply.github.com>
2021-08-02 20:09:31 +02:00
Marc Abonce Seguin a5839a66d6
Update onion engines to v3 (#2904)
downgrade httpx:
PR https://github.com/encode/httpx/pull/1522
made some breaking changes in AsyncHTTPTransport that affect
our code in https://github.com/searx/searx/blob/master/searx/network/client.py

remove not_evil which has been down for a while now:
https://old.reddit.com/r/onions/search/?q=not+evil&restrict_sr=on&t=year
2021-08-02 20:03:55 +02:00
Adam Tauber a8988885a5 [fix] pep8 2021-07-05 19:57:53 +02:00
Adam Tauber 198aad43e0 [enh] add mongodb offline engine 2021-07-05 19:47:15 +02:00
Adam Tauber b2ed65eb6d [fix] initialize redis engine at the right time 2021-07-05 19:40:51 +02:00
Markus Heiser cbc50a9bc5 [fix] google-news engine - KeyError: 'hl in request
Since we added

- 1c67b6aec [enh] google engine: supports "default language"

there is a KeyError: 'hl in request,error pattern::

    ERROR:searx.searx.search.processor.online:engine google news : exception : 'hl'
    Traceback (most recent call last):
      File "searx/search/processors/online.py", line 144, in search
        search_results = self._search_basic(query, params)
      File "searx/search/processors/online.py", line 118, in _search_basic
        self.engine.request(query, params)
      File "searx/engines/google_news.py", line 97, in request
        if lang_info['hl'] == 'en':
      KeyError: 'hl'

Closes: https://github.com/searxng/searxng/issues/154
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-03 16:53:31 +02:00
Markus Heiser 98a63058e5 [fix] google answers: normalize space of the answers.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-03 16:53:31 +02:00
Markus Heiser 412677d495 [mod] google engine: reduce mobile UI parameters to what is needed
Reverse engineering shows that not all of the parameters used by google's mobile
UI (aka "more results" button) are needed [1].

[1] https://github.com/searxng/searxng/pull/160#issuecomment-865013625

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-03 16:53:31 +02:00
Alexandre Flament 8bf216eab6 [mod] google: add "use_mobile_ui" parameter to use mobile endpoint.
disable by default, it has to be enabled in settings.yml

related to  #159
2021-07-03 16:53:31 +02:00
Alexandre Flament 3863f5a83f [enh] google engine: supports "default language"
Same behaviour behaviour than Whoogle [1].  Only the google engine with the
"Default language" choice "(all)"" is changed by this patch.

When searching for a locate place, the result are in the expect language,
without missing results [2]:

  > When a language is not specified, the language interpretation is left up to
  > Google to decide how the search results should be delivered.

The query parameters are copied from Whoogle.  With the ``all`` language:

- add parameter ``source=lnt``
- don't use parameter ``lr``
- don't add a ``Accept-Language`` HTTP header.

The new signature of function ``get_lang_info()`` is:

    lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language)

Argument ``supported_any_language`` is True for google.py and False for the other
google engines.  With this patch the function now returns:

- query parameters: ``lang_info['params']``
- HTTP headers: ``lang_info['headers']``
- and as before this patch:
  - ``lang_info['subdomain']``
  - ``lang_info['country']``
  - ``lang_info['language']``

[1] https://github.com/benbusby/whoogle-search
[2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
2021-07-03 16:53:31 +02:00
Adam Tauber 01a8a5814a [fix] pylint 2021-05-30 19:25:03 +02:00
Adam Tauber 97269be680 [enh] add redis offline engine 2021-05-30 19:20:17 +02:00
Markus Heiser 650a1c0b89 [enh] xpath engine - add request parameter 'soft_max_redirects'
Make 'soft_max_redirects' configurable per Xpath engine::

    - name : <engine-name>
      engine : xpath
      soft_max_redirects: 1
      ...

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-22 20:41:36 +02:00
Noémi Ványi 0313797dfd Add sqlite engine to pylint 2021-05-13 21:47:38 +02:00
Noémi Ványi 8e90a214ce Add sqlite engine
Closes #2808
2021-05-13 21:40:25 +02:00
Alexandre Flament 75d1f38b20 [fix] searxng fix: sjp engine 2021-05-03 21:51:29 +02:00
Alexandre Flament 14fe1779b7 [httpx] replace searx.poolrequests by searx.network
settings.yml:

* outgoing.networks:
   * can contains network definition
   * propertiers: enable_http, verify, http2, max_connections, max_keepalive_connections,
     keepalive_expiry, local_addresses, support_ipv4, support_ipv6, proxies, max_redirects, retries
   * retries: 0 by default, number of times searx retries to send the HTTP request (using different IP & proxy each time)
   * local_addresses can be "192.168.0.1/24" (it supports IPv6)
   * support_ipv4 & support_ipv6: both True by default
     see https://github.com/searx/searx/pull/1034
* each engine can define a "network" section:
   * either a full network description
   * either reference an existing network

* all HTTP requests of engine use the same HTTP configuration (it was not the case before, see proxy configuration in master)
2021-05-03 21:39:54 +02:00
Alexandre Flament 88a96baedc [enh] replace requests by httpx 2021-05-03 21:39:37 +02:00
Marc Abonce Seguin 3284132ae5 fix Qwant's fetch_languages function 2021-05-02 17:24:28 -07:00
Pierre Chevalier 3a0f896b68 [enh] Add Springer Nature engine
Springer Nature is a global publisher dedicated to providing service to research
community [1] with official API [2].

To test this PR, first get your API key following this page:

   https://dev.springernature.com/signup

In searx/engines/springer.py at line 24, add this API key.  I left my own key,
commented out in the line aboce.  Feel free to use it, if needed.

[1] https://www.springernature.com/
[2] https://dev.springernature.com/
2021-04-29 22:43:52 +02:00
spongebob33 6513a56064 add core.ac.uk engine 2021-04-29 22:43:52 +02:00
Noémi Ványi dee75accf6 Fix remote PEP8 errors as well 2021-04-29 22:05:31 +02:00
Noémi Ványi 5cb29f6e46 Fix pep8 errors of database engines 2021-04-29 21:50:25 +02:00
Noémi Ványi c00a33feee Add MySQL engine 2021-04-29 21:41:58 +02:00
Noémi Ványi 22079ffdef Add PostgreSQL engine 2021-04-29 21:41:38 +02:00
Markus Heiser 34d7d97e1e [fix] youtube - send CONSENT Cookie to not be redirected
In the EU there exists a "General Data Protection Regulation" [1] aka GDPR (BTW:
very user friendly!) which requires consent to tracking.  To get the consent
from the user, youtube requests are redirected to confirm and get a CONSENT
Cookie from https://consent.youtube.com

This patch adds a CONSENT Cookie to the youtube request to avoid redirection.

[1] https://en.wikipedia.org/wiki/General_Data_Protection_Regulation

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Reported-by: https://github.com/searx/searx/issues/2774
2021-04-29 21:40:39 +02:00
Michael Ilsaas de0b735f3a Fix URL to solidtorrent result page 2021-04-28 23:57:54 +02:00
Noémi Ványi 8362257b9a
Merge pull request #2736 from plague-doctor/sjp
Add new engine: SJP - Słownik języka polskiego
2021-04-16 17:30:14 +02:00
Noémi Ványi e56323d3c8
Merge pull request #2759 from ypid/fix/typo
Fix grammar mistake in debug log output
2021-04-16 17:26:45 +02:00
Plague Doctor d275d7a35e Code refactoring. 2021-04-16 12:23:27 +10:00
Markus Heiser 062d589f86 [fix] xpath expressions to grap all items from bandcamp's response
I also found some items missing a thumbnail and I used text_extract for content
and title, to remove unneeded whitespaces.

BTW: added bandcamp's favicon

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-04-15 08:52:11 +02:00
Kyle Anthony Williams 4d3c399ee9 [feat] add bandcamp engine 2021-04-15 08:52:11 +02:00
Robin Schneider dfc66ff0f0
Fix grammar mistake in debug log output 2021-04-11 22:12:53 +02:00
Plague Doctor 599ff39ddf Fix conflicts 2021-04-09 06:54:03 +10:00
Plague Doctor 6631f11305 Add new engine: SJP 2021-04-08 10:21:54 +10:00
Plague Doctor 7035bed4ee Add new engine: Wordnik.com 2021-04-08 09:58:00 +10:00
Noémi Ványi 07f5edce3d Add Meilisearch engine
Website: https://www.meilisearch.com/
2021-04-06 21:57:05 +02:00
Alexandre Flament 725a69616b
Merge pull request #2681 from dalf/fix-wikipedia-title
[fix] wikipedia: remove HTML from the title
2021-03-27 17:43:36 +01:00