Commit Graph

2926 Commits

Author SHA1 Message Date
Noémi Ványi ea38fea711
Pick image_proxy changes from searxng (#2965)
* [mod] /image_proxy: don't decompress images

* [fix] image_proxy: always close the httpx respone

previously, when the content type was not an image and some other error,
the httpx response was not closed

* [mod] /image_proxy: use HTTP/1 instead of HTTP/2

httpx: HTTP/2 is slow when a lot data is downloaded.
https://github.com/dalf/pyhttp-benchmark

also, the usage of HTTP/1 decreases the load average

* [mod] searx.utils.dict_subset: rewrite with comprehension

Co-authored-by: Alexandre Flament <alex@al-f.net>
2022-01-22 13:49:00 +01:00
Alexandre Flament ad7e00ad03 [fix] startpage autocompletion 2022-01-22 12:18:57 +01:00
Allen 0c351ea364
[enh] Add Tineye reverse image search (#3040)
* [enh] Add Tineye reverse image search 

Other optional parametesr:

"&sort=crawl_date" can be appended to search_string to sort results by date.
"&domain=example.org" can be implemented to search_string to get results from just one domain.

Public instances could get relatively fast timed-out for 3600s.

* [enh] Add TIneye to settings.yml 

Check if that's the right shortcut.

* [mod] Fix checks

* [mod] Try to fix checks

* [mod] Use Four spaces for indentation

And set paging back to True

Co-authored-by: Noémi Ványi <kvch@users.noreply.github.com>
2022-01-22 12:15:19 +01:00
Noémi Ványi fd9d6b58d5 Add scheme to img_src and thumbnail_url if missing from URL
Closes #3092
2022-01-22 11:59:21 +01:00
Noémi Ványi 148090df12 Minor fixes to satisfy the linter 2022-01-21 17:59:10 +01:00
Alexandre Flament d592159cc5 [fix] startpage: workaround to use the startpage network
workaround for the issue #762
2022-01-21 17:59:10 +01:00
Markus Heiser 036d80ed20 [mod] starpage engine: add comment about Startpage's FFox add-on
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser a4bc089091 [fix] startpage engine: fetch CAPTCHA & issues related to PR-695
In case of CAPTCHA raise a SearxEngineCaptchaException and suspend for 7 days.
When get_sc_code() fails raise a SearxEngineResponseException and suspend for 7
days.

[1] https://github.com/searxng/searxng/pull/695

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser 1076d7e52e [fix] Get an actual `sc` argument from startpage's home page.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser a6184ac32c [pylint] Startpage engine
Fix remarks from pylint

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser 4750586fb0 [fix] startpage engine - avoid captcha
Startpage has introduced new anti-scraping measures that make SearXNG instances
run into captchas:

1. some arguments has been removed and a new `sc` has been added.
2. search path changed from `do/search` to `sp/search`
3. POST request is no longer needed

Closes: https://github.com/searxng/searxng/issues/692
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser 99128537a8 [fix] googel engine - "some results are invalids: invalid content"
Fix google issues listet in the `/stats?engine=google` and message::

    some results are invalids: invalid content

The log is::

    DEBUG   searx                         : result: invalid content: {'url': 'https://de.wikipedia.org/wiki/Foo', 'title': 'Foo - Wikipedia', 'content': None, 'engine': 'google'}
    WARNING searx.engines.google          : ErrorContext('searx/search/processors/abstract.py', 111, 'result_container.extend(self.engine_name, search_results)', None, 'some results are invalids: invalid content', ()) True

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser 26c92d5f50 [fix] google engine: remove adds and fix mobile_ui selector
1. Fix issue reported in comment [1]
2. Fix XPath selector for the response of google's mobile UI, reported in
   comment [2]

[1] https://github.com/searxng/searxng/pull/777#issuecomment-1015121322
[2] https://github.com/searxng/searxng/pull/777#issuecomment-1015236238

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-20 08:33:53 +01:00
Émilien Devos a2ec27696c Update XPath for Google engine 2022-01-19 23:03:36 +01:00
Noémi Ványi f0842c76e5
Drop Python 3.6 support (#3133) 2022-01-16 15:04:32 +01:00
Noémi Ványi 179784068f
Bump pylint from 2.10.2 to 2.12.2 (#3124)
Bumps [pylint](https://github.com/PyCQA/pylint) from 2.10.2 to 2.12.2.
- [Release notes](https://github.com/PyCQA/pylint/releases)
- [Changelog](https://github.com/PyCQA/pylint/blob/main/ChangeLog)
- [Commits](https://github.com/PyCQA/pylint/compare/v2.10.2...v2.12.2)

---
updated-dependencies:
- dependency-name: pylint
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-15 20:23:09 +01:00
Dario Nuevo 1a18adcc16
New files engine: Prowlarr (#3118)
## What does this PR do?

Gives the user the possibility to search their own prowlarr instances.

Info: https://wiki.servarr.com/en/prowlarr
Github: https://github.com/Prowlarr/Prowlarr

## Why is this change important?

Prowlarr searchs multiple upstream search providers, thus allows to use that functionality through searx.
2022-01-15 19:18:15 +01:00
Andy Jones 3ddd0f8944
Update httpx and friends to 0.21.3 (#3121) 2022-01-15 19:16:10 +01:00
Allen 321ddc91bc
[enh] Add autocompleter from Brave (#3109)
* [enh] Add autocompleter from Brave

Raw response example: https://search.brave.com/api/suggest?q=how%20to:%20with%20j

Headers are needed in order to get a 200 response, thus Searx user-agent is used.

Other URL param could be  '&rich=false' or  '&rich=true'.
2022-01-15 19:08:53 +01:00
Noémi Ványi 82ac634070 make port configurable in MySQL engine
Closes #3117
2022-01-11 22:49:53 +01:00
Dario Nuevo 8f07442fb6
feature: new engine xpath_flex (#3119) 2022-01-11 22:44:19 +01:00
Dario Nuevo d1f6e0a3b1
products results: add possibility to show if a product is in stock or not.. (#3120) 2022-01-11 22:39:08 +01:00
searx-bot 1b1eaa6630
Update searx.data - update_firefox_version.py (#3079)
Co-authored-by: dalf <dalf@users.noreply.github.com>
2022-01-07 21:49:50 +01:00
searx-bot bf96bf5ce4
Update searx.data - update_ahmia_blacklist.py (#3080)
Co-authored-by: dalf <dalf@users.noreply.github.com>
2022-01-07 21:49:29 +01:00
Allen 0c2165324d
[Fix] Add suggestions + Fix xpaths (#3082)
* [mod] Add Suggestion to Petalsearch

* [Fix] Changed xpath for Petalsearch
2022-01-07 21:49:08 +01:00
Émilien Devos 8cde08ded2
Disable onesearch by default (#3099)
onesearch is not available everywhere and thus display an error by default in searx
2022-01-07 21:42:51 +01:00
Finn 5dc886136b
[fix] Qwant: Remove extra q from URL (#3091)
Fixes #3090
2022-01-07 21:41:39 +01:00
dalf c11b0189a8 Update searx.data - update_wikidata_units.py 2022-01-01 06:10:23 +00:00
israelyago b90616a25f
Remove categories from onesearch config
Co-authored-by: Noémi Ványi <kvch@users.noreply.github.com>
2021-11-18 08:19:19 -03:00
israelyago 6b3915a2dc
Removed paging from onesearch config
Co-authored-by: Noémi Ványi <kvch@users.noreply.github.com>
2021-11-18 08:18:50 -03:00
israelyago 0d28fd2efe
Merge branch 'master' into onesearch-engine 2021-11-17 15:27:11 -03:00
Israel Yago Pereira f1f3ad97d9 Remove debug log from onesearch engine 2021-11-17 15:15:17 -03:00
Israel Yago Pereira 4b785677d8 Onesearch pagination 2021-11-17 15:14:43 -03:00
Israel Yago Pereira 51530bc394 Fix code style 2021-11-17 15:14:43 -03:00
Israel Yago Pereira 258c6fbd5a Onesearch engine without pagination 2021-11-17 15:14:43 -03:00
Israel Yago Pereira 8e00249633 WIP: onesearch engine 2021-11-17 15:14:43 -03:00
Markus Heiser 4d36aee57b [fix] engine - yahoo: rewrite and fix issues
Languages are supported by mapping the language to a domain.  If domain is not
found in :py:obj:`lang2domain` URL ``<lang>.search.yahoo.com`` is used.

Closes #3020

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-16 20:30:10 +01:00
0xhtml ebbb9f60af [fix] Prevent missing setting error in ranking
Prevent error when the prefer_configured_language setting is missing.

Fixes #3063
2021-11-16 16:14:38 +01:00
searx-bot db2e8fd8b2
Update searx.data - update_wikidata_units.py (#3050)
Co-authored-by: dalf <dalf@users.noreply.github.com>
2021-11-15 20:32:34 +01:00
searx-bot 4129233774
Update searx.data - update_ahmia_blacklist.py (#3049)
Co-authored-by: dalf <dalf@users.noreply.github.com>
2021-11-15 20:32:25 +01:00
searx-bot bd9c03b483
Update searx.data - update_currencies.py (#3048)
Co-authored-by: dalf <dalf@users.noreply.github.com>
2021-11-15 20:32:14 +01:00
searx-bot abe43c4702
Update searx.data - update_external_bangs.py (#3047)
Co-authored-by: dalf <dalf@users.noreply.github.com>
2021-11-15 20:32:03 +01:00
Finn 8c3454fd1b
[enh] Improve ranking based on language (#3053)
Add configurable setting to rank search results higher when part of the
domain (e.g. 'en' in 'en.wikipedia.org' or 'de' in 'beispiel.de')
matches the selected search language. Does not apply to e.g. 'be' in
'youtube.com'.

Closes #206
2021-11-15 20:31:22 +01:00
Noémi Ványi 967e20dd1e adjust comment based on previous patch 2021-11-14 17:51:22 +01:00
Noémi Ványi dff7ee91f9 Check for settings under /etc/searx/settings.yml first
The patch introduced earlier broke the behaviour for instance
admins running searx from packages. This fix aims to provide
compatibility for everyone.

Closes #3061
2021-11-14 17:46:01 +01:00
Israel Yago Pereira a5fd30bf4d fix wrong func call 2021-11-12 13:12:50 -03:00
jecarr 3b3fb93074
Add codebase settings.yml to settings_loader.get_user_settings_path() 2021-11-10 16:16:05 +13:00
Alexandre Flament bc60d834c5 [enh] verify that Tor proxy works every time searx starts
based on @MarcAbonce commit on searx
2021-10-13 00:06:37 -07:00
Noémi Ványi 3bcca43abf Fix qwant engine, only get results from categories
Closes #3014
2021-10-12 20:06:37 +02:00
Alexandre Flament ee86a63556 [enh] flask debug mode: reload the app when searx/settings.yml changes 2021-10-10 21:38:35 +02:00