Commit Graph

3981 Commits

Author SHA1 Message Date
Alexandre Flament 0a8799b834
Merge pull request #2517 from dalf/debug-ci
Update pyenv pyenvinstall Make targets
2021-02-01 17:01:34 +01:00
Markus Heiser 8c45f1149d [hardening] github workflows - corrupted cache
aka: ensure that 'make test' works as expected

The cache contains a copy './local' which is - under some circumstance -
corrupted.  It is not possible to clear the cache [1] (see the top of the page).

Ensure that 'make test' works as expected [2] even if

- the python interpreter is missing
- the virtualenv exists but pyyaml is missing

To hardening when the workflow cache fails, this patch adds the new target
'travis.test' into the workflow.  This target probes to import a python module
'yaml'.  If this fails the virtualenv will be completely new build.

[1] https://github.com/actions/cache/issues/2#issuecomment-673493515
[2] https://github.com/searx/searx/pull/2517#discussion_r567240235

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-01 16:58:04 +01:00
Markus Heiser 38b39ef0ae [fix] re-add 'pip-exe' target - partial revert 9b48ae47
Target pip-exe is a prerequisite of the targets:

  - pyinstall
  - pyuninstall

and was accidentally deleted in commit 9b48ae47.

HINT:
  do not confuse pyinstall with penvinstall

pyinstall & pyuninstall
    Installing into user's HOME using pip from OS,
    therefore the message is needed.

pyenvinstall & pyenvuninstall
    Installing into virtualenv (./local) using pip which is provided by
    prerequisite 'pyenv' in the virtualenv.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-01 16:58:04 +01:00
Alexandre Flament d70c5a621a [mod] more robust make pyenv / make pyenvinstall
"make pyenv" ensures that ./local/py3/bin/python is an executable
2021-02-01 16:58:04 +01:00
Alexandre Flament 806af50738
Merge pull request #2494 from return42/rm-fabfile
[fix] remove Fabric file
2021-02-01 15:09:35 +01:00
Markus Heiser 40d2a116e1 [fix] Makefile target gh-pages & flatten history of branch gh.pages
1. This patch fixes error:

    rm -rf gh-pages/
    make V=1 gh-pages
    make[1]: Leaving directory '/800GBPCIex4/share/searx'
    [ -d "gh-pages/.git" ] || git clone  gh-pages
    fatal: repository 'gh-pages' does not exist

2. The gh-page build has been moved to ./build/gh-pages this also affects
   'travis-gh-pages'

3. The gh-pages commit messages now includes a ref to the repository and commit

4. Since a gh-pages history has only the drawback that the reposetory grows
   fast, this patch also flattens the history:

    cd build/gh-pages/; git log --oneline
    bash: cd: build/gh-pages/: Datei oder Verzeichnis nicht gefunden
    026126be (HEAD -> gh-pages, origin/gh-pages) make gh-pages: from https://github.com/return42/searx.git@71d66979c2935312e0aed7fc7c3cf6199fbe88a2

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-29 11:41:48 +01:00
Alexandre Flament 71d66979c2
Merge pull request #2482 from return42/fix-google-video
[fix] revise of the google-Video engine
2021-01-28 11:11:07 +01:00
Markus Heiser 7f505bdc6f [fix] google: avoid unnecessary SearxEngineXPathException errors
Avoid SearxEngineXPathException errors when parsing non valid results::

    .//div[@class="yuRUbf"]//a/@href index 0 not found
    Traceback (most recent call last):
      File "./searx/engines/google.py", line 274, in response
        url = eval_xpath_getindex(result, href_xpath, 0)
      File "./searx/searx/utils.py", line 608, in eval_xpath_getindex
        raise SearxEngineXPathException(xpath_spec, 'index ' + str(index) + ' not found')
    searx.exceptions.SearxEngineXPathException: .//div[@class="yuRUbf"]//a/@href index 0 not found

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser e436287385 [mod] checker: add some additional tests
BTW: fix indentation by 2 spaces

The additional tests has been commented out in the google engines to not release
any CAPTCHA issues.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser b1fefec40d [fix] normalize the language & region aspects of all google engines
BTW: make the engines ready for search.checker:

- replace eval_xpath by eval_xpath_getindex and eval_xpath_list
- google_images: remove outer try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:46 +01:00
Alexandre Flament 0f18e885bf
Merge pull request #2479 from Tobi823/master
Document workaround for using 2 languages simultaneously #1508
2021-01-27 21:29:42 +01:00
Alexandre Flament b661c3f5d4
Merge pull request #2509 from return42/fix-morty-key
[doc] improve admin-docs about result proxy (morty) configuration
2021-01-27 15:31:29 +01:00
Markus Heiser a69a8a3ed5 [doc] improve admin-docs about result proxy (morty) configuration
[1] https://github.com/searx/searx/pull/1872#issuecomment-768107138

Suggested-by @dalf [1]
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-27 09:58:06 +01:00
Markus Heiser 923b490022 [mod] add Makfile targets for search.checker.<engine_name>
To check all engines:

    make search.checker

To check a engine 'google news' replace space by underline:

    make search.checker.google_news

To see HTTP requests and more use SEARX_DEBUG:

    make SEARX_DEBUG=1 search.checker.google_news

To filter out HTTP redirects:

    make SEARX_DEBUG=1 search.checker.google_news | grep -A1 "HTTP/1.1\" 3[0-9][0-9]"
    ...
    Engine google news                   Checking
    https://news.google.com:443 "GET /search?q=life&hl=en&lr=lang_en&ie=utf8&oe=utf8&ceid=US%3Aen&gl=US HTTP/1.1" 302 0
    https://news.google.com:443 "GET /search?q=life&hl=en-US&lr=lang_en&ie=utf8&oe=utf8&ceid=US:en&gl=US HTTP/1.1" 200 None
    --
    https://news.google.com:443 "GET /search?q=computer&hl=en&lr=lang_en&ie=utf8&oe=utf8&ceid=US%3Aen&gl=US HTTP/1.1" 302 0
    https://news.google.com:443 "GET /search?q=computer&hl=en-US&lr=lang_en&ie=utf8&oe=utf8&ceid=US:en&gl=US HTTP/1.1" 200 None
    --

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-26 11:46:36 +01:00
Alexandre Flament 6047087aac [mod] utils/fetch_languages.py: write files at the right location 2021-01-24 14:25:27 +01:00
Alexandre Flament 3330cf4a46 [enh] every monday, call utils/fetch_*.py scripts and create a PR automatically 2021-01-24 13:32:39 +01:00
Markus Heiser ff6804e545 [data] make engines.languages
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 09:52:32 +01:00
Markus Heiser 8cdad5d85d [fix] google-videos: parse values for 'length' & 'author'
The 'video.html' template from the 'oscar' design supports replacement
for *author* and *length*.  Google-videos does not have an author, alternatively
the publisher info from is used for the *author*.

Hint: these replacements are not supported by the 'simple' design.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 09:51:24 +01:00
Markus Heiser 89b3050b5c [fix] revise of the google-Video engine
This revise is based on the methods developed in the revise of the google engine
(see commit 410c2f9).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 09:39:30 +01:00
Alexandre Flament f4a17acb7a
Merge pull request #2498 from dalf/minor-fix-google-news
[fix] google_news: avoid one HTTP redirect except for the English results
2021-01-24 09:13:48 +01:00
Alexandre Flament 96c2996857
Merge pull request #2497 from return42/fix-test.sh
[fix] lxc.sh - SC2034: ubu2010_boilerplate appears unused.
2021-01-24 09:06:11 +01:00
Alexandre Flament 8c46b767d0 [fix] google_news: avoid one HTTP redirect except for the English results
also add
params['soft_max_redirects'] = 1
to avoid false error reporting in /stats/errors
2021-01-24 08:53:35 +01:00
Markus Heiser ea5c992d4f [fix] lxc.sh - SC2034: ubu2010_boilerplate appears unused.
$ make test.sh
  In utils/lxc.sh line 42:
  ubu2010_boilerplate="$ubu1904_boilerplate"
  ^-----------------^ SC2034: ubu2010_boilerplate appears unused. Verify use (or export if used externally).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 08:29:13 +01:00
Alexandre Flament 7d24850d49
Merge pull request #2483 from return42/fix-google-news
[fix] revise of the google-News engine
2021-01-23 20:21:09 +01:00
Markus Heiser 5f92dfcdbe [fix] google-news: query uses locale without country tag
Wthout country-region tag google will redirect to correct the contry tag [1]:

    SEARX_DEBUG=1 searx-checker -v "google news"
    ...
    https://news.google.com:443 "GET /search?q=computer&hl=en...      HTTP/1.1" 302 0
    https://news.google.com:443 "GET /search?q=computer&hl=en-US&.... HTTP/1.1" 200 None
    ...

[1] https://github.com/searx/searx/pull/2483#issuecomment-765600849

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-23 11:37:14 +01:00
Markus Heiser baec54c492 [fix] revise of the google-news engine
This revise is based on the methods developed in the revise of the google engine
(see commit 410c2f9).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22 18:49:45 +01:00
Markus Heiser a8544798ec [fix] remove Fabric file
The fabfile.py has not been updated since 5 years.  I also asked [1] if someone
still use Fabric wtihout any response.  Lets drop outdated Fabric file.

[1] https://github.com/searx/searx/discussions/2400

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22 17:57:55 +01:00
Adam Tauber f310305c54
Merge pull request #2481 from dalf/mod-check
Mod check
2021-01-20 18:48:29 +00:00
Alexandre Flament 73c86f9bf2 [mod] checker: disable by default 2021-01-19 21:44:48 +01:00
Alexandre Flament 3b7b852aa8 [fix] checker: minor fix about language detection 2021-01-19 21:29:31 +01:00
Alexandre Flament aa887eb375 [mod] checker : replace pycld3 by langdetect
pycld3 requires the native library cld3
langdetect is a pure python package
2021-01-19 21:26:04 +01:00
Tobi823 16a0a01553 Document workaround for using 2 languages simultaneously #1508 2021-01-18 17:23:09 +01:00
Alexandre Flament 0495e15df4
Merge pull request #2476 from dalf/fix-error-recording-and-checker
Fix error recording and checker
2021-01-18 08:29:25 +01:00
Alexandre Flament 67a1aab0d5 [fix] /stats/checker : remove the timestamp field when the checker is disabled 2021-01-18 08:19:53 +01:00
Alexandre Flament d473407ec9 [fix] checker: fix engine statistics
Without this commit, the URL /stats/errors shows percentage above 100% after the checker has run.
2021-01-18 08:19:44 +01:00
Alexandre Flament ca76f3119a [fix] error_recorder: record code and lineno about the engine
since the PR #2225 , code and lineno were sometimes meaningless
see /stats/errors
2021-01-17 16:25:11 +01:00
Alexandre Flament 80d7411f2c
Merge pull request #2452 from kvch/add-wilby-engine
Add wiby.me engine
2021-01-16 22:36:31 +01:00
Alexandre Flament b405646749
Merge pull request #2451 from mrwormo/invidious-engine
[Fix] Invidious Engine
2021-01-16 19:25:45 +01:00
Alexandre Flament 709dd960f1
Merge pull request #2473 from return42/fix-setup.py
[fix] setup.py requires pyyaml installed
2021-01-16 19:05:36 +01:00
Alexandre Flament 1d13ad8452
Merge pull request #2460 from dalf/engine-about
[enh] engines: add about variable
2021-01-16 19:05:17 +01:00
Markus Heiser c4a98862bf [fix] setup.py requires pyyaml installed
pip install -e .
...
Obtaining file:///usr/local/searx/searx-src
    ERROR: Command errored out with exit status 1:
     command: /usr/local/searx/searx-pyenv/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/usr/local/searx/searx-src/setup.py'"'"'; __file__='"'"'/usr/local/searx/searx-src/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'rn'"'"', '"'"'n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-vzer91m2
         cwd: /usr/local/searx/searx-src/
    Complete output (9 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/local/searx/searx-src/setup.py", line 10, in <module>
        from searx.version import VERSION_STRING
      File "/usr/local/searx/searx-src/searx/__init__.py", line 19, in <module>
        import searx.settings_loader
      File "/usr/local/searx/searx-src/searx/settings_loader.py", line 8, in <module>
        import yaml
    ModuleNotFoundError: No module named 'yaml'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-16 08:58:13 +01:00
Alexandre Flament a4dcfa025c [enh] engines: add about variable
move meta information from comment to the about variable
so the preferences, the documentation can show these information
2021-01-14 20:57:17 +01:00
Alexandre Flament 5a511f0d62 [fix] CI: fix docker push 2021-01-14 20:35:10 +01:00
Alexandre Flament 824fe40a28
Merge pull request #2467 from dalf/fix-ci
[fix] github actions: use ubuntu-20.04 instead of ubuntu-latest
2021-01-14 17:14:59 +01:00
Alexandre Flament 38090daa29 [fix] github actions: use ubuntu-20.04 instead of ubuntu-latest 2021-01-14 16:49:17 +01:00
mrwormo 2dff3887f0 [fix] Invidious engine by enabling requests by randomly picking amongst working instances 2021-01-14 12:12:56 +01:00
Alexandre Flament 484dc99580
Merge pull request #2419 from dalf/checker
[enh] add checker
2021-01-13 15:46:48 +01:00
Alexandre Flament 912c7e975c [fix] checker: don't run the checker when uwsgi is not properly configured
Before this commit, even with the scheduler disabled, the checker was running
at least once for each uwsgi worker.
2021-01-13 14:07:39 +01:00
Alexandre Flament 7f0c508598 [fix] checker: fix typo unknown instead of unknow 2021-01-12 11:47:17 +01:00
Alexandre Flament a0c8b413a6 [mod] searx.shared: minor tweaks
searx.shared.shared_abstract.SharedDict inherit from abc.ABC
searx.shared.shared_uwsgi.schedule can schedule multiple functions without issue
2021-01-12 11:47:17 +01:00