Table of Contents
- Searx milestones
- Milestone 1.1 - Clean up
- Milestone 1.2 - async
- Milestone 1.4 - network
- Milestone 1.5 - better statistics about the engines
- Milestone 1.6 - infoboxes engines
- Milestone 1.7 - better engine framework
- Milestone 1.8 - framework for the the on_result plugins
- Milestone 1.9 - autocomplete
- Milestone 1.10 - data upgrade & build process
- Milestone 1.11 - upgrade oscar theme
- Documentation & packaging milestones
Searx milestones
Milestone 1.1 - Clean up
- #1471 : Drop Python 2. this is more than calling 2to3, there is encoding issue here and there related to Python 2 support.
- clean up:
- webapp.py
- the dependencies. For example pyopenssl can be optional now (see https://requests.readthedocs.io/en/master/community/updates/#id1 version 2.24.0).
- Backport from the different forks:
- https://gitlab.e.foundation/e/cloud/my-spot ( searx#1674 )
- https://github.com/entropage/mijisou(not sure which commits, but at least have a look)
- Add typing to the core components.
Milestone 1.2 - async
- Replace requests by httpx or aiohttp
See https://github.com/searx/searx/pull/1856: use httpx instead of requests. Drop source ip rotation, proxy support in this milestone.
- but read also #503#issuecomment-647025488
- related to #899
- see also : https://bugs.python.org/issue36098 and encode/httpx#1031
- solution: monkey patch for a time ( for example encode/httpcore#107 )
- https://github.com/searx/searx/wiki/New-architecture-proposal (See also PR #1724 ) : switch to async (starlette + httpx) instead of one thread per engine. After this task (and perhaps previous one), it would be to switch in maintenance mode on the master branch for a time (few weeks ?). Incoming new things can go feature branches.
- replace uwsgi by uvicorn
- fork at runtime:
- one for front-end with one asnycio loop.
- one for back-end with another asyncio loop.
- make sure translation works as before (see [https://github.com/encode/starlette/issues/279#issuecomment-505243515 )
- make sure the performance are at least equal (on low end machine, on up to date hardware) <-- this one will take time.
- make sure the dev environment works (reload)
- define deployment configuration: encode/uvicorn#517
- update documentation / installation scripts.
- https://github.com/searx/searx/blob/master/utils/searx.sh
- https://github.com/searx/searx/blob/master/dockerfiles/docker-entrypoint.sh
- https://github.com/searx/searx/tree/master/docs
- safety net: if uwsgi is detected, then stop.
- Optional: implement (settings.yml can disable this feature): Extend global timeout when there is no result #948
Milestone 1.4 - network
- IP rotation per engine (for now it is per request even on different engines).
- allow to specify an IP range (useful for IPv6). Related to searx#1034 (would be better to detect IPv6 support to avoid maintenance, dnspython can help).
- allow to specify a list of proxy.
- being able to define a retry policy.
- what will take time here is test, test, test.
Milestone 1.5 - better statistics about the engines
Updated version of PR measure response time with more details. #447 (see #162#issuecomment-76623027 ).
Records accurate statistics, display graphics about them (produce graph on the server to avoid javascript usage.).
Milestone 1.6 - infoboxes engines
The wikidata engine is the default engine to display infoboxes. Unfortunately, it is slow, the duckduckgo_definition is faster but requires some work to provide more user friendly informations.
- improve response time of the wikidata engine. Define helpers:
- load all property name translations at load time (one SPARQL request).
- build one big SPARQL request template at load time.
- use the big SPARQL request instead of asking for the HTML version.
- parse JSON with a functionToApply[propertyName]
- improve the results of the ddd engine:
- define common data_type
Milestone 1.7 - better engine framework
- Apply this : #302#issuecomment-565828553 (the issue, but not the comment, is included in the version 1.0). See this gist : https://gist.github.com/dalf/3c3904699153a741f27842d8ea30b449
- #1802 : Engine code: describe which XPath can fail, which must not. The idea: if an engine fails, we should know why: missing XPath result, missing JSON result, internal error, unexpect data, etc... --> if I sum up: the purpose is it create a better framework / toolset for the engines. It will take some times to review all the engines and find what kind of error to report (the purpose is to not fix them, but to be able to report the errors). For example: issue a warning if there is a unexpected HTTP redirect.
- Integrate https://github.com/searx/searx-checker into searx: see #1559 : Add some code directly into the engine to make sure that they are working as expecting. For example, list some request that should work, and the expected results. Most probably it should be code rather that data because each engine behaves differently. So CI can include a report.
- Expose the errors to a public API so searx-stats2 can collect them (for example: this XPath in this engine fails 40% of the time). Triple check that everything is anonymous.
Milestone 1.8 - framework for the the on_result plugins
Related issue: #2080
The on_result plugins can define some triggers: searx calls the "on_result" functions only when the host match.
Milestone 1.9 - autocomplete
- #392 : include answers in the autocomplete results.
- autocomplete with the external bangs.
Milestone 1.10 - data upgrade & build process
Related issues:
Check with different the searx package maintainers.
Milestone 1.11 - upgrade oscar theme
- Upgrade to the dependencies (jquery, bootstrap, leaftlet, etc...)
- Drop IE support
- Optimize some of the HTML <--- see performance on FF, Chrome, mobile, desktop:
- perhaps merge some files
- /translations.js slows down Searx. #2064
- reduce file size if possible (partial bootstrap).
Documentation & packaging milestones
Milestone 2.1 - common configuration files
Find a way to have a reference configuration. Currently about the filtron configuration, there are 3 versions:
- https://github.com/searx/searx-docker/blob/master/rules.json
- https://github.com/searx/searx/blob/master/utils/templates/etc/filtron/rules.json
- https://asciimoo.github.io/searx/admin/filtron.html#sample-configuration-of-filtron
The purpose is to ensure the default setup is secured (HTTP headers, know working filtron configuration, etc...). If it is updated, it is updated everywhere.
Same can be done about the reverse proxy configuration / HTTP headers.