add generated files of parent

This commit is contained in:
Noémi Ványi 2016-10-30 01:02:58 +02:00
parent ee9b902a39
commit 5a1f928155
4 changed files with 290 additions and 0 deletions

114
_sources/admin/filtron.txt Normal file
View File

@ -0,0 +1,114 @@
How to protect an instance
==========================
Searx depens on external search services. To avoid the abuse of these services it is advised to limit the number of requests processed by searx.
An application firewall, ``filtron`` solves exactly this problem. Information on how to install it can be found at the `project page of filtron <https://github.com/asciimoo/filtron>`__.
Sample configuration of filtron
-------------------------------
An example configuration can be find below. This configuration limits the access of
* scripts or applications (roboagent limit)
* webcrawlers (botlimit)
* IPs which send too many requests (IP limit)
* too many json, csv, etc. requests (rss/json limit)
* the same UserAgent of if too many requests (useragent limit)
.. code:: json
[
{
"name": "search request",
"filters": ["Param:q", "Path=^(/|/search)$"],
"interval": <time-interval-in-sec>,
"limit": <max-request-number-in-interval>,
"subrules": [
{
"name": "roboagent limit",
"interval": <time-interval-in-sec>,
"limit": <max-request-number-in-interval>,
"filters": ["Header:User-Agent=(curl|cURL|Wget|python-requests|Scrapy|FeedFetcher|Go-http-client)"],
"actions": [
{"name": "block",
"params": {"message": "Rate limit exceeded"}}
]
},
{
"name": "botlimit",
"limit": 0,
"stop": true,
"filters": ["Header:User-Agent=(Googlebot|bingbot|Baiduspider|yacybot|YandexMobileBot|YandexBot|Yahoo! Slurp|MJ12bot|AhrefsBot|archive.org_bot|msnbot|MJ12bot|SeznamBot|linkdexbot|Netvibes|SMTBot|zgrab|James BOT)"],
"actions": [
{"name": "block",
"params": {"message": "Rate limit exceeded"}}
]
},
{
"name": "IP limit",
"interval": <time-interval-in-sec>,
"limit": <max-request-number-in-interval>,
"stop": true,
"aggregations": ["Header:X-Forwarded-For"],
"actions": [
{"name": "block",
"params": {"message": "Rate limit exceeded"}}
]
},
{
"name": "rss/json limit",
"interval": <time-interval-in-sec>,
"limit": <max-request-number-in-interval>,
"stop": true,
"filters": ["Param:format=(csv|json|rss)"],
"actions": [
{"name": "block",
"params": {"message": "Rate limit exceeded"}}
]
},
{
"name": "useragent limit",
"interval": <time-interval-in-sec>,
"limit": <max-request-number-in-interval>,
"aggregations": ["Header:User-Agent"],
"actions": [
{"name": "block",
"params": {"message": "Rate limit exceeded"}}
]
}
]
}
]
Route request through filtron
-----------------------------
Filtron can be started using the following command:
.. code:: bash
$ filtron -rules rules.json
It listens on 127.0.0.1:4004 and forwards filtered requests to 127.0.0.1:8888 by default.
Use it along with ``nginx`` with the following example configuration.
.. code:: bash
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Scheme $scheme;
proxy_pass http://127.0.0.1:4004/;
}
Requests are coming from port 4004 going through filtron and then forwarded to port 8888 where a searx is being run.

View File

@ -38,6 +38,7 @@ Administrator documentation
dev/install/installation
admin/api
admin/filtron
Developer documentation
-----------------------

174
admin/filtron.html Normal file
View File

@ -0,0 +1,174 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>How to protect an instance &#8212; searx 0.9.0 documentation</title>
<link rel="stylesheet" href="../_static/style.css" type="text/css" />
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: '../',
VERSION: '0.9.0',
COLLAPSE_INDEX: false,
FILE_SUFFIX: '.html',
HAS_SOURCE: true
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
<link rel="top" title="searx 0.9.0 documentation" href="../index.html" />
<link rel="next" title="Development Quickstart" href="../dev/quickstart.html" />
<link rel="prev" title="Administration API" href="api.html" />
<link media="only screen and (max-device-width: 480px)" href="../_static/small_flask.css" type= "text/css" rel="stylesheet" />
<meta name="viewport" content="width=device-width, initial-scale=0.9, maximum-scale=0.9">
</head>
<body role="document">
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<div class="section" id="how-to-protect-an-instance">
<h1>How to protect an instance<a class="headerlink" href="#how-to-protect-an-instance" title="Permalink to this headline"></a></h1>
<p>Searx depens on external search services. To avoid the abuse of these services it is advised to limit the number of requests processed by searx.</p>
<p>An application firewall, <code class="docutils literal"><span class="pre">filtron</span></code> solves exactly this problem. Information on how to install it can be found at the <a class="reference external" href="https://github.com/asciimoo/filtron">project page of filtron</a>.</p>
<div class="section" id="sample-configuration-of-filtron">
<h2>Sample configuration of filtron<a class="headerlink" href="#sample-configuration-of-filtron" title="Permalink to this headline"></a></h2>
<p>An example configuration can be find below. This configuration limits the access of</p>
<blockquote>
<div><ul class="simple">
<li>scripts or applications (roboagent limit)</li>
<li>webcrawlers (botlimit)</li>
<li>IPs which send too many requests (IP limit)</li>
<li>too many json, csv, etc. requests (rss/json limit)</li>
<li>the same UserAgent of if too many requests (useragent limit)</li>
</ul>
</div></blockquote>
<div class="code json highlight-default"><div class="highlight"><pre><span></span><span class="p">[</span>
<span class="p">{</span>
<span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;search request&quot;</span><span class="p">,</span>
<span class="s2">&quot;filters&quot;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&quot;Param:q&quot;</span><span class="p">,</span> <span class="s2">&quot;Path=^(/|/search)$&quot;</span><span class="p">],</span>
<span class="s2">&quot;interval&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="n">time</span><span class="o">-</span><span class="n">interval</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">sec</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;limit&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="nb">max</span><span class="o">-</span><span class="n">request</span><span class="o">-</span><span class="n">number</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">interval</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;subrules&quot;</span><span class="p">:</span> <span class="p">[</span>
<span class="p">{</span>
<span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;roboagent limit&quot;</span><span class="p">,</span>
<span class="s2">&quot;interval&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="n">time</span><span class="o">-</span><span class="n">interval</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">sec</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;limit&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="nb">max</span><span class="o">-</span><span class="n">request</span><span class="o">-</span><span class="n">number</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">interval</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;filters&quot;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&quot;Header:User-Agent=(curl|cURL|Wget|python-requests|Scrapy|FeedFetcher|Go-http-client)&quot;</span><span class="p">],</span>
<span class="s2">&quot;actions&quot;</span><span class="p">:</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;block&quot;</span><span class="p">,</span>
<span class="s2">&quot;params&quot;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&quot;message&quot;</span><span class="p">:</span> <span class="s2">&quot;Rate limit exceeded&quot;</span><span class="p">}}</span>
<span class="p">]</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;botlimit&quot;</span><span class="p">,</span>
<span class="s2">&quot;limit&quot;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
<span class="s2">&quot;stop&quot;</span><span class="p">:</span> <span class="n">true</span><span class="p">,</span>
<span class="s2">&quot;filters&quot;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&quot;Header:User-Agent=(Googlebot|bingbot|Baiduspider|yacybot|YandexMobileBot|YandexBot|Yahoo! Slurp|MJ12bot|AhrefsBot|archive.org_bot|msnbot|MJ12bot|SeznamBot|linkdexbot|Netvibes|SMTBot|zgrab|James BOT)&quot;</span><span class="p">],</span>
<span class="s2">&quot;actions&quot;</span><span class="p">:</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;block&quot;</span><span class="p">,</span>
<span class="s2">&quot;params&quot;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&quot;message&quot;</span><span class="p">:</span> <span class="s2">&quot;Rate limit exceeded&quot;</span><span class="p">}}</span>
<span class="p">]</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;IP limit&quot;</span><span class="p">,</span>
<span class="s2">&quot;interval&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="n">time</span><span class="o">-</span><span class="n">interval</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">sec</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;limit&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="nb">max</span><span class="o">-</span><span class="n">request</span><span class="o">-</span><span class="n">number</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">interval</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;stop&quot;</span><span class="p">:</span> <span class="n">true</span><span class="p">,</span>
<span class="s2">&quot;aggregations&quot;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&quot;Header:X-Forwarded-For&quot;</span><span class="p">],</span>
<span class="s2">&quot;actions&quot;</span><span class="p">:</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;block&quot;</span><span class="p">,</span>
<span class="s2">&quot;params&quot;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&quot;message&quot;</span><span class="p">:</span> <span class="s2">&quot;Rate limit exceeded&quot;</span><span class="p">}}</span>
<span class="p">]</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;rss/json limit&quot;</span><span class="p">,</span>
<span class="s2">&quot;interval&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="n">time</span><span class="o">-</span><span class="n">interval</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">sec</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;limit&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="nb">max</span><span class="o">-</span><span class="n">request</span><span class="o">-</span><span class="n">number</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">interval</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;stop&quot;</span><span class="p">:</span> <span class="n">true</span><span class="p">,</span>
<span class="s2">&quot;filters&quot;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&quot;Param:format=(csv|json|rss)&quot;</span><span class="p">],</span>
<span class="s2">&quot;actions&quot;</span><span class="p">:</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;block&quot;</span><span class="p">,</span>
<span class="s2">&quot;params&quot;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&quot;message&quot;</span><span class="p">:</span> <span class="s2">&quot;Rate limit exceeded&quot;</span><span class="p">}}</span>
<span class="p">]</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;useragent limit&quot;</span><span class="p">,</span>
<span class="s2">&quot;interval&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="n">time</span><span class="o">-</span><span class="n">interval</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">sec</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;limit&quot;</span><span class="p">:</span> <span class="o">&lt;</span><span class="nb">max</span><span class="o">-</span><span class="n">request</span><span class="o">-</span><span class="n">number</span><span class="o">-</span><span class="ow">in</span><span class="o">-</span><span class="n">interval</span><span class="o">&gt;</span><span class="p">,</span>
<span class="s2">&quot;aggregations&quot;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&quot;Header:User-Agent&quot;</span><span class="p">],</span>
<span class="s2">&quot;actions&quot;</span><span class="p">:</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">&quot;name&quot;</span><span class="p">:</span> <span class="s2">&quot;block&quot;</span><span class="p">,</span>
<span class="s2">&quot;params&quot;</span><span class="p">:</span> <span class="p">{</span><span class="s2">&quot;message&quot;</span><span class="p">:</span> <span class="s2">&quot;Rate limit exceeded&quot;</span><span class="p">}}</span>
<span class="p">]</span>
<span class="p">}</span>
<span class="p">]</span>
<span class="p">}</span>
<span class="p">]</span>
</pre></div>
</div>
</div>
<div class="section" id="route-request-through-filtron">
<h2>Route request through filtron<a class="headerlink" href="#route-request-through-filtron" title="Permalink to this headline"></a></h2>
<p>Filtron can be started using the following command:</p>
<div class="code bash highlight-default"><div class="highlight"><pre><span></span>$ filtron -rules rules.json
</pre></div>
</div>
<p>It listens on 127.0.0.1:4004 and forwards filtered requests to 127.0.0.1:8888 by default.</p>
<p>Use it along with <code class="docutils literal"><span class="pre">nginx</span></code> with the following example configuration.</p>
<div class="code bash highlight-default"><div class="highlight"><pre><span></span>location / {
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Scheme $scheme;
proxy_pass http://127.0.0.1:4004/;
}
</pre></div>
</div>
<p>Requests are coming from port 4004 going through filtron and then forwarded to port 8888 where a searx is being run.</p>
</div>
</div>
</div>
</div>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
<div class="sphinxsidebarwrapper"><div class="sidebar_container body">
<h1>Searx</h1>
<ul>
<li><a href="../index.html">Home</a></li>
<li><a href="https://github.com/asciimoo/searx">Source</a></li>
<li><a href="https://github.com/asciimoo/searx/wiki">Wiki</a></li>
<li><a href="https://github.com/asciimoo/searx/wiki/Searx-instances">Public instances</a></li>
</ul>
<hr />
<ul>
<li><a href="https://twitter.com/Searx_engine">Twitter</a></li>
<li><a href="https://flattr.com/submit/auto?user_id=asciimoo&url=https://github.com/asciimoo/searx&title=searx&language=&tags=github&category=software">Flattr</a></li>
<li><a href="https://gratipay.com/searx">Gratipay</a></li>
</ul>
</div>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="footer">
&copy; Copyright 2015-2016, Adam Tauber.
</div>
</body>
</html>

View File

@ -72,6 +72,7 @@
<ul>
<li class="toctree-l1"><a class="reference internal" href="dev/install/installation.html">Installation</a></li>
<li class="toctree-l1"><a class="reference internal" href="admin/api.html">Administration API</a></li>
<li class="toctree-l1"><a class="reference internal" href="admin/filtron.html">How to protect an instance</a></li>
</ul>
</div>
</div>