Commit Graph

266 Commits

Author SHA1 Message Date
Joris Vink d0a6958747 Let http_state_create() take an "onfree" callback.
This function is called when an HTTP request is being free'd,
allowing you to perform any sort of state cleanup attached
to the HTTP request.
2019-04-28 21:48:16 +02:00
Joris Vink 2c88bc6120 Add asynchronous libcurl support.
This commit adds the CURL=1 build option. When enabled allows
you to schedule CURL easy handles onto the Kore event loop.

It also adds an easy to use HTTP client API that abstracts away the
settings required from libcurl to make HTTP requests.

Tied together with HTTP request state machines this means you can
write fully asynchronous HTTP client requests in an easy way.

Additionally this exposes that API to the Python code as well
allowing you do to things like:

	client = kore.httpclient("https://kore.io")
	status, body = await client.get()

Introduces 2 configuration options:
	- curl_recv_max
		Max incoming bytes for a response.

	- curl_timeout
		Timeout in seconds before a transfer is cancelled.

This API also allows you to take the CURL easy handle and send emails
with it, run FTP, etc. All asynchronously.
2019-04-24 00:15:17 +02:00
Joris Vink c89ba3daa3 check http timeouts better 2019-04-12 14:26:47 +02:00
Joris Vink aa49e181b6 Add http_[header|body]_timeout.
If the HTTP request headers or the HTTP body have not arrived before
these timeouts expire, Kore will send a 408 back to the client.
2019-04-11 20:51:49 +02:00
Joris Vink a191445f76 set body length+offset to 0 when populating data.
otherwise this isn't properly picked up by http_body_read() later
if dealing with in-memory HTTP bodies and you get inconsistent behaviour.
2019-04-02 22:26:44 +02:00
Joris Vink eb9b7f7b14 explicitly include sys/types.h
some smaller libc variants do not include this from sys/param.h.
2019-03-06 09:29:46 +01:00
Joris Vink bf1e8e5ffb bump copyright to 2019 2019-02-22 16:57:28 +01:00
Joris Vink 9aa0e95643 Rework accesslog handling.
Move away from the parent constantly hitting the disk for every
accesslog the workers are sending.

The workers will now write their own accesslogs to shared
memory before the parent will pick those up. The parent
will flush them to disk once every second or if they grow
larger then 1MB.

This removes the heavy penalty for having access logs
turned on when you are dealing with a large volume
of requests.
2018-12-22 09:25:00 +01:00
Joris Vink 61b385ae11 do not set CONN_CLOSE_EMPTY for 1.0 until we reply. 2018-11-30 22:12:43 +01:00
Joris Vink 2dd66586ff several python improvements.
- add kore.time() as equivalent for kore_time_ms().
- call waitpid() until no more children are available for reaping otherwise
  we risk missing a process if several die at the same time and only one
  SIGCHLD is delivered to us.
- drain a RECV socket operation if eof is set but no exception was given.
2018-10-30 20:28:27 +01:00
Joris Vink dda2e1fb2c Some things still talk http/1.0. 2018-10-26 21:24:51 +02:00
Joris Vink 20a0103f1e Add async/await support for socket i/o in python.
This means you can now do things like:

	resp = await koresock.recv(1024)
	await koresock.send(resp)

directly from page handlers if they are defined as async.

Adds lots more to the python goo such as fatalx(), bind_unix(),
task_create() and socket_wrap().
2018-10-15 20:18:54 +02:00
Joris Vink 442bdef79b allow kore to bind to unix sockets via bind_unix. 2018-10-07 20:49:16 +02:00
Joris Vink 566fefd031 do not http_argument_urldecode for multipart data. 2018-08-16 14:11:28 +02:00
Joris Vink cf1f624367 let filerefs to operate on ms precision for mtime. 2018-07-24 19:56:36 +02:00
Joris Vink 821c1df8ec use method not allowed when required 2018-07-18 16:24:28 +02:00
Joris Vink 916ce222b4 better fix for 5a5d9fd0.
Don't let net_recv_flush() do things as long as the HTTP layer
owns the buffer. When we have sent a response kick the read end
back into gear ourselves by calling net_recv_flush().
2018-07-18 16:10:41 +02:00
Joris Vink 5a5d9fd0c2 alloc space for nb->buf after taking ownership. 2018-07-18 14:36:13 +02:00
Joris Vink 1447f6573f better http header validation. 2018-07-17 20:17:05 +02:00
Joris Vink 616af063e3 Calculate an md over the incoming HTTP body.
This is calculated while the HTTP body is incoming over the wire, once
the body is fully received the digest will be available for the page
handlers to obtain.

You can obtain a hex string for this md via http_body_digest() or
dereferences the http_request and look at http_body_digest manually
for the bytes.
2018-07-17 14:53:55 +02:00
Joris Vink 0726a26c0c Allow restriction of methods for paths.
Now Kore will automatically send a 400 bad request in case the
method was not allowed on the path.
2018-07-17 14:23:57 +02:00
Joris Vink f02f88295c revert b5e122 for now. 2018-07-06 11:21:46 +02:00
Joris Vink 47c1a1d195 set referer to NULL in http_request_new(). 2018-07-05 05:02:49 +00:00
Joris Vink b5e122419b Let http_populate_post() listen to content-type 2018-07-03 08:25:06 +02:00
Joris Vink 4a8d8ab7f8 log referer in accesslog if present. 2018-06-29 22:37:48 +02:00
Joris Vink 72073701b0 Add last-modified and if-modified-since for filemaps. 2018-06-29 09:56:04 +02:00
Joris Vink 521ff6a11d catch more bad ranges in http_argument_urldecode() 2018-06-28 15:39:03 +02:00
Joris Vink 70e945afb7 limit http_argument_urldecode() to sane characters 2018-06-28 15:27:55 +02:00
Joris Vink afd76ff55d Change accesslog format to Combined Log Format. 2018-06-28 14:25:32 +02:00
Joris Vink 80f5425698 Add filemaps.
A filemap is a way of telling Kore to serve files from a directory
much like a traditional webserver can do.

Kore filemaps only handles files. Kore does not generate directory
indexes or deal with non-regular files.

The way files are sent to a client differs a bit per platform and
build options:

default:
  - mmap() backed file transfer due to TLS.

NOTLS=1
  - sendfile() under FreeBSD, macOS and Linux.
  - mmap() backed file for OpenBSD.

The opened file descriptors/mmap'd regions are cached and reused when
appropriate. If a file is no longer in use it will be closed and evicted
from the cache after 30 seconds.

New API's are available allowing developers to use these facilities via:
  void net_send_fileref(struct connection *, struct kore_fileref *);
  void http_response_fileref(struct http_request *, struct kore_fileref *);

Kore will attempt to match media types based on file extensions. A few
default types are built-in. Others can be added via the new "http_media_type"
configuration directive.
2018-06-28 13:27:44 +02:00
Joris Vink 9be72aff57 bump size of http_version array. 2018-06-23 17:23:45 +02:00
Joris Vink 8aaf7aaf79 Alter where the version number comes from.
Now if we are a git repo we fetch the branch name and
commitid to build the version string. If there is no
git repo we'll look at the RELEASE file.
2018-06-22 14:24:42 +02:00
Joris Vink 439a3b36f0 Add kore_strtodouble().
Use it for http_argument_get_float() and http_argument_get_double().
2018-05-04 15:55:35 +02:00
Joris Vink 5487950f63 cut off port from the domain when needed. 2018-04-24 20:11:41 +02:00
Joris Vink d73a9114c0 Improve http_response() for server side errors.
In case http_response() is called with an error code indicating
a server side error (>= 500) do not append any headers set by the
caller.
2018-04-11 13:04:26 +02:00
Joris Vink 6a35a8a455 remove dead code. 2018-04-03 10:57:40 +02:00
Joris Vink 548068d2a0 Add http_request_ms configuration option.
This option allows a user to finetune the number of milliseconds
a worker process will max spend inside the http_process() loop.

By default this is 10ms.
2018-03-14 13:41:17 +01:00
Joris Vink 50c3d07b48 remove http_path_pool and http_host_pool.
No longer used.
2018-02-21 09:11:57 +01:00
Joris Vink dd2dff2318 Rework HTTP and worker processes.
The HTTP layer used to make a copy of each incoming header and its
value for a request. Stop doing that and make HTTP headers zero-copy
all across the board.

This change comes with some api function changes, notably the
http_request_header() function which now takes a const char ** rather
than a char ** out pointer.

This commit also constifies several members of http_request, beware.

Additional rework how the worker processes deal with the accept lock.

Before:
	if a worker held the accept lock and it accepted a new connection
	it would release the lock for others and back off for 500ms before
	attempting to grab the lock again.

	This approach worked but under high load this starts becoming obvious.

Now:
	- workers not holding the accept lock and not having any connections
	  will wait less long before returning from kore_platform_event_wait().

	- workers not holding the accept lock will no longer blindly wait
	  an arbitrary amount in kore_platform_event_wait() but will look
	  at how long until the next lock grab is and base their timeout
	  on that.

	- if a worker its next_lock timeout is up and failed to grab the
	  lock it will try again in half the time again.

	- the worker process holding the lock will when releasing the lock
	  double check if it still has space for newer connections, if it does
	  it will keep the lock until it is full. This prevents the lock from
	  bouncing between several non busy worker processes all the time.

Additional fixes:

- Reduce the number of times we check the timeout list, only do it twice
  per second rather then every event tick.
- Fix solo worker count for TLS (we actually hold two processes, not one).
- Make sure we don't accidentally miscalculate the idle time causing new
  connections under heavy load to instantly drop.
- Swap from gettimeofday() to clock_gettime() now that MacOS caught up.
2018-02-14 13:48:49 +01:00
Joris Vink b3a48f3c15 Let http_request_limit matter.
Before http_request_limit just constrained the number of HTTP
requests we'd deal with in a single http_process_requests() call.

But it should really mean how many maximum HTTP requests are allowed
to be alive in the worker process before we start sending 503s back.

While here, drop the lock timeout for a worker to 100ms down from 500ms
and do not allow a worker to grab the accept lock if their HTTP request
queue is full.

This makes things much more pleasant memory wise as the http_request_pool
won't just grow over time.
2018-02-13 11:56:51 +01:00
Joris Vink 548348f553 2018 2018-01-20 22:51:06 +01:00
Joris Vink b95b623e72 Allow param blocks to be marked as "querystring"
Before params get would mean querystring and anything else
would just count toward a www-encoded body.

Now you can prefix the params block with "qs" indicating that
those configured parameters are allowed to occur in the query
string regardless of the method used.

This means you can do something like:

params qs:post /uri {
	...
}

to specify what the allowed parameters are in the querystring for
a POST request towards /uri.

inspired by and properly fixes #205.
2018-01-16 18:47:50 +01:00
Joris Vink 915b8e1d3c Use kore_bufs on the stack rather than the pools. 2018-01-15 22:31:54 +01:00
rouzier f0f1296265 Add patch support (#217)
Add PATCH to supported verbs in config and what not.
2018-01-02 22:27:59 +01:00
Joris Vink ae4201c647 make r const 2017-08-08 09:11:41 +02:00
Joris Vink 6415670753 set CONN_CLOSE_EMPTY for early HTTP errors.
while here fix missing connection response headers for errors.
2017-07-04 10:55:11 +02:00
Joris Vink 8e359ede13 flush out send buffer in http_error_response(). 2017-07-04 10:42:14 +02:00
Stanislav Yudin b73343aea4 add HTTP_METHOD_OPTIONS as another supported http method. (#186) 2017-04-04 09:37:19 +02:00
Joris Vink c545a922a1 Preserve the full host header under req->host.
Additionally make this header available via http_request_header().

prompted by #184
2017-03-30 09:38:23 +02:00
Joris Vink 59f7e85f45 Decouple pgsql from the http layer.
When the pgsql layer was introduced it was tightly coupled with the
http layer in order to make async work fluently.

The time has come to split these up and follow the same method we
used for tasks, allowing either http requests to be tied to a pgsql
data structure or a simple callback function.

This also reworks the internal queueing of pgsql requests until
connections to the db are available again.

The following API functions were changes:
	- kore_pgsql_query_init() -> kore_pgsql_setup()
		no longer takes an http_request parameter.
	- NEW kore_pgsql_init()
		must be called before operating on an kore_pgsql structure.
	- NEW kore_pgsql_bind_request()
		binds an http_request to a kore_pgsql data structure.
	- NEW kore_pgsql_bind_callback()
		binds a callback to a kore_pgsql data structure.

With all of this you can now build kore with PGSQL=1 NOHTTP=1.

The pgsql/ example has been updated to reflect these changes and
new features.
2017-03-24 12:53:07 +01:00