a commit done in 2018 prevented http responses with error codes
> 500 to include any user-set headers, preventing a developer
to include things like content-type etc.
reported by Arun Babu via users@
if an iterator is passed kore will send the response with
transfer-encoding: chunked and call the iterator for every
chunk that was sent.
The iterator must return a utf-8 string.
Works wonderful with TemplateStream from jinja2.
This commit adds the CURL=1 build option. When enabled allows
you to schedule CURL easy handles onto the Kore event loop.
It also adds an easy to use HTTP client API that abstracts away the
settings required from libcurl to make HTTP requests.
Tied together with HTTP request state machines this means you can
write fully asynchronous HTTP client requests in an easy way.
Additionally this exposes that API to the Python code as well
allowing you do to things like:
client = kore.httpclient("https://kore.io")
status, body = await client.get()
Introduces 2 configuration options:
- curl_recv_max
Max incoming bytes for a response.
- curl_timeout
Timeout in seconds before a transfer is cancelled.
This API also allows you to take the CURL easy handle and send emails
with it, run FTP, etc. All asynchronously.
Move away from the parent constantly hitting the disk for every
accesslog the workers are sending.
The workers will now write their own accesslogs to shared
memory before the parent will pick those up. The parent
will flush them to disk once every second or if they grow
larger then 1MB.
This removes the heavy penalty for having access logs
turned on when you are dealing with a large volume
of requests.
- add kore.time() as equivalent for kore_time_ms().
- call waitpid() until no more children are available for reaping otherwise
we risk missing a process if several die at the same time and only one
SIGCHLD is delivered to us.
- drain a RECV socket operation if eof is set but no exception was given.
This means you can now do things like:
resp = await koresock.recv(1024)
await koresock.send(resp)
directly from page handlers if they are defined as async.
Adds lots more to the python goo such as fatalx(), bind_unix(),
task_create() and socket_wrap().
Don't let net_recv_flush() do things as long as the HTTP layer
owns the buffer. When we have sent a response kick the read end
back into gear ourselves by calling net_recv_flush().
This is calculated while the HTTP body is incoming over the wire, once
the body is fully received the digest will be available for the page
handlers to obtain.
You can obtain a hex string for this md via http_body_digest() or
dereferences the http_request and look at http_body_digest manually
for the bytes.
A filemap is a way of telling Kore to serve files from a directory
much like a traditional webserver can do.
Kore filemaps only handles files. Kore does not generate directory
indexes or deal with non-regular files.
The way files are sent to a client differs a bit per platform and
build options:
default:
- mmap() backed file transfer due to TLS.
NOTLS=1
- sendfile() under FreeBSD, macOS and Linux.
- mmap() backed file for OpenBSD.
The opened file descriptors/mmap'd regions are cached and reused when
appropriate. If a file is no longer in use it will be closed and evicted
from the cache after 30 seconds.
New API's are available allowing developers to use these facilities via:
void net_send_fileref(struct connection *, struct kore_fileref *);
void http_response_fileref(struct http_request *, struct kore_fileref *);
Kore will attempt to match media types based on file extensions. A few
default types are built-in. Others can be added via the new "http_media_type"
configuration directive.
This option allows a user to finetune the number of milliseconds
a worker process will max spend inside the http_process() loop.
By default this is 10ms.
The HTTP layer used to make a copy of each incoming header and its
value for a request. Stop doing that and make HTTP headers zero-copy
all across the board.
This change comes with some api function changes, notably the
http_request_header() function which now takes a const char ** rather
than a char ** out pointer.
This commit also constifies several members of http_request, beware.
Additional rework how the worker processes deal with the accept lock.
Before:
if a worker held the accept lock and it accepted a new connection
it would release the lock for others and back off for 500ms before
attempting to grab the lock again.
This approach worked but under high load this starts becoming obvious.
Now:
- workers not holding the accept lock and not having any connections
will wait less long before returning from kore_platform_event_wait().
- workers not holding the accept lock will no longer blindly wait
an arbitrary amount in kore_platform_event_wait() but will look
at how long until the next lock grab is and base their timeout
on that.
- if a worker its next_lock timeout is up and failed to grab the
lock it will try again in half the time again.
- the worker process holding the lock will when releasing the lock
double check if it still has space for newer connections, if it does
it will keep the lock until it is full. This prevents the lock from
bouncing between several non busy worker processes all the time.
Additional fixes:
- Reduce the number of times we check the timeout list, only do it twice
per second rather then every event tick.
- Fix solo worker count for TLS (we actually hold two processes, not one).
- Make sure we don't accidentally miscalculate the idle time causing new
connections under heavy load to instantly drop.
- Swap from gettimeofday() to clock_gettime() now that MacOS caught up.
Before http_request_limit just constrained the number of HTTP
requests we'd deal with in a single http_process_requests() call.
But it should really mean how many maximum HTTP requests are allowed
to be alive in the worker process before we start sending 503s back.
While here, drop the lock timeout for a worker to 100ms down from 500ms
and do not allow a worker to grab the accept lock if their HTTP request
queue is full.
This makes things much more pleasant memory wise as the http_request_pool
won't just grow over time.
Before params get would mean querystring and anything else
would just count toward a www-encoded body.
Now you can prefix the params block with "qs" indicating that
those configured parameters are allowed to occur in the query
string regardless of the method used.
This means you can do something like:
params qs:post /uri {
...
}
to specify what the allowed parameters are in the querystring for
a POST request towards /uri.
inspired by and properly fixes#205.