More BPF helper macros, more helper for granular syscall checking.
Use these throughout kore where it makes sense.
The new helpers:
- KORE_SYSCALL_DENY_ARG(name, arg, value, errno):
Deny the system call with errno if the argument matches value.
- KORE_SYSCALL_DENY_MASK(name, arg, mask, errno):
Deny the system call with errno if the mask argument does not match
the exact mask given.
- KORE_SYSCALL_DENY_WITH_FLAG(name, arg, flag, errno):
Deny the system call with errno if the argument contains the
given flag.
The reverse also exists:
- KORE_SYSCALL_ALLOW_ARG()
- KORE_SYSCALL_ALLOW_MASK()
- KORE_SYSCALL_ALLOW_WITH_FLAG()
With this commit all Kore processes (minus the parent) are running
under seccomp.
The worker processes get the bare minimum allowed syscalls while each module
like curl, pgsql, etc will add their own filters to allow what they require.
New API functions:
int kore_seccomp_filter(const char *name, void *filter, size_t len);
Adds a filter into the seccomp system (must be called before
seccomp is enabled).
New helpful macro:
define KORE_SYSCALL_ALLOW(name)
Allow the syscall with a given name, should be used in
a sock_filter data structure.
New hooks:
void kore_seccomp_hook(void);
Called before seccomp is enabled, allows developers to add their
own BPF filters into seccomp.
Prevents a stall in case there is still data in the read end of the socket
but PQisBusy() told us to not fetch a result yet. In that case we end up
stalling due to epoll not giving us another EPOLLIN event due to EPOLLET.
- Add kore_pgsql_query_param_fields() which allows you to pass in the
arrays for values, lengths and formats yourself.
- Add kore_pgsql_column_binary() which will return 1 if the given column
index contains a binary result or 0 if it contains a text result.
- Change the query call in req.pgsql() for Python to always use the
parameterized queries.
This adds the 'params' and 'binary' keywords to the req.pgsql method.
Eg:
result = await req.pgsql("db", "INSERT INTO foo (field) VALUES($1"),
params=["this is my value"])
Now anyone can schedule events and get a callback to work as long
as the user data structure that is added for the event begins
with a kore_event data structure.
All event state is now kept in that kore_event structure and renamed
CONN_[READ|WRITE]_POSSIBLE to KORE_EVENT_[READ|WRITE].
The HTTP layer used to make a copy of each incoming header and its
value for a request. Stop doing that and make HTTP headers zero-copy
all across the board.
This change comes with some api function changes, notably the
http_request_header() function which now takes a const char ** rather
than a char ** out pointer.
This commit also constifies several members of http_request, beware.
Additional rework how the worker processes deal with the accept lock.
Before:
if a worker held the accept lock and it accepted a new connection
it would release the lock for others and back off for 500ms before
attempting to grab the lock again.
This approach worked but under high load this starts becoming obvious.
Now:
- workers not holding the accept lock and not having any connections
will wait less long before returning from kore_platform_event_wait().
- workers not holding the accept lock will no longer blindly wait
an arbitrary amount in kore_platform_event_wait() but will look
at how long until the next lock grab is and base their timeout
on that.
- if a worker its next_lock timeout is up and failed to grab the
lock it will try again in half the time again.
- the worker process holding the lock will when releasing the lock
double check if it still has space for newer connections, if it does
it will keep the lock until it is full. This prevents the lock from
bouncing between several non busy worker processes all the time.
Additional fixes:
- Reduce the number of times we check the timeout list, only do it twice
per second rather then every event tick.
- Fix solo worker count for TLS (we actually hold two processes, not one).
- Make sure we don't accidentally miscalculate the idle time causing new
connections under heavy load to instantly drop.
- Swap from gettimeofday() to clock_gettime() now that MacOS caught up.
Limits the number of queued asynchronous queries kore will allow
before starting to return errors for KORE_PGSQL_ASYNC setups.
By default set to 1000.
Avoids uncontrolled growth of the pgsql_queue_wait pool.
- make sure conn_count per pgsqldb structure is initialized to 0.
- allow pgsql_conn_max to be 0, meaning just create a new connection
if none was free.
if we fail at rolling back an in-error transaction on a connection
just remove that connection and go back to rescanning rather then
returning an error to the caller, there may be more functional
connections in the pipeline.
- Make pgsql_conn_count count per database rather then globally.
This means you now define the number of clients *per* database registered
rather then the number of clients in total of all databases.
- In case a connection is in failed transaction state Kore will now
automatically rollback the transaction before placing that connection
back in the connection pool.
When the pgsql layer was introduced it was tightly coupled with the
http layer in order to make async work fluently.
The time has come to split these up and follow the same method we
used for tasks, allowing either http requests to be tied to a pgsql
data structure or a simple callback function.
This also reworks the internal queueing of pgsql requests until
connections to the db are available again.
The following API functions were changes:
- kore_pgsql_query_init() -> kore_pgsql_setup()
no longer takes an http_request parameter.
- NEW kore_pgsql_init()
must be called before operating on an kore_pgsql structure.
- NEW kore_pgsql_bind_request()
binds an http_request to a kore_pgsql data structure.
- NEW kore_pgsql_bind_callback()
binds a callback to a kore_pgsql data structure.
With all of this you can now build kore with PGSQL=1 NOHTTP=1.
The pgsql/ example has been updated to reflect these changes and
new features.
- adds new cleanup function that workers will call.
- adds kore_pgsql_nfields() to return number of fields in result.
- add kore_pgsql_fieldname() to return name of a given field.
This commit also changes the behaviour of pgsql_conn_release() in
that it will now cancel the active query before releasing the connection.
This makes sure that if long running queries are active they are hopefully
cancelled if an http request is removed while such queries are still running.
- Change pools to use mmap() for allocating regions.
- Change kore_malloc() to use pools for commonly sized objects.
(split into multiple of 2 buckets, starting at 8 bytes up to 8192).
- Rename kore_mem_free() to kore_free().
The preallocated pools will hold up to 128K of elements per block size.
In case a larger object is to be allocated kore_malloc() will use
malloc() instead.
Same as kore_pgsql_query_params but takes a va_list as last parameter
(non-v version takes a variable list of parameters).
Lets people write easier to call wrappers around the query calls. I use
it in a wrapper that takes next states (error, current, continue) as
arguments in a handler with multiple async queries.
Semantics for using pgsql API have changed quite heavily
with this commit. See the examples for more information.
Based on Github issue #95 by PauloMelo (paulo.melo@vintageform.pt)
with several modifications by me.
This commit renames certain POST centric variable and configuration
naming to the correct HTTP body stuff.
API changes include http_postbody_text() and http_postbody_bytes() to
have become http_body_text() and http_body_bytes().
The developer is still responsible for validating the method their
page handler is called with. Hopefully this becomes a configuration
option soon enough.
This function uses PQsendQueryParams() instead of the normal PQsendQuery()
allowing you to pass binary data in a cleaner fashion.
A basic call would look something like:
char *mydata = "Hello";
size_t mydata_len = strlen(mydata);
kore_pgsql_query_params(&pgsql, req,
"INSERT INTO foo VALUES($1::text)", KORE_PGSQL_FORMAT_TEXT, 1
mydata, mydata_len, KORE_PGSQL_FORMAT_TEXT);
kore_pgsql_query_params() is variadic, allowing you to pass any
count of parameters where each parameter has the following:
data pointer, data length, type of parameter.
I rather keep the old idioms instead of adding more complex things
on top of the async ones. Especially since the simple layer would
interfear with existing http state machines from your handler.
This simple query allows you to ditch rolling your own
state machine for handling async pgsql states and instead
asks you to provide 3 functions:
- init
- results
- done
You can see the different in complexity in the pgsql example,
which now contains a pgsql_simple.c holding the same asynchronous
query as in pgsql.c but using the simple pgsql api.
You can of course still roll your own in case you want more control.