Commit Graph

174 Commits

Author SHA1 Message Date
Joris Vink e2fcedfaec Differentiate between normal shutdown and fatal.
The parent process never differentiated between a worker process
asking for a shutdown or a worker process calling fatalx() when
it came to its exit code.

I made some changes here so the parent process will exit with
an exit code 1 if anything worker related went wrong (fatalx/death policy).
2022-08-08 11:02:27 +02:00
Joris Vink a9f7bd7faf rename ssl prefixed things to tls. 2022-02-18 10:20:28 +01:00
Joris Vink 99a1581e19 Initial work splitting OpenSSL code away.
This work moves all TLS / crypto related code into a tls_openssl.c
file and adds a tls_none.c which contains just stubs.

Allows compilation of Kore with TLS_BACKEND=none to remove building
against OpenSSL.

Also adds code for SHA1/SHA2 taken from openssh-portable so we don't
depend on those being present anymore in libcrypto.
2022-02-17 13:45:28 +01:00
Joris Vink 6dc162e7ee Handle ECHILD when reaping workers on shutdown.
If the child process is already dead we must handle it accordingly
instead of getting stuck waiting on it.
2022-02-16 12:32:20 +01:00
Joris Vink 23d762d682 Allow parent to send msgs to workers via kore_msg.
It wasn't possible for the parent process to send messages
directly via kore_msg_send() to other worker processes.

This is now rectified to from the parent process one can call
kore_msg_send() with a worker destination and it'll work.
2022-02-01 10:36:07 +01:00
Joris Vink b3f54e290a Change parent behaviour when calling waitpid().
Wait for any process in our process group only instead of WAIT_ANY.

This allows the parent process to start subprocesses that end up
in different process groups which are handled in user code instead
completely (using signalfd for example).
2022-02-01 10:34:12 +01:00
Joris Vink 833ca646e7 i forgot, it's 2022. 2022-01-31 22:02:06 +01:00
Joris Vink 93a4fe2a15 Worker hook rework.
This commit adds improved hooks for Python and a new signal delivery hook.

For the Python API kore_worker_configure() and kore_worker_teardown() had
to be implemented before this commit. Now one can create a workerstart
and workerend method in their koreapp as those will be called when
they exist.

The new signal hook is either kore_worker_signal() or koreapp.signal.

This new hook is called after the worker event code handles the received
signal itself first.

With this commit there is also a new kore_signal_trap() API call allowing
you to more easily trap new signals. This API also also exported to the
Python part of the code under kore.sigtrap()
2021-12-22 09:50:26 +01:00
Joris Vink d8113e4545 Reset dom->acme_cert upon clearing. 2021-12-19 00:14:33 +01:00
Joris Vink efc7b3d9a6 Improve how the parent handles workers.
- Make sure we drain the worker log channel if it dies
  so we can flush out any lingering log messages.

- Get rid of the raise() in the parent to signal ourselves
  we should terminate. Instead depend on the new kore_quit.

- Always attempt to reap children one way or the other.
2021-11-03 17:23:05 +01:00
Joris Vink 23b95448cc Hide worker logs behind kore_quiet. 2021-10-05 12:29:50 +02:00
Joris Vink e98a4ddab5 Change how routes are configured in Kore.
Routes are now configured in a context per route:

route /path {
	handler handler_name
	methods get post head
	validate qs:get id v_id
}

All route related configurations are per-route, allowing multiple
routes for the same path (for different methods).

The param context is removed and merged into the route context now
so that you use the validate keyword to specify what needs validating.
2021-09-15 11:09:52 +02:00
Joris Vink a6677b873f On linux, keep track of seccomp tracing properly.
With the new process startup code we must handle the SIGSTOP
from the processes if seccomp_tracing is enabled. Otherwise
they just hang indefinitely and we assume they failed to start,
which is somewhat true.
2021-09-07 23:05:25 +02:00
Joris Vink 9fd30db598 Change timeout for worker startup a bit.
Also give some feedback we are waiting for process startup.
2021-09-07 22:14:28 +02:00
Joris Vink 3b20cda11c Rework worker startup/privsep config.
Starting with the privsep config, this commit changes the following:

- Removes the root, runas, keymgr_root, keymgr_runas, acme_root and
  acme_runas configuration options.

  Instead these are now configured via a privsep configuration context:

  privsep worker {
      root /tmp
      runas nobody
  }

  This is also configurable via Python using the new kore.privsep() method:

      kore.privsep("worker", root="/tmp", runas="nobody", skip=["chroot"])

Tied into this we also better handle worker startup:

- Per worker process, wait until it signalled it is ready.
- If a worker fails at startup, display its last log lines more clearly.
- Don't start acme process if no domain requires acme.
- Remove each process its individual startup log message in favour
  of a generalized one that displays its PID, root and user.
- At startup, log the kore version and built-ins in a nicer way.
- The worker processes now check things they need to start running
  before signaling they are ready (such as access to CA certs for
  TLS client authentication).
2021-09-07 21:59:22 +02:00
Joris Vink 7f56c7dbf2 Change how worker processes do logging.
Before each worker process would either directly print to stdout if
Kore was running in foreground mode, or syslog otherwise.

With this commit the workers will submit their log messages to the
parent process who will either put it onto stdout or syslog.

This change in completely under the hood and users shouldn't care about it.
2021-09-06 13:28:38 +02:00
Joris Vink cef5ac4003 bump copyright year. 2021-01-11 23:46:08 +01:00
Joris Vink 9227347b90 Fix concurrency problem in coroutines.
If a coroutine is woken up from another coroutine running from an
http request we can end up in a case where the call path looks like:

0 kore_worker_entry
1 epoll wait		<- bound to pending timers
2 http_process		<- first coro sleep
3 kore_python_coro_run	<- wakes up request
4 http_process		<- wakes up another coroutine
5 return to kore_worker_entry

In the case where 4 wakes up another coroutine but 1 is bound to a timer
and no io activity occurs the coroutine isn't run until the timer fires.

Fix this issue by always checking for pending coroutines even if the
netwait isn't INFINITE.
2020-12-07 11:11:21 +01:00
Joris Vink 262a2512f1 Do not dispatch signals to workers without a valid pid.
thanks rille.
2020-10-16 13:06:08 +02:00
Frederic Cambus 9deb2e71bf Use kore_worker_name() when logging worker exits in worker_reaper(). 2020-09-15 12:19:08 +02:00
Frederic Cambus 3ac956c67d Use kore_worker_name() when logging worker shutdowns. 2020-09-08 15:15:33 +02:00
Frederic Cambus d9673857d8 Fix a couple of typos in various places. 2020-09-08 13:01:18 +02:00
Joris Vink 636469f555 Only reset accept_avail if we grabbed the lock.
Otherwise in certain scenarios it could mean that workers
unsuccessfully grabbed the lock, reset accept_avail and
no longer attempt to grab the lock afterwards.

This can cause a complete stall in workers processing requests.
2020-09-08 11:51:06 +02:00
Joris Vink 8b9f7a6c12 improve our asynchronous curl support.
- Remove the edge trigger io hacks we had in place.
- Use level triggered io for the libcurl fds instead.
- Batch all curl events together and process them at the end
  of our worker event loop.
2020-08-17 15:15:04 +02:00
Joris Vink 08d66e3926 set a worker its running flag to 0 if it dies. 2020-08-10 09:33:34 +02:00
Joris Vink 2316f1016d Always prune disconnected clients at the end of the event loop. 2020-06-26 12:25:07 +02:00
Joris Vink 74432aeff7 Set netwait to 10ms if a signal is pending.
If a signal is delivered after the signal check in the worker
loop we could end up in a scenario where we wait for i/o to
start triggering it.
2020-06-16 17:29:45 +02:00
Joris Vink 9d0aef0079 bump copyright 2020-02-10 14:47:33 +01:00
Joris Vink 01cc981632 Improve waiting on workers to exit take 2.
Keep track of what workers are running and account for those when things
exit. Somewhat reverts the entire last commit, that was the wrong approach.
2020-01-17 21:48:55 +01:00
Joris Vink d8ff8e2c18 Improve waiting on children to exit.
If waitpid() returns -1 check if errno is ECHILD, just mark the worker
process as exited.

This could happen if Kore starts without keymgr/acme but those would still
be accounted for.
2020-01-17 21:43:56 +01:00
Joris Vink 7cf0006f52 fix potential NULL dereferences.
found by clang --analyze, reminded by fahlgren@
2019-11-13 11:23:02 +01:00
Joris Vink c78535aa5d Add acmev2 (RFC8555) support to Kore.
A new acme process is created that communicates with the acme servers.

This process does not hold any of your private keys (no account keys,
no domain keys etc).

Whenever the acme process requires a signed payload it will ask the keymgr
process to do the signing with the relevant keys.

This process is also sandboxed with pledge+unveil on OpenBSD and seccomp
syscall filtering on Linux.

The implementation only supports the tls-alpn-01 challenge. This means that
you do not need to open additional ports on your machine.

http-01 and dns-01 are currently not supported (no wildcard support).

A new configuration option "acme_provider" is available and can be set
to the acme server its directory. By default this will point to the
live letsencrypt environment:
    https://acme-v02.api.letsencrypt.org/directory

The acme process can be controlled via the following config options:
  - acme_root (where the acme process will chroot/chdir into).
  - acme_runas (the user the acme process will run as).

  If none are set, the values from 'root' and 'runas' are taken.

If you want to turn on acme for domains you do it as follows:

domain kore.io {
	acme yes
}

You do not need to specify certkey/certfile anymore, if they are present
still
they will be overwritten by the acme system.

The keymgr will store all certificates and keys under its root
(keymgr_root), the account key is stored as "/account-key.pem" and all
obtained certificates go under "certificates/<domain>/fullchain.pem" while
keys go under "certificates/<domain>/key.pem".

Kore will automatically renew certificates if they will expire in 7 days
or less.
2019-11-06 19:43:48 +01:00
Joris Vink 8311c036d9 Add seccomp_tracing configuration option for linux.
If set to "yes" then Kore will trace its child processes and properly
notify you of seccomp violations while still allowing the syscalls.

This can be very useful when running Kore on new platforms that have
not been properly tested with seccomp, allowing me to adjust the default
policies as we move further.
2019-10-31 12:52:10 +01:00
Joris Vink 790d020ce9 Stop a python coro from getting stuck with httpclient.
In cases where a request is immediately completed in libcurl its multi
handle and no additional i/o is happening a coro can get stuck waiting
to be run.

Prevent this by lowering netwait from KORE_WAIT_INFINITE if there
are pending python coroutines.
2019-10-22 17:06:32 +02:00
Joris Vink 0eb11794f5 Do not add keymgr its msg fd if not started.
Reshuffles the keymgr_active flag to keymgr.c and let it be figured out
from inside kore_server_start() instead of the worker init code.
2019-10-07 10:31:35 +02:00
Joris Vink 97523e2768 only register tls related msg callbacks if needed 2019-10-04 19:20:37 +02:00
Joris Vink b0cf42726d Do not start keymgr if no tls enabled servers are present 2019-10-04 11:29:45 +02:00
Joris Vink 46375303cb Allow multiple binds on new server directive. 2019-09-27 20:00:35 +02:00
Joris Vink 7350131232 Allow listening of tls/notls ports at the same time.
Before kore needed to be built with NOTLS=1 to be able to do non TLS
connections. This has been like this for years.

It is time to allow non TLS listeners without having to rebuild Kore.

This commit changes your configuration format and will break existing
applications their config.

Configurations now get listener {} contexts:

listen default {
	bind 127.0.0.1 8888
}

The above will create a listener on 127.0.0.1, port 8888 that will serve
TLS (still the default).

If you want to turn off TLS on that listener, specify "tls no" in that
context.

Domains now need to be attached to a listener:

Eg:
	domain * {
		attach	default
	}

For the Python API this kills kore.bind(), and kore.bind_unix(). They are
replaced with:

	kore.listen("name", ip=None, port=None, path=None, tls=True).
2019-09-27 12:27:04 +02:00
Joris Vink 68e90507f4 properly seccomp keymgr 2019-09-25 14:41:09 +02:00
Joris Vink cd9971247c Add seccomp syscall filtering to kore.
With this commit all Kore processes (minus the parent) are running
under seccomp.

The worker processes get the bare minimum allowed syscalls while each module
like curl, pgsql, etc will add their own filters to allow what they require.

New API functions:
    int kore_seccomp_filter(const char *name, void *filter, size_t len);

    Adds a filter into the seccomp system (must be called before
    seccomp is enabled).

New helpful macro:
    define KORE_SYSCALL_ALLOW(name)

    Allow the syscall with a given name, should be used in
    a sock_filter data structure.

New hooks:
    void kore_seccomp_hook(void);

    Called before seccomp is enabled, allows developers to add their
    own BPF filters into seccomp.
2019-09-25 14:31:20 +02:00
Joris Vink c3b2a8b2a2 fix NOHTTP builds 2019-09-20 09:37:02 +02:00
Joris Vink 8e858983bf python pgsql changes.
- decouple pgsql from the HTTP request allowing it to be used in other
  contexts as well (such as a task, etc).

- change names to dbsetup() and dbquery().

eg:

result = kore.dbquery("db", "select foo from bar")
2019-09-04 19:57:28 +02:00
Joris Vink 4a64b4f07b Improve curl timeout handling.
In case libcurl instructs us to call the timeout function as soon
as possible (timeout == 0 in curl_timeout), don't try to be clever
with a timeout value of 10ms.

Instead call the timeout function once we get back in the worker
event loop. This makes things a lot snappier as we don't depend
on epoll/kqueue waiting for io for 10ms (which actually isn't 10ms...).
2019-06-13 12:59:17 +02:00
Joris Vink 07fc7a9097 Improve HTTP processing.
If netwait is INFINITE but there are requests pending reduce the
netwait back down to 100ms so we keep processing them.
2019-05-29 15:27:44 +02:00
Joris Vink d2aa64df5c add kore_proctitle().
manipulates the argv+environ pointers to get a sensible process title
under linux / darwin.
2019-03-29 16:24:14 +01:00
Joris Vink e1766e74ba always capture worker processes exiting.
even if they terminated normally.
2019-03-22 10:29:14 +01:00
Joris Vink 4238431b9e Add worker_death_policy setting.
By default kore will restart worker processes if they terminate
unexpected. However in certain scenarios you may want to bring down
an entire kore instance if a worker process fails.

By setting worker_death_policy to "terminate" the Kore server will
completely stop if a worker exits unexpected.
2019-03-22 09:49:50 +01:00
Joris Vink 370041656e Get rid of WORKER_LOCK_TIMEOUT.
Instead let the workers send a message on the msg channel to each
other when they have given up the accept lock and it is now available
to be grabbed.
2019-03-21 14:03:11 +01:00
Joris Vink 8b0279879a rework timers so they fire more predictably.
this change also stops python coroutines from waking up very
late after their timeout has expired.

in filerefs, don't prime the timer until we actually have something
to expire, and kill the timer when the last ref drops.
2019-03-21 10:17:08 +01:00