qio_channel_yield() now updates ioc->read_write/coroutine and calls
qio_channel_set_aio_fd_handlers(), so the code in the handlers has
become redundant and can be removed.
This does not make a difference in intermediate states because
aio_co_wake() really enters the coroutine immediately here: These
handlers are never run in coroutine context, and we're in the right
AioContext because qio_channel_attach_aio_context() asserts that the
handlers are inactive.
To make these conditions more obvious, assert the right AioContext.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Similar to how qemu_co_sleep_ns() allows preemption from an external
coroutine entry, allow reentering qio_channel_yield() early.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add the ability for a caller to wait for completion of the
background thread to synchronously dispatch its result, without
needing to wait for the main loop to run the idle callback.
This method needs very careful usage to avoid a dangerous
race condition with the free'ing of the task. The completion
callback is normally invoked from an idle callback registered
with the main loop context. The qio_task_wait_thread method
must only be called if the completion callback has not yet
run. The only safe way to achieve this is to run the
qio_task_wait_thread method from the thread that executes
the main loop.
It is generally a bad idea to use this method since it will
block execution of the main loop, however, the design of
the character devices and its usage from vhostuser already
requires blocking execution.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-3-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Currently the struct QIOTaskThreadData is only needed by the worker
thread, but a subsequent patch will need to access it from another
context.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190211182442.8542-2-berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The qio_channel_socket_close method for was mistakenly unlinking the
UNIX server socket, even if the channel was a client connection. This
was not noticed with chardevs, since they never call close, but with the
VNC server, this caused the VNC server socket to be deleted after the
first client quit.
The qio_channel_socket_close method also needlessly reimplemented the
logic that already exists in socket_listen_cleanup(). Just call that
method directly, for listen sockets only.
This fixes a regression introduced in QEMU 3.0.0 with
commit d66f78e1ea
Author: Pavel Balaev <mail@void.so>
Date: Mon May 21 19:17:35 2018 +0300
Delete AF_UNIX socket after close
Fixes launchpad #1795100
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
GNUTLS takes a paranoid approach when seeing 0 bytes returned by the
underlying OS read() function. It will consider this an error and
return GNUTLS_E_PREMATURE_TERMINATION instead of propagating the 0
return value. It expects apps to arrange for clean termination at
the protocol level and not rely on seeing EOF from a read call to
detect shutdown. This is to harden apps against a malicious 3rd party
causing termination of the sockets layer.
This is unhelpful for the QEMU NBD code which does have a clean
protocol level shutdown, but still relies on seeing 0 from the I/O
channel read in the coroutine handling incoming replies.
The upshot is that when using a plain NBD connection shutdown is
silent, but when using TLS, the client spams the console with
Cannot read from TLS channel: Broken pipe
The NBD connection has, however, called qio_channel_shutdown()
at this point to indicate that it is done with I/O. This gives
the opportunity to optimize the code such that when the channel
has been shutdown in the read direction, the error code
GNUTLS_E_PREMATURE_TERMINATION gets turned into a '0' return
instead of an error.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20181119134228.11031-1-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Missed in f69a8bde29.
Thanks Valgrind:
==955== 217 bytes in 1 blocks are definitely lost in loss record 275 of 321
==955== at 0x483A965: realloc (vg_replace_malloc.c:785)
==955== by 0x50B6839: __vasprintf_chk (in /usr/lib64/libc-2.28.so)
==955== by 0x49AA05C: g_vasprintf (in /usr/lib64/libglib-2.0.so.0.5800.1)
==955== by 0x4983440: g_strdup_vprintf (in /usr/lib64/libglib-2.0.so.0.5800.1)
==955== by 0x126048: qio_channel_websock_handshake_send_res (channel-websock.c:162)
==955== by 0x1266E6: qio_channel_websock_handshake_send_res_ok (channel-websock.c:362)
==955== by 0x126D3E: qio_channel_websock_handshake_process (channel-websock.c:468)
==955== by 0x126EF2: qio_channel_websock_handshake_read (channel-websock.c:511)
==955== by 0x12715B: qio_channel_websock_handshake_io (channel-websock.c:571)
==955== by 0x125027: qio_channel_fd_source_dispatch (channel-watch.c:84)
==955== by 0x496326C: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.5800.1)
==955== by 0x169EC3: glib_pollfds_poll (main-loop.c:215)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Since version 2.12.0 AF_UNIX socket created for QMP exchange is not
deleted on instance shutdown.
This is due to the fact that function qio_channel_socket_finalize() is
called after qio_channel_socket_close().
Signed-off-by: Pavel Balaev <mail@void.so>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The fd_is_socket() helper method is useful in a few places, so put it in
the common sockets code. Make the code more compact while moving it.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A new parameter "context" is added to qio_channel_tls_handshake() is to
allow the TLS to be run on a non-default context. Still, no functional
change.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We have worked on qio_task_run_in_thread() already. Further, let
all the qio channel APIs use that context.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
qio_task_run_in_thread() allows main thread to run blocking operations
in the background. However it has an assumption on that it's always
working with the default context. This patch tries to allow the threaded
QIO task framework to run with non-default gcontext.
Currently no functional change so far, so the QIOTasks are still always
running on main context.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Originally we were storing the GSources tag IDs. That'll be not enough
if we are going to support non-default gcontext for QIO code. Switch to
GSources without changing anything real. Now we still always pass in
NULL, which means the default gcontext.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Firstly, introduce an internal qio_channel_add_watch_full(), which
enhances qio_channel_add_watch() that context can be specified.
Then add a new API wrapper qio_channel_add_watch_source() to return a
GSource pointer rather than a tag ID.
Note that the _source() call will keep a reference of GSource so that
callers need to unref them explicitly when finished using the GSource.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
It is strange that it was called gio_task_thread_result. Rename it to
follow the naming rule of the file.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.
The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h. Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.
To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects. The next commit will
improve it further.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
We are currently facing some migration failure on s390x when running
certain avocado-vt tests, e.g. when running the test
type_specific.io-github-autotest-qemu.migrate.with_reboot.exec.gzip_exec.
This test is using 'migrate -d "exec:nc localhost 5200"' for the migration.
The problem is detected at the receiving side, where the migration stream
apparently ends too early. However, the cause for the problem is at the
sending side: After writing the migration stream into the pipe to netcat,
the source QEMU calls qio_channel_command_close() which closes the pipe
and immediately (!) kills the child process afterwards (via the function
qio_channel_command_abort()). So if the sending netcat did not read the
final bytes from the pipe yet, or if it did not manage to send out all
its buffers yet, it is killed before the whole migration stream is passed
to the destination side.
QEMU can not know how much time is required by the child process to send
over all migration data, so we should not kill it, neither directly nor
after a delay. Let's simply wait for the child process to exit gracefully
instead (this was also the behaviour of pclose() that was used in "exec:"
migration before the QIOChannel rework).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Add /dev/fdset/ support to QIOChannelFile by calling qemu_open() instead
of open() and qemu_close() instead of close(). There is a subtle
semantic change since qemu_open() automatically sets O_CLOEXEC, but this
doesn't affect any of the users of the function.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the file descriptor underlying QIOChannelFile is closed in the
io_close() method, don't close it again in the finalize() method since
the file descriptor number may have been reused in the meantime.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code wrongly passes the mode to open() only if O_WRONLY is set.
Instead, the mode should be passed when O_CREAT is set (or O_TMPFILE on
Linux). Fix this by always passing the mode since open() will correctly
ignore the mode if it is not needed. Add a testcase which exercises this
bug and also change the existing testcase to check that the mode of the
created file is correct.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
According to the current implementation of websocket protocol in QEMU,
qio_channel_websock_handshake_io tries to read handshake from the
channel to start communication over socket. But this approach
doesn't cover scenario when socket was closed while handshaking.
Therefore, if G_IO_IN is caught and qio_channel_read returns zero,
error has to be set and connection has to be done.
Such behaviour causes 100% CPU load in main QEMU loop, because main loop
poll continues to receive and handle G_IO_IN events from websocket.
Step to reproduce 100% CPU load:
1) start qemu with the simplest configuration
$ qemu -vnc [::1]:1,websocket=7500
2) open any vnc listener (which doesn't follow websocket
protocol)
$ vncviewer :7500
3) kill listener
4) qemu main thread eats 100% CPU
Signed-off-by: Edgar Kaziakhmedov <edgar.kaziakhmedov@virtuozzo.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The sources array does not escape out of qio_net_listener_wait_client, so
we have to free it.
Reported by Coverity.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes, with the change
to target/s390x/gen-features.c manually reverted, and blank lines
around deletions collapsed.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-3-armbru@redhat.com>
The existing QIOChannelSocket class provides the ability to
listen on a single socket at a time. This patch introduces
a QIONetListener class that provides a higher level API
concept around listening for network services, allowing
for listening on multiple sockets.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This fixes a compiler warning:
/qemu/io/channel-websock.c:163:5: error:
function might be possible candidate for ‘gnu_printf’ format attribute
[-Werror=suggest-attribute=format]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Coverity pointed out the 'date' is not free()d in the error
path
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The noVNC server sends a header "Connection: keep-alive, Upgrade" which
fails our simple equality test. Split the header on ',', trim whitespace
and then check for 'upgrade' token.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently most outbound I/O on the websock channel gets copied into the
rawoutput buffer, and then immediately copied again into the encoutput
buffer, with a header prepended. Now that qio_channel_websock_encode
accepts a struct iovec, we can trivially remove this bounce buffering
and write directly to encoutput.
In doing so, we also now correctly validate the encoutput size against
the QIO_CHANNEL_WEBSOCK_MAX_BUFFER limit.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of requiring use of another Buffer, pass a struct iovec
into qio_channel_websock_encode, which gives callers more
flexibility in how they process data.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qio_channel_websock_encode method is only used in one place,
everything else calls qio_channel_websock_encode_buffer directly.
It can also be pushed up a level into the qio_channel_websock_writev
method, since every other caller of qio_channel_websock_write_wire
has already filled encoutput.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We must ensure we don't get flooded with ping replies if the outbound
channel is slow. Currently we do this by keeping the ping reply in a
separate temporary buffer and only writing it if the encoutput buffer
is completely empty. This is overly pessimistic, as it is reasonable
to add a ping reply to the encoutput buffer even if it has previous
data in it, as long as that previous data doesn't include a ping
reply.
To track this better, put the ping reply directly into the encoutput
buffer, and then record the size of encoutput at this time in
pong_remain. As we write encoutput to the underlying channel, we
can decrement the pong_remain counter. Once it hits zero, we can
accept further ping replies for transmission.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The websocket GSource is monitoring the size of the rawoutput
buffer to determine if the channel can accepts more writes.
The rawoutput buffer, however, is merely a temporary staging
buffer before data is copied into the encoutput buffer. Thus
its size will always be zero when the GSource runs.
This flaw causes the encoutput buffer to grow without bound
if the other end of the underlying data channel doesn't
read data being sent. This can be seen with VNC if a client
is on a slow WAN link and the guest OS is sending many screen
updates. A malicious VNC client can act like it is on a slow
link by playing a video in the guest and then reading data
very slowly, causing QEMU host memory to expand arbitrarily.
This issue is assigned CVE-2017-15268, publically reported in
https://bugs.launchpad.net/qemu/+bug/1718964
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It is useful to trace websockets frame encoding/decoding when debugging
problems.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Make a best effort attempt to close websocket connections according to
the RFC. Sends the close message, as room permits in the socket buffer,
and immediately closes the socket.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add an immediate ping reply (pong) to the outgoing stream when a ping
is received. Unsolicited pongs are ignored.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keep pings and gratuitous pongs generated by web browsers from killing
websocket connections.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some browsers send pings/pongs with no payload, so allow empty payloads
instead of closing the connection.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Allows fragmented binary frames by saving the previous opcode. Handles
the case where an intermediary (i.e., web proxy) fragments frames
originally sent unfragmented by the client.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Gets rid of unnecessary bit shifting and performs proper EOF checking to
avoid a large number of repeated calls to recvmsg() when a client
abruptly terminates a connection (bug fix).
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When checking the value of the Connection and Upgrade HTTP headers
the websock RFC (6455) requires the comparison to be case insensitive.
The Connection value should be an exact match not a substring.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When the websocket handshake fails it is useful to log the real
error message via the trace points for debugging purposes.
Fixes bug: #1715186
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When any error occurs while processing the websockets handshake,
QEMU just terminates the connection abruptly. This is in violation
of the HTTP specs and does not help the client understand what they
did wrong. This is particularly bad when the client gives the wrong
path, as a "404 Not Found" would be very helpful.
Refactor the handshake code so that it always sends a response to
the client unless there was an I/O error.
Fixes bug: #1715186
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some callers want to distinguish between clean EOF (no bytes read)
vs. a short read (at least one byte read, but EOF encountered
before reaching the desired length), as it allows clients the
ability to do a graceful shutdown when a server shuts down at
defined safe points in the protocol, rather than treating all
shutdown scenarios as an error due to EOF. However, we don't want
to require all callers to have to check for early EOF. So add
another wrapper function that can be used by the callers that care
about the distinction.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170905191114.5959-3-eblake@redhat.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
The new qio_channel_{read,write}{,v}_all functions are documented
as yielding until data is available. When used on a blocking
channel, this yield is done via qio_channel_wait() which spawns
a nested event loop under the hood (so it is that secondary loop
which yields as needed); but if we are already in a coroutine (at
which point QIO_CHANNEL_ERR_BLOCK is only possible if we are a
non-blocking channel), we want to yield the current coroutine
instead of spawning a nested event loop.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170905191114.5959-2-eblake@redhat.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
[commit message updated]
Signed-off-by: Eric Blake <eblake@redhat.com>
These functions wait until they are able to read / write the full
requested data buffer(s).
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The non-blocking connect mechanism is obsolete, and it doesn't
work well in inet connection, because it will call getaddrinfo
first and getaddrinfo will blocks on DNS lookups. Since commit
e65c67e4 & d984464e, the non-blocking connect of migration goes
through QIOChannel in a different manner(using a thread), and
nobody use this old non-blocking connect anymore.
Any newly written code which needs a non-blocking connect should
use the QIOChannel code, so we can drop NonBlockingConnectHandler
as a concept entirely.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>