qemu-e2k

Author	SHA1	Message	Date
Daniel P. Berrangé	7a21bee2aa	misc: fix commonly doubled up words Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220707163720.1421716-5-berrange@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2022-08-01 11:58:02 +02:00
Marc-André Lureau	1fbf2665e6	util: replace qemu_get_local_state_pathname() Simplify the function to only return the directory path. Callers are adjusted to use the GLib function to build paths, g_build_filename(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220420132624.2439741-39-marcandre.lureau@redhat.com>	2022-04-21 17:09:09 +04:00
Dr. David Alan Gilbert	555a76e5e5	virtiofsd: Error on bad socket group name Make the '--socket-group=' option fail if the group name is unknown: ./tools/virtiofsd/virtiofsd .... --socket-group=zaphod vhost socket: unable to find group 'zaphod' Reported-by: Xiaoling Gao <xiagao@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20211014122554.34599-1-dgilbert@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-10-25 19:38:32 +01:00
Vivek Goyal	50cf6d6cb7	virtiofsd: Add a helper to stop all queues Use a helper to stop all the queues. Later in the patch series I am planning to use this helper at one more place later in the patch series. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210930153037.1194279-6-vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-10-25 18:58:42 +01:00
Vivek Goyal	c68276556a	virtiofsd: Add a helper to send element on virtqueue We have open coded logic to take locks and push element on virtqueue at three places. Add a helper and use it everywhere. Code is easier to read and less number of lines of code. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210930153037.1194279-5-vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-10-25 18:58:20 +01:00
Vivek Goyal	a88abc6f84	virtiofsd: Remove unused virtio_fs_config definition "struct virtio_fs_config" definition seems to be unused in fuse_virtio.c. Remove it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210930153037.1194279-4-vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-10-25 18:58:02 +01:00
Sergio Lopez	046d91c83c	virtiofsd: Reverse req_list before processing it With the thread pool disabled, we add the requests in the queue to a GList, processing by iterating over there afterwards. For adding them, we're using "g_list_prepend()", which is more efficient but causes the requests to be processed in reverse order, breaking the read-ahead and request-merging optimizations in the host for sequential operations. According to the documentation, if you need to process the request in-order, using "g_list_prepend()" and then reversing the list with "g_list_reverse()" is more efficient than using "g_list_append()", so let's do it that way. Testing on a spinning disk (to boost the increase of read-ahead and request-merging) shows a 4x improvement on sequential write fio test: Test: fio --directory=/mnt/virtio-fs --filename=fio-file1 --runtime=20 --iodepth=16 --size=4G --direct=1 --blocksize=4K --ioengine libaio --rw write --name seqwrite-libaio Without "g_list_reverse()": ... Jobs: 1 (f=1): [W(1)][100.0%][w=22.4MiB/s][w=5735 IOPS][eta 00m:00s] seqwrite-libaio: (groupid=0, jobs=1): err= 0: pid=710: Tue Aug 24 12:58:16 2021 write: IOPS=5709, BW=22.3MiB/s (23.4MB/s)(446MiB/20002msec); 0 zone resets ... With "g_list_reverse()": ... Jobs: 1 (f=1): [W(1)][100.0%][w=84.0MiB/s][w=21.5k IOPS][eta 00m:00s] seqwrite-libaio: (groupid=0, jobs=1): err= 0: pid=716: Tue Aug 24 13:00:15 2021 write: IOPS=21.3k, BW=83.1MiB/s (87.2MB/s)(1663MiB/20001msec); 0 zone resets ... Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20210824131158.39970-1-slp@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-09-16 14:50:48 +01:00
Hubert Jasudowicz	7ef2408a96	virtiofsd: Add missing newline in error message Signed-off-by: Hubert Jasudowicz <hubert.jasudowicz@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <e5914ad202a13e9c1bc2a5efa267ff3bd4f48db6.1625173475.git.hubert.jasudowicz@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-07-09 18:42:46 +02:00
Vivek Goyal	b5fd59cf90	virtiofsd: Set req->reply_sent right after sending reply There is no reason to set it in label "err". We should be able to set it right after sending reply. It is easier to read. Also got rid of label "err" because now only thing it was doing was return a code. We can return from the error location itself and no need to first jump to label "err". Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-8-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	1a5fff8e63	virtiofsd: Check EOF before short read In virtio_send_data_iov() we are checking first for short read and then EOF condition. Change the order. Basically check for error and EOF first and last remaining piece is short ready which will lead to retry automatically at the end of while loop. Just that it is little simpler to read to the code. There is no need to call "continue" and also one less call of "len-=ret". Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-7-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	bf7a3ee044	virtiofsd: Simplify skip byte logic We need to skip bytes in two cases. a. Before we start reading into in_sg, we need to skip iov_len bytes in the beginning which typically will have fuse_out_header. b. If preadv() does a short read, then we need to retry preadv() with remainig bytes and skip the bytes preadv() read in short read. For case a, there is no reason that skipping logic be inside the while loop. Move it outside. And only retain logic "b" inside while loop. Also get rid of variable "skip_size". Looks like we can do without it. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-6-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	0106f6f234	virtiofsd: get rid of in_sg_left variable in_sg_left seems to be being used primarly for debugging purpose. It is keeping track of how many bytes are left in the scatter list we are reading into. We already have another variable "len" which keeps track how many bytes are left to be read. And in_sg_left is greater than or equal to len. We have already ensured that in the beginning of function. if (in_len < tosend_len) { fuse_log(FUSE_LOG_ERR, "%s: elem %d too small for data len %zd\n", __func__, elem->index, tosend_len); ret = E2BIG; goto err; } So in_sg_left seems like a redundant variable. It probably was useful for debugging when code was being developed. Get rid of it. It helps simplify this function. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-5-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	97dbfc5ae6	virtiofsd: Use iov_discard_front() to skip bytes There are places where we need to skip few bytes from front of the iovec array. We have our own custom code for that. Looks like iov_discard_front() can do same thing. So use that helper instead. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-4-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	b31ff38931	virtiofsd: Get rid of unreachable code in read pvreadv() can return following. - error - 0 in case of EOF - short read We seem to handle all the cases already. We are retrying read in case of short read. So another check for short read seems like dead code. Get rid of it. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-3-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	04c9f7e04a	virtiofsd: Check for EINTR in preadv() and retry We don't seem to check for EINTR and retry. There are other places in code where we check for EINTR. So lets add a check. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-2-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Greg Kurz	4962b312cd	virtiofsd: Fix check of chown()'s return value Otherwise you always get this warning when using --socket-group=users vhost socket failed to set group to users (100) While here, print out the error if chown() fails. Fixes: `f6698f2b03` ("tools/virtiofsd: add support for --socket-group") Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <162040394890.714971.15502455176528384778.stgit@bahia.lan> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-13 17:48:47 +02:00
Mahmoud Mandour	67a010f64c	virtiofsd/fuse_virtio.c: Changed allocations of locals to GLib Replaced the allocation of local variables from malloc() to GLib allocation functions. In one instance, dropped the usage to an assert after a malloc() call and used g_malloc() instead. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-8-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Mahmoud Mandour	31dfd22d7c	virtiofsd: Changed allocations of fv_VuDev & its internals to GLib functions Changed the allocations of fv_VuDev structs, VuDev structs, and fv_QueueInfo strcuts from using calloc()/realloc() & free() to using the equivalent functions from GLib. In instances, removed the pair of allocation and assertion for non-NULL checking with a GLib function that aborts on error. Removed NULL-checking for fv_VuDev struct allocation and used a GLib function that crashes on error; namely, g_new0(). This is because allocating one struct should not be a problem on an healthy system. Also following the pattern of aborting-on-null behaviour that is taken with allocating VuDev structs and fv_QueueInfo structs. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-6-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Mahmoud Mandour	01c6c6f982	virtiofsd: Changed allocations of iovec to GLib's functions Replaced the calls to malloc()/calloc() and their respective calls to free() of iovec structs with GLib's allocation and deallocation functions and used g_autofree when appropriate. Replaced the allocation of in_sg_cpy to g_new() instead of a call to calloc() and a null-checking assertion. Not g_new0() because the buffer is immediately overwritten using memcpy. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Message-Id: <20210427181333.148176-1-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Dr. David Alan Gilbert	5bf5188a11	virtiofsd: Don't assume header layout virtiofsd incorrectly assumed a fixed set of header layout in the virt queue; assuming that the fuse and write headers were conveniently separated from the data; the spec doesn't allow us to take that convenience, so fix it up to deal with it the hard way. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210428110100.27757-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-05-06 19:47:44 +01:00
Greg Kurz	5bb8327b65	virtiofsd: Release vu_dispatch_lock when stopping queue QEMU can stop a virtqueue by sending a VHOST_USER_GET_VRING_BASE request to virtiofsd. As with all other vhost-user protocol messages, the thread that runs the main event loop in virtiofsd takes the vu_dispatch lock in write mode. This ensures that no other thread can access virtqueues or memory tables at the same time. In the case of VHOST_USER_GET_VRING_BASE, the main thread basically notifies the queue thread that it should terminate and waits for its termination: main() virtio_loop() vu_dispatch_wrlock() vu_dispatch() vu_process_message() vu_get_vring_base_exec() fv_queue_cleanup_thread() pthread_join() Unfortunately, the queue thread ends up calling virtio_send_msg() at some point, which itself needs to grab the lock: fv_queue_thread() g_list_foreach() fv_queue_worker() fuse_session_process_buf_int() do_release() lo_release() fuse_reply_err() send_reply() send_reply_iov() fuse_send_reply_iov_nofree() fuse_send_msg() virtio_send_msg() vu_dispatch_rdlock() <-- Deadlock ! Simply have the main thread to release the lock before going to sleep and take it back afterwards. A very similar patch was already sent by Vivek Goyal sometime back: https://listman.redhat.com/archives/virtio-fs/2021-January/msg00073.html The only difference here is that this done in fv_queue_set_started() because fv_queue_cleanup_thread() can also be called from virtio_loop() without the lock being held. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210312092212.782255-8-groug@kaod.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-03-15 20:01:55 +00:00
Greg Kurz	525a3030a8	virtiofsd: vu_dispatch locking should never fail pthread_rwlock_rdlock() and pthread_rwlock_wrlock() can fail if a deadlock condition is detected or the current thread already owns the lock. They can also fail, like pthread_rwlock_unlock(), if the mutex wasn't properly initialized. None of these are ever expected to happen with fv_VuDev::vu_dispatch_rwlock. Some users already check the return value and assert, some others don't. Introduce rdlock/wrlock/unlock wrappers that just do the former and use them everywhere for improved consistency and robustness. This is just cleanup. It doesn't fix any actual issue. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210203182434.93870-1-groug@kaod.org> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-02-16 17:03:09 +00:00
Vivek Goyal	e49393a349	virtiofsd: Use --thread-pool-size=0 to mean no thread pool Right now we create a thread pool and main thread hands over the request to thread in thread pool to process. Number of threads in thread pool can be managed by option --thread-pool-size. In tests we have noted that many of the workloads are getting better performance if we don't use a thread pool at all and process all the requests in the context of a thread receiving the request. Hence give user an option to be able to run virtiofsd without using a thread pool. To implement this, I have used existing option --thread-pool-size. This option defines how many maximum threads can be in the thread pool. Thread pool size zero freezes thead pool. I can't see why will one start virtiofsd with a frozen thread pool (hence frozen file system). So I am redefining --thread-pool-size=0 to mean, don't use a thread pool. Instead process the request in the context of thread receiving request from the queue. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20201109143548.GA1479853@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-12-18 10:08:24 +00:00
Markus Armbruster	4bd802b209	Clean up includes Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes, with the changes to the following files manually reverted: contrib/libvhost-user/libvhost-user-glib.h contrib/libvhost-user/libvhost-user.c contrib/libvhost-user/libvhost-user.h contrib/plugins/hotblocks.c contrib/plugins/hotpages.c contrib/plugins/howvec.c contrib/plugins/lockstep.c linux-user/mips64/cpu_loop.c linux-user/mips64/signal.c linux-user/sparc64/cpu_loop.c linux-user/sparc64/signal.c linux-user/x86_64/cpu_loop.c linux-user/x86_64/signal.c target/s390x/gen-features.c tests/fp/platform.h tests/migration/s390x/a-b-bios.c tests/plugin/bb.c tests/plugin/empty.c tests/plugin/insn.c tests/plugin/mem.c tests/test-rcu-simpleq.c tests/test-rcu-slist.c tests/test-rcu-tailq.c tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest/BiosTablesTest.c contrib/plugins/, tests/plugin/, and tests/test-rcu-slist.c appear not to include osdep.h intentionally. The remaining reverts are the same as in commit `bbfff19688`. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20201113061216.2483385-1-armbru@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Alexander Bulekov <alxndr@bu.edu>	2020-12-10 17:16:44 +01:00
Marc-André Lureau	0df750e9d3	libvhost-user: make it a meson subproject By making libvhost-user a subproject, check it builds standalone (without the global QEMU cflags etc). Note that the library still relies on QEMU include/qemu/atomic.h and linux_headers/. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20201125100640.366523-6-marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-12-08 13:48:58 -05:00
Philippe Mathieu-Daudé	2693026042	tools/virtiofsd: Check vu_init() return value (CID 1435958) Since commit `6f5fd83788`, vu_init() can fail if malloc() returns NULL. This fixes the following Coverity warning: CID 1435958 (#1 of 1): Unchecked return value (CHECKED_RETURN) Fixes: `6f5fd83788` ("libvhost-user: support many virtqueues") Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201102092339.2034297-1-philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-11-02 18:32:41 +00:00
Coiby Xu	049f55502a	libvhost-user: Allow vu_message_read to be replaced Allow vu_message_read to be replaced by one which will make use of the QIOChannel functions. Thus reading vhost-user message won't stall the guest. For slave channel, we still use the default vu_message_read. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Coiby Xu <coiby.xu@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200918080912.321299-2-coiby.xu@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-10-23 13:42:16 +01:00
Alex Bennée	f6698f2b03	tools/virtiofsd: add support for --socket-group If you like running QEMU as a normal user (very common for TCG runs) but you have to run virtiofsd as a root user you run into connection problems. Adding support for an optional --socket-group allows the users to keep using the command line. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200925125147.26943-2-alex.bennee@linaro.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Split long line	2020-10-12 12:39:38 +01:00
Vivek Goyal	04d325e86f	virtiofsd: Used glib "shared" thread pool glib offers thread pools and it seems to support "exclusive" and "shared" thread pools. https://developer.gnome.org/glib/stable/glib-Thread-Pools.html#g-thread-pool-new Currently we use "exlusive" thread pools but its performance seems to be poor. I tried using "shared" thread pools and performance seems much better. I posted performance results here. https://www.redhat.com/archives/virtio-fs/2020-September/msg00080.html So lets switch to shared thread pools. We can think of making it optional once somebody can show in what cases exclusive thread pools offer better results. For now, my simple performance tests across the board see better results with shared thread pools. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20200921213216.GE13362@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> With seccomp fix from Miklos	2020-09-25 12:45:58 +01:00
Stefan Hajnoczi	fd9279ec99	virtiofsd: probe unshare(CLONE_FS) and print an error An assertion failure is raised during request processing if unshare(CLONE_FS) fails. Implement a probe at startup so the problem can be detected right away. Unfortunately Docker/Moby does not include unshare in the seccomp.json list unless CAP_SYS_ADMIN is given. Other seccomp.json lists always include unshare (e.g. podman is unaffected): https://raw.githubusercontent.com/seccomp/containers-golang/master/seccomp.json Use "docker run --security-opt seccomp=path/to/seccomp.json ..." if the default seccomp.json is missing unshare. Cc: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200727190223.422280-4-stefanha@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-08-28 13:34:52 +01:00
Misono Tomohiro	bdfd667883	virtiofsd: Fix xattr operations Current virtiofsd has problems about xattr operations and they does not work properly for directory/symlink/special file. The fundamental cause is that virtiofsd uses openat() + f...xattr() systemcalls for xattr operation but we should not open symlink/special file in the daemon. Therefore the function is restricted. Fix this problem by: 1. during setup of each thread, call unshare(CLONE_FS) 2. in xattr operations (i.e. lo_getxattr), if inode is not a regular file or directory, use fchdir(proc_loot_fd) + ...xattr() + fchdir(root.fd) instead of openat() + f...xattr() (Note: for a regular file/directory openat() + f...xattr() is still used for performance reason) With this patch, xfstests generic/062 passes on virtiofs. This fix is suggested by Miklos Szeredi and Stefan Hajnoczi. The original discussion can be found here: https://www.redhat.com/archives/virtio-fs/2019-October/msg00046.html Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Message-Id: <20200227055927.24566-3-misono.tomohiro@jp.fujitsu.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-03 15:13:24 +00:00
Dr. David Alan Gilbert	6fa249027f	virtiofsd: fv_create_listen_socket error path socket leak If we fail when bringing up the socket we can leak the listen_fd; in practice the daemon will exit so it's not really a problem. Fixes: Coverity CID 1413121 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-02-10 17:24:43 +00:00
Eryu Guan	9883df8cca	virtiofsd: stop all queue threads on exit in virtio_loop() On guest graceful shutdown, virtiofsd receives VHOST_USER_GET_VRING_BASE request from VMM and shuts down virtqueues by calling fv_set_started(), which joins fv_queue_thread() threads. So when virtio_loop() returns, there should be no thread is still accessing data in fuse session and/or virtio dev. But on abnormal exit, e.g. guest got killed for whatever reason, vhost-user socket is closed and virtio_loop() breaks out the main loop and returns to main(). But it's possible fv_queue_worker()s are still working and accessing fuse session and virtio dev, which results in crash or use-after-free. Fix it by stopping fv_queue_thread()s before virtio_loop() returns, to make sure there's no-one could access fuse session and virtio dev. Reported-by: Qingming Su <qingming.su@linux.alibaba.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:37 +00:00
Stefan Hajnoczi	951b3120db	virtiofsd: add --thread-pool-size=NUM option Add an option to control the size of the thread pool. Requests are now processed in parallel by default. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:37 +00:00
Stefan Hajnoczi	a3d756c5ae	virtiofsd: process requests in a thread pool Introduce a thread pool so that fv_queue_thread() just pops VuVirtqElements and hands them to the thread pool. For the time being only one worker thread is allowed since passthrough_ll.c is not thread-safe yet. Future patches will lift this restriction so that multiple FUSE requests can be processed in parallel. The main new concept is struct FVRequest, which contains both VuVirtqElement and struct fuse_chan. We now have fv_VuDev for a device, fv_QueueInfo for a virtqueue, and FVRequest for a request. Some of fv_QueueInfo's fields are moved into FVRequest because they are per-request. The name FVRequest conforms to QEMU coding style and I expect the struct fv_* types will be renamed in a future refactoring. This patch series is not optimal. fbuf reuse is dropped so each request does malloc(se->bufsize), but there is no clean and cheap way to keep this with a thread pool. The vq_lock mutex is held for longer than necessary, especially during the eventfd_write() syscall. Performance can be improved in the future. prctl(2) had to be added to the seccomp whitelist because glib invokes it. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:37 +00:00
Stefan Hajnoczi	e7b337326d	virtiofsd: prevent fv_queue_thread() vs virtio_loop() races We call into libvhost-user from the virtqueue handler thread and the vhost-user message processing thread without a lock. There is nothing protecting the virtqueue handler thread if the vhost-user message processing thread changes the virtqueue or memory table while it is running. This patch introduces a read-write lock. Virtqueue handler threads are readers. The vhost-user message processing thread is a writer. This will allow concurrency for multiqueue in the future while protecting against fv_queue_thread() vs virtio_loop() races. Note that the critical sections could be made smaller but it would be more invasive and require libvhost-user changes. Let's start simple and improve performance later, if necessary. Another option would be an RCU-style approach with lighter-weight primitives. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:37 +00:00
Liu Bo	740b0b700a	Virtiofsd: fix memory leak on fuse queueinfo For fuse's queueinfo, both queueinfo array and queueinfos are allocated in fv_queue_set_started() but not cleaned up when the daemon process quits. This fixes the leak in proper places. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:37 +00:00
Masayoshi Mizuma	96814800d2	virtiofsd: Prevent multiply running with same vhost_user_socket virtiofsd can run multiply even if the vhost_user_socket is same path. ]# ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/tmp/share & [1] 244965 virtio_session_mount: Waiting for vhost-user socket connection... ]# ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/tmp/share & [2] 244966 virtio_session_mount: Waiting for vhost-user socket connection... ]# The user will get confused about the situation and maybe the cause of the unexpected problem. So it's better to prevent the multiple running. Create a regular file under localstatedir directory to exclude the vhost_user_socket. To create and lock the file, use qemu_write_pidfile() because the API has some sanity checks and file lock. Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Applied fixes from Stefan's review and moved osdep include Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:37 +00:00
Liu Bo	61cfc44982	virtiofsd: cleanup allocated resource in se This cleans up unfreed resources in se on quiting, including se->virtio_dev, se->vu_socket_path, se->vu_socketfd. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:37 +00:00
Dr. David Alan Gilbert	10477ac47f	virtiofsd: Kill threads when queues are stopped Kill the threads we've started when the queues get stopped. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> With improvements by: Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:37 +00:00
Dr. David Alan Gilbert	e17f7a580e	virtiofsd: Pass write iov's all the way through Pass the write iov pointing to guest RAM all the way through rather than copying the data. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00
Dr. David Alan Gilbert	469f9d2fc4	virtiofsd: Plumb fuse_bufvec through to do_write_buf Let fuse_session_process_buf_int take a fuse_bufvec * instead of a fuse_buf; and then through to do_write_buf - where in the best case it can pass that straight through to op.write_buf without copying (other than skipping a header). Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00
Stefan Hajnoczi	cee8e35d43	virtiofsd: add --fd=FDNUM fd passing option Although --socket-path=PATH is useful for manual invocations, management tools typically create the UNIX domain socket themselves and pass it to the vhost-user device backend. This way QEMU can be launched immediately with a valid socket. No waiting for the vhost-user device backend is required when fd passing is used. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00
Dr. David Alan Gilbert	eb49d187ef	virtiofsd: Fast path for virtio read Readv the data straight into the guests buffer. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> With fix by: Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00
Dr. David Alan Gilbert	2f65e69a7f	virtiofsd: Keep track of replies Keep track of whether we sent a reply to a request; this is a bit paranoid but it means: a) We should always recycle an element even if there was an error in the request b) Never try and send two replies on one queue element Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00
Dr. David Alan Gilbert	df57ba919e	virtiofsd: Send replies to messages Route fuse out messages back through the same queue elements that had the command that triggered the request. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00
Dr. David Alan Gilbert	b509e1228b	virtiofsd: Start reading commands from queue Pop queue elements off queues, copy the data from them and pass that to fuse. Note: 'out' in a VuVirtqElement is from QEMU 'in' in libfuse is into the daemon So we read from the out iov's to get a fuse_in_header When we get a kick we've got to read all the elements until the queue is empty. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00
Dr. David Alan Gilbert	5dcd1f5614	virtiofsd: Poll kick_fd for queue In the queue thread poll the kick_fd we're passed. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00
Dr. David Alan Gilbert	e4c55a3c14	virtiofsd: Start queue threads Start a thread for each queue when we get notified it's been started. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> fix by: Signed-off-by: Jun Piao <piaojun@huawei.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00
Dr. David Alan Gilbert	f2cef5fb9a	virtiofsd: get/set features callbacks Add the get/set features callbacks. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-23 16:41:36 +00:00

1 2

53 Commits