qemu-e2k

Author	SHA1	Message	Date
Christian Ehrhardt	7b223e3860	tools/virtiofsd: Add rseq syscall to the seccomp allowlist The virtiofsd currently crashes when used with glibc 2.35. That is due to the rseq system call being added to every thread creation [1][2]. [1]: https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/ [2]: https://sourceware.org/pipermail/libc-alpha/2022-February/136040.html This happens not at daemon start, but when a guest connects /usr/lib/qemu/virtiofsd -f --socket-path=/tmp/testvfsd -o sandbox=chroot \ -o source=/var/guests/j-virtiofs --socket-group=kvm virtio_session_mount: Waiting for vhost-user socket connection... # start ok, now guest will connect virtio_session_mount: Received vhost-user socket connection virtio_loop: Entry fv_queue_set_started: qidx=0 started=1 fv_queue_set_started: qidx=1 started=1 Bad system call (core dumped) We have to put rseq on the seccomp allowlist to avoid that the daemon is crashing in this case. Reported-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 20220209111456.3328420-1-christian.ehrhardt@canonical.com [Moved rseq to its alphabetically ordered position in the seccomp allowlist. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-02-14 17:11:20 +00:00
Vivek Goyal	449e8171f9	virtiofsd: Drop membership of all supplementary groups (CVE-2022-0358) At the start, drop membership of all supplementary groups. This is not required. If we have membership of "root" supplementary group and when we switch uid/gid using setresuid/setsgid, we still retain membership of existing supplemntary groups. And that can allow some operations which are not normally allowed. For example, if root in guest creates a dir as follows. $ mkdir -m 03777 test_dir This sets SGID on dir as well as allows unprivileged users to write into this dir. And now as unprivileged user open file as follows. $ su test $ fd = open("test_dir/priviledge_id", O_RDWR\|O_CREAT\|O_EXCL, 02755); This will create SGID set executable in test_dir/. And that's a problem because now an unpriviliged user can execute it, get egid=0 and get access to resources owned by "root" group. This is privilege escalation. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2044863 Fixes: CVE-2022-0358 Reported-by: JIETAO XIAO <shawtao1125@gmail.com> Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <YfBGoriS38eBQrAb@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Fixed missing {}'s style nit	2022-01-26 10:32:05 +00:00
Dr. David Alan Gilbert	555a76e5e5	virtiofsd: Error on bad socket group name Make the '--socket-group=' option fail if the group name is unknown: ./tools/virtiofsd/virtiofsd .... --socket-group=zaphod vhost socket: unable to find group 'zaphod' Reported-by: Xiaoling Gao <xiagao@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20211014122554.34599-1-dgilbert@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-10-25 19:38:32 +01:00
Vivek Goyal	50cf6d6cb7	virtiofsd: Add a helper to stop all queues Use a helper to stop all the queues. Later in the patch series I am planning to use this helper at one more place later in the patch series. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210930153037.1194279-6-vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-10-25 18:58:42 +01:00
Vivek Goyal	c68276556a	virtiofsd: Add a helper to send element on virtqueue We have open coded logic to take locks and push element on virtqueue at three places. Add a helper and use it everywhere. Code is easier to read and less number of lines of code. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210930153037.1194279-5-vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-10-25 18:58:20 +01:00
Vivek Goyal	a88abc6f84	virtiofsd: Remove unused virtio_fs_config definition "struct virtio_fs_config" definition seems to be unused in fuse_virtio.c. Remove it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210930153037.1194279-4-vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-10-25 18:58:02 +01:00
Vivek Goyal	5afc8df46c	virtiofsd: xattr mapping add a new type "unsupported" Right now for xattr remapping, we support types of "prefix", "ok" or "bad". Type "bad" returns -EPERM on setxattr and hides xattr in listxattr. For getxattr, mapping code returns -EPERM but getxattr code converts it to -ENODATA. I need a new semantics where if an xattr is unsupported, then getxattr()/setxattr() return -ENOTSUP and listxattr() should hide the xattr. This is needed to simulate that security.selinux is not supported by virtiofs filesystem and in that case client falls back to some default label specified by policy. So add a new type "unsupported" which returns -ENOTSUP on getxattr() and setxattr() and hides xattrs in listxattr(). For example, one can use following mapping rule to not support security.selinux xattr and allow others. "-o xattrmap=/unsupported/all/security.selinux/security.selinux//ok/all///" Suggested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <YUt9qbmgAfCFfg5t@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-10-25 18:48:23 +01:00
Peter Maydell	7adb961995	virtiofsd pull 2021-08-16 Two minor fixes; one for performance, the other seccomp on s390x. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmFDS+oACgkQBRYzHrxb /edhFg//Wldr2jMom7TlbgkhsbjPQbdbsbdHx7rWXviq3bqMq2cB8UzcHvhUFkW4 j15ElKDO2ZFsjtjaFQLtQmuO4zoMN/V4u8c/0V93b/vaAh4IgMYiDXaZ18fpuYi9 Y02tiTcTUyUj4fv5OUoUeynNUkkzgxGrL8Q5oZK3KSHn5uwFOWgnZiAycbECKzJH wNljEpXjyMcZoIUJvoJ9oO246be3Flo82eA8UVOj+O0MMb2/tl18DL5IGOnglBYB U/9CkFoa5qHfiIQ63/OsndHBMXSTZiOqnv+S9RSbAdoZRTjVP1BjFYvNjnzz9/Nv czHfM+ecjLxNzG7WSHOrdm8D+L3E4O42Xuf6umcib1KfV6l/giQ3V9WfqfEX1JA4 V6XpZ5C25rs6AqFwbbh/Eo+wvb2syck30sXbwh7C1oDK/nfwHDgegd1EPE9VUaxi c0yoqV/ScHfo2fbK0QVkIt2mwi8lcjH3cg2gSKfZ0YiHcYBqC7RB9IonndEkIoqd mdvAdGafD27dDEsm1OkSxBDItmE+BZ7C2+7OF5x+a/rnt7L+yQszTG/A0DNH23kD ktvTjEBLx2jzhmR+4My5YnTYoK0rE3zs0Nh2zxg7Kx9Sh3LlMtdN5o2GS0USq0wm YCT4EfNAhPumPi+59mOCybgV6tayxz/ihgjAymQONAcRrTwddyo= =GMdE -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20210916' into staging virtiofsd pull 2021-08-16 Two minor fixes; one for performance, the other seccomp on s390x. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> # gpg: Signature made Thu 16 Sep 2021 14:51:38 BST # gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7 # gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full] # Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7 * remotes/dgilbert-gitlab/tags/pull-virtiofs-20210916: virtiofsd: Reverse req_list before processing it tools/virtiofsd: Add fstatfs64 syscall to the seccomp allowlist Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-09-19 18:53:29 +01:00
Sergio Lopez	046d91c83c	virtiofsd: Reverse req_list before processing it With the thread pool disabled, we add the requests in the queue to a GList, processing by iterating over there afterwards. For adding them, we're using "g_list_prepend()", which is more efficient but causes the requests to be processed in reverse order, breaking the read-ahead and request-merging optimizations in the host for sequential operations. According to the documentation, if you need to process the request in-order, using "g_list_prepend()" and then reversing the list with "g_list_reverse()" is more efficient than using "g_list_append()", so let's do it that way. Testing on a spinning disk (to boost the increase of read-ahead and request-merging) shows a 4x improvement on sequential write fio test: Test: fio --directory=/mnt/virtio-fs --filename=fio-file1 --runtime=20 --iodepth=16 --size=4G --direct=1 --blocksize=4K --ioengine libaio --rw write --name seqwrite-libaio Without "g_list_reverse()": ... Jobs: 1 (f=1): [W(1)][100.0%][w=22.4MiB/s][w=5735 IOPS][eta 00m:00s] seqwrite-libaio: (groupid=0, jobs=1): err= 0: pid=710: Tue Aug 24 12:58:16 2021 write: IOPS=5709, BW=22.3MiB/s (23.4MB/s)(446MiB/20002msec); 0 zone resets ... With "g_list_reverse()": ... Jobs: 1 (f=1): [W(1)][100.0%][w=84.0MiB/s][w=21.5k IOPS][eta 00m:00s] seqwrite-libaio: (groupid=0, jobs=1): err= 0: pid=716: Tue Aug 24 13:00:15 2021 write: IOPS=21.3k, BW=83.1MiB/s (87.2MB/s)(1663MiB/20001msec); 0 zone resets ... Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20210824131158.39970-1-slp@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-09-16 14:50:48 +01:00
Thomas Huth	8cfd339b3d	tools/virtiofsd: Add fstatfs64 syscall to the seccomp allowlist The virtiofsd currently crashes on s390x when doing something like this in the guest: mkdir -p /mnt/myfs mount -t virtiofs myfs /mnt/myfs touch /mnt/myfs/foo.txt stat -f /mnt/myfs/foo.txt The problem is that the fstatfs64 syscall is called in this case from the virtiofsd. We have to put it on the seccomp allowlist to avoid that the daemon gets killed in this case. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2001728 Suggested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210914123214.181885-1-thuth@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Sergio Lopez <slp@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-09-16 14:50:48 +01:00
Michael Tokarev	68857f13aa	spelling: sytem => system Signed-off-By: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <fefb5f5c-82bc-05e2-b4c1-665e9d6896ff@msgid.tls.msk.ru> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-09-15 15:51:07 +02:00
Hubert Jasudowicz	7ef2408a96	virtiofsd: Add missing newline in error message Signed-off-by: Hubert Jasudowicz <hubert.jasudowicz@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <e5914ad202a13e9c1bc2a5efa267ff3bd4f48db6.1625173475.git.hubert.jasudowicz@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-07-09 18:42:46 +02:00
Vivek Goyal	65a820d292	virtiofsd: Add an option to enable/disable posix acls fuse has an option FUSE_POSIX_ACL which needs to be opted in by fuse server to enable posix acls. As of now we are not opting in for this, so posix acls are disabled on virtiofs by default. Add virtiofsd option "-o posix_acl/no_posix_acl" to let users enable/disable posix acl support. By default it is disabled as of now due to performance concerns with cache=none. Currently even if file server has not opted in for FUSE_POSIX_ACL, user can still query acl and set acl, and system.posix_acl_access and system.posix_acl_default xattrs show up listxattr response. Miklos said this is confusing. So he said lets block and filter system.posix_acl_access and system.posix_acl_default xattrs in getxattr/setxattr/listxattr if user has explicitly disabled posix acls using -o no_posix_acl. As of now continuing to keeping the existing behavior if user did not specify any option to disable acl support due to concerns about backward compatibility. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210622150852.1507204-8-vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Vivek Goyal	f1aa1774df	virtiofsd: Switch creds, drop FSETID for system.posix_acl_access xattr When posix access acls are set on a file, it can lead to adjusting file permissions (mode) as well. If caller does not have CAP_FSETID and it also does not have membership of owner group, this will lead to clearing SGID bit in mode. Current fuse code is written in such a way that it expects file server to take care of chaning file mode (permission), if there is a need. Right now, host kernel does not clear SGID bit because virtiofsd is running as root and has CAP_FSETID. For host kernel to clear SGID, virtiofsd need to switch to gid of caller in guest and also drop CAP_FSETID (if caller did not have it to begin with). If SGID needs to be cleared, client will set the flag FUSE_SETXATTR_ACL_KILL_SGID in setxattr request. In that case server should kill sgid. Currently just switch to uid/gid of the caller and drop CAP_FSETID and that should do it. This should fix the xfstest generic/375 test case. We don't have to switch uid for this to work. That could be one optimization that pass a parameter to lo_change_cred() to only switch gid and not uid. Also this will not work whenever (if ever) we support idmapped mounts. In that case it is possible that uid/gid in request are 0/0 but still we need to clear SGID. So we will have to pick a non-root sgid and switch to that instead. That's an TODO item for future when idmapped mount support is introduced. This patch only adds the capability to switch creds and drop FSETID when acl xattr is set. This does not take affect yet. It can take affect when next patch adds the capability to enable posix_acl. Reported-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210622150852.1507204-7-vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Vivek Goyal	227e5d7fd5	virtiofsd: Add capability to change/restore umask When parent directory has default acl and a file is created in that directory, then umask is ignored and final file permissions are determined using default acl instead. (man 2 umask). Currently, fuse applies the umask and sends modified mode in create request accordingly. fuse server can set FUSE_DONT_MASK and tell fuse client to not apply umask and fuse server will take care of it as needed. With posix acls enabled, requirement will be that we want umask to determine final file mode if parent directory does not have default acl. So if posix acls are enabled, opt in for FUSE_DONT_MASK. virtiofsd will set umask of the thread doing file creation. And host kernel should use that umask if parent directory does not have default acls, otherwise umask does not take affect. Miklos mentioned that we already call unshare(CLONE_FS) for every thread. That means umask has now become property of per thread and it should be ok to manipulate it in file creation path. This patch only adds capability to change umask and restore it. It does not enable it yet. Next few patches will add capability to enable it based on if user enabled posix_acl or not. This should fix fstest generic/099. Reported-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210622150852.1507204-6-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Vivek Goyal	6d0028b947	virtiofsd: Add umask to seccom allow list Patches in this series are going to make use of "umask" syscall. So allow it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210622150852.1507204-5-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Vivek Goyal	c46ef954fa	virtiofsd: Add support for extended setxattr Add the bits to enable support for setxattr_ext if fuse offers it. Do not enable it by default yet. Let passthrough_ll opt-in. Enabling it by deafult kind of automatically means that you are taking responsibility of clearing SGID if ACL is set. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210622150852.1507204-4-vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixed up double def in fuse_common.h	2021-07-05 10:51:26 +01:00
Vivek Goyal	5290fb625d	virtiofsd: Fix xattr operations overwriting errno getxattr/setxattr/removexattr/listxattr operations handle regualar and non-regular files differently. For the case of non-regular files we do fchdir(/proc/self/fd) and the xattr operation and then revert back to original working directory. After this we are saving errno and that's buggy because fchdir() will overwrite the errno. FCHDIR_NOFAIL(lo->proc_self_fd); ret = getxattr(procname, name, value, size); FCHDIR_NOFAIL(lo->root.fd); if (ret == -1) saverr = errno In above example, if getxattr() failed, we will still return 0 to caller as errno must have been written by FCHDIR_NOFAIL(lo->root.fd) call. Fix all such instances and capture "errno" early and save in "saverr" variable. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210622150852.1507204-3-vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Vivek Goyal	061624455f	virtiofsd: Fix fuse setxattr() API change issue With kernel header updates fuse_setxattr_in struct has grown in size. But this new struct size only takes affect if user has opted in for fuse feature FUSE_SETXATTR_EXT otherwise fuse continues to send "fuse_setxattr_in" of older size. Older size is determined by FUSE_COMPAT_SETXATTR_IN_SIZE. Fix this. If we have not opted in for FUSE_SETXATTR_EXT, then expect that we will get fuse_setxattr_in of size FUSE_COMPAT_SETXATTR_IN_SIZE and not sizeof(struct fuse_sexattr_in). Fixes: `278f064e45` ("Update Linux headers to 5.13-rc4") Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210622150852.1507204-2-vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Greg Kurz	1d03e56607	virtiofsd: Don't allow file creation with FUSE_OPEN A well behaved FUSE client uses FUSE_CREATE to create files. It isn't supposed to pass O_CREAT along a FUSE_OPEN request, as documented in the "fuse_lowlevel.h" header : /** * Open a file * * Open flags are available in fi->flags. The following rules * apply. * * - Creation (O_CREAT, O_EXCL, O_NOCTTY) flags will be * filtered out / handled by the kernel. But if the client happens to do it anyway, the server ends up passing this flag to open() without the mandatory mode_t 4th argument. Since open() is a variadic function, glibc will happily pass whatever it finds on the stack to the syscall. If this file is compiled with -D_FORTIFY_SOURCE=2, glibc will even detect that and abort: * invalid openat64 call: O_CREAT or O_TMPFILE without mode *: terminated Specifying O_CREAT with FUSE_OPEN is a protocol violation. Check this in do_open(), print out a message and return an error to the client, EINVAL like we already do when fuse_mbuf_iter_advance() fails. The FUSE filesystem doesn't currently support O_TMPFILE, but the very same would happen if O_TMPFILE was passed in a FUSE_OPEN request. Check that as well. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210624101809.48032-1-groug@kaod.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Daniel P. Berrangé	d9a801f7e9	virtiofsd: use GDateTime for formatting timestamp for debug messages The GDateTime APIs provided by GLib avoid portability pitfalls, such as some platforms where 'struct timeval.tv_sec' field is still 'long' instead of 'time_t'. When combined with automatic cleanup, GDateTime often results in simpler code too. Localtime is changed to UTC to avoid the need to grant extra seccomp permissions for GLib's access of the timezone database. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210611164319.67762-1-berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Peter Maydell	1cbd2d9149	-----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJgudWYAAoJEO8Ells5jWIR3nIH/1N7d60CHf986IzLdUVF/b8g ME/SiDB+SdnYgmEmWhNhxWpWeroyPbKqhU/eSqvPj8E8BvKj9Ze1laFdaxs/kwos N03ly0T/jlbm1yMg0Y986zxjh3HE4fpQooWW3ToA3TgycDUtkHMMd0qVtRaTWv0M KG3MbyHsp7MkR3S4wHBkE9yrVDCziBibZvkxhhz1VpEHjRjNDoNbevotE5Gr43+N 50D2TxRNVd6MjN7KGJOXQHc7t22OKb2/1fKTS1Pp+oGnDxHh63G6pGQ4LpC8wEjW 2h49tcAWHQ4SafkDqyapXgTACHs4k4TV/zUg8cUDFtkAArawHppwYHoAXvz8kd8= =m1ZO -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging # gpg: Signature made Fri 04 Jun 2021 08:26:16 BST # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal] # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: MAINTAINERS: Added eBPF maintainers information. docs: Added eBPF documentation. virtio-net: Added eBPF RSS to virtio-net. ebpf: Added eBPF RSS loader. ebpf: Added eBPF RSS program. net: Added SetSteeringEBPF method for NetClientState. net/tap: Added TUNSETSTEERINGEBPF code. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-06-04 13:38:49 +01:00
Andrew Melnychenko	f3fa412de2	ebpf: Added eBPF RSS program. RSS program and Makefile to build it. The bpftool used to generate '.h' file. The data in that file may be loaded by libbpf. EBPF compilation is not required for building qemu. You can use Makefile if you need to regenerate rss.bpf.skeleton.h. Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com> Signed-off-by: Andrew Melnychenko <andrew@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2021-06-04 15:25:46 +08:00
Mahmoud Mandour	bf99f30bc3	tools/virtiofsd/fuse_opt.c: Replaced a malloc with GLib's g_try_malloc Replaced a malloc() call and its respective free() with GLib's g_try_malloc() and g_free() calls. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Message-Id: <20210314032324.45142-8-ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Mahmoud Mandour	d14d4f4f18	tools/virtiofsd/buffer.c: replaced a calloc call with GLib's g_try_new0 Replaced a call to calloc() and its respective free() call with GLib's g_try_new0() and g_free() calls. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Message-Id: <20210314032324.45142-7-ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	b5fd59cf90	virtiofsd: Set req->reply_sent right after sending reply There is no reason to set it in label "err". We should be able to set it right after sending reply. It is easier to read. Also got rid of label "err" because now only thing it was doing was return a code. We can return from the error location itself and no need to first jump to label "err". Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-8-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	1a5fff8e63	virtiofsd: Check EOF before short read In virtio_send_data_iov() we are checking first for short read and then EOF condition. Change the order. Basically check for error and EOF first and last remaining piece is short ready which will lead to retry automatically at the end of while loop. Just that it is little simpler to read to the code. There is no need to call "continue" and also one less call of "len-=ret". Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-7-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	bf7a3ee044	virtiofsd: Simplify skip byte logic We need to skip bytes in two cases. a. Before we start reading into in_sg, we need to skip iov_len bytes in the beginning which typically will have fuse_out_header. b. If preadv() does a short read, then we need to retry preadv() with remainig bytes and skip the bytes preadv() read in short read. For case a, there is no reason that skipping logic be inside the while loop. Move it outside. And only retain logic "b" inside while loop. Also get rid of variable "skip_size". Looks like we can do without it. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-6-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	0106f6f234	virtiofsd: get rid of in_sg_left variable in_sg_left seems to be being used primarly for debugging purpose. It is keeping track of how many bytes are left in the scatter list we are reading into. We already have another variable "len" which keeps track how many bytes are left to be read. And in_sg_left is greater than or equal to len. We have already ensured that in the beginning of function. if (in_len < tosend_len) { fuse_log(FUSE_LOG_ERR, "%s: elem %d too small for data len %zd\n", __func__, elem->index, tosend_len); ret = E2BIG; goto err; } So in_sg_left seems like a redundant variable. It probably was useful for debugging when code was being developed. Get rid of it. It helps simplify this function. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-5-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	97dbfc5ae6	virtiofsd: Use iov_discard_front() to skip bytes There are places where we need to skip few bytes from front of the iovec array. We have our own custom code for that. Looks like iov_discard_front() can do same thing. So use that helper instead. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-4-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	b31ff38931	virtiofsd: Get rid of unreachable code in read pvreadv() can return following. - error - 0 in case of EOF - short read We seem to handle all the cases already. We are retrying read in case of short read. So another check for short read seems like dead code. Get rid of it. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-3-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Vivek Goyal	04c9f7e04a	virtiofsd: Check for EINTR in preadv() and retry We don't seem to check for EINTR and retry. There are other places in code where we check for EINTR. So lets add a check. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210518213538.693422-2-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-26 18:39:32 +01:00
Greg Kurz	4962b312cd	virtiofsd: Fix check of chown()'s return value Otherwise you always get this warning when using --socket-group=users vhost socket failed to set group to users (100) While here, print out the error if chown() fails. Fixes: `f6698f2b03` ("tools/virtiofsd: add support for --socket-group") Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <162040394890.714971.15502455176528384778.stgit@bahia.lan> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-13 17:48:47 +02:00
Mahmoud Mandour	67a010f64c	virtiofsd/fuse_virtio.c: Changed allocations of locals to GLib Replaced the allocation of local variables from malloc() to GLib allocation functions. In one instance, dropped the usage to an assert after a malloc() call and used g_malloc() instead. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-8-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Mahmoud Mandour	c9a276f57c	virtiofsd/passthrough_ll.c: Changed local allocations to GLib functions Changed the allocations of some local variables to GLib's allocation functions, such as g_try_malloc0(), and annotated those variables as g_autofree. Subsequently, I was able to remove the calls to free(). Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-7-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Mahmoud Mandour	31dfd22d7c	virtiofsd: Changed allocations of fv_VuDev & its internals to GLib functions Changed the allocations of fv_VuDev structs, VuDev structs, and fv_QueueInfo strcuts from using calloc()/realloc() & free() to using the equivalent functions from GLib. In instances, removed the pair of allocation and assertion for non-NULL checking with a GLib function that aborts on error. Removed NULL-checking for fv_VuDev struct allocation and used a GLib function that crashes on error; namely, g_new0(). This is because allocating one struct should not be a problem on an healthy system. Also following the pattern of aborting-on-null behaviour that is taken with allocating VuDev structs and fv_QueueInfo structs. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-6-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Mahmoud Mandour	e85d6d1ef2	virtiofsd: Changed allocation of lo_map_elems to GLib's functions Replaced (re)allocation of lo_map_elem structs from realloc() to GLib's g_try_realloc_n() and replaced the respective free() call with a g_free(). Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-5-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Mahmoud Mandour	f90a2d68c0	virtiofsd: Changed allocations of fuse_session to GLib's functions Replaced the allocation and deallocation of fuse_session structs from calloc() and free() calls to g_try_new0() and g_free(). Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-4-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Mahmoud Mandour	01c6c6f982	virtiofsd: Changed allocations of iovec to GLib's functions Replaced the calls to malloc()/calloc() and their respective calls to free() of iovec structs with GLib's allocation and deallocation functions and used g_autofree when appropriate. Replaced the allocation of in_sg_cpy to g_new() instead of a call to calloc() and a null-checking assertion. Not g_new0() because the buffer is immediately overwritten using memcpy. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Message-Id: <20210427181333.148176-1-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Mahmoud Mandour	98bbd186ed	virtiofsd: Changed allocations of fuse_req to GLib functions Replaced the allocation and deallocation of fuse_req structs using calloc()/free() call pairs to a GLib's g_try_new0() and g_free(). Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210420154643.58439-2-ma.mandourr@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-05-06 19:47:44 +01:00
Dr. David Alan Gilbert	5bf5188a11	virtiofsd: Don't assume header layout virtiofsd incorrectly assumed a fixed set of header layout in the virt queue; assuming that the fuse and write headers were conveniently separated from the data; the spec doesn't allow us to take that convenience, so fix it up to deal with it the hard way. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210428110100.27757-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-05-06 19:47:44 +01:00
Dr. David Alan Gilbert	d02a3c5a1b	virtiofs: Fixup printf args Fixup some fuse_log printf args for 32bit compatibility. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210428110100.27757-2-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-05-06 19:47:44 +01:00
Carlos Venegas	1221a929be	virtiofsd: Add help for -o xattr-mapping The option is not documented in help. Add small help about the option. Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com> Message-Id: <20210414201207.3612432-3-jose.carlos.venegas.munoz@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com>	2021-05-06 19:47:44 +01:00
Carlos Venegas	a87d29e0d7	virtiofsd: Allow use "-o xattrmap" without "-o xattr" When -o xattrmap is used, it will not work unless xattr is enabled. This patch enables xattr when -o xattrmap is used. Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com> Message-Id: <20210414201207.3612432-2-jose.carlos.venegas.munoz@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com>	2021-05-06 19:47:44 +01:00
Greg Kurz	0adb3aff39	virtiofsd: Fix side-effect in assert() It is bad practice to put an expression with a side-effect in assert() because the side-effect won't happen if the code is compiled with -DNDEBUG. Use an intermediate variable. Consolidate this in an macro to have proper line numbers when the assertion is hit. virtiofsd: ../../tools/virtiofsd/passthrough_ll.c:2797: lo_getxattr: Assertion `fchdir_res == 0' failed. Aborted 2796 /* fchdir should not fail here */ =>2797 FCHDIR_NOFAIL(lo->proc_self_fd); 2798 ret = getxattr(procname, name, value, size); 2799 FCHDIR_NOFAIL(lo->root.fd); Fixes: `bdfd667883` ("virtiofsd: Fix xattr operations") Cc: misono.tomohiro@jp.fujitsu.com Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210409100627.451573-1-groug@kaod.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2021-05-06 19:47:44 +01:00
Dr. David Alan Gilbert	99c3ac6dbe	virtiofsd: Fix security.capability comparison My security fix for the security.capability remap has a silly early segfault in a simple case where there is an xattrmapping but it doesn't remap the security.capability. Fixes: `e586edcb41` ("virtiofs: drop remapped security.capability xattr as needed") Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210401145845.78445-1-dgilbert@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-04-06 18:56:01 +01:00
Alex Bennée	320d0bca94	tools/virtiofsd: include --socket-group in help I confused myself wandering if this had been merged by looking at the help output. It seems fuse_opt doesn't automagically add to help output so lets do it now. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Updates: `f6698f2b03` ("tools/virtiofsd: add support for --socket-group") Message-Id: <20210323165308.15244-5-alex.bennee@linaro.org>	2021-03-24 14:24:56 +00:00
Greg Kurz	03ccaaae48	virtiofsd: Convert some functions to return bool Both currently only return 0 or 1. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210312141003.819108-3-groug@kaod.org> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-03-15 20:01:55 +00:00
Greg Kurz	20afcc23b3	virtiofsd: Don't allow empty paths in lookup_name() When passed an empty filename, lookup_name() returns the inode of the parent directory, unless the parent is the root in which case the st_dev doesn't match and lo_find() returns NULL. This is because lookup_name() passes AT_EMPTY_PATH down to fstatat() or statx(). This behavior doesn't quite make sense because users of lookup_name() then pass the name to unlinkat(), renameat() or renameat2(), all of which will always fail on empty names. Drop AT_EMPTY_PATH from the flags in lookup_name() so that it has the consistent behavior of "returning an existing child inode or NULL" for all directories. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210312141003.819108-2-groug@kaod.org> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-03-15 20:01:55 +00:00
Greg Kurz	28d1ad0ea4	virtiofsd: Don't allow empty filenames POSIX.1-2017 clearly stipulates that empty filenames aren't allowed ([1] and [2]). Since virtiofsd is supposed to mirror the host file system hierarchy and the host can be assumed to be linux, we don't really expect clients to pass requests with an empty path in it. If they do so anyway, this would eventually cause an error when trying to create/lookup the actual inode on the underlying POSIX filesystem. But this could still confuse some code that wouldn't be ready to cope with this. Filter out empty names coming from the client at the top level, so that the rest doesn't have to care about it. This is done everywhere we already call is_safe_path_component(), but in a separate helper since the usual error for empty path names is ENOENT instead of EINVAL. [1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_170 [2] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_13 Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210312141003.819108-4-groug@kaod.org> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-03-15 20:01:55 +00:00

1 2 3 4 5

242 Commits