qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Greg Kurz	feabd6cf78	9pfs: Convert V9fsFidState::fid_list to QSIMPLEQ The fid_list is currently open-coded. This doesn't seem to serve any purpose that cannot be met with QEMU's generic lists. Let's go for a QSIMPLEQ : this will allow to add new fids at the end of the list and to improve the logic in v9fs_mark_fids_unreclaim(). Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20210118142300.801516-3-groug@kaod.org> Signed-off-by: Greg Kurz <groug@kaod.org>	2021-01-21 17:49:45 +01:00
Greg Kurz	2e53160fc6	9pfs: Convert V9fsFidState::clunked to bool This can only be 0 or 1. Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20210118142300.801516-2-groug@kaod.org> Signed-off-by: Greg Kurz <groug@kaod.org>	2021-01-21 17:49:45 +01:00
Greg Kurz	acef3f8b47	9pfs/proxy: Check return value of proxy_marshal() This should always successfully write exactly two 32-bit integers. Make it clear with an assert(), like v9fs_receive_status() and v9fs_receive_response() already do when unmarshalling the same header. Fixes: Coverity CID 1438968 Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <161035859647.1221144.4691749806675653934.stgit@bahia.lan> Signed-off-by: Greg Kurz <groug@kaod.org>	2021-01-21 17:49:45 +01:00
Greg Kurz	89fbea8737	9pfs: Fully restart unreclaim loop (CVE-2021-20181) Depending on the client activity, the server can be asked to open a huge number of file descriptors and eventually hit RLIMIT_NOFILE. This is currently mitigated using a reclaim logic : the server closes the file descriptors of idle fids, based on the assumption that it will be able to re-open them later. This assumption doesn't hold of course if the client requests the file to be unlinked. In this case, we loop on the entire fid list and mark all related fids as unreclaimable (the reclaim logic will just ignore them) and, of course, we open or re-open their file descriptors if needed since we're about to unlink the file. This is the purpose of v9fs_mark_fids_unreclaim(). Since the actual opening of a file can cause the coroutine to yield, another client request could possibly add a new fid that we may want to mark as non-reclaimable as well. The loop is thus restarted if the re-open request was actually transmitted to the backend. This is achieved by keeping a reference on the first fid (head) before traversing the list. This is wrong in several ways: - a potential clunk request from the client could tear the first fid down and cause the reference to be stale. This leads to a use-after-free error that can be detected with ASAN, using a custom 9p client - fids are added at the head of the list : restarting from the previous head will always miss fids added by a some other potential request All these problems could be avoided if fids were being added at the end of the list. This can be achieved with a QSIMPLEQ, but this is probably too much change for a bug fix. For now let's keep it simple and just restart the loop from the current head. Fixes: CVE-2021-20181 Buglink: https://bugs.launchpad.net/qemu/+bug/1911666 Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Message-Id: <161064025265.1838153.15185571283519390907.stgit@bahia.lan> Signed-off-by: Greg Kurz <groug@kaod.org>	2021-01-15 08:44:28 +01:00
Philippe Mathieu-Daudé	e6b99460b1	hw/9pfs: Fix Kconfig dependency problem between 9pfs and Xen Commit `b2c00bce54` ("meson: convert hw/9pfs, cleanup") introduced CONFIG_9PFS (probably a wrong conflict resolution). This config is not used anywhere. Backends depend on CONFIG_FSDEV_9P which itself depends on CONFIG_VIRTFS. Remove the invalid CONFIG_9PFS and use CONFIG_FSDEV_9P instead, to fix the './configure --without-default-devices --enable-xen' build: /usr/bin/ld: libcommon.fa.p/hw_xen_xen-legacy-backend.c.o: in function `xen_be_register_common': hw/xen/xen-legacy-backend.c:754: undefined reference to `xen_9pfs_ops' /usr/bin/ld: libcommon.fa.p/fsdev_qemu-fsdev.c.o:(.data.rel+0x8): undefined reference to `local_ops' /usr/bin/ld: libcommon.fa.p/fsdev_qemu-fsdev.c.o:(.data.rel+0x20): undefined reference to `synth_ops' /usr/bin/ld: libcommon.fa.p/fsdev_qemu-fsdev.c.o:(.data.rel+0x38): undefined reference to `proxy_ops' collect2: error: ld returned 1 exit status Fixes: `b2c00bce54` ("meson: convert hw/9pfs, cleanup") Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20201104115706.3101190-3-philmd@redhat.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-11-05 15:21:11 +01:00
Xinhao Zhang	22e1367587	hw/9pfs : add space before the open parenthesis '(' Fix code style. Space required before the open parenthesis '('. Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com> Signed-off-by: Kai Deng <dengkai1@huawei.com> Reported-by: Euler Robot <euler.robot@huawei.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20201030043515.1030223-3-zhangxinhao1@huawei.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-11-05 15:14:03 +01:00
Xinhao Zhang	487729e9f6	hw/9pfs : open brace '{' following struct go on the same line Fix code style. Open braces for struct should go on the same line. Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com> Signed-off-by: Kai Deng <dengkai1@huawei.com> Reported-by: Euler Robot <euler.robot@huawei.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20201030043515.1030223-2-zhangxinhao1@huawei.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-11-05 15:14:03 +01:00
Xinhao Zhang	01011733ea	hw/9pfs : add spaces around operator Fix code style. Operator needs spaces both sides. Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com> Signed-off-by: Kai Deng <dengkai1@huawei.com> Reported-by: Euler Robot <euler.robot@huawei.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20201030043515.1030223-1-zhangxinhao1@huawei.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-11-05 15:14:03 +01:00
Christian Schoenebeck	b036d9ac69	9pfs: suppress performance warnings on qtest runs Don't trigger any performance warning if we're just running test cases, because tests intentionally run for edge cases. So far performance warnings were suppressed for the 'synth' fs driver backend only. This patch suppresses them for all 9p fs driver backends. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <a2d2ff2163f8853ea782a7a1d4e6f2afd7c29ffe.1603106145.git.qemu_oss@crudebyte.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-10-19 14:25:40 +02:00
Eduardo Habkost	8063396bf3	Use OBJECT_DECLARE_SIMPLE_TYPE when possible This converts existing DECLARE_INSTANCE_CHECKER usage to OBJECT_DECLARE_SIMPLE_TYPE when possible. $ ./scripts/codeconverter/converter.py -i \ --pattern=AddObjectDeclareSimpleType $(git grep -l '' -- '*.[ch]') Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20200916182519.415636-6-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-18 14:12:32 -04:00
Christian Schoenebeck	c418f935ac	9pfs: disable msize warning for synth driver Previous patch introduced a performance warning being logged on host side if client connected with an 'msize' <= 8192. Disable this performance warning for the synth driver to prevent that warning from being printed whenever the 9pfs (qtest) test cases are running. Introduce a new export flag V9FS_NO_PERF_WARN for that purpose, which might also be used to disable such warnings from the CLI in future. We could have also prevented the warning by simply raising P9_MAX_SIZE in virtio-9p-test.c to any value larger than 8192, however in the context of test cases it makes sense running for edge cases, which includes the lowest 'msize' value supported by the server which is 4096, hence we want to preserve an msize of 4096 for the test client. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <E1kEyDy-0006nN-5A@lizzy.crudebyte.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-09-15 12:12:03 +02:00
Christian Schoenebeck	62777d825b	9pfs: log warning if msize <= 8192 It is essential to choose a reasonable high value for 'msize' to avoid severely degraded file I/O performance. This parameter can only be chosen on client/guest side, and a Linux client defaults to an 'msize' of only 8192 if the user did not explicitly specify a value for 'msize', which results in very poor file I/O performance. Unfortunately many users are not aware that they should specify an appropriate value for 'msize' to avoid severe performance issues, so log a performance warning (with a QEMU wiki link explaining this issue in detail) on host side in that case to make it more clear. Currently a client cannot automatically pick a reasonable value for 'msize', because a good value for 'msize' depends on the file I/O potential of the underlying storage on host side, i.e. a feature invisible to the client, and even then a user would still need to trade off between performance profit and additional RAM costs, i.e. with growing 'msize' (RAM occupation), performance still increases, but performance delta will shrink continuously. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <e6fc84845c95816ad5baecb0abd6bfefdcf7ec9f.1599144062.git.qemu_oss@crudebyte.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-09-15 12:12:03 +02:00
Eduardo Habkost	8110fa1d94	Use DECLARE_CHECKER macros Generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=TypeCheckMacro $(git grep -l '' -- '*.[ch]') Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-12-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-13-ehabkost@redhat.com> Message-Id: <20200831210740.126168-14-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 09:27:09 -04:00
Eduardo Habkost	db1015e92e	Move QOM typedefs and add missing includes Some typedefs and macros are defined after the type check macros. This makes it difficult to automatically replace their definitions with OBJECT_DECLARE_TYPE. Patch generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=QOMStructTypedefSplit $(git grep -l '' -- '.[ch]') which will split "typdef struct { ... } TypedefName" declarations. Followed by: $ ./scripts/codeconverter/converter.py -i --pattern=MoveSymbols \ $(git grep -l '' -- '.[ch]') which will: - move the typedefs and #defines above the type check macros - add missing #include "qom/object.h" lines if necessary Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-9-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-10-ehabkost@redhat.com> Message-Id: <20200831210740.126168-11-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 09:26:43 -04:00
Peter Maydell	30aa19446d	9pfs: Fix severe performance issue of Treaddir requests. -----BEGIN PGP SIGNATURE----- iQJLBAABCgA1FiEEltjREM96+AhPiFkBNMK1h2Wkc5UFAl8zvx0XHHFlbXVfb3Nz QGNydWRlYnl0ZS5jb20ACgkQNMK1h2Wkc5Uthw//cXXwifzzjUaLccxkTCRejdZH tRLVhx8Asp4JG5WV+djF78dAh8UGw6DPMGIejqgZyBW3fDwQzbJGSycMWCfLtDwS 176rDS0yYfpHM4hVW3dVIvSC6ea1hXlzZQP4STe1ZSghVXYLjFLY6u5aFJmvtS2E vh33VecxE/MyKvJlTBpNG4h/oNz5PIJXPOsBI/N9kIX7sBDXZMI/X90SSJ0m/MJa heT/DRXTDJo+9m8K4Eibso/Akx8h+ZuyMwSR+b5e/9OKqylMdFKKBoGSSPDY2h8r q5OweV0Aewfj885qnD7BfH/Iis6re/qbFcQz6gxqZW0j/aW71yRoFXbFucvgX0ie 1HLiLHd/gv9HAwT8TeYUT7bldIDyk2jiD14cvhkE9PXlWmGigu0aMiXhPJ2/Jbx2 uJUIbLRXk6d/eds8q+2KO8+H6c6PmXMy40rqXDMFbUHCJIYDVH0K3hvH+4h8uE63 PKRuwoI+XOryw6dxEQlx206CfDUrjnZ+X4+v7UloTEy6/4BxlcagFQDCgyHEqyJL PVlkOjRyJWDt8Q1k6YpZImj+OaTzLmnLE8/ucLzCnaHEVqWQUJwwO/jeeCgFt3a0 oAUoTZUnpS7OM/oNWRx6YiheM8Ynk9nb6rAjeCpGnNgDhihq9Oh9/PKsXwTXUdyL sywT9dVI0Y4m3LyF7ok= =1Qh/ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/cschoenebeck/tags/pull-9p-20200812' into staging 9pfs: Fix severe performance issue of Treaddir requests. # gpg: Signature made Wed 12 Aug 2020 11:06:21 BST # gpg: using RSA key 96D8D110CF7AF8084F88590134C2B58765A47395 # gpg: issuer "qemu_oss@crudebyte.com" # gpg: Good signature from "Christian Schoenebeck <qemu_oss@crudebyte.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: ECAB 1A45 4014 1413 BA38 4926 30DB 47C3 A012 D5F4 # Subkey fingerprint: 96D8 D110 CF7A F808 4F88 5901 34C2 B587 65A4 7395 * remotes/cschoenebeck/tags/pull-9p-20200812: 9pfs: clarify latency of v9fs_co_run_in_worker() 9pfs: differentiate readdir lock between 9P2000.u vs. 9P2000.L 9pfs: T_readdir latency optimization 9pfs: add new function v9fs_co_readdir_many() 9pfs: split out fs driver core of v9fs_co_readdir() 9pfs: make v9fs_readdir_response_size() public tests/virtio-9p: added split readdir tests Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-08-24 16:39:53 +01:00
Marc-André Lureau	b2c00bce54	meson: convert hw/9pfs, cleanup hw/Makefile.objs is gone so there is more code that can be removed. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:30:33 -04:00
Paolo Bonzini	243af0225a	trace: switch position of headers to what Meson requires Meson doesn't enjoy the same flexibility we have with Make in choosing the include path. In particular the tracing headers are using $(build_root)/$(<D). In order to keep the include directives unchanged, the simplest solution is to generate headers with patterns like "trace/trace-audio.h" and place forwarding headers in the source tree such that for example "audio/trace.h" includes "trace/trace-audio.h". This patch is too ugly to be applied to the Makefiles now. It's only a way to separate the changes to the tracing header files from the Meson rewrite of the tracing logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:18:24 -04:00
Christian Schoenebeck	da9f2eda25	9pfs: clarify latency of v9fs_co_run_in_worker() As we just fixed a severe performance issue with Treaddir request handling, clarify this overall issue as a comment on v9fs_co_run_in_worker() with the intention to hopefully prevent such performance mistakes in future (and fixing other yet outstanding ones). Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <4d34d332e1aaa8a2cf8dc0b5da4fd7727f2a86e8.1596012787.git.qemu_oss@crudebyte.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-08-12 09:17:32 +02:00
Christian Schoenebeck	d2c5cf7ca1	9pfs: differentiate readdir lock between 9P2000.u vs. 9P2000.L Previous patch suggests that it might make sense to use a different mutex type now while handling readdir requests, depending on the precise protocol variant, as v9fs_do_readdir_with_stat() (used by 9P2000.u) uses a CoMutex to avoid deadlocks that might happen with QemuMutex otherwise, whereas do_readdir_many() (used by 9P2000.L) should better use a QemuMutex, as the precise behaviour of a failed CoMutex lock on fs driver side would not be clear. And to avoid the wrong lock type being used, be now strict and error out if a 9P2000.L client sends a Tread on a directory, and likeweise error out if a 9P2000.u client sends a Treaddir request. This patch is just intended as transitional measure, as currently 9P2000.u vs. 9P2000.L implementations currently differ where the main logic of fetching directory entries is located at (9P2000.u still being more top half focused, while 9P2000.L already being bottom half focused in regards to fetching directory entries that is). Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <9a2ddc347e533b0d801866afd9dfac853d2d4106.1596012787.git.qemu_oss@crudebyte.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-08-12 09:17:32 +02:00
Christian Schoenebeck	0c4356ba7d	9pfs: T_readdir latency optimization Make top half really top half and bottom half really bottom half: Each T_readdir request handling is hopping between threads (main I/O thread and background I/O driver threads) several times for every individual directory entry, which sums up to huge latencies for handling just a single T_readdir request. Instead of doing that, collect now all required directory entries (including all potentially required stat buffers for each entry) in one rush on a background I/O thread from fs driver by calling the previously added function v9fs_co_readdir_many() instead of v9fs_co_readdir(), then assemble the entire resulting network response message for the readdir request on main I/O thread. The fs driver is still aborting the directory entry retrieval loop (on the background I/O thread inside of v9fs_co_readdir_many()) as soon as it would exceed the client's requested maximum R_readdir response size. So this will not introduce a performance penalty on another end. Also: No longer seek initial directory position in v9fs_readdir(), as this is now handled (more consistently) by v9fs_co_readdir_many() instead. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <c7c3d1cf4e86611538cef44897842819d9359d7a.1596012787.git.qemu_oss@crudebyte.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-08-12 09:17:32 +02:00
Christian Schoenebeck	2149675b19	9pfs: add new function v9fs_co_readdir_many() The newly added function v9fs_co_readdir_many() retrieves multiple directory entries with a single fs driver request. It is intended to replace uses of v9fs_co_readdir(), the latter only retrieves a single directory entry per fs driver request instead. The reason for this planned replacement is that for every fs driver request the coroutine is dispatched from main I/O thread to a background I/O thread and eventually dispatched back to main I/O thread. Hopping between threads adds latency. So if a 9pfs Treaddir request reads a large amount of directory entries, this currently sums up to huge latencies of several hundred ms or even more. So using v9fs_co_readdir_many() instead of v9fs_co_readdir() will provide significant performance improvements. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <73dc827a12ef577ae7e644dcf34a5c0e443ab42f.1596012787.git.qemu_oss@crudebyte.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-08-12 09:17:32 +02:00
Christian Schoenebeck	dd8151f4fe	9pfs: split out fs driver core of v9fs_co_readdir() The implementation of v9fs_co_readdir() has two parts: the outer part is executed by main I/O thread, whereas the inner part is executed by fs driver on a background I/O thread. Move the inner part to its own new, private function do_readdir(), so it can be shared by another upcoming new function. This is just a preparatory patch for the subsequent patch, with the purpose to avoid the next patch to clutter the overall diff. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <a426ee06e77584fa2d8253ce5d8bea519eb3ffd4.1596012787.git.qemu_oss@crudebyte.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-08-12 09:17:32 +02:00
Christian Schoenebeck	29c9d2ca80	9pfs: make v9fs_readdir_response_size() public Rename function v9fs_readdir_data_size() -> v9fs_readdir_response_size() and make it callable from other units. So far this function is only used by 9p.c, however subsequent patches require the function to be callable from another 9pfs unit. And as we're at it; also make it clear for what this function is used for. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <3668ebc7d5b929a0e4f1357457060d96f50f76f4.1596012787.git.qemu_oss@crudebyte.com> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-08-12 09:17:32 +02:00
Vladimir Sementsov-Ogievskiy	92c451222c	virtio-9p: Use ERRP_GUARD() If we want to check error after errp-function call, we need to introduce local_err and then propagate it to errp. Instead, use the ERRP_GUARD() macro, benefits are: 1. No need of explicit error_propagate call 2. No need of explicit local_err variable: use errp directly 3. ERRP_GUARD() leaves errp as is if it's not NULL or &error_fatal, this means that we don't break error_abort (we'll abort on error_set, not on error_propagate) If we want to add some info to errp (by error_prepend() or error_append_hint()), we must use the ERRP_GUARD() macro. Otherwise, this info will not be added when errp == &error_fatal (the program will exit prior to the error_append_hint() or error_prepend() call). Fix such a case in v9fs_device_realize_common(). This commit is generated by command sed -n '/^virtio-9p$/,/^$/{s/^F: //p}' MAINTAINERS \| \ xargs git ls-files \| grep '\.[hc]$' \| \ xargs spatch \ --sp-file scripts/coccinelle/errp-guard.cocci \ --macro-file scripts/cocci-macro-file.h \ --in-place --no-show-diff --max-width 80 Reported-by: Kevin Wolf <kwolf@redhat.com> Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Greg Kurz <groug@kaod.org> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200707165037.1026246-7-armbru@redhat.com> [ERRP_AUTO_PROPAGATE() renamed to ERRP_GUARD(), and auto-propagated-errp.cocci to errp-guard.cocci. Commit message tweaked again.]	2020-07-10 15:18:09 +02:00
Markus Armbruster	9261ef5e32	Clean up some calls to ignore Error objects the right way Receiving the error in a local variable only to free it is less clear (and also less efficient) than passing NULL. Clean up. Cc: Daniel P. Berrange <berrange@redhat.com> Cc: Jerome Forissier <jerome@forissier.org> CC: Greg Kurz <groug@kaod.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20200630090351.1247703-4-armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-07-02 06:25:28 +02:00
Stefano Stabellini	84af75577c	xen/9pfs: increase max ring order to 9 The max order allowed by the protocol is 9. Increase the max order supported by QEMU to 9 to increase performance. Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20200521192627.15259-3-sstabellini@kernel.org> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-05-25 11:45:40 +02:00
Stefano Stabellini	a4c4d46272	xen/9pfs: yield when there isn't enough room on the ring Instead of truncating replies, which is problematic, wait until the client reads more data and frees bytes on the reply ring. Do that by calling qemu_coroutine_yield(). The corresponding qemu_coroutine_enter_if_inactive() is called from xen_9pfs_bh upon receiving the next notification from the client. We need to be careful to avoid races in case xen_9pfs_bh and the coroutine are both active at the same time. In xen_9pfs_bh, wait until either the critical section is over (ring->co == NULL) or until the coroutine becomes inactive (qemu_coroutine_yield() was called) before continuing. Then, simply wake up the coroutine if it is inactive. Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20200521192627.15259-2-sstabellini@kernel.org> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-05-25 11:45:39 +02:00
Stefano Stabellini	cf45183b71	Revert "9p: init_in_iov_from_pdu can truncate the size" This reverts commit `16724a1730`. It causes https://bugs.launchpad.net/bugs/1877688. Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20200521192627.15259-1-sstabellini@kernel.org> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-05-25 11:45:38 +02:00
Greg Kurz	ed463454ef	9p: Lock directory streams with a CoMutex Locking was introduced in QEMU 2.7 to address the deprecation of readdir_r(3) in glibc 2.24. It turns out that the frontend code is the worst place to handle a critical section with a pthread mutex: the code runs in a coroutine on behalf of the QEMU mainloop and then yields control, waiting for the fsdev backend to process the request in a worker thread. If the client resends another readdir request for the same fid before the previous one finally unlocked the mutex, we're deadlocked. This never bit us because the linux client serializes readdir requests for the same fid, but it is quite easy to demonstrate with a custom client. A good solution could be to narrow the critical section in the worker thread code and to return a copy of the dirent to the frontend, but this causes quite some changes in both 9p.c and codir.c. So, instead of that, in order for people to easily backport the fix to older QEMU versions, let's simply use a CoMutex since all the users for this sit in coroutines. Fixes: `7cde47d4a8` ("9p: add locking to V9fsDir") Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <158981894794.109297.3530035833368944254.stgit@bahia.lan> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-05-25 10:38:03 +02:00
Dan Robertson	03556ea920	9pfs: include linux/limits.h for XATTR_SIZE_MAX linux/limits.h should be included for the XATTR_SIZE_MAX definition used by v9fs_xattrcreate. Fixes: `3b79ef2cf4` ("9pfs: limit xattr size in xattrcreate") Signed-off-by: Dan Robertson <dan@dlrobertson.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20200515203015.7090-2-dan@dlrobertson.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-05-25 10:38:03 +02:00
Markus Armbruster	b69c3c21a5	qdev: Unrealize must not fail Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>	2020-05-15 07:08:14 +02:00
Christian Schoenebeck	9bbb7e0fe0	xen-9pfs: Fix log messages of reply errors If delivery of some 9pfs response fails for some reason, log the error message by mentioning the 9P protocol reply type, not by client's request type. The latter could be misleading that the error occurred already when handling the request input. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Message-Id: <ad0e5a9b6abde52502aa40b30661d29aebe1590a.1589132512.git.qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-05-14 08:06:43 +02:00
Omar Sandoval	a5804fcf7b	9pfs: local: ignore O_NOATIME if we don't have permissions QEMU's local 9pfs server passes through O_NOATIME from the client. If the QEMU process doesn't have permissions to use O_NOATIME (namely, it does not own the file nor have the CAP_FOWNER capability), the open will fail. This causes issues when from the client's point of view, it believes it has permissions to use O_NOATIME (e.g., a process running as root in the virtual machine). Additionally, overlayfs on Linux opens files on the lower layer using O_NOATIME, so in this case a 9pfs mount can't be used as a lower layer for overlayfs (cf. `dabfe19719/vmtest/onoatimehack.c` and https://github.com/NixOS/nixpkgs/issues/54509). Luckily, O_NOATIME is effectively a hint, and is often ignored by, e.g., network filesystems. open(2) notes that O_NOATIME "may not be effective on all filesystems. One example is NFS, where the server maintains the access time." This means that we can honor it when possible but fall back to ignoring it. Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Message-Id: <e9bee604e8df528584693a4ec474ded6295ce8ad.1587149256.git.osandov@fb.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-05-14 08:06:43 +02:00
Greg Kurz	659f195328	9p/proxy: Fix export_flags The common fsdev options are set by qemu_fsdev_add() before it calls the backend specific option parsing code. In the case of "proxy" this means "writeout" or "readonly" were simply ignored. This has been broken from the beginning. Reported-by: Stéphane Graber <stgraber@ubuntu.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <158349633705.1237488.8895481990204796135.stgit@bahia.lan>	2020-03-10 16:12:49 +01:00
Christian Schoenebeck	af46a3b233	hw/9pfs/9p-synth: added directory for readdir test This will provide the following virtual files by the 9pfs synth driver: - /ReadDirDir/ReadDirFile99 - /ReadDirDir/ReadDirFile98 ... - /ReadDirDir/ReadDirFile1 - /ReadDirDir/ReadDirFile0 This virtual directory and its virtual 100 files will be used by the upcoming 9pfs readdir tests. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <5408c28c8de25dd575b745cef63bf785305ccef2.1579567020.git.qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-02-08 09:29:04 +01:00
Christian Schoenebeck	d36a5c2270	9pfs: validate count sent by client with T_readdir A good 9p client sends T_readdir with "count" parameter that's sufficiently smaller than client's initially negotiated msize (maximum message size). We perform a check for that though to avoid the server to be interrupted with a "Failed to encode VirtFS reply type 41" transport error message by bad clients. This count value constraint uses msize - 11, because 11 is the header size of R_readdir. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <3990d3891e8ae2074709b56449e96ab4b4b93b7d.1579567020.git.qemu_oss@crudebyte.com> [groug: added comment ] Signed-off-by: Greg Kurz <groug@kaod.org>	2020-02-08 09:28:54 +01:00
Christian Schoenebeck	e16453a31a	9pfs: require msize >= 4096 A client establishes a session by sending a Tversion request along with a 'msize' parameter which client uses to suggest server a maximum message size ever to be used for communication (for both requests and replies) between client and server during that session. If client suggests a 'msize' smaller than 4096 then deny session by server immediately with an error response (Rlerror for "9P2000.L" clients or Rerror for "9P2000.u" clients) instead of replying with Rversion. So far any msize submitted by client with Tversion was simply accepted by server without any check. Introduction of some minimum msize makes sense, because e.g. a msize < 7 would not allow any subsequent 9p operation at all, because 7 is the size of the header section common by all 9p message types. A substantial higher value of 4096 was chosen though to prevent potential issues with some message types. E.g. Rreadlink may yield up to a size of PATH_MAX which is usually 4096, and like almost all 9p message types, Rreadlink is not allowed to be truncated by the 9p protocol. This chosen size also prevents a similar issue with Rreaddir responses (provided client always sends adequate 'count' parameter with Treaddir), because even though directory entries retrieval may be split up over several T/Rreaddir messages; a Rreaddir response must not truncate individual directory entries though. So msize should be large enough to return at least one directory entry with the longest possible file name supported by host. Most file systems support a max. file name length of 255. Largest known file name lenght limit would be currently ReiserFS with max. 4032 bytes, which is also covered by this min. msize value because 4032 + 35 < 4096. Furthermore 4096 is already the minimum msize of the Linux kernel's 9pfs client. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <8ceecb7fb9fdbeabbe55c04339349a36929fb8e3.1579567019.git.qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-02-08 09:28:43 +01:00
Peter Maydell	760df0d121	* Register qdev properties as class properties (Marc-André) * Cleanups (Philippe) * virtio-scsi fix (Pan Nengyuan) * Tweak Skylake-v3 model id (Kashyap) * x86 UCODE_REV support and nested live migration fix (myself) * Advisory mode for pvpanic (Zhenwei) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJeK1CXAAoJEL/70l94x66DXtkIAI6W5wEY0Yme4M9Q5mGc0RV8 uscPLg0wsg88u6xne8ucCiGymvDREym2ii/aVI0Hi5ish84ZMdCrdck9cd+llpMf +a3slL26AKlOW8WtYSuyAE1RdLFXngeXdwal5KtWPEExJorkDUPTbwhBzQduQK1a myoHHcbwdd/96v7FvKnfG8jM6KZtHPQQ0i6+6fX4PN44jaULQNjze8GIrRBEwqw5 uCKJFQPBXiVcxKjH5/kzI1vl2hLJbF2ZGVEzX/U8OPZwyGPHIkWquURo8lvUTPfb ySlNTUTV2CyrN65TBRXQp/mJi44WvME5Jxlf5rNLBaYXPpL0zhmILKn5X5ya4U0= =TD0Y -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * Register qdev properties as class properties (Marc-André) * Cleanups (Philippe) * virtio-scsi fix (Pan Nengyuan) * Tweak Skylake-v3 model id (Kashyap) * x86 UCODE_REV support and nested live migration fix (myself) * Advisory mode for pvpanic (Zhenwei) # gpg: Signature made Fri 24 Jan 2020 20:16:23 GMT # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (58 commits) build-sys: clean up flags included in the linker command line target/i386: Add the 'model-id' for Skylake -v3 CPU models qdev: use object_property_help() qapi/qmp: add ObjectPropertyInfo.default-value qom: introduce object_property_help() qom: simplify qmp_device_list_properties() vl: print default value in object help qdev: register properties as class properties qdev: move instance properties to class properties qdev: rename DeviceClass.props qdev: set properties with device_class_set_props() object: return self in object_ref() object: release all props object: add object_class_property_add_link() object: express const link with link property object: add direct link flag object: rename link "child" to "target" object: check strong flag with & object: do not free class properties object: add object_property_set_default ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-01-27 09:44:04 +00:00
Marc-André Lureau	4f67d30b5e	qdev: set properties with device_class_set_props() The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-01-24 20:59:15 +01:00
Pan Nengyuan	ad30a9e904	virtio-9p-device: convert to new virtio_delete_queue Use virtio_delete_queue to make it more clear. Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200117060927.51996-3-pannengyuan@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>	2020-01-22 00:23:07 -05:00
Pan Nengyuan	9580d60e66	virtio-9p-device: fix memleak in virtio_9p_device_unrealize v->vq forgot to cleanup in virtio_9p_device_unrealize, the memory leak stack is as follow: Direct leak of 14336 byte(s) in 2 object(s) allocated from: #0 0x7f819ae43970 (/lib64/libasan.so.5+0xef970) ??:? #1 0x7f819872f49d (/lib64/libglib-2.0.so.0+0x5249d) ??:? #2 0x55a3a58da624 (./x86_64-softmmu/qemu-system-x86_64+0x2c14624) /mnt/sdb/qemu/hw/virtio/virtio.c:2327 #3 0x55a3a571bac7 (./x86_64-softmmu/qemu-system-x86_64+0x2a55ac7) /mnt/sdb/qemu/hw/9pfs/virtio-9p-device.c:209 #4 0x55a3a58e7bc6 (./x86_64-softmmu/qemu-system-x86_64+0x2c21bc6) /mnt/sdb/qemu/hw/virtio/virtio.c:3504 #5 0x55a3a5ebfb37 (./x86_64-softmmu/qemu-system-x86_64+0x31f9b37) /mnt/sdb/qemu/hw/core/qdev.c:876 Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200117060927.51996-2-pannengyuan@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Acked-by: Greg Kurz <groug@kaod.org>	2020-01-22 00:23:07 -05:00
Daniel Henrique Barboza	b858e80a02	9pfs/9p.c: remove unneeded labels 'out' label in v9fs_xattr_write() and 'out_nofid' label in v9fs_complete_rename() can be replaced by appropriate return calls. CC: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Acked-by: Greg Kurz <groug@kaod.org> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-01-20 15:11:39 +01:00
Greg Kurz	16724a1730	9p: init_in_iov_from_pdu can truncate the size init_in_iov_from_pdu might not be able to allocate the full buffer size requested, which comes from the client and could be larger than the transport has available at the time of the request. Specifically, this can happen with read operations, with the client requesting a read up to the max allowed, which might be more than the transport has available at the time. Today the implementation of init_in_iov_from_pdu throws an error, both Xen and Virtio. Instead, change the V9fsTransport interface so that the size becomes a pointer and can be limited by the implementation of init_in_iov_from_pdu. Change both the Xen and Virtio implementations to set the size to the size of the buffer they managed to allocate, instead of throwing an error. However, if the allocated buffer size is less than P9_IOHDRSZ (the size of the header) still throw an error as the case is unhandable. Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> CC: groug@kaod.org CC: anthony.perard@citrix.com CC: roman@zededa.com CC: qemu_oss@crudebyte.com [groug: fix 32-bit build] Signed-off-by: Greg Kurz <groug@kaod.org>	2020-01-20 15:11:39 +01:00
Daniel Henrique Barboza	846cf408a4	9p: local: always return -1 on error in local_unlinkat_common local_unlinkat_common() is supposed to always return -1 on error. This is being done by jumps to the 'err_out' label, which is a 'return ret' call, and 'ret' is initialized with -1. Unfortunately there is a condition in which the function will return 0 on error: in a case where flags == AT_REMOVEDIR, 'ret' will be 0 when reaching map_dirfd = openat_dir(...) And, if map_dirfd == -1 and errno != ENOENT, the existing 'err_out' jump will execute 'return ret', when ret is still set to zero at that point. This patch fixes it by changing all 'err_out' labels by 'return -1' calls, ensuring that the function will always return -1 on error conditions. 'ret' can be left unintialized since it's now being used just to store the result of 'unlinkat' calls. CC: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> [groug: changed prefix in title to be "9p: local:"] Signed-off-by: Greg Kurz <groug@kaod.org>	2020-01-20 15:11:39 +01:00
Jiajun Chen	841b8d099c	9pfs: local: Fix possible memory leak in local_link() There is a possible memory leak while local_link return -1 without free odirpath and oname. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Jaijun Chen <chenjiajun8@huawei.com> Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2020-01-20 15:11:39 +01:00
Vladimir Sementsov-Ogievskiy	4c5ec47e63	9pfs: make Error errp const where it is appropriate Mostly, Error is for returning error from the function, so the callee sets it. However error_append_security_model_hint and error_append_socket_sockfd_hint get already filled errp parameter. They don't change the pointer itself, only change the internal state of referenced Error object. So we can make it Error const errp, to stress the behavior. It will also help coccinelle script (in future) to distinguish such cases from common errp usage. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Greg Kurz <groug@kaod.org> Message-Id: <20191205174635.18758-9-vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Commit message replaced] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2019-12-18 08:43:19 +01:00
Dan Schatzberg	68d654daee	9pfs: Fix divide by zero bug Some filesystems may return 0s in statfs (trivially, a FUSE filesystem can do so). QEMU should handle this gracefully and just behave the same as if statfs failed. Signed-off-by: Dan Schatzberg <dschatzberg@fb.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2019-11-23 15:51:48 +01:00
Christian Schoenebeck	6b6aa8285d	9p: Use variable length suffixes for inode remapping Use variable length suffixes for inode remapping instead of the fixed 16 bit size prefixes before. With this change the inode numbers on guest will typically be much smaller (e.g. around >2^1 .. >2^7 instead of >2^48 with the previous fixed size inode remapping. Additionally this solution is more efficient, since inode numbers in practice can take almost their entire 64 bit range on guest as well, so there is less likely a need for generating and tracking additional suffixes, which might also be beneficial for nested virtualization where each level of virtualization would shift up the inode bits and increase the chance of expensive remapping actions. The "Exponential Golomb" algorithm is used as basis for generating the variable length suffixes. The algorithm has a parameter k which controls the distribution of bits on increasing indeces (minimum bits at low index vs. maximum bits at high index). With k=0 the generated suffixes look like: Index Dec/Bin -> Generated Suffix Bin 1 [1] -> [1] (1 bits) 2 [10] -> [010] (3 bits) 3 [11] -> [110] (3 bits) 4 [100] -> [00100] (5 bits) 5 [101] -> [10100] (5 bits) 6 [110] -> [01100] (5 bits) 7 [111] -> [11100] (5 bits) 8 [1000] -> [0001000] (7 bits) 9 [1001] -> [1001000] (7 bits) 10 [1010] -> [0101000] (7 bits) 11 [1011] -> [1101000] (7 bits) 12 [1100] -> [0011000] (7 bits) ... 65533 [1111111111111101] -> [1011111111111111000000000000000] (31 bits) 65534 [1111111111111110] -> [0111111111111111000000000000000] (31 bits) 65535 [1111111111111111] -> [1111111111111111000000000000000] (31 bits) Hence minBits=1 maxBits=31 And with k=5 they would look like: Index Dec/Bin -> Generated Suffix Bin 1 [1] -> [000001] (6 bits) 2 [10] -> [100001] (6 bits) 3 [11] -> [010001] (6 bits) 4 [100] -> [110001] (6 bits) 5 [101] -> [001001] (6 bits) 6 [110] -> [101001] (6 bits) 7 [111] -> [011001] (6 bits) 8 [1000] -> [111001] (6 bits) 9 [1001] -> [000101] (6 bits) 10 [1010] -> [100101] (6 bits) 11 [1011] -> [010101] (6 bits) 12 [1100] -> [110101] (6 bits) ... 65533 [1111111111111101] -> [0011100000000000100000000000] (28 bits) 65534 [1111111111111110] -> [1011100000000000100000000000] (28 bits) 65535 [1111111111111111] -> [0111100000000000100000000000] (28 bits) Hence minBits=6 maxBits=28 Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:23 +02:00
Antonios Motakis	f3fe4a2d92	9p: stat_to_qid: implement slow path stat_to_qid attempts via qid_path_prefixmap to map unique files (which are identified by 64 bit inode nr and 32 bit device id) to a 64 QID path value. However this implementation makes some assumptions about inode number generation on the host. If qid_path_prefixmap fails, we still have 48 bits available in the QID path to fall back to a less memory efficient full mapping. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next (SHA1 7fc4c49e91). - Updated hash calls to new xxhash API. - Removed unnecessary parantheses in qpf_lookup_func(). - Removed unnecessary g_malloc0() result checks. - Log error message when running out of prefixes in qid_path_fullmap(). - Log warning message about potential degraded performance in qid_path_prefixmap(). - Wrapped qpf_table initialization to dedicated qpf_table_init() function. - Fixed typo in comment. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:14 +02:00
Antonios Motakis	1a6ed33cc5	9p: Added virtfs option 'multidevs=remap\|forbid\|warn' 'warn' (default): Only log an error message (once) on host if more than one device is shared by same export, except of that just ignore this config error though. This is the default behaviour for not breaking existing installations implying that they really know what they are doing. 'forbid': Like 'warn', but except of just logging an error this also denies access of guest to additional devices. 'remap': Allows to share more than one device per export by remapping inodes from host to guest appropriately. To support multiple devices on the 9p share, and avoid qid path collisions we take the device id as input to generate a unique QID path. The lowest 48 bits of the path will be set equal to the file inode, and the top bits will be uniquely assigned based on the top 16 bits of the inode and the device id. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next (SHA1 7fc4c49e91). - Added virtfs option 'multidevs', original patch simply did the inode remapping without being asked. - Updated hash calls to new xxhash API. - Updated docs for new option 'multidevs'. - Fixed v9fs_do_readdir() not having remapped inodes. - Log error message when running out of prefixes in qid_path_prefixmap(). - Fixed definition of QPATH_INO_MASK. - Wrapped qpp_table initialization to dedicated qpp_table_init() function. - Dropped unnecessary parantheses in qpp_lookup_func(). - Dropped unnecessary g_malloc0() result checks. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> [groug: - Moved "multidevs" parsing to the local backend. - Added hint to invalid multidevs option error. - Turn "remap" into "x-remap". ] Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:05 +02:00
Antonios Motakis	3b5ee9e86b	9p: Treat multiple devices on one export as an error The QID path should uniquely identify a file. However, the inode of a file is currently used as the QID path, which on its own only uniquely identifies files within a device. Here we track the device hosting the 9pfs share, in order to prevent security issues with QID path collisions from other devices. We only print a warning for now but a subsequent patch will allow users to have finer control over the desired behaviour. Failing the I/O will be one the proposed behaviour, so we also change stat_to_qid() to return an error here in order to keep other patches simpler. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Assign dev_id to export root's device already in v9fs_device_realize_common(), not postponed in stat_to_qid(). - error_report_once() if more than one device was shared by export. - Return -ENODEV instead of -ENOSYS in stat_to_qid(). - Fixed typo in log comment. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> [groug, changed to warning, updated message and changelog] Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:05 +02:00
Greg Kurz	ea52cdd443	fsdev: Add return value to fsdev_throttle_parse_opts() It is more convenient to use the return value of the function to notify errors, rather than to be tied up setting up the &local_err boilerplate. Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:05 +02:00
Greg Kurz	c0da0cb761	9p: Simplify error path of v9fs_device_realize_common() Make v9fs_device_unrealize_common() idempotent and use it for rollback, in order to reduce code duplication. Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:04 +02:00
Antonios Motakis	8703283352	9p: unsigned type for type, version, path There is no need for signedness on these QID fields for 9p. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Also make QID type unsigned. - Adjust donttouch_stat() to new types. - Adjust trace-events to new types. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:04 +02:00
Paolo Bonzini	98387d5802	9p: simplify source file selection Express the complex conditions in Kconfig rather than Makefiles, since Kconfig is better suited at expressing dependencies and detecting contradictions. Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-08-20 17:26:19 +02:00
Markus Armbruster	a27bd6c779	Include hw/qdev-properties.h less In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>	2019-08-16 13:31:53 +02:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	650d103d3e	Include hw/hw.h exactly where needed In my "build everything" tree, changing hw/hw.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The previous commits have left only the declaration of hw_error() in hw/hw.h. This permits dropping most of its inclusions. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Anthony PERARD	a3434a2d56	xen: Import other xen/io/*.h A Xen public header have been imported into QEMU (by `f65eadb639` "xen: import ring.h from xen"), but there are other header that depends on ring.h which come from the system when building QEMU. This patch resolves the issue of having headers from the system importing a different copie of ring.h. This patch is prompt by the build issue described in the previous patch: 'Revert xen/io/ring.h of "Clean up a few header guard symbols"' ring.h and the new imported headers are moved to "include/hw/xen/interface" as those describe interfaces with a guest. The imported headers are cleaned up a bit while importing them: some part of the file that QEMU doesn't use are removed (description of how to make hypercall in grant_table.h have been removed). Other cleanup: - xen-mapcache.c and xen-legacy-backend.c don't need grant_table.h. - xenfb.c doesn't need event_channel.h. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Message-Id: <20190621105441.3025-3-anthony.perard@citrix.com>	2019-06-24 10:42:30 +01:00
Markus Armbruster	f91005e195	Supply missing header guards Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190604181618.19980-5-armbru@redhat.com>	2019-06-12 13:20:21 +02:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Markus Armbruster	0b8fa32f55	Include qemu/module.h where needed, drop it from qemu-common.h Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]	2019-06-12 13:18:33 +02:00
Markus Armbruster	dec9776049	trace-events: Fix attribution of trace points to source Some trace points are attributed to the wrong source file. Happens when we neglect to update trace-events for code motion, or add events in the wrong place, or misspell the file name. Clean up with help of cleanup-trace-events.pl. Same funnies as in the previous commit, of course. Manually shorten its change to linux-user/trace-events to */signal.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 20190314180929.27722-6-armbru@redhat.com Message-Id: <20190314180929.27722-6-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-03-22 16:18:07 +00:00
Markus Armbruster	500016e5db	trace-events: Shorten file names in comments We spell out sub/dir/ in sub/dir/trace-events' comments pointing to source files. That's because when trace-events got split up, the comments were moved verbatim. Delete the sub/dir/ part from these comments. Gets rid of several misspellings. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190314180929.27722-3-armbru@redhat.com Message-Id: <20190314180929.27722-3-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-03-22 16:18:07 +00:00
Yang Zhong	b42075bb77	virtio: express virtio dependencies with Kconfig Signed-off-by: Yang Zhong <yang.zhong@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190123065618.3520-42-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-03-07 21:45:53 +01:00
Paolo Bonzini	e0e312f352	build: switch to Kconfig The make_device_config.sh script is replaced by minikconf, which is modified to support the same command line as its predecessor. The roots of the parsing are default-configs/.mak, Kconfig.host and hw/Kconfig. One difference with make_device_config.sh is that all symbols have to be defined in a Kconfig file, including those coming from the configure script. This is the reason for the Kconfig.host file introduced in the previous patch. Whenever a file in default-configs/.mak used $(...) to refer to a config-host.mak symbol, this is replaced by a Kconfig dependency; this part must be done already in this patch for bisectability. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Acked-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190123065618.3520-28-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-03-07 21:45:53 +01:00
Paolo Bonzini	82f5181777	kconfig: introduce kconfig files The Kconfig files were generated mostly with this script: for i in `grep -ho CONFIG_[A-Z0-9_]* default-configs/* \| sort -u`; do set fnord `git grep -lw $i -- 'hw//Makefile.objs' ` shift if test $# = 1; then cat >> $(dirname $1)/Kconfig << EOF config ${i#CONFIG_} bool EOF git add $(dirname $1)/Kconfig else echo $i $ fi done sed -i '$d' hw//Kconfig for i in hw/; do if test -d $i && ! test -f $i/Kconfig; then touch $i/Kconfig git add $i/Kconfig fi done Whenever a symbol is referenced from multiple subdirectories, the script prints the list of directories that reference the symbol. These symbols have to be added manually to the Kconfig files. Kconfig.host and hw/Kconfig were created manually. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20190123065618.3520-27-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-03-07 21:45:53 +01:00
Paolo Bonzini	3fa749654e	9pfs: remove unnecessary conditionals The VIRTIO_9P \|\| VIRTFS && XEN condition can be computed in hw/Makefile.objs, removing an "if" from hw/9pfs/Makefile.objs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-03-07 21:45:53 +01:00
Paul Durrant	2d0ed5e642	xen: re-name XenDevice to XenLegacyDevice... ...and xen_backend.h to xen-legacy-backend.h Rather than attempting to convert the existing backend infrastructure to be QOM compliant (which would be hard to do in an incremental fashion), subsequent patches will introduce a completely new framework for Xen PV backends. Hence it is necessary to re-name parts of existing code to avoid name clashes. The re-named 'legacy' infrastructure will be removed once all backends have been ported to the new framework. This patch is purely cosmetic. No functional change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>	2019-01-14 13:45:40 +00:00
Greg Kurz	93aee84f57	9p: remove support for the "handle" backend The "handle" fsdev backend was deprecated in QEMU 2.12.0 with: commit `db3b3c7281` Author: Greg Kurz <groug@kaod.org> Date: Mon Jan 8 11:18:23 2018 +0100 9pfs: deprecate handle backend This backend raise some concerns: - doesn't support symlinks - fails +100 tests in the PJD POSIX file system test suite [1] - requires the QEMU process to run with the CAP_DAC_READ_SEARCH capability, which isn't recommended for security reasons This backend should not be used and wil be removed. The 'local' backend is the recommended alternative. [1] https://www.tuxera.com/community/posix-test-suite/ Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> It has passed the two release cooling period without any complaint. Remove it now. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Thomas Huth <thuth@redhat.com>	2018-12-12 14:18:10 +01:00
Greg Kurz	75607e0dcc	xen/9pfs: use g_new(T, n) instead of g_malloc(sizeof(T) * n) Because it is a recommended coding practice (see HACKING). Signed-off-by: Greg Kurz <groug@kaod.org> Acked-by: Anthony PERARD <anthony.perard@citrix.com>	2018-12-12 14:18:10 +01:00
Greg Kurz	1923923bfa	9p: use g_new(T, n) instead of g_malloc(sizeof(T) * n) Because it is a recommended coding practice (see HACKING). Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>	2018-12-12 14:18:10 +01:00
Greg Kurz	1d20398694	9p: fix QEMU crash when renaming files When using the 9P2000.u version of the protocol, the following shell command line in the guest can cause QEMU to crash: while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done With 9P2000.u, file renaming is handled by the WSTAT command. The v9fs_wstat() function calls v9fs_complete_rename(), which calls v9fs_fix_path() for every fid whose path is affected by the change. The involved calls to v9fs_path_copy() may race with any other access to the fid path performed by some worker thread, causing a crash like shown below: Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59 59 while (*path && fd != -1) { (gdb) bt #0 0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59 #1 0x0000555555a25e0c in local_opendir_nofollow (fs_ctx=0x555557d958b8, path=0x0) at hw/9pfs/9p-local.c:92 #2 0x0000555555a261b8 in local_lstat (fs_ctx=0x555557d958b8, fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185 #3 0x0000555555a2b367 in v9fs_co_lstat (pdu=0x555557d97498, path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53 #4 0x0000555555a1e9e2 in v9fs_stat (opaque=0x555557d97498) at hw/9pfs/9p.c:1083 #5 0x0000555555e060a2 in coroutine_trampoline (i0=-669165424, i1=32767) at util/coroutine-ucontext.c:116 #6 0x00007fffef4f5600 in __start_context () at /lib64/libc.so.6 #7 0x0000000000000000 in () (gdb) The fix is to take the path write lock when calling v9fs_complete_rename(), like in v9fs_rename(). Impact: DoS triggered by unprivileged guest users. Fixes: CVE-2018-19489 Cc: P J P <ppandit@redhat.com> Reported-by: zhibin hu <noirfate@gmail.com> Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-11-23 13:28:03 +01:00
Greg Kurz	5b3c77aa58	9p: take write lock on fid path updates (CVE-2018-19364) Recent commit `5b76ef50f6` fixed a race where v9fs_co_open2() could possibly overwrite a fid path with v9fs_path_copy() while it is being accessed by some other thread, ie, use-after-free that can be detected by ASAN with a custom 9p client. It turns out that the same can happen at several locations where v9fs_path_copy() is used to set the fid path. The fix is again to take the write lock. Fixes CVE-2018-19364. Cc: P J P <ppandit@redhat.com> Reported-by: zhibin hu <noirfate@gmail.com> Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-11-20 13:00:35 +01:00
Greg Kurz	5b76ef50f6	9p: write lock path in v9fs_co_open2() The assumption that the fid cannot be used by any other operation is wrong. At least, nothing prevents a misbehaving client to create a file with a given fid, and to pass this fid to some other operation at the same time (ie, without waiting for the response to the creation request). The call to v9fs_path_copy() performed by the worker thread after the file was created can race with any access to the fid path performed by some other thread. This causes use-after-free issues that can be detected by ASAN with a custom 9p client. Unlike other operations that only read the fid path, v9fs_co_open2() does modify it. It should hence take the write lock. Cc: P J P <ppandit@redhat.com> Reported-by: zhibin hu <noirfate@gmail.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-11-08 21:19:05 +01:00
Markus Armbruster	b836723dfe	fsdev: Clean up error reporting in qemu_fsdev_add() Calling error_report() from within a function that takes an Error ** argument is suspicious. qemu_fsdev_add() does that, and its caller fsdev_init_func() then fails without setting an error. Its caller main(), via qemu_opts_foreach(), is fine with it, but clean it up anyway. Cc: Greg Kurz <groug@kaod.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Greg Kurz <groug@kaod.org> Message-Id: <20181017082702.5581-32-armbru@redhat.com>	2018-10-19 14:51:34 +02:00
Markus Armbruster	1245402e3c	9pfs: Fix CLI parsing crash on error Calling error_report() in a function that takes an Error ** argument is suspicious. 9p-handle.c's handle_parse_opts() does that, and then fails without setting an error. Wrong. Its caller crashes when it tries to report the error: $ qemu-system-x86_64 -nodefaults -fsdev id=foo,fsdriver=handle qemu-system-x86_64: -fsdev id=foo,fsdriver=handle: warning: handle backend is deprecated qemu-system-x86_64: -fsdev id=foo,fsdriver=handle: fsdev: No path specified Segmentation fault (core dumped) Screwed up when commit `91cda4e8f3` (v2.12.0) converted the function to Error. Fix by calling error_setg() instead of error_report(). Fixes: `91cda4e8f3` Cc: Greg Kurz <groug@kaod.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Greg Kurz <groug@kaod.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20181017082702.5581-9-armbru@redhat.com>	2018-10-19 14:51:34 +02:00
Markus Armbruster	4b5766488f	error: Fix use of error_prepend() with &error_fatal, &error_abort From include/qapi/error.h: * Pass an existing error to the caller with the message modified: * error_propagate(errp, err); * error_prepend(errp, "Could not frobnicate '%s': ", name); Fei Li pointed out that doing error_propagate() first doesn't work well when @errp is &error_fatal or &error_abort: the error_prepend() is never reached. Since I doubt fixing the documentation will stop people from getting it wrong, introduce error_propagate_prepend(), in the hope that it lures people away from using its constituents in the wrong order. Update the instructions in error.h accordingly. Convert existing error_prepend() next to error_propagate to error_propagate_prepend(). If any of these get reached with &error_fatal or &error_abort, the error messages improve. I didn't check whether that's the case anywhere. Cc: Fei Li <fli@suse.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20181017082702.5581-2-armbru@redhat.com>	2018-10-19 14:51:34 +02:00
Keno Fischer	230f1b31c5	9p: darwin: Explicitly cast comparisons of mode_t with -1 Comparisons of mode_t with -1 require an explicit cast, since mode_t is unsigned on Darwin. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-06-29 12:32:10 +02:00
Keno Fischer	5c99fa375d	cutils: Provide strchrnul strchrnul is a GNU extension and thus unavailable on a number of targets. In the review for a commit removing strchrnul from 9p, I was asked to create a qemu_strchrnul helper to factor out this functionality. Do so, and use it in a number of other places in the code base that inlined the replacement pattern in a place where strchrnul could be used. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Acked-by: Greg Kurz <groug@kaod.org> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-06-29 12:32:10 +02:00
Keno Fischer	aca6897fba	9p: xattr: Properly translate xattrcreate flags As with unlinkat, these flags come from the client and need to be translated to their host values. The protocol values happen to match linux, but that need not be true in general. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-06-07 12:17:22 +02:00
Keno Fischer	67e8734574	9p: Properly check/translate flags in unlinkat The 9p-local code previously relied on P9_DOTL_AT_REMOVEDIR and AT_REMOVEDIR having the same numerical value and deferred any errorchecking to the syscall itself. However, while the former assumption is true on Linux, it is not true in general. 9p-handle did this properly however. Move the translation code to the generic 9p server code and add an error if unrecognized flags are passed. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-06-07 12:17:22 +02:00
Keno Fischer	5b7b2f9a85	9p: local: Avoid warning if FS_IOC_GETVERSION is not defined Both `stbuf` and `local_ioc_getversion` where unused when FS_IOC_GETVERSION was not defined, causing a compiler warning. Reorganize the code to avoid this warning. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-06-07 12:17:22 +02:00
Keno Fischer	a647502c58	9p: xattr: Fix crashes due to free of uninitialized value If the size returned from llistxattr/lgetxattr is 0, we skipped the malloc call, leaving xattr.value uninitialized. However, this value is later passed to `g_free` without any further checks, causing an error. Fix that by always calling g_malloc unconditionally. If `size` is 0, it will return NULL, which is safe to pass to g_free. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-06-07 12:17:22 +02:00
Keno Fischer	ec70b956fd	9p: Move a couple xattr functions to 9p-util These functions will need custom implementations on Darwin. Since the implementation is very similar among all of them, and 9p-util already has the _nofollow version of fgetxattrat, let's move them all there. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-06-07 12:17:22 +02:00
Keno Fischer	2306271c38	9p: local: Properly set errp in fstatfs error path In the review of 9p: Avoid warning if FS_IOC_GETVERSION is not defined Grep Kurz noted this error path was failing to set errp. Fix that. Signed-off-by: Keno Fischer <keno@juliacomputing.com> [added local: to commit title, Greg Kurz] Signed-off-by: Greg Kurz <groug@kaod.org>	2018-06-07 12:17:21 +02:00
Keno Fischer	fde1f3e4a0	9p: proxy: Fix size passed to `connect` The size to pass to the `connect` call is the size of the entire `struct sockaddr_un`. Passing anything shorter than this causes errors on darwin. Signed-off-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-06-07 12:17:21 +02:00
Paolo Bonzini	b5dfdb082f	hw: make virtio devices configurable via default-configs/ This is only half of the work, because the proxy devices (virtio--pci, virtio--ccw, etc.) are still included unconditionally. It is still a move in the right direction. Based-on: <20180522194943.24871-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-06-01 15:14:31 +02:00
Paul Durrant	58560f2ae7	xen: remove other open-coded use of libxengnttab Now that helpers are available in xen_backend, use them throughout all Xen PV backends. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>	2018-05-22 11:43:21 -07:00
Greg Kurz	8f9c64bfa5	9p: add trace event for v9fs_setattr() Don't print the tv_nsec part of atime and mtime, to stay below the 10 argument limit of trace events. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2018-05-02 08:59:24 +02:00
Marc-André Lureau	6ce7177ae2	9p: fix leak in synth_name_to_path() Leak found thanks to ASAN: Direct leak of 8 byte(s) in 1 object(s) allocated from: #0 0x55995789ac90 in __interceptor_malloc (/home/elmarco/src/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x1510c90) #1 0x7f0a91190f0c in g_malloc /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:94 #2 0x5599580a281c in v9fs_path_copy /home/elmarco/src/qemu/hw/9pfs/9p.c:196:17 #3 0x559958f9ec5d in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:116:9 #4 0x7f0a8766ebbf (/lib64/libc.so.6+0x50bbf) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-02-19 18:27:32 +01:00
Marc-André Lureau	e446a1eb5e	9p: v9fs_path_copy() readability lhs/rhs doesn't tell much about how argument are handled, dst/src is and const arguments is clearer in my mind. Use g_memdup() while at it. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2018-02-19 18:27:15 +01:00
Markus Armbruster	922a01a013	Move include qemu/option.h from qemu-common.h to actual users qemu-common.h includes qemu/option.h, but most places that include the former don't actually need the latter. Drop the include, and add it to the places that actually need it. While there, drop superfluous includes of both headers, and separate #include from file comment with a blank line. This cleanup makes the number of objects depending on qemu/option.h drop from 4545 (out of 4743) to 284 in my "build everything" tree. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-20-armbru@redhat.com> [Semantic conflict with commit `bdd6a90a9e` in block/nvme.c resolved]	2018-02-09 13:52:16 +01:00
Markus Armbruster	e688df6bc4	Include qapi/error.h exactly where needed This cleanup makes the number of objects depending on qapi/error.h drop from 1910 (out of 4743) to 1612 in my "build everything" tree. While there, separate #include from file comment with a blank line, and drop a useless comment on why qemu/osdep.h is included first. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-5-armbru@redhat.com> [Semantic conflict with commit `34e304e975` resolved, OSX breakage fixed]	2018-02-09 13:50:17 +01:00
Greg Kurz	357e2f7f4e	tests: virtio-9p: add FLUSH operation test The idea is to send a victim request that will possibly block in the server and to send a flush request to cancel the victim request. This patch adds two test to verifiy that: - the server does not reply to a victim request that was actually cancelled - the server replies to the flush request after replying to the victim request if it could not cancel it 9p request cancellation reference: http://man.cat-v.org/plan_9/5/flush Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> (groug, change the test to only write a single byte to avoid any alignment or endianess consideration)	2018-02-02 11:11:55 +01:00
Greg Kurz	354b86f85f	tests: virtio-9p: add WRITE operation test Trivial test of a successful write. Signed-off-by: Greg Kurz <groug@kaod.org> (groug, handle potential overflow when computing request size, add missing g_free(buf), backend handles one written byte at a time to validate the server doesn't do short-reads) Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-02-01 21:21:28 +01:00
Greg Kurz	82469aaefe	tests: virtio-9p: add LOPEN operation test Trivial test of a successful open. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-02-01 21:21:28 +01:00
Greg Kurz	2893ddd598	tests: virtio-9p: use the synth backend The purpose of virtio-9p-test is to test the virtio-9p device, especially the 9p server state machine. We don't really care what fsdev backend we're using. Moreover, if we want to be able to test the flush request or a device reset with in-flights I/O, it is close to impossible to achieve with a physical backend because we cannot ask it reliably to put an I/O on hold at a specific point in time. Fortunately, we can do that with the synthetic backend, which allows to register callbacks on read/write accesses to a specific file. This will be used by a later patch to test the 9P flush request. The walk request test is converted to using the synth backend. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-02-01 21:21:27 +01:00
Keno Fischer	fc78d5ee76	9pfs: Correctly handle cancelled requests # Background I was investigating spurious non-deterministic EINTR returns from various 9p file system operations in a Linux guest served from the qemu 9p server. ## EINTR, ERESTARTSYS and the linux kernel When a signal arrives that the Linux kernel needs to deliver to user-space while a given thread is blocked (in the 9p case waiting for a reply to its request in 9p_client_rpc -> wait_event_interruptible), it asks whatever driver is currently running to abort its current operation (in the 9p case causing the submission of a TFLUSH message) and return to user space. In these situations, the error message reported is generally ERESTARTSYS. If the userspace processes specified SA_RESTART, this means that the system call will get restarted upon completion of the signal handler delivery (assuming the signal handler doesn't modify the process state in complicated ways not relevant here). If SA_RESTART is not specified, ERESTARTSYS gets translated to EINTR and user space is expected to handle the restart itself. ## The 9p TFLUSH command The 9p TFLUSH commands requests that the server abort an ongoing operation. The man page [1] specifies: ``` If it recognizes oldtag as the tag of a pending transaction, it should abort any pending response and discard that tag. [...] When the client sends a Tflush, it must wait to receive the corresponding Rflush before reusing oldtag for subsequent messages. If a response to the flushed request is received before the Rflush, the client must honor the response as if it had not been flushed, since the completed request may signify a state change in the server ``` In particular, this means that the server must not send a reply with the orignal tag in response to the cancellation request, because the client is obligated to interpret such a reply as a coincidental reply to the original request. # The bug When qemu receives a TFlush request, it sets the `cancelled` flag on the relevant pdu. This flag is periodically checked, e.g. in `v9fs_co_name_to_path`, and if set, the operation is aborted and the error is set to EINTR. However, the server then violates the spec, by returning to the client an Rerror response, rather than discarding the message entirely. As a result, the client is required to assume that said Rerror response is a result of the original request, not a result of the cancellation and thus passes the EINTR error back to user space. This is not the worst thing it could do, however as discussed above, the correct error code would have been ERESTARTSYS, such that user space programs with SA_RESTART set get correctly restarted upon completion of the signal handler. Instead, such programs get spurious EINTR results that they were not expecting to handle. It should be noted that there are plenty of user space programs that do not set SA_RESTART and do not correctly handle EINTR either. However, that is then a userspace bug. It should also be noted that this bug has been mitigated by a recent commit to the Linux kernel [2], which essentially prevents the kernel from sending Tflush requests unless the process is about to die (in which case the process likely doesn't care about the response). Nevertheless, for older kernels and to comply with the spec, I believe this change is beneficial. # Implementation The fix is fairly simple, just skipping notification of a reply if the pdu was previously cancelled. We do however, also notify the transport layer that we're doing this, so it can clean up any resources it may be holding. I also added a new trace event to distinguish operations that caused an error reply from those that were cancelled. One complication is that we only omit sending the message on EINTR errors in order to avoid confusing the rest of the code (which may assume that a client knows about a fid if it sucessfully passed it off to pud_complete without checking for cancellation status). This does mean that if the server acts upon the cancellation flag, it always needs to set err to EINTR. I believe this is true of the current code. [1] https://9fans.github.io/plan9port/man/man9/flush.html [2] https://github.com/torvalds/linux/commit/9523feac272ccad2ad8186ba4fcc891 Signed-off-by: Keno Fischer <keno@juliacomputing.com> Reviewed-by: Greg Kurz <groug@kaod.org> [groug, send a zero-sized reply instead of detaching the buffer] Signed-off-by: Greg Kurz <groug@kaod.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2018-02-01 21:21:27 +01:00
Greg Kurz	066eb006b5	9pfs: drop v9fs_register_transport() No good reasons to do this outside of v9fs_device_realize_common(). Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>	2018-02-01 21:21:27 +01:00

1 2 3 4 5 ...

573 Commits