Commit Graph

59389 Commits

Author SHA1 Message Date
Christoph Hellwig 57dfe3ce10 block_dev: use bio_release_pages in blkdev_bio_end_io
Use bio_release_pages instead of duplicating it.

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-29 09:47:31 -06:00
Christoph Hellwig 147a60538d iomap: use bio_release_pages in iomap_dio_bio_end_io
Use bio_release_pages instead of duplicating it.

Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-29 09:47:31 -06:00
Linus Torvalds 01305db842 XArray updates for 5.2-rc6
Account XArray nodes for the page cache to the appropriate cgroup
   (Johannes Weiner)
 Fix idr_get_next() when called under the RCU lock (Matthew Wilcox)
 Add a test for xa_insert() (Matthew Wilcox)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAl0WuKsACgkQDpNsjXcp
 gj73zgf9Eb477PuwYZpFBA9ZxI5v/6WyqbaWXKdqEhotARgIUuv1CfVnkt1IJE6P
 Z3QCRABZ3pIKHgIErJN53B7AdvdONUO4Xf9VFBqmxeWE7F9L3sROOpXc8IrR26kV
 hITQn8mwgacNQ8mLtQmcSFaCVC2E7yVNBhVd5zmcA6jNIAFsOJcP06KLJTe94OXe
 AB9TJvswxpzAEX8emHQ/a1SFBNZWJ7b53hBcu8CJn8CuGDxmo1/+qqoRyNY+WrDO
 OohFk2u1j6Esfc6j0k+Akt8mEFyfU2oxFfv5MjL0KYEyrHoU84eZljFGgf7rQqGj
 fqH9RO8J8eoj4D/3XaLL5QYRLIxRaw==
 =AXZy
 -----END PGP SIGNATURE-----

Merge tag 'xarray-5.2-rc6' of git://git.infradead.org/users/willy/linux-dax

Pull XArray fixes from Matthew Wilcox:

 - Account XArray nodes for the page cache to the appropriate cgroup
   (Johannes Weiner)

 - Fix idr_get_next() when called under the RCU lock (Matthew Wilcox)

 - Add a test for xa_insert() (Matthew Wilcox)

* tag 'xarray-5.2-rc6' of git://git.infradead.org/users/willy/linux-dax:
  XArray tests: Add check_insert
  idr: Fix idr_get_next race with idr_remove
  mm: fix page cache convergence regression
2019-06-29 17:14:57 +08:00
Linus Torvalds 0839c53762 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "15 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  linux/kernel.h: fix overflow for DIV_ROUND_UP_ULL
  mm, swap: fix THP swap out
  fork,memcg: alloc_thread_stack_node needs to set tsk->stack
  MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info
  mm/vmalloc.c: avoid bogus -Wmaybe-uninitialized warning
  mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
  initramfs: fix populate_initrd_image() section mismatch
  mm/oom_kill.c: fix uninitialized oc->constraint
  mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge
  mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails
  signal: remove the wrong signal_pending() check in restore_user_sigmask()
  fs/binfmt_flat.c: make load_flat_shared_library() work
  mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask
  fs/proc/array.c: allow reporting eip/esp for all coredumping threads
  mm/dev_pfn: exclude MEMORY_DEVICE_PRIVATE while computing virtual address
2019-06-29 17:11:01 +08:00
Linus Torvalds c949c30b26 Two more NFS client fixes for Linux 5.2
Stable bugfixes:
 - SUNRPC: Fix up calculation of client message length # 5.1+
 - NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O # 4.8+
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAl0Wf3EACgkQ18tUv7Cl
 QOs2ORAA5/CXFa471jUldOsHejxfFoddFBkuqf8qZ1AF3TZdFuITAsq+xydxfO5U
 hYzUUlOTKedEi+ISYLFs1tjU/nYRQJv7fFZxVwq6uDZ53Z/doiMLAIR67Eq7EcTY
 KBWA9zdldnBzb0S87+hkbmaNPR5pjqxBzLEfMmOQEAAh5pSGf5YSeUNTXLGj4wBd
 iXf25o1VSjUmNpSHaA3KsrqTJ4mJ7+i/17Iny1c4xRgZbJtoTm44DpceHCheJpbl
 DymRSgjSr0vFjJbufcKkbF2OPp1ZsnkDiKyJmZzgPOa3+TMGzisU5yiASoac6D+j
 gs426yEz9rvR/TMZtFS05nfu2clKuS8foLGwZelJ7XjQSXJgObCb4xf97jLIOWNb
 J+BWwsTmUIQS+fMUQDA+rlbyepJ+skVZpbjmUy+/Uy52oqtnYK6uTD469NdmxBwr
 7z2pnCUjJFTqo6BHeCQgR5XlSt1MGDByamcVAWONS+9zJttRhfUOjq0PIOLSrsBK
 5zRzJxtBoYLwP5py3zKAeV9RcvDNSgh5U6P0hhFRtHfqMUmtGeA58nNND2S6Qm3/
 vAB7WZL0aVSvc3zpz7qdctitMESQNspCkMooAp/EoIime3YkqKCS+AgED9jKLhJR
 /5eqtr6tehh6A4dshzSlDF7cFrKyUd+ulS0IN8vt1V2TYgOQmbY=
 =FATk
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.2-4' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull two more NFS client fixes from Anna Schumaker:
 "These are both stable fixes.

  One to calculate the correct client message length in the case of
  partial transmissions. And the other to set the proper TCP timeout for
  flexfiles"

* tag 'nfs-for-5.2-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O
  SUNRPC: Fix up calculation of client message length
2019-06-29 17:02:22 +08:00
Linus Torvalds 43251dbd6a A small fix for a potential -rc1 regression from Jeff.
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAl0WLe4THGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzizPrB/4tNUS8J9mW9Zd3xLAzZmwjq+WAfCV8
 wp3IjBHCgvn9SmTYOJtozjTLJVlmeGNVyrCaWbtzQ2YLKvyBTCUF4kg9EG7FMX9a
 ixzlHb2+Wu46LYWiA7jhUnoKNMMl1swm01BOvfmGprSwV70BAEF0i2/D7WHikolX
 rgcwGb58vUMmXQ1VGfIO9Pox2a8jaZNj82BZnDniMDxetZ5sRsZXGy43s14zC6Lt
 YnwDT70Y7+Pr9SwHMA5bnZ8kCtQpr0qAHmDVhEd965Io1XZ+2/EHF5IwqK0xGg+e
 KUQdRyhMWjIGG34SWMt5tbT+9Lzeju4CAka9NPSJ1tRtFnk1AvpILbnB
 =TpFR
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.2-rc7' of git://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
 "A small fix for a potential -rc1 regression from Jeff"

* tag 'ceph-for-5.2-rc7' of git://github.com/ceph/ceph-client:
  ceph: fix ceph_mdsc_build_path to not stop on first component
2019-06-29 17:01:02 +08:00
Linus Torvalds 9dda12b6fa for-linus-20190628
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl0WM9YQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpmkyEADVPjXlIZETBpAl/oK/StNc1NMdfgBiWaX7
 kQHbFu3V4soDpvR8iQvMVyFc7dUpwo9lmgxIOcZSfdCf/ciJ/G4trhH4UljXfRsj
 2vdKV3rZXragrclN0zGtW90sBBYxSilaezzRQbnnXjEgGaHFkeJJR3xW00UMoGrm
 GDO2gSQdhDKqhJtKjiCASkyN9uWMkcLFdsGErPgA6e4S3NTbaLKaY/xFUCcMF7aX
 N1aYkIfdyl38QUU/N+5WLgiJYHkiZNqcrJ+a5aECioqqiNh9ST+UR1jCgo7tlt4h
 b3Gb5mxP0CPUuTh3VQD8GHCaPzDsxUIxThJkz5aih3M9NEQmm5Du0GDChaDuMoUR
 zyFT/Yl4JfeO93mlpxGUyC5WyFCQdj0QOBuyxInCchvJC5kbpRflMuKt+xRYlSqg
 331njdykyKkgutagLzzTME38RPUbttZVmbc6K422PXKkYW+FOlS352FZpl5qxDOu
 5+ihOXOLvO09VXu6kcC5UH4Yi6nuGYDS95oIZhJ0OODx10xnKSE4ZozlPXAEreAR
 NVJN7vbHVqLnphuplRK9Kh0VngdIhLkeTsUxaTnX6UQSioHPDJPqPP5nfSu9Xkyo
 e+2UAXkfVjnw45jAu8Mrsu0KhabCB5Pde8Jk+kmqPcuWXQEN5OHqeA09vtvKj81J
 lIagz1NZxw==
 =WzXj
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190628' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Just two small fixes.

  One from Paolo, fixing a silly mistake in BFQ. The other one is from
  me, ensuring that we have ->file cleared in the io_uring request a bit
  earlier. That avoids a use-before-free, if we encounter an error
  before ->file is assigned"

* tag 'for-linus-20190628' of git://git.kernel.dk/linux-block:
  block, bfq: fix operator in BFQQ_TOTALLY_SEEKY
  io_uring: ensure req->file is cleared on allocation
2019-06-29 16:58:35 +08:00
Oleg Nesterov 97abc889ee signal: remove the wrong signal_pending() check in restore_user_sigmask()
This is the minimal fix for stable, I'll send cleanups later.

Commit 854a6ed568 ("signal: Add restore_user_sigmask()") introduced
the visible change which breaks user-space: a signal temporary unblocked
by set_user_sigmask() can be delivered even if the caller returns
success or timeout.

Change restore_user_sigmask() to accept the additional "interrupted"
argument which should be used instead of signal_pending() check, and
update the callers.

Eric said:

: For clarity.  I don't think this is required by posix, or fundamentally to
: remove the races in select.  It is what linux has always done and we have
: applications who care so I agree this fix is needed.
:
: Further in any case where the semantic change that this patch rolls back
: (aka where allowing a signal to be delivered and the select like call to
: complete) would be advantage we can do as well if not better by using
: signalfd.
:
: Michael is there any chance we can get this guarantee of the linux
: implementation of pselect and friends clearly documented.  The guarantee
: that if the system call completes successfully we are guaranteed that no
: signal that is unblocked by using sigmask will be delivered?

Link: http://lkml.kernel.org/r/20190604134117.GA29963@redhat.com
Fixes: 854a6ed568 ("signal: Add restore_user_sigmask()")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Eric Wong <e@80x24.org>
Tested-by: Eric Wong <e@80x24.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: <stable@vger.kernel.org>	[5.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-29 16:43:45 +08:00
Jann Horn 867bfa4a5f fs/binfmt_flat.c: make load_flat_shared_library() work
load_flat_shared_library() is broken: It only calls load_flat_file() if
prepare_binprm() returns zero, but prepare_binprm() returns the number of
bytes read - so this only happens if the file is empty.

Instead, call into load_flat_file() if the number of bytes read is
non-negative. (Even if the number of bytes is zero - in that case,
load_flat_file() will see nullbytes and return a nice -ENOEXEC.)

In addition, remove the code related to bprm creds and stop using
prepare_binprm() - this code is loading a library, not a main executable,
and it only actually uses the members "buf", "file" and "filename" of the
linux_binprm struct. Instead, call kernel_read() directly.

Link: http://lkml.kernel.org/r/20190524201817.16509-1-jannh@google.com
Fixes: 287980e49f ("remove lots of IS_ERR_VALUE abuses")
Signed-off-by: Jann Horn <jannh@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-29 16:43:45 +08:00
John Ogness cb8f381f16 fs/proc/array.c: allow reporting eip/esp for all coredumping threads
0a1eb2d474 ("fs/proc: Stop reporting eip and esp in /proc/PID/stat")
stopped reporting eip/esp and fd7d56270b ("fs/proc: Report eip/esp in
/prod/PID/stat for coredumping") reintroduced the feature to fix a
regression with userspace core dump handlers (such as minicoredumper).

Because PF_DUMPCORE is only set for the primary thread, this didn't fix
the original problem for secondary threads.  Allow reporting the eip/esp
for all threads by checking for PF_EXITING as well.  This is set for all
the other threads when they are killed.  coredump_wait() waits for all the
tasks to become inactive before proceeding to invoke a core dumper.

Link: http://lkml.kernel.org/r/87y32p7i7a.fsf@linutronix.de
Link: http://lkml.kernel.org/r/20190522161614.628-1-jlu@pengutronix.de
Fixes: fd7d56270b ("fs/proc: Report eip/esp in /prod/PID/stat for coredumping")
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reported-by: Jan Luebbe <jlu@pengutronix.de>
Tested-by: Jan Luebbe <jlu@pengutronix.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-29 16:43:44 +08:00
Christoph Hellwig 89b171acb2 xfs: fix iclog allocation size
Properly allocate the space for the bio_vecs instead of just one byte
per bio_vec.

Fixes: 79b54d9bfc ("xfs: use bios directly to write log buffers")
Reported-by: syzbot+b75afdbe271a0d7ac4f6@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 21:02:45 -07:00
Eric Sandeen 250d4b4c40 xfs: remove unused header files
There are many, many xfs header files which are included but
unneeded (or included twice) in the xfs code, so remove them.

nb: xfs_linux.h includes about 9 headers for everyone, so those
explicit includes get removed by this.  I'm not sure what the
preference is, but if we wanted explicit includes everywhere,
a followup patch could remove those xfs_*.h includes from
xfs_linux.h and move them into the files that need them.
Or it could be left as-is.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:43 -07:00
Christoph Hellwig adfb5fb46a xfs: implement cgroup aware writeback
Link every newly allocated writeback bio to cgroup pointed to by the
writeback control structure, and charge every byte written back to it.

Tested-by: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:22 -07:00
Christoph Hellwig a247373596 xfs: simplify xfs_chain_bio
Move setting up operation and write hint to xfs_alloc_ioend, and
then just copy over all needed information from the previous bio
in xfs_chain_bio and stop passing various parameters to it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:22 -07:00
Darrick J. Wong f327a00745 xfs: account for log space when formatting new AGs
When we're writing out a fresh new AG, make sure that we don't list an
internal log as free and that we create the rmap for the region.  growfs
never does this, but we will need it when we hook up mkfs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-28 19:30:21 -07:00
Darrick J. Wong 8d90857cff xfs: refactor free space btree record initialization
Refactor the code that populates the free space btrees of a new AG so
that we can avoid code duplication once things start getting
complicated.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-28 19:30:21 -07:00
Brian Foster 7e36a3a63d xfs: always update params on small allocation
xfs_alloc_ag_vextent_small() doesn't update the output parameters in
the event of an AGFL allocation. Instead, it updates the
xfs_alloc_arg structure directly to complete the allocation.

Update both args and the output params to provide consistent
behavior for future callers.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:20 -07:00
Brian Foster 6691cd9267 xfs: skip small alloc cntbt logic on NULL cursor
The small allocation helper is implemented in a way that is fairly
tightly integrated to the existing allocation algorithms. It expects
a cntbt cursor beyond the end of the tree, attempts to locate the
last record in the tree and only attempts an AGFL allocation if the
cntbt is empty.

The upcoming generic algorithm doesn't rely on the cntbt processing
of this function. It will only call this function when the cntbt
doesn't have a big enough extent or is empty and thus AGFL
allocation is the only remaining option. Tweak
xfs_alloc_ag_vextent_small() to handle a NULL cntbt cursor and skip
the cntbt logic. This facilitates use by the existing allocation
code and new code that only requires an AGFL allocation attempt.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:20 -07:00
Brian Foster c63cdd4fc9 xfs: move small allocation helper
Move the small allocation helper further up in the file to avoid the
need for a function declaration. The remaining declarations will be
removed by followup patches. No functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:19 -07:00
Brian Foster 2a4f35f984 xfs: clean up small allocation helper
xfs_alloc_ag_vextent_small() is kind of a mess. Clean it up in
preparation for future changes. No functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:30:19 -07:00
Christoph Hellwig caeaea9858 xfs: merge xfs_trans_bmap.c into xfs_bmap_item.c
Keep all bmap item related code together.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:29:42 -07:00
Christoph Hellwig 3cfce1e3ce xfs: merge xfs_trans_rmap.c into xfs_rmap_item.c
Keep all rmap item related code together in one file.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:29:41 -07:00
Christoph Hellwig effd5e96e7 xfs: merge xfs_trans_refcount.c into xfs_refcount_item.c
Keep all the refcount item related code together in one file.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:29:41 -07:00
Christoph Hellwig 81f4004173 xfs: merge xfs_trans_extfree.c into xfs_extfree_item.c
Keep all the extree item related code together in one file.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:28:17 -07:00
Christoph Hellwig 73f0d23633 xfs: merge xfs_bud_init into xfs_trans_get_bud
There is no good reason to keep these two functions separate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:36 -07:00
Christoph Hellwig 60883447f4 xfs: merge xfs_rud_init into xfs_trans_get_rud
There is no good reason to keep these two functions separate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:36 -07:00
Christoph Hellwig ebeb8e0629 xfs: merge xfs_cud_init into xfs_trans_get_cud
There is no good reason to keep these two functions separate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:35 -07:00
Christoph Hellwig 9c5e7c2ae3 xfs: merge xfs_efd_init into xfs_trans_get_efd
There is no good reason to keep these two functions separate.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:35 -07:00
Christoph Hellwig 95cf0e4a0d xfs: remove a pointless comment duplicated above all xfs_item_ops instances
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:34 -07:00
Christoph Hellwig 89ae379d56 xfs: use a list_head for iclog callbacks
Replace the hand grown linked list handling and cil context attachment
with the standard list_head structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:34 -07:00
Christoph Hellwig efe2330fdc xfs: remove the xfs_log_item_t typedef
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:33 -07:00
Christoph Hellwig b3b14aacc6 xfs: don't cast inode_log_items to get the log_item
The cast is not type safe, and we can just dereference the first
member instead to start with.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:33 -07:00
Christoph Hellwig 9ce632a28a xfs: add a flag to release log items on commit
We have various items that are released from ->iop_comitting.  Add a
flag to just call ->iop_release from the commit path to avoid tons
of boilerplate code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:32 -07:00
Christoph Hellwig ddf92053e4 xfs: split iop_unlock
The iop_unlock method is called when comitting or cancelling a
transaction.  In the latter case, the transaction may or may not be
aborted.  While there is no known problem with the current code in
practice, this implementation is limited in that any log item
implementation that might want to differentiate between a commit and a
cancellation must rely on the aborted state.  The aborted bit is only
set when the cancelled transaction is dirty, however.  This means that
there is no way to distinguish between a commit and a clean transaction
cancellation.

For example, intent log items currently rely on this distinction.  The
log item is either transferred to the CIL on commit or released on
transaction cancel. There is currently no possibility for a clean intent
log item in a transaction, but if that state is ever introduced a cancel
of such a transaction will immediately result in memory leaks of the
associated log item(s).  This is an interface deficiency and landmine.

To clean this up, replace the iop_unlock method with an iop_release
method that is specific to transaction cancel.  The existing
iop_committing method occurs at the same time as iop_unlock in the
commit path and there is no need for two separate callbacks here.
Overload the iop_committing method with the current commit time
iop_unlock implementations to eliminate the need for the latter and
further simplify the interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:32 -07:00
Christoph Hellwig 195cd83d1b xfs: don't use xfs_trans_free_items in the commit path
While commiting items looks very similar to freeing them on error it is
a different operation, and they will diverge a bit soon.

Split out the commit case from xfs_trans_free_items, inline it into
xfs_log_commit_cil and give it a separate trace point.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:31 -07:00
Christoph Hellwig 8e4b20ea83 xfs: remove the dummy iop_push implementation for inode creation items
This method should never be called, so don't waste code on it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:31 -07:00
Christoph Hellwig e8b78db77d xfs: don't require log items to implement optional methods
Just check if they are present first.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:30 -07:00
Christoph Hellwig d15cbf2f38 xfs: stop using XFS_LI_ABORTED as a parameter flag
Just pass a straight bool aborted instead of abusing XFS_LI_ABORTED as a
flag in function parameters.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:30 -07:00
Christoph Hellwig 086252c34b xfs: fix a trivial comment typo in xfs_trans_committed_bulk
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:29 -07:00
Christoph Hellwig dbd329f1e4 xfs: add struct xfs_mount pointer to struct xfs_buf
We need to derive the mount pointer from a buffer in a lot of place.
Add a direct pointer to short cut the pointer chasing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:29 -07:00
Christoph Hellwig 8124b9b601 xfs: remove the b_io_length field in struct xfs_buf
This field is now always idential to b_length.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:28 -07:00
Christoph Hellwig e99b4bd0cb xfs: properly type the b_log_item field in struct xfs_buf
Now that the log code doesn't abuse this field any more we can
declare it as a struct xfs_buf_log_item pointer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:28 -07:00
Christoph Hellwig 0564501ff5 xfs: remove unused buffer cache APIs
Now that the log code uses bios directly we can drop various special
cases in the buffer cache code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:27 -07:00
Christoph Hellwig 6e9b3dd80f xfs: stop using bp naming for log recovery buffers
Now that we don't use struct xfs_buf to hold log recovery buffer rename
the related functions and variables to just talk of a buffer instead of
using the bp name that we usually use for xfs_buf related functionality.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:27 -07:00
Christoph Hellwig 6ad5b3255b xfs: use bios directly to read and write the log recovery buffers
The xfs_buf structure is basically used as a glorified container for
a memory allocation in the log recovery code.  Replace it with a
call to kmem_alloc_large and a simple abstraction to read into or
write from it synchronously using chained bios.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:26 -07:00
Christoph Hellwig 18ffb8c3f0 xfs: return an offset instead of a pointer from xlog_align
This simplifies both the helper and the callers.  We lost a bit of
size sanity checking, but that is already covered by KASAN if needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:26 -07:00
Christoph Hellwig 1058d0f5ee xfs: move the log ioend workqueue to struct xlog
Move the workqueue used for log I/O completions from struct xfs_mount
to struct xlog to keep it self contained in the log code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: destroy the log workqueue after ensuring log ios are done]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:25 -07:00
Christoph Hellwig 79b54d9bfc xfs: use bios directly to write log buffers
Currently the XFS logging code uses the xfs_buf structure and
associated APIs to write the log buffers to disk.  This requires
various special cases in the log code and is generally not very
optimal.

Instead of using a buffer just allocate a kmem_alloc_larger region for
each log buffer, and use a bio and bio_vec array embedded in the iclog
structure to write the buffer to disk.  This also allows for using
the bio split and chaining case to deal with the case of a log
buffer wrapping around the end of the log.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: don't split if/else with an #endif]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:25 -07:00
Christoph Hellwig 2d15d2c0e0 xfs: make use of the l_targ field in struct xlog
Use the slightly shorter way to get at the buftarg for the log device
wherever we can in the log and log recovery code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:24 -07:00
Christoph Hellwig abca1f33f8 xfs: remove the syncing argument from xlog_verify_iclog
The only caller unconditionally passes true here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:24 -07:00
Christoph Hellwig 9b0489c1d1 xfs: update both stat counters together in xlog_sync
Just a small bit of code tidying up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:23 -07:00
Christoph Hellwig db0a6faf93 xfs: factor out iclog size calculation from xlog_sync
Split out another self-contained bit of code from xlog_sync.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:23 -07:00
Christoph Hellwig 5693384805 xfs: factor out splitting of an iclog from xlog_sync
Split out a self-contained chunk of code from xlog_sync that calculates
the split offset for an iclog that wraps the log end and bumps the
cycles for the second half.

Use the chance to bring some sanity to the variables used to track the
split in xlog_sync by not changing the count variable, and instead use
split as the offset for the split and use those to calculate the
sizes and offsets for the two write buffers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:22 -07:00
Christoph Hellwig 94860a301b xfs: factor out log buffer writing from xlog_sync
Replace the not very useful xlog_bdstrat wrapper with a new version that
that takes care of all the common logic for writing log buffers.  Use
the opportunity to avoid overloading the buffer address with the log
relative address, and to shed the unused return value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:22 -07:00
Christoph Hellwig 1f9489be02 xfs: don't use REQ_PREFLUSH for split log writes
If we have to split a log write because it wraps the end of the log we
can't just use REQ_PREFLUSH to flush before the first log write,
as the writes might get reordered somewhere in the I/O stack.  Issue
a manual flush in that case so that the ordering of the two log I/Os
doesn't matter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:21 -07:00
Christoph Hellwig 366fc4b898 xfs: remove XLOG_STATE_IOABORT
This value is the only flag in ic_state, which we otherwise use as
a state.  Switch it to a new debug-only field and also report and
actual error in the buffer in the I/O completion path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:21 -07:00
Christoph Hellwig 9bff313253 xfs: reformat xlog_get_lowest_lsn
Reformat xlog_get_lowest_lsn to our usual style.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:20 -07:00
Christoph Hellwig 4f62282a36 xfs: cleanup xlog_get_iclog_buffer_size
We don't really need all the messy branches in the function, as it
really does three things, out of which 2 are common for all branches:

 1) set up mount point log buffer size and count values if not already
    done from mount options
 2) calculate the number of log headers
 3) set up all the values in struct xlog based on the above

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:20 -07:00
Christoph Hellwig 76ce9823ac xfs: remove the l_iclog_size_log field from struct xlog
This field is never used, so we can simply kill it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:19 -07:00
Christoph Hellwig 72945d86dd xfs: make mem_to_page available outside of xfs_buf.c
Rename the function to kmem_to_page and move it to kmem.h together
with our kmem_large allocator that may either return kmalloced or
vmalloc pages.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:19 -07:00
Christoph Hellwig ce89755cdf xfs: renumber XBF_WRITE_FAIL
Assining a numerical value that is not close to the flags
defined near by is just asking for conflicts later on.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:18 -07:00
Christoph Hellwig 153fd7b57c xfs: remove the never used _XBF_COMPOUND flag
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:18 -07:00
Christoph Hellwig 1e85a3670d xfs: remove the no-op spinlock_destroy stub
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28 19:27:17 -07:00
Darrick J. Wong 5467b34bd1 xfs: move xfs_ino_geometry to xfs_shared.h
The inode geometry structure isn't related to ondisk format; it's
support for the mount structure.  Move it to xfs_shared.h.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-28 19:25:35 -07:00
David Howells 1eda8bab70 afs: Add support for the UAE error table
Add support for mapping AFS UAE abort codes to Linux errno values.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-28 18:37:53 +01:00
Trond Myklebust 68f461593f NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O
Fix a typo where we're confusing the default TCP retrans value
(NFS_DEF_TCP_RETRANS) for the default TCP timeout value.

Fixes: 15d03055cf ("pNFS/flexfiles: Set reasonable default ...")
Cc: stable@vger.kernel.org # 4.8+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-06-28 11:48:52 -04:00
Ronnie Sahlberg 5de254dca8 cifs: fix crash querying symlinks stored as reparse-points
We never parsed/returned any data from .get_link() when the object is a windows reparse-point
containing a symlink. This results in the VFS layer oopsing accessing an uninitialized buffer:

...
[  171.407172] Call Trace:
[  171.408039]  readlink_copy+0x29/0x70
[  171.408872]  vfs_readlink+0xc1/0x1f0
[  171.409709]  ? readlink_copy+0x70/0x70
[  171.410565]  ? simple_attr_release+0x30/0x30
[  171.411446]  ? getname_flags+0x105/0x2a0
[  171.412231]  do_readlinkat+0x1b7/0x1e0
[  171.412938]  ? __ia32_compat_sys_newfstat+0x30/0x30
...

Fix this by adding code to handle these buffers and make sure we do return a valid buffer
to .get_link()

CC: Stable <stable@vger.kernel.org>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-06-28 00:34:17 -05:00
David S. Miller d96ff269a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The new route handling in ip_mc_finish_output() from 'net' overlapped
with the new support for returning congestion notifications from BPF
programs.

In order to handle this I had to take the dev_loopback_xmit() calls
out of the switch statement.

The aquantia driver conflicts were simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 21:06:39 -07:00
Linus Torvalds 7a702b4e82 for-linus-20190627
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE7btrcuORLb1XUhEwjrBW1T7ssS0FAl0UnRoACgkQjrBW1T7s
 sS1T0w/+PFooDZNaKJkhJCGm0XyRDYmmuivEX9ydUR1x9/doRbDZTqfjsQBJLoVK
 PulxDiuFbQWXzhBJFEMuU6YBR2fjFqUGsXz5qAXPB0zaahWcSY/0Y8VCU/PKq7A6
 3oJPl/lYwYkLTYUKsnN08hByosUA7WeRQRAxbSFWdCTlUfIw72mDhprMGjJIVAlu
 snLA5lUoy7hyoFdXR5qNhYAcX8sASmi01hXhdnsKMOv4z2Vb5NoQsgqL1W8tAnsf
 BdJKL82Qd7vWQahlbOtur46aeJAL2ukGSTskuA2jOQqsKxmpos+hWq36gToq7usa
 XgPii0Rz7/2s6ZvhmxV5kmzqHylT9giU1DxWybSVo9IZBsU2i1o9DV+yBY50tr45
 s0bmpSA/u4DP2uT8oRvh47LbDqiQFA8dyVWQKE25smSdjekuZHTO0tgXf8mwC2CW
 hDci4z+ONOyqIQyFrhP7UaKuSK6tAAUbYKtXUIN6rnuq1FjuTA2+wtIlOPZuDZQ2
 yrsSUefh4/sFMBSAgoGTg9f+PiCejBMKcxoqhU2/27mvkiInAyDPfoc4oGQcinOy
 OVX3B0A8B88l26sDkWdv15d92E1GKzZLj8h66TlYwDpN+seevftKtpblZ9fJWsSf
 0NejoMV/GcA/KsAp1sxqWwouRob8H6pbGXWb97DYRA2IVyjK3q4=
 =APjA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190627' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux

Pull pidfd fixes from Christian Brauner:
 "Userspace tools and libraries such as strace or glibc need a cheap and
  reliable way to tell whether CLONE_PIDFD is supported. The easiest way
  is to pass an invalid fd value in the return argument, perform the
  syscall and verify the value in the return argument has been changed
  to a valid fd.

  However, if CLONE_PIDFD is specified we currently check if pidfd == 0
  and return EINVAL if not.

  The check for pidfd == 0 was originally added to enable us to abuse
  the return argument for passing additional flags along with
  CLONE_PIDFD in the future.

  However, extending legacy clone this way would be a terrible idea and
  with clone3 on the horizon and the ability to reuse CLONE_DETACHED
  with CLONE_PIDFD there's no real need for this clutch. So remove the
  pidfd == 0 check and help userspace out.

  Also, accordig to Al, anon_inode_getfd() should only be used past the
  point of no failure and ksys_close() should not be used at all since
  it is far too easy to get wrong. Al's motto being "basically, once
  it's in descriptor table, it's out of your control". So Al's patch
  switches back to what we already had in v1 of the original patchset
  and uses a anon_inode_getfile() + put_user() + fd_install() sequence
  in the success path and a fput() + put_unused_fd() in the failure
  path.

  The other two changes should be trivial"

* tag 'for-linus-20190627' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
  proc: remove useless d_is_dir() check
  copy_process(): don't use ksys_close() on cleanups
  samples: make pidfd-metadata fail gracefully on older kernels
  fork: don't check parent_tidptr with CLONE_PIDFD
2019-06-28 08:41:18 +08:00
Linus Torvalds cd0f3aaebc AFS fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAXRMn5vu3V2unywtrAQICpA/+IIINk6MJVQDzGhOnvWrbGdPnOdJEUyLN
 B9U4bLZJRg/j+Sqodn+fXIfsEO4FQflkSJD+xoBi4pzBZcr0xkLUVOog/1S7dv4J
 bPVT9p2f3ITNiatmisOrUe1InuHa6Wb/cUnQaLLRhd7NqbawKGRQG4tv4CGwKn67
 dJIOOm/iTCs1ACES4C5QOpU7/DWK38Pn3BbnN21bFzDgfbtbdDTaFFkhFtXy78oB
 Gcj5g+ULpkKBcuJThFuJUPZ9E4qICNZR4kJXEULSvykDDRzluhJmQ+v8btm6NJsq
 hMqTrT9M2y114V1OqXj3me7tA6wOEAfTQ0WzpzF2SmyFQKnSly/EkWc4HZXFD/8O
 BczCcABUbuKNE/pJSELx6k1M0+00QfeLcjHPc6joZFCni3lMdYWOncn/syyHw5P+
 rc9JQsy3+dLcFsaVQ5eGmX6NDc70dCrAlS6MllIzSBcwAVCctTKwm0meaSW6B2y6
 VymPy+cqi1RxMKyiQ0hAeU7Xe6yqFcl6rtonfCQqRLxkfzrCXkDp6/ELOXBzDft1
 ey6+N3WsmWW7YSPuM/SIZKV66rshlflj0w+FRluZEEAF1NYeYqXUDvK/S8KC9kPG
 AXUDvhI+tBpxg1AVz94JN714VmkbY23xV0g44eQsdqSQm2YvsxiFCSWZZ6L/KEWe
 kWQc6BGDCB0=
 =YTdG
 -----END PGP SIGNATURE-----

Merge tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:
 "The in-kernel AFS client has been undergoing testing on opendev.org on
  one of their mirror machines. They are using AFS to hold data that is
  then served via apache, and Ian Wienand had reported seeing oopses,
  spontaneous machine reboots and updates to volumes going missing. This
  patch series appears to have fixed the problem, very probably due to
  patch (2), but it's not 100% certain.

  (1) Fix the printing of the "vnode modified" warning to exclude checks
      on files for which we don't have a callback promise from the
      server (and so don't expect the server to tell us when it
      changes).

      Without this, for every file or directory for which we still have
      an in-core inode that gets changed on the server, we may get a
      message logged when we next look at it. This can happen in bulk
      if, for instance, someone does "vos release" to update a R/O
      volume from a R/W volume and a whole set of files are all changed
      together.

      We only really want to log a message if the file changed and the
      server didn't tell us about it or we failed to track the state
      internally.

  (2) Fix accidental corruption of either afs_vlserver struct objects or
      the the following memory locations (which could hold anything).
      The issue is caused by a union that points to two different
      structs in struct afs_call (to save space in the struct). The call
      cleanup code assumes that it can simply call the cleanup for one
      of those structs if not NULL - when it might be actually pointing
      to the other struct.

      This means that every Volume Location RPC op is going to corrupt
      something.

  (3) Fix an uninitialised spinlock. This isn't too bad, it just causes
      a one-off warning if lockdep is enabled when "vos release" is
      called, but the spinlock still behaves correctly.

  (4) Fix the setting of i_block in the inode. This causes du, for
      example, to produce incorrect results, but otherwise should not be
      dangerous to the kernel"

* tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Fix setting of i_blocks
  afs: Fix uninitialised spinlock afs_volume::cb_break_lock
  afs: Fix vlserver record corruption
  afs: Fix over zealous "vnode modified" warnings
2019-06-28 08:34:12 +08:00
Andreas Gruenbacher 36a7347de0 iomap: fix page_done callback for short writes
When we truncate a short write to have it retried, pass the truncated
length to the page_done callback instead of the full length.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-27 17:28:41 -07:00
Christoph Hellwig 8af54f291e fs: fold __generic_write_end back into generic_write_end
This effectively reverts a6d639da63 ("fs: factor out a
__generic_write_end helper") as we now open code what is left of that
helper in iomap.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-27 17:28:40 -07:00
Andreas Gruenbacher 8d3e72a180 iomap: don't mark the inode dirty in iomap_write_end
Marking the inode dirty for each page copied into the page cache can be
very inefficient for file systems that use the VFS dirty inode tracking,
and is completely pointless for those that don't use the VFS dirty inode
tracking.  So instead, only set an iomap flag when changing the in-core
inode size, and open code the rest of __generic_write_end.

Partially based on code from Christoph Hellwig.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-27 17:28:40 -07:00
David Howells 2e12256b9a keys: Replace uid/gid/perm permissions checking with an ACL
Replace the uid/gid/perm permissions checking on a key with an ACL to allow
the SETATTR and SEARCH permissions to be split.  This will also allow a
greater range of subjects to represented.

============
WHY DO THIS?
============

The problem is that SETATTR and SEARCH cover a slew of actions, not all of
which should be grouped together.

For SETATTR, this includes actions that are about controlling access to a
key:

 (1) Changing a key's ownership.

 (2) Changing a key's security information.

 (3) Setting a keyring's restriction.

And actions that are about managing a key's lifetime:

 (4) Setting an expiry time.

 (5) Revoking a key.

and (proposed) managing a key as part of a cache:

 (6) Invalidating a key.

Managing a key's lifetime doesn't really have anything to do with
controlling access to that key.

Expiry time is awkward since it's more about the lifetime of the content
and so, in some ways goes better with WRITE permission.  It can, however,
be set unconditionally by a process with an appropriate authorisation token
for instantiating a key, and can also be set by the key type driver when a
key is instantiated, so lumping it with the access-controlling actions is
probably okay.

As for SEARCH permission, that currently covers:

 (1) Finding keys in a keyring tree during a search.

 (2) Permitting keyrings to be joined.

 (3) Invalidation.

But these don't really belong together either, since these actions really
need to be controlled separately.

Finally, there are number of special cases to do with granting the
administrator special rights to invalidate or clear keys that I would like
to handle with the ACL rather than key flags and special checks.


===============
WHAT IS CHANGED
===============

The SETATTR permission is split to create two new permissions:

 (1) SET_SECURITY - which allows the key's owner, group and ACL to be
     changed and a restriction to be placed on a keyring.

 (2) REVOKE - which allows a key to be revoked.

The SEARCH permission is split to create:

 (1) SEARCH - which allows a keyring to be search and a key to be found.

 (2) JOIN - which allows a keyring to be joined as a session keyring.

 (3) INVAL - which allows a key to be invalidated.

The WRITE permission is also split to create:

 (1) WRITE - which allows a key's content to be altered and links to be
     added, removed and replaced in a keyring.

 (2) CLEAR - which allows a keyring to be cleared completely.  This is
     split out to make it possible to give just this to an administrator.

 (3) REVOKE - see above.


Keys acquire ACLs which consist of a series of ACEs, and all that apply are
unioned together.  An ACE specifies a subject, such as:

 (*) Possessor - permitted to anyone who 'possesses' a key
 (*) Owner - permitted to the key owner
 (*) Group - permitted to the key group
 (*) Everyone - permitted to everyone

Note that 'Other' has been replaced with 'Everyone' on the assumption that
you wouldn't grant a permit to 'Other' that you wouldn't also grant to
everyone else.

Further subjects may be made available by later patches.

The ACE also specifies a permissions mask.  The set of permissions is now:

	VIEW		Can view the key metadata
	READ		Can read the key content
	WRITE		Can update/modify the key content
	SEARCH		Can find the key by searching/requesting
	LINK		Can make a link to the key
	SET_SECURITY	Can change owner, ACL, expiry
	INVAL		Can invalidate
	REVOKE		Can revoke
	JOIN		Can join this keyring
	CLEAR		Can clear this keyring


The KEYCTL_SETPERM function is then deprecated.

The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set,
or if the caller has a valid instantiation auth token.

The KEYCTL_INVALIDATE function then requires INVAL.

The KEYCTL_REVOKE function then requires REVOKE.

The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an
existing keyring.

The JOIN permission is enabled by default for session keyrings and manually
created keyrings only.


======================
BACKWARD COMPATIBILITY
======================

To maintain backward compatibility, KEYCTL_SETPERM will translate the
permissions mask it is given into a new ACL for a key - unless
KEYCTL_SET_ACL has been called on that key, in which case an error will be
returned.

It will convert possessor, owner, group and other permissions into separate
ACEs, if each portion of the mask is non-zero.

SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY.  WRITE
permission turns on WRITE, REVOKE and, if a keyring, CLEAR.  JOIN is turned
on if a keyring is being altered.

The KEYCTL_DESCRIBE function translates the ACL back into a permissions
mask to return depending on possessor, owner, group and everyone ACEs.

It will make the following mappings:

 (1) INVAL, JOIN -> SEARCH

 (2) SET_SECURITY -> SETATTR

 (3) REVOKE -> WRITE if SETATTR isn't already set

 (4) CLEAR -> WRITE

Note that the value subsequently returned by KEYCTL_DESCRIBE may not match
the value set with KEYCTL_SETATTR.


=======
TESTING
=======

This passes the keyutils testsuite for all but a couple of tests:

 (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now
     returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed
     if the type doesn't have ->read().  You still can't actually read the
     key.

 (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't
     work as Other has been replaced with Everyone in the ACL.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-27 23:03:07 +01:00
David Howells a58946c158 keys: Pass the network namespace into request_key mechanism
Create a request_key_net() function and use it to pass the network
namespace domain tag into DNS revolver keys and rxrpc/AFS keys so that keys
for different domains can coexist in the same keyring.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: netdev@vger.kernel.org
cc: linux-nfs@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: linux-afs@lists.infradead.org
2019-06-27 23:02:12 +01:00
Bob Peterson f29e62eed2 gfs2: replace more printk with calls to fs_info and friends
This patch replaces a few leftover printk errors with calls to
fs_info and similar, so that the file system having the error is
properly logged.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 21:30:27 +02:00
Bob Peterson 3792ce973f gfs2: dump fsid when dumping glock problems
Before this patch, if a glock error was encountered, the glock with
the problem was dumped. But sometimes you may have lots of file systems
mounted, and that doesn't tell you which file system it was for.

This patch adds a new boolean parameter fsid to the dump_glock family
of functions. For non-error cases, such as dumping the glocks debugfs
file, the fsid is not dumped in order to keep lock dumps and glocktop
as clean as possible. For all error cases, such as GLOCK_BUG_ON, the
file system id is now printed. This will make it easier to debug.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 21:27:43 +02:00
Bob Peterson 55317f5b00 gfs2: simplify gfs2_freeze by removing case
Function gfs2_freeze had a case statement that simply checked the
error code, but the break statements just made the logic hard to
read. This patch simplifies the logic in favor of a simple if.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 21:26:58 +02:00
Bob Peterson 04aea0ca14 gfs2: Rename SDF_SHUTDOWN to SDF_WITHDRAWN
Before this patch, the superblock flag indicating when a file system
is withdrawn was called SDF_SHUTDOWN. This patch simply renames it to
the more obvious SDF_WITHDRAWN.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 21:26:35 +02:00
Bob Peterson d14e1ca305 gfs2: Warn when a journal replay overwrites a rgrp with buffers
This patch adds some instrumentation in gfs2's journal replay that
indicates when we're about to overwrite a rgrp for which we already
have a valid buffer_head.

When this problem occurs, it's a situation in which this node has
been granted a rgrp glock and subsequently read in buffer_heads for
it, and possibly even made changes to the rgrp bits and/or
allocation values. But now another node has failed and forced us to
replay its journal, but its journal contains a copy of the same
rgrp, without a revoke, which means we're about to overwrite a
rgrp that we now rightfully own, with an obsolete copy. That is
always a problem. It means the other node (which failed and left
its journal to be replayed) failed to flush out its rgrp buffers,
write out the revoke, and invalidate its copy before it released
the glock to our possession.

No node should ever release a glock until its metadata has been
written to the journal and revoked and invalidated..

We also kludge around the problem and refuse to replace our good
copy with the journals bad copy by not marking the buffer dirty,
but never do it silently. That's wallpapering over a larger problem
that still exists. IOW, if this situation can happen to this node,
it can also happen to a different node and we wouldn't even know it
or be able to circumvent it: Suppose we have a 3-node cluster:
Node 1 fails, leaving an obsolete rgrp block in its journal without
a revoke. Node 2 grabs the rgrp as soon as the rgrp glock is
released and starts making changes, allocating and freeing blocks
from the rgrp, etc. Node 3 replays the journal from node 1,
oblivious and unaware that it's about to overwrite node 2's changes.
So we still need to be vocal and log the error to make it apparent
that a corruption path still exists in gfs2.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 21:04:07 +02:00
Bob Peterson 49eb776ed9 gfs2: log which portion of the journal is replayed
When a journal is replayed, gfs2 logs a message similar to:

jid=X: Replaying journal...

This patch adds the tail and block number so that the range of the
replayed block is also printed. These values will match the values
shown if the journal is dumped with gfs2_edit -p journalX. The
resulting output looks something like this:

jid=1: Replaying journal...0x28b7 to 0x2beb

This will allow us to better debug file system corruption problems.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 21:03:58 +02:00
Bob Peterson e955537e32 gfs2: eliminate tr_num_revoke_rm
For its journal processing, gfs2 kept track of the number of buffers
added and removed on a per-transaction basis. These values are used
to calculate space needed in the journal. But while these calculations
make sense for the number of buffers, they make no sense for revokes.
Revokes are managed in their own list, linked from the superblock.
So it's entirely unnecessary to keep separate per-transaction counts
for revokes added and removed. A single count will do the same job.
Therefore, this patch combines the transaction revokes into a single
count.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 21:03:53 +02:00
Bob Peterson 5b3a9f348b gfs2: kthread and remount improvements
Before this patch, gfs2 saved the pointers to the two daemon threads
(logd and quotad) in the superblock, but they were never cleared,
even if the threads were stopped (e.g. on remount -o ro). That meant
that certain error conditions (like a withdrawn file system) could
race. For example, xfstests generic/361 caused an IO error during
remount -o ro, which caused the kthreads to be stopped, then the
error flagged. Later, when the test unmounted the file system, it
would try to stop the threads a second time with kthread_stop.

This patch does two things: First, every time it stops the threads
it zeroes out the thread pointer, and also checks whether it's NULL
before trying to stop it. Second, in function gfs2_remount_fs, it
was returning if an error was logged by either of the two functions
for gfs2_make_fs_ro and _rw, which caused it to bypass the online
uevent at the bottom of the function. This removes that bypass in
favor of just running the whole function, then returning the error.
That way, unmounts and remounts won't hang forever.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 21:03:43 +02:00
Kefeng Wang 15a798f7de gfs2: Use IS_ERR_OR_NULL
Use IS_ERR_OR_NULL where appropriate.

(Several more places converted by Andreas.)

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 20:53:46 +02:00
Andreas Gruenbacher 2a27b755ed gfs2: Clean up freeing struct gfs2_sbd
Add a free_sbd function for freeing a struct gfs2_sbd.  Use that for
freeing a super-block descriptor, either directly or via kobject_put.
Free sd_lkstats inside the kobject release function: that way,
gfs2_put_super will no longer leak sd_lkstats.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-27 20:53:45 +02:00
Eric Biggers adbd9b4dee fscrypt: remove selection of CONFIG_CRYPTO_SHA256
fscrypt only uses SHA-256 for AES-128-CBC-ESSIV, which isn't the default
and is only recommended on platforms that have hardware accelerated
AES-CBC but not AES-XTS.  There's no link-time dependency, since SHA-256
is requested via the crypto API on first use.

To reduce bloat, we should limit FS_ENCRYPTION to selecting the default
algorithms only.  SHA-256 by itself isn't that much bloat, but it's
being discussed to move ESSIV into a crypto API template, which would
incidentally bring in other things like "authenc" support, which would
all end up being built-in since FS_ENCRYPTION is now a bool.

For Adiantum encryption we already just document that users who want to
use it have to enable CONFIG_CRYPTO_ADIANTUM themselves.  So, let's do
the same for AES-128-CBC-ESSIV and CONFIG_CRYPTO_SHA256.

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-06-27 10:29:33 -07:00
Jeff Layton d6b8bd679c ceph: fix ceph_mdsc_build_path to not stop on first component
When ceph_mdsc_build_path is handed a positive dentry, it will return a
zero-length path string with the base set to that dentry.  This is not
what we want.  Always include at least one path component in the string.

ceph_mdsc_build_path has behaved this way for a long time but it didn't
matter until recent d_name handling rework.

Fixes: 964fff7491 ("ceph: use ceph_mdsc_build_path instead of clone_dentry_name")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-06-27 18:27:36 +02:00
Christian Brauner 30d158b143
proc: remove useless d_is_dir() check
Remove the d_is_dir() check from tgid_pidfd_to_pid().

It is pointless since you should never get &proc_tgid_base_operations
for f_op on a non-directory.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian@brauner.io>
2019-06-27 12:25:09 +02:00
Eric Sandeen 555b2c3da1 quota: honor quota type in Q_XGETQSTAT[V] calls
The code in quota_getstate and quota_getstatev is strange; it
says the returned fs_quota_stat[v] structure has room for only
one type of time limits, so fills it in with the first enabled
quota, even though every quotactl command must have a type sent
in by the user.

Instead of just picking the first enabled quota, fill in the
reply with the timers for the quota type that was actually
requested.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-25 17:51:35 +02:00
Ingo Molnar b9271f0c65 Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into perf/core, to refresh branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:25:52 +02:00
Ingo Molnar d2abae71eb Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into sched/core, to refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:19:53 +02:00
Jens Axboe 9e645e1105 io_uring: add support for sqe links
With SQE links, we can create chains of dependent SQEs. One example
would be queueing an SQE that's a read from one file descriptor, with
the linked SQE being a write to another with the same set of buffers.

An SQE link will not stall the pipeline, it'll just ensure that
dependent SQEs aren't issued before the previous link has completed.

Any error at submission or completion time will break the chain of SQEs.
For completions, this also includes short reads or writes, as the next
SQE could depend on the previous one being fully completed.

Any SQE in a chain that gets canceled due to any of the above errors,
will get an CQE fill with -ECANCELED as the error value.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-24 08:00:18 -06:00
Christoph Hellwig a2357223c5 binfmt_flat: don't offset the data start
Ever since the initial commit of the binfmt_flat shared library
support back in the bitkeeper days we've offset the actual in-memory
.data start by one field per possible shared library, or 1 in case
shared library support isn't enabled.  I can't find anything in the
loader that actually makes use of it, nor was it present before
shared library support it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:47 +10:00
Christoph Hellwig a445d988b4 binfmt_flat: move the MAX_SHARED_LIBS definition to binfmt_flat.c
MAX_SHARED_LIBS is an implementation detail of the kernel loader,
and should be kept away from the file format definition.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:47 +10:00
Christoph Hellwig 6843d8aa5b binfmt_flat: remove the persistent argument from flat_get_addr_from_rp
The argument is never used.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:47 +10:00
Christoph Hellwig cf9a566c2c binfmt_flat: make support for old format binaries optional
No need to carry the extra code around, given that systems using flat
binaries are generally very resource constrained.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:47 +10:00
Christoph Hellwig aef0f78e74 binfmt_flat: add a ARCH_HAS_BINFMT_FLAT option
Allow architectures to opt into ARCH_HAS_BINFMT_FLAT support instead of
assuming that all nommu ports support the format.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:47 +10:00
Christoph Hellwig 3b97771842 binfmt_flat: add endianess annotations
Most binfmt_flat on-disk fields are big endian.  Use the proper __be32
type where applicable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:47 +10:00
Christoph Hellwig 06d2bfedd1 binfmt_flat: remove the uapi <linux/flat.h> header
The split between the two flat.h files is completely arbitrary, and the
uapi version even contains CONFIG_ ifdefs that can't work in userspace.
The only userspace program known to use the header is elf2flt, and it
ships with its own version of the combined header.

Use the chance to move the <asm/flat.h> inclusion out of this file, as it
is in no way needed for the format defintion, but just for the binfmt
implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:46 +10:00
Christoph Hellwig bdd15a2884 binfmt_flat: replace flat_argvp_envp_on_stack with a Kconfig variable
This will eventually allow us to kill the need for an <asm/flat.h> for
many cases.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:46 +10:00
Christoph Hellwig 1d52dca117 binfmt_flat: remove flat_old_ram_flag
Instead add a Kconfig variable that only h8300 selects.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:46 +10:00
Christoph Hellwig 02da283302 binfmt_flat: provide a default version of flat_get_relocate_addr
This way only the two architectures that do masking need to provide
the helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:46 +10:00
Christoph Hellwig 2f3196d49b binfmt_flat: remove flat_set_persistent
This helper is a no-op on all architectures, remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:46 +10:00
Christoph Hellwig 9ee24b2a38 binfmt_flat: remove flat_reloc_valid
This helper is the same for all architectures, open code it in the only
caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:46 +10:00
Greg Kroah-Hartman 8083f3d788 Merge 5.2-rc6 into char-misc-next
We need the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-23 09:23:33 +02:00
David S. Miller 92ad6325cb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor SPDX change conflict.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-22 08:59:24 -04:00
Theodore Ts'o 7633b08b27 ext4: rename htree_inline_dir_to_tree() to ext4_inlinedir_to_tree()
Clean up namespace pollution by the inline_data code.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-06-21 21:57:00 -04:00
Linus Torvalds c036f7dabc More NFS client fixes for Linux 5.2
Bugfixes:
 - SUNRPC: Fix a credential refcount leak
 - Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE"
 - SUNRPC: Fix xps refcount imbalance on the error path
 - NFS4: Only set creation opendata if O_CREAT
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAl0NMHQACgkQ18tUv7Cl
 QOujAw//bL4p0ADvCSglqO0ceZcHpHuQVhj10f+tyXfkq3R7WDQmP2qok/+6uZjk
 rYxWh6nCqQqTBSmstf4ouLMjdQTk+zDQf38zSSWGPW0U6rCX4kl/yOyskBriYqnf
 W4U5+hOCvQ8prKdkcjbhYqdvz6qT9bhc5X7kHgtD66CtvyzUDraYw04Mojzodl91
 CPV97rDZbiHBAgZPBFKF+qoTXEL7hQlHUREcR/DZBLV3qrMBsraog1T1OpWRGev0
 OAxVAwZyXFcWDFm7mFcItMA4WcUnZbL77gPNtvZSfgYPxHsytHfH8KOmEG7zP5Yy
 +blko41nvNR2UVOPQ/zTvWj9pkuHQlacDUrlYdgnOmHuGhufj1/xx4Z5C2dkTTnp
 ufjcVXJOqZE7lWyA4IOWapc7gLM3q6I8sUR9w5nLc1meYphNAqHZgK0fTgfbMoo7
 JweUBcgGmNvkMpkX561HBe16ENKvcgQQp666VsnTTI/BPZ/BBdlayKGBbxA3j22F
 znF4gwznvQ0jVxtlNsibzSwM8GQbc6UM5fGPM+atlPJEzban0waQaIbCW347DViZ
 fTXP2NQmvH1x+YNgx6RTRwVBWWT02u/ijQHf7+NvwWLWTKb9OXwQVjlnWHu/E/Qi
 MpNXxtBTT6y+DG09VNV9CYwmAsULnlRQWe9RNBfMz+4sjLCyIjg=
 =+z30
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.2-3' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull more NFS client fixes from Anna Schumaker:
 "These are mostly refcounting issues that people have found recently.
  The revert fixes a suspend recovery performance issue.

   - SUNRPC: Fix a credential refcount leak

   - Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE"

   - SUNRPC: Fix xps refcount imbalance on the error path

   - NFS4: Only set creation opendata if O_CREAT"

* tag 'nfs-for-5.2-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  SUNRPC: Fix a credential refcount leak
  Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE"
  net :sunrpc :clnt :Fix xps refcount imbalance on the error path
  NFS4: Only set creation opendata if O_CREAT
2019-06-21 13:45:41 -07:00
Theodore Ts'o ddce3b9471 ext4: refactor initialize_dirent_tail()
Move the calculation of the location of the dirent tail into
initialize_dirent_tail().  Also prefix the function with ext4_ to fix
kernel namepsace polution.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-06-21 16:31:47 -04:00
Jens Axboe 60c112b0ad io_uring: ensure req->file is cleared on allocation
Stephen reports:

I hit the following General Protection Fault when testing io_uring via
the io_uring engine in fio. This was on a VM running 5.2-rc5 and the
latest version of fio. The issue occurs for both null_blk and fake NVMe
drives. I have not tested bare metal or real NVMe SSDs. The fio script
used is given below.

[io_uring]
time_based=1
runtime=60
filename=/dev/nvme2n1 (note /dev/nullb0 also fails)
ioengine=io_uring
bs=4k
rw=readwrite
direct=1
fixedbufs=1
sqthread_poll=1
sqthread_poll_cpu=0

general protection fault: 0000 [#1] SMP PTI
CPU: 0 PID: 872 Comm: io_uring-sq Not tainted 5.2.0-rc5-cpacket-io-uring #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
RIP: 0010:fput_many+0x7/0x90
Code: 01 48 85 ff 74 17 55 48 89 e5 53 48 8b 1f e8 a0 f9 ff ff 48 85 db 48 89 df 75 f0 5b 5d f3 c3 0f 1f 40 00 0f 1f 44 00 00 89 f6 <f0> 48 29 77 38 74 01 c3 55 48 89 e5 53 48 89 fb 65 48 \

RSP: 0018:ffffadeb817ebc50 EFLAGS: 00010246
RAX: 0000000000000004 RBX: ffff8f46ad477480 RCX: 0000000000001805
RDX: 0000000000000000 RSI: 0000000000000001 RDI: f18b51b9a39552b5
RBP: ffffadeb817ebc58 R08: ffff8f46b7a318c0 R09: 000000000000015d
R10: ffffadeb817ebce8 R11: 0000000000000020 R12: ffff8f46ad4cd000
R13: 00000000fffffff7 R14: ffffadeb817ebe30 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff8f46b7a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055828f0bbbf0 CR3: 0000000232176004 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? fput+0x13/0x20
 io_free_req+0x20/0x40
 io_put_req+0x1b/0x20
 io_submit_sqe+0x40a/0x680
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 io_submit_sqes+0xb9/0x160
 ? io_submit_sqes+0xb9/0x160
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x34/0x70
 ? __schedule+0x3f2/0x6a0
 ? __switch_to_asm+0x34/0x70
 io_sq_thread+0x1af/0x470
 ? __switch_to_asm+0x34/0x70
 ? wait_woken+0x80/0x80
 ? __switch_to+0x85/0x410
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x34/0x70
 ? __schedule+0x3f2/0x6a0
 kthread+0x105/0x140
 ? io_submit_sqes+0x160/0x160
 ? kthread+0x105/0x140
 ? io_submit_sqes+0x160/0x160
 ? kthread_destroy_worker+0x50/0x50
 ret_from_fork+0x35/0x40

which occurs because using a kernel side submission thread isn't valid
without using fixed files (registered through io_uring_register()). This
causes io_uring to put the request after logging an error, but before
the file field is set in the request. If it happens to be non-zero, we
attempt to fput() garbage.

Fix this by ensuring that req->file is initialized when the request is
allocated.

Cc: stable@vger.kernel.org # 5.1+
Reported-by: Stephen Bates <sbates@raithlin.com>
Tested-by: Stephen Bates <sbates@raithlin.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-21 14:16:28 -06:00
Theodore Ts'o f036adb399 ext4: rename "dirent_csum" functions to use "dirblock"
Functions such as ext4_dirent_csum_verify() and ext4_dirent_csum_set()
don't actually operate on a directory entry, but a directory block.
And while they take a struct ext4_dir_entry *dirent as an argument, it
had better be the first directory at the beginning of the direct
block, or things will go very wrong.

Rename the following functions so that things make more sense, and
remove a lot of confusing casts along the way:

   ext4_dirent_csum_verify	 -> ext4_dirblock_csum_verify
   ext4_dirent_csum_set		 -> ext4_dirblock_csum_set
   ext4_dirent_csum		 -> ext4_dirblock_csum
   ext4_handle_dirty_dirent_node -> ext4_handle_dirty_dirblock

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-06-21 15:49:26 -04:00
Benjamin Coddington 909105199a NFS4: Only set creation opendata if O_CREAT
We can end up in nfs4_opendata_alloc during task exit, in which case
current->fs has already been cleaned up.  This leads to a crash in
current_umask().

Fix this by only setting creation opendata if we are actually doing an open
with O_CREAT.  We can drop the check for NULL nfs4_open_createattrs, since
O_CREAT will never be set for the recovery path.

Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-06-21 14:43:25 -04:00
Wang Shilong 5043a9643f f2fs: only set project inherit bit for directory
It doesn't make any sense to have project inherit bits
for regular files, even though this won't cause any
problem, but it is better fix this.

Cc: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-21 10:41:57 -07:00
Eric Biggers 360985573b f2fs: separate f2fs i_flags from fs_flags and ext4 i_flags
f2fs copied all the on-disk i_flags from ext4, and along with it the
assumption that the on-disk i_flags are the same as the bits used by
FS_IOC_GETFLAGS and FS_IOC_SETFLAGS.  This is problematic because
reserving an on-disk inode flag in either filesystem's i_flags or in
these ioctls effectively reserves it in all the other places too.  In
fact, most of the "f2fs i_flags" are not used by f2fs at all.

Fix this by separating f2fs's i_flags from the ioctl bits and ext4's
i_flags.

In the process, un-reserve all "f2fs i_flags" that aren't actually
supported by f2fs.  This included various flags that were not settable
at all, as well as various flags that were settable by FS_IOC_SETFLAGS
but didn't actually do anything.

There's a slight chance we'll need to add some flag(s) back to
FS_IOC_SETFLAGS in order to avoid breaking users who expect f2fs to
accept some random flag(s).  But hopefully such users don't exist.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-21 10:41:57 -07:00
Kimberly Brown 176ef3c4de f2fs: replace ktype default_attrs with default_groups
The kobj_type default_attrs field is being replaced by the
default_groups field. Replace the default_attrs fields in f2fs_sb_ktype
and f2fs_feat_ktype with default_groups. Use the ATTRIBUTE_GROUPS macro
to create f2fs_groups and f2fs_feat_groups.

Fixes: fef4129ec2 ("f2fs: fix to be aware discard/preflush/dio command in is_idle()")
Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-21 10:41:56 -07:00
Linus Torvalds c884d8ac7f SPDX update for 5.2-rc6
Another round of SPDX updates for 5.2-rc6
 
 Here is what I am guessing is going to be the last "big" SPDX update for
 5.2.  It contains all of the remaining GPLv2 and GPLv2+ updates that
 were "easy" to determine by pattern matching.  The ones after this are
 going to be a bit more difficult and the people on the spdx list will be
 discussing them on a case-by-case basis now.
 
 Another 5000+ files are fixed up, so our overall totals are:
 	Files checked:            64545
 	Files with SPDX:          45529
 
 Compared to the 5.1 kernel which was:
 	Files checked:            63848
 	Files with SPDX:          22576
 This is a huge improvement.
 
 Also, we deleted another 20000 lines of boilerplate license crud, always
 nice to see in a diffstat.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXQyQYA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymnGQCghETUBotn1p3hTjY56VEs6dGzpHMAnRT0m+lv
 kbsjBGEJpLbMRB2krnaU
 =RMcT
 -----END PGP SIGNATURE-----

Merge tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull still more SPDX updates from Greg KH:
 "Another round of SPDX updates for 5.2-rc6

  Here is what I am guessing is going to be the last "big" SPDX update
  for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates
  that were "easy" to determine by pattern matching. The ones after this
  are going to be a bit more difficult and the people on the spdx list
  will be discussing them on a case-by-case basis now.

  Another 5000+ files are fixed up, so our overall totals are:
	Files checked:            64545
	Files with SPDX:          45529

  Compared to the 5.1 kernel which was:
	Files checked:            63848
	Files with SPDX:          22576

  This is a huge improvement.

  Also, we deleted another 20000 lines of boilerplate license crud,
  always nice to see in a diffstat"

* tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (65 commits)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485
  ...
2019-06-21 09:58:42 -07:00
Linus Torvalds 05512b0f46 four small SMB3 fixes, all for stable
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl0MOgQACgkQiiy9cAdy
 T1EJYwv9HdThounjBOZJIxWEgrmZPDzrX4gC3qMtCW5kn8VIKu3em6QAh0N8F03y
 j4NZwH1bSUPaD+mtiWbSHA7An9On6xfGbLOPzJukVPYIjm58IxrUD5PUzgZk8lFg
 5WL4F5TO7uyecokK5RP/HijgT6bg9ZAcC++dV/ZAZ0+ihGIc5iRFT+eH0Y4k3aGr
 tf97JDf02N5olOKKfOg4yfbZA0tG/A2W6BShZg+f1HaNxhBmRtzmPqMAOnGQQiL6
 uig3la1KC7czC0gnaYqbQD7Nisy7KUbeF15B5/l8/ukUMMTQig4zoynBukMiasSQ
 XsHYGzNxLoNmnw3NKyZQ8FC3k1cYRz0gxQn++QQ2q1kbOxBvwlNaNWxJ/UXRQQTB
 Prc+PqYpUabzrNWjlXUP8uliRH6vDV/0Y5Ohotl8megKWNiJV5C3LV+Zaf04IDag
 Db0W6qTZbnlrQn+0rx7F643R+zBooS+cIfgP+U6zR5KyuO32bPN5+z/o1WXL71gQ
 2PH8qw6k
 =SJLT
 -----END PGP SIGNATURE-----

Merge tag '5.2-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Four small SMB3 fixes, all for stable"

* tag '5.2-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix GlobalMid_Lock bug in cifs_reconnect
  SMB3: retry on STATUS_INSUFFICIENT_RESOURCES instead of failing write
  cifs: add spinlock for the openFileList to cifsInodeInfo
  cifs: fix panic in smb2_reconnect
2019-06-21 09:51:44 -07:00
Kimberly Brown 7c7e301406 btrfs: sysfs: Replace default_attrs in ktypes with groups
The kobj_type default_attrs field is being replaced by the
default_groups field. Replace the default_attrs fields in
btrfs_raid_ktype and space_info_ktype with default_groups.

Change "raid_attributes" to "raid_attrs", and use the ATTRIBUTE_GROUPS
macro to create raid_groups and space_info_groups.

Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-21 15:38:59 +02:00
Theodore Ts'o 4e19d6b65f ext4: allow directory holes
The largedir feature was intended to allow ext4 directories to have
unmapped directory blocks (e.g., directory holes).  And so the
released e2fsprogs no longer enforces this for largedir file systems;
however, the corresponding change to the kernel-side code was not made.

This commit fixes this oversight.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2019-06-20 21:19:02 -04:00
Theodore Ts'o 9382cde8cd jbd2: drop declaration of journal_sync_buffer()
The journal_sync_buffer() function was never carried over from jbd to
jbd2.  So get rid of the vestigal declaration of this (non-existent)
function.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-20 17:32:21 -04:00
Ross Zwisler 73131fbb00 ext4: use jbd2_inode dirty range scoping
Use the newly introduced jbd2_inode dirty range scoping to prevent us
from waiting forever when trying to complete a journal transaction.

Signed-off-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
2019-06-20 17:26:26 -04:00
Ross Zwisler 6ba0e7dc64 jbd2: introduce jbd2_inode dirty range scoping
Currently both journal_submit_inode_data_buffers() and
journal_finish_inode_data_buffers() operate on the entire address space
of each of the inodes associated with a given journal entry.  The
consequence of this is that if we have an inode where we are constantly
appending dirty pages we can end up waiting for an indefinite amount of
time in journal_finish_inode_data_buffers() while we wait for all the
pages under writeback to be written out.

The easiest way to cause this type of workload is do just dd from
/dev/zero to a file until it fills the entire filesystem.  This can
cause journal_finish_inode_data_buffers() to wait for the duration of
the entire dd operation.

We can improve this situation by scoping each of the inode dirty ranges
associated with a given transaction.  We do this via the jbd2_inode
structure so that the scoping is contained within jbd2 and so that it
follows the lifetime and locking rules for that structure.

This allows us to limit the writeback & wait in
journal_submit_inode_data_buffers() and
journal_finish_inode_data_buffers() respectively to the dirty range for
a given struct jdb2_inode, keeping us from waiting forever if the inode
in question is still being appended to.

Signed-off-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
2019-06-20 17:24:56 -04:00
Linus Torvalds 4ae004a9bc overlayfs fixes for 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXQvlwwAKCRDh3BK/laaZ
 PJH4AP4vnumu1Q22ZWiGkTeU93JTgHt3MGPG1r1DtnUmsIKRfwEAjDY8bvuOP7Vw
 EYQicghvAPTHWqyGUoe0QZJwPlMiZw4=
 =TKOM
 -----END PGP SIGNATURE-----

Merge tag 'ovl-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs fixes from Miklos Szeredi:
 "Fix two regressions in this cycle, and a couple of older bugs"

* tag 'ovl-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: make i_ino consistent with st_ino in more cases
  ovl: fix typo in MODULE_PARM_DESC
  ovl: fix bogus -Wmaybe-unitialized warning
  ovl: don't fail with disconnected lower NFS
  ovl: fix wrong flags check in FS_IOC_FS[SG]ETXATTR ioctls
2019-06-20 14:19:34 -07:00
Linus Torvalds b910f6a7cc fuse fixes for 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXQvk/wAKCRDh3BK/laaZ
 POwjAP9hPq9pTlX3YZsz14DcoBdz8iyFXNWMj7eQCL4GioCMKgEA3XNajQyf9DLK
 bWRkAdYHVAcP0QueK5ReNYl3pV66mw4=
 =f2l/
 -----END PGP SIGNATURE-----

Merge tag 'fuse-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse fix from Miklos Szeredi:
 "Just a single revert, fixing a regression in -rc1"

* tag 'fuse-fixes-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  Revert "fuse: require /dev/fuse reads to have enough buffer capacity"
2019-06-20 14:16:16 -07:00
Linus Torvalds d72558b2b3 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAl0LhAIACgkQnJ2qBz9k
 QNnUqwf/d7fNZv0+GJVBIrIVbSUgHqzJYxakMWAS6NGMmd2fkPcoPRHitXWbi5MJ
 fhJPFceNVqY30RPQUePlDmWSitEDI0kdaNZ3Z8SzE9YszaEgoLNAN/dpOuPGpQfh
 kXQd7yM1cBZJoAv5kQsECiYXfY7nk+3J+DVsu69rBcsooxT5rfXs00Dz9ETao9gK
 L1SR/s5C6b2t0m0EfQpv/+PjbzPQPLKngvihvFesAT6lSA6QpRMY7M8+4Es3rzuI
 7h0kuThkJaIp9B+D9C8vYIT+uVQVjsN9wXozJHXRNvnK/4mfDvYJdWSkRhqP5p1a
 DBRo/jK8meV1ZvIEsLjARxHg0z7yAA==
 =PlCd
 -----END PGP SIGNATURE-----

Merge tag 'for_v5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull two misc vfs fixes from Jan Kara:
 "One small quota fix fixing spurious EDQUOT errors and one fanotify fix
  fixing a bug in the new fanotify FID reporting code"

* tag 'for_v5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: update connector fsid cache on add mark
  quota: fix a problem about transfer quota
2019-06-20 10:12:53 -07:00
Zhengyuan Liu ee102584ef fs/afs: use struct_size() in kzalloc()
As Gustavo said in other patches doing the same replace, we can now
use the new struct_size() helper to avoid leaving these open-coded and
prone to type mistake.

Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn>
Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20 18:12:17 +01:00
David Howells 4521819369 afs: Trace afs_server usage
Add a tracepoint (afs_server) to track the afs_server object usage count.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20 18:12:17 +01:00
David Howells 051d25250b afs: Add some callback management tracepoints
Add a couple of tracepoints to track callback management:

 (1) afs_cb_miss - Logs when we were unable to apply a callback, either due
     to the inode being discarded or due to a competing thread applying a
     callback first.

 (2) afs_cb_break - Logs when we attempted to clear the noted callback
     promise, either due to the server explicitly breaking the callback,
     the callback promise lapsing or a local event obsoleting it.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20 18:12:16 +01:00
David Howells fa59f52f5b afs: afs_unlink() doesn't need to check dentry->d_inode
Don't check that dentry->d_inode is valid in afs_unlink().  We should be
able to take that as given.

This caused Smatch to issue the following warning:

	fs/afs/dir.c:1392 afs_unlink() error: we previously assumed 'vnode' could be null (see line 1375)

Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20 18:12:16 +01:00
David Howells 2cd42d19cf afs: Fix setting of i_blocks
The setting of i_blocks, which is calculated from i_size, has got
accidentally misordered relative to the setting of i_size when initially
setting up an inode.  Further, i_blocks isn't updated by afs_apply_status()
when the size is updated.

To fix this, break the i_size/i_blocks setting out into a helper function
and call it from both places.

Fixes: a58823ac45 ("afs: Fix application of status and callback to be under same lock")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20 18:12:02 +01:00
Linus Torvalds 41a247d896 for-linus-20190620
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl0LUG0QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgplqNEADDmx5+r01qqeVKHbcKPFdd1BZMxVhun0cI
 u1kaeQijGCuYeWACuwXAuMRovEjr/lz9ClJVAKqT+e+wKtbEnRzT1fgG2elYU/ta
 gSAFzqbQOidY+r4oF+xsqJLduOlFtNbiPtyFWzBf/FHe53FS3OT017FJ+SaIE4eD
 ljzo4QD2Sv3/c3CGbbCZUGdIMd4c/7qwU+dHeoVDOG3o8FAYCwewA/XCJQ9VZXgW
 38bpRPvEZ9nvXP00C5Khzsqyxo3P+A2qk1+z3Bx4d8Dw64+jUVoYNdws8qr13MZu
 +EwHy91cvBCF1mzu0+X3irDh+Di+uuzvQ0Nfd7E1xkTNUKSc7ql7XpYoAyF3D7E3
 /4M864cFcaXq6RVY25uq92vUPk4bKsugR19zmKe8PYKrhG0NhRncJNSXNV1coyhD
 Nfu4EKybTwBcdJO8hvs8moAjLPLPtcWopLrHq9CoCqTC8RAIG1IT8OWfaqQuEBCn
 RlzaCuAHP2QBdkZ/69BK48/OSSqhnsQF200pRDA+3NJnoX5UIcqdFwXNu0GUaqzg
 nmqWiNorIvKKmWTWDi8LgnqYM1WU6K30ix1yG848e9Clkw/pwKPLc0FuDBiynIqy
 GD0FZK4v4z8gz0GeASJqqefI63DnT8CeQvuCuRDQoyLl46ND7kXvI04Q3QvTXpUD
 I6wfzGZJvQ==
 =rUI5
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190620' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Three fixes that should go into this series.

  One is a set of two patches from Christoph, fixing a page leak on same
  page merges. Boiled down version of a bigger fix, but this one is more
  appropriate for this late in the cycle (and easier to backport to
  stable).

  The last patch is for a divide error in MD, from Mariusz (via Song)"

* tag 'for-linus-20190620' of git://git.kernel.dk/linux-block:
  md: fix for divide error in status_resync
  block: fix page leak when merging to same page
  block: return from __bio_try_merge_page if merging occured in the same page
2019-06-20 09:58:35 -07:00
David Howells 90fa9b6452 afs: Fix uninitialised spinlock afs_volume::cb_break_lock
Fix the cb_break_lock spinlock in afs_volume struct by initialising it when
the volume record is allocated.

Also rename the lock to cb_v_break_lock to distinguish it from the lock of
the same name in the afs_server struct.

Without this, the following trace may be observed when a volume-break
callback is received:

  INFO: trying to register non-static key.
  the code is fine but needs lockdep annotation.
  turning off the locking correctness validator.
  CPU: 2 PID: 50 Comm: kworker/2:1 Not tainted 5.2.0-rc1-fscache+ #3045
  Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
  Workqueue: afs SRXAFSCB_CallBack
  Call Trace:
   dump_stack+0x67/0x8e
   register_lock_class+0x23b/0x421
   ? check_usage_forwards+0x13c/0x13c
   __lock_acquire+0x89/0xf73
   lock_acquire+0x13b/0x166
   ? afs_break_callbacks+0x1b2/0x3dd
   _raw_write_lock+0x2c/0x36
   ? afs_break_callbacks+0x1b2/0x3dd
   afs_break_callbacks+0x1b2/0x3dd
   ? trace_event_raw_event_afs_server+0x61/0xac
   SRXAFSCB_CallBack+0x11f/0x16c
   process_one_work+0x2c5/0x4ee
   ? worker_thread+0x234/0x2ac
   worker_thread+0x1d8/0x2ac
   ? cancel_delayed_work_sync+0xf/0xf
   kthread+0x11f/0x127
   ? kthread_park+0x76/0x76
   ret_from_fork+0x24/0x30

Fixes: 68251f0a68 ("afs: Fix whole-volume callback handling")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20 16:49:35 +01:00
David Howells a6853b9ce8 afs: Fix vlserver record corruption
Because I made the afs_call struct share pointers to an afs_server object
and an afs_vlserver object to save space, afs_put_call() calls
afs_put_server() on afs_vlserver object (which is only meant for the
afs_server object) because it sees that call->server isn't NULL.

This means that the afs_vlserver object gets unpredictably and randomly
modified, depending on what config options are set (such as lockdep).

Fix this by getting rid of the union and having two non-overlapping
pointers in the afs_call struct.

Fixes: ffba718e93 ("afs: Get rid of afs_call::reply[]")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20 16:49:35 +01:00
David Howells 3647e42b55 afs: Fix over zealous "vnode modified" warnings
Occasionally, warnings like this:

	vnode modified 2af7 on {10000b:1} [exp 2af2] YFS.FetchStatus(vnode)

are emitted into the kernel log.  This indicates that when we were applying
the updated vnode (file) status retrieved from the server to an inode we
saw that the data version number wasn't what we were expecting (in this
case it's 0x2af7 rather than 0x2af2).

We've usually received a callback from the server prior to this point - or
the callback promise has lapsed - so the warning is merely informative and
the state is to be expected.

Fix this by only emitting the warning if the we still think that we have a
valid callback promise and haven't received a callback.

Also change the format slightly so so that the new data version doesn't
look like part of the text, the like is prefixed with "kAFS: " and the
message is ranked as a warning.

Fixes: 31143d5d51 ("AFS: implement basic file write support")
Reported-by: Ian Wienand <iwienand@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-20 16:49:34 +01:00
Amir Goldstein 7377f5bec1 fsnotify: get rid of fsnotify_nameremove()
For all callers of fsnotify_{unlink,rmdir}(), we made sure that d_parent
and d_name are stable.  Therefore, fsnotify_{unlink,rmdir}() do not need
the safety measures in fsnotify_nameremove() to stabilize parent and name.
We can now simplify those hooks and get rid of fsnotify_nameremove().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20 14:47:54 +02:00
Amir Goldstein 49246466a9 fsnotify: move fsnotify_nameremove() hook out of d_delete()
d_delete() was piggy backed for the fsnotify_nameremove() hook when
in fact not all callers of d_delete() care about fsnotify events.

For all callers of d_delete() that may be interested in fsnotify events,
we made sure to call one of fsnotify_{unlink,rmdir}() hooks before
calling d_delete().

Now we can move the fsnotify_nameremove() call from d_delete() to the
fsnotify_{unlink,rmdir}() hooks.

Two explicit calls to fsnotify_nameremove() from nfs/afs sillyrename
are also removed. This will cause a change of behavior - nfs/afs will
NOT generate an fsnotify delete event when renaming over a positive
dentry.  This change is desirable, because it is consistent with the
behavior of all other filesystems.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20 14:47:44 +02:00
Amir Goldstein 6146e78c03 configfs: call fsnotify_rmdir() hook
This will allow generating fsnotify delete events on unregister
of group/subsystem after the fsnotify_nameremove() hook is removed
from d_delete().

The rest of the d_delete() calls from this filesystem are either
called recursively from within debugfs_unregister_{group,subsystem},
called from a vfs function that already has delete hooks or are
called from shutdown/cleanup code.

Cc: Joel Becker <jlbec@evilplan.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20 14:47:21 +02:00
Amir Goldstein 6679ea6dea debugfs: call fsnotify_{unlink,rmdir}() hooks
This will allow generating fsnotify delete events after the
fsnotify_nameremove() hook is removed from d_delete().

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20 14:47:09 +02:00
Amir Goldstein 823e545c02 debugfs: simplify __debugfs_remove_file()
Move simple_unlink()+d_delete() from __debugfs_remove_file() into
caller __debugfs_remove() and rename helper for post remove file to
__debugfs_file_removed().

This will simplify adding fsnotify_unlink() hook.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20 14:46:42 +02:00
Amir Goldstein fd0d506f2b devpts: call fsnotify_unlink() hook
This will allow generating fsnotify delete events after the
fsnotify_nameremove() hook is removed from d_delete().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20 14:46:34 +02:00
Amir Goldstein 4bf2377472 tracefs: call fsnotify_{unlink,rmdir}() hooks
This will allow generating fsnotify delete events after the
fsnotify_nameremove() hook is removed from d_delete().

Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20 14:46:31 +02:00
Amir Goldstein 46008d9d3f btrfs: call fsnotify_rmdir() hook
This will allow generating fsnotify delete events after the
fsnotify_nameremove() hook is removed from d_delete().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20 14:45:04 +02:00
Amir Goldstein 116b9731ad fsnotify: add empty fsnotify_{unlink,rmdir}() hooks
We would like to move fsnotify_nameremove() calls from d_delete()
into a higher layer where the hook makes more sense and so we can
consider every d_delete() call site individually.

Start by creating empty hook fsnotify_{unlink,rmdir}() and place
them in the proper VFS call sites.  After all d_delete() call sites
will be converted to use the new hook, the new hook will generate the
delete events and fsnotify_nameremove() hook will be removed.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-20 14:44:55 +02:00
Lianbo Jiang 4eb5fec31e fs/proc/vmcore: Enable dumping of encrypted memory when SEV was active
In the kdump kernel, the memory of the first kernel gets to be dumped
into a vmcore file.

Similarly to SME kdump, if SEV was enabled in the first kernel, the old
memory has to be remapped encrypted in order to access it properly.

Commit

  992b649a3f ("kdump, proc/vmcore: Enable kdumping encrypted memory with SME enabled")

took care of the SME case but it uses sme_active() which checks for SME
only. Use mem_encrypt_active() instead, which returns true when either
SME or SEV is active.

Unlike SME, the second kernel images (kernel and initrd) are loaded into
encrypted memory when SEV is active, hence the kernel elf header must be
remapped as encrypted in order to access it properly.

 [ bp: Massage commit message. ]

Co-developed-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: bhe@redhat.com
Cc: dyoung@redhat.com
Cc: Ganesh Goudar <ganeshgr@chelsio.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: kexec@lists.infradead.org
Cc: linux-fsdevel@vger.kernel.org
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: mingo@redhat.com
Cc: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190430074421.7852-4-lijiang@redhat.com
2019-06-20 10:07:49 +02:00
Ard Biesheuvel 97a5fee2bd fs: cifs: switch to RC4 library interface
The CIFS code uses the sync skcipher API to invoke the ecb(arc4) skcipher,
of which only a single generic C code implementation exists. This means
that going through all the trouble of using scatterlists etc buys us
very little, and we're better off just invoking the arc4 library directly.

This also reverts commit 5f4b55699a ("CIFS: Fix BUG() in calc_seckey()"),
since it is no longer necessary to allocate sec_key on the heap.

Cc: linux-cifs@vger.kernel.org
Cc: Steve French <sfrench@samba.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-20 14:19:55 +08:00
Colin Ian King c708b1c6de ext4: remove redundant assignment to node
Pointer 'node' is assigned a value that is never read, node is
later overwritten when it re-assigned a different value inside
the while-loop.  The assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2019-06-20 00:10:10 -04:00
Gabriel Krisman Bertazi 3ae72562ad ext4: optimize case-insensitive lookups
Temporarily cache a casefolded version of the file name under lookup in
ext4_filename, to avoid repeatedly casefolding it.  I got up to 30%
speedup on lookups of large directories (>100k entries), depending on
the length of the string under lookup.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-06-19 23:45:09 -04:00
zhangjs b03755ad6f ext4: make __ext4_get_inode_loc plug
Add a blk_plug to prevent the inode table readahead from being
submitted as small I/O requests.

Signed-off-by: zhangjs <zachary@baishancloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2019-06-19 23:41:29 -04:00
Theodore Ts'o c60990b361 ext4: clean up kerneldoc warnigns when building with W=1
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-06-19 16:30:03 -04:00
Jan Kara 936bbf3aea ext2: Always brelse bh on failure in ext2_iget()
All but one bail out paths in ext2_iget() is releasing bh. Move the
releasing of bh into a common error handling code.

Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-19 18:29:45 +02:00
Chengguang Xu edb895d3bf ext2: add missing brelse() in ext2_iget()
Add missing brelse() on error path of ext2_iget().

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-19 18:27:38 +02:00
Thomas Gleixner d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Thomas Gleixner 20c8ccb197 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499
Based on 1 normalized pattern(s):

  this work is licensed under the terms of the gnu gpl version 2 see
  the copying file in the top level directory

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 35 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.797835076@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:53 +02:00
Thomas Gleixner b6a3d1b71a treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 231
Based on 1 normalized pattern(s):

  this library is free software you can redistribute it and or modify
  it under the terms of the gnu general public license v2 as published
  by the free software foundation this library is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu lesser general public license for more details
  you should have received a copy of the gnu lesser general public
  license along with this library if not write to the free software
  foundation inc 59 temple place suite 330 boston ma 02111 1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 2 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.539286961@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:06 +02:00
Amir Goldstein c285a2f01d fanotify: update connector fsid cache on add mark
When implementing connector fsid cache, we only initialized the cache
when the first mark added to object was added by FAN_REPORT_FID group.
We forgot to update conn->fsid when the second mark is added by
FAN_REPORT_FID group to an already attached connector without fsid
cache.

Reported-and-tested-by: syzbot+c277e8e2f46414645508@syzkaller.appspotmail.com
Fixes: 77115225ac ("fanotify: cache fsid in fsnotify_mark_connector")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-19 15:53:58 +02:00
yangerkun c6d9c35d16 quota: fix a problem about transfer quota
Run below script as root, dquot_add_space will return -EDQUOT since
__dquot_transfer call dquot_add_space with flags=0, and dquot_add_space
think it's a preallocation. Fix it by set flags as DQUOT_SPACE_WARN.

mkfs.ext4 -O quota,project /dev/vdb
mount -o prjquota /dev/vdb /mnt
setquota -P 23 1 1 0 0 /dev/vdb
dd if=/dev/zero of=/mnt/test-file bs=4K count=1
chattr -p 23 test-file

Fixes: 7b9ca4c61b ("quota: Reduce contention on dq_data_lock")
Signed-off-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-19 15:25:41 +02:00
Amir Goldstein 387e3746d0 locks: eliminate false positive conflicts for write lease
check_conflicting_open() is checking for existing fd's open for read or
for write before allowing to take a write lease.  The check that was
implemented using i_count and d_count is an approximation that has
several false positives.  For example, overlayfs since v4.19, takes an
extra reference on the dentry; An open with O_PATH takes a reference on
the dentry although the file cannot be read nor written.

Change the implementation to use i_readcount and i_writecount to
eliminate the false positive conflicts and allow a write lease to be
taken on an overlayfs file.

The change of behavior with existing fd's open with O_PATH is symmetric
w.r.t. current behavior of lease breakers - an open with O_PATH currently
does not break a write lease.

This increases the size of struct inode by 4 bytes on 32bit archs when
CONFIG_FILE_LOCKING is defined and CONFIG_IMA was not already
defined.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2019-06-19 08:49:38 -04:00
Ira Weiny d51f527f44 locks: Add trace_leases_conflict
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2019-06-19 08:49:37 -04:00
Amir Goldstein 6dde1e42f4 ovl: make i_ino consistent with st_ino in more cases
Relax the condition that overlayfs supports nfs export, to require
that i_ino is consistent with st_ino/d_ino.

It is enough to require that st_ino and d_ino are consistent.

This fixes the failure of xfstest generic/504, due to mismatch of
st_ino to inode number in the output of /proc/locks.

Fixes: 12574a9f4c ("ovl: consistent i_ino for non-samefs with xino")
Cc: <stable@vger.kernel.org> # v4.19
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-19 09:04:19 +02:00
YueHaibing c036061be9 ecryptfs: Make ecryptfs_xattr_handler static
Fix sparse warning:

fs/ecryptfs/inode.c:1138:28: warning:
 symbol 'ecryptfs_xattr_handler' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2019-06-19 05:53:49 +00:00
YueHaibing 29a51df060 ecryptfs: remove unnessesary null check in ecryptfs_keyring_auth_tok_for_sig
request_key and ecryptfs_get_encrypted_key never
return a NULL pointer, so no need do a null check.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2019-06-19 05:53:44 +00:00
Sascha Hauer 96827c3044 ecryptfs: use print_hex_dump_bytes for hexdump
The Kernel has nice hexdump facilities, use them rather a homebrew
hexdump function.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2019-06-19 05:53:37 +00:00
Linus Torvalds bed3c0d84e for-5.2-rc5-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl0JE18ACgkQxWXV+ddt
 WDskyw//fi4r3/D2lz7ZKHRz00o8h9gDyBw50y4FHXN44GrAWgM0dbi1l0VXRquy
 A9Xy4rBoquDdUKkSIvMr3ff33eK+Y2O9oUBeXMcb4tfg8TicAMdyTHduDwka1Ljg
 TXcupG8cWd7Boi9FfeSDDPQpV+NXxzL9VSy7uTMjMmmxWYWViMJo38ultsxUhoOY
 ZVNY8LNT/F6i/4Us9D5ymzqn6uQWzfu2GXZT2I3Pq5Ps3PBSc7OJVhrPYE9FQ/jv
 ptirddkoeOo6xQJ1Pb/UMPjkTZ5ct2Wy/lAvQPXiWf9FjjwR7zuSL1Xe3wzpUg/y
 llENXp3Ps+oMzTF1XKid43yHlt0Swqzr+EIiNvLbSj5E5o5msKZ+jXQYHV8LHZfW
 I12uizChQc3N11SrwC+gVou5oAMhTnJmHjlq96vWXaw0lcvX+yHMWP3w16OGM3Pc
 9aN0ap6SgRBM6XZkXR65Rf4sIAnCm12hDrVdHNPJhz95W6PQZkPhMS0FWjrOAUYy
 yqhrMqtY4ELNRBBBXJ4dq+k+l/I6lHkPbhPXVt0VekXdyKPr4o5WZLI1k3C2S14L
 wXrqw6wTU4pgaD6LCqQ7NFtCZF3Zz+8L+MHbbK1LLlMFUcMFs/+cRrg7EKipOxxn
 mm/mjOKD5lGUJPGKuFb87rCcgAOXcOWnF+FoR/FNf6HVEJJJOuY=
 =uzhr
 -----END PGP SIGNATURE-----

Merge tag 'for-5.2-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - regression where properties stored as xattrs are not properly
   persisted

 - a small readahead fix (the fstests testcase for that fix hangs on
   unpatched kernel, so we'd like get it merged to ease future testing)

 - fix a race during block group creation and deletion

* tag 'for-5.2-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Btrfs: fix failure to persist compression property xattr deletion on fsync
  btrfs: start readahead also in seed devices
  Btrfs: fix race between block group removal and block group allocation
2019-06-18 11:20:24 -07:00
Nicolas Schier 253e748339 ovl: fix typo in MODULE_PARM_DESC
Change first argument to MODULE_PARM_DESC() calls, that each of them
matched the actual module parameter name.  The matching results in
changing (the 'parm' section from) the output of `modinfo overlay` from:

    parm: ovl_check_copy_up:Obsolete; does nothing
    parm: redirect_max:ushort
    parm: ovl_redirect_max:Maximum length of absolute redirect xattr value
    parm: redirect_dir:bool
    parm: ovl_redirect_dir_def:Default to on or off for the redirect_dir feature
    parm: redirect_always_follow:bool
    parm: ovl_redirect_always_follow:Follow redirects even if redirect_dir feature is turned off
    parm: index:bool
    parm: ovl_index_def:Default to on or off for the inodes index feature
    parm: nfs_export:bool
    parm: ovl_nfs_export_def:Default to on or off for the NFS export feature
    parm: xino_auto:bool
    parm: ovl_xino_auto_def:Auto enable xino feature
    parm: metacopy:bool
    parm: ovl_metacopy_def:Default to on or off for the metadata only copy up feature

into:

    parm: check_copy_up:Obsolete; does nothing
    parm: redirect_max:Maximum length of absolute redirect xattr value (ushort)
    parm: redirect_dir:Default to on or off for the redirect_dir feature (bool)
    parm: redirect_always_follow:Follow redirects even if redirect_dir feature is turned off (bool)
    parm: index:Default to on or off for the inodes index feature (bool)
    parm: nfs_export:Default to on or off for the NFS export feature (bool)
    parm: xino_auto:Auto enable xino feature (bool)
    parm: metacopy:Default to on or off for the metadata only copy up feature (bool)

Signed-off-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-18 15:06:16 +02:00
Arnd Bergmann 1dac6f5b0e ovl: fix bogus -Wmaybe-unitialized warning
gcc gets a bit confused by the logic in ovl_setup_trap() and
can't figure out whether the local 'trap' variable in the caller
was initialized or not:

fs/overlayfs/super.c: In function 'ovl_fill_super':
fs/overlayfs/super.c:1333:4: error: 'trap' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    iput(trap);
    ^~~~~~~~~~
fs/overlayfs/super.c:1312:17: note: 'trap' was declared here

Reword slightly to make it easier for the compiler to understand.

Fixes: 146d62e5a5 ("ovl: detect overlapping layers")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-18 15:06:16 +02:00
Miklos Szeredi 9179c21dc6 ovl: don't fail with disconnected lower NFS
NFS mounts can be disconnected from fs root.  Don't fail the overlapping
layer check because of this.

The check is not authoritative anyway, since topology can change during or
after the check.

Reported-by: Antti Antinoja <antti@fennosys.fi> 
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 146d62e5a5 ("ovl: detect overlapping layers")
2019-06-18 15:06:16 +02:00
David S. Miller 13091aa305 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Honestly all the conflicts were simple overlapping changes,
nothing really interesting to report.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-17 20:20:36 -07:00
Christian Brauner d728cf7916 fs/namespace: fix unprivileged mount propagation
When propagating mounts across mount namespaces owned by different user
namespaces it is not possible anymore to move or umount the mount in the
less privileged mount namespace.

Here is a reproducer:

  sudo mount -t tmpfs tmpfs /mnt
  sudo --make-rshared /mnt

  # create unprivileged user + mount namespace and preserve propagation
  unshare -U -m --map-root --propagation=unchanged

  # now change back to the original mount namespace in another terminal:
  sudo mkdir /mnt/aaa
  sudo mount -t tmpfs tmpfs /mnt/aaa

  # now in the unprivileged user + mount namespace
  mount --move /mnt/aaa /opt

Unfortunately, this is a pretty big deal for userspace since this is
e.g. used to inject mounts into running unprivileged containers.
So this regression really needs to go away rather quickly.

The problem is that a recent change falsely locked the root of the newly
added mounts by setting MNT_LOCKED. Fix this by only locking the mounts
on copy_mnt_ns() and not when adding a new mount.

Fixes: 3bd045cc9c ("separate copying and locking mount tree on cross-userns copies")
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Tested-by: Christian Brauner <christian@brauner.io>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-17 17:36:09 -04:00
Eric Biggers 1b0b9cc8d3 vfs: fsmount: add missing mntget()
sys_fsmount() needs to take a reference to the new mount when adding it
to the anonymous mount namespace.  Otherwise the filesystem can be
unmounted while it's still in use, as found by syzkaller.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: syzbot+99de05d099a170867f22@syzkaller.appspotmail.com
Reported-by: syzbot+7008b8b8ba7df475fdc8@syzkaller.appspotmail.com
Fixes: 93766fbd26 ("vfs: syscall: Add fsmount() to create a mount for a superblock")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-17 17:36:07 -04:00
Ronnie Sahlberg 61cabc7b0a cifs: fix GlobalMid_Lock bug in cifs_reconnect
We can not hold the GlobalMid_Lock spinlock during the
dfs processing in cifs_reconnect since it invokes things that may sleep
and thus trigger :

BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:23

Thus we need to drop the spinlock during this code block.

RHBZ: 1716743

Cc: stable@vger.kernel.org
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-06-17 16:27:02 -05:00
Steve French 8d526d62db SMB3: retry on STATUS_INSUFFICIENT_RESOURCES instead of failing write
Some servers such as Windows 10 will return STATUS_INSUFFICIENT_RESOURCES
as the number of simultaneous SMB3 requests grows (even though the client
has sufficient credits).  Return EAGAIN on STATUS_INSUFFICIENT_RESOURCES
so that we can retry writes which fail with this status code.

This (for example) fixes large file copies to Windows 10 on fast networks.

Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-06-17 16:17:56 -05:00
Christoph Hellwig ff896738be block: return from __bio_try_merge_page if merging occured in the same page
We currently have an input same_page parameter to __bio_try_merge_page
to prohibit merging in the same page.  The rationale for that is that
some callers need to account for every page added to a bio.  Instead of
letting these callers call twice into the merge code to account for the
new vs existing page cases, just turn the paramter into an output one that
returns if a merge in the same page occured and let them act accordingly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-17 09:33:02 -06:00
Filipe Manana 3763771cf6 Btrfs: fix failure to persist compression property xattr deletion on fsync
After the recent series of cleanups in the properties and xattrs modules
that landed in the 5.2 merge window, we ended up with a regression where
after deleting the compression xattr property through the setflags ioctl,
we don't set the BTRFS_INODE_COPY_EVERYTHING flag in the inode anymore.
As a consequence, if the inode was fsync'ed when it had the compression
property set, after deleting the compression property through the setflags
ioctl and fsync'ing again the inode, the log will still contain the
compression xattr, because the inode did not had that bit set, which
made the fsync not delete all xattrs from the log and copy all xattrs
from the subvolume tree to the log tree.

This regression happens due to the fact that that series of cleanups
made btrfs_set_prop() call the old function do_setxattr() (which is now
named btrfs_setxattr()), and not the old version of btrfs_setxattr(),
which is now called btrfs_setxattr_trans().

Fix this by setting the BTRFS_INODE_COPY_EVERYTHING bit in the current
btrfs_setxattr() function and remove it from everywhere else, including
its setup at btrfs_ioctl_setflags(). This is cleaner, avoids similar
regressions in the future, and centralizes the setup of the bit. After
all, the need to setup this bit should only be in the xattrs module,
since it is an implementation of xattrs.

Fixes: 04e6863b19 ("btrfs: split btrfs_setxattr calls regarding transaction")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-17 16:37:17 +02:00
Ingo Molnar bddb363673 Merge branch 'x86/cpu' into perf/core, to pick up dependent changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:29:16 +02:00
Ingo Molnar 23da766ab1 Linux 5.2-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Gj1MeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGctkH/0At3+SQPY2JJSy8
 i6+TDeytFx9OggeGLPHChRfehkAlvMb/kd34QHnuEvDqUuCAMU6HZQJFKoK9mvFI
 sDJVayPGDSqpm+iv8qLpMBPShiCXYVnGZeVfOdv36jUswL0k6wHV1pz4avFkDeZa
 1F4pmI6O2XRkNTYQawbUaFkAngWUCBG9ECLnHJnuIY6ohShBvjI4+E2JUaht+8gO
 M2h2b9ieddWmjxV3LTKgsK1v+347RljxdZTWnJ62SCDSEVZvsgSA9W2wnebVhBkJ
 drSmrFLxNiM+W45mkbUFmQixRSmjv++oRR096fxAnodBxMw0TDxE1RiMQWE6rVvG
 N6MC6xA=
 =+B0P
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc5' into sched/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:12:27 +02:00
Nikolay Borisov 9ffbe8ac05 locking/lockdep: Rename lockdep_assert_held_exclusive() -> lockdep_assert_held_write()
All callers of lockdep_assert_held_exclusive() use it to verify the
correct locking state of either a semaphore (ldisc_sem in tty,
mmap_sem for perf events, i_rwsem of inode for dax) or rwlock by
apparmor. Thus it makes sense to rename _exclusive to _write since
that's the semantics callers care. Additionally there is already
lockdep_assert_held_read(), which this new naming is more consistent with.

No functional changes.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190531100651.3969-1-nborisov@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:09:24 +02:00
Tejun Heo 6631142229 blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration
wbc_account_io() collects information on cgroup ownership of writeback
pages to determine which cgroup should own the inode.  Pages can stay
associated with dead memcgs but we want to avoid attributing IOs to
dead blkcgs as much as possible as the association is likely to be
stale.  However, currently, pages associated with dead memcgs
contribute to the accounting delaying and/or confusing the
arbitration.

Fix it by ignoring pages associated with dead memcgs.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-15 10:39:42 -06:00
Linus Torvalds 4066524401 Fix rounding error in gfs2_iomap_page_prepare
-----BEGIN PGP SIGNATURE-----
 
 iQIbBAABAgAGBQJdA9AdAAoJENW/n+sDE2U67xgP+PnBqsdJpFVfodQyu+AEHdxJ
 et2nZmcsQGRXj5iha7lWI6Mu0x/7q9FOCS1I0aZDwcHn1A0R5R/xulKcnl6ySjrp
 bd6GnehP1KicufykAPr5lGHmiIIp/SdIDe7z6XUkV1Eksl6ls8WTWi+JEsP7uDjw
 TGQ6GTOziRUKHshuEkvvYfA3OhbebNzP1FjgPr6keg7PFeLAaiUWKxlTYj9Vov8F
 4YnOsdfFEFpiJy8w5N/Ggfmp1fbY+uyzoS70Xe+y7XTdkX57fPnzlCxLsji7PgAp
 AN38cpUHib46Ng8YyGfdB2wzJ5BAauYLy7zYGLiNdHXFvwkm4bP2CyeALyhG/Jg7
 HOBJ1owekaBV/o96NgyuJTfv65zdV0su7s24YBVJlY+aUOkZWJb2N4quyv/WyiTY
 Ds1xR4Uql6WOBt6fb4Ady+4lD0j01k5YMk56fcmHlykldVYXXeJXksS/0MxKBR5T
 qGWPHpldVXK6TZhRo8SkS/H2h6CdqevwxECa2YMrnWFVaioTo5jBtO6l5Nn7z0Ni
 qaELyzFdhDlnq8ywvmo1zBPT2NjF4m0XQkEiy5P1RvkxSq500zvOEVVCx8vXICY7
 TwUpCSRP5Tu9P9LurvgTf/Pltsf0pApKyYRIC4hu8/6jNxum1Rq0JZW2JepNDSJH
 qnQPowGLnhAJDWIfIPU=
 =Z7P5
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-v5.2.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Andreas Gruenbacher:
 "Fix rounding error in gfs2_iomap_page_prepare"

* tag 'gfs2-v5.2.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Fix rounding error in gfs2_iomap_page_prepare
2019-06-14 17:27:12 -10:00
Linus Torvalds 7b10315128 for-linus-20190614
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl0DzeIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgppgoD/wOQ4LaeFvnRZkOQ+1+64xot3okN2BvFFAB
 HN+yc8rS2EvHYMG/2PRpLJpGPzTdu83m7Erc5SHBOQVTFqbADxmjRlVtNezOrefk
 TWcpsNw2z4Fc7cRQ1u6CNZxqjTVF2QqQvC4bv3fNrZYgckhI4FuiIjJ4pxGI0o9N
 HjrGLmp9hdpr20geMomfKSYx9XWUyZs7SxIr8oOFiZYF+K5goO4K8fRsdDIpFGYD
 xvzwkohf0xeI1kdfehII+yanlUrdYdeSShLgGl7F/qbiupyKhkxSjvmH8tOShFYa
 qSQFRhacjKy84VTOA4MQjkpcTaJfsDqINIPuU8IR+fhTqFbQW8kq9ovlhW2GeVZ/
 63b8mWD0XTq7+9vnNeJid8MZ3LqmcKy5qbLWYnnh927sdzjc1Y/RgPWQS4pBllaT
 m1rtRMBS0OBcHU0ofsshzL0UZshVh+M+mrKBp+tctIUVmF0bDRqNOJqatihzgNL1
 hI/9GY0KbBJQdHJD1iUgw+egikw/GQfoUcNG0IVbHHIREqy+0u/mBSmNzVw/hWeS
 jynnsuaAOSrwWeaIeHBGYQFUeQrjkgS1brTYIsXcG5wIcX99uFrp41kuTJGKD2fR
 jxkU3EIGxyLQL25T4PaxFqJQrfh/2rAia3Jq99lao5/rLNUdgX/2Ysv+hiw8M/Y7
 N+DyskreNQ==
 =6C8e
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190614' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Remove references to old schedulers for the scheduler switching and
   blkio controller documentation (Andreas)

 - Kill duplicate check for report zone for null_blk (Chaitanya)

 - Two bcache fixes (Coly)

 - Ensure that mq-deadline is selected if zoned block device is enabled,
   as we need that to support them (Damien)

 - Fix io_uring memory leak (Eric)

 - ps3vram fallout from LBDAF removal (Geert)

 - Redundant blk-mq debugfs debugfs_create return check cleanup (Greg)

 - Extend NOPLM quirk for ST1000LM024 drives (Hans)

 - Remove error path warning that can now trigger after the queue
   removal/addition fixes (Ming)

* tag 'for-linus-20190614' of git://git.kernel.dk/linux-block:
  block/ps3vram: Use %llu to format sector_t after LBDAF removal
  libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk
  bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached
  bcache: fix stack corruption by PRECEDING_KEY()
  blk-mq: remove WARN_ON(!q->elevator) from blk_mq_sched_free_requests
  blkio-controller.txt: Remove references to CFQ
  block/switching-sched.txt: Update to blk-mq schedulers
  null_blk: remove duplicate check for report zone
  blk-mq: no need to check return value of debugfs_create functions
  io_uring: fix memory leak of UNIX domain socket inode
  block: force select mq-deadline for zoned block devices
2019-06-14 15:41:18 -10:00
Andreas Gruenbacher 2741b6723b gfs2: Fix rounding error in gfs2_iomap_page_prepare
The pos and len arguments to the iomap page_prepare callback are not
block aligned, so we need to take that into account when computing the
number of blocks.

Fixes: d0a22a4b03 ("gfs2: Fix iomap write page reclaim deadlock")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-14 18:49:07 +02:00
Naohiro Aota c4e0540d0a btrfs: start readahead also in seed devices
Currently, btrfs does not consult seed devices to start readahead. As a
result, if readahead zone is added to the seed devices, btrfs_reada_wait()
indefinitely wait for the reada_ctl to finish.

You can reproduce the hung by modifying btrfs/163 to have larger initial
file size (e.g. xfs_io pwrite 4M instead of current 256K).

Fixes: 7414a03fbf ("btrfs: initial readahead code and prototypes")
Cc: stable@vger.kernel.org # 3.2+: ce7791ffee1e: Btrfs: fix race between readahead and device replace/removal
Cc: stable@vger.kernel.org # 3.2+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-14 17:33:46 +02:00
Wengang Wang be99ca2716 fs/ocfs2: fix race in ocfs2_dentry_attach_lock()
ocfs2_dentry_attach_lock() can be executed in parallel threads against the
same dentry.  Make that race safe.  The race is like this:

            thread A                               thread B

(A1) enter ocfs2_dentry_attach_lock,
seeing dentry->d_fsdata is NULL,
and no alias found by
ocfs2_find_local_alias, so kmalloc
a new ocfs2_dentry_lock structure
to local variable "dl", dl1

               .....

                                    (B1) enter ocfs2_dentry_attach_lock,
                                    seeing dentry->d_fsdata is NULL,
                                    and no alias found by
                                    ocfs2_find_local_alias so kmalloc
                                    a new ocfs2_dentry_lock structure
                                    to local variable "dl", dl2.

                                                   ......

(A2) set dentry->d_fsdata with dl1,
call ocfs2_dentry_lock() and increase
dl1->dl_lockres.l_ro_holders to 1 on
success.
              ......

                                    (B2) set dentry->d_fsdata with dl2
                                    call ocfs2_dentry_lock() and increase
				    dl2->dl_lockres.l_ro_holders to 1 on
				    success.

                                                  ......

(A3) call ocfs2_dentry_unlock()
and decrease
dl2->dl_lockres.l_ro_holders to 0
on success.
             ....

                                    (B3) call ocfs2_dentry_unlock(),
                                    decreasing
				    dl2->dl_lockres.l_ro_holders, but
				    see it's zero now, panic

Link: http://lkml.kernel.org/r/20190529174636.22364-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reported-by: Daniel Sobe <daniel.sobe@nxp.com>
Tested-by: Daniel Sobe <daniel.sobe@nxp.com>
Reviewed-by: Changwei Ge <gechangwei@live.cn>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13 17:34:56 -10:00
Ronnie Sahlberg 487317c994 cifs: add spinlock for the openFileList to cifsInodeInfo
We can not depend on the tcon->open_file_lock here since in multiuser mode
we may have the same file/inode open via multiple different tcons.

The current code is race prone and will crash if one user deletes a file
at the same time a different user opens/create the file.

To avoid this we need to have a spinlock attached to the inode and not the tcon.

RHBZ:  1580165

CC: Stable <stable@vger.kernel.org>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-06-13 14:21:09 -05:00
Ronnie Sahlberg 0ff2b018b0 cifs: fix panic in smb2_reconnect
RH Bugzilla: 1702264

We need to protect so that the call to smb2_reconnect() in
smb2_reconnect_server() does not end up freeing the session
because it can lead to a use after free and crash.

Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-06-13 14:20:57 -05:00
Kimberly Brown dad4afe746 f2fs: replace ktype default_attrs with default_groups
The kobj_type default_attrs field is being replaced by the
default_groups field. Replace the default_attrs fields in f2fs_sb_ktype
and f2fs_feat_ktype with default_groups. Use the ATTRIBUTE_GROUPS macro
to create f2fs_groups and f2fs_feat_groups.

Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-13 13:50:22 +02:00
Kimberly Brown c9c5b5e156 dlm: Replace default_attrs in dlm_ktype with default_groups
The kobj_type default_attrs field is being replaced by the
default_groups field, so replace the default_attrs field in dlm_ktype
with default_groups. Use the ATTRIBUTE_GROUPS macro to create
dlm_groups.

Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-13 13:50:22 +02:00
Kimberly Brown 59137a93f3 ext4: replace ktype default_attrs with default_groups
The kobj_type default_attrs field is being replaced by the
default_groups field. Replace the default_attrs field in ext4_sb_ktype
and ext4_feat_ktype with default_groups. Use the ATTRIBUTE_GROUPS macro
to create ext4_groups and ext4_feat_groups.

Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-13 13:50:22 +02:00
Kimberly Brown ef254d13f1 gfs2: replace ktype default_attrs with default_groups
The kobj_type default_attrs field is being replaced by the
default_groups field. Replace the default_attrs field in gfs2_ktype
with default_groups. Use the ATTRIBUTE_GROUPS macro to create
gfs2_groups.

Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-13 13:50:21 +02:00
Eric Biggers 355e8d26f7 io_uring: fix memory leak of UNIX domain socket inode
Opening and closing an io_uring instance leaks a UNIX domain socket
inode.  This is because the ->file of the io_uring instance's internal
UNIX domain socket is set to point to the io_uring file, but then
sock_release() sees the non-NULL ->file and assumes the inode reference
is held by the file so doesn't call iput().  That's not the case here,
since the reference is still meant to be held by the socket; the actual
inode of the io_uring file is different.

Fix this leak by NULL-ing out ->file before releasing the socket.

Reported-by: syzbot+111cb28d9f583693aefa@syzkaller.appspotmail.com
Fixes: 2b188cc1bb ("Add io_uring IO interface")
Cc: <stable@vger.kernel.org> # v5.1+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13 03:00:30 -06:00
Eric Sandeen f5b999c03f xfs: remove unused flag arguments
There are several functions which take a flag argument that is
only ever passed as "0," so remove these arguments.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 09:00:00 -07:00
Christoph Hellwig 76dee76921 xfs: remove the debug-only q_transp field from struct xfs_dquot
The field is only used for a few assertations.  Shrink the dqout
structure instead, similarly to what commit f3ca87389d
("xfs: remove i_transp") did for the xfs_inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 08:59:59 -07:00
Christoph Hellwig f9a196ee5a xfs: merge xfs_buf_zero and xfs_buf_iomove
xfs_buf_zero is the only caller of xfs_buf_iomove.  Remove support
for copying from or to the buffer in xfs_buf_iomove and merge the
two functions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 08:59:59 -07:00
Eric Sandeen 8c9ce2f707 xfs: remove unused flags arg from getsb interfaces
The flags value is always passed as 0 so remove the argument.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 08:59:58 -07:00
Eric Sandeen d03a2f1b9f xfs: include WARN, REPAIR build options in XFS_BUILD_OPTIONS
The XFS_BUILD_OPTIONS string, shown at module init time and
in modinfo output, does not currently include all available
build options.  So, add in CONFIG_XFS_WARN and CONFIG_XFS_REPAIR.

It has been suggested in some quarters
That this is not enough.
Well ...

Anybody who would like to see this in a sysfs file can send
a patch.  :)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-12 08:37:40 -07:00
Darrick J. Wong 4b4d98cca3 xfs: finish converting to inodes_per_cluster
Finish converting all the old inode_cluster_size >> inopblog users to
inodes_per_cluster.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-06-12 08:37:40 -07:00
Darrick J. Wong 490d451fa5 xfs: fix inode_cluster_size rounding mayhem
inode_cluster_size is supposed to represent the size (in bytes) of an
inode cluster buffer.  We avoid having to handle multiple clusters per
filesystem block on filesystems with large blocks by openly rounding
this value up to 1 FSB when necessary.  However, we never reset
inode_cluster_size to reflect this new rounded value, which adds to the
potential for mistakes in calculating geometries.

Fix this by setting inode_cluster_size to reflect the rounded-up size if
needed, and special-case the few places in the sparse inodes code where
we actually need the smaller value to validate on-disk metadata.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-06-12 08:37:40 -07:00
Darrick J. Wong 494dba7b27 xfs: refactor inode geometry setup routines
Migrate all of the inode geometry setup code from xfs_mount.c into a
single libxfs function that we can share with xfsprogs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-06-12 08:37:40 -07:00
Darrick J. Wong ef32595999 xfs: separate inode geometry
Separate the inode geometry information into a distinct structure.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-06-12 08:37:40 -07:00
Filipe Manana 8eaf40c0e2 Btrfs: fix race between block group removal and block group allocation
If a task is removing the block group that currently has the highest start
offset amongst all existing block groups, there is a short time window
where it races with a concurrent block group allocation, resulting in a
transaction abort with an error code of EEXIST.

The following diagram explains the race in detail:

      Task A                                                        Task B

 btrfs_remove_block_group(bg offset X)

   remove_extent_mapping(em offset X)
     -> removes extent map X from the
        tree of extent maps
        (fs_info->mapping_tree), so the
        next call to find_next_chunk()
        will return offset X

                                                   btrfs_alloc_chunk()
                                                     find_next_chunk()
                                                       --> returns offset X

                                                     __btrfs_alloc_chunk(offset X)
                                                       btrfs_make_block_group()
                                                         btrfs_create_block_group_cache()
                                                           --> creates btrfs_block_group_cache
                                                               object with a key corresponding
                                                               to the block group item in the
                                                               extent, the key is:
                                                               (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)

                                                         --> adds the btrfs_block_group_cache object
                                                             to the list new_bgs of the transaction
                                                             handle

                                                   btrfs_end_transaction(trans handle)
                                                     __btrfs_end_transaction()
                                                       btrfs_create_pending_block_groups()
                                                         --> sees the new btrfs_block_group_cache
                                                             in the new_bgs list of the transaction
                                                             handle
                                                         --> its call to btrfs_insert_item() fails
                                                             with -EEXIST when attempting to insert
                                                             the block group item key
                                                             (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)
                                                             because task A has not removed that key yet
                                                         --> aborts the running transaction with
                                                             error -EEXIST

   btrfs_del_item()
     -> removes the block group's key from
        the extent tree, key is
        (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)

A sample transaction abort trace:

  [78912.403537] ------------[ cut here ]------------
  [78912.403811] BTRFS: Transaction aborted (error -17)
  [78912.404082] WARNING: CPU: 2 PID: 20465 at fs/btrfs/extent-tree.c:10551 btrfs_create_pending_block_groups+0x196/0x250 [btrfs]
  (...)
  [78912.405642] CPU: 2 PID: 20465 Comm: btrfs Tainted: G        W         5.0.0-btrfs-next-46 #1
  [78912.405941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
  [78912.406586] RIP: 0010:btrfs_create_pending_block_groups+0x196/0x250 [btrfs]
  (...)
  [78912.407636] RSP: 0018:ffff9d3d4b7e3b08 EFLAGS: 00010282
  [78912.407997] RAX: 0000000000000000 RBX: ffff90959a3796f0 RCX: 0000000000000006
  [78912.408369] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff909636b16860
  [78912.408746] RBP: ffff909626758a58 R08: 0000000000000000 R09: 0000000000000000
  [78912.409144] R10: ffff9095ff462400 R11: 0000000000000000 R12: ffff90959a379588
  [78912.409521] R13: ffff909626758ab0 R14: ffff9095036c0000 R15: ffff9095299e1158
  [78912.409899] FS:  00007f387f16f700(0000) GS:ffff909636b00000(0000) knlGS:0000000000000000
  [78912.410285] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [78912.410673] CR2: 00007f429fc87cbc CR3: 000000014440a004 CR4: 00000000003606e0
  [78912.411095] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [78912.411496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [78912.411898] Call Trace:
  [78912.412318]  __btrfs_end_transaction+0x5b/0x1c0 [btrfs]
  [78912.412746]  btrfs_inc_block_group_ro+0xcf/0x160 [btrfs]
  [78912.413179]  scrub_enumerate_chunks+0x188/0x5b0 [btrfs]
  [78912.413622]  ? __mutex_unlock_slowpath+0x100/0x2a0
  [78912.414078]  btrfs_scrub_dev+0x2ef/0x720 [btrfs]
  [78912.414535]  ? __sb_start_write+0xd4/0x1c0
  [78912.414963]  ? mnt_want_write_file+0x24/0x50
  [78912.415403]  btrfs_ioctl+0x17fb/0x3120 [btrfs]
  [78912.415832]  ? lock_acquire+0xa6/0x190
  [78912.416256]  ? do_vfs_ioctl+0xa2/0x6f0
  [78912.416685]  ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
  [78912.417116]  do_vfs_ioctl+0xa2/0x6f0
  [78912.417534]  ? __fget+0x113/0x200
  [78912.417954]  ksys_ioctl+0x70/0x80
  [78912.418369]  __x64_sys_ioctl+0x16/0x20
  [78912.418812]  do_syscall_64+0x60/0x1b0
  [78912.419231]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
  [78912.419644] RIP: 0033:0x7f3880252dd7
  (...)
  [78912.420957] RSP: 002b:00007f387f16ed68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  [78912.421426] RAX: ffffffffffffffda RBX: 000055f5becc1df0 RCX: 00007f3880252dd7
  [78912.421889] RDX: 000055f5becc1df0 RSI: 00000000c400941b RDI: 0000000000000003
  [78912.422354] RBP: 0000000000000000 R08: 00007f387f16f700 R09: 0000000000000000
  [78912.422790] R10: 00007f387f16f700 R11: 0000000000000246 R12: 0000000000000000
  [78912.423202] R13: 00007ffda49c266f R14: 0000000000000000 R15: 00007f388145e040
  [78912.425505] ---[ end trace eb9bfe7c426fc4d3 ]---

Fix this by calling remove_extent_mapping(), at btrfs_remove_block_group(),
only at the very end, after removing the block group item key from the
extent tree (and removing the free space tree entry if we are using the
free space tree feature).

Fixes: 04216820fe ("Btrfs: fix race between fs trimming and block group remove/allocation")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-12 15:55:42 +02:00
Fumiya Shigemitsu fdbd3e8c9f ext2: Fix a typo in ext2_getattr argument
Fix a typo in a ext2_getattr argument

Signed-off-by: Fumiya Shigemitsu <shfy1014@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-12 15:10:41 +02:00
Chengguang Xu 1fe0341544 ext2: fix a typo in comment
Just fix a typo in comment and remove redundant blank line
in ext2_data_block_valid().

Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-06-12 15:08:02 +02:00
Aubrey Li 68bc30bb9f proc: Add /proc/<pid>/arch_status
Exposing architecture specific per process information is useful for
various reasons. An example is the AVX512 usage on x86 which is important
for task placement for power/performance optimizations.

Adding this information to the existing /prcc/pid/status file would be the
obvious choise, but it has been agreed on that a explicit arch_status file
is better in separating the generic and architecture specific information.

[ tglx: Massage changelog ]

Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: peterz@infradead.org
Cc: hpa@zytor.com
Cc: ak@linux.intel.com
Cc: tim.c.chen@linux.intel.com
Cc: dave.hansen@intel.com
Cc: arjan@linux.intel.com
Cc: adobriyan@gmail.com
Cc: aubrey.li@intel.com
Cc: linux-api@vger.kernel.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Linux API <linux-api@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190606012236.9391-1-aubrey.li@linux.intel.com
2019-06-12 11:42:13 +02:00
Linus Torvalds 6fa425a265 for-5.2-rc4-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlz/z0wACgkQxWXV+ddt
 WDt/4g//eS+USsVpHV65AoOC+LWHQFvKHVNX97l53yy7GvINxgCO5+X7MctvCP1P
 cEhhyGhapTFWinOj2zoMSz+mk1//wkqPsC9KzN+ha1fx0ktj+IKms+xbh3kFsygq
 dbqLMBHobGpCVV1z7kg+YN0OnG3W90qv/5TqeYhPADBEzEDffXuj+5qEul88J47h
 n4GP309mJcJwO1SmYOMFTchZDKJJnFMejr+KS+hOvzh3i5C6ZVOLeEs2yksdFvUi
 X9zeM9sbzshPonzRQVR9xazzW3JKP69rgkz+fo5TLnKqYwXiGN9ObCCjtKm/rck6
 pkJbvSmqV6QpX/pBUwdI/8DjgQyrPlfVyVcv5lU960mwya0eCkJYTa95bMC7N4UJ
 NNfROq9aJHKmct/rHECLsOwRTy2KuAqpX7+Ktjy6Sarw9bvlPwpgBep7PvYR9DXf
 mC9QHcRTYDCoKR5SF5xzB5lQHVOfWe6dduCugW8BhvO8t3ty+IW8p9WfcNXkTVu8
 SQWuxF2Y9uOEuTJMmyw6gbl1KRppg+U95r7vpKKbPFXMrsJDcGy2eUhMHfChwnkO
 brI2scsAaJg+UMvTjjO/8DVw4qpR9UDZaDgPqsGcFCNIQv65bPlL/UnQD112M2Ba
 w9FsvwbyI1AgUC0JsslLoHQVcOqHl2DnxFyvtbFTy0zcJK8fPxY=
 =UByh
 -----END PGP SIGNATURE-----

Merge tag 'for-5.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fix from David Sterba:
 "One regression fix to TRIM ioctl.

  The range cannot be used as its meaning can be confusing regarding
  physical and logical addresses. This confusion in code led to
  potential corruptions when the range overlapped data.

  The original patch made it to several stable kernels and was promptly
  reverted, the version for master branch is different due to additional
  changes but the change is effectively the same"

* tag 'for-5.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: Always trim all unallocated space in btrfs_trim_free_extents
2019-06-11 15:10:15 -10:00
Amir Goldstein 941d935ac7 ovl: fix wrong flags check in FS_IOC_FS[SG]ETXATTR ioctls
The ioctl argument was parsed as the wrong type.

Fixes: b21d9c435f ("ovl: support the FS_IOC_FS[SG]ETXATTR ioctls")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-11 17:17:41 +02:00
Miklos Szeredi 766741fcaa Revert "fuse: require /dev/fuse reads to have enough buffer capacity"
This reverts commit d4b13963f2.

The commit introduced a regression in glusterfs-fuse.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-06-11 13:35:22 +02:00
Eric Biggers 0bb06cac06 fscrypt: remove unnecessary includes of ratelimit.h
These should have been removed during commit 544d08fde2 ("fscrypt: use
a common logging function"), but I missed them.

Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-06-10 19:01:33 -07:00
Wang Shilong 7ddf79a103 ext4: only set project inherit bit for directory
It doesn't make any sense to have project inherit bits
for regular files, even though this won't cause any
problem, but it is better fix this.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
2019-06-10 00:13:32 -04:00
Theodore Ts'o 02b016ca7f ext4: enforce the immutable flag on open files
According to the chattr man page, "a file with the 'i' attribute
cannot be modified..."  Historically, this was only enforced when the
file was opened, per the rest of the description, "... and the file
can not be opened in write mode".

There is general agreement that we should standardize all file systems
to prevent modifications even for files that were opened at the time
the immutable flag is set.  Eventually, a change to enforce this at
the VFS layer should be landing in mainline.  Until then, enforce this
at the ext4 level to prevent xfstests generic/553 from failing.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: stable@kernel.org
2019-06-09 22:04:33 -04:00
Darrick J. Wong 2e53840362 ext4: don't allow any modifications to an immutable file
Don't allow any modifications to a file that's marked immutable, which
means that we have to flush all the writable pages to make the readonly
and we have to check the setattr/setflags parameters more closely.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2019-06-09 21:41:41 -04:00
Amir Goldstein fe0da9c09b fuse: copy_file_range needs to strip setuid bits and update timestamps
Like ->write_iter(), we update mtime and strip setuid of dst file before
copy and like ->read_iter(), we update atime of src file after copy.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:07:07 -07:00
Amir Goldstein 5dae222a5f vfs: allow copy_file_range to copy across devices
We want to enable cross-filesystem copy_file_range functionality
where possible, so push the "same superblock only" checks down to
the individual filesystem callouts so they can make their own
decisions about cross-superblock copy offload and fallack to
generic_copy_file_range() for cross-superblock copy.

[Amir] We do not call ->remap_file_range() in case the files are not
on the same sb and do not call ->copy_file_range() in case the files
do not belong to the same filesystem driver.

This changes behavior of the copy_file_range(2) syscall, which will
now allow cross filesystem in-kernel copy.  CIFS already supports
cross-superblock copy, between two shares to the same server. This
functionality will now be available via the copy_file_range(2) syscall.

Cc: Steve French <stfrench@microsoft.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:20 -07:00
Amir Goldstein 8c3f406c09 xfs: use file_modified() helper
Note that by using the helper, the order of calling file_remove_privs()
after file_update_mtime() in xfs_file_aio_write_checks() has changed.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:19 -07:00
Amir Goldstein e38f7f53c3 vfs: introduce file_modified() helper
The combination of file_remove_privs() and file_update_mtime() is
quite common in filesystem ->write_iter() methods.

Modelled after the helper file_accessed(), introduce file_modified()
and use it from generic_remap_file_range_prep().

Note that the order of calling file_remove_privs() before
file_update_mtime() in the helper was matched to the more common order by
filesystems and not the current order in generic_remap_file_range_prep().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:19 -07:00
Amir Goldstein 96e6e8f4a6 vfs: add missing checks to copy_file_range
Like the clone and dedupe interfaces we've recently fixed, the
copy_file_range() implementation is missing basic sanity, limits and
boundary condition tests on the parameters that are passed to it
from userspace. Create a new "generic_copy_file_checks()" function
modelled on the generic_remap_checks() function to provide this
missing functionality.

[Amir] Shorten copy length instead of checking pos_in limits
because input file size already abides by the limits.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:19 -07:00
Amir Goldstein a31713517d vfs: introduce generic_file_rw_checks()
Factor out helper with some checks on in/out file that are
common to clone_file_range and copy_file_range.

Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:19 -07:00
Dave Chinner 64bf5ff58d vfs: no fallback for ->copy_file_range
Now that we have generic_copy_file_range(), remove it as a fallback
case when offloads fail. This puts the responsibility for executing
fallbacks on the filesystems that implement ->copy_file_range and
allows us to add operational validity checks to
generic_copy_file_range().

Rework vfs_copy_file_range() to call a new do_copy_file_range()
helper to execute the copying callout, and move calls to
generic_file_copy_range() into filesystem methods where they
currently return failures.

[Amir] overlayfs is not responsible of executing the fallback.
It is the responsibility of the underlying filesystem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:19 -07:00
Dave Chinner f16acc9d9b vfs: introduce generic_copy_file_range()
Right now if vfs_copy_file_range() does not use any offload
mechanism, it falls back to calling do_splice_direct(). This fails
to do basic sanity checks on the files being copied. Before we
start adding this necessarily functionality to the fallback path,
separate it out into generic_copy_file_range().

generic_copy_file_range() has the same prototype as
->copy_file_range() so that filesystems can use it in their custom
->copy_file_range() method if they so choose.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:18 -07:00
Greg Kroah-Hartman 0154ec71d5 Merge 5.2-rc4 into char-misc-next
We want the char/misc driver fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-09 09:11:21 +02:00
Linus Torvalds 2759e05cdb A change to call iput() asynchronously to avoid a possible deadlock
when iput_final() needs to wait for in-flight I/O (e.g. readahead) and
 a fixup for a cleanup that went into -rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAlz8BN0THGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi0P5B/0dESu+t5AHG1qJbGdfDEfX6TmDRwn+
 Z/O22zlEwm25nAljfOYVsga6ZL08VUi7aWrdnyu9CkABPO9XZNDCQvqs/7FBUiaO
 ShRYY8hFWoAnpdSPraaDEiA7Z+4dNF5fSVMpNtRzjPFXDyxSrn/aArXqwAHNMFQY
 fNBL8gzKYQV0bwsCv13SpCA/ENjXMY61+mhYSA1OXaTLDQXDLwqzxaotlE6tzIvK
 NnS5SRr2ph1t3ChfUOLCoad2ZwkSETfDiqEeTp36oOe3ns6qGnIk7rSqMqTJN891
 a92n2RdXXpcBugvJUlHTalZdNrSJbPuGT0WFYLKNY6BERramtiTAGvf2
 =bJxo
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.2-rc4' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A change to call iput() asynchronously to avoid a possible deadlock
  when iput_final() needs to wait for in-flight I/O (e.g. readahead) and
  a fixup for a cleanup that went into -rc1"

* tag 'ceph-for-5.2-rc4' of git://github.com/ceph/ceph-client:
  ceph: fix error handling in ceph_get_caps()
  ceph: avoid iput_final() while holding mutex or in dispatch thread
  ceph: single workqueue for inode related works
2019-06-08 15:57:35 -07:00
Linus Torvalds 9331b6740f SPDX update for 5.2-rc4
Another round of SPDX header file fixes for 5.2-rc4
 
 These are all more "GPL-2.0-or-later" or "GPL-2.0-only" tags being
 added, based on the text in the files.  We are slowly chipping away at
 the 700+ different ways people tried to write the license text.  All of
 these were reviewed on the spdx mailing list by a number of different
 people.
 
 We now have over 60% of the kernel files covered with SPDX tags:
 	$ ./scripts/spdxcheck.py -v 2>&1 | grep Files
 	Files checked:            64533
 	Files with SPDX:          40392
 	Files with errors:            0
 
 I think the majority of the "easy" fixups are now done, it's now the
 start of the longer-tail of crazy variants to wade through.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXPuGTg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykBvQCg2SG+HmDH+tlwKLT/q7jZcLMPQigAoMpt9Uuy
 sxVEiFZo8ZU9v1IoRb1I
 =qU++
 -----END PGP SIGNATURE-----

Merge tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull yet more SPDX updates from Greg KH:
 "Another round of SPDX header file fixes for 5.2-rc4

  These are all more "GPL-2.0-or-later" or "GPL-2.0-only" tags being
  added, based on the text in the files. We are slowly chipping away at
  the 700+ different ways people tried to write the license text. All of
  these were reviewed on the spdx mailing list by a number of different
  people.

  We now have over 60% of the kernel files covered with SPDX tags:
	$ ./scripts/spdxcheck.py -v 2>&1 | grep Files
	Files checked:            64533
	Files with SPDX:          40392
	Files with errors:            0

  I think the majority of the "easy" fixups are now done, it's now the
  start of the longer-tail of crazy variants to wade through"

* tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (159 commits)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 450
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 449
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 448
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 446
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 445
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 444
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 443
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 442
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 440
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 438
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 437
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 436
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 435
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 434
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 433
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 432
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 431
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 430
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 429
  ...
2019-06-08 12:52:42 -07:00
David S. Miller a6cdeeb16b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-07 11:00:14 -07:00
Nikolay Borisov 8103d10b71 btrfs: Always trim all unallocated space in btrfs_trim_free_extents
This patch removes support for range parameters of FITRIM ioctl when
trimming unallocated space on devices. This is necessary since ranges
passed from user space are generally interpreted as logical addresses,
whereas btrfs_trim_free_extents used to interpret them as device
physical extents. This could result in counter-intuitive behavior for
users so it's best to remove that support altogether.

Additionally, the existing range support had a bug where if an offset
was passed to FITRIM which overflows u64 e.g. -1 (parsed as u64
18446744073709551615) then wrong data was fed into btrfs_issue_discard,
which in turn leads to wrap-around when aligning the passed range and
results in wrong regions being discarded which leads to data corruption.

Fixes: c2d1b3aae3 ("btrfs: Honour FITRIM range constraints during free space trim")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-07 14:52:05 +02:00
Jan Kara 1571c029a2 dax: Fix xarray entry association for mixed mappings
When inserting entry into xarray, we store mapping and index in
corresponding struct pages for memory error handling. When it happened
that one process was mapping file at PMD granularity while another
process at PTE granularity, we could wrongly deassociate PMD range and
then reassociate PTE range leaving the rest of struct pages in PMD range
without mapping information which could later cause missed notifications
about memory errors. Fix the problem by calling the association /
deassociation code if and only if we are really going to update the
xarray (deassociating and associating zero or empty entries is just
no-op so there's no reason to complicate the code with trying to avoid
the calls for these cases).

Cc: <stable@vger.kernel.org>
Fixes: d2c997c0f1 ("fs, dax: use page->mapping to warn if truncate...")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-06-06 22:18:49 -07:00
Linus Torvalds 01047631df Changes since last update:
- Fix some forgotten strings in a log debugging function
 - Fix incorrect unit conversion in online fsck code
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlz1Ur0ACgkQ+H93GTRK
 tOshcA//X+tEFZGF+oosQ3h/IJqr9pODZfVf76mQIpg/5IU7WUzONGjTsjiHyH93
 zC2qw7l7m/3LyCSXfuJfuWOCyPXRRj7Dv+iUCbjmAh0OAn6Alpa1VN9jxABTeTuY
 xeCoNdRBCg6wF2XLLVgEN+VEdrFc7F7x/4OyoW5dmmBQakNOWlVgLtQqO+7gK/hS
 Qc+xw9ekKvkceHi//NXJTSXAZ0EfntzNZ/Fg41O2cztWV4oKxl2a4ej+j0g8/u9e
 rabTP54RH8dNJIejRmWU3dxz/w7OHSOO84LW/Q4LMKykfhLV6lhPL25igy1eLNhY
 OpcCWQio4ZKBwd8UoXCH0HA4dprusPipI1JIXp/YDHGFb0PumhkZaO/1xM0G5J+o
 3n1A1m7MW16jXAwlwBq0A6UNUFGyMfhfQiCluikBwkpR5dhWQoLQ5/IWZxs7ibUy
 0M5POwnYUsx7xcx3/TjUsLaGbBrqD/F4LXXGmxVJQa6Dt6B0ctyLDnjN08mNwR0e
 4Kvck4yxUHrUoXGaOhtddZmloVcrY58CwwHPabwoA3qC39ktKvp5PPrTNPiao8Ew
 oILteFrpQxPfg1AItXO/QMcGvj7ragHSA2O0XqD5Tm61jWErfpIAyRQ89oXT5GUt
 Z52uOI164u/8T4S/HUvi9tqYuZ+xudG4krIClDCXBaww1W2xNjo=
 =WFWW
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.2-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "Here are a couple more bug fixes for 5.2. Changes since last update:

   - Fix some forgotten strings in a log debugging function

   - Fix incorrect unit conversion in online fsck code"

* tag 'xfs-5.2-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: inode btree scrubber should calculate im_boffset correctly
  xfs: fix broken log reservation debugging
2019-06-06 12:36:54 -07:00
Linus Torvalds dc8ca9cc6e Revert commit "gfs2: Replace gl_revokes with a GLF flag".
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc+SSIAAoJENW/n+sDE2U6Vx8P/jP9FW/JNMj4JeXVMPvSP4eB
 raZDEUNksGnWu7/JAjoz5PruPZaeW+CoDq25oiB7MalMopEXZ4RRovhmrrZBXqP3
 jI7/4yoHRJf5907ZFfbNHX3vL8up+I3Ej1jf8fsEpbTMlO95n5dtsTJIcuBpRB6L
 Oq4M7sIrSU8FxRhOt+wBw042R3FWLYkNZ7mYV+VbiC/OXGSnOWj/uhZ04m83B39a
 B0EylS0RIMVkS187+7gVxn5Rcyg0go+/Hi3pdMwBcWFOsVfAPI4Gr7n4F/CI4he5
 lb4B6CMlQR0m1Nsqvvx+s3TUxlnAxuRJ9Kdam2T0emVvNkQTOpBvA6mHsk3WNfrl
 6YE06wkpk29tLGD0a1cKwY4tiQrJ7isi2n/9pCw8NsWVJM2vMMILIBjNWFQ6wKOo
 JnrKe3gjEe/lXN6JuZRZQaP2VkaRO9a86Op7WsPmt5LWAIyRFj0EQMJFay+dhjcl
 N/MpAXzhuNrHGHpuf0X+9caWhLciAZmd3t/CbEaNCbHMlha7S9iFaIChy899gFob
 L8ODwrHGozOQXSJzqOxLgGEos7BQLyrxnurBqe1aU3JHmQXLD32xQA69qTphvRf9
 wlYSVm4eitg5QgGQPgwMilAb51O+3gQkDuas8uUS9LbRcnauhjNw4zabM4L8tOT2
 fW4JhUxxw39fj+/GBtX2
 =ilXy
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-v5.2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Andreas Gruenbacher:
 "A revert for a patch that turned out to be broken"

* tag 'gfs2-v5.2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  Revert "gfs2: Replace gl_revokes with a GLF flag"
2019-06-06 12:33:52 -07:00
Linus Torvalds 5d6b501fe5 overlayfs fixes for 5.2-rc4
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXPkV+wAKCRDh3BK/laaZ
 PMGyAQDq6ry0bTRPIL52Ek+eRS/pi3bIsH96e22Q6W/NrAQEfwD+NtFZneAW/Tux
 AuKIRWqS7UdqCjLurwMHfR9bOHrBDQI=
 =z7pu
 -----END PGP SIGNATURE-----

Merge tag 'ovl-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs fixes from Miklos Szeredi:
 "Here's one fix for a class of bugs triggered by syzcaller, and one
  that makes xfstests fail less"

* tag 'ovl-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: doc: add non-standard corner cases
  ovl: detect overlapping layers
  ovl: support the FS_IOC_FS[SG]ETXATTR ioctls
2019-06-06 12:31:15 -07:00
Linus Torvalds 211758573b fuse fixes for 5.2-rc4
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXPjJMAAKCRDh3BK/laaZ
 PDzlAP9CgHZsgCVfB5afSb9rqY9Fdzr3LxSOwaCXavA5XGJAVQEAhjldnlMOjEvO
 LrDEPG3zziJuQgCmMJ9xXoBYYjkCwgo=
 =nff/
 -----END PGP SIGNATURE-----

Merge tag 'fuse-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse fixes from Miklos Szeredi:
 "This fixes a leaked inode lock in an error cleanup path and a data
  consistency issue with copy_file_range().

  It also adds a new flag for the WRITE request that allows userspace
  filesystems to clear suid/sgid bits on the file if necessary"

* tag 'fuse-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: extract helper for range writeback
  fuse: fix copy_file_range() in the writeback case
  fuse: add FUSE_WRITE_KILL_PRIV
  fuse: fallocate: fix return with locked inode
2019-06-06 12:25:56 -07:00
Linus Torvalds 459aa077a2 NFS client fixes for Linux 5.2
Stable bugfixes:
 - SUNRPC: Fix regression in umount of a secure mount
 - SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential
 - NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
 - NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
 
 Other bugfixes:
 - xprtrdma: Use struct_size() in kzalloc()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlz4KzoACgkQ18tUv7Cl
 QOt7OhAAvG+DVZ6V5+q4zvabKgoievlL56Ys4SaAp3+OlxC6VaiyQUDs/6U9C/xH
 dmVbGYWdFXjqJE1JPXxmu0jOdRiZcnhIq+hiHNOK0qZOBCnE5zzZ1r1tdNY0GHQ2
 JOkREqsXsaeUWuO0pCY7JOmzd5aU1XLhg1/8+9Z7gNwamMfwLkEqi7FGtXi+xsGz
 gQVxMJlHsV2F21IKdKS0TJrcqr2okya/MnOQRbbMC2RT/MYNxDrhAPBJ1Shcx3HB
 NlccAn4jhIL0bCPRvFPib6KrO01U0Ye/KECN8j2qHRT4QS2s0dsnnQ6f2tEs9mJ8
 cRTVh1uniF6ZuDxSr6KIIN3mKA9DX2SK83H16ahAaRBLM8dwF+4MIr6gDdtJvsVw
 nY0YDpnAaKFypuCPBV/jFu7fk97hul4ntymJGVeFdlqu/HtWs1Z1iM93DDVJbKr8
 a3AND6woOQ2asvySPo+X66PKt79gofga4C+ZDuMfJax8+K9imqIyforgLrAmd/yL
 sGAlLzenf6fmOB5C1bPTtrFFbs6XiHXMidDGwmm1kOZIDuN+O2TTwc24gyXx0IyJ
 OhmjDn2CKmzS2WVPhetRgurzkdTigJu4PebC421qWSFhxlf/NfghW+rpM9su/hwv
 /r9+bpdjZ8YD5FUJvxsX4NZLr+SWbNTzX/ARNdRFsGr0NLpG/50=
 =rrp/
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.2-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:
 "These are mostly stable bugfixes found during testing, many during the
  recent NFS bake-a-thon.

  Stable bugfixes:
   - SUNRPC: Fix regression in umount of a secure mount
   - SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential
   - NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
   - NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled

  Other bugfixes:
   - xprtrdma: Use struct_size() in kzalloc()"

* tag 'nfs-for-5.2-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
  NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
  SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential
  SUNRPC fix regression in umount of a secure mount
  xprtrdma: Use struct_size() in kzalloc()
2019-06-06 12:19:37 -07:00
Linus Torvalds 44e843eb5c As a result of some of Al Viro's great work, here are a few cleanups
with fixes for adfs:
 
 - factor out filename comparison, so we can be sure that adfs_compare()
   (used for namei compare) and adfs_match() (used for lookup) have the
   same behaviour.
 - factor out filename lowering (which is not the same as tolower() which
   will lower top-bit-set characters) to ensure that we have the same
   behaviour when comparing filenames as when we hash them.
 - factor out the object fixups, so we are applying all fixups to
   directory objects in the same way, independent of the disk format.
 - factor out the object name fixup (into the previously factored out
   function) to ensure that filenames are appropriately translated -
   for example, adfs allows '/' in filenames, which being the Unix path
   separator, need to be translated to a different character, which is
   normally '.' (DOS 8.3 filenames represent the . as a / on adfs, so
   this is the expected reverse translation.)
 - remove filename truncation; Al asked about this and apparently the
   decision is to remove it.  In any case, adfs's truncation was buggy,
   so this rids us of that bug by removing the truncation feature.
 - we now have only one location which adds the "filetype" suffix to the
   filename, so there's no point that code being out of line.
 - since we translate '/' into '.', an adfs filename of "/" or "//" would
   end up being translated to "." and ".." which have special meanings.
   In this case, change the first character to "^" to avoid these special
   directory names being abused.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAXPD4oPTnkBvkraxkAQK35Q//bql8SDnF9sqy8YlVUmAgoIMdyQtiFuD7
 5UgRBPDl4bIm2Y+mtuYjU/u3nq6tZit0HukybUvD5yA64Opy1Ahkf8OB/f0f6TTU
 xjx55/D9QRj4loGxXHM6PuEO9GX+pIsYPFQoufPYq7hksQB2y1ETkjENk4W4PY2m
 gvbtQqQmz/B+G7PZvrMsZQV/BwYF3vhP8S/qLbgl3PAbciVofruXtPJNRxgOL8ot
 hbOIfT5x30YZpILzXqDZJq4mviWPru+FVJ1uIW1Nd5s8T/9seICxXMjFaMQJUSMn
 oIHCzC1WlP4uRbjwmJ+lyLlEPyYrgYN3+H1FcIO0MTYfBXwYZrVdFLW4TtWBracc
 8bRa+p9jeRe9jdlKpGaX12a4W7xQ3SmB2i8UFE+/epnBqPvuDPOM0h19XK+FclTH
 fAg7Ej1uBC1ROkTlW4OXBsLakXvIIka859DQYQduVKDw8kTSH4QyTdAE7qvOGz/d
 Y0XMeIUz+U1izVJ8ShHCVyttXnkBkC6Xwoc6RY3IET++Fu0hXCMdQMSf6Ta8zJjA
 EUAdb3GLYdJTqX6Oy9NTUd42GPnZwR5KMUOJd6v9ETU7gLDRDyu9ILpgrF+vFaKj
 Xnf7B+D4l1jdB5cU/MYzfzF7Ky80vDHjVr62PSvtb5X9F0pHOINgZJ5an8n80bDc
 dxQq92h9hCI=
 =mQU/
 -----END PGP SIGNATURE-----

Merge tag 'for-rc-adfs' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ADFS cleanups/fixes from Russell King:
 "As a result of some of Al Viro's great work, here are a few cleanups
  with fixes for adfs:

   - factor out filename comparison, so we can be sure that
     adfs_compare() (used for namei compare) and adfs_match() (used for
     lookup) have the same behaviour.

   - factor out filename lowering (which is not the same as tolower()
     which will lower top-bit-set characters) to ensure that we have the
     same behaviour when comparing filenames as when we hash them.

   - factor out the object fixups, so we are applying all fixups to
     directory objects in the same way, independent of the disk format.

   - factor out the object name fixup (into the previously factored out
     function) to ensure that filenames are appropriately translated -
     for example, adfs allows '/' in filenames, which being the Unix
     path separator, need to be translated to a different character,
     which is normally '.' (DOS 8.3 filenames represent the . as a / on
     adfs, so this is the expected reverse translation.)

   - remove filename truncation; Al asked about this and apparently the
     decision is to remove it. In any case, adfs's truncation was buggy,
     so this rids us of that bug by removing the truncation feature.

   - we now have only one location which adds the "filetype" suffix to
     the filename, so there's no point that code being out of line.

   - since we translate '/' into '.', an adfs filename of "/" or "//"
     would end up being translated to "." and ".." which have special
     meanings. In this case, change the first character to "^" to avoid
     these special directory names being abused"

* tag 'for-rc-adfs' of git://git.armlinux.org.uk/~rmk/linux-arm:
  fs/adfs: fix filename fixup handling for "/" and "//" names
  fs/adfs: move append_filetype_suffix() into adfs_object_fixup()
  fs/adfs: remove truncated filename hashing
  fs/adfs: factor out filename fixup
  fs/adfs: factor out object fixups
  fs/adfs: factor out filename case lowering
  fs/adfs: factor out filename comparison
2019-06-06 11:02:54 -07:00
Bob Peterson 638803d456 Revert "gfs2: Replace gl_revokes with a GLF flag"
Commit 73118ca8ba introduced a glock reference counting bug in
gfs2_trans_remove_revoke.  Given that, replacing gl_revokes with a GLF flag is
no longer useful, so revert that commit.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-06 16:29:26 +02:00
Linus Torvalds 47358b6475 pstore fixes for v5.2-rc4
- Avoid NULL deref when unloading/reloading ramoops module (Pi-Hsun Shih)
 - Run ramoops without crash dump region
 -----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlz3NB8WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjGyEACklx2W3qjE51oCWpRuN9B29As0
 XzuWrQr15WzD2zAtdG4wc6/OI3Gfu+/xXb9YClqXFN8TqGEwyGIcz0kOOExvzJN2
 bdseq8gA4JkL8NK3LKjGMXCvUBQiCGUKfGa4xXL8I2NfyZkykUqRa0PVkkfNYOEf
 q6zjPz73BDRvpZUw7+50sKDJalcicwOzn3GMXw7C43qDuuychpwzLTL5ZmFrQ3oX
 qJqz7mIfsP5DpJk8SUTZl+W4eZ6/ianfML883ia9Zg8AP6ix/iET0iQHXw59DbOZ
 XeFmXBudou+JNAjqlDbGppBwJOu3iHXFKh7eJre2W2swkdah/V8CvYo36qdJ9zHP
 zs4/Wt/yloWYZqtY4UWsMhs47ryvm8iC2Ki//OPTZh30fIeqGAcVknbFbu1EHron
 autOEy8DiKH5I76BGGaR78We6AVt04HXTT0kFcDgczv3MLhfOpHLoL4w4fM0NvNq
 3CSDEkr6dsTQPCPUoApBo3rfbiVROzgXdDLLLxULWphtL6rAvvn/FmAPQsC7OdN3
 TdZQ0AjMtiQO32TFfm9badadDXW2QjXJF91TQBqtGacR+ipiXSnImeZC24VCdXyT
 pO9U/rbrU3tds3+Qu1WNh87IvEWOjzC/sjDKSd/ClZqk9F0KVGGSxc9YXgxNzLIR
 gC0luMlt7acj4Jzkog==
 =g26W
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore fixes from Kees Cook:

 - Avoid NULL deref when unloading/reloading ramoops module (Pi-Hsun
   Shih)

 - Run ramoops without crash dump region

* tag 'pstore-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore/ram: Run without kernel crash dump region
  pstore: Set tfm to NULL on free_buf_for_compression
2019-06-05 12:42:26 -07:00
Yan, Zheng 7b2f936fc8 ceph: fix error handling in ceph_get_caps()
The function return 0 even when interrupted or try_get_cap_refs()
return error.

Fixes: 1199d7da2d ("ceph: simplify arguments and return semantics of try_get_cap_refs")
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-06-05 20:34:39 +02:00
Yan, Zheng 3e1d0452ed ceph: avoid iput_final() while holding mutex or in dispatch thread
iput_final() may wait for reahahead pages. The wait can cause deadlock.
For example:

  Workqueue: ceph-msgr ceph_con_workfn [libceph]
    Call Trace:
     schedule+0x36/0x80
     io_schedule+0x16/0x40
     __lock_page+0x101/0x140
     truncate_inode_pages_range+0x556/0x9f0
     truncate_inode_pages_final+0x4d/0x60
     evict+0x182/0x1a0
     iput+0x1d2/0x220
     iterate_session_caps+0x82/0x230 [ceph]
     dispatch+0x678/0xa80 [ceph]
     ceph_con_workfn+0x95b/0x1560 [libceph]
     process_one_work+0x14d/0x410
     worker_thread+0x4b/0x460
     kthread+0x105/0x140
     ret_from_fork+0x22/0x40

  Workqueue: ceph-msgr ceph_con_workfn [libceph]
    Call Trace:
     __schedule+0x3d6/0x8b0
     schedule+0x36/0x80
     schedule_preempt_disabled+0xe/0x10
     mutex_lock+0x2f/0x40
     ceph_check_caps+0x505/0xa80 [ceph]
     ceph_put_wrbuffer_cap_refs+0x1e5/0x2c0 [ceph]
     writepages_finish+0x2d3/0x410 [ceph]
     __complete_request+0x26/0x60 [libceph]
     handle_reply+0x6c8/0xa10 [libceph]
     dispatch+0x29a/0xbb0 [libceph]
     ceph_con_workfn+0x95b/0x1560 [libceph]
     process_one_work+0x14d/0x410
     worker_thread+0x4b/0x460
     kthread+0x105/0x140
     ret_from_fork+0x22/0x40

In above example, truncate_inode_pages_range() waits for readahead pages
while holding s_mutex. ceph_check_caps() waits for s_mutex and blocks
OSD dispatch thread. Later OSD replies (for readahead) can't be handled.

ceph_check_caps() also may lock snap_rwsem for read. So similar deadlock
can happen if iput_final() is called while holding snap_rwsem.

In general, it's not good to call iput_final() inside MDS/OSD dispatch
threads or while holding any mutex.

The fix is introducing ceph_async_iput(), which calls iput_final() in
workqueue.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-06-05 20:34:39 +02:00
Yan, Zheng 1cf89a8dee ceph: single workqueue for inode related works
We have three workqueue for inode works. Later patch will introduce
one more work for inode. It's not good to introcuce more workqueue
and add more 'struct work_struct' to 'struct ceph_inode_info'.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-06-05 20:34:39 +02:00
Thomas Gleixner 55716d2643 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428
Based on 1 normalized pattern(s):

  this file is released under the gplv2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 68 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190114.292346262@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:16 +02:00
Thomas Gleixner 921a3d4d31 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 405
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation inc 59 temple place suite 330 boston ma 021110
  1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 5 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190112.221098808@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:13 +02:00
Thomas Gleixner 7336d0e654 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 398
Based on 1 normalized pattern(s):

  this copyrighted material is made available to anyone wishing to use
  modify copy or redistribute it subject to the terms and conditions
  of the gnu general public license version 2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 44 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531081038.653000175@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:12 +02:00
Thomas Gleixner 2b27bdcc20 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 336
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation inc 51 franklin st fifth floor boston ma 02110
  1301 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 246 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190530000436.674189849@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:07 +02:00
Thomas Gleixner 4505153954 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation inc 59 temple place suite 330 boston ma 02111
  1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 136 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190530000436.384967451@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:06 +02:00
Thomas Gleixner 9f8068503d treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 294
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license 2 as published
  by the free software foundation this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation this program is distributed in the hope
  that it [would] be useful but without any warranty without even the
  implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 9 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.804956444@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:38 +02:00
Thomas Gleixner 2025cf9e19 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 263 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Thomas Gleixner 50acfb2b76 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 286
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation version 2 this program is distributed
  in the hope that it will be useful but without any warranty without
  even the implied warranty of merchantability or fitness for a
  particular purpose see the gnu general public license for more
  details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 97 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.025053186@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Thomas Gleixner 9c92ab6191 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282
Based on 1 normalized pattern(s):

  this software is licensed under the terms of the gnu general public
  license version 2 as published by the free software foundation and
  may be copied distributed and modified under those terms this
  program is distributed in the hope that it will be useful but
  without any warranty without even the implied warranty of
  merchantability or fitness for a particular purpose see the gnu
  general public license for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 285 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.642774971@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Thomas Gleixner e47ca50905 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 275
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation version 2 of the license this program
  is distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation inc 59 temple place suite 330 boston ma 021110
  1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 2 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141334.789682544@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:30:30 +02:00
Daniel Rosenberg 4d3aed7090 f2fs: Add option to limit required GC for checkpoint=disable
This extends the checkpoint option to allow checkpoint=disable:%u[%]
This allows you to specify what how much of the disk you are willing
to lose access to while mounting with checkpoint=disable. If the amount
lost would be higher, the mount will return -EAGAIN. This can be given
as a percent of total space, or in blocks.

Currently, we need to run garbage collection until the amount of holes
is smaller than the OVP space. With the new option, f2fs can mark
space as unusable up front instead of requiring garbage collection until
the number of holes is small enough.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-03 13:27:48 -07:00
Daniel Rosenberg a4c3ecaaad f2fs: Fix accounting for unusable blocks
Fixes possible underflows when dealing with unusable blocks.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-03 13:27:47 -07:00
Daniel Rosenberg 9a9aecaad9 f2fs: Fix root reserved on remount
On a remount, you can currently set root reserved if it was not
previously set. This can cause an underflow if reserved has been set to
a very high value, since then root reserved + current reserved could be
greater than user_block_count. inc_valid_block_count later subtracts out
these values from user_block_count, causing an underflow.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-03 13:27:47 -07:00
Daniel Rosenberg ae4ad7ea09 f2fs: Lower threshold for disable_cp_again
The existing threshold for allowable holes at checkpoint=disable time is
too high. The OVP space contains reserved segments, which are always in
the form of free segments. These must be subtracted from the OVP value.

The current threshold is meant to be the maximum value of holes of a
single type we can have and still guarantee that we can fill the disk
without failing to find space for a block of a given type.

If the disk is full, ignoring current reserved, which only helps us,
the amount of unused blocks is equal to the OVP area. Of that, there
are reserved segments, which must be free segments, and the rest of the
ovp area, which can come from either free segments or holes. The maximum
possible amount of holes is OVP-reserved.

Now, consider the disk when mounting with checkpoint=disable.
We must be able to fill all available free space with either data or
node blocks. When we start with checkpoint=disable, holes are locked to
their current type. Say we have H of one type of hole, and H+X of the
other. We can fill H of that space with arbitrary typed blocks via SSR.
For the remaining H+X blocks, we may not have any of a given block type
left at all. For instance, if we were to fill the disk entirely with
blocks of the type with fewer holes, the H+X blocks of the opposite type
would not be used. If H+X > OVP-reserved, there would be more holes than
could possibly exist, and we would have failed to find a suitable block
earlier on, leading to a crash in update_sit_entry.

If H+X <= OVP-reserved, then the holes end up effectively masked by the OVP
region in this case.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-06-03 13:27:47 -07:00
Darrick J. Wong 025197ebb0 xfs: inode btree scrubber should calculate im_boffset correctly
The im_boffset field is in units of bytes, whereas XFS_INO_OFFSET
returns a value in units of inodes.  Convert the units so that scrub on
a 64k-block filesystem works correctly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-06-03 09:18:40 -07:00
Greg Kroah-Hartman c9c2c27d7c debugfs: make debugfs_create_u32_array() return void
The single user of debugfs_create_u32_array() does not care about the
return value of it, so make it return void as there is no need to do
anything with the return value.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-03 16:34:27 +02:00