Commit Graph

351 Commits

Author SHA1 Message Date
Linus Torvalds b04a23421b Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:

 - Report constant st_ino values across copy-up even if underlying
   layers are on different filesystems, but using different st_dev
   values for each layer.

   Ideally we'd report the same st_dev across the overlay, and it's
   possible to do for filesystems that use only 32bits for st_ino by
   unifying the inum space. It would be nice if it wasn't a choice of 32
   or 64, rather filesystems could report their current maximum (that
   could change on resize, so it wouldn't be set in stone).

 - miscellaneus fixes and a cleanup of ovl_fill_super(), that was long
   overdue.

 - created a path_put_init() helper that clears out the pointers after
   putting the ref.

   I think this could be useful elsewhere, so added it to <linux/path.h>

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (30 commits)
  ovl: remove unneeded arg from ovl_verify_origin()
  ovl: Put upperdentry if ovl_check_origin() fails
  ovl: rename ufs to ofs
  ovl: clean up getting lower layers
  ovl: clean up workdir creation
  ovl: clean up getting upper layer
  ovl: move ovl_get_workdir() and ovl_get_lower_layers()
  ovl: reduce the number of arguments for ovl_workdir_create()
  ovl: change order of setup in ovl_fill_super()
  ovl: factor out ovl_free_fs() helper
  ovl: grab reference to workbasedir early
  ovl: split out ovl_get_indexdir() from ovl_fill_super()
  ovl: split out ovl_get_lower_layers() from ovl_fill_super()
  ovl: split out ovl_get_workdir() from ovl_fill_super()
  ovl: split out ovl_get_upper() from ovl_fill_super()
  ovl: split out ovl_get_lowerstack() from ovl_fill_super()
  ovl: split out ovl_get_workpath() from ovl_fill_super()
  ovl: split out ovl_get_upperpath() from ovl_fill_super()
  ovl: use path_put_init() in error paths for ovl_fill_super()
  vfs: add path_put_init()
  ...
2017-11-17 13:36:59 -08:00
Amir Goldstein d976807606 ovl: remove unneeded arg from ovl_verify_origin()
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:16 +01:00
Vivek Goyal 5455f92b54 ovl: Put upperdentry if ovl_check_origin() fails
If ovl_check_origin() fails, we should put upperdentry. We have a reference
on it by now. So goto out_put_upper instead of out.

Fixes: a9d019573e ("ovl: lookup non-dir copy-up-origin by file handle")
Cc: <stable@vger.kernel.org> #4.12
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:16 +01:00
Miklos Szeredi ad204488d3 ovl: rename ufs to ofs
Rename all "struct ovl_fs" pointers to "ofs".  The "ufs" name is historical
and can only be found in overlayfs/super.c.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:16 +01:00
Miklos Szeredi 4155c10a03 ovl: clean up getting lower layers
Move calling ovl_get_lower_layers() into ovl_get_lowerstack().

ovl_get_lowerstack() now returns the root dentry's filled in ovl_entry.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:15 +01:00
Miklos Szeredi bca44b52f8 ovl: clean up workdir creation
Move calling ovl_get_workdir() into ovl_get_workpath().

Rename ovl_get_workdir() to ovl_make_workdir() and ovl_get_workpath() to
ovl_get_workdir().

Workpath is now not needed outside ovl_get_workdir().

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:15 +01:00
Miklos Szeredi 5064975e7f ovl: clean up getting upper layer
Merge ovl_get_upper() and ovl_get_upperpath().

The resulting function is named ovl_get_upper(), though it still returns
upperpath as well.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:15 +01:00
Miklos Szeredi 520d7c867f ovl: move ovl_get_workdir() and ovl_get_lower_layers()
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:15 +01:00
Miklos Szeredi 6e88256e19 ovl: reduce the number of arguments for ovl_workdir_create()
Remove "sb" and "dentry" arguments of ovl_workdir_create() and related
functions.  Move setting MS_RDONLY flag to callers.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:15 +01:00
Miklos Szeredi c6fe625493 ovl: change order of setup in ovl_fill_super()
Move ovl_get_upper() immediately after ovl_get_upperpath(),
ovl_get_workdir() immediately after ovl_get_workdir() and
ovl_get_lower_layers() immediately after ovl_get_lowerstack().

Also move prepare_creds() up to where other allocations are happening.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:15 +01:00
Miklos Szeredi a9075cdb46 ovl: factor out ovl_free_fs() helper
This can be called both from ovl_put_super() and in the error cleanup path
from ovl_fill_super().

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-10 09:39:15 +01:00
Miklos Szeredi 95e6d4177c ovl: grab reference to workbasedir early
and related cleanups.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:29 +01:00
Miklos Szeredi f7e3a7d947 ovl: split out ovl_get_indexdir() from ovl_fill_super()
It's okay to get rid of the intermediate error label due to ufs being
zeroed on allocation.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:28 +01:00
Miklos Szeredi c0d91fb910 ovl: split out ovl_get_lower_layers() from ovl_fill_super()
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:28 +01:00
Miklos Szeredi 8ed61dc37e ovl: split out ovl_get_workdir() from ovl_fill_super()
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:28 +01:00
Miklos Szeredi 21a3b317a6 ovl: split out ovl_get_upper() from ovl_fill_super()
And don't clobber ufs->upper_mnt on error.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:28 +01:00
Miklos Szeredi 53dbb0b478 ovl: split out ovl_get_lowerstack() from ovl_fill_super()
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:28 +01:00
Miklos Szeredi 87ad447a9d ovl: split out ovl_get_workpath() from ovl_fill_super()
It's okay to get rid of the intermediate error label due to ufs being
zeroed on allocation.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:28 +01:00
Miklos Szeredi 6ee8acf0f7 ovl: split out ovl_get_upperpath() from ovl_fill_super()
It's okay to get rid of the intermediate error label due to ufs being
zeroed on allocation.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:28 +01:00
Miklos Szeredi 8aafcb593d ovl: use path_put_init() in error paths for ovl_fill_super()
This allows simplifying the error cleanup later.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:28 +01:00
Amir Goldstein f30536f0f9 ovl: update cache version of impure parent on rename
ovl_rename() updates dir cache version for impure old parent if an entry
with copy up origin is moved into old parent, but it did not update
cache version if the entry moved out of old parent has a copy up origin.

[SzM] Same for new dir: we updated the version if an entry with origin was
moved in, but not if an entry with origin was moved out.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:27 +01:00
Amir Goldstein a0c5ad307a ovl: relax same fs constraint for constant st_ino
For the case of all layers not on the same fs, return the copy up origin
inode st_dev/st_ino for non-dir from stat(2).

This guaranties constant st_dev/st_ino for non-dir across copy up.
Like the same fs case, st_ino of non-dir is also persistent.

If the st_dev/st_ino for copied up object would have been the same as
that of the real underlying lower file, running diff on underlying lower
file and overlay copied up file would result in diff reporting that the
two files are equal when in fact, they may have different content.

Therefore, unlike the same fs case, st_dev is not persistent because it
uses the unique anonymous bdev allocated for the lower layer.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:27 +01:00
Chandan Rajendra ba1e563cdc ovl: return anonymous st_dev for lower inodes
For non-samefs setup, to make sure that st_dev/st_ino pair is unique
across the system, we return a unique anonymous st_dev for stat(2)
of lower layer inode.

A following patch is going to fix constant st_dev/st_ino across copy up
by returning origin st_dev/st_ino for copied up objects.

If the st_dev/st_ino for copied up object would have been the same as
that of the real underlying lower file, running diff on underlying lower
file and overlay copied up file would result in diff reporting that the
2 files are equal when in fact, they may have different content.

[amir: simplify ovl_get_pseudo_dev()
       split from allocate anonymous bdev patch]

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:27 +01:00
Chandan Rajendra 2a9c6d066e ovl: allocate anonymous devs for lowerdirs
Generate unique values of st_dev per lower layer for non-samefs
overlay mount. The unique values are obtained by allocating anonymous
bdevs for each of the lowerdirs in the overlayfs instance.

The anonymous bdev is going to be returned by stat(2) for lowerdir
non-dir entries in non-samefs case.

[amir: split from ovl_getattr() and re-structure patches]

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:27 +01:00
Chandan Rajendra b93436320c ovl: re-structure overlay lower layers in-memory
Define new structures to represent overlay instance lower layers and
overlay merge dir lower layers to make room for storing more per layer
information in-memory.

Instead of keeping the fs instance lower layers in an array of struct
vfsmount, keep them in an array of new struct ovl_layer, that has a
pointer to struct vfsmount.

Instead of keeping the dentry lower layers in an array of struct path,
keep them in an array of new struct ovl_path, that has a pointer to
struct dentry and to struct ovl_layer.

Add a small helper to find the fs layer id that correspopnds to a lower
struct ovl_path and use it in ovl_lookup().

[amir: split re-structure from anonymous bdev patch]

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:27 +01:00
Amir Goldstein ee023c30d7 ovl: move include of ovl_entry.h into overlayfs.h
Most overlayfs c files already explicitly include ovl_entry.h
to use overlay entry struct definitions and upcoming changes
are going to require even more c files to include this header.

All overlayfs c files include overlayfs.h and overlayfs.h itself
refers to some structs defined in ovl_entry.h, so it seems more
logic to include ovl_entry.h from overlayfs.h than from c files.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:27 +01:00
zhangyi (F) 07f6fff148 ovl: fix rmdir problem on non-merge dir with origin xattr
An "origin && non-merge" upper dir may have leftover whiteouts that
were created in past mount. overlayfs does no clear this dir when we
delete it, which may lead to rmdir fail or temp file left in workdir.

Simple reproducer:
  mkdir lower upper work merge
  mkdir -p lower/dir
  touch lower/dir/a
  mount -t overlay overlay -olowerdir=lower,upperdir=upper,\
    workdir=work merge
  rm merge/dir/a
  umount merge
  rm -rf lower/*
  touch lower/dir  (*)
  mount -t overlay overlay -olowerdir=lower,upperdir=upper,\
    workdir=work merge
  rm -rf merge/dir

Syslog dump:
  overlayfs: cleanup of 'work/#7' failed (-39)

(*): if we do not create the regular file, the result is different:
  rm: cannot remove "dir/": Directory not empty

This patch adds a check for the case of non-merge dir that may contain
whiteouts, and calls ovl_check_empty_dir() to check and clear whiteouts
from upper dir when an empty dir is being deleted.

[amir: split patch from ovl_check_empty_dir() cleanup
       rename ovl_is_origin() to ovl_may_have_whiteouts()
       check OVL_WHITEOUTS flag instead of checking origin xattr]

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:27 +01:00
zhangyi (F) 95e598e7ac ovl: simplify ovl_check_empty_and_clear()
Filter out non-whiteout non-upper entries from list of merge dir entries
while checking if merge dir is empty in ovl_check_empty_dir().
The remaining work for ovl_clear_empty() is to clear all entries on the
list.

[amir: split patch from rmdir bug fix]

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:27 +01:00
Amir Goldstein b79e05aaa1 ovl: no direct iteration for dir with origin xattr
If a non-merge dir in an overlay mount has an overlay.origin xattr, it
means it was once an upper merge dir, which may contain whiteouts and
then the lower dir was removed under it.

Do not iterate real dir directly in this case to avoid exposing whiteouts.

[SzM] Set OVL_WHITEOUT for all merge directories as well.

[amir] A directory that was just copied up does not have the OVL_WHITEOUTS
flag. We need to set it to fix merge dir iteration.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:26 +01:00
Amir Goldstein 4eae06de48 ovl: lockdep annotate of nested OVL_I(inode)->lock
This fixes a lockdep splat when mounting a nested overlayfs.

Fixes: a015dafcaf ("ovl: use ovl_inode mutex to synchronize...")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-11-09 10:23:26 +01:00
Ingo Molnar 8c5db92a70 Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts:
	include/linux/compiler-clang.h
	include/linux/compiler-gcc.h
	include/linux/compiler-intel.h
	include/uapi/linux/stddef.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:32:44 +01:00
Amir Goldstein fa0096e3ba ovl: do not cleanup unsupported index entries
With index=on, ovl_indexdir_cleanup() tries to cleanup invalid index
entries (e.g. bad index name). This behavior could result in cleaning of
entries created by newer kernels and is therefore undesirable.
Instead, abort mount if such entries are encountered. We still cleanup
'stale' entries and 'orphan' entries, both those cases can be a result
of offline changes to lower and upper dirs.

When encoutering an index entry of type directory or whiteout, kernel
was supposed to fallback to read-only mount, but the fill_super()
operation returns EROFS in this case instead of returning success with
read-only mount flag, so mount fails when encoutering directory or
whiteout index entries. Bless this behavior by returning -EINVAL on
directory and whiteout index entries as we do for all unsupported index
entries.

Fixes: 61b674710c ("ovl: do not cleanup directory and whiteout index..")
Cc: <stable@vger.kernel.org> # v4.13
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2017-10-24 16:06:17 +02:00
Amir Goldstein 7937a56fdf ovl: handle ENOENT on index lookup
Treat ENOENT from index entry lookup the same way as treating a returned
negative dentry. Apparently, either could be returned if file is not
found, depending on the underlying file system.

Fixes: 359f392ca5 ("ovl: lookup index entry for copy up origin")
Cc: <stable@vger.kernel.org> # v4.13
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2017-10-24 16:06:17 +02:00
Amir Goldstein 6eaf011144 ovl: fix EIO from lookup of non-indexed upper
Commit fbaf94ee3c ("ovl: don't set origin on broken lower hardlink")
attempt to avoid the condition of non-indexed upper inode with lower
hardlink as origin. If this condition is found, lookup returns EIO.

The protection of commit mentioned above does not cover the case of lower
that is not a hardlink when it is copied up (with either index=off/on)
and then lower is hardlinked while overlay is offline.

Changes to lower layer while overlayfs is offline should not result in
unexpected behavior, so a permanent EIO error after creating a link in
lower layer should not be considered as correct behavior.

This fix replaces EIO error with success in cases where upper has origin
but no index is found, or index is found that does not match upper
inode. In those cases, lookup will not fail and the returned overlay inode
will be hashed by upper inode instead of by lower origin inode.

Fixes: 359f392ca5 ("ovl: lookup index entry for copy up origin")
Cc: <stable@vger.kernel.org> # v4.13
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-10-24 16:06:16 +02:00
Will Deacon 506458efaf locking/barriers: Convert users of lockless_dereference() to READ_ONCE()
READ_ONCE() now has an implicit smp_read_barrier_depends() call, so it
can be used instead of lockless_dereference() without any change in
semantics.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1508840570-22169-4-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-24 13:17:33 +02:00
Dan Carpenter 0ce5cdc9d7 ovl: Return -ENOMEM if an allocation fails ovl_lookup()
The error code is missing here so it means we return ERR_PTR(0) or NULL.
The other error paths all return an error code so this probably should
as well.

Fixes: 02b69b284c ("ovl: lookup redirects")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-10-19 16:19:52 +02:00
Hirofumi Nakagawa b3885bd6ed ovl: add NULL check in ovl_alloc_inode
This was detected by fault injection test

Signed-off-by: Hirofumi Nakagawa <nklabs@gmail.com>
Fixes: 13cf199d00 ("ovl: allocate an ovl_inode struct")
Cc: <stable@vger.kernel.org> # v4.13
2017-10-19 16:19:51 +02:00
Amir Goldstein 85fdee1eef ovl: fix regression caused by exclusive upper/work dir protection
Enforcing exclusive ownership on upper/work dirs caused a docker
regression: https://github.com/moby/moby/issues/34672.

Euan spotted the regression and pointed to the offending commit.
Vivek has brought the regression to my attention and provided this
reproducer:

Terminal 1:

  mount -t overlay -o workdir=work,lowerdir=lower,upperdir=upper none
        merged/

Terminal 2:

  unshare -m

Terminal 1:

  umount merged
  mount -t overlay -o workdir=work,lowerdir=lower,upperdir=upper none
        merged/
  mount: /root/overlay-testing/merged: none already mounted or mount point
         busy

To fix the regression, I replaced the error with an alarming warning.
With index feature enabled, mount does fail, but logs a suggestion to
override exclusive dir protection by disabling index.
Note that index=off mount does take the inuse locks, so a concurrent
index=off will issue the warning and a concurrent index=on mount will fail.

Documentation was updated to reflect this change.

Fixes: 2cac0c00a6 ("ovl: get exclusive ownership on upper/work dirs")
Cc: <stable@vger.kernel.org> # v4.13
Reported-by: Euan Kemp <euank@euank.com>
Reported-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-10-05 15:53:18 +02:00
Amir Goldstein 5820dc0888 ovl: fix missing unlock_rename() in ovl_do_copy_up()
Use the ovl_lock_rename_workdir() helper which requires
unlock_rename() only on lock success.

Fixes: ("fd210b7d67ee ovl: move copy up lock out")
Cc: <stable@vger.kernel.org> # v4.13
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-10-05 15:53:18 +02:00
Amir Goldstein dc7ab6773e ovl: fix dentry leak in ovl_indexdir_cleanup()
index dentry was not released when breaking out of the loop
due to index verification error.

Fixes: 415543d5c6 ("ovl: cleanup bad and stale index entries on mount")
Cc: <stable@vger.kernel.org> # v4.13
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-10-05 15:53:18 +02:00
Amir Goldstein 9f4ec904db ovl: fix dput() of ERR_PTR in ovl_cleanup_index()
Fixes: caf70cb2ba ("ovl: cleanup orphan index entries")
Cc: <stable@vger.kernel.org> # v4.13
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-10-05 15:53:18 +02:00
Amir Goldstein e0082a0f04 ovl: fix error value printed in ovl_lookup_index()
Fixes: 359f392ca5 ("ovl: lookup index entry for copy up origin")
Cc: <stable@vger.kernel.org> # v4.13
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-10-05 15:53:18 +02:00
Linus Torvalds 0f0d12728e Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull mount flag updates from Al Viro:
 "Another chunk of fmount preparations from dhowells; only trivial
  conflicts for that part. It separates MS_... bits (very grotty
  mount(2) ABI) from the struct super_block ->s_flags (kernel-internal,
  only a small subset of MS_... stuff).

  This does *not* convert the filesystems to new constants; only the
  infrastructure is done here. The next step in that series is where the
  conflicts would be; that's the conversion of filesystems. It's purely
  mechanical and it's better done after the merge, so if you could run
  something like

	list=$(for i in MS_RDONLY MS_NOSUID MS_NODEV MS_NOEXEC MS_SYNCHRONOUS MS_MANDLOCK MS_DIRSYNC MS_NOATIME MS_NODIRATIME MS_SILENT MS_POSIXACL MS_KERNMOUNT MS_I_VERSION MS_LAZYTIME; do git grep -l $i fs drivers/staging/lustre drivers/mtd ipc mm include/linux; done|sort|uniq|grep -v '^fs/namespace.c$')

	sed -i -e 's/\<MS_RDONLY\>/SB_RDONLY/g' \
	        -e 's/\<MS_NOSUID\>/SB_NOSUID/g' \
	        -e 's/\<MS_NODEV\>/SB_NODEV/g' \
	        -e 's/\<MS_NOEXEC\>/SB_NOEXEC/g' \
	        -e 's/\<MS_SYNCHRONOUS\>/SB_SYNCHRONOUS/g' \
	        -e 's/\<MS_MANDLOCK\>/SB_MANDLOCK/g' \
	        -e 's/\<MS_DIRSYNC\>/SB_DIRSYNC/g' \
	        -e 's/\<MS_NOATIME\>/SB_NOATIME/g' \
	        -e 's/\<MS_NODIRATIME\>/SB_NODIRATIME/g' \
	        -e 's/\<MS_SILENT\>/SB_SILENT/g' \
	        -e 's/\<MS_POSIXACL\>/SB_POSIXACL/g' \
	        -e 's/\<MS_KERNMOUNT\>/SB_KERNMOUNT/g' \
	        -e 's/\<MS_I_VERSION\>/SB_I_VERSION/g' \
	        -e 's/\<MS_LAZYTIME\>/SB_LAZYTIME/g' \
	        $list

  and commit it with something along the lines of 'convert filesystems
  away from use of MS_... constants' as commit message, it would save a
  quite a bit of headache next cycle"

* 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  VFS: Differentiate mount flags (MS_*) from internal superblock flags
  VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)
  vfs: Add sb_rdonly(sb) to query the MS_RDONLY flag on s_flags
2017-09-14 18:54:01 -07:00
Michal Hocko 0ee931c4e3 mm: treewide: remove GFP_TEMPORARY allocation flag
GFP_TEMPORARY was introduced by commit e12ba74d8f ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation.  As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.

The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory.  So
this is rather misleading and hard to evaluate for any benefits.

I have checked some random users and none of them has added the flag
with a specific justification.  I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring.  This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.

I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL.  Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.

I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.

This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
seems to be a heuristic without any measured advantage for most (if not
all) its current users.  The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers.  So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.

[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org

[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
  Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13 18:53:16 -07:00
Linus Torvalds c353f88f3d Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
 "This fixes d_ino correctness in readdir, which brings overlayfs on par
  with normal filesystems regarding inode number semantics, as long as
  all layers are on the same filesystem.

  There are also some bug fixes, one in particular (random ioctl's
  shouldn't be able to modify lower layers) that touches some vfs code,
  but of course no-op for non-overlay fs"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: fix false positive ESTALE on lookup
  ovl: don't allow writing ioctl on lower layer
  ovl: fix relatime for directories
  vfs: add flags to d_real()
  ovl: cleanup d_real for negative
  ovl: constant d_ino for non-merge dirs
  ovl: constant d_ino across copy up
  ovl: fix readdir error value
  ovl: check snprintf return
2017-09-13 09:11:44 -07:00
Amir Goldstein 939ae4efd5 ovl: fix false positive ESTALE on lookup
Commit b9ac5c274b ("ovl: hash overlay non-dir inodes by copy up origin")
verifies that the origin lower inode stored in the overlayfs inode matched
the inode of a copy up origin dentry found by lookup.

There is a false positive result in that check when lower fs does not
support file handles and copy up origin cannot be followed by file handle
at lookup time.

The false negative happens when finding an overlay inode in cache on a
copied up overlay dentry lookup. The overlay inode still 'remembers' the
copy up origin inode, but the copy up origin dentry is not available for
verification.

Relax the check in case copy up origin dentry is not available.

Fixes: b9ac5c274b ("ovl: hash overlay non-dir inodes by copy up...")
Cc: <stable@vger.kernel.org> # v4.13
Reported-by: Jordi Pujol <jordipujolp@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-09-12 17:22:20 +02:00
Miklos Szeredi cd91304e71 ovl: fix relatime for directories
Need to treat non-regular overlayfs files the same as regular files when
checking for an atime update.

Add a d_real() flag to make it return the upper dentry for all file types.

Reported-by: "zhangyi (F)" <yi.zhang@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-09-05 12:53:11 +02:00
Miklos Szeredi 495e642939 vfs: add flags to d_real()
Add a separate flags argument (in addition to the open flags) to control
the behavior of d_real().

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-09-04 21:42:22 +02:00
Miklos Szeredi 191a3980c6 ovl: cleanup d_real for negative
d_real() is never called with a negative dentry.  So remove the
d_is_negative() check (which would never trigger anyway, since d_is_reg()
returns false for a negative dentry).

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-09-04 16:44:42 +02:00
Peter Zijlstra ff7a5fb0f1 overlayfs, locking: Remove smp_mb__before_spinlock() usage
While we could replace the smp_mb__before_spinlock() with the new
smp_mb__after_spinlock(), the normal pattern is to use
smp_store_release() to publish an object that is used for
lockless_dereference() -- and mirrors the regular rcu_assign_pointer()
/ rcu_dereference() patterns.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:29:02 +02:00