linux/fs
Linus Torvalds 0cbee99269 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace updates from Eric Biederman:
 "Long ago and far away when user namespaces where young it was realized
  that allowing fresh mounts of proc and sysfs with only user namespace
  permissions could violate the basic rule that only root gets to decide
  if proc or sysfs should be mounted at all.

  Some hacks were put in place to reduce the worst of the damage could
  be done, and the common sense rule was adopted that fresh mounts of
  proc and sysfs should allow no more than bind mounts of proc and
  sysfs.  Unfortunately that rule has not been fully enforced.

  There are two kinds of gaps in that enforcement.  Only filesystems
  mounted on empty directories of proc and sysfs should be ignored but
  the test for empty directories was insufficient.  So in my tree
  directories on proc, sysctl and sysfs that will always be empty are
  created specially.  Every other technique is imperfect as an ordinary
  directory can have entries added even after a readdir returns and
  shows that the directory is empty.  Special creation of directories
  for mount points makes the code in the kernel a smidge clearer about
  it's purpose.  I asked container developers from the various container
  projects to help test this and no holes were found in the set of mount
  points on proc and sysfs that are created specially.

  This set of changes also starts enforcing the mount flags of fresh
  mounts of proc and sysfs are consistent with the existing mount of
  proc and sysfs.  I expected this to be the boring part of the work but
  unfortunately unprivileged userspace winds up mounting fresh copies of
  proc and sysfs with noexec and nosuid clear when root set those flags
  on the previous mount of proc and sysfs.  So for now only the atime,
  read-only and nodev attributes which userspace happens to keep
  consistent are enforced.  Dealing with the noexec and nosuid
  attributes remains for another time.

  This set of changes also addresses an issue with how open file
  descriptors from /proc/<pid>/ns/* are displayed.  Recently readlink of
  /proc/<pid>/fd has been triggering a WARN_ON that has not been
  meaningful since it was added (as all of the code in the kernel was
  converted) and is not now actively wrong.

  There is also a short list of issues that have not been fixed yet that
  I will mention briefly.

  It is possible to rename a directory from below to above a bind mount.
  At which point any directory pointers below the renamed directory can
  be walked up to the root directory of the filesystem.  With user
  namespaces enabled a bind mount of the bind mount can be created
  allowing the user to pick a directory whose children they can rename
  to outside of the bind mount.  This is challenging to fix and doubly
  so because all obvious solutions must touch code that is in the
  performance part of pathname resolution.

  As mentioned above there is also a question of how to ensure that
  developers by accident or with purpose do not introduce exectuable
  files on sysfs and proc and in doing so introduce security regressions
  in the current userspace that will not be immediately obvious and as
  such are likely to require breaking userspace in painful ways once
  they are recognized"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  vfs: Remove incorrect debugging WARN in prepend_path
  mnt: Update fs_fully_visible to test for permanently empty directories
  sysfs: Create mountpoints with sysfs_create_mount_point
  sysfs: Add support for permanently empty directories to serve as mount points.
  kernfs: Add support for always empty directories.
  proc: Allow creating permanently empty directories that serve as mount points
  sysctl: Allow creating permanently empty directories that serve as mountpoints.
  fs: Add helper functions for permanently empty directories.
  vfs: Ignore unlocked mounts in fs_fully_visible
  mnt: Modify fs_fully_visible to deal with locked ro nodev and atime
  mnt: Refactor the logic for mounting sysfs and proc in a user namespace
2015-07-03 15:20:57 -07:00
..
9p Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-block 2015-06-25 16:00:17 -07:00
adfs fs/adfs: remove unneeded cast 2015-06-30 19:44:57 -07:00
affs fs/affs/symlink.c: remove unneeded err variable 2015-06-30 19:44:57 -07:00
afs net: Add a struct net parameter to sock_create_kern 2015-05-11 10:50:17 -04:00
autofs4 don't pass nameidata to ->follow_link() 2015-05-10 22:20:15 -04:00
befs fs/befs/btree.c: remove unneeded initializations 2015-06-25 17:00:43 -07:00
bfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
btrfs Merge branch 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2015-06-30 20:07:45 -07:00
cachefiles VFS: fs/cachefiles: d_backing_inode() annotations 2015-04-15 15:06:59 -04:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2015-07-02 11:35:00 -07:00
cifs cifs: Unset CIFS_MOUNT_POSIX_PATHS flag when following dfs mounts 2015-06-29 14:50:22 -05:00
coda VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
configfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
cramfs
debugfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
devpts devpts: if initialization failed, don't crash when opening /dev/ptmx 2015-06-30 19:44:58 -07:00
dlm net: Add a struct net parameter to sock_create_kern 2015-05-11 10:50:17 -04:00
ecryptfs get rid of assorted nameidata-related debris 2015-05-15 01:10:37 -04:00
efivarfs Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-05-06 10:57:37 -07:00
efs fs/efs: femove unneeded cast 2015-06-25 17:00:42 -07:00
exofs exofs: switch to {simple,page}_symlink_inode_operations 2015-05-10 22:18:27 -04:00
exportfs VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) 2015-02-22 11:38:41 -05:00
ext2 xfs: update for 4.2-rc1 2015-06-30 20:16:08 -07:00
ext3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2015-06-24 20:07:10 -07:00
ext4 xfs: update for 4.2-rc1 2015-06-30 20:16:08 -07:00
f2fs Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-block 2015-06-25 16:00:17 -07:00
fat writeback: separate out include/linux/backing-dev-defs.h 2015-06-02 08:33:34 -06:00
freevxfs freevxfs: switch to simple_follow_link() 2015-05-10 22:18:27 -04:00
fscache fs/fscache/object-list.c: use __seq_open_private() 2014-10-13 17:52:21 +01:00
fuse Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
gfs2 GFS2: merge window 2015-06-27 09:47:46 -07:00
hfs writeback: separate out include/linux/backing-dev-defs.h 2015-06-02 08:33:34 -06:00
hfsplus writeback: separate out include/linux/backing-dev-defs.h 2015-06-02 08:33:34 -06:00
hostfs Merge branch 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-06-22 12:51:21 -07:00
hpfs VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
hugetlbfs mm/hugetlb: reduce arch dependent code about hugetlb_prefault_arch_hook 2015-06-24 17:49:41 -07:00
isofs VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
jbd jbd: Deletion of an unnecessary check before the function call "iput" 2014-11-18 10:15:29 +01:00
jbd2 Revert "jbd2: speedup jbd2_journal_dirty_metadata()" 2015-06-27 09:41:50 -07:00
jffs2 MTD fixes for 4.2 2015-06-23 17:38:39 -07:00
jfs jfs: switch to simple_follow_link() 2015-05-10 22:18:26 -04:00
kernfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
lockd nfsd: eliminate NFSD_DEBUG 2015-04-21 16:16:02 -04:00
logfs logfs: fix a pagecache leak for symlinks 2015-05-10 22:18:28 -04:00
minix fs/minix: remove unneeded cast 2015-06-25 17:00:42 -07:00
ncpfs VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
nfs NFS client updates for Linux 4.2 2015-07-02 11:32:23 -07:00
nfs_common
nfsd nfsd: wrap too long lines in nfsd4_encode_read 2015-06-22 14:15:05 -04:00
nilfs2 Merge branch 'akpm' (patches from Andrew) 2015-06-26 09:52:05 -07:00
nls
notify fs/notify: don't use module_init for non-modular inotify_user code 2015-06-16 14:12:34 -04:00
ntfs mm: do not ignore mapping_gfp_mask in page cache allocation paths 2015-06-24 17:49:44 -07:00
ocfs2 Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-block 2015-06-25 16:00:17 -07:00
omfs omfs: fix potential integer overflow in allocator 2015-05-28 18:25:19 -07:00
openpromfs
overlayfs Merge branch 'overlayfs-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2015-07-02 11:23:00 -07:00
proc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
pstore Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
qnx4
qnx6 VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
quota Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
ramfs VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
reiserfs Merge branch 'akpm' (patches from Andrew) 2015-06-26 09:52:05 -07:00
romfs make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
squashfs VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
sysfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
sysv sysv: switch to simple_follow_link() 2015-05-10 22:18:26 -04:00
tracefs sysfs: Create mountpoints with sysfs_create_mount_point 2015-07-01 10:36:47 -05:00
ubifs This pull request includes the following UBI/UBIFS changes: 2015-06-25 14:11:34 -07:00
udf udf: fix udf_load_pvoldesc() 2015-05-21 15:19:15 +02:00
ufs Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-block 2015-06-25 16:00:17 -07:00
xfs xfs: update for 4.2-rc1 2015-06-30 20:16:08 -07:00
Kconfig f2fs: relocate Kconfig from misc filesystems 2015-04-10 15:08:35 -07:00
Kconfig.binfmt mm: split ET_DYN ASLR from mmap ASLR 2015-04-14 16:49:05 -07:00
Makefile um: Remove hppfs 2015-05-31 13:23:08 +02:00
aio.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
anon_inodes.c
attr.c
bad_inode.c don't bother with most of the bad_file_ops methods 2015-02-20 04:03:58 -05:00
binfmt_aout.c assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
binfmt_elf.c fs/binfmt_elf.c:load_elf_binary(): return -EINVAL on zero-length mappings 2015-05-28 18:25:18 -07:00
binfmt_elf_fdpic.c
binfmt_em86.c syscalls: implement execveat() system call 2014-12-13 12:42:51 -08:00
binfmt_flat.c
binfmt_misc.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
binfmt_script.c syscalls: implement execveat() system call 2014-12-13 12:42:51 -08:00
block_dev.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2015-06-30 19:46:34 -07:00
buffer.c buffer: remove unusued 'ret' variable 2015-06-02 09:22:34 -06:00
char_dev.c fs: introduce f_op->mmap_capabilities for nommu mmap support 2015-01-20 14:02:58 -07:00
compat.c vfs: make first argument of dir_context.actor typed 2014-10-31 17:48:54 -04:00
compat_binfmt_elf.c
compat_ioctl.c Bluetooth: bnep: Add support for get bnep features via ioctl 2015-04-03 23:21:34 +02:00
coredump.c coredump: add __printf attribute to cn_*printf functions 2015-06-25 17:00:43 -07:00
dax.c dax: expose __dax_fault for filesystems with locking constraints 2015-06-04 09:18:18 +10:00
dcache.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
dcookies.c
direct-io.c direct-io: only inc/dec inode->i_dio_count for file systems 2015-04-24 15:45:28 -04:00
drop_caches.c vmscan: per memory cgroup slab shrinkers 2015-02-12 18:54:09 -08:00
eventfd.c eventfd: don't take the spinlock in eventfd_poll 2015-02-17 14:34:52 -08:00
eventpoll.c epoll: optimize setting task running after blocking 2015-02-13 21:21:40 -08:00
exec.c parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures 2015-05-12 22:03:44 +02:00
fcntl.c vfs: renumber FMODE_NONOTIFY and add to uniqueness check 2015-01-08 15:10:52 -08:00
fhandle.c vfs: read file_handle only once in handle_to_path 2015-06-02 10:29:07 -07:00
file.c mm: rcu-protected get_mm_exe_file() 2015-04-17 09:04:07 -04:00
file_table.c ->aio_read and ->aio_write removed 2015-04-11 22:29:43 -04:00
filesystems.c
fs-writeback.c writeback: do foreign inode detection iff cgroup writeback is enabled 2015-06-17 12:47:37 -06:00
fs_pin.c fs_pin: Allow for the possibility that m_list or s_list go unused. 2015-04-09 11:39:55 -05:00
fs_struct.c
inode.c Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-block 2015-06-25 16:00:17 -07:00
internal.h trylock_super(): replacement for grab_super_passive() 2015-02-22 11:38:42 -05:00
ioctl.c fsioctl.c: make generic_block_fiemap() signal-tolerant 2015-02-10 14:30:30 -08:00
libfs.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
locks.c proc: show locks in /proc/pid/fdinfo/X 2015-04-17 09:04:12 -04:00
mbcache.c
mount.h fs: use seq_open_private() for proc_mounts 2015-06-30 19:44:56 -07:00
mpage.c writeback: implement foreign cgroup inode detection 2015-06-02 08:40:20 -06:00
namei.c turn user_{path_at,path,lpath,path_dir}() into static inlines 2015-05-15 01:10:45 -04:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
no-block.c
nsfs.c VFS: assorted weird filesystems: d_inode() annotations 2015-04-15 15:06:58 -04:00
open.c VFS: Handle lower layer dentry/inode in pathwalk 2015-05-11 08:13:10 -04:00
pipe.c VFS: assorted weird filesystems: d_inode() annotations 2015-04-15 15:06:58 -04:00
pnode.c mnt: Don't propagate unmounts to locked mounts 2015-04-02 20:34:20 -05:00
pnode.h mnt: Honor MNT_LOCKED when detaching mounts 2015-04-09 11:39:55 -05:00
posix_acl.c VFS: assorted d_backing_inode() annotations 2015-04-15 15:06:59 -04:00
proc_namespace.c fs: use seq_open_private() for proc_mounts 2015-06-30 19:44:56 -07:00
read_write.c new_sync_write(): discard ->ki_pos unless the return value is positive 2015-04-11 22:29:46 -04:00
readdir.c vfs: make first argument of dir_context.actor typed 2014-10-31 17:48:54 -04:00
select.c locking/arch: Rename set_mb() to smp_store_mb() 2015-05-19 08:32:00 +02:00
seq_file.c Merge branch 'akpm' (patches from Andrew) 2015-07-01 17:47:51 -07:00
signalfd.c fs: Convert show_fdinfo functions to void 2014-11-05 14:13:23 -05:00
splice.c Merge branch 'akpm' (patches from Andrew) 2015-06-24 20:47:21 -07:00
stack.c
stat.c VFS: assorted d_backing_inode() annotations 2015-04-15 15:06:59 -04:00
statfs.c
super.c cleancache: remove limit on the number of cleancache enabled filesystems 2015-04-14 16:49:03 -07:00
sync.c vfs: add support for a lazytime mount option 2015-02-05 02:45:00 -05:00
timerfd.c fs: Convert show_fdinfo functions to void 2014-11-05 14:13:23 -05:00
utimes.c
xattr.c evm: fix potential race when removing xattrs 2015-05-21 13:28:47 -04:00