linux/kernel
Linus Torvalds 0cbee99269 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace updates from Eric Biederman:
 "Long ago and far away when user namespaces where young it was realized
  that allowing fresh mounts of proc and sysfs with only user namespace
  permissions could violate the basic rule that only root gets to decide
  if proc or sysfs should be mounted at all.

  Some hacks were put in place to reduce the worst of the damage could
  be done, and the common sense rule was adopted that fresh mounts of
  proc and sysfs should allow no more than bind mounts of proc and
  sysfs.  Unfortunately that rule has not been fully enforced.

  There are two kinds of gaps in that enforcement.  Only filesystems
  mounted on empty directories of proc and sysfs should be ignored but
  the test for empty directories was insufficient.  So in my tree
  directories on proc, sysctl and sysfs that will always be empty are
  created specially.  Every other technique is imperfect as an ordinary
  directory can have entries added even after a readdir returns and
  shows that the directory is empty.  Special creation of directories
  for mount points makes the code in the kernel a smidge clearer about
  it's purpose.  I asked container developers from the various container
  projects to help test this and no holes were found in the set of mount
  points on proc and sysfs that are created specially.

  This set of changes also starts enforcing the mount flags of fresh
  mounts of proc and sysfs are consistent with the existing mount of
  proc and sysfs.  I expected this to be the boring part of the work but
  unfortunately unprivileged userspace winds up mounting fresh copies of
  proc and sysfs with noexec and nosuid clear when root set those flags
  on the previous mount of proc and sysfs.  So for now only the atime,
  read-only and nodev attributes which userspace happens to keep
  consistent are enforced.  Dealing with the noexec and nosuid
  attributes remains for another time.

  This set of changes also addresses an issue with how open file
  descriptors from /proc/<pid>/ns/* are displayed.  Recently readlink of
  /proc/<pid>/fd has been triggering a WARN_ON that has not been
  meaningful since it was added (as all of the code in the kernel was
  converted) and is not now actively wrong.

  There is also a short list of issues that have not been fixed yet that
  I will mention briefly.

  It is possible to rename a directory from below to above a bind mount.
  At which point any directory pointers below the renamed directory can
  be walked up to the root directory of the filesystem.  With user
  namespaces enabled a bind mount of the bind mount can be created
  allowing the user to pick a directory whose children they can rename
  to outside of the bind mount.  This is challenging to fix and doubly
  so because all obvious solutions must touch code that is in the
  performance part of pathname resolution.

  As mentioned above there is also a question of how to ensure that
  developers by accident or with purpose do not introduce exectuable
  files on sysfs and proc and in doing so introduce security regressions
  in the current userspace that will not be immediately obvious and as
  such are likely to require breaking userspace in painful ways once
  they are recognized"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  vfs: Remove incorrect debugging WARN in prepend_path
  mnt: Update fs_fully_visible to test for permanently empty directories
  sysfs: Create mountpoints with sysfs_create_mount_point
  sysfs: Add support for permanently empty directories to serve as mount points.
  kernfs: Add support for always empty directories.
  proc: Allow creating permanently empty directories that serve as mount points
  sysctl: Allow creating permanently empty directories that serve as mountpoints.
  fs: Add helper functions for permanently empty directories.
  vfs: Ignore unlocked mounts in fs_fully_visible
  mnt: Modify fs_fully_visible to deal with locked ro nodev and atime
  mnt: Refactor the logic for mounting sysfs and proc in a user namespace
2015-07-03 15:20:57 -07:00
..
bpf bpf: allow networking programs to use bpf_trace_printk() for debugging 2015-06-15 15:53:50 -07:00
configs kconfig: add xenconfig defconfig helper 2015-06-16 11:04:29 +01:00
debug debug: prevent entering debug mode on panic/exception. 2015-02-19 12:39:03 -06:00
events This patch series contains several clean ups and even a new trace clock 2015-06-26 14:02:43 -07:00
gcov gcov: add support for GCC 5.1 2015-06-30 19:44:57 -07:00
irq Merge branch 'irq/for-x86' into irq/core 2015-06-20 19:14:31 +02:00
livepatch Merge branches 'for-4.1/upstream-fixes', 'for-4.2/kaslr' and 'for-4.2/upstream' into for-linus 2015-06-22 16:26:56 +02:00
locking Merge branch 'sched-hrtimers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-24 15:09:40 -07:00
power Power management and ACPI fixes for v4.2-rc1 2015-07-01 14:17:44 -07:00
printk printk: improve the description of /dev/kmsg line format 2015-06-30 19:44:59 -07:00
rcu This patch series contains several clean ups and even a new trace clock 2015-06-26 14:02:43 -07:00
sched Merge branch 'sched-hrtimers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-24 15:09:40 -07:00
time Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-07-01 15:44:18 -07:00
trace This patch series contains several clean ups and even a new trace clock 2015-06-26 14:02:43 -07:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS 2015-05-12 09:46:00 +02:00
Kconfig.preempt
Makefile make certificate list change message more useful 2015-07-02 16:42:13 -07:00
acct.c acct: check FMODE_CAN_WRITE 2015-04-11 22:27:55 -04:00
async.c kernel/async.c: switch to pr_foo() 2014-10-09 22:26:04 -04:00
audit.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-06-27 13:53:16 -07:00
audit.h Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-04-22 14:49:23 -07:00
audit_tree.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
audit_watch.c VFS: audit: d_backing_inode() annotations 2015-04-15 15:06:55 -04:00
auditfilter.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-02-11 20:07:47 -08:00
auditsc.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-06-27 13:53:16 -07:00
backtracetest.c
bounds.c page-cgroup: get rid of NR_PCG_FLAGS 2014-08-08 15:57:18 -07:00
capability.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
cgroup.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
cgroup_freezer.c cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes 2014-07-15 11:05:09 -04:00
compat.c compat: cleanup coding in compat_get_bitmap() and compat_put_bitmap() 2015-06-04 23:57:18 +02:00
configs.c
context_tracking.c context_tracking: Inherit TIF_NOHZ through forks instead of context switches 2015-05-07 12:02:51 +02:00
cpu.c cpu: Remove new instance of __cpuinit that crept back in 2015-05-27 12:58:39 -07:00
cpu_pm.c
cpuset.c kernel, cpuset: remove exception for __GFP_THISNODE 2015-04-14 16:49:03 -07:00
crash_dump.c crash_dump: Make is_kdump_kernel() accessible from modules 2014-08-25 15:42:19 -07:00
cred.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
delayacct.c delayacct: Remove braindamaged type conversions 2014-07-23 10:18:06 -07:00
dma.c
elfcore.c
exec_domain.c Remove rest of exec domains. 2015-04-12 21:03:31 +02:00
exit.c exit,stats: /* obey this comment */ 2015-06-25 17:00:43 -07:00
extable.c ftrace/x86/extable: Add is_ftrace_trampoline() function 2014-11-19 15:25:26 -05:00
fork.c Merge branch 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2015-06-26 19:50:04 -07:00
freezer.c freezer: remove obsolete comments in __thaw_task() 2014-10-21 23:44:20 +02:00
futex.c Merge branch 'sched-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-24 14:46:01 -07:00
futex_compat.c
groups.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
hung_task.c kernel/hung_task.c: change hung_task.c to use for_each_process_thread() 2015-04-15 16:35:22 -07:00
irq_work.c percpu: Convert remaining __get_cpu_var uses in 3.18-rcX 2014-10-29 11:18:18 -04:00
jump_label.c module, jump_label: Fix module locking 2015-05-27 11:09:50 +09:30
kallsyms.c kernel/kallsyms.c: use __seq_open_private() 2014-10-14 02:18:16 +02:00
kcmp.c kcmp: fix standard comparison bug 2014-09-10 15:42:12 -07:00
kexec.c kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path 2015-06-30 19:44:57 -07:00
kmod.c usermodehelper: kill the kmod_thread_locker logic 2014-12-10 17:41:17 -08:00
kprobes.c kprobes: makes kprobes/enabled works correctly for optimized kprobes. 2015-02-13 21:21:42 -08:00
ksysfs.c
kthread.c kernel/kthread.c: partial revert of 81c98869fa ("kthread: ensure locality of task_struct allocations") 2014-10-09 22:25:51 -04:00
latencytop.c
module-internal.h
module.c Minor merge needed, due to function move. 2015-07-01 10:49:25 -07:00
module_signing.c
notifier.c rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
nsproxy.c bury struct proc_ns in fs/proc 2014-12-04 14:34:54 -05:00
padata.c padata: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:38 -08:00
panic.c kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path 2015-06-30 19:44:57 -07:00
params.c Minor merge needed, due to function move. 2015-07-01 10:49:25 -07:00
pid.c fork: report pid reservation failure properly 2015-04-17 09:04:06 -04:00
pid_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-16 15:53:03 -08:00
profile.c profile: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:38 -08:00
ptrace.c ptrace: ptrace_detach() can no longer race with SIGKILL 2015-04-17 09:04:06 -04:00
range.c kernel: avoid overflow in cmp_range 2015-01-17 10:02:23 +13:00
reboot.c kernel/reboot.c: add orderly_reboot for graceful reboot 2015-04-15 16:35:23 -07:00
relay.c kernel/relay.c: use kvfree() in relay_free_page_array() 2015-06-30 19:44:59 -07:00
resource.c kernel/resource.c: remove deprecated __check_region() and friends 2015-04-15 16:35:22 -07:00
seccomp.c seccomp, filter: add and use bpf_prog_create_from_user from seccomp 2015-05-09 17:35:05 -04:00
signal.c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2015-06-27 13:26:03 -07:00
smp.c smp: Fix error case handling in smp_call_function_*() 2015-04-19 13:19:23 -07:00
smpboot.c watchdog: add watchdog_cpumask sysctl to assist nohz 2015-06-24 17:49:40 -07:00
smpboot.h
softirq.c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-02-09 15:24:03 -08:00
stacktrace.c stacktrace: introduce snprint_stack_trace for buffer output 2014-12-13 12:42:48 -08:00
stop_machine.c sched/stop_machine: Fix deadlock between multiple stop_two_cpus() 2015-06-19 10:03:12 +02:00
sys.c prctl: more prctl(PR_SET_MM_*) checks 2015-06-25 17:00:37 -07:00
sys_ni.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
sysctl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
sysctl_binary.c kernel: add panic_on_warn 2014-12-10 17:41:10 -08:00
system_certificates.S
system_keyring.c KEYS: validate certificate trust only with builtin keys 2014-07-17 09:35:17 -04:00
task_work.c
taskstats.c netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
test_kprobes.c kernel/test_kprobes.c: use current logging functions 2014-08-08 15:57:18 -07:00
torture.c rcu: Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() 2015-05-27 12:56:15 -07:00
tracepoint.c
tsacct.c sched: Make task->start_time nanoseconds based 2014-07-23 10:18:05 -07:00
uid16.c groups: Consolidate the setgroups permission checks 2014-12-05 17:19:27 -06:00
up.c
user-return-notifier.c scheduler: Replace __get_cpu_var with this_cpu_ptr 2014-08-26 13:45:45 -04:00
user.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2014-12-17 12:31:40 -08:00
user_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2014-12-17 12:31:40 -08:00
utsname.c copy address of proc_ns_ops into ns_common 2014-12-04 14:34:47 -05:00
utsname_sysctl.c
watchdog.c watchdog: add watchdog_cpumask sysctl to assist nohz 2015-06-24 17:49:40 -07:00
workqueue.c Minor merge needed, due to function move. 2015-07-01 10:49:25 -07:00
workqueue_internal.h