Commit Graph

1808 Commits

Author SHA1 Message Date
Kees Cook 7612bfeecc Yama: access task_struct->comm directly
The core ptrace access checking routine holds a task lock, and when
reporting a failure, Yama takes a separate task lock. To avoid a
potential deadlock with two ptracers taking the opposite locks, do not
use get_task_comm() and just use ->comm directly since accuracy is not
important for the report.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
CC: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-08-17 20:40:38 +10:00
Kees Cook 9d8dad742a Yama: higher restrictions should block PTRACE_TRACEME
The higher ptrace restriction levels should be blocking even
PTRACE_TRACEME requests. The comments in the LSM documentation are
misleading about when the checks happen (the parent does not go through
security_ptrace_access_check() on a PTRACE_TRACEME call).

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org # 3.5.x and later
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-08-10 19:58:07 +10:00
Mel Gorman 6290c2c439 selinux: tag avc cache alloc as non-critical
Failing to allocate a cache entry will only harm performance not
correctness.  Do not consume valuable reserve pages for something like
that.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-31 18:42:47 -07:00
Linus Torvalds 27c1ee3f92 Merge branch 'akpm' (Andrew's patch-bomb)
Merge Andrew's first set of patches:
 "Non-MM patches:

   - lots of misc bits

   - tree-wide have_clk() cleanups

   - quite a lot of printk tweaks.  I draw your attention to "printk:
     convert the format for KERN_<LEVEL> to a 2 byte pattern" which
     looks a bit scary.  But afaict it's solid.

   - backlight updates

   - lib/ feature work (notably the addition and use of memweight())

   - checkpatch updates

   - rtc updates

   - nilfs updates

   - fatfs updates (partial, still waiting for acks)

   - kdump, proc, fork, IPC, sysctl, taskstats, pps, etc

   - new fault-injection feature work"

* Merge emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits)
  drivers/misc/lkdtm.c: fix missing allocation failure check
  lib/scatterlist: do not re-write gfp_flags in __sg_alloc_table()
  fault-injection: add tool to run command with failslab or fail_page_alloc
  fault-injection: add selftests for cpu and memory hotplug
  powerpc: pSeries reconfig notifier error injection module
  memory: memory notifier error injection module
  PM: PM notifier error injection module
  cpu: rewrite cpu-notifier-error-inject module
  fault-injection: notifier error injection
  c/r: fcntl: add F_GETOWNER_UIDS option
  resource: make sure requested range is included in the root range
  include/linux/aio.h: cpp->C conversions
  fs: cachefiles: add support for large files in filesystem caching
  pps: return PTR_ERR on error in device_create
  taskstats: check nla_reserve() return
  sysctl: suppress kmemleak messages
  ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION
  ipc: compat: use signed size_t types for msgsnd and msgrcv
  ipc: allow compat IPC version field parsing if !ARCH_WANT_OLD_COMPAT_IPC
  ipc: add COMPAT_SHMLBA support
  ...
2012-07-30 17:25:34 -07:00
Cyrill Gorcunov 1d151c337d c/r: fcntl: add F_GETOWNER_UIDS option
When we restore file descriptors we would like them to look exactly as
they were at dumping time.

With help of fcntl it's almost possible, the missing snippet is file
owners UIDs.

To be able to read their values the F_GETOWNER_UIDS is introduced.

This option is valid iif CONFIG_CHECKPOINT_RESTORE is turned on, otherwise
returning -EINVAL.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-30 17:25:21 -07:00
Al Viro e3fea3f70f selinux: fix selinux_inode_setxattr oops
OK, what we have so far is e.g.
	setxattr(path, name, whatever, 0, XATTR_REPLACE)
with name being good enough to get through xattr_permission().
Then we reach security_inode_setxattr() with the desired value and size.
Aha.  name should begin with "security.selinux", or we won't get that
far in selinux_inode_setxattr().  Suppose we got there and have enough
permissions to relabel that sucker.  We call security_context_to_sid()
with value == NULL, size == 0.  OK, we want ss_initialized to be non-zero.
I.e. after everything had been set up and running.  No problem...

We do 1-byte kmalloc(), zero-length memcpy() (which doesn't oops, even
thought the source is NULL) and put a NUL there.  I.e. form an empty
string.  string_to_context_struct() is called and looks for the first
':' in there.  Not found, -EINVAL we get.  OK, security_context_to_sid_core()
has rc == -EINVAL, force == 0, so it silently returns -EINVAL.
All it takes now is not having CAP_MAC_ADMIN and we are fucked.

All right, it might be a different bug (modulo strange code quoted in the
report), but it's real.  Easily fixed, AFAICS:

Deal with size == 0, value == NULL case in selinux_inode_setxattr()

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Dave Jones <davej@redhat.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-30 15:36:50 +10:00
Alan Cox 3b9fc37280 smack: off by one error
Consider the input case of a rule that consists entirely of non space
symbols followed by a \0. Say 64 + \0

In this case strlen(data) = 64
kzalloc of subject and object are 64 byte objects
sscanfdata, "%s %s %s", subject, ...)

will put 65 bytes into subject.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-30 15:04:17 +10:00
Josh Boyer 8ded2bbc18 posix_types.h: Cleanup stale __NFDBITS and related definitions
Recently, glibc made a change to suppress sign-conversion warnings in
FD_SET (glibc commit ceb9e56b3d1).  This uncovered an issue with the
kernel's definition of __NFDBITS if applications #include
<linux/types.h> after including <sys/select.h>.  A build failure would
be seen when passing the -Werror=sign-compare and -D_FORTIFY_SOURCE=2
flags to gcc.

It was suggested that the kernel should either match the glibc
definition of __NFDBITS or remove that entirely.  The current in-kernel
uses of __NFDBITS can be replaced with BITS_PER_LONG, and there are no
uses of the related __FDELT and __FDMASK defines.  Given that, we'll
continue the cleanup that was started with commit 8b3d1cda4f
("posix_types: Remove fd_set macros") and drop the remaining unused
macros.

Additionally, linux/time.h has similar macros defined that expand to
nothing so we'll remove those at the same time.

Reported-by: Jeff Law <law@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
CC: <stable@vger.kernel.org>
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
[ .. and fix up whitespace as per akpm ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-26 13:36:43 -07:00
Linus Torvalds 3c4cfadef6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking changes from David S Miller:

 1) Remove the ipv4 routing cache.  Now lookups go directly into the FIB
    trie and use prebuilt routes cached there.

    No more garbage collection, no more rDOS attacks on the routing
    cache.  Instead we now get predictable and consistent performance,
    no matter what the pattern of traffic we service.

    This has been almost 2 years in the making.  Special thanks to
    Julian Anastasov, Eric Dumazet, Steffen Klassert, and others who
    have helped along the way.

    I'm sure that with a change of this magnitude there will be some
    kind of fallout, but such things ought the be simple to fix at this
    point.  Luckily I'm not European so I'll be around all of August to
    fix things :-)

    The major stages of this work here are each fronted by a forced
    merge commit whose commit message contains a top-level description
    of the motivations and implementation issues.

 2) Pre-demux of established ipv4 TCP sockets, saves a route demux on
    input.

 3) TCP SYN/ACK performance tweaks from Eric Dumazet.

 4) Add namespace support for netfilter L4 conntrack helpers, from Gao
    Feng.

 5) Add config mechanism for Energy Efficient Ethernet to ethtool, from
    Yuval Mintz.

 6) Remove quadratic behavior from /proc/net/unix, from Eric Dumazet.

 7) Support for connection tracker helpers in userspace, from Pablo
    Neira Ayuso.

 8) Allow userspace driven TX load balancing functions in TEAM driver,
    from Jiri Pirko.

 9) Kill off NLMSG_PUT and RTA_PUT macros, more gross stuff with
    embedded gotos.

10) TCP Small Queues, essentially minimize the amount of TCP data queued
    up in the packet scheduler layer.  Whereas the existing BQL (Byte
    Queue Limits) limits the pkt_sched --> netdevice queuing levels,
    this controls the TCP --> pkt_sched queueing levels.

    From Eric Dumazet.

11) Reduce the number of get_page/put_page ops done on SKB fragments,
    from Alexander Duyck.

12) Implement protection against blind resets in TCP (RFC 5961), from
    Eric Dumazet.

13) Support the client side of TCP Fast Open, basically the ability to
    send data in the SYN exchange, from Yuchung Cheng.

    Basically, the sender queues up data with a sendmsg() call using
    MSG_FASTOPEN, then they do the connect() which emits the queued up
    fastopen data.

14) Avoid all the problems we get into in TCP when timers or PMTU events
    hit a locked socket.  The TCP Small Queues changes added a
    tcp_release_cb() that allows us to queue work up to the
    release_sock() caller, and that's what we use here too.  From Eric
    Dumazet.

15) Zero copy on TX support for TUN driver, from Michael S. Tsirkin.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1870 commits)
  genetlink: define lockdep_genl_is_held() when CONFIG_LOCKDEP
  r8169: revert "add byte queue limit support".
  ipv4: Change rt->rt_iif encoding.
  net: Make skb->skb_iif always track skb->dev
  ipv4: Prepare for change of rt->rt_iif encoding.
  ipv4: Remove all RTCF_DIRECTSRC handliing.
  ipv4: Really ignore ICMP address requests/replies.
  decnet: Don't set RTCF_DIRECTSRC.
  net/ipv4/ip_vti.c: Fix __rcu warnings detected by sparse.
  ipv4: Remove redundant assignment
  rds: set correct msg_namelen
  openvswitch: potential NULL deref in sample()
  tcp: dont drop MTU reduction indications
  bnx2x: Add new 57840 device IDs
  tcp: avoid oops in tcp_metrics and reset tcpm_stamp
  niu: Change niu_rbr_fill() to use unlikely() to check niu_rbr_add_page() return value
  niu: Fix to check for dma mapping errors.
  net: Fix references to out-of-scope variables in put_cmsg_compat()
  net: ethernet: davinci_emac: add pm_runtime support
  net: ethernet: davinci_emac: Remove unnecessary #include
  ...
2012-07-24 10:01:50 -07:00
Linus Torvalds e05644e17e Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Nothing groundbreaking for this kernel, just cleanups and fixes, and a
  couple of Smack enhancements."

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits)
  Smack: Maintainer Record
  Smack: don't show empty rules when /smack/load or /smack/load2 is read
  Smack: user access check bounds
  Smack: onlycap limits on CAP_MAC_ADMIN
  Smack: fix smack_new_inode bogosities
  ima: audit is compiled only when enabled
  ima: ima_initialized is set only if successful
  ima: add policy for pseudo fs
  ima: remove unused cleanup functions
  ima: free securityfs violations file
  ima: use full pathnames in measurement list
  security: Fix nommu build.
  samples: seccomp: add .gitignore for untracked executables
  tpm: check the chip reference before using it
  TPM: fix memleak when register hardware fails
  TPM: chip disabled state erronously being reported as error
  MAINTAINERS: TPM maintainers' contacts update
  Merge branches 'next-queue' and 'next' into next
  Remove unused code from MPI library
  Revert "crypto: GnuPG based MPI lib - additional sources (part 4)"
  ...
2012-07-23 18:49:06 -07:00
Linus Torvalds a66d2c8f7e Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull the big VFS changes from Al Viro:
 "This one is *big* and changes quite a few things around VFS.  What's in there:

   - the first of two really major architecture changes - death to open
     intents.

     The former is finally there; it was very long in making, but with
     Miklos getting through really hard and messy final push in
     fs/namei.c, we finally have it.  Unlike his variant, this one
     doesn't introduce struct opendata; what we have instead is
     ->atomic_open() taking preallocated struct file * and passing
     everything via its fields.

     Instead of returning struct file *, it returns -E...  on error, 0
     on success and 1 in "deal with it yourself" case (e.g.  symlink
     found on server, etc.).

     See comments before fs/namei.c:atomic_open().  That made a lot of
     goodies finally possible and quite a few are in that pile:
     ->lookup(), ->d_revalidate() and ->create() do not get struct
     nameidata * anymore; ->lookup() and ->d_revalidate() get lookup
     flags instead, ->create() gets "do we want it exclusive" flag.

     With the introduction of new helper (kern_path_locked()) we are rid
     of all struct nameidata instances outside of fs/namei.c; it's still
     visible in namei.h, but not for long.  Come the next cycle,
     declaration will move either to fs/internal.h or to fs/namei.c
     itself.  [me, miklos, hch]

   - The second major change: behaviour of final fput().  Now we have
     __fput() done without any locks held by caller *and* not from deep
     in call stack.

     That obviously lifts a lot of constraints on the locking in there.
     Moreover, it's legal now to call fput() from atomic contexts (which
     has immediately simplified life for aio.c).  We also don't need
     anti-recursion logics in __scm_destroy() anymore.

     There is a price, though - the damn thing has become partially
     asynchronous.  For fput() from normal process we are guaranteed
     that pending __fput() will be done before the caller returns to
     userland, exits or gets stopped for ptrace.

     For kernel threads and atomic contexts it's done via
     schedule_work(), so theoretically we might need a way to make sure
     it's finished; so far only one such place had been found, but there
     might be more.

     There's flush_delayed_fput() (do all pending __fput()) and there's
     __fput_sync() (fput() analog doing __fput() immediately).  I hope
     we won't need them often; see warnings in fs/file_table.c for
     details.  [me, based on task_work series from Oleg merged last
     cycle]

   - sync series from Jan

   - large part of "death to sync_supers()" work from Artem; the only
     bits missing here are exofs and ext4 ones.  As far as I understand,
     those are going via the exofs and ext4 trees resp.; once they are
     in, we can put ->write_super() to the rest, along with the thread
     calling it.

   - preparatory bits from unionmount series (from dhowells).

   - assorted cleanups and fixes all over the place, as usual.

  This is not the last pile for this cycle; there's at least jlayton's
  ESTALE work and fsfreeze series (the latter - in dire need of fixes,
  so I'm not sure it'll make the cut this cycle).  I'll probably throw
  symlink/hardlink restrictions stuff from Kees into the next pile, too.
  Plus there's a lot of misc patches I hadn't thrown into that one -
  it's large enough as it is..."

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits)
  ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()
  btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()
  switch dentry_open() to struct path, make it grab references itself
  spufs: shift dget/mntget towards dentry_open()
  zoran: don't bother with struct file * in zoran_map
  ecryptfs: don't reinvent the wheels, please - use struct completion
  don't expose I_NEW inodes via dentry->d_inode
  tidy up namei.c a bit
  unobfuscate follow_up() a bit
  ext3: pass custom EOF to generic_file_llseek_size()
  ext4: use core vfs llseek code for dir seeks
  vfs: allow custom EOF in generic_file_llseek code
  vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes
  vfs: Remove unnecessary flushing of block devices
  vfs: Make sys_sync writeout also block device inodes
  vfs: Create function for iterating over block devices
  vfs: Reorder operations during sys_sync
  quota: Move quota syncing to ->sync_fs method
  quota: Split dquot_quota_sync() to writeback and cache flushing part
  vfs: Move noop_backing_dev_info check from sync into writeback
  ...
2012-07-23 12:27:27 -07:00
Al Viro 765927b2d5 switch dentry_open() to struct path, make it grab references itself
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-23 00:01:29 +04:00
Al Viro d35abdb288 hold task_lock around checks in keyctl
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22 23:58:01 +04:00
Al Viro 67d1214551 merge task_work and rcu_head, get rid of separate allocation for keyring case
task_work and rcu_head are identical now; merge them (calling the result
struct callback_head, rcu_head #define'd to it), kill separate allocation
in security/keys since we can just use cred->rcu now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22 23:57:56 +04:00
Al Viro 41f9d29f09 trimming task_work: kill ->data
get rid of the only user of ->data; this is _not_ the final variant - in the
end we'll have task_work and rcu_head identical and just use cred->rcu,
at which point the separate allocation will be gone completely.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-22 23:57:54 +04:00
David S. Miller abaa72d7fd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
2012-07-19 11:17:30 -07:00
Linus Torvalds e2f3b78557 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull SELinux regression fixes from James Morris.

Andrew Morton has a box that hit that open perms problem.

I also renamed the "epollwakeup" selinux name for the new capability to
be "block_suspend", to match the rename done by commit d9914cf661
("PM: Rename CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND").

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  SELinux: do not check open perms if they are not known to policy
  SELinux: include definition of new capabilities
2012-07-18 13:42:44 -07:00
Eric Paris 3d2195c332 SELinux: do not check open perms if they are not known to policy
When I introduced open perms policy didn't understand them and I
implemented them as a policycap.  When I added the checking of open perm
to truncate I forgot to conditionalize it on the userspace defined
policy capability.  Running an old policy with a new kernel will not
check open on open(2) but will check it on truncate.  Conditionalize the
truncate check the same as the open check.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: stable@vger.kernel.org # 3.4.x
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-16 11:41:47 +10:00
Eric Paris 64919e6091 SELinux: include definition of new capabilities
The kernel has added CAP_WAKE_ALARM and CAP_EPOLLWAKEUP.  We need to
define these in SELinux so they can be mediated by policy.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-16 11:40:31 +10:00
Rafal Krypa 65ee7f45cf Smack: don't show empty rules when /smack/load or /smack/load2 is read
This patch removes empty rules (i.e. with access set to '-') from the
rule list presented to user space.

Smack by design never removes labels nor rules from its lists. Access
for a rule may be set to '-' to effectively disable it. Such rules would
show up in the listing generated when /smack/load or /smack/load2 is
read. This may cause clutter if many rules were disabled.

As a rule with access set to '-' is equivalent to no rule at all, they
may be safely hidden from the listing.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-07-13 15:49:24 -07:00
Casey Schaufler 3518721a89 Smack: user access check bounds
Some of the bounds checking used on the /smack/access
interface was lost when support for long labels was
added. No kernel access checks are affected, however
this is a case where /smack/access could be used
incorrectly and fail to detect the error. This patch
reintroduces the original checks.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-07-13 15:49:24 -07:00
Casey Schaufler 1880eff77e Smack: onlycap limits on CAP_MAC_ADMIN
Smack is integrated with the POSIX capabilities scheme,
using the capabilities CAP_MAC_OVERRIDE and CAP_MAC_ADMIN to
determine if a process is allowed to ignore Smack checks or
change Smack related data respectively. Smack provides an
additional restriction that if an onlycap value is set
by writing to /smack/onlycap only tasks with that Smack
label are allowed to use CAP_MAC_OVERRIDE.

This change adds CAP_MAC_ADMIN as a capability that is affected
by the onlycap mechanism.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-07-13 15:49:23 -07:00
Casey Schaufler eb982cb4cf Smack: fix smack_new_inode bogosities
In January of 2012 Al Viro pointed out three bits of code that
he titled "new_inode_smack bogosities". This patch repairs these
errors.

1. smack_sb_kern_mount() included a NULL check that is impossible.
   The check and NULL case are removed.
2. smack_kb_kern_mount() included pointless locking. The locking is
   removed. Since this is the only place that lock was used the lock
   is removed from the superblock_smack structure.
3. smk_fill_super() incorrectly and unnecessarily set the Smack label
   for the smackfs root inode. The assignment has been removed.

Targeted for git://gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2012-07-13 15:49:23 -07:00
Dmitry Kasatkin 417c6c8ee2 ima: audit is compiled only when enabled
IMA auditing code was compiled even when CONFIG_AUDIT was not enabled.
This patch compiles auditing code only when possible and enabled.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-05 16:43:59 -04:00
Dmitry Kasatkin 7ff2267af5 ima: ima_initialized is set only if successful
Set ima_initialized only if initialization was successful.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-05 16:43:57 -04:00
Dmitry Kasatkin 8445d64dd7 ima: add policy for pseudo fs
Exclude DEVPTS and BINFMT filesystems from the measurement policy.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-05 16:42:33 -04:00
David S. Miller c90a9bb907 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-07-05 03:44:25 -07:00
Paul Mundt 75331a597c security: Fix nommu build.
The security + nommu configuration presently blows up with an undefined
reference to BDI_CAP_EXEC_MAP:

security/security.c: In function 'mmap_prot':
security/security.c:687:36: error: dereferencing pointer to incomplete type
security/security.c:688:16: error: 'BDI_CAP_EXEC_MAP' undeclared (first use in this function)
security/security.c:688:16: note: each undeclared identifier is reported only once for each function it appears in

include backing-dev.h directly to fix it up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-03 21:41:03 +10:00
Dmitry Kasatkin c7de7adc18 ima: remove unused cleanup functions
IMA cannot be used as module and does not need __exit functions.
Removed them.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-02 16:43:30 -04:00
Dmitry Kasatkin 0ea4f8ae41 ima: free securityfs violations file
On ima_fs_init() error, free securityfs violations file.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2012-07-02 16:43:30 -04:00
Mimi Zohar 08e1b76ae3 ima: use full pathnames in measurement list
The IMA measurement list contains filename hints, which can be
ambigious without the full pathname.  This patch replaces the
filename hint with the full pathname, simplifying for userspace
the correlating of file hash measurements with files.

Change log v1:
- Revert to short filenames, when full pathname is longer than IMA
  measurement buffer size. (Based on Dmitry's review)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-02 16:43:29 -04:00
Paul Mundt 659b5e7652 security: Fix nommu build.
The security + nommu configuration presently blows up with an undefined
reference to BDI_CAP_EXEC_MAP:

security/security.c: In function 'mmap_prot':
security/security.c:687:36: error: dereferencing pointer to incomplete type
security/security.c:688:16: error: 'BDI_CAP_EXEC_MAP' undeclared (first use in this function)
security/security.c:688:16: note: each undeclared identifier is reported only once for each function it appears in

include backing-dev.h directly to fix it up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-02 23:56:04 +10:00
Pablo Neira Ayuso a31f2d17b3 netlink: add netlink_kernel_cfg parameter to netlink_kernel_create
This patch adds the following structure:

struct netlink_kernel_cfg {
        unsigned int    groups;
        void            (*input)(struct sk_buff *skb);
        struct mutex    *cb_mutex;
};

That can be passed to netlink_kernel_create to set optional configurations
for netlink kernel sockets.

I've populated this structure by looking for NULL and zero parameters at the
existing code. The remaining parameters that always need to be set are still
left in the original interface.

That includes optional parameters for the netlink socket creation. This allows
easy extensibility of this interface in the future.

This patch also adapts all callers to use this new interface.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-29 16:46:02 -07:00
David S. Miller 01f534d0ae selinux: netlink: Move away from NLMSG_PUT().
And use nlmsg_data() while we're here too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-26 21:54:06 -07:00
James Morris 66dd07b88a Merge commit 'v3.5-rc2' into next 2012-06-10 22:52:10 +10:00
Alban Crequy 2597a8344c netfilter: selinux: switch hook PFs to nfproto
This patch is a cleanup. Use NFPROTO_* for consistency with other
netfilter code.

Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk>
Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Reviewed-by: Vincent Sanders <vincent.sanders@collabora.co.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-06-07 14:58:43 +02:00
Linus Torvalds 1193755ac6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs changes from Al Viro.
 "A lot of misc stuff.  The obvious groups:
   * Miklos' atomic_open series; kills the damn abuse of
     ->d_revalidate() by NFS, which was the major stumbling block for
     all work in that area.
   * ripping security_file_mmap() and dealing with deadlocks in the
     area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in
     general.
   * ->encode_fh() switched to saner API; insane fake dentry in
     mm/cleancache.c gone.
   * assorted annotations in fs (endianness, __user)
   * parts of Artem's ->s_dirty work (jff2 and reiserfs parts)
   * ->update_time() work from Josef.
   * other bits and pieces all over the place.

  Normally it would've been in two or three pull requests, but
  signal.git stuff had eaten a lot of time during this cycle ;-/"

Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the
'truncate_range' inode method was removed by the VM changes, the VFS
update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due
to sparse fix added twice, with other changes nearby).

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits)
  nfs: don't open in ->d_revalidate
  vfs: retry last component if opening stale dentry
  vfs: nameidata_to_filp(): don't throw away file on error
  vfs: nameidata_to_filp(): inline __dentry_open()
  vfs: do_dentry_open(): don't put filp
  vfs: split __dentry_open()
  vfs: do_last() common post lookup
  vfs: do_last(): add audit_inode before open
  vfs: do_last(): only return EISDIR for O_CREAT
  vfs: do_last(): check LOOKUP_DIRECTORY
  vfs: do_last(): make ENOENT exit RCU safe
  vfs: make follow_link check RCU safe
  vfs: do_last(): use inode variable
  vfs: do_last(): inline walk_component()
  vfs: do_last(): make exit RCU safe
  vfs: split do_lookup()
  Btrfs: move over to use ->update_time
  fs: introduce inode operation ->update_time
  reiserfs: get rid of resierfs_sync_super
  reiserfs: mark the superblock as dirty a bit later
  ...
2012-06-01 10:34:35 -07:00
Al Viro 98de59bfe4 take calculation of final prot in security_mmap_file() into a helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 10:37:17 -04:00
Al Viro 8b3ec6814c take security_mmap_file() outside of ->mmap_sem
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 10:37:01 -04:00
Linus Torvalds fb21affa49 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull second pile of signal handling patches from Al Viro:
 "This one is just task_work_add() series + remaining prereqs for it.

  There probably will be another pull request from that tree this
  cycle - at least for helpers, to get them out of the way for per-arch
  fixes remaining in the tree."

Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile
had brought in commit 97fd75b7b8 ("kernel/irq/manage.c: use the
pr_foo() infrastructure to prefix printks") which changed one of the
pr_err() calls that this merge moves around.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  keys: kill task_struct->replacement_session_keyring
  keys: kill the dummy key_replace_session_keyring()
  keys: change keyctl_session_to_parent() to use task_work_add()
  genirq: reimplement exit_irq_thread() hook via task_work_add()
  task_work_add: generic process-context callbacks
  avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers
  parisc: need to check NOTIFY_RESUME when exiting from syscall
  move key_repace_session_keyring() into tracehook_notify_resume()
  TIF_NOTIFY_RESUME is defined on all targets now
2012-05-31 18:47:30 -07:00
Christopher Yeoh ac34ebb3a6 aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector()
A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after
changes made to support CMA in an earlier patch.

Rather than having an additional check_access parameter to these
functions, the first paramater type is overloaded to allow the caller to
specify CHECK_IOVEC_ONLY which means check that the contents of the iovec
are valid, but do not check the memory that they point to.  This is used
by process_vm_readv/writev where we need to validate that a iovec passed
to the syscall is valid but do not want to check the memory that it points
to at this point because it refers to an address space in another process.

Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:32 -07:00
Boaz Harrosh 81ab6e7b26 kmod: convert two call sites to call_usermodehelper_fns()
Both kernel/sys.c && security/keys/request_key.c where inlining the exact
same code as call_usermodehelper_fns(); So simply convert these sites to
directly use call_usermodehelper_fns().

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:28 -07:00
Andrew Morton 4f1c28d241 security/keys/keyctl.c: suppress memory allocation failure warning
This allocation may be large.  The code is probing to see if it will
succeed and if not, it falls back to vmalloc().  We should suppress any
page-allocation failure messages when the fallback happens.

Reported-by: Dave Jones <davej@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:26 -07:00
Al Viro e5467859f7 split ->file_mmap() into ->mmap_addr()/->mmap_file()
... i.e. file-dependent and address-dependent checks.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-31 13:11:54 -04:00
Al Viro d007794a18 split cap_mmap_addr() out of cap_file_mmap()
... switch callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-31 13:10:54 -04:00
Al Viro cc1dad7183 selinuxfs snprintf() misuses
a) %d does _not_ produce a page worth of output
b) snprintf() doesn't return negatives - it used to in old glibc, but
that's the kernel...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-29 23:28:33 -04:00
David Howells 423b978802 KEYS: Fix some sparse warnings
Fix some sparse warnings in the keyrings code:

 (1) compat_keyctl_instantiate_key_iov() should be static.

 (2) There were a couple of places where a pointer was being compared against
     integer 0 rather than NULL.

 (3) keyctl_instantiate_key_common() should not take a __user-labelled iovec
     pointer as the caller must have copied the iovec to kernel space.

 (4) __key_link_begin() takes and __key_link_end() releases
     keyring_serialise_link_sem under some circumstances and so this should be
     declared.

     Note that adding __acquires() and __releases() for this doesn't help cure
     the warnings messages - something only commenting out both helps.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-05-25 20:51:42 +10:00
Oleg Nesterov 413cd3d9ab keys: change keyctl_session_to_parent() to use task_work_add()
Change keyctl_session_to_parent() to use task_work_add() and move
key_replace_session_keyring() logic into task_work->func().

Note that we do task_work_cancel() before task_work_add() to ensure that
only one work can be pending at any time.  This is important, we must not
allow user-space to abuse the parent's ->task_works list.

The callback, replace_session_keyring(), checks PF_EXITING.  I guess this
is not really needed but looks better.

As a side effect, this fixes the (unlikely) race.  The callers of
key_replace_session_keyring() and keyctl_session_to_parent() lack the
necessary barriers, the parent can miss the request.

Now we can remove task_struct->replacement_session_keyring and related
code.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Smith <dsmith@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-23 22:11:23 -04:00
Al Viro 1227dd773d TIF_NOTIFY_RESUME is defined on all targets now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-23 22:09:19 -04:00
Linus Torvalds 644473e9c6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace enhancements from Eric Biederman:
 "This is a course correction for the user namespace, so that we can
  reach an inexpensive, maintainable, and reasonably complete
  implementation.

  Highlights:
   - Config guards make it impossible to enable the user namespace and
     code that has not been converted to be user namespace safe.

   - Use of the new kuid_t type ensures the if you somehow get past the
     config guards the kernel will encounter type errors if you enable
     user namespaces and attempt to compile in code whose permission
     checks have not been updated to be user namespace safe.

   - All uids from child user namespaces are mapped into the initial
     user namespace before they are processed.  Removing the need to add
     an additional check to see if the user namespace of the compared
     uids remains the same.

   - With the user namespaces compiled out the performance is as good or
     better than it is today.

   - For most operations absolutely nothing changes performance or
     operationally with the user namespace enabled.

   - The worst case performance I could come up with was timing 1
     billion cache cold stat operations with the user namespace code
     enabled.  This went from 156s to 164s on my laptop (or 156ns to
     164ns per stat operation).

   - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
     Most uid/gid setting system calls treat these value specially
     anyway so attempting to use -1 as a uid would likely cause
     entertaining failures in userspace.

   - If setuid is called with a uid that can not be mapped setuid fails.
     I have looked at sendmail, login, ssh and every other program I
     could think of that would call setuid and they all check for and
     handle the case where setuid fails.

   - If stat or a similar system call is called from a context in which
     we can not map a uid we lie and return overflowuid.  The LFS
     experience suggests not lying and returning an error code might be
     better, but the historical precedent with uids is different and I
     can not think of anything that would break by lying about a uid we
     can't map.

   - Capabilities are localized to the current user namespace making it
     safe to give the initial user in a user namespace all capabilities.

  My git tree covers all of the modifications needed to convert the core
  kernel and enough changes to make a system bootable to runlevel 1."

Fix up trivial conflicts due to nearby independent changes in fs/stat.c

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
  userns:  Silence silly gcc warning.
  cred: use correct cred accessor with regards to rcu read lock
  userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
  userns: Convert cgroup permission checks to use uid_eq
  userns: Convert tmpfs to use kuid and kgid where appropriate
  userns: Convert sysfs to use kgid/kuid where appropriate
  userns: Convert sysctl permission checks to use kuid and kgids.
  userns: Convert proc to use kuid/kgid where appropriate
  userns: Convert ext4 to user kuid/kgid where appropriate
  userns: Convert ext3 to use kuid/kgid where appropriate
  userns: Convert ext2 to use kuid/kgid where appropriate.
  userns: Convert devpts to use kuid/kgid where appropriate
  userns: Convert binary formats to use kuid/kgid where appropriate
  userns: Add negative depends on entries to avoid building code that is userns unsafe
  userns: signal remove unnecessary map_cred_ns
  userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
  userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
  userns: Convert stat to return values mapped from kuids and kgids
  userns: Convert user specfied uids and gids in chown into kuids and kgid
  userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
  ...
2012-05-23 17:42:39 -07:00