Commit Graph

2978 Commits

Author SHA1 Message Date
Linus Torvalds
7a1e8b80fb Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - TPM core and driver updates/fixes
   - IPv6 security labeling (CALIPSO)
   - Lots of Apparmor fixes
   - Seccomp: remove 2-phase API, close hole where ptrace can change
     syscall #"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
  apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
  tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
  tpm: Factor out common startup code
  tpm: use devm_add_action_or_reset
  tpm2_i2c_nuvoton: add irq validity check
  tpm: read burstcount from TPM_STS in one 32-bit transaction
  tpm: fix byte-order for the value read by tpm2_get_tpm_pt
  tpm_tis_core: convert max timeouts from msec to jiffies
  apparmor: fix arg_size computation for when setprocattr is null terminated
  apparmor: fix oops, validate buffer size in apparmor_setprocattr()
  apparmor: do not expose kernel stack
  apparmor: fix module parameters can be changed after policy is locked
  apparmor: fix oops in profile_unpack() when policy_db is not present
  apparmor: don't check for vmalloc_addr if kvzalloc() failed
  apparmor: add missing id bounds check on dfa verification
  apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
  apparmor: use list_next_entry instead of list_entry_next
  apparmor: fix refcount race when finding a child profile
  apparmor: fix ref count leak when profile sha1 hash is read
  apparmor: check that xindex is in trans_table bounds
  ...
2016-07-29 17:38:46 -07:00
Linus Torvalds
a867d7349e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns vfs updates from Eric Biederman:
 "This tree contains some very long awaited work on generalizing the
  user namespace support for mounting filesystems to include filesystems
  with a backing store.  The real world target is fuse but the goal is
  to update the vfs to allow any filesystem to be supported.  This
  patchset is based on a lot of code review and testing to approach that
  goal.

  While looking at what is needed to support the fuse filesystem it
  became clear that there were things like xattrs for security modules
  that needed special treatment.  That the resolution of those concerns
  would not be fuse specific.  That sorting out these general issues
  made most sense at the generic level, where the right people could be
  drawn into the conversation, and the issues could be solved for
  everyone.

  At a high level what this patchset does a couple of simple things:

   - Add a user namespace owner (s_user_ns) to struct super_block.

   - Teach the vfs to handle filesystem uids and gids not mapping into
     to kuids and kgids and being reported as INVALID_UID and
     INVALID_GID in vfs data structures.

  By assigning a user namespace owner filesystems that are mounted with
  only user namespace privilege can be detected.  This allows security
  modules and the like to know which mounts may not be trusted.  This
  also allows the set of uids and gids that are communicated to the
  filesystem to be capped at the set of kuids and kgids that are in the
  owning user namespace of the filesystem.

  One of the crazier corner casees this handles is the case of inodes
  whose i_uid or i_gid are not mapped into the vfs.  Most of the code
  simply doesn't care but it is easy to confuse the inode writeback path
  so no operation that could cause an inode write-back is permitted for
  such inodes (aka only reads are allowed).

  This set of changes starts out by cleaning up the code paths involved
  in user namespace permirted mounts.  Then when things are clean enough
  adds code that cleanly sets s_user_ns.  Then additional restrictions
  are added that are possible now that the filesystem superblock
  contains owner information.

  These changes should not affect anyone in practice, but there are some
  parts of these restrictions that are changes in behavior.

   - Andy's restriction on suid executables that does not honor the
     suid bit when the path is from another mount namespace (think
     /proc/[pid]/fd/) or when the filesystem was mounted by a less
     privileged user.

   - The replacement of the user namespace implicit setting of MNT_NODEV
     with implicitly setting SB_I_NODEV on the filesystem superblock
     instead.

     Using SB_I_NODEV is a stronger form that happens to make this state
     user invisible.  The user visibility can be managed but it caused
     problems when it was introduced from applications reasonably
     expecting mount flags to be what they were set to.

  There is a little bit of work remaining before it is safe to support
  mounting filesystems with backing store in user namespaces, beyond
  what is in this set of changes.

   - Verifying the mounter has permission to read/write the block device
     during mount.

   - Teaching the integrity modules IMA and EVM to handle filesystems
     mounted with only user namespace root and to reduce trust in their
     security xattrs accordingly.

   - Capturing the mounters credentials and using that for permission
     checks in d_automount and the like.  (Given that overlayfs already
     does this, and we need the work in d_automount it make sense to
     generalize this case).

  Furthermore there are a few changes that are on the wishlist:

   - Get all filesystems supporting posix acls using the generic posix
     acls so that posix_acl_fix_xattr_from_user and
     posix_acl_fix_xattr_to_user may be removed.  [Maintainability]

   - Reducing the permission checks in places such as remount to allow
     the superblock owner to perform them.

   - Allowing the superblock owner to chown files with unmapped uids and
     gids to something that is mapped so the files may be treated
     normally.

  I am not considering even obvious relaxations of permission checks
  until it is clear there are no more corner cases that need to be
  locked down and handled generically.

  Many thanks to Seth Forshee who kept this code alive, and putting up
  with me rewriting substantial portions of what he did to handle more
  corner cases, and for his diligent testing and reviewing of my
  changes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
  fs: Call d_automount with the filesystems creds
  fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
  evm: Translate user/group ids relative to s_user_ns when computing HMAC
  dquot: For now explicitly don't support filesystems outside of init_user_ns
  quota: Handle quota data stored in s_user_ns in quota_setxquota
  quota: Ensure qids map to the filesystem
  vfs: Don't create inodes with a uid or gid unknown to the vfs
  vfs: Don't modify inodes with a uid or gid unknown to the vfs
  cred: Reject inodes with invalid ids in set_create_file_as()
  fs: Check for invalid i_uid in may_follow_link()
  vfs: Verify acls are valid within superblock's s_user_ns.
  userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
  fs: Refuse uid/gid changes which don't map into s_user_ns
  selinux: Add support for unprivileged mounts from user namespaces
  Smack: Handle labels consistently in untrusted mounts
  Smack: Add support for unprivileged mounts from user namespaces
  fs: Treat foreign mounts as nosuid
  fs: Limit file caps to the user namespace of the super block
  userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
  userns: Remove implicit MNT_NODEV fragility.
  ...
2016-07-29 15:54:19 -07:00
Linus Torvalds
6784725ab0 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "Assorted cleanups and fixes.

  Probably the most interesting part long-term is ->d_init() - that will
  have a bunch of followups in (at least) ceph and lustre, but we'll
  need to sort the barrier-related rules before it can get used for
  really non-trivial stuff.

  Another fun thing is the merge of ->d_iput() callers (dentry_iput()
  and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all
  except the one in __d_lookup_lru())"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
  fs/dcache.c: avoid soft-lockup in dput()
  vfs: new d_init method
  vfs: Update lookup_dcache() comment
  bdev: get rid of ->bd_inodes
  Remove last traces of ->sync_page
  new helper: d_same_name()
  dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends()
  vfs: clean up documentation
  vfs: document ->d_real()
  vfs: merge .d_select_inode() into .d_real()
  unify dentry_iput() and dentry_unlink_inode()
  binfmt_misc: ->s_root is not going anywhere
  drop redundant ->owner initializations
  ufs: get rid of redundant checks
  orangefs: constify inode_operations
  missed comment updates from ->direct_IO() prototype change
  file_inode(f)->i_mapping is f->f_mapping
  trim fsnotify hooks a bit
  9p: new helper - v9fs_parent_fid()
  debugfs: ->d_parent is never NULL or negative
  ...
2016-07-28 12:59:05 -07:00
Linus Torvalds
554828ee0d Merge branch 'salted-string-hash'
This changes the vfs dentry hashing to mix in the parent pointer at the
_beginning_ of the hash, rather than at the end.

That actually improves both the hash and the code generation, because we
can move more of the computation to the "static" part of the dcache
setup, and do less at lookup runtime.

It turns out that a lot of other hash users also really wanted to mix in
a base pointer as a 'salt' for the hash, and so the slightly extended
interface ends up working well for other cases too.

Users that want a string hash that is purely about the string pass in a
'salt' pointer of NULL.

* merge branch 'salted-string-hash':
  fs/dcache.c: Save one 32-bit multiply in dcache lookup
  vfs: make the string hashes salt the hash
2016-07-28 12:26:31 -07:00
Arnd Bergmann
7616ac70d1 apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
The newly added Kconfig option could never work and just causes a build error
when disabled:

security/apparmor/lsm.c:675:25: error: 'CONFIG_SECURITY_APPARMOR_HASH_DEFAULT' undeclared here (not in a function)
 bool aa_g_hash_policy = CONFIG_SECURITY_APPARMOR_HASH_DEFAULT;

The problem is that the macro undefined in this case, and we need to use the IS_ENABLED()
helper to turn it into a boolean constant.

Another minor problem with the original patch is that the option is even offered
in sysfs when SECURITY_APPARMOR_HASH is not enabled, so this also hides the option
in that case.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 6059f71f1e ("apparmor: add parameter to control whether policy hashing is used")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-07-27 17:39:26 +10:00
Kees Cook
f5509cc18d mm: Hardened usercopy
This is the start of porting PAX_USERCOPY into the mainline kernel. This
is the first set of features, controlled by CONFIG_HARDENED_USERCOPY. The
work is based on code by PaX Team and Brad Spengler, and an earlier port
from Casey Schaufler. Additional non-slab page tests are from Rik van Riel.

This patch contains the logic for validating several conditions when
performing copy_to_user() and copy_from_user() on the kernel object
being copied to/from:
- address range doesn't wrap around
- address range isn't NULL or zero-allocated (with a non-zero copy size)
- if on the slab allocator:
  - object size must be less than or equal to copy size (when check is
    implemented in the allocator, which appear in subsequent patches)
- otherwise, object must not span page allocations (excepting Reserved
  and CMA ranges)
- if on the stack
  - object must not extend before/after the current process stack
  - object must be contained by a valid stack frame (when there is
    arch/build support for identifying stack frames)
- object must not overlap with kernel text

Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-26 14:41:47 -07:00
Linus Torvalds
bbce2ad2d7 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 4.8:

  API:
   - first part of skcipher low-level conversions
   - add KPP (Key-agreement Protocol Primitives) interface.

  Algorithms:
   - fix IPsec/cryptd reordering issues that affects aesni
   - RSA no longer does explicit leading zero removal
   - add SHA3
   - add DH
   - add ECDH
   - improve DRBG performance by not doing CTR by hand

  Drivers:
   - add x86 AVX2 multibuffer SHA256/512
   - add POWER8 optimised crc32c
   - add xts support to vmx
   - add DH support to qat
   - add RSA support to caam
   - add Layerscape support to caam
   - add SEC1 AEAD support to talitos
   - improve performance by chaining requests in marvell/cesa
   - add support for Araneus Alea I USB RNG
   - add support for Broadcom BCM5301 RNG
   - add support for Amlogic Meson RNG
   - add support Broadcom NSP SoC RNG"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (180 commits)
  crypto: vmx - Fix aes_p8_xts_decrypt build failure
  crypto: vmx - Ignore generated files
  crypto: vmx - Adding support for XTS
  crypto: vmx - Adding asm subroutines for XTS
  crypto: skcipher - add comment for skcipher_alg->base
  crypto: testmgr - Print akcipher algorithm name
  crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op
  crypto: nx - off by one bug in nx_of_update_msc()
  crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
  crypto: scatterwalk - Inline start/map/done
  crypto: scatterwalk - Remove unnecessary BUG in scatterwalk_start
  crypto: scatterwalk - Remove unnecessary advance in scatterwalk_pagedone
  crypto: scatterwalk - Fix test in scatterwalk_done
  crypto: api - Optimise away crypto_yield when hard preemption is on
  crypto: scatterwalk - add no-copy support to copychunks
  crypto: scatterwalk - Remove scatterwalk_bytes_sglen
  crypto: omap - Stop using crypto scatterwalk_bytes_sglen
  crypto: skcipher - Remove top-level givcipher interface
  crypto: user - Remove crypto_lookup_skcipher call
  crypto: cts - Convert to skcipher
  ...
2016-07-26 13:40:17 -07:00
Al Viro
4f3ccd7657 qstr: constify dentry_init_security
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-07-20 23:30:06 -04:00
John Johansen
d4d03f74a7 apparmor: fix arg_size computation for when setprocattr is null terminated
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
Vegard Nossum
e89b808132 apparmor: fix oops, validate buffer size in apparmor_setprocattr()
When proc_pid_attr_write() was changed to use memdup_user apparmor's
(interface violating) assumption that the setprocattr buffer was always
a single page was violated.

The size test is not strictly speaking needed as proc_pid_attr_write()
will reject anything larger, but for the sake of robustness we can keep
it in.

SMACK and SELinux look safe to me, but somebody else should probably
have a look just in case.

Based on original patch from Vegard Nossum <vegard.nossum@oracle.com>
modified for the case that apparmor provides null termination.

Fixes: bb646cdb12
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-07-12 08:43:10 -07:00
Heinrich Schuchardt
f4ee2def2d apparmor: do not expose kernel stack
Do not copy uninitalized fields th.td_hilen, th.td_data.

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
58acf9d911 apparmor: fix module parameters can be changed after policy is locked
the policy_lock parameter is a one way switch that prevents policy
from being further modified. Unfortunately some of the module parameters
can effectively modify policy by turning off enforcement.

split policy_admin_capable into a view check and a full admin check,
and update the admin check to test the policy_lock parameter.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
5f20fdfed1 apparmor: fix oops in profile_unpack() when policy_db is not present
BugLink: http://bugs.launchpad.net/bugs/1592547

If unpack_dfa() returns NULL due to the dfa not being present,
profile_unpack() is not checking if the dfa is not present (NULL).

Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
3197f5adf5 apparmor: don't check for vmalloc_addr if kvzalloc() failed
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
15756178c6 apparmor: add missing id bounds check on dfa verification
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
Jeff Mahoney
ff118479a7 apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
While using AppArmor, SYS_CAP_RESOURCE is insufficient to call prlimit
on another task. The only other example of a AppArmor mediating access to
another, already running, task (ignoring fork+exec) is ptrace.

The AppArmor model for ptrace is that one of the following must be true:
1) The tracer is unconfined
2) The tracer is in complain mode
3) The tracer and tracee are confined by the same profile
4) The tracer is confined but has SYS_CAP_PTRACE

1), 2, and 3) are already true for setrlimit.

We can match the ptrace model just by allowing CAP_SYS_RESOURCE.

We still test the values of the rlimit since it can always be overridden
using a value that means unlimited for a particular resource.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
Geliang Tang
38dbd7d8be apparmor: use list_next_entry instead of list_entry_next
list_next_entry has been defined in list.h, so I replace list_entry_next
with it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
de7c4cc947 apparmor: fix refcount race when finding a child profile
When finding a child profile via an rcu critical section, the profile
may be put and scheduled for deletion after the child is found but
before its refcount is incremented.

Protect against this by repeating the lookup if the profiles refcount
is 0 and is one its way to deletion.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
0b938a2e2c apparmor: fix ref count leak when profile sha1 hash is read
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
23ca7b640b apparmor: check that xindex is in trans_table bounds
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
f7da2de011 apparmor: ensure the target profile name is always audited
The target profile name was not being correctly audited in a few
cases because the target variable was not being set and gotos
passed the code to set it at apply:

Since it is always based on new_profile just drop the target var
and conditionally report based on new_profile.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
7ee6da25dc apparmor: fix audit full profile hname on successful load
Currently logging of a successful profile load only logs the basename
of the profile. This can result in confusion when a child profile has
the same name as the another profile in the set. Logging the hname
will ensure there is no confusion.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
bf15cf0c64 apparmor: fix log failures for all profiles in a set
currently only the profile that is causing the failure is logged. This
makes it more confusing than necessary about which profiles loaded
and which didn't. So make sure to log success and failure messages for
all profiles in the set being loaded.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
f351841f8d apparmor: fix put() parent ref after updating the active ref
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
6059f71f1e apparmor: add parameter to control whether policy hashing is used
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
bd35db8b8c apparmor: internal paths should be treated as disconnected
Internal mounts are not mounted anywhere and as such should be treated
as disconnected paths.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
f2e561d190 apparmor: fix disconnected bind mnts reconnection
Bind mounts can fail to be properly reconnected when PATH_CONNECT is
specified. Ensure that when PATH_CONNECT is specified the path has
a root.

BugLink: http://bugs.launchpad.net/bugs/1319984

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
d671e89020 apparmor: fix update the mtime of the profile file on replacement
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
9049a79221 apparmor: exec should not be returning ENOENT when it denies
The current behavior is confusing as it causes exec failures to report
the executable is missing instead of identifying that apparmor
caused the failure.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
b6b1b81b3a apparmor: fix uninitialized lsm_audit member
BugLink: http://bugs.launchpad.net/bugs/1268727

The task field in the lsm_audit struct needs to be initialized if
a change_hat fails, otherwise the following oops will occur

BUG: unable to handle kernel paging request at 0000002fbead7d08
IP: [<ffffffff8171153e>] _raw_spin_lock+0xe/0x50
PGD 1e3f35067 PUD 0
Oops: 0002 [#1] SMP
Modules linked in: pppox crc_ccitt p8023 p8022 psnap llc ax25 btrfs raid6_pq xor xfs libcrc32c dm_multipath scsi_dh kvm_amd dcdbas kvm microcode amd64_edac_mod joydev edac_core psmouse edac_mce_amd serio_raw k10temp sp5100_tco i2c_piix4 ipmi_si ipmi_msghandler acpi_power_meter mac_hid lp parport hid_generic usbhid hid pata_acpi mpt2sas ahci raid_class pata_atiixp bnx2 libahci scsi_transport_sas [last unloaded: tipc]
CPU: 2 PID: 699 Comm: changehat_twice Tainted: GF          O 3.13.0-7-generic #25-Ubuntu
Hardware name: Dell Inc. PowerEdge R415/08WNM9, BIOS 1.8.6 12/06/2011
task: ffff8802135c6000 ti: ffff880212986000 task.ti: ffff880212986000
RIP: 0010:[<ffffffff8171153e>]  [<ffffffff8171153e>] _raw_spin_lock+0xe/0x50
RSP: 0018:ffff880212987b68  EFLAGS: 00010006
RAX: 0000000000020000 RBX: 0000002fbead7500 RCX: 0000000000000000
RDX: 0000000000000292 RSI: ffff880212987ba8 RDI: 0000002fbead7d08
RBP: ffff880212987b68 R08: 0000000000000246 R09: ffff880216e572a0
R10: ffffffff815fd677 R11: ffffea0008469580 R12: ffffffff8130966f
R13: ffff880212987ba8 R14: 0000002fbead7d08 R15: ffff8800d8c6b830
FS:  00002b5e6c84e7c0(0000) GS:ffff880216e40000(0000) knlGS:0000000055731700
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000002fbead7d08 CR3: 000000021270f000 CR4: 00000000000006e0
Stack:
 ffff880212987b98 ffffffff81075f17 ffffffff8130966f 0000000000000009
 0000000000000000 0000000000000000 ffff880212987bd0 ffffffff81075f7c
 0000000000000292 ffff880212987c08 ffff8800d8c6b800 0000000000000026
Call Trace:
 [<ffffffff81075f17>] __lock_task_sighand+0x47/0x80
 [<ffffffff8130966f>] ? apparmor_cred_prepare+0x2f/0x50
 [<ffffffff81075f7c>] do_send_sig_info+0x2c/0x80
 [<ffffffff81075fee>] send_sig_info+0x1e/0x30
 [<ffffffff8130242d>] aa_audit+0x13d/0x190
 [<ffffffff8130c1dc>] aa_audit_file+0xbc/0x130
 [<ffffffff8130966f>] ? apparmor_cred_prepare+0x2f/0x50
 [<ffffffff81304cc2>] aa_change_hat+0x202/0x530
 [<ffffffff81308fc6>] aa_setprocattr_changehat+0x116/0x1d0
 [<ffffffff8130a11d>] apparmor_setprocattr+0x25d/0x300
 [<ffffffff812cee56>] security_setprocattr+0x16/0x20
 [<ffffffff8121fc87>] proc_pid_attr_write+0x107/0x130
 [<ffffffff811b7604>] vfs_write+0xb4/0x1f0
 [<ffffffff811b8039>] SyS_write+0x49/0xa0
 [<ffffffff8171a1bf>] tracesys+0xe1/0xe6

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
ec34fa24a9 apparmor: fix replacement bug that adds new child to old parent
When set atomic replacement is used and the parent is updated before the
child, and the child did not exist in the old parent so there is no
direct replacement then the new child is incorrectly added to the old
parent. This results in the new parent not having the child(ren) that
it should and the old parent when being destroyed asserting the
following error.

AppArmor: policy_destroy: internal error, policy '<profile/name>' still
contains profiles

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
dcda617a0c apparmor: fix refcount bug in profile replacement
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
James Morris
e1e5fa9616 Merge tag 'keys-misc-20160708' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2016-07-09 12:49:00 +10:00
James Morris
c632809953 Merge branch 'smack-for-4.8' of https://github.com/cschaufler/smack-next into next 2016-07-08 10:41:15 +10:00
Vegard Nossum
30a46a4647 apparmor: fix oops, validate buffer size in apparmor_setprocattr()
When proc_pid_attr_write() was changed to use memdup_user apparmor's
(interface violating) assumption that the setprocattr buffer was always
a single page was violated.

The size test is not strictly speaking needed as proc_pid_attr_write()
will reject anything larger, but for the sake of robustness we can keep
it in.

SMACK and SELinux look safe to me, but somebody else should probably
have a look just in case.

Based on original patch from Vegard Nossum <vegard.nossum@oracle.com>
modified for the case that apparmor provides null termination.

Fixes: bb646cdb12
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-07-08 10:26:25 +10:00
James Morris
d011a4d861 Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/selinux into next 2016-07-07 10:15:34 +10:00
Seth Forshee
0b3c9761d1 evm: Translate user/group ids relative to s_user_ns when computing HMAC
The EVM HMAC should be calculated using the on disk user and
group ids, so the k[ug]ids in the inode must be translated
relative to the s_user_ns of the inode's super block.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-07-05 15:13:10 -05:00
Al Viro
b223f4e215 Merge branch 'd_real' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into work.misc 2016-06-30 23:34:49 -04:00
Eric Richter
544e1cea03 ima: extend the measurement entry specific pcr
Extend the PCR supplied as a parameter, instead of assuming that the
measurement entry uses the default configured PCR.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:22 -04:00
Eric Richter
a422638d49 ima: change integrity cache to store measured pcr
IMA avoids re-measuring files by storing the current state as a flag in
the integrity cache. It will then skip adding a new measurement log entry
if the cache reports the file as already measured.

If a policy measures an already measured file to a new PCR, the measurement
will not be added to the list. This patch implements a new bitfield for
specifying which PCR the file was measured into, rather than if it was
measured.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:22 -04:00
Eric Richter
67696f6d79 ima: redefine duplicate template entries
Template entry duplicates are prevented from being added to the
measurement list by checking a hash table that contains the template
entry digests. However, the PCR value is not included in this comparison,
so duplicate template entry digests with differing PCRs may be dropped.

This patch redefines duplicate template entries as template entries with
the same digest and same PCR values.

Reported-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:21 -04:00
Eric Richter
5f6f027b50 ima: change ima_measurements_show() to display the entry specific pcr
IMA assumes that the same default Kconfig PCR is extended for each
entry. This patch replaces the default configured PCR with the policy
defined PCR.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:21 -04:00
Eric Richter
14b1da85bb ima: include pcr for each measurement log entry
The IMA measurement list entries include the Kconfig defined PCR value.
This patch defines a new ima_template_entry field for including the PCR
as specified in the policy rule.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:21 -04:00
Eric Richter
725de7fabb ima: extend ima_get_action() to return the policy pcr
Different policy rules may extend different PCRs. This patch retrieves
the specific PCR for the matched rule.  Subsequent patches will include
the rule specific PCR in the measurement list and extend the appropriate
PCR.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:20 -04:00
Eric Richter
0260643ce8 ima: add policy support for extending different pcrs
This patch defines a new IMA measurement policy rule option "pcr=",
which allows extending different PCRs on a per rule basis. For example,
the system independent files could extend the default IMA Kconfig
specified PCR, while the system dependent files could extend a different
PCR.

The following is an example of this usage with an SELinux policy; the
rule would extend PCR 11 with system configuration files:

  measure func=FILE_CHECK mask=MAY_READ obj_type=system_conf_t pcr=11

Changelog v3:
- FIELD_SIZEOF returns bytes, not bits. Fixed INVALID_PCR

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:20 -04:00
Eric Richter
96d450bbec integrity: add measured_pcrs field to integrity cache
To keep track of which measurements have been extended to which PCRs, this
patch defines a new integrity_iint_cache field named measured_pcrs. This
field is a bitmask of the PCRs measured. Each bit corresponds to a PCR
index. For example, bit 10 corresponds to PCR 10.

Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-06-30 01:14:19 -04:00
Huw Davies
4fee5242bf calipso: Add a label cache.
This works in exactly the same way as the CIPSO label cache.
The idea is to allow the lsm to cache the result of a secattr
lookup so that it doesn't need to perform the lookup for
every skbuff.

It introduces two sysctl controls:
 calipso_cache_enable - enables/disables the cache.
 calipso_cache_bucket_size - sets the size of a cache bucket.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:06:17 -04:00
Huw Davies
a04e71f631 netlabel: Pass a family parameter to netlbl_skbuff_err().
This makes it possible to route the error to the appropriate
labelling engine.  CALIPSO is far less verbose than CIPSO
when encountering a bogus packet, so there is no need for a
CALIPSO error handler.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:06:16 -04:00
Huw Davies
2917f57b6b calipso: Allow the lsm to label the skbuff directly.
In some cases, the lsm needs to add the label to the skbuff directly.
A NF_INET_LOCAL_OUT IPv6 hook is added to selinux to match the IPv4
behaviour.  This allows selinux to label the skbuffs that it requires.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:06:15 -04:00
Huw Davies
e1adea9270 calipso: Allow request sockets to be relabelled by the lsm.
Request sockets need to have a label that takes into account the
incoming connection as well as their parent's label.  This is used
for the outgoing SYN-ACK and for their child full-socket.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:05:29 -04:00
Huw Davies
1f440c99d3 netlabel: Prevent setsockopt() from changing the hop-by-hop option.
If a socket has a netlabel in place then don't let setsockopt() alter
the socket's IPv6 hop-by-hop option.  This is in the same spirit as
the existing check for IPv4.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:05:27 -04:00
Huw Davies
ceba1832b1 calipso: Set the calipso socket label to match the secattr.
CALIPSO is a hop-by-hop IPv6 option.  A lot of this patch is based on
the equivalent CISPO code.  The main difference is due to manipulating
the options in the hop-by-hop header.

Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-27 15:02:51 -04:00
Seth Forshee
aad82892af selinux: Add support for unprivileged mounts from user namespaces
Security labels from unprivileged mounts in user namespaces must
be ignored. Force superblocks from user namespaces whose labeling
behavior is to use xattrs to use mountpoint labeling instead.
For the mountpoint label, default to converting the current task
context into a form suitable for file objects, but also allow the
policy writer to specify a different label through policy
transition rules.

Pieced together from code snippets provided by Stephen Smalley.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 11:02:54 -05:00
Seth Forshee
809c02e091 Smack: Handle labels consistently in untrusted mounts
The SMACK64, SMACK64EXEC, and SMACK64MMAP labels are all handled
differently in untrusted mounts. This is confusing and
potentically problematic. Change this to handle them all the same
way that SMACK64 is currently handled; that is, read the label
from disk and check it at use time. For SMACK64 and SMACK64MMAP
access is denied if the label does not match smk_root. To be
consistent with suid, a SMACK64EXEC label which does not match
smk_root will still allow execution of the file but will not run
with the label supplied in the xattr.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 11:02:22 -05:00
Seth Forshee
9f50eda2a9 Smack: Add support for unprivileged mounts from user namespaces
Security labels from unprivileged mounts cannot be trusted.
Ideally for these mounts we would assign the objects in the
filesystem the same label as the inode for the backing device
passed to mount. Unfortunately it's currently impossible to
determine which inode this is from the LSM mount hooks, so we
settle for the label of the process doing the mount.

This label is assigned to s_root, and also to smk_default to
ensure that new inodes receive this label. The transmute property
is also set on s_root to make this behavior more explicit, even
though it is technically not necessary.

If a filesystem has existing security labels, access to inodes is
permitted if the label is the same as smk_root, otherwise access
is denied. The SMACK64EXEC xattr is completely ignored.

Explicit setting of security labels continues to require
CAP_MAC_ADMIN in init_user_ns.

Altogether, this ensures that filesystem objects are not
accessible to subjects which cannot already access the backing
store, that MAC is not violated for any objects in the fileystem
which are already labeled, and that a user cannot use an
unprivileged mount to gain elevated MAC privileges.

sysfs, tmpfs, and ramfs are already mountable from user
namespaces and support security labels. We can't rule out the
possibility that these filesystems may already be used in mounts
from user namespaces with security lables set from the init
namespace, so failing to trust lables in these filesystems may
introduce regressions. It is safe to trust labels from these
filesystems, since the unprivileged user does not control the
backing store and thus cannot supply security labels, so an
explicit exception is made to trust labels from these
filesystems.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 10:40:42 -05:00
Andy Lutomirski
380cf5ba6b fs: Treat foreign mounts as nosuid
If a process gets access to a mount from a different user
namespace, that process should not be able to take advantage of
setuid files or selinux entrypoints from that filesystem.  Prevent
this by treating mounts from other mount namespaces and those not
owned by current_user_ns() or an ancestor as nosuid.

This will make it safer to allow more complex filesystems to be
mounted in non-root user namespaces.

This does not remove the need for MNT_LOCK_NOSUID.  The setuid,
setgid, and file capability bits can no longer be abused if code in
a user namespace were to clear nosuid on an untrusted filesystem,
but this patch, by itself, is insufficient to protect the system
from abuse of files that, when execed, would increase MAC privilege.

As a more concrete explanation, any task that can manipulate a
vfsmount associated with a given user namespace already has
capabilities in that namespace and all of its descendents.  If they
can cause a malicious setuid, setgid, or file-caps executable to
appear in that mount, then that executable will only allow them to
elevate privileges in exactly the set of namespaces in which they
are already privileges.

On the other hand, if they can cause a malicious executable to
appear with a dangerous MAC label, running it could change the
caller's security context in a way that should not have been
possible, even inside the namespace in which the task is confined.

As a hardening measure, this would have made CVE-2014-5207 much
more difficult to exploit.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 10:40:41 -05:00
Seth Forshee
d07b846f62 fs: Limit file caps to the user namespace of the super block
Capability sets attached to files must be ignored except in the
user namespaces where the mounter is privileged, i.e. s_user_ns
and its descendants. Otherwise a vector exists for gaining
privileges in namespaces where a user is not already privileged.

Add a new helper function, current_in_user_ns(), to test whether a user
namespace is the same as or a descendant of another namespace.
Use this helper to determine whether a file's capability set
should be applied to the caps constructed during exec.

--EWB Replaced in_userns with the simpler current_in_userns.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-06-24 10:40:31 -05:00
Herbert Xu
d56d72c6a0 KEYS: Use skcipher for big keys
This patch replaces use of the obsolete blkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: David Howells <dhowells@redhat.com>
2016-06-24 21:24:58 +08:00
Dan Carpenter
38327424b4 KEYS: potential uninitialized variable
If __key_link_begin() failed then "edit" would be uninitialized.  I've
added a check to fix that.

This allows a random user to crash the kernel, though it's quite
difficult to achieve.  There are three ways it can be done as the user
would have to cause an error to occur in __key_link():

 (1) Cause the kernel to run out of memory.  In practice, this is difficult
     to achieve without ENOMEM cropping up elsewhere and aborting the
     attempt.

 (2) Revoke the destination keyring between the keyring ID being looked up
     and it being tested for revocation.  In practice, this is difficult to
     time correctly because the KEYCTL_REJECT function can only be used
     from the request-key upcall process.  Further, users can only make use
     of what's in /sbin/request-key.conf, though this does including a
     rejection debugging test - which means that the destination keyring
     has to be the caller's session keyring in practice.

 (3) Have just enough key quota available to create a key, a new session
     keyring for the upcall and a link in the session keyring, but not then
     sufficient quota to create a link in the nominated destination keyring
     so that it fails with EDQUOT.

The bug can be triggered using option (3) above using something like the
following:

	echo 80 >/proc/sys/kernel/keys/root_maxbytes
	keyctl request2 user debug:fred negate @t

The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system.  Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.

Assuming the failure occurs, something like the following will be seen:

	kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h
	------------[ cut here ]------------
	kernel BUG at ../mm/slab.c:2821!
	...
	RIP: 0010:[<ffffffff811600f9>] kfree_debugcheck+0x20/0x25
	RSP: 0018:ffff8804014a7de8  EFLAGS: 00010092
	RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
	RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
	RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000
	R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202
	R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001
	...
	Call Trace:
	  kfree+0xde/0x1bc
	  assoc_array_cancel_edit+0x1f/0x36
	  __key_link_end+0x55/0x63
	  key_reject_and_link+0x124/0x155
	  keyctl_reject_key+0xb6/0xe0
	  keyctl_negate_key+0x10/0x12
	  SyS_keyctl+0x9f/0xe7
	  do_syscall_64+0x63/0x13a
	  entry_SYSCALL64_slow_path+0x25/0x25

Fixes: f70e2e0619 ('KEYS: Do preallocation for __key_link()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-16 17:15:04 -10:00
Heinrich Schuchardt
309c5fad5d selinux: fix type mismatch
avc_cache_threshold is of type unsigned int.  Do not use a signed
new_value in sscanf(page, "%u", &new_value).

Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
[PM: subject prefix fix, description cleanup]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-15 16:20:28 -04:00
David Howells
965475acca KEYS: Strip trailing spaces
Strip some trailing spaces.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-06-14 10:29:44 +01:00
Linus Torvalds
8387ff2577 vfs: make the string hashes salt the hash
We always mixed in the parent pointer into the dentry name hash, but we
did it late at lookup time.  It turns out that we can simplify that
lookup-time action by salting the hash with the parent pointer early
instead of late.

A few other users of our string hashes also wanted to mix in their own
pointers into the hash, and those are updated to use the same mechanism.

Hash users that don't have any particular initial salt can just use the
NULL pointer as a no-salt.

Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-10 20:21:46 -07:00
Paul Moore
8bebe88c09 selinux: import NetLabel category bitmaps correctly
The existing ebitmap_netlbl_import() code didn't correctly handle the
case where the ebitmap_node was not aligned/sized to a power of two,
this patch fixes this (on x86_64 ebitmap_node contains six bitmaps
making a range of 0..383).

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-06-09 10:40:37 -04:00
Rafal Krypa
18d872f77c Smack: ignore null signal in smack_task_kill
Kill with signal number 0 is commonly used for checking PID existence.
Smack treated such cases like any other kills, although no signal is
actually delivered when sig == 0.

Checking permissions when sig == 0 didn't prevent an unprivileged caller
from learning whether PID exists or not. When it existed, kernel returned
EPERM, when it didn't - ESRCH. The only effect of policy check in such
case is noise in audit logs.

This change lets Smack silently ignore kill() invocations with sig == 0.

Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2016-06-08 13:52:31 -07:00
Mike Danese
40d273782f security: tomoyo: simplify the gc kthread creation
The code is doing the equivalent of the kthread_run macro.

Signed-off-by: Mike Danese <mikedanese@google.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-06-06 20:23:55 +10:00
Casey Schaufler
2885c1e3e0 LSM: Fix for security_inode_getsecurity and -EOPNOTSUPP
Serge Hallyn pointed out that the current implementation of
security_inode_getsecurity() works if there is only one hook
provided for it, but will fail if there is more than one and
the attribute requested isn't supplied by the first module.
This isn't a problem today, since only SELinux and Smack
provide this hook and there is (currently) no way to enable
both of those modules at the same time. Serge, however, wants
to introduce a capability attribute and an inode_getsecurity
hook in the capability security module to handle it. This
addresses that upcoming problem, will be required for "extreme
stacking" and is just a better implementation.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-06-06 19:59:18 +10:00
Stephan Mueller
4693fc734d KEYS: Add placeholder for KDF usage with DH
The values computed during Diffie-Hellman key exchange are often used
in combination with key derivation functions to create cryptographic
keys.  Add a placeholder for a later implementation to configure a
key derivation function that will transform the Diffie-Hellman
result returned by the KEYCTL_DH_COMPUTE command.

[This patch was stripped down from a patch produced by Mat Martineau that
 had a bug in the compat code - so for the moment Stephan's patch simply
 requires that the placeholder argument must be NULL]

Original-signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-06-03 16:14:34 +10:00
Stephen Smalley
7ea59202db selinux: Only apply bounds checking to source types
The current bounds checking of both source and target types
requires allowing any domain that has access to the child
domain to also have the same permissions to the parent, which
is undesirable.  Drop the target bounds checking.

KaiGai Kohei originally removed all use of target bounds in
commit 7d52a155e3 ("selinux: remove dead code in
type_attribute_bounds_av()") but this was reverted in
commit 2ae3ba3938 ("selinux: libsepol: remove dead code in
check_avtab_hierarchy_callback()") because it would have
required explicitly allowing the parent any permissions
to the child that the child is allowed to itself.

This change in contrast retains the logic for the case where both
source and target types are bounded, thereby allowing access
if the parent of the source is allowed the corresponding
permissions to the parent of the target.  Further, this change
reworks the logic such that we only perform a single computation
for each case and there is no ambiguity as to how to resolve
a bounds violation.

Under the new logic, if the source type and target types are both
bounded, then the parent of the source type must be allowed the same
permissions to the parent of the target type.  If only the source
type is bounded, then the parent of the source type must be allowed
the same permissions to the target type.

Examples of the new logic and comparisons with the old logic:
1. If we have:
	typebounds A B;
then:
	allow B self:process <permissions>;
will satisfy the bounds constraint iff:
	allow A self:process <permissions>;
is also allowed in policy.

Under the old logic, the allow rule on B satisfies the
bounds constraint if any of the following three are allowed:
	allow A B:process <permissions>; or
	allow B A:process <permissions>; or
	allow A self:process <permissions>;
However, either of the first two ultimately require the third to
satisfy the bounds constraint under the old logic, and therefore
this degenerates to the same result (but is more efficient - we only
need to perform one compute_av call).

2. If we have:
	typebounds A B;
	typebounds A_exec B_exec;
then:
	allow B B_exec:file <permissions>;
will satisfy the bounds constraint iff:
	allow A A_exec:file <permissions>;
is also allowed in policy.

This is essentially the same as #1; it is merely included as
an example of dealing with object types related to a bounded domain
in a manner that satisfies the bounds relationship.  Note that
this approach is preferable to leaving B_exec unbounded and having:
	allow A B_exec:file <permissions>;
in policy because that would allow B's entrypoints to be used to
enter A.  Similarly for _tmp or other related types.

3. If we have:
	typebounds A B;
and an unbounded type T, then:
	allow B T:file <permissions>;
will satisfy the bounds constraint iff:
	allow A T:file <permissions>;
is allowed in policy.

The old logic would have been identical for this example.

4. If we have:
	typebounds A B;
and an unbounded domain D, then:
	allow D B:unix_stream_socket <permissions>;
is not subject to any bounds constraints under the new logic
because D is not bounded.  This is desirable so that we can
allow a domain to e.g. connectto a child domain without having
to allow it to do the same to its parent.

The old logic would have required:
	allow D A:unix_stream_socket <permissions>;
to also be allowed in policy.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: re-wrapped description to appease checkpatch.pl]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-05-31 12:01:59 -04:00
Al Viro
4093d306a9 securityfs: ->d_parent is never NULL or negative
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-29 16:22:06 -04:00
Al Viro
07a8e62fde drbd: ->d_parent is never NULL or negative
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-29 16:21:55 -04:00
Linus Torvalds
d102a56edb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Followups to the parallel lookup work:

   - update docs

   - restore killability of the places that used to take ->i_mutex
     killably now that we have down_write_killable() merged

   - Additionally, it turns out that I missed a prerequisite for
     security_d_instantiate() stuff - ->getxattr() wasn't the only thing
     that could be called before dentry is attached to inode; with smack
     we needed the same treatment applied to ->setxattr() as well"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  switch ->setxattr() to passing dentry and inode separately
  switch xattr_handler->set() to passing dentry and inode separately
  restore killability of old mutex_lock_killable(&inode->i_mutex) users
  add down_write_killable_nested()
  update D/f/directory-locking
2016-05-27 17:14:05 -07:00
Al Viro
3767e255b3 switch ->setxattr() to passing dentry and inode separately
smack ->d_instantiate() uses ->setxattr(), so to be able to call it before
we'd hashed the new dentry and attached it to inode, we need ->setxattr()
instances getting the inode as an explicit argument rather than obtaining
it from dentry.

Similar change for ->getxattr() had been done in commit ce23e64.  Unlike
->getxattr() (which is used by both selinux and smack instances of
->d_instantiate()) ->setxattr() is used only by smack one and unfortunately
it got missed back then.

Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-27 20:09:16 -04:00
Jann Horn
dca6b41491 Yama: fix double-spinlock and user access in atomic context
Commit 8a56038c2a ("Yama: consolidate error reporting") causes lockups
when someone hits a Yama denial. Call chain:

process_vm_readv -> process_vm_rw -> process_vm_rw_core -> mm_access
-> ptrace_may_access
task_lock(...) is taken
__ptrace_may_access -> security_ptrace_access_check
-> yama_ptrace_access_check -> report_access -> kstrdup_quotable_cmdline
-> get_cmdline -> access_process_vm -> get_task_mm
task_lock(...) is taken again

task_lock(p) just calls spin_lock(&p->alloc_lock), so at this point,
spin_lock() is called on a lock that is already held by the current
process.

Also: Since the alloc_lock is a spinlock, sleeping inside
security_ptrace_access_check hooks is probably not allowed at all? So it's
not even possible to print the cmdline from in there because that might
involve paging in userspace memory.

It would be tempting to rewrite ptrace_may_access() to drop the alloc_lock
before calling the LSM, but even then, ptrace_may_access() itself might be
called from various contexts in which you're not allowed to sleep; for
example, as far as I understand, to be able to hold a reference to another
task, usually an RCU read lock will be taken (see e.g. kcmp() and
get_robust_list()), so that also prohibits sleeping. (And using e.g. FUSE,
a user can cause pagefault handling to take arbitrary amounts of time -
see https://bugs.chromium.org/p/project-zero/issues/detail?id=808.)

Therefore, AFAIK, in order to print the name of a process below
security_ptrace_access_check(), you'd have to either grab a reference to
the mm_struct and defer the access violation reporting or just use the
"comm" value that's stored in kernelspace and accessible without big
complications. (Or you could try to use some kind of atomic remote VM
access that fails if the memory isn't paged in, similar to
copy_from_user_inatomic(), and if necessary fall back to comm, but
that'd be kind of ugly because the comm/cmdline choice would look
pretty random to the user.)

Fix it by deferring reporting of the access violation until current
exits kernelspace the next time.

v2: Don't oops on PTRACE_TRACEME, call report_access under
task_lock(current). Also fix nonsensical comment. And don't use
GPF_ATOMIC for memory allocation with no locks held.
This patch is tested both for ptrace attach and ptrace traceme.

Fixes: 8a56038c2a ("Yama: consolidate error reporting")
Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-05-26 09:56:18 +10:00
Andy Shevchenko
b8b572789c security/integrity/ima/ima_policy.c: use %pU to output UUID in printable format
Instead of open coded variant re-use extension that vsprintf.c provides
us for ages.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Linus Torvalds
f4f27d0028 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - A new LSM, "LoadPin", from Kees Cook is added, which allows forcing
     of modules and firmware to be loaded from a specific device (this
     is from ChromeOS, where the device as a whole is verified
     cryptographically via dm-verity).

     This is disabled by default but can be configured to be enabled by
     default (don't do this if you don't know what you're doing).

   - Keys: allow authentication data to be stored in an asymmetric key.
     Lots of general fixes and updates.

   - SELinux: add restrictions for loading of kernel modules via
     finit_module().  Distinguish non-init user namespace capability
     checks.  Apply execstack check on thread stacks"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (48 commits)
  LSM: LoadPin: provide enablement CONFIG
  Yama: use atomic allocations when reporting
  seccomp: Fix comment typo
  ima: add support for creating files using the mknodat syscall
  ima: fix ima_inode_post_setattr
  vfs: forbid write access when reading a file into memory
  fs: fix over-zealous use of "const"
  selinux: apply execstack check on thread stacks
  selinux: distinguish non-init user namespace capability checks
  LSM: LoadPin for kernel file loading restrictions
  fs: define a string representation of the kernel_read_file_id enumeration
  Yama: consolidate error reporting
  string_helpers: add kstrdup_quotable_file
  string_helpers: add kstrdup_quotable_cmdline
  string_helpers: add kstrdup_quotable
  selinux: check ss_initialized before revalidating an inode label
  selinux: delay inode label lookup as long as possible
  selinux: don't revalidate an inode's label when explicitly setting it
  selinux: Change bool variable name to index.
  KEYS: Add KEYCTL_DH_COMPUTE command
  ...
2016-05-19 09:21:36 -07:00
Linus Torvalds
a7fd20d1c4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support SPI based w5100 devices, from Akinobu Mita.

   2) Partial Segmentation Offload, from Alexander Duyck.

   3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.

   4) Allow cls_flower stats offload, from Amir Vadai.

   5) Implement bpf blinding, from Daniel Borkmann.

   6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
      actually using FASYNC these atomics are superfluous.  From Eric
      Dumazet.

   7) Run TCP more preemptibly, also from Eric Dumazet.

   8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
      driver, from Gal Pressman.

   9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.

  10) Improve BPF usage documentation, from Jesper Dangaard Brouer.

  11) Support tunneling offloads in qed, from Manish Chopra.

  12) aRFS offloading in mlx5e, from Maor Gottlieb.

  13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
      Leitner.

  14) Add MSG_EOR support to TCP, this allows controlling packet
      coalescing on application record boundaries for more accurate
      socket timestamp sampling.  From Martin KaFai Lau.

  15) Fix alignment of 64-bit netlink attributes across the board, from
      Nicolas Dichtel.

  16) Per-vlan stats in bridging, from Nikolay Aleksandrov.

  17) Several conversions of drivers to ethtool ksettings, from Philippe
      Reynes.

  18) Checksum neutral ILA in ipv6, from Tom Herbert.

  19) Factorize all of the various marvell dsa drivers into one, from
      Vivien Didelot

  20) Add VF support to qed driver, from Yuval Mintz"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
  Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
  Revert "phy dp83867: Make rgmii parameters optional"
  r8169: default to 64-bit DMA on recent PCIe chips
  phy dp83867: Make rgmii parameters optional
  phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
  bpf: arm64: remove callee-save registers use for tmp registers
  asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
  switchdev: pass pointer to fib_info instead of copy
  net_sched: close another race condition in tcf_mirred_release()
  tipc: fix nametable publication field in nl compat
  drivers: net: Don't print unpopulated net_device name
  qed: add support for dcbx.
  ravb: Add missing free_irq() calls to ravb_close()
  qed: Remove a stray tab
  net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fec-mpc52xx: use phydev from struct net_device
  bpf, doc: fix typo on bpf_asm descriptions
  stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
  net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fs-enet: use phydev from struct net_device
  ...
2016-05-17 16:26:30 -07:00
Linus Torvalds
c52b76185b Merge branch 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull 'struct path' constification update from Al Viro:
 "'struct path' is passed by reference to a bunch of Linux security
  methods; in theory, there's nothing to stop them from modifying the
  damn thing and LSM community being what it is, sooner or later some
  enterprising soul is going to decide that it's a good idea.

  Let's remove the temptation and constify all of those..."

* 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  constify ima_d_path()
  constify security_sb_pivotroot()
  constify security_path_chroot()
  constify security_path_{link,rename}
  apparmor: remove useless checks for NULL ->mnt
  constify security_path_{mkdir,mknod,symlink}
  constify security_path_{unlink,rmdir}
  apparmor: constify common_perm_...()
  apparmor: constify aa_path_link()
  apparmor: new helper - common_path_perm()
  constify chmod_common/security_path_chmod
  constify security_sb_mount()
  constify chown_common/security_path_chown
  tomoyo: constify assorted struct path *
  apparmor_path_truncate(): path->mnt is never NULL
  constify vfs_truncate()
  constify security_path_truncate()
  [apparmor] constify struct path * in a bunch of helpers
2016-05-17 14:41:03 -07:00
Linus Torvalds
7f427d3a60 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull parallel filesystem directory handling update from Al Viro.

This is the main parallel directory work by Al that makes the vfs layer
able to do lookup and readdir in parallel within a single directory.
That's a big change, since this used to be all protected by the
directory inode mutex.

The inode mutex is replaced by an rwsem, and serialization of lookups of
a single name is done by a "in-progress" dentry marker.

The series begins with xattr cleanups, and then ends with switching
filesystems over to actually doing the readdir in parallel (switching to
the "iterate_shared()" that only takes the read lock).

A more detailed explanation of the process from Al Viro:
 "The xattr work starts with some acl fixes, then switches ->getxattr to
  passing inode and dentry separately.  This is the point where the
  things start to get tricky - that got merged into the very beginning
  of the -rc3-based #work.lookups, to allow untangling the
  security_d_instantiate() mess.  The xattr work itself proceeds to
  switch a lot of filesystems to generic_...xattr(); no complications
  there.

  After that initial xattr work, the series then does the following:

   - untangle security_d_instantiate()

   - convert a bunch of open-coded lookup_one_len_unlocked() to calls of
     that thing; one such place (in overlayfs) actually yields a trivial
     conflict with overlayfs fixes later in the cycle - overlayfs ended
     up switching to a variant of lookup_one_len_unlocked() sans the
     permission checks.  I would've dropped that commit (it gets
     overridden on merge from #ovl-fixes in #for-next; proper resolution
     is to use the variant in mainline fs/overlayfs/super.c), but I
     didn't want to rebase the damn thing - it was fairly late in the
     cycle...

   - some filesystems had managed to depend on lookup/lookup exclusion
     for *fs-internal* data structures in a way that would break if we
     relaxed the VFS exclusion.  Fixing hadn't been hard, fortunately.

   - core of that series - parallel lookup machinery, replacing
     ->i_mutex with rwsem, making lookup_slow() take it only shared.  At
     that point lookups happen in parallel; lookups on the same name
     wait for the in-progress one to be done with that dentry.

     Surprisingly little code, at that - almost all of it is in
     fs/dcache.c, with fs/namei.c changes limited to lookup_slow() -
     making it use the new primitive and actually switching to locking
     shared.

   - parallel readdir stuff - first of all, we provide the exclusion on
     per-struct file basis, same as we do for read() vs lseek() for
     regular files.  That takes care of most of the needed exclusion in
     readdir/readdir; however, these guys are trickier than lookups, so
     I went for switching them one-by-one.  To do that, a new method
     '->iterate_shared()' is added and filesystems are switched to it
     as they are either confirmed to be OK with shared lock on directory
     or fixed to be OK with that.  I hope to kill the original method
     come next cycle (almost all in-tree filesystems are switched
     already), but it's still not quite finished.

   - several filesystems get switched to parallel readdir.  The
     interesting part here is dealing with dcache preseeding by readdir;
     that needs minor adjustment to be safe with directory locked only
     shared.

     Most of the filesystems doing that got switched to in those
     commits.  Important exception: NFS.  Turns out that NFS folks, with
     their, er, insistence on VFS getting the fuck out of the way of the
     Smart Filesystem Code That Knows How And What To Lock(tm) have
     grown the locking of their own.  They had their own homegrown
     rwsem, with lookup/readdir/atomic_open being *writers* (sillyunlink
     is the reader there).  Of course, with VFS getting the fuck out of
     the way, as requested, the actual smarts of the smart filesystem
     code etc. had become exposed...

   - do_last/lookup_open/atomic_open cleanups.  As the result, open()
     without O_CREAT locks the directory only shared.  Including the
     ->atomic_open() case.  Backmerge from #for-linus in the middle of
     that - atomic_open() fix got brought in.

   - then comes NFS switch to saner (VFS-based ;-) locking, killing the
     homegrown "lookup and readdir are writers" kinda-sorta rwsem.  All
     exclusion for sillyunlink/lookup is done by the parallel lookups
     mechanism.  Exclusion between sillyunlink and rmdir is a real rwsem
     now - rmdir being the writer.

     Result: NFS lookups/readdirs/O_CREAT-less opens happen in parallel
     now.

   - the rest of the series consists of switching a lot of filesystems
     to parallel readdir; in a lot of cases ->llseek() gets simplified
     as well.  One backmerge in there (again, #for-linus - rockridge
     fix)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (74 commits)
  ext4: switch to ->iterate_shared()
  hfs: switch to ->iterate_shared()
  hfsplus: switch to ->iterate_shared()
  hostfs: switch to ->iterate_shared()
  hpfs: switch to ->iterate_shared()
  hpfs: handle allocation failures in hpfs_add_pos()
  gfs2: switch to ->iterate_shared()
  f2fs: switch to ->iterate_shared()
  afs: switch to ->iterate_shared()
  befs: switch to ->iterate_shared()
  befs: constify stuff a bit
  isofs: switch to ->iterate_shared()
  get_acorn_filename(): deobfuscate a bit
  btrfs: switch to ->iterate_shared()
  logfs: no need to lock directory in lseek
  switch ecryptfs to ->iterate_shared
  9p: switch to ->iterate_shared()
  fat: switch to ->iterate_shared()
  romfs, squashfs: switch to ->iterate_shared()
  more trivial ->iterate_shared conversions
  ...
2016-05-17 11:01:31 -07:00
Linus Torvalds
91e8d0cbc9 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "A rather small set of patches from the timer departement:

   - Some more y2038 work
   - Yet another new clocksource driver
   - The usual set of small fixes, cleanups and enhancements"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/tegra: Remove unused suspend/resume code
  clockevents/driversi/mps2: add MPS2 Timer driver
  dt-bindings: document the MPS2 timer bindings
  clocksource/drivers/mtk_timer: Add __init attribute
  clockevents/drivers/dw_apb_timer: Implement ->set_state_oneshot_stopped()
  time: Introduce do_sys_settimeofday64()
  security: Introduce security_settime64()
  clocksource: Add missing include of of.h.
2016-05-17 09:49:28 -07:00
Kees Cook
b937190c40 LSM: LoadPin: provide enablement CONFIG
Instead of being enabled by default when SECURITY_LOADPIN is selected,
provide an additional (default off) config to determine the boot time
behavior. As before, the "loadpin.enabled=0/1" kernel parameter remains
available.

Suggested-by: James Morris <jmorris@namei.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-05-17 20:10:30 +10:00
Al Viro
0e0162bb8c Merge branch 'ovl-fixes' into for-linus
Backmerge to resolve a conflict in ovl_lookup_real();
"ovl_lookup_real(): use lookup_one_len_unlocked()" instead,
but it was too late in the cycle to rebase.
2016-05-17 02:17:59 -04:00
David S. Miller
e800072c18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
In netdevice.h we removed the structure in net-next that is being
changes in 'net'.  In macsec.c and rtnetlink.c we have overlaps
between fixes in 'net' and the u64 attribute changes in 'net-next'.

The mlx5 conflicts have to do with vxlan support dependencies.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 15:59:24 -04:00
James Morris
a6926cc989 Merge branch 'stable-4.7' of git://git.infradead.org/users/pcmoore/selinux into next 2016-05-06 09:31:34 +10:00
James Morris
0250abcd72 Merge tag 'keys-next-20160505' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2016-05-06 09:29:00 +10:00
Sasha Levin
74f430cd0f Yama: use atomic allocations when reporting
Access reporting often happens from atomic contexes. Avoid
lockups when allocating memory for command lines.

Fixes: 8a56038c2a ("Yama: consolidate error reporting")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2016-05-04 10:54:05 -07:00
David Howells
d55201ce08 Merge branch 'keys-trust' into keys-next
Here's a set of patches that changes how certificates/keys are determined
to be trusted.  That's currently a two-step process:

 (1) Up until recently, when an X.509 certificate was parsed - no matter
     the source - it was judged against the keys in .system_keyring,
     assuming those keys to be trusted if they have KEY_FLAG_TRUSTED set
     upon them.

     This has just been changed such that any key in the .ima_mok keyring,
     if configured, may also be used to judge the trustworthiness of a new
     certificate, whether or not the .ima_mok keyring is meant to be
     consulted for whatever process is being undertaken.

     If a certificate is determined to be trustworthy, KEY_FLAG_TRUSTED
     will be set upon a key it is loaded into (if it is loaded into one),
     no matter what the key is going to be loaded for.

 (2) If an X.509 certificate is loaded into a key, then that key - if
     KEY_FLAG_TRUSTED gets set upon it - can be linked into any keyring
     with KEY_FLAG_TRUSTED_ONLY set upon it.  This was meant to be the
     system keyring only, but has been extended to various IMA keyrings.
     A user can at will link any key marked KEY_FLAG_TRUSTED into any
     keyring marked KEY_FLAG_TRUSTED_ONLY if the relevant permissions masks
     permit it.

These patches change that:

 (1) Trust becomes a matter of consulting the ring of trusted keys supplied
     when the trust is evaluated only.

 (2) Every keyring can be supplied with its own manager function to
     restrict what may be added to that keyring.  This is called whenever a
     key is to be linked into the keyring to guard against a key being
     created in one keyring and then linked across.

     This function is supplied with the keyring and the key type and
     payload[*] of the key being linked in for use in its evaluation.  It
     is permitted to use other data also, such as the contents of other
     keyrings such as the system keyrings.

     [*] The type and payload are supplied instead of a key because as an
         optimisation this function may be called whilst creating a key and
         so may reject the proposed key between preparse and allocation.

 (3) A default manager function is provided that permits keys to be
     restricted to only asymmetric keys that are vouched for by the
     contents of the system keyring.

     A second manager function is provided that just rejects with EPERM.

 (4) A key allocation flag, KEY_ALLOC_BYPASS_RESTRICTION, is made available
     so that the kernel can initialise keyrings with keys that form the
     root of the trust relationship.

 (5) KEY_FLAG_TRUSTED and KEY_FLAG_TRUSTED_ONLY are removed, along with
     key_preparsed_payload::trusted.

This change also makes it possible in future for userspace to create a private
set of trusted keys and then to have it sealed by setting a manager function
where the private set is wholly independent of the kernel's trust
relationships.

Further changes in the set involve extracting certain IMA special keyrings
and making them generally global:

 (*) .system_keyring is renamed to .builtin_trusted_keys and remains read
     only.  It carries only keys built in to the kernel.  It may be where
     UEFI keys should be loaded - though that could better be the new
     secondary keyring (see below) or a separate UEFI keyring.

 (*) An optional secondary system keyring (called .secondary_trusted_keys)
     is added to replace the IMA MOK keyring.

     (*) Keys can be added to the secondary keyring by root if the keys can
         be vouched for by either ring of system keys.

 (*) Module signing and kexec only use .builtin_trusted_keys and do not use
     the new secondary keyring.

 (*) Config option SYSTEM_TRUSTED_KEYS now depends on ASYMMETRIC_KEY_TYPE as
     that's the only type currently permitted on the system keyrings.

 (*) A new config option, IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY,
     is provided to allow keys to be added to IMA keyrings, subject to the
     restriction that such keys are validly signed by a key already in the
     system keyrings.

     If this option is enabled, but secondary keyrings aren't, additions to
     the IMA keyrings will be restricted to signatures verifiable by keys in
     the builtin system keyring only.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-05-04 17:20:20 +01:00
Mimi Zohar
cf90ea9340 ima: fix the string representation of the LSM/IMA hook enumeration ordering
This patch fixes the string representation of the LSM/IMA hook enumeration
ordering used for displaying the IMA policy.

Fixes: d9ddf077bb ("ima: support for kexec image and initramfs")
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Tested-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-05-04 18:46:00 +10:00
Mimi Zohar
05d1a717ec ima: add support for creating files using the mknodat syscall
Commit 3034a14 "ima: pass 'opened' flag to identify newly created files"
stopped identifying empty files as new files.  However new empty files
can be created using the mknodat syscall.  On systems with IMA-appraisal
enabled, these empty files are not labeled with security.ima extended
attributes properly, preventing them from subsequently being opened in
order to write the file data contents.  This patch defines a new hook
named ima_post_path_mknod() to mark these empty files, created using
mknodat, as new in order to allow the file data contents to be written.

In addition, files with security.ima xattrs containing a file signature
are considered "immutable" and can not be modified.  The file contents
need to be written, before signing the file.  This patch relaxes this
requirement for new files, allowing the file signature to be written
before the file contents.

Changelog:
- defer identifying files with signatures stored as security.ima
  (based on Dmitry Rozhkov's comments)
- removing tests (eg. dentry, dentry->d_inode, inode->i_size == 0)
  (based on Al's review)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Al Viro <<viro@zeniv.linux.org.uk>
Tested-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
2016-05-01 09:23:52 -04:00
Mimi Zohar
42a4c60319 ima: fix ima_inode_post_setattr
Changing file metadata (eg. uid, guid) could result in having to
re-appraise a file's integrity, but does not change the "new file"
status nor the security.ima xattr.  The IMA_PERMIT_DIRECTIO and
IMA_DIGSIG_REQUIRED flags are policy rule specific.  This patch
only resets these flags, not the IMA_NEW_FILE or IMA_DIGSIG flags.

With this patch, changing the file timestamp will not remove the
file signature on new files.

Reported-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Tested-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
2016-05-01 09:23:52 -04:00
Stephen Smalley
c2316dbf12 selinux: apply execstack check on thread stacks
The execstack check was only being applied on the main
process stack.  Thread stacks allocated via mmap were
only subject to the execmem permission check.  Augment
the check to apply to the current thread stack as well.
Note that this does NOT prevent making a different thread's
stack executable.

Suggested-by: Nick Kralevich <nnk@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-26 15:47:57 -04:00
Stephen Smalley
8e4ff6f228 selinux: distinguish non-init user namespace capability checks
Distinguish capability checks against a target associated
with the init user namespace versus capability checks against
a target associated with a non-init user namespace by defining
and using separate security classes for the latter.

This is needed to support e.g. Chrome usage of user namespaces
for the Chrome sandbox without needing to allow Chrome to also
exercise capabilities on targets in the init user namespace.

Suggested-by: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-26 15:41:43 -04:00
Baolin Wang
457db29bfc security: Introduce security_settime64()
security_settime() uses a timespec, which is not year 2038 safe
on 32bit systems. Thus this patch introduces the security_settime64()
function with timespec64 type. We also convert the cap_settime() helper
function to use the 64bit types.

This patch then moves security_settime() to the header file as an
inline helper function so that existing users can be iteratively
converted.

None of the existing hooks is using the timespec argument and therefor
the patch is not making any functional changes.

Cc: Serge Hallyn <serge.hallyn@canonical.com>,
Cc: James Morris <james.l.morris@oracle.com>,
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
Cc: Paul Moore <pmoore@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Kees Cook <keescook@chromium.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
[jstultz: Reworded commit message]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2016-04-22 11:48:30 -07:00
Kees Cook
9b091556a0 LSM: LoadPin for kernel file loading restrictions
This LSM enforces that kernel-loaded files (modules, firmware, etc)
must all come from the same filesystem, with the expectation that
such a filesystem is backed by a read-only device such as dm-verity
or CDROM. This allows systems that have a verified and/or unchangeable
filesystem to enforce module and firmware loading restrictions without
needing to sign the files individually.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-04-21 10:47:27 +10:00
Kees Cook
8a56038c2a Yama: consolidate error reporting
Use a common error reporting function for Yama violation reports, and give
more detail into the process command lines.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-04-21 10:47:26 +10:00
Roopa Prabhu
10c9ead9f3 rtnetlink: add new RTM_GETSTATS message to dump link stats
This patch adds a new RTM_GETSTATS message to query link stats via netlink
from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK
returns a lot more than just stats and is expensive in some cases when
frequent polling for stats from userspace is a common operation.

RTM_GETSTATS is an attempt to provide a light weight netlink message
to explicity query only link stats from the kernel on an interface.
The idea is to also keep it extensible so that new kinds of stats can be
added to it in the future.

This patch adds the following attribute for NETDEV stats:
struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
        [IFLA_STATS_LINK_64]  = { .len = sizeof(struct rtnl_link_stats64) },
};

Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of
a single interface or all interfaces with NLM_F_DUMP.

Future possible new types of stat attributes:
link af stats:
    - IFLA_STATS_LINK_IPV6  (nested. for ipv6 stats)
    - IFLA_STATS_LINK_MPLS  (nested. for mpls/mdev stats)
extended stats:
    - IFLA_STATS_LINK_EXTENDED (nested. extended software netdev stats like bridge,
      vlan, vxlan etc)
    - IFLA_STATS_LINK_HW_EXTENDED (nested. extended hardware stats which are
      available via ethtool today)

This patch also declares a filter mask for all stat attributes.
User has to provide a mask of stats attributes to query. filter mask
can be specified in the new hdr 'struct if_stats_msg' for stats messages.
Other important field in the header is the ifindex.

This api can also include attributes for global stats (eg tcp) in the future.
When global stats are included in a stats msg, the ifindex in the header
must be zero. A single stats message cannot contain both global and
netdev specific stats. To easily distinguish them, netdev specific stat
attributes name are prefixed with IFLA_STATS_LINK_

Without any attributes in the filter_mask, no stats will be returned.

This patch has been tested with mofified iproute2 ifstat.

Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-20 15:43:42 -04:00
Paul Moore
1ac4247626 selinux: check ss_initialized before revalidating an inode label
There is no point in trying to revalidate an inode's security label if
the security server is not yet initialized.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-19 16:37:27 -04:00
Paul Moore
20cdef8d57 selinux: delay inode label lookup as long as possible
Since looking up an inode's label can result in revalidation, delay
the lookup as long as possible to limit the performance impact.

Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-19 16:37:07 -04:00
Paul Moore
2c97165bef selinux: don't revalidate an inode's label when explicitly setting it
There is no point in attempting to revalidate an inode's security
label when we are in the process of setting it.

Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-19 16:36:28 -04:00
Prarit Bhargava
0fd71a620b selinux: Change bool variable name to index.
security_get_bool_value(int bool) argument "bool" conflicts with
in-kernel macros such as BUILD_BUG().  This patch changes this to
index which isn't a type.

Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: James Morris <james.l.morris@oracle.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andrew Perepechko <anserper@ya.ru>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: selinux@tycho.nsa.gov
Cc: Eric Paris <eparis@redhat.com>
Cc: Paul Moore <pmoore@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
[PM: wrapped description for checkpatch.pl, use "selinux:..." as subj]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-04-14 11:24:50 -04:00
Mat Martineau
ddbb411487 KEYS: Add KEYCTL_DH_COMPUTE command
This adds userspace access to Diffie-Hellman computations through a
new keyctl() syscall command to calculate shared secrets or public
keys using input parameters stored in the keyring.

Input key ids are provided in a struct due to the current 5-arg limit
for the keyctl syscall. Only user keys are supported in order to avoid
exposing the content of logon or encrypted keys.

The output is written to the provided buffer, based on the assumption
that the values are only needed in userspace.

Future support for other types of key derivation would involve a new
command, like KEYCTL_ECDH_COMPUTE.

Once Diffie-Hellman support is included in the crypto API, this code
can be converted to use the crypto API to take advantage of possible
hardware acceleration and reduce redundant code.

Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2016-04-12 19:54:58 +01:00