kernfs_node->parent and ->name are currently marked as "published"
indicating that kernfs users may access them directly; however, those
fields may get updated by kernfs_rename[_ns]() and unrestricted access
may lead to erroneous values or oops.
Protect ->parent and ->name updates with a irq-safe spinlock
kernfs_rename_lock and implement the following accessors for these
fields.
* kernfs_name() - format the node's name into the specified buffer
* kernfs_path() - format the node's path into the specified buffer
* pr_cont_kernfs_name() - pr_cont a node's name (doesn't need buffer)
* pr_cont_kernfs_path() - pr_cont a node's path (doesn't need buffer)
* kernfs_get_parent() - pin and return a node's parent
All can be called under any context. The recursive sysfs_pathname()
in fs/sysfs/dir.c is replaced with kernfs_path() and
sysfs_rename_dir_ns() is updated to use kernfs_get_parent() instead of
dereferencing parent directly.
v2: Dummy definition of kernfs_path() for !CONFIG_KERNFS was missing
static inline making it cause a lot of build warnings. Add it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sysfs assumed 0755 for all newly created directories and kernfs
inherited it. This assumption is unnecessarily restrictive and
inconsistent with kernfs_create_file[_ns](). This patch adds @mode
parameter to kernfs_create_dir[_ns]() and update uses in sysfs
accordingly. Among others, this will be useful for implementations of
the planned ->mkdir() method.
This patch doesn't introduce any behavior differences.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernfs has just been separated out from sysfs and we're already in
full conflict mode. Nothing can make the situation any worse. Let's
take the chance to name things properly.
This patch performs the following renames.
* s/SYSFS_DIR/KERNFS_DIR/
* s/SYSFS_KOBJ_ATTR/KERNFS_FILE/
* s/SYSFS_KOBJ_LINK/KERNFS_LINK/
* s/SYSFS_{TYPE_FLAGS}/KERNFS_{TYPE_FLAGS}/
* s/SYSFS_FLAG_{FLAG}/KERNFS_{FLAG}/
* s/sysfs_type()/kernfs_type()/
* s/SD_DEACTIVATED_BIAS/KN_DEACTIVATED_BIAS/
This patch is strictly rename only and doesn't introduce any
functional difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernfs has just been separated out from sysfs and we're already in
full conflict mode. Nothing can make the situation any worse. Let's
take the chance to name things properly.
s_ prefix for kernfs members is used inconsistently and a misnomer
now. It's not like kernfs_node is used widely across the kernel
making the ability to grep for the members particularly useful. Let's
just drop the prefix.
This patch is strictly rename only and doesn't introduce any
functional difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernfs has just been separated out from sysfs and we're already in
full conflict mode. Nothing can make the situation any worse. Let's
take the chance to name things properly.
This patch performs the following renames.
* s/sysfs_elem_dir/kernfs_elem_dir/
* s/sysfs_elem_symlink/kernfs_elem_symlink/
* s/sysfs_elem_attr/kernfs_elem_file/
* s/sysfs_dirent/kernfs_node/
* s/sd/kn/ in kernfs proper
* s/parent_sd/parent/
* s/target_sd/target/
* s/dir_sd/parent/
* s/to_sysfs_dirent()/rb_to_kn()/
* misc renames of local vars when they conflict with the above
Because md, mic and gpio dig into sysfs details, this patch ends up
modifying them. All are sysfs_dirent renames and trivial. While we
can avoid these by introducing a dummy wrapping struct sysfs_dirent
around kernfs_node, given the limited usage outside kernfs and sysfs
proper, I don't think such workaround is called for.
This patch is strictly rename only and doesn't introduce any
functional difference.
- mic / gpio renames were missing. Spotted by kbuild test robot.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, it's assumed that there's a single kernfs hierarchy in the
system anchored at sysfs_root which is defined as a global struct. To
allow other users of kernfs, this will be made dynamic. Introduce a
new global variable sysfs_root_sd which points to &sysfs_root and
convert all &sysfs_root users.
This patch doesn't introduce any behavior difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move core dir code to fs/kernfs/dir.c. fs/sysfs/dir.c now only
contains sysfs_warn_dup() and sysfs wrappers around kernfs interfaces.
The respective declarations in fs/sysfs/sysfs.h are moved to
fs/kernfs/kernfs-internal.h.
This is pure relocation.
v2: sysfs_symlink_target_lock was mistakenly relocated to kernfs. It
should remain with sysfs. Fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce kernfs interface for finding, getting and putting
sysfs_dirents.
* sysfs_find_dirent() is renamed to kernfs_find_ns() and lockdep
assertion for sysfs_mutex is added.
* sysfs_get_dirent_ns() is renamed to kernfs_find_and_get().
* Macro inline dancing around __sysfs_get/put() are removed and
kernfs_get/put() are made proper functions implemented in
fs/sysfs/dir.c.
While the conversions are mostly equivalent, there's one difference -
kernfs_get() doesn't return the input param as its return value. This
change is intentional. While passing through the input increases
writability in some areas, it is unnecessary and has been shown to
cause confusion regarding how the last ref is handled.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, sysfs_dirent active_ref lockdep annotation uses
attribute->[s]key as the lockdep key, which forces
kernfs_create_file_ns() to assume that sysfs_dirent->priv is pointing
to a struct attribute which may not be true for non-sysfs users. This
patch restructures the lockdep annotation such that
* kernfs_ops contains lockdep_key which is used by default for files
created kernfs_create_file_ns().
* kernfs_create_file_ns_key() is introduced which takes an extra @key
argument. The created file will use the specified key for
active_ref lockdep annotation. If NULL is specified, lockdep for
the file is disabled.
* sysfs_add_file_mode_ns() is updated to use
kernfs_create_file_ns_key() with the appropriate key from the
attribute or NULL if ignore_lockdep is set.
This makes the lockdep annotation properly contained in kernfs while
allowing sysfs to cleanly keep its current behavior. This patch
doesn't introduce any behavior differences.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sysfs_add_one() is a wrapper around __sysfs_add_one() which prints out
duplicate name warning if __sysfs_add_one() fails with -EEXIST. The
previous kernfs conversions moved all dup warnings to sysfs interface
functions and sysfs_add_one() doesn't have any user left.
Remove sysfs_add_one() and update __sysfs_add_one() to take its name.
This patch doesn't make any functional changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce kernfs interface to manipulate a directory which takes and
returns sysfs_dirents.
create_dir() is renamed to kernfs_create_dir_ns() and its argumantes
and return value are updated. create_dir() usages are replaced with
kernfs_create_dir_ns() and sysfs_create_subdir() usages are replaced
with kernfs_create_dir(). Dup warnings are handled explicitly by
sysfs users of the kernfs interface.
sysfs_enable_ns() is renamed to kernfs_enable_ns().
This patch doesn't introduce any behavior changes.
v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.
v3: kernfs_enable_ns() added.
v4: Refreshed on top of "sysfs: drop kobj_ns_type handling, take #2"
so that this patch removes sysfs_enable_ns().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A directory sysfs_dirent points to the associated kobj. A regular or
bin file points to the associated [bin_]attribute. This patch
replaces sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with void *
->priv.
This is to prepare for kernfs interface so that sysfs can specify the
private data in the same way for directories and files. This lower
debuggability but not by much - the whole thing was overlaid in a
union anyway. If debuggability becomes an issue, we can later add
->priv accessors which explicitly check for the sysfs_dirent type and
performs casting.
This patch doesn't introduce any behavior difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce kernfs rename interface, krenfs_rename[_ns]().
This is just rename of sysfs_rename(). No functional changes.
Function comment is added to kernfs_rename_ns() and @new_parent_sd is
renamed to @new_parent for consistency with other kernfs interfaces.
v2: Dummy implementation for !CONFIG_SYSFS updated to return -ENOSYS.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce kernfs removal interfaces - kernfs_remove() and
kernfs_remove_by_name[_ns]().
These are just renames of sysfs_remove() and sysfs_hash_and_remove().
No functional changes.
v2: Dummy kernfs_remove_by_name_ns() for !CONFIG_SYSFS updated to
return -ENOSYS instead of 0.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the kobject based interface guarantees that a parent
sysfs_dirent is always a directory; however, the planned kernfs
interface will be directly based on sysfs_dirents and the caller may
specify non-directory node as the parent. Add an explicit check in
__sysfs_add_one() so that such attempts fail with -EINVAL.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The way namespace tags are implemented in sysfs is more complicated
than necessary. As each tag is a pointer value and required to be
non-NULL under a namespace enabled parent, there's no need to record
separately what type each tag is. If multiple namespace types are
needed, which currently aren't, we can simply compare the tag to a set
of allowed tags in the superblock assuming that the tags, being
pointers, won't have the same value across multiple types.
This patch rips out kobj_ns_type handling from sysfs. sysfs now has
an enable switch to turn on namespace under a node. If enabled, all
children are required to have non-NULL namespace tags and filtered
against the super_block's tag.
kobject namespace determination is now performed in
lib/kobject.c::create_dir() making sysfs_read_ns_type() unnecessary.
The sanity checks are also moved. create_dir() is restructured to
ease such addition. This removes most kobject namespace knowledge
from sysfs proper which will enable proper separation and layering of
sysfs.
This is the second try. The first one was cb26a31157 ("sysfs: drop
kobj_ns_type handling") which tried to automatically enable namespace
if there are children with non-NULL namespace tags; however, it was
broken for symlinks as they should inherit the target's tag iff
namespace is enabled in the parent. This led to namespace filtering
enabled incorrectly for wireless net class devices through phy80211
symlinks and thus network configuration failure. a1212d278c
("Revert "sysfs: drop kobj_ns_type handling"") reverted the commit.
This shouldn't introduce any behavior changes, for real.
v2: Dummy implementation of sysfs_enable_ns() for !CONFIG_SYSFS was
missing and caused build failure. Reported by kbuild test robot.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit cb26a31157.
It mysteriously causes NetworkManager to not find the wireless device
for me. As far as I can tell, Tejun *meant* for this commit to not make
any semantic changes, but there clearly are some. So revert it, taking
into account some of the calling convention changes that happened in
this area in subsequent commits.
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sysfs_assoc_lock is an odd piece of locking. In general, whoever owns
a kobject is responsible for synchronizing sysfs operations and sysfs
proper assumes that, for example, removal won't race with any other
operation; however, this doesn't work for symlinking because an entity
performing symlink doesn't usually own the target kobject and thus has
no control over its removal.
sysfs_assoc_lock synchronizes symlink operations against kobj->sd
disassociation so that symlink code doesn't end up dereferencing
already freed sysfs_dirent by racing with removal of the target
kobject.
This is quite obscure and the generic name of the lock and lack of
comments make it difficult to understand its role. Let's rename it to
sysfs_symlink_target_lock and add comments explaining what's going on.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Separate out sysfs_warn_dup() out of sysfs_add_one(). This will help
separating out the core sysfs functionalities into kernfs so that it
can be used by non-sysfs users too.
This doesn't make any functional changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Most removal related logic is implemented in fs/sysfs/dir.c. Move
sysfs_hash_and_remove() to fs/sysfs/dir.c so that __sysfs_remove()
doesn't have to be public.
This is pure relocation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
375b611e60 ("sysfs: remove sysfs_buffer->ops") introduced
sysfs_file_ops() which determines the associated file operation of a
given sysfs_dirent. As file ops access should be protected by an
active reference, the new function includes a lockdep assertion on the
sysfs_dirent; unfortunately, I forgot to take attr->ignore_lockdep
flag into account and the lockdep assertion trips spuriously for files
which opt out from active reference lockdep checking.
# cat /sys/devices/pci0000:00/0000:00:01.2/usb1/authorized
------------[ cut here ]------------
WARNING: CPU: 1 PID: 540 at /work/os/work/fs/sysfs/file.c:79 sysfs_file_ops+0x4e/0x60()
Modules linked in:
CPU: 1 PID: 540 Comm: cat Not tainted 3.11.0-work+ #3
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
0000000000000009 ffff880016205c08 ffffffff81ca0131 0000000000000000
ffff880016205c40 ffffffff81096d0d ffff8800166cb898 ffff8800166f6f60
ffffffff8125a220 ffff880011ab1ec0 ffff88000aff0c78 ffff880016205c50
Call Trace:
[<ffffffff81ca0131>] dump_stack+0x4e/0x82
[<ffffffff81096d0d>] warn_slowpath_common+0x7d/0xa0
[<ffffffff81096dea>] warn_slowpath_null+0x1a/0x20
[<ffffffff8125994e>] sysfs_file_ops+0x4e/0x60
[<ffffffff8125a274>] sysfs_open_file+0x54/0x300
[<ffffffff811df612>] do_dentry_open.isra.17+0x182/0x280
[<ffffffff811df820>] finish_open+0x30/0x40
[<ffffffff811f0623>] do_last+0x503/0xd90
[<ffffffff811f0f6b>] path_openat+0xbb/0x6d0
[<ffffffff811f23ba>] do_filp_open+0x3a/0x90
[<ffffffff811e09a9>] do_sys_open+0x129/0x220
[<ffffffff811e0abe>] SyS_open+0x1e/0x20
[<ffffffff81caf3c2>] system_call_fastpath+0x16/0x1b
---[ end trace aa48096b111dafdb ]---
Rename fs/sysfs/dir.c::ignore_lockdep() to sysfs_ignore_lockdep() and
move it to fs/sysfs/sysfs.h and make sysfs_file_ops() skip lockdep
assertion if sysfs_ignore_lockdep() is true.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With the previous changes, sysfs regular file code is ready to handle
bin files too. This patch makes bin files share the regular file
path.
* sysfs_create/remove_bin_file() are moved to fs/sysfs/file.c.
* sysfs_init_inode() is updated to use the new sysfs_bin_operations
instead of bin_fops for bin files.
* fs/sysfs/bin.c and the related pieces are removed.
This patch shouldn't introduce any behavior difference to bin file
accesses.
Overall, this unification reduces the amount of duplicate logic, makes
behaviors more consistent and paves the road for building simpler and
more versatile interface which will allow other subsystems to make use
of sysfs for their pseudo filesystems.
v2: Stale fs/sysfs/bin.c reference dropped from
Documentation/DocBook/filesystems.tmpl. Reported by kbuild test
robot.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay@vrfy.org>
Cc: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sysfs bin file handling will be merged into the regular file support.
This patch copies mmap support from bin so that fs/sysfs/file.c can
handle mmapping bin files.
The code is copied mostly verbatim with the following updates.
* ->mmapped and ->vm_ops are added to sysfs_open_file and bin_buffer
references are replaced with sysfs_open_file ones.
* Symbols are prefixed with sysfs_.
* sysfs_unmap_bin_file() grabs sysfs_open_dirent and traverses
->files. Invocation of this function is added to
sysfs_addrm_finish().
* sysfs_bin_mmap() is added to sysfs_bin_operations.
This is a preparation and the new mmap path isn't used yet.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Given a sysfs_dirent, there is no reason to have multiple versions of
removal functions. A function which removes the specified
sysfs_dirent and its descendants is enough.
This patch intorduces [__}sysfs_remove() which replaces all internal
variations of removal functions. This will be the only removal
function in the planned new sysfs_dirent based interface.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, sysfs directory removal is inconsistent in that it would
remove any files directly under it but wouldn't recurse into
directories. Thanks to group subdirectories, this doesn't even match
with kobject boundaries. sysfs is in the process of being separated
out so that it can be used by multiple subsystems and we want to have
a consistent behavior - either removal of a sysfs_dirent should remove
every descendant entries or none instead of something inbetween.
This patch implements proper recursive removal in
__sysfs_remove_dir(). The function now walks its subtree in a
post-order walk to remove all descendants.
This is a behavior change but kobject / driver layer, which currently
is the only consumer, has already been updated to handle duplicate
removal attempts, so nothing should be broken after this change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sysfs currently has a rather weird behavior regarding removals. A
directory removal would delete all files directly under it but
wouldn't recurse into subdirectories, which, while a bit inconsistent,
seems to make sense at the first glance as each directory is
supposedly associated with a kobject and each kobject can take care of
the directory deletion; however, this doesn't really hold as we have
groups which can be directories without a kobject associated with it
and require explicit deletions.
We're in the process of separating out sysfs from kboject / driver
core and want a consistent behavior. A removal should delete either
only the specified node or everything under it. I think it is helpful
to support recursive atomic removal and later patches will implement
it.
Such change means that a sysfs_dirent associated with kobject may be
deleted before the kobject itself is removed if one of its ancestor
gets removed before it. As sysfs_remove_dir() puts the base ref, we
may end up with dangling pointer on descendants. This can be solved
by holding an extra reference on the sd from kobject.
Acquire an extra reference on the associated sysfs_dirent on directory
creation and put it after removal.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sysfs_addrm_start/finish() enclose sysfs_dirent additions and
deletions and sysfs_addrm_cxt is used to record information necessary
to finish the operations. Currently, sysfs_addrm_start() takes
@parent_sd, records it in sysfs_addrm_cxt, and assumes that all
operations in the block are performed under that @parent_sd.
This assumption has been fine until now but we want to make some
operations behave recursively and, while having @parent_sd recorded in
sysfs_addrm_cxt doesn't necessarily prevents that, it becomes
confusing.
This patch removes sysfs_addrm_cxt->parent_sd and makes
sysfs_add_one() take an explicit @parent_sd parameter. Note that
sysfs_remove_one() doesn't need the extra argument as its parent is
always known from the target @sd.
While at it, add __acquires/releases() notations to
sysfs_addrm_start/finish() respectively.
This patch doesn't make any functional difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some internal sysfs functions which take explicit namespace argument
are weird in that they place the optional @ns in front of @name which
is contrary to the established convention. This is confusing and
error-prone especially as @ns and @name may be interchanged without
causing compilation warning.
Swap the positions of @name and @ns in the following internal
functions.
sysfs_find_dirent()
sysfs_rename()
sysfs_hash_and_remove()
sysfs_name_hash()
sysfs_name_compare()
create_dir()
This patch doesn't introduce any functional changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The pre-existing sysfs interfaces which take explicit namespace
argument are weird in that they place the optional @ns in front of
@name which is contrary to the established convention. For example,
we end up forcing vast majority of sysfs_get_dirent() users to do
sysfs_get_dirent(parent, NULL, name), which is silly and error-prone
especially as @ns and @name may be interchanged without causing
compilation warning.
This renames sysfs_get_dirent() to sysfs_get_dirent_ns() and swap the
positions of @name and @ns, and sysfs_get_dirent() is now a wrapper
around sysfs_get_dirent_ns(). This makes confusions a lot less
likely.
There are other interfaces which take @ns before @name. They'll be
updated by following patches.
This patch doesn't introduce any functional changes.
v2: EXPORT_SYMBOL_GPL() wasn't updated leading to undefined symbol
error on module builds. Reported by build test robot. Fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The way namespace tags are implemented in sysfs is more complicated
than necessary. As each tag is a pointer value and required to be
non-NULL under a namespace enabled parent, there's no need to record
separately what type each tag is or where namespace is enabled.
If multiple namespace types are needed, which currently aren't, we can
simply compare the tag to a set of allowed tags in the superblock
assuming that the tags, being pointers, won't have the same value
across multiple types. Also, whether to filter by namespace tag or
not can be trivially determined by whether the node has any tagged
children or not.
This patch rips out kobj_ns_type handling from sysfs. sysfs no longer
cares whether specific type of namespace is enabled or not. If a
sysfs_dirent has a non-NULL tag, the parent is marked as needing
namespace filtering and the value is tested against the allowed set of
tags for the superblock (currently only one but increasing this number
isn't difficult) and the sysfs_dirent is ignored if it doesn't match.
This removes most kobject namespace knowledge from sysfs proper which
will enable proper separation and layering of sysfs. The namespace
sanity checks in fs/sysfs/dir.c are replaced by the new sanity check
in kobject_namespace(). As this is the only place ktype->namespace()
is called for sysfs, this doesn't weaken the sanity check
significantly. I omitted converting the sanity check in
sysfs_do_create_link_sd(). While the check can be shifted to upper
layer, mistakes there are well contained and should be easily visible
anyway.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For some unrecognizable reason, namespace information is communicated
to sysfs through ktype->namespace() callback when there's *nothing*
which needs the use of a callback. The whole sequence of operations
is completely synchronous and sysfs operations simply end up calling
back into the layer which just invoked it in order to find out the
namespace information, which is completely backwards, obfuscates
what's going on and unnecessarily tangles two separate layers.
This patch doesn't remove ktype->namespace() but shifts its handling
to kobject layer. We probably want to get rid of the callback in the
long term.
This patch adds an explicit param to sysfs_{create|rename|move}_dir()
and renames them to sysfs_{create|rename|move}_dir_ns(), respectively.
ktype->namespace() invocations are moved to the calling sites of the
above functions. A new helper kboject_namespace() is introduced which
directly tests kobj_ns_type_operations->type which should give the
same result as testing sysfs_fs_type(parent_sd) and returns @kobj's
namespace tag as necessary. kobject_namespace() is extern as it will
be used from another file in the following patches.
This patch should be an equivalent conversion without any functional
difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The expansion of to_sysfs_dirent() contains an unncessary trailing
semicolon making it impossible to use in the middle of statements.
Drop it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Do have_submounts(), shrink_dcache_parent() and d_drop() atomically.
check_submounts_and_drop() can deal with negative dentries and
non-directories as well.
Non-directories can also be mounted on. And just like directories we don't
want these to disappear with invalidation.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Here's the big driver core merge for 3.11-rc1
Lots of little things, and larger firmware subsystem updates, all
described in the shortlog. Nice thing here is that we finally get rid
of CONFIG_HOTPLUG, after 10+ years, thanks to Stephen Rohtwell (it had
been always on for a number of kernel releases, now it's just removed.)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlHRsGMACgkQMUfUDdst+ylIIACfW8lLxOPVK+iYG699TWEBAkp0
LFEAnjlpAMJ1JnoZCuWDZObNCev93zGB
=020+
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here's the big driver core merge for 3.11-rc1
Lots of little things, and larger firmware subsystem updates, all
described in the shortlog. Nice thing here is that we finally get rid
of CONFIG_HOTPLUG, after 10+ years, thanks to Stephen Rohtwell (it had
been always on for a number of kernel releases, now it's just
removed)"
* tag 'driver-core-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (27 commits)
driver core: device.h: fix doc compilation warnings
firmware loader: fix another compile warning with PM_SLEEP unset
build some drivers only when compile-testing
firmware loader: fix compile warning with PM_SLEEP set
kobject: sanitize argument for format string
sysfs_notify is only possible on file attributes
firmware loader: simplify holding module for request_firmware
firmware loader: don't export cache_firmware and uncache_firmware
drivers/base: Use attribute groups to create sysfs memory files
firmware loader: fix compile warning
firmware loader: fix build failure with !CONFIG_FW_LOADER_USER_HELPER
Documentation: Updated broken link in HOWTO
Finally eradicate CONFIG_HOTPLUG
driver core: firmware loader: kill FW_ACTION_NOHOTPLUG requests before suspend
driver core: firmware loader: don't cache FW_ACTION_NOHOTPLUG firmware
Documentation: Tidy up some drivers/base/core.c kerneldoc content.
platform_device: use a macro instead of platform_driver_register
firmware: move EXPORT_SYMBOL annotations
firmware: Avoid deadlock of usermodehelper lock at shutdown
dell_rbu: Select CONFIG_FW_LOADER_USER_HELPER explicitly
...
Fix a typo subling->sibling in the comment of sysfs_link_sibling().
Signed-off-by: Warner Wang <warner.wang@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It might be a kernel disaster if one sysfs entry is freed but
still referenced by sysfs tree.
Recently Dave and Sasha reported one use-after-free problem on
sysfs entry, and the problem has been troubleshooted with help
of debug message added in this patch.
Given sysfs_get_dirent/sysfs_put are exported APIs, even inside
sysfs they are called in many contexts(kobject/attribe add/delete,
inode init/drop, dentry lookup/release, readdir, ...), it is healthful
to check the removed flag before freeing one entry and dump message
if it is freeing without being removed first.
Cc: Dave Jones <davej@redhat.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The inode->i_mutex isn't hold when updating filp->f_pos
in read()/write(), so the filp->f_pos might be read as
0 or 1 in readdir() when there is concurrent read()/write()
on this same file, then may cause use after free in readdir().
The bug can be reproduced with Li Zefan's test code on the
link:
https://patchwork.kernel.org/patch/2160771/
This patch fixes the use after free under this situation.
Cc: stable <stable@vger.kernel.org>
Reported-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It seems that sysfs has an interesting way of doing the same thing.
This removes the cpu_relax unfortunately, but if it's really needed,
it would be better to add this to include/linux/atomic.h to benefit
all atomic ops users.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case of 'if (filp->f_pos == 0 or 1)' of sysfs_readdir(),
the failure from filldir() isn't handled, and the reference counter
of the sysfs_dirent object pointed by filp->private_data will be
released without clearing filp->private_data, so use after free
bug will be triggered later.
This patch returns immeadiately under the situation for fixing the bug,
and it is reasonable to return from readdir() when filldir() fails.
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While readdir() is running, lseek() may set filp->f_pos as zero,
then may leave filp->private_data pointing to one sysfs_dirent
object without holding its reference counter, so the sysfs_dirent
object may be used after free in next readdir().
This patch holds inode->i_mutex to avoid the problem since
the lock is always held in readdir path.
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The warning check for duplicate sysfs entries can cause a buffer overflow
when printing the warning, as strcat() doesn't check buffer sizes.
Use strlcat() instead.
Since strlcat() doesn't return a pointer to the passed buffer, unlike
strcat(), I had to convert the nested concatenation in sysfs_add_one() to
an admittedly more obscure comma operator construct, to avoid emitting code
for the concatenation if CONFIG_BUG is disabled.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here's the big driver core pull request for 3.6-rc1.
Unlike 3.5, this kernel should be a lot tamer, with the printk changes now
settled down. All we have here is some extcon driver updates, w1 driver
updates, a few printk cleanups that weren't needed for 3.5, but are good to
have now, and some other minor fixes/changes in the driver core.
All of these have been in the linux-next releases for a while now.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAlARgIUACgkQMUfUDdst+ynDHgCfRNwIB9L+zZvjcKE5e1BhDbUl
wVUAn398DFgbJ1+PjGkd1EMR2uVTh7Ou
=MIFu
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core changes from Greg Kroah-Hartman:
"Here's the big driver core pull request for 3.6-rc1.
Unlike 3.5, this kernel should be a lot tamer, with the printk changes
now settled down. All we have here is some extcon driver updates, w1
driver updates, a few printk cleanups that weren't needed for 3.5, but
are good to have now, and some other minor fixes/changes in the driver
core.
All of these have been in the linux-next releases for a while now.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (38 commits)
printk: Export struct log size and member offsets through vmcoreinfo
Drivers: hv: Change the hex constant to a decimal constant
driver core: don't trigger uevent after failure
extcon: MAX77693: Add extcon-max77693 driver to support Maxim MAX77693 MUIC device
sysfs: fail dentry revalidation after namespace change fix
sysfs: fail dentry revalidation after namespace change
extcon: spelling of detach in function doc
extcon: arizona: Stop microphone detection if we give up on it
extcon: arizona: Update cable reporting calls and split headset
PM / Runtime: Do not increment device usage counts before probing
kmsg - do not flush partial lines when the console is busy
kmsg - export "continuation record" flag to /dev/kmsg
kmsg - avoid warning for CONFIG_PRINTK=n compilations
kmsg - properly print over-long continuation lines
driver-core: Use kobj_to_dev instead of re-implementing it
driver-core: Move kobj_to_dev from genhd.h to device.h
driver core: Move deferred devices to the end of dpm_list before probing
driver core: move uevent call to driver_register
driver core: fix shutdown races with probe/remove(v3)
Extcon: Arizona: Add driver for Wolfson Arizona class devices
...
don't assume that KOBJ_NS_TYPE_NONE==0. Also save a test-n-branch.
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we change the namespace tag of a sysfs entry, the associated dentry
is still kept around. readdir() will work correctly and not display the
old entries, but open() will still succeed, so will reads and writes.
This will no longer happen if sysfs is remounted, hinting that this is a
cache-related problem.
I am using the following sequence to demonstrate that:
shell1:
ip link add type veth
unshare -nm
shell2:
ip link set veth1 <pid_of_shell_1>
cat /sys/devices/virtual/net/veth1/ifindex
Before that patch, this will succeed (fail to fail). After it, it will
correctly return an error. Differently from a normal rename, which we
handle fine, changing the object namespace will keep it's path intact.
So this check seems necessary as well.
[ v2: get type from parent, as suggested by Eric Biederman ]
Signed-off-by: Glauber Costa <glommer@parallels.com>
CC: Tejun Heo <tj@kernel.org>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>