Commit Graph

25 Commits

Author SHA1 Message Date
Tejun Heo dd14cbc994 sysfs: fix race condition around sd->s_dentry, take#2
Allowing attribute and symlink dentries to be reclaimed means
sd->s_dentry can change dynamically.  However, updates to the field
are unsynchronized leading to race conditions.  This patch adds
sysfs_lock and use it to synchronize updates to sd->s_dentry.

Due to the locking around ->d_iput, the check in sysfs_drop_dentry()
is complex.  sysfs_lock only protect sd->s_dentry pointer itself.  The
validity of the dentry is protected by dcache_lock, so whether dentry
is alive or not can only be tested while holding both locks.

This is minimal backport of sysfs_drop_dentry() rewrite in devel
branch.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-06-12 16:08:47 -07:00
Tejun Heo 6aa054aadf sysfs: fix condition check in sysfs_drop_dentry()
The condition check doesn't make much sense as it basically always
succeeds.  This causes NULL dereferencing on certain cases.  It seems
that parentheses are put in the wrong place.  Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-06-12 16:08:46 -07:00
Eric Sandeen dc351252b3 sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses
Backport of
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch

For regular files in sysfs, sysfs_readdir wants to traverse
sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
But, the dentry can be reclaimed under memory pressure, and there is
no synchronization with readdir.  This patch follows Tejun's scheme of
allocating and storing an inode number in the new s_ino member of a
sysfs_dirent, when dirents are created, and retrieving it from there
for readdir, so that the pointer chain doesn't have to be traversed.

Tejun's upstream patch uses a new-ish "ida" allocator which brings
along some extra complexity; this -stable patch has a brain-dead
incrementing counter which does not guarantee uniqueness, but because
sysfs doesn't hash inodes as iunique expects, uniqueness wasn't
guaranteed today anyway.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-06-12 16:08:46 -07:00
Alexey Dobriyan e8edc6e03a Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
   getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
   they don't need sched.h
b) sched.h stops being dependency for significant number of files:
   on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
   after patch it's only 3744 (-8.3%).

Cross-compile tested on

	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
	alpha alpha-up
	arm
	i386 i386-up i386-defconfig i386-allnoconfig
	ia64 ia64-up
	m68k
	mips
	parisc parisc-up
	powerpc powerpc-up
	s390 s390-up
	sparc sparc-up
	sparc64 sparc64-up
	um-x86_64
	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 09:18:19 -07:00
Alan Stern e7b0d26a86 [PATCH] sysfs: reinstate exclusion between method calls and attribute unregistration
This patch (as869) reinstates the mutual exclusion between sysfs
attribute method calls and attribute unregistration.  The
previously-reported deadlocks have been fixed, and this exclusion is
by far the simplest way to avoid races during driver unbinding.

The check for orphaned read-buffers has been moved down slightly, so
that the remainder of a partially-read buffer will still be available
to userspace even after the attribute has been unregistered.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-15 15:29:26 -07:00
Hugh Dickins 266d4f4037 [PATCH] suspend regression: sysfs deadlock
Suspend deadlocks when trying to unregister /sys/block/sr0.

This comes from Oliver's commit 94bebf4d1b
"Driver core: fix race in sysfs between sysfs_remove_file() and
read()/write()".

sysfs_write_file downs buffer->sem while calling flush_write_buffer, and
flushing that particular write buffer entails downing buffer->sem in
orphan_all_buffers, resulting in the obvious self-deadlock.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 17:59:14 -08:00
Arjan van de Ven c5ef1c42c5 [PATCH] mark struct inode_operations const 3
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:46 -08:00
Eric W. Biederman b592fcfe7f sysfs: Shadow directory support
The problem.  When implementing a network namespace I need to be able
to have multiple network devices with the same name.  Currently this
is a problem for /sys/class/net/*. 

What I want is a separate /sys/class/net directory in sysfs for each
network namespace, and I want to name each of them /sys/class/net.

I looked and the VFS actually allows that.  All that is needed is
for /sys/class/net to implement a follow link method to redirect
lookups to the real directory you want. 

Implementing a follow link method that is sensitive to the current
network namespace turns out to be 3 lines of code so it looks like a
clean approach.  Modifying sysfs so it doesn't get in my was is a bit
trickier. 

I am calling the concept of multiple directories all at the same path
in the filesystem shadow directories.  With the directory entry really
at that location the shadow master. 

The following patch modifies sysfs so it can handle a directory
structure slightly different from the kobject tree so I can implement
the shadow directories for handling /sys/class/net/.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 10:37:14 -08:00
Frederik Deweerdt d3fc373ac5 sysfs: suppress lockdep warnings
Lockdep issues the following warning:
[    9.064000] =============================================
[    9.064000] [ INFO: possible recursive locking detected ]
[    9.064000] 2.6.20-rc3-mm1 #3
[    9.064000] ---------------------------------------------
[    9.064000] init/1 is trying to acquire lock:
[    9.064000]  (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.064000]
[    9.064000] but task is already holding lock:
[    9.064000]  (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.065000]
[    9.065000] other info that might help us debug this:
[    9.065000] 2 locks held by init/1:
[    9.065000]  #0:  (tty_mutex){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.065000]  #1:  (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.065000]
[    9.065000] stack backtrace:
[    9.065000]  [<c010390d>] show_trace_log_lvl+0x1a/0x30
[    9.066000]  [<c0103935>] show_trace+0x12/0x14
[    9.066000]  [<c0103a2f>] dump_stack+0x16/0x18
[    9.066000]  [<c0138cb8>] print_deadlock_bug+0xb9/0xc3
[    9.066000]  [<c0138d17>] check_deadlock+0x55/0x5a
[    9.066000]  [<c013a953>] __lock_acquire+0x371/0xbf0
[    9.066000]  [<c013b7a9>] lock_acquire+0x69/0x83
[    9.066000]  [<c03e6b7e>] __mutex_lock_slowpath+0x75/0x2d1
[    9.066000]  [<c03e6afc>] mutex_lock+0x1c/0x1f
[    9.066000]  [<c01b249c>] sysfs_drop_dentry+0xb1/0x133
[    9.066000]  [<c01b25d1>] sysfs_hash_and_remove+0xb3/0x142
[    9.066000]  [<c01b30ed>] sysfs_remove_file+0xd/0x10
[    9.067000]  [<c02849e0>] device_remove_file+0x23/0x2e
[    9.067000]  [<c02850b2>] device_del+0x188/0x1e6
[    9.067000]  [<c028511b>] device_unregister+0xb/0x15
[    9.067000]  [<c0285318>] device_destroy+0x9c/0xa9
[    9.067000]  [<c0261431>] vcs_remove_sysfs+0x1c/0x3b
[    9.067000]  [<c0267a08>] con_close+0x5e/0x6b
[    9.067000]  [<c02598f2>] release_dev+0x4c4/0x6e5
[    9.067000]  [<c0259faa>] tty_release+0x12/0x1c
[    9.067000]  [<c0174872>] __fput+0x177/0x1a0
[    9.067000]  [<c01746f5>] fput+0x3b/0x41
[    9.068000]  [<c0172ee1>] filp_close+0x36/0x65
[    9.068000]  [<c0172f73>] sys_close+0x63/0xa4
[    9.068000]  [<c0102a96>] sysenter_past_esp+0x5f/0x99
[    9.068000]  =======================

This is due to sysfs_hash_and_remove() holding dir->d_inode->i_mutex
before calling sysfs_drop_dentry() which calls orphan_all_buffers()
which in turn takes node->i_mutex.

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 10:37:13 -08:00
Oliver Neukum 94bebf4d1b Driver core: fix race in sysfs between sysfs_remove_file() and read()/write()
This patch prevents a race between IO and removing a file from sysfs.
It introduces a list of sysfs_buffers associated with a file at the inode.
Upon removal of a file the list is walked and the buffers marked orphaned.
IO to orphaned buffers fails with -ENODEV. The driver can safely free
associated data structures or be unloaded.

Signed-off-by: Oliver Neukum <oliver@neukum.name>
Acked-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 10:37:13 -08:00
Theodore Ts'o ba52de123d [PATCH] inode-diet: Eliminate i_blksize from the inode structure
This eliminates the i_blksize field from struct inode.  Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function.

Note that some filesystems were providing pretty much random (and incorrect)
values for i_blksize.

[bunk@stusta.de: cleanup]
[akpm@osdl.org: generic_fillattr() fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:18 -07:00
Randy.Dunlap 995982ca79 sysfs_remove_bin_file: no return value, dump_stack on error
Make sysfs_remove_bin_file() void.  If it detects an error,
printk the file name and call dump_stack().

sysfs_hash_and_remove() now returns an error code indicating
its success or failure so that sysfs_remove_bin_file() can
know success/failure.

Convert the only driver that checked the return value of
sysfs_remove_bin_file().

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:39 -07:00
Arjan van de Ven 232ba9dbd6 [PATCH] lockdep: annotate the sysfs i_mutex to be a separate class
sysfs has a different i_mutex lock order behavior for i_mutex than the
other filesystems; sysfs i_mutex is called in many places with subsystem
locks held.  At the same time, many of the VFS locking rules do not apply
to sysfs at all (cross directory rename for example).  To untangle this
mess (which gives false positives in lockdep), we're giving sysfs inodes
their own class for i_mutex.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-12 12:52:54 -07:00
Christoph Hellwig f5e54d6e53 [PATCH] mark address_space_operations const
Same as with already do with the file operations: keep them in .rodata and
prevents people from doing runtime patching.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28 14:59:04 -07:00
Eric Sesterhenn 99cee0cd75 BUG_ON() Conversion in fs/sysfs/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:18:38 +02:00
Eric Sesterhenn 58d49283b8 [PATCH] sysfs: kzalloc conversion
this converts fs/sysfs to kzalloc() usage.
compile tested with make allyesconfig

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:58 -08:00
Greg Kroah-Hartman 641e6f30a0 [PATCH] sysfs: sysfs_remove_dir() needs to invalidate the dentry
When calling sysfs_remove_dir() don't allow any further sysfs functions
to work for this kobject anymore.  This fixes a nasty USB cdc-acm oops
on disconnect.

Many thanks to Bob Copeland and Paul Fulghum for taking the time to
track this down.

Cc: Bob Copeland <email@bobcopeland.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:57 -08:00
Randy Dunlap 16f7e0fe2e [PATCH] capable/capability.h (fs/)
fs: Use <linux/capability.h> where capable() is used.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 18:42:13 -08:00
Jes Sorensen 1b1dcc1b57 [PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem
This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.

Modified-by: Ingo Molnar <mingo@elte.hu>

(finished the conversion)

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2006-01-09 15:59:24 -08:00
James Bottomley 36676bcbf9 [PATCH] Fix oops in sysfs_hash_and_remove_file()
The problem arises if an entity in sysfs is created and removed without
ever having been made completely visible.  In SCSI this is triggered by
removing a device while it's initialising.

The problem appears to be that because it was never made visible in sysfs,
the sysfs dentry has a null d_inode which oopses when a reference is made
to it.  The solution is simply to check d_inode and assume the object was
never made visible (and thus doesn't need deleting) if it's NULL.

(akpm: possibly a stopgap for 2.6.13 scsi problems.  May not be the
long-term fix)

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-26 19:37:13 -07:00
Maneesh Soni 9ca1eb3282 [PATCH] sysfs: fix sysfs_setattr
o sysfs_dirent's s_mode field should also be updated in sysfs_setattr(), else
  there could be inconsistency in the two fields. s_mode is used while
  ->readdir so as not to bring in the inode to cache.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-29 13:12:49 -07:00
Christoph Hellwig 5f45f1a78f [PATCH] remove duplicate get_dentry functions in various places
Various filesystem drivers have grown a get_dentry() function that's a
duplicate of lookup_one_len, except that it doesn't take a maximum length
argument and doesn't check for \0 or / in the passed in filename.

Switch all these places to use lookup_one_len.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Greg KH <greg@kroah.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:20 -07:00
Maneesh Soni 8215534ce7 [PATCH] sysfs-iattr: set inode attributes
o Following patch sets the attributes for newly allocated inodes for sysfs
  objects. If the object has non-default attributes, inode attributes are
  set as saved in sysfs_dirent->s_iattr, pointer to struct iattr.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:37 -07:00
Maneesh Soni 988d186de5 [PATCH] sysfs-iattr: add sysfs_setattr
o This adds ->i_op->setattr VFS method for sysfs inodes. The changed
  attribues are saved in the persistent sysfs_dirent structure as a pointer
  to struct iattr. The struct iattr is allocated only for those sysfs_dirent's
  for which default attributes are getting changed. Thanks to Jon Smirl for
  this suggestion.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:37 -07:00
Linus Torvalds 1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00