Commit Graph

3601 Commits

Author SHA1 Message Date
David Howells cf9a2ae8d4 [PATCH] BLOCK: Move functions out of buffer code [try #6]
Move some functions out of the buffering code that aren't strictly buffering
specific.  This is a precursor to being able to disable the block layer.

 (*) Moved some stuff out of fs/buffer.c:

     (*) The file sync and general sync stuff moved to fs/sync.c.

     (*) The superblock sync stuff moved to fs/super.c.

     (*) do_invalidatepage() moved to mm/truncate.c.

     (*) try_to_release_page() moved to mm/filemap.c.

 (*) Moved some related declarations between header files:

     (*) declarations for do_invalidatepage() and try_to_release_page() moved
     	 to linux/mm.h.

     (*) __set_page_dirty_buffers() moved to linux/buffer_head.h.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30 20:31:19 +02:00
Oleg Nesterov cf342e52e3 [PATCH] Don't need to disable interrupts for tasklist_lock
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30 20:31:18 +02:00
Jens Axboe caa38fb0f4 [PATCH] ext3: make meta data reads use READ_META
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-09-30 20:29:42 +02:00
Jens Axboe fc46379daf [PATCH] cfq-iosched: kill cfq_exit_lock
cfq_exit_lock is protecting two things now:

- The per-ioc rbtree of cfq_io_contexts

- The per-cfqd linked list of cfq_io_contexts

The per-cfqd linked list can be protected by the queue lock, as it is (by
definition) per cfqd as the queue lock is.

The per-ioc rbtree is mainly used and updated by the process itself only.
The only outside use is the io priority changing. If we move the
priority changing to not browsing the rbtree, we can remove any locking
from the rbtree updates and lookup completely. Let the sys_ioprio syscall
just mark processes as having the iopriority changed and lazily update
the private cfq io contexts the next time io is queued, and we can
remove this locking as well.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-09-30 20:29:36 +02:00
Linus Torvalds c0341b0f47 Merge git://oss.sgi.com:8090/xfs/xfs-2.6
* git://oss.sgi.com:8090/xfs/xfs-2.6: (49 commits)
  [XFS] Remove v1 dir trace macro - missed in a past commit.
  [XFS] 955947: Infinite loop in xfs_bulkstat() on formatter() error
  [XFS] pv 956241, author: nathans, rv: vapo - make ino validation checks
  [XFS] pv 956240, author: nathans, rv: vapo - Minor fixes in
  [XFS] Really fix use after free in xfs_iunpin.
  [XFS] Collapse sv_init and init_sv into just the one interface.
  [XFS] standardize on one sema init macro
  [XFS] Reduce endian flipping in alloc_btree, same as was done for
  [XFS] Minor cleanup from dio locking fix, remove an extra conditional.
  [XFS] Fix kmem_zalloc_greedy warnings on 64 bit platforms.
  [XFS] pv 955157, rv bnaujok - break the loop on EFAULT formatter() error
  [XFS] pv 955157, rv bnaujok - break the loop on formatter() error
  [XFS] Fixes the leak in reservation space because we weren't ungranting
  [XFS] Add lock annotations to xfs_trans_update_ail and
  [XFS] Fix a porting botch on the realtime subvol growfs code path.
  [XFS] Minor code rearranging and cleanup to prevent some coverity false
  [XFS] Remove a no-longer-correct debug assert from dio completion
  [XFS] Add a greedy allocation interface, allocating within a min/max size
  [XFS] Improve error handling for the zero-fsblock extent detection code.
  [XFS] Be more defensive with page flags (error/private) for metadata
  ...
2006-09-29 09:36:55 -07:00
Vivek Goyal 632dd2053a [PATCH] Kcore elf note namesz field fix
o As per ELF specifications, it looks like that elf note "namesz" field
  contains the length of "name" including the size of null character.  And
  currently we are filling "namesz" without taking into the consideration
  the null character size.

o Kexec-tools performs this check deligently hence I ran into the issue
  while trying to open /proc/kcore in kexec-tools for some info.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:25 -07:00
Andrew Morton 327dcaadc0 [PATCH] expand_fdtable(): remove pointless unlock+lock
This unlock/lock on a super-unlikely path isn't worth the kernel text.

Cc: Vadim Lobanov <vlobanov@speakeasy.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:25 -07:00
Vadim Lobanov 74d392aaab [PATCH] Clean up expand_fdtable() and expand_files()
Perform a code cleanup against the expand_fdtable() and expand_files()
functions inside fs/file.c.  It aims to make the flow of code within these
functions simpler and easier to understand, via added comments and modest
refactoring.

Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:25 -07:00
Andreas Gruenbacher 39f0247d38 [PATCH] Access Control Lists for tmpfs
Add access control lists for tmpfs.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:24 -07:00
Andreas Gruenbacher f0c8bd164e [PATCH] Generic infrastructure for acls
The patches solve the following problem: We want to grant access to devices
based on who is logged in from where, etc.  This includes switching back and
forth between multiple user sessions, etc.

Using ACLs to define device access for logged-in users gives us all the
flexibility we need in order to fully solve the problem.

Device special files nowadays usually live on tmpfs, hence tmpfs ACLs.

Different distros have come up with solutions that solve the problem to
different degrees: SUSE uses a resource manager which tracks login sessions
and sets ACLs on device inodes as appropriate.  RedHat uses pam_console, which
changes the primary file ownership to the logged-in user.  Others use a set of
groups that users must be in in order to be granted the appropriate accesses.

The freedesktop.org project plans to implement a combination of a
console-tracker and a HAL-device-list based solution to grant access to
devices to users, and more distros will likely follow this approach.

These patches have first been posted here on 2 February 2005, and again
on 8 January 2006. We have been shipping them in SLES9 and SLES10 with
no problems reported.  The previous submission is archived here:

   http://lkml.org/lkml/2006/1/8/229
   http://lkml.org/lkml/2006/1/8/230
   http://lkml.org/lkml/2006/1/8/231

This patch:

Add some infrastructure for access control lists on in-memory
filesystems such as tmpfs.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:24 -07:00
Chris Snook 4e6fd33b75 [PATCH] enforce RLIMIT_NOFILE in poll()
POSIX states that poll() shall fail with EINVAL if nfds > OPEN_MAX.  In
this context, POSIX is referring to sysconf(OPEN_MAX), which is the value
of current->signal->rlim[RLIMIT_NOFILE].rlim_cur in the linux kernel, not
the compile-time constant which happens to also be named OPEN_MAX.  In the
current code, an application may poll up to max_fdset file descriptors,
even if this exceeds RLIMIT_NOFILE.  The current code also breaks
applications which poll more than max_fdset descriptors, which worked circa
2.4.18 when the check was against NR_OPEN, which is 1024*1024.  This patch
enforces the limit precisely as POSIX defines, even if RLIMIT_NOFILE has
been changed at run time with ulimit -n.

To elaborate on the rationale for this, there are three cases:

1) RLIMIT_NOFILE is at the default value of 1024

In this (default) case, the patch changes nothing.  Calls with nfds > 1024
fail with EINVAL both before and after the patch, and calls with nfds <=
1024 pass the check both before and after the patch, since 1024 is the
initial value of max_fdset.

2) RLIMIT_NOFILE has been raised above the default

In this case, poll() becomes more permissive, allowing polling up to
RLIMIT_NOFILE file descriptors even if less than 1024 have been opened.
The patch won't introduce new errors here.  If an application somehow
depends on poll() failing when it polls with duplicate or invalid file
descriptors, it's already broken, since this is already allowed below 1024,
and will also work above 1024 if enough file descriptors have been open at
some point to cause max_fdset to have been increased above nfds.

3) RLIMIT_NOFILE has been lowered below the default

In this case, the system administrator or the user has gone out of their
way to protect the system from inefficient (or malicious) applications
wasting kernel memory.  The current code allows polling up to 1024 file
descriptors even if RLIMIT_NOFILE is much lower, which is not what the user
or administrator intended.  Well-written applications which only poll
valid, unique file descriptors will never notice the difference, because
they'll hit the limit on open() first.  If an application gets broken
because of the patch in this case, then it was already poorly/maliciously
designed, and allowing it to work in the past was a violation of POSIX and
a DoS risk on low-resource systems.

With this patch, poll() will permit exactly what POSIX suggests, no more,
no less, and for any run-time value set with ulimit -n, not just 256 or
1024.  There are existing apps which which poll a large number of file
descriptors, some of which may be invalid, and if those numbers stradle
1024, they currently fail with or without the patch in -mm, though they
worked fine under 2.4.18.

Signed-off-by: Chris Snook <csnook@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:23 -07:00
Andreas Mohr e518ddb7ba [PATCH] fs/namei.c: replace multiple current->fs by shortcut variable
Replace current->fs by fs helper variable to reduce some indirection
overhead and (at least at the moment, before the current_thread_info() %gs
PDA improvement is available) get rid of more costly current references.
Reduces fs/namei.o from 37786 to 37082 Bytes (704 Bytes saved).

[akpm@osdl.org: cleanup]
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:22 -07:00
Alexey Dobriyan 6ea36ddbd1 [PATCH] Ban register_filesystem(NULL);
Everyone passes valid pointer there.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:20 -07:00
Alexey Dobriyan d826380b30 [PATCH] 9p: fix leak on error path
If register_filesystem() fails mux workqueue must be killed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@lanl.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:20 -07:00
Alexey Dobriyan 368bdb3d61 [PATCH] cramfs: make cramfs_uncompress_exit() return void
It always returns 0, so relying on it is useless.  The only caller isn't
checking return value.  In general, un-, de-, -free functions should return
void.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:20 -07:00
Alexey Dobriyan a4376e13ce [PATCH] freevxfs: fix leak on error path
If register_filesystem() fails, vxfs_inode cache must be destroyed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:20 -07:00
Alexey Dobriyan 50d44ed009 [PATCH] cramfs: rewrite init_cramfs_fs()
Two lines -- two bugs. :-(

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:20 -07:00
Frederik Deweerdt f7ca54f486 [PATCH] fix mem_write() return value
At the beginning of the routine, "copied" is set to 0, but it is no good
because in lines 805 and 812 it is set to other values.  Finally, the
routine returns as if it copied 12 (=ENOMEM) bytes less than it actually
did.

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Acked-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:19 -07:00
Jason Baron 87d7c8aca8 [PATCH] block_dev.c mutex_lock_nested() fix
In the case below we are locking the whole disk not a partition.  This
change simply brings the code in line with the piece above where when we
are the 'first' opener, and we are a partition.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:19 -07:00
Ian Kent 44938af6e0 [PATCH] autofs4: pending flag not cleared on mount fail
During testing I've found that the mount pending flag can be left set at
exit from autofs4_lookup after a failed mount request.  This shouldn't be
allowed to happen and causes incorrect error returns.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:18 -07:00
Ian Kent be3ca7fecb [PATCH] autofs4: autofs4_follow_link false negative fix
The check for an empty directory in the autofs4_follow_link method fails
occassionally due to old dentrys.  We had the same problem
autofs4_revalidate ages ago.  I thought we wouldn't need this in
autofs4_follow_link, silly me.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:18 -07:00
Jonathan Corbet cf3e43dbe0 [PATCH] cdev documentation
Add some documentation comments for the cdev interface.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Acked-by: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:16 -07:00
Alan Cox 3cfd0885fa [PATCH] tty: stop the tty vanishing under procfs access
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:16 -07:00
Joel & Rebecca VanderZee fb50ae7446 [PATCH] I/O Error attempting to read last partial block of a file in an ISO9660 file system
There was an I/O error that prevented reading the last partial block of
large files in an ISO9660 filesystem.  The error was generated when a file
comprised more than one section and had a size that was not an exact
multiple of the filesystem block size.  This patch removes the check (and
failure) for reading into the last partial block (and possibly beyond) for
multiple-section files.

It worked in my testing to prevent reading beyond the end of the section;
my first patch just incremented the sect_size block count for a partial
block and continued doing the check.  But there is a commment in the source
code about reading beyond the end of the file to fill a page cache.
Failing to access beyond the section would prevent reading beyond the end
of the file.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:15 -07:00
Jan Kara b525a7e444 [PATCH] dquot: add proper locking when using current->signal->tty
Dquot passes the tty to tty_write_message without locking

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:14 -07:00
Oleg Nesterov bce9a234ce [PATCH] elf_fdpic_core_dump: don't take tasklist_lock
do_each_thread() is rcu-safe, and all tasks which use this ->mm must sleep
in wait_for_completion(&mm->core_done) at this point, so we can use RCU
locks.

Also, remove unneeded INIT_LIST_HEAD(new) before list_add(new, head).

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-By: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:14 -07:00
Oleg Nesterov 486ccb05fd [PATCH] elf_core_dump: don't take tasklist_lock
do_each_thread() is rcu-safe, and all tasks which use this ->mm must sleep
in wait_for_completion(&mm->core_done) at this point, so we can use RCU
locks.

Also, remove unneeded INIT_LIST_HEAD(new) before list_add(new, head).

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:14 -07:00
Ernie Petrides ee731f4f78 [PATCH] fix wrong error code on interrupted close syscalls
The problem is that close() syscalls can call a file system's flush
handler, which in turn might sleep interruptibly and ultimately pass back
an -ERESTARTSYS return value.  This happens for files backed by an
interruptible NFS mount under nfs_file_flush() when a large file has just
been written and nfs_wait_bit_interruptible() detects that there is a
signal pending.

I have a test case where the "strace" command is used to attach to a
process sleeping in such a close().  Since the SIGSTOP is forced onto the
victim process (removing it from the thread's "blocked" mask in
force_sig_info()), the RPC wait is interrupted and the close() is
terminated early.

But the file table entry has already been cleared before the flush handler
was called.  Thus, when the syscall is restarted, the file descriptor
appears closed and an EBADF error is returned (which is wrong).  What's
worse, there is the hypothetical case where another thread of a
multi-threaded application might have reused the file descriptor, in which
case that file would be mistakenly closed.

The bottom line is that close() syscalls are not restartable, and thus
-ERESTARTSYS return values should be mapped to -EINTR.  This is consistent
with the close(2) manual page.  The fix is below.

Signed-off-by: Ernie Petrides <petrides@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:13 -07:00
Amos Waterland 01d553d0fe [PATCH] Chardev checking of overlapping ranges
The code in __register_chrdev_region checks that if the driver wishing to
register has the same major as an existing driver the new minor range is
strictly less than the existing minor range.  However, it does not also
check that the new minor range is strictly greater than the existing minor
range.  That is, if driver X has registered with major=x and minor=0-3,
__register_chrdev_region will allow driver Y to register with major=x and
minor=1-4.

Signed-off-by: Amos Waterland <apw@us.ibm.com>
Cc: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:12 -07:00
Kirill Korotaev 3b9b8ab65d [PATCH] Fix unserialized task->files changing
Fixed race on put_files_struct on exec with proc.  Restoring files on
current on error path may lead to proc having a pointer to already kfree-d
files_struct.

->files changing at exit.c and khtread.c are safe as exit_files() makes all
things under lock.

Found during OpenVZ stress testing.

[akpm@osdl.org: add export]
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:12 -07:00
Chris Mason ae78bf9c4f [PATCH] add -o flush for fat
Fat is commonly used on removable media.  Mounting with -o flush tells the
FS to write things to disk as quickly as possible.  It is like -o sync, but
much faster (and not as safe).

Signed-off-by: Chris Mason <mason@suse.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:12 -07:00
Alexey Dobriyan 50462062a0 [PATCH] fs.h: ifdef security fields
[assuming BSD security levels are deleted]
The only user of i_security, f_security, s_security fields is SELinux,
however, quite a few security modules are trying to get into kernel.
So, wrap them under CONFIG_SECURITY. Adding config option for each
security field is likely an overkill.

Following Stephen Smalley's suggestion, i_security initialization is
moved to security_inode_alloc() to not clutter core code with ifdefs
and make alloc_inode() codepath tiny little bit smaller and faster.

The user of (highly greppable) struct fown_struct::security field is
still to be found. I've checked every "fown_struct" and every "f_owner"
occurence. Additionally it's removal doesn't break i386 allmodconfig
build.

struct inode, struct file, struct super_block, struct fown_struct
become smaller.

P.S. Combined with two reiserfs inode shrinking patches sent to
linux-fsdevel, I can finally suck 12 reiserfs inodes into one page.

		/proc/slabinfo

	-ext2_inode_cache	388	10
	+ext2_inode_cache	384	10
	-inode_cache		280	14
	+inode_cache		276	14
	-proc_inode_cache	296	13
	+proc_inode_cache	292	13
	-reiser_inode_cache	336	11
	+reiser_inode_cache	332	12 <=
	-shmem_inode_cache	372	10
	+shmem_inode_cache	368	10

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:11 -07:00
Alexey Dobriyan cfe14677f2 [PATCH] reiserfs: ifdef ACL stuff from inode
Shrink reiserfs inode more (by 8 bytes) for ACL non-users:

	-reiser_inode_cache     344     11
	+reiser_inode_cache     336     11

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:11 -07:00
Alexey Dobriyan 068fbb315d [PATCH] reiserfs: ifdef xattr_sem
Shrink reiserfs inode by 12 bytes for xattr non-users (me).

	-reiser_inode_cache     356     11
	+reiser_inode_cache     344     11

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:11 -07:00
Chris Mason a317202714 [PATCH] Fix reiserfs latencies caused by data=ordered
ReiserFS does periodic cleanup of old transactions in order to limit the
length of time a journal replay may take after a crash.  Sometimes, writing
metadata from an old (already committed) transaction may require committing
a newer transaction, which also requires writing all data=ordered buffers.
This can cause very long stalls on journal_begin.

This patch makes sure new transactions will not need to be committed before
trying a periodic reclaim of an old transaction.  It is low risk because if
a bad decision is made, it just means a slightly longer journal replay
after a crash.

Signed-off-by: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:11 -07:00
Chris Mason 25736b1c69 [PATCH] reiserfs_fsync should only use barriers when they are enabled
make sure that reiserfs_fsync only triggers barriers when mounted with -o
barrier=flush

Signed-off-by: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:11 -07:00
Eric Sandeen 39b3f6d6e9 [PATCH] mount udf UDF_PART_FLAG_READ_ONLY partitions with MS_RDONLY
There's a bug where a UDF_PART_FLAG_READ_ONLY udf partition gets mounted
read-write, then subsequent problems happen; files seem to be able to be
removed, but file creation results in EIO or worse, oops.

EIO is coming from udf_new_block(), which returns EIO if the right flags
aren't set; only UDF_PART_FLAG_READ_ONLY is set in this case.  We probably
s hould not have gotten this far...

Attached patch seems to fix it - and includes a printk to alert the user
that their "rw" mount request has been converted to "ro."

Here's the testcase I used:

[root@magnesium ~]# mkisofs -R -J -udf -o testiso /tmp/
...
Total translation table size: 0
Total rockridge attributes bytes: 342923
Total directory bytes: 382312
Path table size(bytes): 104
Max brk space used 103000
105059 extents written (205 MB)

[root@magnesium ~]# mount -o loop testiso /mnt/test/
[root@magnesium ~]# ls /mnt/test/fsfile
/mnt/test/fsfile
[root@magnesium ~]# rm /mnt/test/fsfile
[root@magnesium ~]# ls /mnt/test/fsfile
ls: /mnt/test/fsfile: No such file or directory
[root@magnesium ~]# touch /mnt/test/fsfile
touch: cannot touch `/mnt/test/fsfile': Input/output error
[root@magnesium tmp]# grep udf /proc/mounts
/dev/loop1 /mnt/test udf rw 0 0

Force readonly mounts of UDF partitions marked as read-only.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:09 -07:00
Olaf Hering e1dfa92dca [PATCH] ignore partition table on disks with AIX label
The on-disk data structures from AIX are not known, also the filesystem
layout is not known.  There is a msdos partition signature at the end of
the first block, and the kernel recognizes 3 small (and overlapping)
partitions.  But they are not usable.  Maybe the firmware uses it to find
the bootloader for AIX, but AIX boots also if the first block is cleared.

This is the content of the partition table:
 # dd if=/dev/sdb count=$(( 4 * 16 )) bs=1 skip=$(( 0x1be )) | xxd
0000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000010: 80ff ffff 41ff ffff 1b11 0000 381b 0000  ....A.......8...
0000020: 00ff ffff 41ff ffff 0211 0000 1900 0000  ....A...........
0000030: 80ff ffff 41ff ffff 1b11 0000 381b 0000  ....A.......8...

Handle the whole disk as empty disk.

This fixes also YaST which compares the output from parted (and formerly
fdisk) with /proc/partitions.  fdisk recognizes the AIX label since a long
time, SuSE has a patch for parted to handle the disk label as unknown.

dmesg will look like this:
 sda: [AIX]  unknown partition table

Tested on an IBM B50 with AIX V4.3.3.

Signed-off-by: Olaf Hering <olh@suse.de>
Cc: Albert Cahalan <acahalan@gmail.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:09 -07:00
Miklos Szeredi 650a898342 [PATCH] vfs: define new lookup flag for chdir
In the "operation does permission checking" model used by fuse, chdir
permission is not checked, since there's no chdir method.

For this case set a lookup flag, which will be passed to ->permission(), so
fuse can distinguish it from permission checks for other operations.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:08 -07:00
Miklos Szeredi 5b35e8e58a [PATCH] fuse: use dentry in statfs
Some filesystems may want to report different values depending on the path
within the filesystem, i.e.  one mount is actually several filesystems.  This
can be the case for a network filesystem exported by an unprivileged server
(e.g.  sshfs).

This is now possible, thanks to David Howells "VFS: Permit filesystem to
perform statfs with a known root dentry" patch.

This change is backward compatible, so no need to change interface version.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:08 -07:00
Eugene Teo 8454aeef6f [PATCH] Require mmap handler for a.out executables
Files supported by fs/proc/base.c, i.e.  /proc/<pid>/*, are not capable of
meeting the validity checks in ELF load_elf_*() handling because they have
no mmap handler which is required by ELF.  In order to stop a.out
executables being used as part of an exploit attack against /proc-related
vulnerabilities, we make a.out executables depend on ->mmap() existing.

Signed-off-by: Eugene Teo <eteo@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:08 -07:00
Josh Triplett 9c4dbee79d [PATCH] fs: add lock annotation to grab_super
grab_super gets called with sb_lock held, and releases it.  Add a lock
annotation to this function so that sparse can check callers for lock
pairing, and so that sparse will not complain about this function since it
intentionally uses the lock in this manner.

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:08 -07:00
Josh Triplett ddc0a51d2e [PATCH] hugetlbfs: add lock annotation to hugetlbfs_forget_inode()
hugetlbfs_forget_inode releases inode_lock.  Add a lock annotation to this
function so that sparse can check callers for lock pairing, and so that
sparse will not complain about this functions since it intentionally uses
the lock in this manner.

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:08 -07:00
Josh Triplett 105f4d7a81 [PATCH] fuse: add lock annotations to request_end and fuse_read_interrupt
request_end and fuse_read_interrupt release fc->lock.  Add lock annotations
to these two functions so that sparse can check callers for lock pairing,
and so that sparse will not complain about these functions since they
intentionally use locks in this manner.

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:08 -07:00
Josh Triplett 99fc705996 [PATCH] afs: add lock annotations to afs_proc_cell_servers_{start,stop}
afs_proc_cell_servers_start acquires a lock, and afs_proc_cell_servers_stop
releases that lock.  Add lock annotations to these two functions so that
sparse can check callers for lock pairing, and so that sparse will not
complain about these functions since they intentionally use locks in this
manner.

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:07 -07:00
Josh Triplett 58f555e5f6 [PATCH] mbcache: add lock annotation for __mb_cache_entry_release_unlock()
__mb_cache_entry_release_unlock releases mb_cache_spinlock, so annotate it
accordingly.

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:07 -07:00
Olaf Hering 42012cc4a2 [PATCH] use gcc -O1 in fs/reiserfs only for ancient gcc versions
Only compile with -O1 if the (very old) compiler is broken.  We use
reiserfs alot since SLES9 on ppc64, and it was never seen with gcc33.
Assume the broken gcc is gcc-3.4 or older.

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:07 -07:00
Pekka J Enberg c0d92cbc58 [PATCH] libfs: remove page up-to-date check from simple_readpage
Remove the unnecessary PageUptodate check from simple_readpage.  The only
two callers for ->readpage that don't have explicit PageUptodate check are
read_cache_pages and page_cache_read which operate on newly allocated pages
which don't have the flag set.

[akpm: use the allegedly-faster clear_page(), too]
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:06 -07:00
Randy Dunlap 15a67dd8cc [PATCH] fs/namespace: handle init/registration errors
Check and handle init errors.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:05 -07:00
Andrew Morton 4d7dd8fd95 [PATCH] blockdev.c: check driver layer errors
Check driver layer errors.

Fix from: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>

In blockdevc-check-errors.patch, add_bd_holder() is modified to return error
values when some of its operation failed.  Among them, it returns -EEXIST when
a given bd_holder object already exists in the list.

However, in this case, the function completed its work successfully and need
no action by its caller other than freeing unused bd_holder object.  So I
think it's better to return success after freeing by itself.

Otherwise, bd_claim-ing with same claim pointer will fail.
Typically, lvresize will fails with following message:
  device-mapper: reload ioctl failed: Invalid argument
and you'll see messages like below in kernel log:
  device-mapper: table: 254:13: linear: dm-linear: Device lookup failed
  device-mapper: ioctl: error adding target to table

Similarly, it should not add bd_holder to the list if either one of symlinking
fails.  I don't have a test case for this to happen but it should cause
dereference of freed pointer.

If a matching bd_holder is found in bd_holder_list, add_bd_holder() completes
its job by just incrementing the reference count.  In this case, it should be
considered as success but it used to return 'fail' to let the caller free
temporary bd_holder.  Fixed it to return success and free given object by
itself.

Also, if either one of symlinking fails, the bd_holder should not be added to
the list so that it can be discarded later.  Otherwise, the caller will free
bd_holder which is in the list.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:04 -07:00
Zoltan Menyhart d1807793e1 [PATCH] JBD: memory leak in "journal_init_dev()"
We leak a bh ref in "journal_init_dev()" in case of failure.

Signed-off-by: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:03 -07:00
Dave Kleikamp f71b2f10f5 [PATCH] JBD: Make journal_brelse_array() static
It's always good to make symbols static when we can, and this also eliminates
the need to rename the function in jbd2

Suggested by Eric Sandeen.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:03 -07:00
Tim Shimmin 65e8697a12 [XFS] Remove v1 dir trace macro - missed in a past commit.
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-29 15:23:02 +10:00
Vlad Apostolov 6e73b41888 [XFS] 955947: Infinite loop in xfs_bulkstat() on formatter() error
SGI-PV: 955947
SGI-Modid: xfs-linux-melb:xfs-kern:26986a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:06:21 +10:00
Vlad Apostolov 6f1f216840 [XFS] pv 956241, author: nathans, rv: vapo - make ino validation checks
consistent in bulkstat

SGI-PV: 956241
SGI-Modid: xfs-linux-melb:xfs-kern:26984a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:06:15 +10:00
Vlad Apostolov 6216ff1883 [XFS] pv 956240, author: nathans, rv: vapo - Minor fixes in
kmem_zalloc_greedy()

SGI-PV: 956240
SGI-Modid: xfs-linux-melb:xfs-kern:26983a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:06:10 +10:00
David Chinner f273ab848b [XFS] Really fix use after free in xfs_iunpin.
The previous attempts to fix the linux inode use-after-free in xfs_iunpin
simply made the problem harder to hit. We actually need complete exclusion
between xfs_reclaim and xfs_iunpin, as well as ensuring that the i_flags
are consistent during both of these functions. Introduce a new spinlock
for exclusion and the i_flags, and fix up xfs_iunpin to use igrab before
marking the inode dirty.

SGI-PV: 952967
SGI-Modid: xfs-linux-melb:xfs-kern:26964a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:06:03 +10:00
Eric Sandeen 01106eae97 [XFS] Collapse sv_init and init_sv into just the one interface.
SGI-PV: 907752
SGI-Modid: xfs-linux-melb:xfs-kern:26925a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:05:52 +10:00
Eric Sandeen 7ae67d78e7 [XFS] standardize on one sema init macro
One sema to rule them all, one sema to find them...

SGI-PV: 907752
SGI-Modid: xfs-linux-melb:xfs-kern:26911a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:05:46 +10:00
Eric Sandeen 91d8723204 [XFS] Reduce endian flipping in alloc_btree, same as was done for
ialloc_btree.

SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26910a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:05:40 +10:00
Nathan Scott edcd4bce5e [XFS] Minor cleanup from dio locking fix, remove an extra conditional.
SGI-PV: 955696
SGI-Modid: xfs-linux-melb:xfs-kern:26908a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:05:33 +10:00
Nathan Scott 215101c360 [XFS] Fix kmem_zalloc_greedy warnings on 64 bit platforms.
SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26907a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:04:43 +10:00
Vlad Apostolov e132f54ce8 [XFS] pv 955157, rv bnaujok - break the loop on EFAULT formatter() error
SGI-PV: 955157
SGI-Modid: xfs-linux-melb:xfs-kern:26869a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:04:31 +10:00
Vlad Apostolov 22de606a0b [XFS] pv 955157, rv bnaujok - break the loop on formatter() error
SGI-PV: 955157
SGI-Modid: xfs-linux-melb:xfs-kern:26866a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:04:24 +10:00
Tim Shimmin 955e47ad28 [XFS] Fixes the leak in reservation space because we weren't ungranting
space for the unmount record - which becomes a problem in the freeze/thaw
scenario.

SGI-PV: 942533
SGI-Modid: xfs-linux-melb:xfs-kern:26815a

Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:04:16 +10:00
Josh Triplett 22d91f65d5 [XFS] Add lock annotations to xfs_trans_update_ail and
xfs_trans_delete_ail

xfs_trans_update_ail and xfs_trans_delete_ail get called with the AIL lock
held, and release it. Add lock annotations to these two functions so that
sparse can check callers for lock pairing, and so that sparse will not
complain about these functions since they intentionally use locks in this
manner.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26807a

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:04:07 +10:00
Nathan Scott 68c3271515 [XFS] Fix a porting botch on the realtime subvol growfs code path.
SGI-PV: 955515
SGI-Modid: xfs-linux-melb:xfs-kern:26806a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:03:53 +10:00
Nathan Scott d432c80e68 [XFS] Minor code rearranging and cleanup to prevent some coverity false
positives.

SGI-PV: 955502
SGI-Modid: xfs-linux-melb:xfs-kern:26805a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:03:44 +10:00
Nathan Scott b627259c60 [XFS] Remove a no-longer-correct debug assert from dio completion
handling.

SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26804a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:03:33 +10:00
Nathan Scott 77e4635ae1 [XFS] Add a greedy allocation interface, allocating within a min/max size
range.

SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26803a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:03:27 +10:00
Nathan Scott 572d95f49f [XFS] Improve error handling for the zero-fsblock extent detection code.
SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26802a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:03:20 +10:00
Nathan Scott 948ecdb4c1 [XFS] Be more defensive with page flags (error/private) for metadata
buffers.

SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26801a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:03:13 +10:00
Nathan Scott efb8ad7e94 [XFS] Add a debug flag for allocations which are known to be larger than
one page.

SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26800a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:03:05 +10:00
Eric Sandeen 3f89243c5b [XFS] Remove several macros that are no longer used anywhere
SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26749a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:02:57 +10:00
Eric Sandeen 065d312e15 [XFS] Remove unused iop_abort log item operation
SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26747a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:02:44 +10:00
Eric Sandeen 43129c16e8 [XFS] Remove a couple of unused BUF macros
SGI-PV: 955302
SGI-Modid: xfs-linux-melb:xfs-kern:26746a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:02:37 +10:00
Vlad Apostolov 17370097da [XFS] pass file mode on DMAPI remove events
SGI-PV: 953687
SGI-Modid: xfs-linux-melb:xfs-kern:26639a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:02:30 +10:00
Nathan Scott 745b1f47fc [XFS] Remove last bulkstat false-positives with debug kernels.
SGI-PV: 953819
SGI-Modid: xfs-linux-melb:xfs-kern:26628a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:02:23 +10:00
Nathan Scott a3c6685eaa [XFS] Ensure xlog_state_do_callback does not report spurious warnings on
ramdisks.

SGI-PV: 954802
SGI-Modid: xfs-linux-melb:xfs-kern:26627a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:02:14 +10:00
Nathan Scott bb3c7d2936 [XFS] Increase the size of the buffer holding the local inode cluster
list, to increase our potential readahead window and in turn improve
bulkstat performance.

SGI-PV: 944409
SGI-Modid: xfs-linux-melb:xfs-kern:26607a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:02:09 +10:00
Nathan Scott 2627509330 [XFS] Drop unneeded endian conversion in bulkstat and start readahead for
batches of inode cluster buffers at once, before any blocking reads are
issued.

SGI-PV: 944409
SGI-Modid: xfs-linux-melb:xfs-kern:26606a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:02:03 +10:00
Nathan Scott 51bdd70681 [XFS] When issuing metadata readahead, submit bio with READA not READ.
SGI-PV: 944409
SGI-Modid: xfs-linux-melb:xfs-kern:26603a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:01:57 +10:00
Nathan Scott 8b56f083c2 [XFS] Rework DMAPI bulkstat calls in such a way that we can directly
extract inline attributes out of the bulkstat buffer (for that case),
rather than using an (extremely expensive for large icount filesystems)
iget for fetching attrs.

SGI-PV: 944409
SGI-Modid: xfs-linux-melb:xfs-kern:26602a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:01:46 +10:00
Tim Shimmin 726801ba06 [XFS] Add EA list callbacks for xfs kernel use. Cleanup some namespace
code.

SGI-PV: 954372
SGI-Modid: xfs-linux-melb:xfs-kern:26583a

Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:01:37 +10:00
Nathan Scott 69e23b9a5e [XFS] Update XFS for i_blksize removal from generic inode structure
SGI-PV: 954366
SGI-Modid: xfs-linux-melb:xfs-kern:26565a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 11:01:22 +10:00
Nathan Scott 29b6d22b01 [XFS] remove accidentally reintroduced vfs unmount flag, unneeded in
current kernels

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26564a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:59:06 +10:00
Christoph Hellwig fe48cae9ed [XFS] remove bhv_lookup, _range version works aswell and has more useful
semantics.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26563a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:58:52 +10:00
Nathan Scott 1121b219bf [XFS] use NULL for pointer initialisation instead of zero-cast-to-ptr
SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26562a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:58:40 +10:00
Christoph Hellwig 8801bb99e4 [XFS] endianess annotations for xfs_bmbt_key Trivial as there are no
incore users.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26561a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:58:17 +10:00
Christoph Hellwig 576039cf3c [XFS] endianess annotate XFS_BMAP_BROOT_PTR_ADDR Make sure it returns a
__be64 and let the callers use the proper macros.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26560a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:58:06 +10:00
Christoph Hellwig 397b5208d5 [XFS] endianess annotations for xfs_bmbt_ptr_t/xfs_bmdr_ptr_t
SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26559a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:57:52 +10:00
Christoph Hellwig b113bcb83e [XFS] add xfs_btree_check_lptr_disk variant which handles endian
conversion

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26558a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:57:42 +10:00
Christoph Hellwig c38e5e84db [XFS] remove left over INT_ comments in *alloc*.c We can verify endianess
handling with sparse now, no need for comments.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26557a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:57:17 +10:00
Christoph Hellwig 61a2584867 [XFS] endianess annotations for xfs_inobt_rec_t / xfs_inobt_key_t
SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26556a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:57:04 +10:00
Christoph Hellwig e21010053a [XFS] endianess annotation for xfs_agfl_t. Trivial, xfs_agfl_t is always
used for ondisk values.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26553a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:56:51 +10:00
Nathan Scott ed9d88f7b7 [XFS] Fix sparse warning found when page tracing enabled, due to
overloaded gfp_t param.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26552a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:56:43 +10:00
Nathan Scott 673cdf5c72 [XFS] Fix rounding bug in xfs_free_file_space found by sparse checking.
SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26551a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:56:26 +10:00
Alexey Dobriyan 87395deb0b [XFS] move XFS_IOC_GETVERSION to main multiplexer
Avoids doing an unnecessary inode to vnode conversion and avoids a memory
allocation.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26492a

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:56:01 +10:00
Tim Shimmin 128dabc5e9 [XFS] cleanup the field types of some item format structures
SGI-PV: 954365
SGI-Modid: xfs-linux-melb:xfs-kern:26406a

Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:55:43 +10:00
Nathan Scott f07c225036 [XFS] Improve xfsbufd delayed write submission patterns, after blktrace
analysis.

Under a sequential create+allocate workload, blktrace reported backward
writes being issued by xfsbufd, and frequent inappropriate queue unplugs.
We now insert at the tail when moving from the delwri lists to the temp
lists, which maintains correct ordering, and we avoid unplugging queues
deep in the submit paths when we'd shortly do it at a higher level anyway.
blktrace now reports much healthier write patterns from xfsbufd for this
workload (and likely many others).

SGI-PV: 954310
SGI-Modid: xfs-linux-melb:xfs-kern:26396a

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:52:15 +10:00
Alexey Dobriyan f37ea14969 [XFS] pass inode to xfs_ioc_space(), simplify some code. There is trivial
"inode => vnode => inode" conversion, but only flags and mode of final
inode are looked at. Pass original inode instead.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:26395a

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2006-09-28 10:52:04 +10:00
Eric W. Biederman aafe6c2a2b [PATCH] de_thread: Use tsk not current
Ingo Oeser pointed out that because current expands to an inline function
it is more space efficient and somewhat faster to simply keep a cached copy
of current in another variable.  This patch implements that for the
de_thread function.

(akpm: saves nearly 100 bytes of text on x86)

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:20 -07:00
Adrian Bunk 66f37509fc [PATCH] fs/nfs/: make code static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:20 -07:00
Martin Bligh b7b52630de [PATCH] add newline to nfs dprintk
Add missing \n to dprintk

Signed-off-by: Martin Bligh <mbligh@google.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:19 -07:00
Eric W. Biederman c18258c6f0 [PATCH] pid: Implement transfer_pid and use it to simplify de_thread
In de_thread we move pids from one process to another, a rather ugly case.
The function transfer_pid makes it clear what we are doing, and makes the
action atomic.  This is useful we ever want to atomically traverse the
process group and session lists, in a rcu safe manner.

Even if the atomic properties this change should be a win as transfer_pid
should be less code to execute than executing both attach_pid and
detach_pid, and this should make de_thread slightly smaller as only a
single function call needs to be emitted.  The only downside is that the
code might be slower to execute as the odds are against transfer_pid being
in cache.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:19 -07:00
Eric W. Biederman b89a81712f [PATCH] sysctl: Allow /proc/sys without sys_sysctl
Since sys_sysctl is deprecated start allow it to be compiled out.  This
should catch any remaining user space code that cares, and paves the way
for further sysctl cleanups.

[akpm@osdl.org: If sys_sysctl() is not compiled-in, emit a warning]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:19 -07:00
Andrew Morton 8b0e330b77 [PATCH] alloc_fdtable() cleanup
free_fdset(NULL, ...) is legal.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:19 -07:00
Adrian Bunk 36b756f2b5 [PATCH] reiserfs: warn about the useless nolargeio option
Since the nolargeio option no longer has any effect, print a warning
instead of setting a write-only variable.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <mason@suse.com>
Cc: Hans Reiser <reiser@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:18 -07:00
Theodore Ts'o ba52de123d [PATCH] inode-diet: Eliminate i_blksize from the inode structure
This eliminates the i_blksize field from struct inode.  Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function.

Note that some filesystems were providing pretty much random (and incorrect)
values for i_blksize.

[bunk@stusta.de: cleanup]
[akpm@osdl.org: generic_fillattr() fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:18 -07:00
Theodore Ts'o 577c4eb09d [PATCH] inode-diet: Move i_cdev into a union
Move the i_cdev pointer in struct inode into a union.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Theodore Ts'o eaf796e7ef [PATCH] inode-diet: Move i_bdev into a union
Move the i_bdev pointer in struct inode into a union.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Theodore Ts'o 8e18e2941c [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_private
The following patches reduce the size of the VFS inode structure by 28 bytes
on a UP x86.  (It would be more on an x86_64 system).  This is a 10% reduction
in the inode size on a UP kernel that is configured in a production mode
(i.e., with no spinlock or other debugging functions enabled; if you want to
save memory taken up by in-core inodes, the first thing you should do is
disable the debugging options; they are responsible for a huge amount of bloat
in the VFS inode structure).

This patch:

The filesystem or device-specific pointer in the inode is inside a union,
which is pretty pointless given that all 30+ users of this field have been
using the void pointer.  Get rid of the union and rename it to i_private, with
a comment to explain who is allowed to use the void pointer.  This is just a
cleanup, but it allows us to reuse the union 'u' for something something where
the union will actually be used.

[judith@osdl.org: powerpc build fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Judith Lebzelter <judith@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
OGAWA Hirofumi 6a1d9805ec [PATCH] fat: cleanup fat_get_block(s)
get_blocks() was removed.  So, this removes it on fat, and will take
advantage of the multi block mapping.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Ian Kent bcdc5e019d [PATCH] autofs4 needs to force fail return revalidate
For a long time now I have had a problem with not being able to return a
lookup failure on an existsing directory.  In autofs this corresponds to a
mount failure on a autofs managed mount entry that is browsable (and so the
mount point directory exists).

While this problem has been present for a long time I've avoided resolving
it because it was not very visible.  But now that autofs v5 has "mount and
expire on demand" of nested multiple mounts, such as is found when mounting
an export list from a server, solving the problem cannot be avoided any
longer.

I've tried very hard to find a way to do this entirely within the autofs4
module but have not been able to find a satisfactory way to achieve it.

So, I need to propose a change to the VFS.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
David Howells f269fdd182 [PATCH] NOMMU: move the fallback arch_vma_name() to a sensible place
Move the fallback arch_vma_name() to a sensible place (kernel/signal.c).

Currently it's in fs/proc/task_mmu.c, a file that is dependent on both
CONFIG_PROC_FS and CONFIG_MMU being enabled, but it's used from
kernel/signal.c from where it is called unconditionally.

[akpm@osdl.org: build fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:15 -07:00
David Howells dbf8685c8e [PATCH] NOMMU: Implement /proc/pid/maps for NOMMU
Implement /proc/pid/maps for NOMMU by reading the vm_area_list attached to
current->mm->context.vmlist.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:14 -07:00
David Howells 5da6185bca [PATCH] NOMMU: Set BDI capabilities for /dev/mem and /dev/kmem
Set the backing device info capabilities for /dev/mem and /dev/kmem to
permit direct sharing under no-MMU conditions and full mapping capabilities
under MMU conditions.  Make the BDI used by these available to all directly
mappable character devices.

Also comment the capabilities for /dev/zero.

[akpm@osdl.org: ifdef reductions]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:14 -07:00
Alexey Dobriyan 1a1d92c10d [PATCH] Really ignore kmem_cache_destroy return value
* Rougly half of callers already do it by not checking return value
* Code in drivers/acpi/osl.c does the following to be sure:

	(void)kmem_cache_destroy(cache);

* Those who check it printk something, however, slab_error already printed
  the name of failed cache.
* XFS BUGs on failed kmem_cache_destroy which is not the decision
  low-level filesystem driver should make. Converted to ignore.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Panagiotis Issaris f52720ca5f [PATCH] fs: Removing useless casts
* Removing useless casts
* Removing useless wrapper
* Conversion from kmalloc+memset to kzalloc

Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Panagiotis Issaris f8314dc60c [PATCH] fs: Conversions from kmalloc+memset to k(z|c)alloc
Conversions from kmalloc+memset to kzalloc.

Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Jffs2-bit-acked-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Eric Sandeen 32c2d2bc4b [PATCH] more ext3 16T overflow fixes
Some of the changes in balloc.c are just cosmetic, as Andreas pointed out -
if they overflow they'll then underflow and things are fine.

5th hunk actually fixes an overflow problem.

Also check for potential overflows in inode & block counts when resizing.

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Dave Kleikamp a4e4de36dc [PATCH] ext3: Fix sparse warnings
Fixing up some endian-ness warnings in preparation to clone ext4 from ext3.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Dave Kleikamp e9ad5620bf [PATCH] ext3: More whitespace cleanups
More white space cleanups in preparation of cloning ext4 from ext3.
Removing spaces that precede a tab.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Vasily Averin 7543fc7b3a [PATCH] ext3: wrong error behavior
SWsoft Virtuozzo/OpenVZ Linux kernel team has discovered that ext3 error
behavior was broken in linux kernels since 2.5.x versions by the following
patch:

2002/10/31 02:15:26-05:00 tytso@snap.thunk.org
Default mount options from superblock for ext2/3 filesystems
http://linux.bkbits.net:8080/linux-2.6/gnupatch@3dc0d88eKbV9ivV4ptRNM8fBuA3JBQ

In case ext3 file system is mounted with errors=continue
(EXT3_ERRORS_CONTINUE) errors should be ignored when possible.  However at
present in case of any error kernel aborts journal and remounts filesystem
to read-only.  Such behavior was hit number of times and noted to differ
from that of 2.4.x kernels.

This patch fixes this:
- do nothing in case of EXT3_ERRORS_CONTINUE,
- set EXT3_MOUNT_ABORT and call journal_abort() in all other cases
- panic() should be called after ext3_commit_super() to save
 sb marked as EXT3_ERROR_FS

Signed-off-by: Vasily Averin <vvs@sw.ru>
Acked-by: Kirill Korotaev <dev@sw.ru>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Mingming Cao 36faadc144 [PATCH] ext3: more comments about block allocation/reservation code
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Mingming Cao 321fb9e818 [PATCH] ext3: turn on reservation dump on block allocation errors
In the past there were a few kernel panics related to block reservation
tree operations failure (insert/remove etc).  It would be very useful to
get the block allocation reservation map info when such error happens.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Eric Sandeen 37ed322290 [PATCH] JBD: 16T fixes
These are a few places I've found in jbd that look like they may not be
16T-safe, or consistent with the use of unsigned longs for block
containers.  Problems here would be somewhat hard to hit, would require
journal blocks past the 8T boundary, which would not be terribly common.
Still, should fix.

(some of these have come from the ext4 work on jbd as well).

I think there's one more possibility that the wrap() function may not be
safe IF your last block in the journal butts right up against the 232 block
boundary, but that seems like a VERY remote possibility, and I'm not
worrying about it at this point.

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Eric Sandeen eee194e76c [PATCH] ext3: inode numbers are unsigned long
This is primarily format string fixes, with changes to ialloc.c where large
inode counts could overflow, and also pass around journal_inum as an
unsigned long, just to be pedantic about it....

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Eric Sandeen 41f04d852e [PATCH] ext2: fix mounts at 16T
Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Eric Sandeen 855565e81a [PATCH] fix ext3 mounts at 16T
I need to do some actual IO testing now, but this gets things mounting for
a 16T ext3 filesystem.  (patched up e2fsprogs is needed too, I'll send that
off the kernel list)

This patch fixes these issues in the kernel:

o sbi->s_groups_count overflows in ext3_fill_super()

	sbi->s_groups_count = (le32_to_cpu(es->s_blocks_count) -
			       le32_to_cpu(es->s_first_data_block) +
			       EXT3_BLOCKS_PER_GROUP(sb) - 1) /
			      EXT3_BLOCKS_PER_GROUP(sb);

  at 16T, s_blocks_count is already maxed out; adding
  EXT3_BLOCKS_PER_GROUP(sb) overflows it and groups_count comes out to 0.
  Not really what we want, and causes a failed mount.

  Feel free to check my math (actually, please do!), but changing it this
  way should work & avoid the overflow:

  (A + B - 1)/B changed to: ((A - 1)/B) + 1

o ext3_check_descriptors() overflows range checks

  ext3_check_descriptors() iterates over all block groups making sure
  that various bits are within the right block ranges...  on the last pass
  through, it is checking the error case

   [item] >= block + EXT3_BLOCKS_PER_GROUP(sb)

  where "block" is the first block in the last block group.  The last
  block in this group (and the last one that will fit in 32 bits) is block
  + EXT3_BLOCKS_PER_GROUP(sb)- 1.  block + EXT3_BLOCKS_PER_GROUP(sb) wraps
  back around to 0.

  so, make things clearer with "first_block" and "last_block" where those
  are first and last, inclusive, and use <, > rather than <, >=.

  Finally, the last block group may be smaller than the rest, so account
  for this on the last pass through: last_block = sb->s_blocks_count - 1;

(a similar patch could be done for ext2; does anyone in their right mind
use ext2 at 16T?  I'll send an ext2 patch doing the same thing if that's
warranted)

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Alexey Dobriyan 2aed348469 [PATCH] jbd: use BUILD_BUG_ON in journal init
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Stephen Tweedie <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Mingming Cao ae6ddcc5f2 [PATCH] ext3 and jbd cleanup: remove whitespace
Remove whitespace from ext3 and jbd, before we clone ext4.

Signed-off-by: Mingming Cao<cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Josh Triplett e7ab8d6505 [PATCH] jbd: add lock annotation to jbd_sync_bh
jbd_sync_bh releases journal->j_list_lock.  Add a lock annotation to this
function so that sparse can check callers for lock pairing, and so that
sparse will not complain about this function since it intentionally uses
the lock in this manner.

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:08 -07:00
Linus Torvalds b278240839 Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits)
  [PATCH] Don't set calgary iommu as default y
  [PATCH] i386/x86-64: New Intel feature flags
  [PATCH] x86: Add a cumulative thermal throttle event counter.
  [PATCH] i386: Make the jiffies compares use the 64bit safe macros.
  [PATCH] x86: Refactor thermal throttle processing
  [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64)
  [PATCH] Fix unwinder warning in traps.c
  [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1
  [PATCH] x86: Move direct PCI scanning functions out of line
  [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI
  [PATCH] Don't leak NT bit into next task
  [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
  [PATCH] Fix some broken white space in ia32_signal.c
  [PATCH] Initialize argument registers for 32bit signal handlers.
  [PATCH] Remove all traces of signal number conversion
  [PATCH] Don't synchronize time reading on single core AMD systems
  [PATCH] Remove outdated comment in x86-64 mmconfig code
  [PATCH] Use string instructions for Core2 copy/clear
  [PATCH] x86: - restore i8259A eoi status on resume
  [PATCH] i386: Split multi-line printk in oops output.
  ...
2006-09-26 13:07:55 -07:00
Linus Torvalds dd77a4ee0f Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (47 commits)
  Driver core: Don't call put methods while holding a spinlock
  Driver core: Remove unneeded routines from driver core
  Driver core: Fix potential deadlock in driver core
  PCI: enable driver multi-threaded probe
  Driver Core: add ability for drivers to do a threaded probe
  sysfs: add proper sysfs_init() prototype
  drivers/base: check errors
  drivers/base: Platform notify needs to occur before drivers attach to the device
  v4l-dev2: handle __must_check
  add CONFIG_ENABLE_MUST_CHECK
  add __must_check to device management code
  Driver core: fixed add_bind_files() definition
  Driver core: fix comments in drivers/base/power/resume.c
  sysfs_remove_bin_file: no return value, dump_stack on error
  kobject: must_check fixes
  Driver core: add ability for devices to create and remove bin files
  Class: add support for class interfaces for devices
  Driver core: create devices/virtual/ tree
  Driver core: add device_rename function
  Driver core: add ability for classes to handle devices properly
  ...
2006-09-26 11:49:46 -07:00
Andrew Morton 8d6b5eeea5 [PATCH] binfmt_elf: consistently use loff_t
As David Howells <dhowells@redhat.com> points out, binfmt_elf sometimes uses
off_t, sometimes uses loff_t.  Use loff_t throughout.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:53 -07:00
Christoph Lameter 972d1a7b14 [PATCH] ZVC: Support NR_SLAB_RECLAIMABLE / NR_SLAB_UNRECLAIMABLE
Remove the atomic counter for slab_reclaim_pages and replace the counter
and NR_SLAB with two ZVC counter that account for unreclaimable and
reclaimable slab pages: NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE.

Change the check in vmscan.c to refer to to NR_SLAB_RECLAIMABLE.  The
intend seems to be to check for slab pages that could be freed.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:51 -07:00
Christoph Lameter 182e8e2373 [PATCH] reduce MAX_NR_ZONES: make display of highmem counters conditional on CONFIG_HIGHMEM
Do not display HIGHMEM memory sizes if CONFIG_HIGHMEM is not set.

Make HIGHMEM dependent texts and make display of highmem counters optional

Some texts are depending on CONFIG_HIGHMEM.

Remove those strings and remove the display of highmem counter values if
CONFIG_HIGHMEM is not set.

[akpm@osdl.org: remove some ifdefs]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:46 -07:00
Peter Zijlstra d08b3851da [PATCH] mm: tracking shared dirty pages
Tracking of dirty pages in shared writeable mmap()s.

The idea is simple: write protect clean shared writeable pages, catch the
write-fault, make writeable and set dirty.  On page write-back clean all the
PTE dirty bits and write protect them once again.

The implementation is a tad harder, mainly because the default
backing_dev_info capabilities were too loosely maintained.  Hence it is not
enough to test the backing_dev_info for cap_account_dirty.

The current heuristic is as follows, a VMA is eligible when:
 - its shared writeable
    (vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED)
 - it is not a 'special' mapping
    (vm_flags & (VM_PFNMAP|VM_INSERTPAGE)) == 0
 - the backing_dev_info is cap_account_dirty
    mapping_cap_account_dirty(vma->vm_file->f_mapping)
 - f_op->mmap() didn't change the default page protection

Page from remap_pfn_range() are explicitly excluded because their COW
semantics are already horrid enough (see vm_normal_page() in do_wp_page()) and
because they don't have a backing store anyway.

mprotect() is taught about the new behaviour as well.  However it overrides
the last condition.

Cleaning the pages on write-back is done with page_mkclean() a new rmap call.
It can be called on any page, but is currently only implemented for mapped
pages, if the page is found the be of a VMA that accounts dirty pages it will
also wrprotect the PTE.

Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty() from
under ->private_lock.  This seems to be safe, since ->private_lock is used to
serialize access to the buffers, not the page itself.  This is needed because
clear_page_dirty() will call into page_mkclean() and would thereby violate
locking order.

[dhowells@redhat.com: Provide a page_mkclean() implementation for NOMMU]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:44 -07:00
Jan Kara 3998b9301d [PATCH] jbd: fix commit of ordered data buffers
Original commit code assumes, that when a buffer on BJ_SyncData list is
locked, it is being written to disk.  But this is not true and hence it can
lead to a potential data loss on crash.  Also the code didn't count with
the fact that journal_dirty_data() can steal buffers from committing
transaction and hence could write buffers that no longer belong to the
committing transaction.  Finally it could possibly happen that we tried
writing out one buffer several times.

The patch below tries to solve these problems by a complete rewrite of the
data commit code.  We go through buffers on t_sync_datalist, lock buffers
needing write out and store them in an array.  Buffers are also immediately
refiled to BJ_Locked list or unfiled (if the write out is completed).  When
the array is full or we have to block on buffer lock, we submit all
accumulated buffers for IO.

[suitable for 2.6.18.x around the 2.6.19-rc2 timeframe]

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:44 -07:00
Andi Kleen 758333458a [PATCH] Check return value of copy_to_user in compat_sys_pselect7
Fix

linux/fs/compat.c: In function compat_sys_pselect7
linux/fs/compat.c:1869: warning: ignoring return value of copy_to_user, declared with attribute warn_unused_result

To make it easier to handle I changed to semantics to not try to
write out a timespec if an error occurred. I hope that's ok.

Cc: dwmw2@infradead.org

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:39 +02:00
Andi Kleen c16b63e09d [PATCH] i386/x86-64: Don't randomize stack top when no randomization personality is set
Based on patch from Frank van Maarseveen <frankvm@frankvm.com>, but
extended.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andrew Morton f20a9ead0d sysfs: add proper sysfs_init() prototype
Don't be crufty.  Mark it __must_check too.

Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:39 -07:00
Randy.Dunlap 995982ca79 sysfs_remove_bin_file: no return value, dump_stack on error
Make sysfs_remove_bin_file() void.  If it detects an error,
printk the file name and call dump_stack().

sysfs_hash_and_remove() now returns an error code indicating
its success or failure so that sysfs_remove_bin_file() can
know success/failure.

Convert the only driver that checked the return value of
sysfs_remove_bin_file().

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:39 -07:00
Greg Kroah-Hartman ceeee1fb28 SYSFS: allow sysfs_create_link to create symlinks in the root of sysfs
This is needed to make the compatible link for /sys/block in the future.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:36 -07:00
Randy Dunlap 6468b3afa7 Debugfs: kernel-doc fixes for debugfs
Fix kernel-doc and typos/spellos in fs/debugfs/.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:36 -07:00
Juha Yrjölä eea3f8911f sysfs: Make poll behaviour consistent
When no events have been reported by sysfs_notify(), sd->s_events
was previously set to zero.  The initial value for new readers is
also zero, so poll was blocking, regardless of whether the attribute
was read by the process or not.

Make poll behave consistently by setting the initial value of
sd->s_events to non-zero.

Signed-off-by: Juha Yrjola <juha.yrjola@solidboot.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:36 -07:00
Ian Kent c0ba7e5147 [PATCH] autofs4: zero timeout prevents shutdown
If the timeout of an autofs mount is set to zero then umounts are disabled.
 This works fine, however the kernel module checks the expire timeout and
goes no further if it is zero.  This is not the right thing to do at
shutdown as the module is passed an option to expire mounts regardless of
their timeout setting.

This patch allows autofs to honor the force expire option.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-25 17:38:35 -07:00
Mark Fasheh 0d5dc6c2dd ocfs2: Teach ocfs2_drop_lock() to use ->set_lvb() callback
With this, we don't need to pass an additional struct with function pointer.

Now that the callbacks are fully used, comment the remaining API.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:48 -07:00
Mark Fasheh b5e500e23e ocfs2: Remove ->unblock lockres operation
Have ocfs2_process_blocked_lock() call ocfs2_generic_unblock_lock(), which
gets to be ocfs2_unblock_lock() now that it's the only possible unblock
function.

Remove the ->unblock() callback from the structure, and all lock type
specific unblock functions.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:48 -07:00