Commit Graph

4497 Commits

Author SHA1 Message Date
Jun Chen f46ba2235f [PATCH] fs: make nls_cp936.c handle some U00XY characters and U20AC correctly
Twenty characters in cp936 are not correctly handled.  They're all in the
U00 plane.  nls_cp936 converts all U00XY to XY but this is not correct for
some characters.(e.g.  U00B7 -> A1A4, U00A8 -> A1A7).

This problem is fixed by generating u2c_00 based on all c2u_xx and changing
uni2char() to give U00 plane a special handling.  The "€"(U20AC,80 in
cp936) is also be handled properly.

Acked-by: Gang Chen <cgdlut@gmail.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:46 -08:00
Helge Deller 15ad7cdcfd [PATCH] struct seq_operations and struct file_operations constification
- move some file_operations structs into the .rodata section

 - move static strings from policy_types[] array into the .rodata section

 - fix generic seq_operations usages, so that those structs may be defined
   as "const" as well

[akpm@osdl.org: couple of fixes]
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:46 -08:00
Yan Burman 4a08a9f681 [PATCH] jffs: replace kmalloc+memset with kzalloc
Replace kmalloc+memset with kzalloc

Signed-off-by: Yan Burman <burman.yan@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:45 -08:00
Yan Burman b593e48d2b [PATCH] affs: replace kmalloc+memset with kzalloc
Replace kmalloc+memset with kzalloc

Signed-off-by: Yan Burman <burman.yan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:45 -08:00
Adrian Bunk 3ee6f61ca0 [PATCH] remove NFSD_OPTIMIZE_SPACE
This patch removes the unused NFSD_OPTIMIZE_SPACE.

Additionally, it does differently what NFSD_OPTIMIZE_SPACE was supposed to do:

Nowadays, gcc knows best when to inline code, and CONFIG_CC_OPTIMIZE_FOR_SIZE
even tells gcc globally whether to optimize for size or for speed.  Therefore,
this patch also removes all inline's from these files.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:45 -08:00
David Rientjes 8de61e69c2 [PATCH] fs: remove unused variable
Removed unused 'have_pt_gnu_stack' variable.

Reported by David Binderman <dcb314@hotmail.com>

Signed-off-by: David Rientjes <rientjes@cs.washington.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:44 -08:00
Eric Sandeen a8f48a9561 [PATCH] ext3/4: don't do orphan processing on readonly devices
If you do something like:

  # touch foo
  # tail -f foo &
  # rm foo
  # <take snapshot>
  # <mount snapshot>

you'll panic, because ext3/4 tries to do orphan list processing on the
readonly snapshot device, and:

  kernel: journal commit I/O error
  kernel: Assertion failure in journal_flush_Rsmp_e2f189ce() at journal.c:1356: "!journal->j_checkpoint_transactions"
  kernel: Kernel panic: Fatal exception

for a truly readonly underlying device, it's reasonable and necessary
to just skip orphan list processing.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:44 -08:00
Mariusz Kozlowski 71a3d1b4f7 [PATCH] fs: ufs add missing bracket
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:44 -08:00
Adrian Bunk 0da1480ec3 [PATCH] proper prototype for remove_inode_dquot_ref()
Add a proper prototype for remove_inode_dquot_ref() in
include/linux/quotaops.h

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:44 -08:00
Adrian Bunk 3982cd99c3 [PATCH] fs/sysv/: doc cleanup
Remove two different changelog files from fs/sysv/ and merges the INTRO
file into Documentation/filesystems/sysv-fs.txt

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:44 -08:00
Jiri Kosina c949d4eb40 [PATCH] autofs: fix error code path in autofs_fill_sb()
When kernel is compiled with old version of autofs (CONFIG_AUTOFS_FS), and
new (observed at least with 5.x.x) automount deamon is started, kernel
correctly reports incompatible version of kernel and userland daemon, but
then screws things up instead of correct handling of the error:

 autofs: kernel does not match daemon version
 =====================================
 [ BUG: bad unlock balance detected! ]
 -------------------------------------
 automount/4199 is trying to release lock (&type->s_umount_key) at:
 [<c0163b9e>] get_sb_nodev+0x76/0xa4
 but there are no more locks to release!

 other info that might help us debug this:
 no locks held by automount/4199.

 stack backtrace:
  [<c0103b15>] dump_trace+0x68/0x1b2
  [<c0103c77>] show_trace_log_lvl+0x18/0x2c
  [<c01041db>] show_trace+0xf/0x11
  [<c010424d>] dump_stack+0x12/0x14
  [<c012e02c>] print_unlock_inbalance_bug+0xe7/0xf3
  [<c012fd4f>] lock_release+0x8d/0x164
  [<c012b452>] up_write+0x14/0x27
  [<c0163b9e>] get_sb_nodev+0x76/0xa4
  [<c0163689>] vfs_kern_mount+0x83/0xf6
  [<c016373e>] do_kern_mount+0x2d/0x3e
  [<c017513f>] do_mount+0x607/0x67a
  [<c0175224>] sys_mount+0x72/0xa4
  [<c0102b96>] sysenter_past_esp+0x5f/0x99
 DWARF2 unwinder stuck at sysenter_past_esp+0x5f/0x99
 Leftover inexact backtrace:
  =======================

and then deadlock comes.

The problem: autofs_fill_super() returns EINVAL to get_sb_nodev(), but
before that, it calls kill_anon_super() to destroy the superblock which
won't be needed.  This is however way too soon to call kill_anon_super(),
because get_sb_nodev() has to perform its own cleanup of the superblock
first (deactivate_super(), etc.).  The correct time to call
kill_anon_super() is in the autofs_kill_sb() callback, which is called by
deactivate_super() at proper time, when the superblock is ready to be
killed.

I can see the same faulty codepath also in autofs4.  This patch solves
issues in both filesystems in a same way - it postpones the
kill_anon_super() until the proper time is signalized by deactivate_super()
calling the kill_sb() callback.

[raven@themaw.net: update comment]
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Ian Kent <raven@themaw.net>
Cc: <stable@kernel.org>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:43 -08:00
Hugh Dickins ec0837f230 [PATCH] ext4 balloc: fix _with_rsv freeze
Port fix to the off-by-one in find_next_usable_block's memscan from ext2 to
ext4; but it didn't cause a serious problem for ext4 because the additional
ext4_test_allocatable check rescued it from the error.

[akpm@osdl.org: build fix]
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:43 -08:00
Hugh Dickins 341cee4385 [PATCH] ext4 balloc: use io_error label
ext4_new_blocks has a nice io_error label for setting -EIO, so goto that in
the one place that doesn't already use it.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:43 -08:00
Hugh Dickins b78a657f0a [PATCH] ext4 balloc: say rb_entry not list_entry
The reservations tree is an rb_tree not a list, so it's less confusing to use
rb_entry() than list_entry() - though they're both just container_of().

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:42 -08:00
Hugh Dickins b2f2c76d17 [PATCH] ext4 balloc: fix off-by-one against rsv_end
rsv_end is the last block within the reservation, so alloc_new_reservation
should accept start_block == rsv_end as success.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:42 -08:00
Hugh Dickins e7dc95db26 [PATCH] ext4 balloc: fix off-by-one against grp_goal
grp_goal 0 is a genuine goal (unlike -1), so ext4_try_to_allocate_with_rsv
should treat it as such.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:42 -08:00
Hugh Dickins cd16c8f72a [PATCH] ext4 balloc: reset windowsz when full
ext4_new_blocks should reset the reservation window size to 0 when squeezing
the last blocks out of an almost full filesystem, so the retry doesn't skip
any groups with less than half that free, reporting ENOSPC too soon.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:42 -08:00
Hisashi Hifumi 126039256c [PATCH] jbd2: wait for already submitted t_sync_datalist buffer to complete
In the current jbd code, if a buffer on BJ_SyncData list is dirty and not
locked, the buffer is refiled to BJ_Locked list, submitted to the IO and
waited for IO completion.

But the fsstress test showed the case that when a buffer was already
submitted to the IO just before the buffer_dirty(bh) check, the buffer was
not waited for IO completion.

Following patch solves this problem.  If it is assumed that a buffer is
submitted to the IO before the buffer_dirty(bh) check and still being
written to disk, this buffer is refiled to BJ_Locked list.

Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Cc: Jan Kara <jack@ucw.cz>
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:42 -08:00
Vladimir V. Saveliev c55747682e [PATCH] reiserfs: do not add save links for O_DIRECT writes
We add a save link for O_DIRECT writes to protect the i_size against the
crashes before we actually finish the I/O.  If we hit an -ENOSPC in
aops->prepare_write(), we would do a truncate() to release the blocks which
might have got initialized.  Now the truncate would add another save link
for the same inode causing a reiserfs panic for having multiple save links
for the same inode.

Signed-off-by: Vladimir V. Saveliev <vs@namesys.com>
Signed-off-by: Amit Arora <amitarora@in.ibm.com>
Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:42 -08:00
Yan Burman 01afb2134e [PATCH] reiser: replace kmalloc+memset with kzalloc
Replace kmalloc+memset with kzalloc

Signed-off-by: Yan Burman <burman.yan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:41 -08:00
Eric Dumazet b3423415fb [PATCH] dcache: avoid RCU for never-hashed dentries
Some dentries don't need to be globally visible in dentry hashtable.
(pipes & sockets)

Such dentries dont need to wait for a RCU grace period at delete time.
Being able to free them permits a better CPU cache use (hot cache)

This patch combined with (dont insert pipe dentries into dentry_hashtable)
reduced time of { pipe(p); close(p[0]); close(p[1]);} on my UP machine (1.6
GHz Pentium-M) from 3.23 us to 2.86 us (But this patch does not depend on
other patches, only bench results)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:41 -08:00
Eric Dumazet d18de5a272 [PATCH] don't insert pipe dentries into dentry_hashtable.
We currently insert pipe dentries into the global dentry hashtable.  This
is suboptimal because there is currently no way these entries can be used
for a lookup().  (/proc/xxx/fd/xxx uses a different mechanism).  Inserting
them in dentry hashtable slows dcache lookups.

To let __dpath() still work correctly (ie not adding a " (deleted)") after
dentry name, we do :

 - Right after d_alloc(), pretend they are hashed by clearing the
   DCACHE_UNHASHED bit.

 - Call d_instantiate() instead of d_add() : dentry is not inserted in
   hash table.

__dpath() & friends work as intended during dentry lifetime.

 - At dismantle time, once dput() must clear the dentry, setting again
   DCACHE_UNHASHED bit inside the custom d_delete() function provided by
   pipe code, so that dput() can just kill_it.

This patch, combined with (avoid RCU for never hashed dentries) reduced
time of { pipe(p); close(p[0]); close(p[1]);} on my UP machine (1.6GHz
Pentium-M) from 3.23 us to 2.86 us (But this patch does not depend on other
patches, only bench results)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:41 -08:00
Adrian Bunk 9711ef9945 [PATCH] make fs/proc/base.c:proc_pid_instantiate() static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:40 -08:00
Adrian Bunk c585646dd1 [PATCH] fs/lockd/host.c: make 2 functions static
Make the following needlessly global functions static:

 - nlm_lookup_host()
 - nsm_find()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:40 -08:00
Adrian Bunk 7ddae86095 [PATCH] make fs/jbd2/transaction.c:__kbd2_journal_temp_unlink_buffer() static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:40 -08:00
Adrian Bunk d394e122bc [PATCH] make fs/jbd/transaction.c:__journal_temp_unlink_buffer() static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:40 -08:00
Adrian Bunk 8487f2e406 [PATCH] make ecryptfs_version_str_map[] static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:39 -08:00
Mingming Cao 1df1e63b9e [PATCH] ext4: fix reservation extension
Hugh Dickins wrote:
> Not found anything relevant, but I keep noticing these lines
> in ext2_try_to_allocate_with_rsv(), ext3 and ext4 similar:
>
> 		} else if (grp_goal > 0 &&
> 				(my_rsv->rsv_end - grp_goal + 1) < *count)
> 			try_to_extend_reservation(my_rsv, sb,
> 					*count-my_rsv->rsv_end + grp_goal - 1);
>
> They're wrong, a no-op in most groups, aren't they?  rsv_end is an
> absolute block number, whereas grp_goal is group-relative, so the
> calculation ought to bring in group_first_block?  Or I'm confused.
>

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:39 -08:00
Mingming Cao 2bd94bd79e [PATCH] ext3: fix reservation extension
Hugh Dickins wrote:
> Not found anything relevant, but I keep noticing these lines
> in ext2_try_to_allocate_with_rsv(), ext3 and ext4 similar:
>
> 		} else if (grp_goal > 0 &&
> 				(my_rsv->rsv_end - grp_goal + 1) < *count)
> 			try_to_extend_reservation(my_rsv, sb,
> 					*count-my_rsv->rsv_end + grp_goal - 1);
>
> They're wrong, a no-op in most groups, aren't they?  rsv_end is an
> absolute block number, whereas grp_goal is group-relative, so the
> calculation ought to bring in group_first_block?  Or I'm confused.
>

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:39 -08:00
Ingo Molnar 0231606785 [PATCH] hotplug CPU: clean up hotcpu_notifier() use
There was lots of #ifdef noise in the kernel due to hotcpu_notifier(fn,
prio) not correctly marking 'fn' as used in the !HOTPLUG_CPU case, and thus
generating compiler warnings of unused symbols, hence forcing people to add
#ifdefs.

the compiler can skip truly unused functions just fine:

    text    data     bss     dec     hex filename
 1624412  728710 3674856 6027978  5bfaca vmlinux.before
 1624412  728710 3674856 6027978  5bfaca vmlinux.after

[akpm@osdl.org: topology.c fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:39 -08:00
Alexey Dobriyan de21c57b90 [PATCH] reiserfs: add missing D-cache flushing
Looks like, reiserfs_prepare_file_region_for_write() doesn't contain
several flush_dcache_page() calls.

Found with help from Dmitriy Monakhov <dmonakhov@openvz.org>

[akpm@osdl.org: small speedup]
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:38 -08:00
Magnus Damm 360276042d [PATCH] elf: fix kcore note size calculation
- Define "CORE" string as CORE_STR in single common place.
 - Include terminating zero in CORE_STR length calculation for elf_buflen.
 - Use roundup(,4) to include alignment in elf_buflen calculation.

[akpm@osdl.org: simplification suggested by Roland]
Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Cc: Daniel Jacobowitz <drow@false.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:38 -08:00
Magnus Damm 386d9a7edd [PATCH] elf: Always define elf_addr_t in linux/elf.h
Define elf_addr_t in linux/elf.h.  The size of the type is determined using
ELF_CLASS.  This allows us to remove the defines that today are spread all
over .c and .h files.

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Cc: Daniel Jacobowitz <drow@false.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:38 -08:00
Mike Galbraith c36264dfb2 [PATCH] remove the syslog interface when printk is disabled
Attempts to read() from the non-existent dmesg buffer will return zero and
userspace tends to get stuck in a busyloop.

So just remove /dev/kmsg altogether if CONFIG_PRINTK=n.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:38 -08:00
Andrey Savochkin b46be05004 [PATCH] retries in ext4_prepare_write() violate ordering requirements
In journal=ordered or journal=data mode retry in ext4_prepare_write()
breaks the requirements of journaling of data with respect to metadata.
The fix is to call commit_write to commit allocated zero blocks before
retry.

Signed-off-by: Kirill Korotaev <dev@openvz.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ken Chen <kenneth.w.chen@intel.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:37 -08:00
Andrey Savochkin e92a4d595b [PATCH] retries in ext3_prepare_write() violate ordering requirements
In journal=ordered or journal=data mode retry in ext3_prepare_write()
breaks the requirements of journaling of data with respect to metadata.
The fix is to call commit_write to commit allocated zero blocks before
retry.

Signed-off-by: Kirill Korotaev <dev@openvz.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ken Chen <kenneth.w.chen@intel.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:37 -08:00
Andrew Morton 93f210dd9e [PATCH] protect ext2 ioctl modifying append_only immutable etc with i_mutex
Port commit a090d9132c into ext2:

All modifications of ->i_flags in inodes that might be visible to somebody
else must be under ->i_mutex.  That patch fixes ext2 ioctl() setting S_APPEND.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:37 -08:00
Adrian Bunk 6fb50ea79c [PATCH] ext4_ext_split(): remove dead code
The Coverity checker noted that this was dead code, since in all places
above in this function, "err" is immediately checked.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:36 -08:00
Phillip Lougher 8bb0269160 [PATCH] corrupted cramfs filesystems cause kernel oops
Steve Grubb's fzfuzzer tool (http://people.redhat.com/sgrubb/files/
fsfuzzer-0.6.tar.gz) generates corrupt Cramfs filesystems which cause
Cramfs to kernel oops in cramfs_uncompress_block().  The cause of the oops
is an unchecked corrupted block length field read by cramfs_readpage().

This patch adds a sanity check to cramfs_readpage() which checks that the
block length field is sensible.  The (PAGE_CACHE_SIZE << 1) size check is
intentional, even though the uncompressed data is not going to be larger
than PAGE_CACHE_SIZE, gzip sometimes generates compressed data larger than
the original source data.  Mkcramfs checks that the compressed size is
always less than or equal to PAGE_CACHE_SIZE << 1.  Of course Cramfs could
use the original uncompressed data in this case, but it doesn't.

Signed-off-by: Phillip Lougher <phillip@lougher.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:36 -08:00
Andrew Morton 8984d137df [PATCH] ext4: uninline large functions
Saves nearly 4kbytes on x86.

Cc: Arnaldo Carvalho de Melo <acme@mandriva.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:35 -08:00
Andrew Morton 3a229b39eb [PATCH] ext3: uninline large functions
Saves nearly 4kbytes on x86.

Cc: Arnaldo Carvalho de Melo <acme@mandriva.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:35 -08:00
Andrew Morton 0723305844 [PATCH] vfs_getattr(): remove dead code
As Mikulas points out, (1 << anything) won't be evaluating to zero.  This code
is long-dead.

Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:35 -08:00
Vasily Averin dc168427e6 [PATCH] VFS: extra check inside dentry_unhash()
d_count check after dget() is always true.

Signed-off-by:	Vasily Averin <vvs@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:35 -08:00
Randy Dunlap 18debbbcce [PATCH] hpfs: fix printk format warnings
Fix hpfs printk warnings:

  fs/hpfs/dir.c:87: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'long unsigned int'
  fs/hpfs/dir.c:147: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'long int'
  fs/hpfs/dir.c:148: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'long int'
  fs/hpfs/dnode.c:537: warning: format '%08x' expects type 'unsigned int', but argument 5 has type 'long unsigned int'
  fs/hpfs/dnode.c:854: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'loff_t'
  fs/hpfs/ea.c:247: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'long unsigned int'
  fs/hpfs/inode.c:254: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'long unsigned int'
  fs/hpfs/map.c:129: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'ino_t'
  fs/hpfs/map.c:135: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'ino_t'
  fs/hpfs/map.c:140: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'ino_t'
  fs/hpfs/map.c:147: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'ino_t'
  fs/hpfs/map.c:154: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'ino_t'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:35 -08:00
Alexey Dobriyan 352d94d040 [PATCH] hpfs: bring hpfs_error() into shape
- switch to error message buffer in .bss
 - missing va_end() (htf it worked before?)
 - use vsnprintf()
 - rename variables to understandable "fmt", "args".
 - "const char *fmt", yes.
 - add __attribute__((format ...

Still, put that coffee down before reading more.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:35 -08:00
Alexey Dobriyan 4a6e617a4b [PATCH] fs/*: trivial vsnprintf() conversion
It would very lame to get buffer overflow via one of the following.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:35 -08:00
Heiko Carstens 7116e994b4 [PATCH] compat: fix uaccess handling
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:33 -08:00
Heiko Carstens 841d5fb7c7 [PATCH] binfmt: fix uaccess handling
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:33 -08:00
Mika Kukkonen 736c4b8572 [PATCH] Function v9fs_get_idpool returns int, not u32 as called twice in fs/9p/vfs_inode.c
Function v9fs_get_idpool returns int, not u32.  Actually it returns -1 on
errors, and these two callers check if the value is smaller than 0, which
was caught by gcc with extra warning flags.  Compile tested only but should
be OK, as the value computed in v9fs_get_idpool() is also int.

Signed-of-by: Mika Kukkonen <mikukkon@iki.fi>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@lanl.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:33 -08:00
Eric Sandeen e6c4021190 [PATCH] handle ext4 directory corruption better
I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz

Basically it makes a filesystem, splats some random bits over it, then
tries to mount it and do some simple filesystem actions.

At best, the filesystem catches the corruption gracefully.  At worst,
things spin out of control.

As you might guess, we found a couple places in ext4 where things spin out
of control :)

First, we had a corrupted directory that was never checked for
consistency...  it was corrupt, and pointed to another bad "entry" of
length 0.  The for() loop looped forever, since the length of
ext4_next_entry(de) was 0, and we kept looking at the same pointer over and
over and over and over...  I modeled this check and subsequent action on
what is done for other directory types in ext4_readdir...

(adding this check adds some computational expense; I am testing a followup
patch to reduce the number of times we check and re-check these directory
entries, in all cases.  Thanks for the idea, Andreas).

Next we had a root directory inode which had a corrupted size, claimed to
be > 200M on a 4M filesystem.  There was only really 1 block in the
directory, but because the size was so large, readdir kept coming back for
more, spewing thousands of printk's along the way.

Per Andreas' suggestion, if we're in this read error condition and we're
trying to read an offset which is greater than i_blocks worth of bytes,
stop trying, and break out of the loop.

With these two changes fsfuzz test survives quite well on ext4.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:33 -08:00
Eric Sandeen 40b851348f [PATCH] handle ext3 directory corruption better
I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz

Basically it makes a filesystem, splats some random bits over it, then
tries to mount it and do some simple filesystem actions.

At best, the filesystem catches the corruption gracefully.  At worst,
things spin out of control.

As you might guess, we found a couple places in ext3 where things spin out
of control :)

First, we had a corrupted directory that was never checked for
consistency...  it was corrupt, and pointed to another bad "entry" of
length 0.  The for() loop looped forever, since the length of
ext3_next_entry(de) was 0, and we kept looking at the same pointer over and
over and over and over...  I modeled this check and subsequent action on
what is done for other directory types in ext3_readdir...

(adding this check adds some computational expense; I am testing a followup
patch to reduce the number of times we check and re-check these directory
entries, in all cases.  Thanks for the idea, Andreas).

Next we had a root directory inode which had a corrupted size, claimed to
be > 200M on a 4M filesystem.  There was only really 1 block in the
directory, but because the size was so large, readdir kept coming back for
more, spewing thousands of printk's along the way.

Per Andreas' suggestion, if we're in this read error condition and we're
trying to read an offset which is greater than i_blocks worth of bytes,
stop trying, and break out of the loop.

With these two changes fsfuzz test survives quite well on ext3.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:33 -08:00
Marcus Meissner 59287c0913 [PATCH] binfmt_elf: randomize PIE binaries (2nd try)
Randomizes -pie compiled binaries from 64k (0x10000) up to ELF_ET_DYN_BASE.

0 -> 64k is excluded to allow NULL ptr accesses to fail.

Signed-off-by: Marcus Meissner <meissner@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:33 -08:00
Andreas Gruenbacher ed2908f313 [PATCH] Remove superfluous lock_super() in extN xattr code
lock_super() is unnecessary for setting super-block feature flags.  Use the
provided *_SET_COMPAT_FEATURE() macros as well.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:32 -08:00
Suzuki K P 87b4126f10 [PATCH] fix reiserfs bad path release panic
One of our test team hit a reiserfs_panic while running fsstress tests on
2.6.19-rc1.  The message looks like :

  REISERFS: panic(device Null superblock):
  reiserfs[5676]: assertion !(p->path_length != 1 ) failed at
  fs/reiserfs/stree.c:397:reiserfs_check_path: path not properly relsed.

The backtrace looked :

  kernel BUG in reiserfs_panic at fs/reiserfs/prints.c:361!
	.reiserfs_check_path+0x58/0x74
	.reiserfs_get_block+0x1444/0x1508
	.__block_prepare_write+0x1c8/0x558
	.block_prepare_write+0x34/0x64
	.reiserfs_prepare_write+0x118/0x1d0
	.generic_file_buffered_write+0x314/0x82c
	.__generic_file_aio_write_nolock+0x350/0x3e0
	.__generic_file_write_nolock+0x78/0xb0
	.generic_file_write+0x60/0xf0
	.reiserfs_file_write+0x198/0x2038
	.vfs_write+0xd0/0x1b4
	.sys_write+0x4c/0x8c
	syscall_exit+0x0/0x4

Upon debugging I found that the restart_transaction was not releasing
the path if the th->refcount was > 1.

/*static*/
int restart_transaction(struct reiserfs_transaction_handle *th,
                           			struct inode *inode, struct path *path)
{
	[...]

         /* we cannot restart while nested */
         if (th->t_refcount > 1) { <<- Path is not released in this case!
                 return 0;
         }

         pathrelse(path); <<- Path released here.
	[...]

This could happen in such a situation :

In reiserfs/inode.c: reiserfs_get_block() ::

      if (repeat == NO_DISK_SPACE || repeat == QUOTA_EXCEEDED) {
          /* restart the transaction to give the journal a chance to free
           ** some blocks.  releases the path, so we have to go back to
           ** research if we succeed on the second try
           */
          SB_JOURNAL(inode->i_sb)->j_next_async_flush = 1;

        -->>  retval = restart_transaction(th, inode, &path); <<--

  We are supposed to release the path, no matter we succeed or fail. But
if the th->refcount is > 1, the path is still valid. And,

          if (retval)
                   goto failure;
          repeat =
              _allocate_block(th, block, inode,
                             &allocated_block_nr, NULL, create);

If the above allocate_block fails with NO_DISK_SPACE or QUOTA_EXCEEDED,
we would have path which is not released.

         if (repeat != NO_DISK_SPACE && repeat != QUOTA_EXCEEDED) {
                   goto research;
         }
         if (repeat == QUOTA_EXCEEDED)
                   retval = -EDQUOT;
         else
                   retval = -ENOSPC;
         goto failure;
	[...]

       failure:
	[...]
         reiserfs_check_path(&path); << Panics here !

Attached here is a patch which could fix the issue.

fix reiserfs/inode.c : restart_transaction() to release the path in all
cases.

The restart_transaction() doesn't release the path when the the journal
handle has a refcount > 1.  This would trigger a reiserfs_panic() if we
encounter an -ENOSPC / -EDQUOT in reiserfs_get_block().

Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: <reiserfs-dev@namesys.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:32 -08:00
Tejun Heo 593be07ae8 [PATCH] file: kill unnecessary timer in fdtable_defer
free_fdtable_rc() schedules timer to reschedule fddef->wq if
schedule_work() on it returns 0.  However, schedule_work() guarantees that
the target work is executed at least once after the scheduling regardless
of its return value.  0 return simply means that the work was already
pending and thus no further action was required.

Another problem is that it used contant '5' as @expires argument to
mod_timer().

Kill unnecessary fddef->timer.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:32 -08:00
Miklos Szeredi 875d95ec9e [PATCH] fuse: fix compile without CONFIG_BLOCK
Randy Dunlap wote:
> Should FUSE depend on BLOCK?  Without that and with BLOCK=n, I get:
>
> inode.c:(.text+0x3acc5): undefined reference to `sb_set_blocksize'
> inode.c:(.text+0x3a393): undefined reference to `get_sb_bdev'
> fs/built-in.o:(.data+0xd718): undefined reference to `kill_block_super

Most fuse filesystems work fine without block device support, so I
think a better solution is to disable the 'fuseblk' filesystem type if
BLOCK=n.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:32 -08:00
Miklos Szeredi 0ec7ca41f6 [PATCH] fuse: add DESTROY operation
Add a DESTROY operation for block device based filesystems.  With the help of
this operation, such a filesystem can flush dirty data to the device
synchronously before the umount returns.

This is needed in situations where the filesystem is assumed to be clean
immediately after unmount (e.g.  ejecting removable media).

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:32 -08:00
Miklos Szeredi b2d2272fae [PATCH] fuse: add bmap support
Add support for the BMAP operation for block device based filesystems.  This
is needed to support swap-files and lilo.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:32 -08:00
Miklos Szeredi d809161402 [PATCH] fuse: add blksize option
Add 'blksize' option for block device based filesystems.  During
initialization this is used to set the block size on the device and the super
block.  The default block size is 512bytes.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:31 -08:00
Miklos Szeredi d6392f873f [PATCH] fuse: add support for block device based filesystems
I never intended this, but people started using fuse to implement block device
based "real" filesystems (ntfs-3g, zfs).

The following four patches add better support for these kinds of filesystems.
Unlike "normal" fuse filesystems, using this feature should require superuser
privileges (enforced by the fusermount utility).

Thanks to Szabolcs Szakacsits for the input and testing.

This patch adds a 'fuseblk' filesystem type, which is only different from the
'fuse' filesystem type in how the 'dev_name' mount argument is interpreted.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:31 -08:00
Miklos Szeredi bdcf250804 [PATCH] fuse: minor cleanup in fuse_dentry_revalidate
Remove unneeded code from fuse_dentry_revalidate().  This made some sense
while the validity time could wrap around, but now it's a very obvious no-op.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:31 -08:00
Pekka Enberg 960cc398a7 [PATCH] ext4: fsid for statvfs
Update ext4_statfs to return an FSID that is a 64 bit XOR of the 128 bit
filesystem UUID as suggested by Andreas Dilger.  See the following Bugzilla
entry for details:

  http://bugzilla.kernel.org/show_bug.cgi?id=136

Cc: Andreas Dilger <adilger@clusterfs.com>
Cc: Stephen Tweedie <sct@redhat.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:31 -08:00
Pekka Enberg 50ee0a32b1 [PATCH] ext3: fsid for statvfs
Update ext3_statfs to return an FSID that is a 64 bit XOR of the 128 bit
filesystem UUID as suggested by Andreas Dilger.  See the following Bugzilla
entry for details:

  http://bugzilla.kernel.org/show_bug.cgi?id=136

Cc: Andreas Dilger <adilger@clusterfs.com>
Cc: Stephen Tweedie <sct@redhat.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:31 -08:00
Pekka Enberg e4fca01ea2 [PATCH] ext2: fsid for statvfs
Update ext2_statfs to return an FSID that is a 64 bit XOR of the 128 bit
filesystem UUID as suggested by Andreas Dilger.  See the following Bugzilla
entry for details:

  http://bugzilla.kernel.org/show_bug.cgi?id=136

Cc: Andreas Dilger <adilger@clusterfs.com>
Cc: Stephen Tweedie <sct@redhat.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:31 -08:00
Stas Sergeev 317a40ac22 [PATCH] honour MNT_NOEXEC for access()
Make access(X_OK) take the "noexec" mount option into account.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:30 -08:00
Suzuki K P 57881dd9df [PATCH] Fix check_partition routines
check_partition() stops its probe once it hits an I/O error from the
partition checkers.  This would prevent the actual partition checker
getting a chance to verify the partition.

So this patch lets check_partition() continue probing untill it hits a
success while recording the I/O error which might have been reported by the
checking routines.

Also, it does some cleanup of the partition methods for ibm, atari and
amiga to return -1 upon hitting an I/O error.

Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:30 -08:00
Suzuki Kp 5127d002f9 [PATCH] fix rescan_partitions to return errors properly
The current rescan_partition implementation ignores the errors that comes from
the lower layer.  It reports success for unknown partitions as well as I/O
error cases while reading the partition information.

The unknown partition is not (and will not be) considered as an error in the
kernel, since there are legal users of it (e.g, members of a RAID5 MD Device
or a new disk which is not partitioned at all ).  Changing this behaviour
would scare the user about a serious problem with their disk and is not
recommended.  Thus for both "unknown partitions" to the Linux (eg., DEC
VMS,Novell Netware) and the legal users of NULL partition, would still be
reported as "SUCCESS".

The patch attached here, scares the user about something which he does need to
worry about.  i.e, returning -EIO on disk I/O errors while reading the
partition information.

Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Cc: Erik Mouw <erik@harddisk-recovery.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:30 -08:00
Rafael J. Wysocki 58e14b148d [PATCH] Use freezeable workqueues in XFS
Make the workqueues used by XFS freezeable, so their worker threads don't
submit any I/O after the suspend image has been created.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@suspend2.net>
Cc: David Chinner <dgc@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:29 -08:00
Nigel Cunningham 7dfb71030f [PATCH] Add include/linux/freezer.h and move definitions from sched.h
Move process freezing functions from include/linux/sched.h to freezer.h, so
that modifications to the freezer or the kernel configuration don't require
recompiling just about everything.

[akpm@osdl.org: fix ueagle driver]
Signed-off-by: Nigel Cunningham <nigel@suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:27 -08:00
Christoph Lameter e18b890bb0 [PATCH] slab: remove kmem_cache_t
Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:25 -08:00
Christoph Lameter e94b176609 [PATCH] slab: remove SLAB_KERNEL
SLAB_KERNEL is an alias of GFP_KERNEL.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:24 -08:00
Christoph Lameter f7267c0c07 [PATCH] slab: remove SLAB_USER
SLAB_USER is an alias of GFP_USER

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:24 -08:00
Christoph Lameter e6b4f8da3a [PATCH] slab: remove SLAB_NOFS
SLAB_NOFS is an alias of GFP_NOFS.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:23 -08:00
Guillem Jover 8fb4fc68ca [PATCH] Allow user processes to raise their oom_adj value
Currently a user process cannot rise its own oom_adj value (i.e.
unprotecting itself from the OOM killer).  As this value is stored in the
task structure it gets inherited and the unprivileged childs will be unable
to rise it.

The EPERM will be handled by the generic proc fs layer, as only processes
with the proper caps or the owner of the process will be able to write to
the file.  So we allow only the processes with CAP_SYS_RESOURCE to lower
the value, otherwise it will get an EACCES which seems more appropriate
than EPERM.

Signed-off-by: Guillem Jover <guillem.jover@nokia.com>
Acked-by: Andrea Arcangeli <andrea@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:21 -08:00
Andrey Mirkin 822191a2fa [PATCH] skip data conversion in compat_sys_mount when data_page is NULL
OpenVZ Linux kernel team has found a problem with mounting in compat mode.

Simple command "mount -t smbfs ..." on Fedora Core 5 distro in 32-bit mode
leads to oops:

  Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: compat_sys_mount+0xd6/0x290
  Process mount (pid: 14656, veid=300, threadinfo ffff810034d30000, task ffff810034c86bc0)
  Call Trace: ia32_sysret+0x0/0xa

The problem is that data_page pointer can be NULL, so we should skip data
conversion in this case.

Signed-off-by: Andrey Mirkin <amirkin@openvz.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:20 -08:00
Patrick Caulfield ac33d07105 [DLM] Clean up lowcomms
This fixes up most of the things pointed out by akpm and Pavel Machek
with comments below indicating why some things have been left:

Andrew Morton wrote:
>
>> +static struct nodeinfo *nodeid2nodeinfo(int nodeid, gfp_t alloc)
>> +{
>> +	struct nodeinfo *ni;
>> +	int r;
>> +	int n;
>> +
>> +	down_read(&nodeinfo_lock);
>
> Given that this function can sleep, I wonder if `alloc' is useful.
>
> I see lots of callers passing in a literal "0" for `alloc'.  That's in fact
> a secret (GFP_ATOMIC & ~__GFP_HIGH).  I doubt if that's what you really
> meant.  Particularly as the code could at least have used __GFP_WAIT (aka
> GFP_NOIO) which is much, much more reliable than "0".  In fact "0" is the
> least reliable mode possible.
>
> IOW, this is all bollixed up.

When 0 is passed into nodeid2nodeinfo the function does not try to allocate a
new structure at all. it's an indication that the caller only wants the nodeinfo
struct for that nodeid if there actually is one in existance.
I've tidied the function itself so it's more obvious, (and tidier!)

>> +/* Data received from remote end */
>> +static int receive_from_sock(void)
>> +{
>> +	int ret = 0;
>> +	struct msghdr msg;
>> +	struct kvec iov[2];
>> +	unsigned len;
>> +	int r;
>> +	struct sctp_sndrcvinfo *sinfo;
>> +	struct cmsghdr *cmsg;
>> +	struct nodeinfo *ni;
>> +
>> +	/* These two are marginally too big for stack allocation, but this
>> +	 * function is (currently) only called by dlm_recvd so static should be
>> +	 * OK.
>> +	 */
>> +	static struct sockaddr_storage msgname;
>> +	static char incmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))];
>
> whoa.  This is globally singly-threaded code??

Yes. it is only ever run in the context of dlm_recvd.
>>
>> +static void initiate_association(int nodeid)
>> +{
>> +	struct sockaddr_storage rem_addr;
>> +	static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))];
>
> Another static buffer to worry about.  Globally singly-threaded code?

Yes. Only ever called by dlm_sendd.

>> +
>> +/* Send a message */
>> +static int send_to_sock(struct nodeinfo *ni)
>> +{
>> +	int ret = 0;
>> +	struct writequeue_entry *e;
>> +	int len, offset;
>> +	struct msghdr outmsg;
>> +	static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))];
>
> Singly-threaded?

Yep.

>>
>> +static void dealloc_nodeinfo(void)
>> +{
>> +	int i;
>> +
>> +	for (i=1; i<=max_nodeid; i++) {
>> +		struct nodeinfo *ni = nodeid2nodeinfo(i, 0);
>> +		if (ni) {
>> +			idr_remove(&nodeinfo_idr, i);
>
> Didn't that need locking?

Not. it's only ever called at DLM shutdown after all the other threads
have been stopped.

>>
>> +static int write_list_empty(void)
>> +{
>> +	int status;
>> +
>> +	spin_lock_bh(&write_nodes_lock);
>> +	status = list_empty(&write_nodes);
>> +	spin_unlock_bh(&write_nodes_lock);
>> +
>> +	return status;
>> +}
>
> This function's return value is meaningless.  As soon as the lock gets
> dropped, the return value can get out of sync with reality.
>
> Looking at the caller, this _might_ happen to be OK, but it's a nasty and
> dangerous thing.  Really the locking should be moved into the caller.

It's just an optimisation to allow the caller to schedule if there is no work
to do. if something arrives immediately afterwards then it will get picked up
when the process re-awakes (and it will be woken by that arrival).

The 'accepting' atomic has gone completely. as Andrew pointed out it didn't
really achieve much anyway. I suspect it was a plaster over some other
startup or shutdown bug to be honest.


Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Pavel Machek <pavel@ucw.cz>
2006-12-07 09:25:13 -05:00
Steven Whitehouse 34126f9f41 [GFS2] Change gfs2_fsync() to use write_inode_now()
This is a bit better than the previous version of gfs2_fsync()
although it would be better still if we were able to call a
function which only wrote the inode & metadata. Its no big deal
though that this will potentially write the data as well since
the VFS has already done that before calling gfs2_fsync(). I've
also added a comment to explain whats going on here.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Morton <akpm@osdl.org>
2006-12-07 09:13:14 -05:00
Andi Kleen 67fd44fea2 [PATCH] x86-64: Implement compat code for SIOCSIFHWBROADCAST
This network ioctl wasn't handled before.

Reported by Alexandra.Kossovsky@oktetlabs.ru (Alexandra Kossovsky)
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-07 02:14:03 +01:00
Chuck Lever c041b5ff8d NLM: fix print format for tk_pid
The tk_pid field is an unsigned short.  The proper print format specifier for
that type is %5u, not %4d.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:54 -05:00
Trond Myklebust a18030445f NFS: Clean up calls to mark_inode_dirty() part 2
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:42 -05:00
Trond Myklebust 9cf85e0a24 NFS: Fix up writeback_control->nr_to_write accounting
We're really accounting for the same page twice now: once in
generic_writepages(), and once in nfs_scan_dirty().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:41 -05:00
Trond Myklebust 3925675cb3 NFS: Fix up the dirty page accounting
There is now no reason to account for the dirty pages in the NFS code,
since the VM code will now do it for us via __set_page_dirty_nobuffers(),
and set_page_writeback().

We still need to keep the accounting of stable writes, though.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:41 -05:00
Trond Myklebust e507d9ebbb NFS: Ensure the inode is marked as dirty if we break out of nfs_wb_all()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:40 -05:00
Trond Myklebust fa8d8c5b77 NFS: Fix nfs_release_page
invalidate_inode_pages2_range() will clear the PG_dirty bit before calling
try_to_release_page().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:40 -05:00
Trond Myklebust 61822ab5e3 NFS: Ensure we only call set_page_writeback() under the page lock
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:40 -05:00
Trond Myklebust e261f51f25 NFS: Make nfs_updatepage() mark the page as dirty.
This will ensure that we can call set_page_writeback() from within
nfs_writepage(), which is always called with the page lock set.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:39 -05:00
Trond Myklebust 4d770ccf42 NFS: Ensure that nfs_wb_page() calls writepage when necessary.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:39 -05:00
Trond Myklebust 1a54533ec8 NFS: Add nfs_set_page_dirty()
We will want to allow nfs_writepage() to distinguish between pages that
have been marked as dirty by the VM, and those that have been marked as
dirty by nfs_updatepage().
In the former case, the entire page will want to be written out, and so any
requests that were pending need to be flushed out first.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:38 -05:00
Trond Myklebust 200baa2112 NFS: Remove nfs_writepage_sync()
Maintaining two parallel ways of doing synchronous writes is rather
pointless. This patch gets rid of the legacy nfs_writepage_sync(), and
replaces it with the faster asynchronous writes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:38 -05:00
Trond Myklebust e21195a740 NFS: More cleanups of fs/nfs/write.c
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:37 -05:00
Trond Myklebust 87a4ce1608 NFS: Remove call to igrab() from nfs_writepage()
We always ensure that the nfs_open_context holds a reference to the dentry,
so the test in nfs_writepage() for whether or not the inode is referenced
is redundant.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:37 -05:00
Trond Myklebust 49a70f2786 NFS: Cleanup: add common helper nfs_page_length()
Clean up a lot of ad-hoc page length calculations in fs/nfs/write.c

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:36 -05:00
Trond Myklebust 277459d2e2 NFS: Store pointer to the nfs_page in page->private
This will allow fast lookup of the nfs_page from the struct page instead of
having to search the radix tree.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:36 -05:00
Trond Myklebust 1c75950b9a NFS: cleanup of nfs_sync_inode_wait()
Allow callers to directly pass it a struct writeback_control.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:35 -05:00
Trond Myklebust 3f442547b7 NFS: Clean up nfs_scan_dirty()
Pass down struct writeback control.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:35 -05:00
Trond Myklebust 28c6925fce NFS: Clean up nfs_flush_inode()
Make it take a struct writepages argument, and rename to
nfs_flush_mapping().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:34 -05:00
Frank Filz eb5f8545ff NFS: Remove use of the Big Kernel Lock around nfs calls to readlink
Remove use of the Big Kernel Lock around indirect calls to
nfs3_proc_readlink and nfs4_proc_readlink, both of which
basically call rpc_call_sync.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:31 -05:00
Frank Filz cae823c4c0 NFS: Remove use of the Big Kernel Lock around calls to rpc_call_sync
Remove use of the Big Kernel Lock around calls to rpc_call_sync.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:31 -05:00
Frank Filz a99b71c9c4 NFS: Remove use of the Big Kernel Lock around calls to rpc_execute.
Remove use of the Big Kernel Lock around calls to rpc_execute.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:30 -05:00
Trond Myklebust e8e058e830 NFS: Fix nfs_sync_inode_wait(FLUSH_INVALIDATE)
Currently nfs_sync_inode_wait() will fail to loop correctly when we call
nfs_sync_inode_wait with the FLUSH_INVALIDATE argument.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:28 -05:00