Commit Graph

5358 Commits

Author SHA1 Message Date
Mark Fasheh 6f16bf655c ocfs2: small cleanup of ocfs2_request_delete()
There are two checks in there (one for inode newness, one for other mounted
nodes) which are unnecessary, so remove them. The DLM will allow the trylock
in either case without any messaging overhead.

Removing these makes ocfs2_request_delete() a one liner function, so just
move the trylock out one level into ocfs2_query_inode_wipe().

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-04-26 14:40:55 -07:00
Tiger Yang 68e2b740c4 ocfs2: remove unused code
Remove node messaging code that becomes unused with the delete inode vote
removal.

[Removed even more cruft which I spotted during review --Mark]

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-04-26 14:40:16 -07:00
Tiger Yang 500086300e ocfs2: Remove delete inode vote
Ocfs2 currently does cluster-wide node messaging to check the open state of
an inode during delete. This patch removes that mechanism in favor of an
inode cluster lock which is taken at shared read when an inode is first read
and dropped in clear_inode(). This allows a deleting node to test the
liveness of an inode by attempting to take an exclusive lock.

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-04-26 14:39:48 -07:00
Mark Fasheh a9f5f70739 ocfs2: filter more error prints
We don't want to print anything at all in ocfs2_lookup() when getting an
error from ocfs2_iget() - it could be something as innocuous as a signal
being detected in the dlm.

ocfs2_permission() should filter on -ENOENT which ocfs2_meta_lock() can
return if the inode was deleted on another node.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-04-26 13:39:08 -07:00
Sunil Mushran bebe6f120b ocfs2: Replace panic() with emergency_restart() when fencing
We have noticed panic() hanging leading us to a situation in which
the node, while otherwise dead, is still disk heartbeating. This
leads to a hung cluster as the other nodes are waiting for this
node to stop disk heartbeating. This situation is only resolved
by power resetting the box.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-04-26 13:39:02 -07:00
Sunil Mushran 5d262cc7dd ocfs2: Silence compiler warnings
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-04-26 13:38:55 -07:00
Mark Fasheh be9e986b82 ocfs2: Local mounts should skip inode updates
We don't want the extent map and uptodate cache destruction in
ocfs2_meta_lock_update() on a local mount, so skip that.

This fixes several bugs with uptodate being cleared on buffers and extent
maps being corrupted.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-04-26 13:35:21 -07:00
Sunil Mushran 0d01af6e5d ocfs2_dlm: Call cond_resched_lock() once per hash bucket scan
In dlm_migrate_all_locks(), we currently call cond_resched_lock() after
processing each lockres in a hash bucket. Move it outside the loop so as to
call it only after the entire hash bucket has been processed.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-04-26 13:33:11 -07:00
Srinivas Eeda 756a1501dd ocfs2_dlm: fix race in dlm_remaster_locks
There is a possibility that dlm_remaster_locks could overwride node->state
with DLM_RECO_NODE_DATA_REQUESTED after dlm_reco_data_done_handler sets the
node->state to DLM_RECO_NODE_DATA_DONE. This could lead to recovery getting
stuck and requires a cluster reboot. Synchronize with dlm_reco_state_lock
spinlock.

Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-04-26 13:33:02 -07:00
Steve French 984acfe1cf [CIFS] prefixpath mounts to servers supporting posix paths used wrong slash
Acked-by: Alexander Bokovoy <abokovoy@ru.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-26 16:42:50 +00:00
Steve French deb0420c6f [CIFS] Update cifs version to 1.49
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-26 14:35:54 +00:00
David Woodhouse ef2e58ea6b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2007-04-26 09:31:28 +01:00
Andrew Morton f6449f4ece [JFFS2] Fix compr_rubin.c build after include file elimination.
It seems to be silly season lately.

(Oops, test builds are more useful if the file in question is actually
configured on. dwmw2).

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-26 07:27:04 +01:00
Patrick McHardy af65bdfce9 [NETLINK]: Switch cb_lock spinlock to mutex and allow to override it
Switch cb_lock to mutex and allow netlink kernel users to override it
with a subsystem specific mutex for consistent locking in dump callbacks.
All netlink_dump_start users have been audited not to rely on any
side-effects of the previously used spinlock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:29:03 -07:00
Arnaldo Carvalho de Melo b529ccf279 [NETLINK]: Introduce nlmsg_hdr() helper
For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the
number of direct accesses to skb->data and for consistency with all the other
cast skb member helpers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:34 -07:00
Eric Dumazet ae40eb1ef3 [NET]: Introduce SIOCGSTAMPNS ioctl to get timestamps with nanosec resolution
Now network timestamps use ktime_t infrastructure, we can add a new
ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'.
User programs can thus access to nanosecond resolution.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
CC: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:04 -07:00
David Woodhouse 61c4b23770 [JFFS2] Handle inodes with only a single metadata node with non-zero isize
This should never happen unless there's corruption on the medium and the
actual data nodes go missing. But the failure mode (an oops when we assume
the fragtree isn't empty and go looking for its last node) isn't useful.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-25 17:04:23 +01:00
David Woodhouse c00c310eac [JFFS2] Tidy up licensing/copyright boilerplate.
In particular, remove the bit in the LICENCE file about contacting
Red Hat for alternative arrangements. Their errant IS department broke
that arrangement a long time ago -- the policy of collecting copyright
assignments from contributors came to an end when the plug was pulled on
the servers hosting the project, without notice or reason.

We do still dual-license it for use with eCos, with the GPL+exception
licence approved by the FSF as being GPL-compatible. It's just that nobody
has the right to license it differently.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-25 14:16:47 +01:00
vignesh eaa33a9ac0 [CIFS] Replace kmalloc/memset combination with kzalloc
Signed-off-by: Vignesh Babu <vignesh.babu@wipro.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-25 12:13:48 +00:00
Steve French 5858ae44e2 [CIFS] Add IPv6 support
IPv6 support was started a few years ago in the cifs client, but lacked a
kernel helper function for parsing the ascii form of the ipv6 address. Now
that that is added (and now IPv6 is the default that some OS use now) it
was fairly easy to finish  the cifs ipv6 support.  This  requires that
CIFS_EXPERIMENTAL be enabled and (at least until the mount.cifs module is
modified to use a new ipv6 friendly call instead of gethostbyname) and the
ipv6 address be passed on the mount as "ip=" mount option.

Thanks

Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-25 11:59:10 +00:00
Steve French cbac3cba66 [CIFS] New CIFS POSIX mkdir performance improvement (part 2)
Fix incorrect parsing of return data

Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-25 11:46:06 +00:00
Joakim Tjernlund 0dec4c8bc6 [JFFS2] Better fix for all-zero node headers
No need to check for all-zero header since the header cannot
be zero due to other checks.

Replace the all-zero header check in readinode.c with a
check for the magic word.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-25 04:13:06 +01:00
David Woodhouse df8e96f391 [JFFS2] Improve read_inode memory usage, v2.
We originally used to read every node and allocate a jffs2_tmp_dnode_info
structure for each, before processing them in (reverse) version order
and discarding the ones which are obsoleted by later nodes.

With huge logfiles, this behaviour caused memory problems. For example, a
file involved in OLPC trac #1292 has 1822391 nodes, and would cause the XO
machine to run out of memory during the first stage of read_inode().

Instead of just inserting nodes into a tree in version order as we find
them, we now put them into a tree in order of their offset within the
file, which allows us to immediately discard nodes which are completely
obsoleted.

We don't use a full tree with 'fragments' pointing to the real data
structure, as we do in the normal fragtree. We sort only on the start
address, and add an 'overlapped' flag to the tmp_dnode_info to indicate
that the node in question is (partially) overlapped by another.

When the scan is complete, we start at the end of the file, adding each
node to a real fragtree as before. Where the node is non-overlapped, we
just add it (it doesn't matter that it's not the latest version; there is
no overlap). When the node at the end of the tree _is_ overlapped, we sort
it and all its overlapping nodes into version order and then add them to
the fragtree in that order.

This 'early discard' reduces the peak allocation of tmp_dnode_info
structures from 1.8M to a mere 62872 (3.5%) in the degenerate case
referenced above.

This version of the patch also correctly rememembers the highest node
version# seen for an inode when it's scanned.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-25 03:23:42 +01:00
Jeff Mahoney 9b7f375505 reiserfs: fix xattr root locking/refcount bug
The listxattr() and getxattr() operations are only protected by a read
lock.  As a result, if either of these operations run in parallel, a race
condition exists where the xattr_root will end up being cached twice, which
results in the leaking of a reference and a BUG() on umount.

This patch refactors get_xa_root(), __get_xa_root(), and create_xa_root(),
into one get_xa_root() function that takes the appropriate locking around
the entire critical section.

Reported, diagnosed and tested by Andrea Righi <a.righi@cineca.it>

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Andrea Righi <a.righi@cineca.it>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: Edward Shishkin <edward@namesys.com>
Cc: Alex Zarochentsev <zam@namesys.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24 08:23:09 -07:00
Latchesar Ionkov c959df9f01 v9fs: don't use primary fid when removing file
v9fs_insert uses v9fs_fid_lookup (which also locks the fid) to get the
primary fid associated with the dentry and destroys the v9fs_fid struct
after removing the file.  If another process called v9fs_fid_lookup on the
same dentry, it may wait undefinitely for the fid's lock (as the struct is
freed).

This patch changes v9fs_remove to use a cloned fid, so the primary fid is
not locked and freed.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Cc: Eric Van Hensbergen <ericvh@hera.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24 08:23:08 -07:00
Steve French 2dd29d3133 [CIFS] New CIFS POSIX mkdir performance improvement
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-23 22:07:35 +00:00
David Woodhouse 44b998e1eb [JFFS2] Improve failure mode if inode checking leaves unchecked space.
We should never find the unchecked size is non-zero after we've finished
checking all inodes. If it happens, used to BUG(), leaving the alloc_sem
held and deadlocking. Instead, just return -ENOSPC after complaining. The
GC thread will die, but read-only operation should be able to continue and
the file system should be unmountable.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-23 12:11:46 +01:00
David Woodhouse 566865a2a4 [JFFS2] Fix cross-endian build.
When compiling a LE-capable JFFS2 on PowerPC, wbuf.c fails to compile:

fs/jffs2/wbuf.c:973: error: braced-group within expression allowed only inside a function
fs/jffs2/wbuf.c:973: error: initializer element is not constant
fs/jffs2/wbuf.c:973: error: (near initialization for ‘oob_cleanmarker.magic’)
fs/jffs2/wbuf.c:974: error: braced-group within expression allowed only inside a function
fs/jffs2/wbuf.c:974: error: initializer element is not constant
fs/jffs2/wbuf.c:974: error: (near initialization for ‘oob_cleanmarker.nodetype’)
fs/jffs2/wbuf.c:975: error: braced-group within expression allowed only inside a function
fs/jffs2/wbuf.c:976: error: initializer element is not constant
fs/jffs2/wbuf.c:976: error: (near initialization for ‘oob_cleanmarker.totlen’)

Provide constant_cpu_to_je{16,32} functions, and use them for initialising the
offending structure.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-23 12:07:17 +01:00
Trond Myklebust 2b82f190c8 NFS: Fix race in nfs_set_page_dirty
Protect nfs_set_page_dirty() against races with nfs_inode_add_request.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-20 22:56:30 -07:00
Trond Myklebust 612c9384fd NFS: Fix the 'desynchronized value of nfs_i.ncommit' error
Redirtying a request that is already marked for commit will screw up the
accounting for NR_UNSTABLE_NFS as well as nfs_i.ncommit.
Ensure that all requests on the commit queue are labelled with the
PG_NEED_COMMIT flag, and avoid moving them onto the dirty list inside
nfs_page_mark_flush().

Also inline nfs_mark_request_dirty() into nfs_page_mark_flush() for
atomicity reasons. Avoid dropping the spinlock until we're done marking the
request in the radix tree and have added it to the ->dirty list.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-20 22:56:29 -07:00
Trond Myklebust 6d677e3504 NFS: Don't clear PG_writeback until after we've processed unstable writes
Ensure that we don't release the PG_writeback lock until after the page has
either been redirtied, or queued on the nfs_inode 'commit' list.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-20 22:56:29 -07:00
Trond Myklebust 8e821cad12 NFS: clean up the unstable write code
Get rid of the inlined #ifdefs.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-20 22:56:29 -07:00
Joakim Tjernlund a491486a20 [JFFS2] Obsolete dirent nodes immediately on unlink, where possible.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-20 23:09:28 -04:00
Evgeniy Dushistov 07a0cfec30 ufs proper handling of zero link case
This patch should fix or partly fix this bug:
http://bugzilla.kernel.org/show_bug.cgi?id=8276

The problem is:

- if we see "zero link case" during reading inode operation, we call
  ufs_error(which remount fs readonly), but not "mark" inode as bad (1)

- in readonly case we do not fill some data structures, which are used in
  read and write case (2)

- VFS call ufs_delete_inode if link count is zero (3)

so (1)->(3)->(2) cause oops, this patch should fix such scenario

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Jim Paris <jim@jtan.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-17 16:36:27 -07:00
Alan Cox c4bbafda70 exec.c: fix coredump to pipe problem and obscure "security hole"
The patch checks for "|" in the pattern not the output and doesn't nail a
pid on to a piped name (as it is a program name not a file)

Also fixes a very very obscure security corner case.  If you happen to have
decided on a core pattern that starts with the program name then the user
can run a program called "|myevilhack" as it stands.  I doubt anyone does
this.

Signed-off-by: Alan Cox <alan@redhat.com>
Confirmed-by: Christopher S. Aker <caker@theshore.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-17 16:36:26 -07:00
Joakim Tjernlund c2aecda79c [JFFS2] Speed up mount for directly-mapped NOR flash
Remove excessive scanning of empty flash after a clean
marker for users of the point/unpoint method. cfi_cmdset_0001
uses point/unpoint by default iff flash mapping is linear.
The speedup is several orders of magnitude if FS is less than
half full.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-17 14:07:34 -04:00
Artem Bityutskiy 10731f8300 [JFFS2] fix buffer sise calculations in jffs2_get_inode_nodes()
In read inode we have an optimization which prevents one
min. I/O unit (e.g. NAND page) to be read more then once.

Namely, at the beginning we do not know which node type we read,
so we read so we assume we read the directory entry, because it
has the smallest node header. When we read it, we read up to the
next min. I/O unit, just because if later we'll need to read more,
we already have this data.

If it turns out to be that the node is not directory entry, and
we need more data, and we did not read it because it sits in the
next min. I/O unit, we read the whole next (or several next)
min. I/O unit(s). And if it happens to be that we read a data node,
and we've read part of its data, we calculate partial CRC.
So if later we need to check data CRC, we'll only read the rest
of the data from further min. I/O units and continue CRC checking.

This code was a bit messy and buggy. The bug was that it assumed
relatively large min. I/O unit, so that the largest node header
could overlap only one min. I/O unit boundary.

This parch clean-ups the code a bit and fixes this bug.
The patch was not tested on flash with small min. I/O unit, like
NOR-ECC, nut it was tested on NAND with 512 bytes NAND page, so
it at least does not break NAND. It was also tested with mtdram
so it should not break NOR.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-17 14:05:48 -04:00
Adrian Hunter 7f762ab24c [JFFS2] Disable summary after wbuf recovery
After a write error, any data in the write buffer must
be relocated.  This is handled by the jffs2_wbuf_recover
function.  This function does not fix up the erase block
summary information that is collected for writing at the
end of the block, which results in an incorrect summary
(or BUG if the summary was found to be empty).

As the summary is not essential (it is an optimisation),
it may be disabled for the current erase block when this
situation arises.  This patch does that.

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-17 13:56:44 -04:00
Adrian Hunter 99c2594f0e [JFFS2] Prevent list corruption when handling write errors
If a write error occurs, the affected block is placed on the
bad_used_list.  In the case that the write error occured
when writing summary data the block was also being placed on
the dirty_list, which caused list corruption and ultimately
a soft lockup in jffs2_mark_node_obsolete. This fixes that.

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-17 13:56:23 -04:00
Artem Bityutskiy b0afbbec49 [JFFS2] fix deadlock on error path
When the MTD driver returns write failure, the following deadlock
occurs:

We are in __jffs2_flush_wbuf(), we hold &c->wbuf_sem. Write failure.
jffs2_wbuf_recover()->jffs2_reserve_space_gc()->jffs2_do_reserve_space()
->jffs2_erase_pending_blocks()->jffs2_flash_read()

and it tries to lock &c->wbuf_sem again. Deadlock.

Reported-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-17 13:53:51 -04:00
Thomas Gleixner 53043002ef [JFFS2] check node crc before doing anything else
Check the node CRC on scan before doing anything else with the node.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-17 18:26:18 +01:00
J. Bruce Fields c2fa1b8a6c locks: create posix-to-flock helper functions
Factor out a bit of messy code by creating posix-to-flock counterparts
to the existing flock-to-posix helper functions.

Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-04-16 13:40:37 -04:00
J. Bruce Fields 226a998dbf locks: trivial removal of unnecessary parentheses
Remove some unnecessary parentheses.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-04-16 13:40:37 -04:00
Trond Myklebust eb4cac10d9 NFS: Fix a list corruption problem
We must remove the request from whatever list it is currently on before we
can add it to the dirty list.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-15 16:48:11 -07:00
Trond Myklebust 5a6d41b32a NFS: Ensure PG_writeback is cleared when writeback fails
If the writebacks are cancelled via nfs_cancel_dirty_list, or due to the
memory allocation failing in nfs_flush_one/nfs_flush_multi, then we must
ensure that the PG_writeback flag is cleared.

Also ensure that we actually own the PG_writeback flag whenever we
schedule a new writeback by making nfs_set_page_writeback() return the
value of test_set_page_writeback().
The PG_writeback page flag ends up replacing the functionality of the
PG_FLUSHING nfs_page flag, so we rip that out too.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-14 21:46:48 -07:00
Trond Myklebust 60fa3f769f NFS: Fix two bugs in the O_DIRECT write code
Do not flag an error if the COMMIT call fails and we decide to resend the
writes. Let the resend flag the error if it fails.

If a write has failed, then nfs_direct_write_result should not attempt to
send a commit. It should just exit asap and return the error to the user.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-14 21:46:48 -07:00
Trond Myklebust e1552e1998 NFS: Fix an Oops in nfs_setattr()
It looks like nfs_setattr() and nfs_rename() also need to test whether the
target is a regular file before calling nfs_wb_all()...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-14 21:46:47 -07:00
Jeff Mahoney c3724b129b [PATCH] autofs4: fix race in unhashed dentry code
Commit f50b6f8691 introduced a race in
autofs4 between autofs_lookup_unhashed() and autofs_dentry_release().

autofs_dentry_release() ends up clearing the ->dentry and ->inode members
of autofs_info before removing it from the rehash list.  The list is
protected by the rehash lock in both functions, but since
autofs_dentry_release() starts tearing the autofs_info struct down before
removing it from the list, autofs_lookup_unhashed() can get a autofs_info
with a NULL dentry.

This patch moves the clearing of ->dentry and ->inode after the removal
from the rehash list.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-12 15:31:42 -07:00
Vladimir Saveliev 6d205f1205 [PATCH] reiserfs: fix key decrementing
This patch fixes a bug in function decrementing a key of stat data item.

Offset of reiserfs keys are compared as signed values.  To set key offset
to maximal possible value maximal signed value has to be used.

This bug is responsible for severe reiserfs filesystem corruption which
shows itself as warning vs-13060.  reiserfsck fixes this corruption by
filesystem tree rebuilding.

Signed-off-by: Vladimir Saveliev <vs@namesys.com>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-12 15:31:42 -07:00
Stephen Rothwell 1a38147ed0 [POWERPC] Make struct property's value a void *
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-13 03:55:18 +10:00
Timo Savola a5bfffac64 [PATCH] fuse: validate rootmode mount option
If rootmode isn't valid, we hit the BUG() in fuse_init_inode.  Now
EINVAL is returned.

Signed-off-by: Timo Savola <tsavola@movial.fi>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-08 19:47:55 -07:00
Steve French 5268df2ead [CIFS] Add write perm for usr to file on windows should remove r/o dos attr
Remove read only dos attribute on chmod when adding any write permission (ie on any of
user/group/other (not all of user/group/other ie  0222) when
mounted to windows.

Suggested by: Urs Fleisch

Signed-off-by: Urs Fleisch <urs.fleisch@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-06 19:28:16 +00:00
Andrew Morton 2363cc0264 [PATCH] remove protection of LANANA-reserved majors
Revert all this.  It can cause device-mapper to receive a different major from
earlier kernels and it turns out that the Amanda backup program (via GNU tar,
apparently) checks major numbers on files when performing incremental backups.

Which is a bit broken of Amanda (or tar), but this feature isn't important
enough to justify the churn.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-04 21:12:47 -07:00
Steve French 3a9f462f6d [CIFS] Remove unnecessary parm to cifs_reopen_file
Also expand debug entry to show which character on a failed Unicode
mapping.

Acked-by: Shaggy <shaggy@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-04 17:10:24 +00:00
Igor Mammedov aaf737adb6 [CIFS] Switch cifsd to kthread_run from kernel_thread
cifsd was the only cifs thread that had not been switched to the newer
kthread interface

Signed-off-by: Igor Mammedov <niallain at gmail.com>
Signed-off-by: Wilhelm Meier <wilhelm.meier@fh-kl.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-03 19:16:43 +00:00
Christoph Hellwig c33f8d3274 [CIFS] Remove unnecessary checks
file->f_path.dentry or file->f_path.dentry.d_inode can't be NULL since at
least ten years, similar for all but very few arguments passed in from the
VFS.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-04-02 18:47:20 +00:00
Robert P. J. Day 8dc64fca75 [JFFS2] Delete everything related to obsolete JFFS2_PROC option
Delete everything related to the apparently non-existent kernel config
option JFFS2_PROC.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-04-02 14:11:25 -04:00
Andrew Morton 7479d2b90b [PATCH] revert "retries in ext4_prepare_write() violate ordering requirements"
Revert b46be05004.  Same reasoning as for ext3.

Cc: Kirill Korotaev <dev@openvz.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ken Chen <kenneth.w.chen@intel.com>
Cc: Andrey Savochkin <saw@sw.ru>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02 10:06:08 -07:00
Andrew Morton 1aa9b4b9bc [PATCH] revert "retries in ext3_prepare_write() violate ordering requirements"
Revert e92a4d595b.

Dmitry points out

"When we block_prepare_write() failed while ext3_prepare_write() we jump to
 "failure" label and call ext3_prepare_failure() witch search last mapped bh
 and invoke commit_write untill it.  This is wrong!!  because some bh from
 begining to the last mapped bh may be not uptodate.  As a result we commit to
 disk not uptodate page content witch contains garbage from previous usage."

and

"Unexpected file size increasing."

   Call trace the same as it was in first issue but result is different.
   For example we have file with i_size is zero.  we want write two blocks ,
   but fs has only one free block.

   ->ext3_prepare_write(...from == 0, to == 2048)
     retry:
     ->block_prepare_write() == -ENOSPC# we failed but allocated one block here.
     ->ext3_prepare_failure()
       ->commit_write( from == 0, to == 1024) # after this i_size becomes 1024 :)
     if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
        goto retry;

   Finally when all retries will be spended ext3_prepare_failure return
   -ENOSPC, but i_size was increased and later block trimm procedures can't
   help here.

We don't appear to have the horsepower to fix these issues, so let's put
things back the way they were for now.

Cc: Kirill Korotaev <dev@openvz.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ken Chen <kenneth.w.chen@intel.com>
Cc: Andrey Savochkin <saw@sw.ru>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02 10:06:08 -07:00
Brian Pomerantz 0322170260 [PATCH] fix page leak during core dump
When the dump cannot occur most likely because of a full file system and
the page to be written is the zero page, the call to page_cache_release()
is missed.

Signed-off-by: Brian Pomerantz <bapper@mvista.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02 10:06:08 -07:00
Andrew Morton 05565b65a5 [PATCH] proc: fix linkage with CONFIG_SYSCTL=y, CONFIG_PROC_SYSCTL=n
We're using #ifdef CONFIG_SYSCTL, but we should be using CONFIG_PROC_SYSCTL,
so we get

 fs/built-in.o: In function `proc_root_init':
 /usr/src/linux/fs/proc/root.c:83: undefined reference to `proc_sys_init'

Fix that up and remove an ifdef-in-C.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Helge Hafting <helgehaf@aitel.hist.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-02 10:06:08 -07:00
Linus Torvalds 22c8c65d24 Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: partial write fix
2007-03-29 08:23:52 -07:00
Paolo 'Blaisorblade' Giarrusso 75e8defbe4 [PATCH] uml: hostfs variable renaming
* rename name to host_root_path
* rename data to req_root.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-29 08:22:25 -07:00
Jeff Dike 622e696938 [PATCH] uml: fix compilation problems
Fix a few miscellaneous compilation problems -
	an assignment with mismatched types in ldt.c
	a missing include in mconsole.h which needs a definition of uml_pt_regs
	I missed removing an include of user_util.h in hostfs

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-29 08:22:25 -07:00
Dmitriy Monakhov d9993c37ef [PATCH] splice: partial write fix
Currently if partial write has happened while ->commit_write() then page
wasn't marked as accessed and rebalanced.

Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-03-29 14:26:42 +02:00
Linus Torvalds e5c465f5d9 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2_dlm: Check for migrateable lockres in dlm_empty_lockres()
  ocfs2_dlm: Fix lockres ref counting bug
2007-03-28 14:02:03 -07:00
Jeff Garzik a9c87a10db Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes 2007-03-28 02:21:18 -04:00
Zach Brown 28defbea64 [PATCH] aio: remove bare user-triggerable error printk
The user can generate console output if they cause do_mmap() to fail
during sys_io_setup().  This was seen in a regression test that does
exactly that by spinning calling mmap() until it gets -ENOMEM before
calling io_setup().

We don't need this printk at all, just remove it.

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 17:53:25 -07:00
Jean Tourrilhes ed4bb10631 [PATCH] wext: Add missing ioctls to 64<->32 conversion
Johannes Berg and Michael Buesch noticed that the WPA ioctls
were missing from the 64<->32 bit conversion. This means that when
using a 32 bits userspace on a 64 bit kernel, those ioctls fail.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-03-27 14:10:17 -04:00
Linus Torvalds e0ab0bb6d2 Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
  Export __splice_from_pipe()
  2/2 splice: dont readpage
  1/2 splice: dont steal
  make elv_register() output atomic
  block: blk_max_pfn is somtimes wrong
2007-03-27 09:05:49 -07:00
Mika Kukkonen 5c46010af2 [PATCH] Fix kernel build with EMBEDDED & PROC_FS & !PROC_SYSCTL
Without attached patch against current -git I get following with
!PROC_SYSCTL (with EMBEDDED and PROC_FS set):

    CC      init/version.o
    LD      init/built-in.o
    LD      vmlinux
  fs/built-in.o: In function `do_proc_sys_lookup':
  proc_sysctl.c:(.text+0x26583): undefined reference to `sysctl_head_next'
  fs/built-in.o: In function `proc_sys_revalidate':
  proc_sysctl.c:(.text+0x265bb): undefined reference to `sysctl_head_finish'
  fs/built-in.o: In function `proc_sys_readdir':
  proc_sysctl.c:(.text+0x26720): undefined reference to `sysctl_head_next'
  proc_sysctl.c:(.text+0x267d8): undefined reference to `sysctl_head_finish'
  proc_sysctl.c:(.text+0x268e7): undefined reference to `sysctl_head_next'
  proc_sysctl.c:(.text+0x26910): undefined reference to `sysctl_head_finish'
  fs/built-in.o: In function `proc_sys_write':
  proc_sysctl.c:(.text+0x2695d): undefined reference to `sysctl_perm'
  proc_sysctl.c:(.text+0x2699c): undefined reference to `sysctl_head_finish'
  fs/built-in.o: In function `proc_sys_read':
  proc_sysctl.c:(.text+0x269e9): undefined reference to `sysctl_perm'
  proc_sysctl.c:(.text+0x26a25): undefined reference to `sysctl_head_finish'
  fs/built-in.o: In function `proc_sys_permission':
  proc_sysctl.c:(.text+0x26ad1): undefined reference to `sysctl_perm'
  proc_sysctl.c:(.text+0x26adb): undefined reference to `sysctl_head_finish'
  fs/built-in.o: In function `proc_sys_lookup':
  proc_sysctl.c:(.text+0x26b39): undefined reference to `sysctl_head_finish'
  make: *** [vmlinux] Virhe 1

All those functions are in fs/proc/proc_sysctl.c, which has no CONFIG_
#define's in it, so the patch makes the compilation of that file to depend
on CONFIG_PROC_SYSCTL (the simplest choice).

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 09:05:16 -07:00
J. Bruce Fields 79f6523a16 [PATCH] knfsd: nfsd4: remove superfluous cancel_delayed_work() call
This cancel_delayed_work call is called from a function that is only called
from a piece of code that immediate follows a cancel and destruction of the
workqueue, so it's clearly a mistake.

Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 09:05:14 -07:00
Bruce Fields 21315edd48 [PATCH] knfsd: nfsd4: demote "clientid in use" printk to a dprintk
The reused clientid here is a more of a problem for the client than the
server, and the client can report the problem itself if it's serious.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 09:05:14 -07:00
Bruce Fields 54c0440949 [PATCH] knfsd: nfsd4: fix inheritance flags on v4 ace derived from posix default ace
A regression introduced in the last set of acl patches removed the
INHERIT_ONLY flag from aces derived from the posix acl.  Fix.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 09:05:14 -07:00
NeilBrown 598b9a5637 [PATCH] knfsd: allow nfsd READDIR to return 64bit cookies
->readdir passes lofft_t offsets (used as nfs cookies) to
nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it
becomes an 'off_t', which isn't good.

So filesystems that returned 64bit offsets would lose.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-27 09:05:14 -07:00
Mark Fasheh 40bee44eae Export __splice_from_pipe()
Ocfs2 wants to implement it's own splice write actor so that it can better
manage cluster / page locks. This lets us re-use the rest of splice write
while only providing our own code where it's actually important.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-03-27 08:55:47 +02:00
Nick Piggin 08c7259163 2/2 splice: dont readpage
Splice does not need to readpage to bring the page uptodate before writing
to it, because prepare_write will take care of that for us.

Splice is also wrong to SetPageUptodate before the page is actually uptodate.
This results in the old uninitialised memory leak. This gets fixed as a
matter of course when removing the readpage logic.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-03-27 08:55:39 +02:00
Nick Piggin 485ddb4b97 1/2 splice: dont steal
Stealing pages with splice is problematic because we cannot just insert
an uptodate page into the pagecache and hope the filesystem can take care
of it later.

We also cannot just ClearPageUptodate, then hope prepare_write does not
write anything into the page, because I don't think prepare_write gives
that guarantee.

Remove support for SPLICE_F_MOVE for now. If we really want to bring it
back, we might be able to do so with a the new filesystem buffered write
aops APIs I'm working on. If we really don't want to bring it back, then
we should decide that sooner rather than later, and remove the flag and
all the stealing infrastructure before anybody starts using it.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-03-27 08:55:08 +02:00
Sunil Mushran 2f5bf1f2d0 ocfs2_dlm: Check for migrateable lockres in dlm_empty_lockres()
In dlm_migrate_lockres(), we check upfront whether the lockres is a
candidate for migration. This patch encapsulates that code in a separate
function so that dlm_empty_lockres() can also use it during umount. This
patch addresses the umount process spinning problem.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-03-26 16:51:15 -07:00
Sunil Mushran 78062cb2e5 ocfs2_dlm: Fix lockres ref counting bug
During umount, the umount thread migrates the lockres' and the dlm_thread
frees the empty lockres'. Due to a race, the reference counting on the
lockres goes awry leading to extra puts.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-03-26 16:50:52 -07:00
Adrian Bunk d32b687e2e 9p: make struct v9fs_cached_file_operations static
This patch makes te needlessly global struct v9fs_cached_file_operations
static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-03-26 18:14:44 +00:00
Andrew Morton 105fd108a6 [PATCH] "ext[34]: EA block reference count racing fix" performance fix
A little mistake in 8a2bfdcbfa is making all
transactions synchronous, which reduces ext3 performance to comical levels.

Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-23 11:01:22 -07:00
David Howells aa289b4723 [PATCH] FDPIC: fix the /proc/pid/stat representation of executable boundaries
Fix the /proc/pid/stat representation of executable boundaries.  It should
show the bounds of the executable, but instead shows the bounds of the
loader.

Before the patch is applied, the bug can be seen by examining, say, inetd:

	# ps | grep inetd
	  610         root          0   S   /usr/sbin/inetd -i
	# cat /proc/610/maps
	c0bb0000-c0bba788 r-xs 00000000 00:0b 14582157  /lib/ld-uClibc-0.9.28.so
	c3180000-c31dede4 r-xs 00000000 00:0b 14582179  /lib/libuClibc-0.9.28.so
	c328c000-c328ea00 rw-p 00008000 00:0b 14582157  /lib/ld-uClibc-0.9.28.so
	c3290000-c329b6c0 rw-p 00000000 00:00 0
	c32a0000-c32c0000 rwxp 00000000 00:00 0
	c32d4000-c32d8000 rw-p 00000000 00:00 0
	c3394000-c3398000 rw-p 00000000 00:00 0
	c3458000-c345f464 r-xs 00000000 00:0b 16384612  /usr/sbin/inetd
	c3470000-c34748f8 rw-p 00004000 00:0b 16384612  /usr/sbin/inetd
	c34cc000-c34d0000 rw-p 00000000 00:00 0
	c34d4000-c34d8000 rw-p 00000000 00:00 0
	c34d8000-c34dc000 rw-p 00000000 00:00 0
	# cat /proc/610/stat
	610 (inetd) S 1 610 610 0 -1 256 0 0 0 0 0 8 0 0 19 0 1 0 94392000718
	950272 0 4294967295 3233480704 3233523592 3274440352 3274439976
 	3273467584 0 0 4096 90115 3221712796 0 0 17 0 0 0 0

The code boundaries are 3233480704 to 3233523592, which are:

	(gdb) p/x 3233480704
	$1 = 0xc0bb0000
	(gdb) p/x 3233523592
	$2 = 0xc0bba788

Which corresponds to this line in the maps file:

	c0bb0000-c0bba788 r-xs 00000000 00:0b 14582157  /lib/ld-uClibc-0.9.28.so

Which is wrong.  After the patch is applied, the maps file is pretty much
identical (there's some minor shuffling of the location of some of the
anonymous VMAs), but the stat file is now:

	# cat /proc/610/stat
	610 (inetd) S 1 610 610 0 -1 256 0 0 0 0 0 7 0 0 18 0 1 0 94392000722
	950272 0 4294967295 3276111872 3276141668 3274440352 3274439976
	3273467584 0 0 4096 90115 3221712796 0 0 17 0 0 0 0

The code boundaries are then 3276111872 to 3276141668, which are:

	(gdb) p/x 3276111872
	$1 = 0xc3458000
	(gdb) p/x 3276141668
	$2 = 0xc345f464

And these correspond to this line in the maps file instead:

	c3458000-c345f464 r-xs 00000000 00:0b 16384612  /usr/sbin/inetd

Which is now correct.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-23 11:01:21 -07:00
Linus Torvalds 12998096cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Allow reset of file to ATTR_NORMAL when archive bit not set
  [CIFS] Do not negotiate new POSIX_PATH_OPERATIONS_CAP yet
  [CIFS] reset mode when client notices that ATTR_READONLY is no longer set
2007-03-22 19:47:09 -07:00
Rafael J. Wysocki b43376927a [PATCH] Make XFS workqueues nonfreezable
Since freezable workqueues are broken in 2.6.21-rc
(cf. http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755,
http://marc.theaimsgroup.com/?l=linux-kernel&m=117261312523921&w=2)
it's better to change the only user of them, which is XFS, to use "normal"
nonfreezable workqueues.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: David Chinner <dgc@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-22 19:39:06 -07:00
Steve French 066fcb06d3 [CIFS] Allow reset of file to ATTR_NORMAL when archive bit not set
When a file had a dos attribute of 0x1 (readonly - but dos attribute
of archive was not set) - doing chmod 0777 or equivalent would
try to set a dos attribute of 0 (which some servers ignore)
rather than ATTR_NORMAL (0x20) which most servers accept.
Does not affect servers which support the CIFS Unix Extensions.

Acked-by: Prasad Potluri <pvp@us.ibm.com>
Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-03-23 00:45:08 +00:00
James Bottomley d1cabd6326 [PATCH] fix process crash caused by randomisation and 64k pages
This bug was seen on ppc64, but it could have occurred on any
architecture with a page size of 64k or above.  The problem is that in
fs/binfmt_elf.c:randomize_stack_top() randomizes the stack to within
0x7ff pages.  On 4k page machines, this is 8MB; on 64k page boxes, this
is 128MB.

The problem is that the new binary layout (selected in
arch_pick_mmap_layout) places the mapping segment 128MB or the stack
rlimit away from the top of the process memory, whichever is larger.  If
you chose an rlimit of less than 128MB (most defaults are in the 8Mb
range) then you can end up having your entire stack randomized away.

The fix is to make randomize_stack_top() only steal at most 8MB, which this
patch does.  However, I have to point out that even with this, your stack
rlimit might not be exactly what you get if it's > 128MB, because you're
still losing the random offset of up to 8MB.

The true fix should be to leave an explicit gap for the randomization plus
a buffer when determining mmap_base, but that would involve fixing all the
architectures.

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:06 -07:00
Johannes Berg e5d480ff17 [PATCH] change misleading EFI partition support description
Remove the misleading "Presently only useful on the IA-64 platform" text
from the EFI partition Kconfig.

EFI partitions are also used by Apple on their Intel-based machines and
thus you need EFI partition support if you (for example) want to attach
such a machine in target disk mode.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:06 -07:00
Trond Myklebust 634707388b [PATCH] nfs: nfs_getattr() can't call nfs_sync_mapping_range() for non-regular files
Looks like we need a check in nfs_getattr() for a regular file. It makes
no sense to call nfs_sync_mapping_range() on anything else. I think that
should fix your problem: it will stop the NFS client from interfering
with dirty pages on that inode's mapping.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Olof Johansson <olof@lixom.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:06 -07:00
Peter Zijlstra 89a09141df [PATCH] nfs: fix congestion control
The current NFS client congestion logic is severly broken, it marks the
backing device congested during each nfs_writepages() call but doesn't
mirror this in nfs_writepage() which makes for deadlocks.  Also it
implements its own waitqueue.

Replace this by a more regular congestion implementation that puts a cap on
the number of active writeback pages and uses the bdi congestion waitqueue.

Also always use an interruptible wait since it makes sense to be able to
SIGKILL the process even for mounts without 'intr'.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:05 -07:00
suzuki b74a2f0913 [PATCH] fix rescan_partitions to return errors properly
The only error code which comes from the partition checkers is -1, when
they finds an EIO.  As per the discussion, ENOMEM values were ignored,
as they might scare the users.

So, with the current code, we end up returning -1 and not EIO for the
ioctl() calls.  Which doesn't give any clue to the user of what went
wrong.

Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:05 -07:00
Vasily Averin 1174cf7301 [PATCH] smbfs: double free memory corruption
smbfs allocates rq_trans2buffer to handle server's multi transaction2 response
messages.  As struct smb_request may be reused, rq_trans2buffer is freed
before each new request.  However if last servers's response is not multi but
single trans2 message then new rq_trans2buffer is not allocated but last
smb_rput still tries to free it again.

To prevent this issue rq_trans2buffer pointer should be set to NULL after
kfree.

Signed-off-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:05 -07:00
Michael Halcrow b228b8e5bf [PATCH] eCryptfs: fix possible NULL ptr deref in ecryptfs_d_release()
ecryptfs_d_release() first dereferences a pointer (via
ecryptfs_dentry_to_lower()) and then afterwards checks to see if the
pointer it just dereferenced is NULL (via ecryptfs_dentry_to_private()).

This patch moves all of the work done on the dereferenced pointer inside a
block governed by the condition that the pointer is non-NULL.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:05 -07:00
Evgeniy Dushistov 0465fc0a1c [PATCH] ufs2: tindirect truncate fix
During modification of code to support UFS2 writing, the case with
"three indirect" blocks in truncate path was missed, this patch fixes
this situation.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:03 -07:00
Evgeniy Dushistov 4b25a37e20 [PATCH] ufs: zeroize the rest of block in truncate
This patch fix behaviour in such test scenario:

  lseek(fd, BIG_OFFSET)
  write(fd, buf, sizeof(buf))
  truncate(BIG_OFFSET)
  truncate(BIG_OFFSET + sizeof(buf))
  read(fd, buf...)

Because of if file big enough(BIG_OFFSET) we start allocate space by block,
ordinary block size > page size, so we should zeroize the rest of block in
truncate(except last framgnet, about which VFS should care), to not get
garbage, when we extend file.

Also patch corrects conversion from pointer to block to physical block number,
this helps in case of not common used UFS types.

And add to debug output inode number.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:03 -07:00
Evgeniy Dushistov 5431bf97ce [PATCH] ufs: prepare write + change blocks on the fly
This fixes "change blocks numbers on the fly" in case when "prepare
write page" is in the call chain, in this case some buffers may be not
uptodate and not mapped, we should care to map them and load from disk.

This patch was tested with:
 - ufs regressions simple tests
 - fsx-linux
 - ltp(20060306)
 - untar and build kernel

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:03 -07:00
Evgeniy Dushistov 2189850f42 [PATCH] ufs2: more correct work with time
This patch corrects work with time in UFS2 case.

1) According to UFS2 disk layout modification/access and so on "time"
   should be hold in two variables one 64bit for seconds and another 32bit for
   nanoseconds,

   at now for some unknown reason we suppose that "inode time" holds in
   three variables 32bit for seconds, 32bit for milliseconds and 32bit for
   nanoseconds.

2) We set amount of nanoseconds in "VFS inode" to 0 during read, instead of
   getting values from "on disk inode"(this should close
   http://bugzilla.kernel.org/show_bug.cgi?id=7991).

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Bjoern Jacke <bjoern@j3e.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-16 19:25:03 -07:00
Steve French 38e2aff670 [CIFS] Do not negotiate new POSIX_PATH_OPERATIONS_CAP yet
Samba server now expects that clients which send the new
POSIX_PATH_OPERATIONS_CAP send all opens with this new
SMB - and expects that clients that could send the new
posix open/create but don't as indicating that they really
want Windows semantics on that handle (which allows Samba
to support clients which want to support both types of
behaviors on different handles on the same mount)

We will put this capability back in the SetFSInfo
negotiation with servers like Samba when the
new POSIXCreate (create/open/mkdir) code is finished.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-03-16 05:12:53 +00:00
Alan Stern e7b0d26a86 [PATCH] sysfs: reinstate exclusion between method calls and attribute unregistration
This patch (as869) reinstates the mutual exclusion between sysfs
attribute method calls and attribute unregistration.  The
previously-reported deadlocks have been fixed, and this exclusion is
by far the simplest way to avoid races during driver unbinding.

The check for orphaned read-buffers has been moved down slightly, so
that the remainder of a partially-read buffer will still be available
to userspace even after the attribute has been unregistered.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-15 15:29:26 -07:00
Alan Stern d9a9cdfb07 [PATCH] sysfs and driver core: add callback helper, used by SCSI and S390
This patch (as868) adds a helper routine for device drivers that need
to set up a callback to perform some action in a different process's
context.  This is intended for use by attribute methods that want to
unregister themselves or their parent device.  Attribute method calls
are mutually exclusive with unregistration, so such actions cannot be
taken directly.

Two attribute methods are converted to use the new helper routine: one
for SCSI device deletion and one for System/390 ccwgroup devices.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-15 15:29:26 -07:00
Linus Torvalds 4e337adae4 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2_dlm: Add missing locks in dlm_empty_lockres
  ocfs2_dlm: Missing get/put lockres in dlm_run_purge_lockres
  configfs: add missing mutex_unlock()
  ocfs2: add some missing address space callbacks
  ocfs2: Concurrent access of o2hb_region->hr_task was not locked
  ocfs2: Proper cleanup in case of error in ocfs2_register_hb_callbacks()
2007-03-14 15:29:08 -07:00
Al Viro 82c05a13c9 [PATCH] cifs endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-14 15:27:50 -07:00
Al Viro a033f35a22 [PATCH] include of asm/pgtable.h in nfsfh is bogus
not needed and actually breaks build on frv, while we are at it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-14 15:27:49 -07:00
Al Viro 04ff97086b [PATCH] sanitize security_getprocattr() API
have it return the buffer it had allocated

Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-14 15:27:48 -07:00
Sunil Mushran b36c3f8498 ocfs2_dlm: Add missing locks in dlm_empty_lockres
__dlm_lockres_unused() expects the caller to take the lockres spinlock.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-03-14 14:37:35 -07:00
Sunil Mushran 3fca0894a4 ocfs2_dlm: Missing get/put lockres in dlm_run_purge_lockres
In some circumstances, this was causing us to reference freed memory.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-03-14 14:37:33 -07:00
Joel Becker afdf04ea09 configfs: add missing mutex_unlock()
d_alloc() failure in configfs_register_subsystem() would fail to unlock
the mutex taken above.  Reorganize the exit path to ensure the unlock
happens.

Reported-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-03-14 14:37:21 -07:00
Joel Becker 03f981cf2e ocfs2: add some missing address space callbacks
Under load, OCFS2 would crash in invalidate_inode_pages2_range() because
invalidate_complete_page2() was unable to invalidate a page.  It would
appear that JBD is holding on to the page.  ext3 has a specific
->releasepage() handler to cover this case.

Steal ext3's ->releasepage(), ->invalidatepage(), and ->migratepage(), as
they appear completely appropriate for OCFS2.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-03-14 14:37:16 -07:00
Joel Becker e6c352dbc0 ocfs2: Concurrent access of o2hb_region->hr_task was not locked
This means that a build-up and a teardown could race which would result in a
double-kthread_stop().

Protect the setting and clearing of hr_task with o2hb_live_lock, as it's not
a common thing and not performance critical.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-03-14 14:37:12 -07:00
Joel Becker c24f72cc7c ocfs2: Proper cleanup in case of error in ocfs2_register_hb_callbacks()
If ocfs2_register_hb_callbacks() succeeds on its first callback but fails
its second, it doesn't release the first on the way out. Fix that.

While we're at it, o2hb_unregister_callback() never returns anything but
0, so let's make it void.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-03-14 14:37:09 -07:00
Robert P. J. Day 3a6effe81f [JFFS2] Remove superfluous source file fs/jffs2/comprtest.c
Delete the obsolete source file fs/jffs2/comprtest.c.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-03-10 10:16:36 +01:00
Alan Tyson f5c1e2ea71 [CIFS] reset mode when client notices that ATTR_READONLY is no longer set
Signed-off-by: Alan Tyso <atyson@hp.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-03-10 06:05:14 +00:00
Linus Torvalds 6e96783f58 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6:
  [JFFS2] print a message when marking bad block
  [JFFS2] Check for all-zero node headers
  [MTD] [OneNAND] Classify the page data and oob buffer
  [MTD] [OneNAND] Exit the loop when transferring/filling of the oob is finished
  [MTD] [OneNAND] add Nokia Copyright and a credit
  [MTD] [OneNAND] Fix typo & wrong comments
  [MTD] [OneNAND] Use oob buffer instead of main one in oob functions
  [MTD] Correct partition failed erase address
  [JFFS2] Use yield() between GC passes in background thread.
  [MTD] [NAND] Correct misspelled preprocessor variable.
  [MTD] [MAPS] dilnetpc: Fix printk warning
  [MTD] [NOR] Fix oops in cfi_amdstd_sync
  [MTD] ESB2 check for closed ROM window
  [JFFS2] Fix writebuffer recovery in the first page of a block
  [MTD] [NAND] make oobavail public
2007-03-09 10:34:55 -08:00
Artem Bityutskiy 0feba829ee [JFFS2] print a message when marking bad block
New bad eraseblock is an event which is important enough to be printed
about.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-03-09 12:29:39 +00:00
David Woodhouse c7258a4477 [JFFS2] Check for all-zero node headers
Due to a poor choice of CRC32 seed, a node header which is all zeroes
would pass the CRC32 check. Explicitly check for this case, and treat it
as we do a CRC failure.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-03-09 11:44:00 +00:00
Peter Zijlstra 908e0a8a26 [PATCH] ecryptfs: nested locking annotation
ecryptfs uses a lock_parent() function, which I hope really locks the parents
and is not abused

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-08 07:39:16 -08:00
suzuki 9bebff6ca5 [PATCH] check_partition(): fix error check
Fix inverted check introduced in 57881dd9df "Fix
check_partition routines".

Signed-off-by: Suzuki K P <suzuki@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-08 07:38:22 -08:00
Davide Libenzi f6dfb4fd7d [PATCH] Add epoll compat_ code to fs/compat.c
IA64 and ARM-OABI are currently using their own version of epoll compat_
code.

An architecture needs epoll_event translation if alignof(u64) in 32 bit
mode is different from alignof(u64) in 64 bit mode.  If an architecture
needs epoll_event translation, it must define struct compat_epoll_event in
asm/compat.h and set CONFIG_HAVE_COMPAT_EPOLL_EVENT and use
compat_sys_epoll_ctl and compat_sys_epoll_wait.

All 64 bit architecture should use compat_sys_epoll_pwait.

[sfr: restructure and move to fs/compat.c, remove MIPS version
of compat_sys_epoll_pwait, use __put_user_unaligned]

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-08 07:38:22 -08:00
Paolo 'Blaisorblade' Giarrusso a6eb0be6d5 [PATCH] uml: hostfs: make hostfs= option work as a jail, as intended.
When a given host directory is specified to be mounted both in hostfs=path1
and with mount option -o path2, we should give access to path1/path2, but this
does not happen.  Fix that in the simpler way.

Also, root_ino can be the empty string, since we use %s/%s as format.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-08 07:38:21 -08:00
Paolo 'Blaisorblade' Giarrusso bca271136f [PATCH] uml: hostfs: fix double free
Fix double free in the error path - when name is assigned into root_inode we
do not own it any more and we must not kfree() it - see patch for details.

Thanks to William Stearns for the initial report.

CC: William Stearns <wstearns@pobox.com>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-08 07:38:21 -08:00
David Woodhouse f8a922c7bb [JFFS2] Use yield() between GC passes in background thread.
The garbage collection thread is strictly an optimisation. Everything it
does would also be done just-in-time in the context of something in
userspace trying to access the file system.

Sometimes, however, it's a pessimisation. Especially during early boot
when it's checksumming nodes and scanning inodes which are shortly going
to be pulled in by read_inode anyway. We end up building the rbtree of
node coverage twice for the same inode.

By switching to yield() instead of cond_resched() in the main loop, we
observe boot times on the OLPC system going down from about 100 seconds to
60.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-03-08 10:28:30 +00:00
Vitaly Wool 180bfb31fe [JFFS2] Fix writebuffer recovery in the first page of a block
For the case when nand_write_page fail with -EIO for the first page in an
eraseblock, jffs2_wbuf_recover ends up producing a BUG in jffs2_block_refile
as jeb->first_node is not yet set up (it's set up later in jffs2_wbuf_recover).
This BUG is not really a bug; it's just jffs2_wbuf_recover calling
jffs2_block_refile with the wrong second parameter.
This patch takes care of this situation.

Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-03-08 09:18:31 +00:00
Steven Whitehouse c3f49bc209 [GFS2] Fix bz 229873, alternate test: assertion "!ip->i_inode.i_mapping->nrpages" failed
The following removes an incorrect assertion from the GFS2 glops code. This
fixes Red Hat bz 229873. Thanks to Abhijith Das for testing the patch
and confirming the fix.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Abhijith Das <adas@redhat.com>
2007-03-07 14:03:53 -05:00
akpm@linux-foundation.org 95d97b7dd7 [GFS2] build fix
fs/gfs2/glock.c:2198: error: 'THIS_MODULE' undeclared here (not in a function)

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-03-07 14:03:25 -05:00
Steven Whitehouse 631c42e170 [GFS2] go_drop_bh is never used, so remove it
The ->go_drop_bh function is never used, so this removes it and the single
caller,

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 14:02:53 -05:00
Steven Whitehouse 04b159b132 [GFS2] Remove unused variable
Remove an unused variable.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 14:02:30 -05:00
Steven Whitehouse 1be3867955 [GFS2] Fix bz 229831, lookup returns wrong inode
The following patch fixes Red Hat bz 229831. Without this patch its
possible for the wrong inode to be returned in certain cases. It is a
pretty unusual event, so that its taken some time to track down. Thanks
and due to Josef Whiter who did a lot of the testing required to thrack
this down and fix it.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 14:01:53 -05:00
Steven Whitehouse cad5b93927 [GFS2] Fix bz 230143, incorrect flushing of rgrps
The below patch fixes a problem where we were not flushing rgrps
correctly. It only occurred in the specific case that a callback was
received for an rgrp which was dirty and when a journal log flush had
not already resulted in the rgrp being flushed anyway. This fixes Red
Hat bz 230143,

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 14:00:14 -05:00
Wendy Cheng fb0d3bce8e [GFS2] pass formal ino in do_filldir_main
ok, the following is the minimum changes to get NFSD going before we
settle down this issue .. would appreciate this in the tree so other NFS
related works can get done in parallel.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:58:45 -05:00
Adrian Bunk 84c6e8cd35 [DLM] fs/dlm/user.c should #include "user.h"
Every file should include the headers containing the prototypes for
it's global functions.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:58:21 -05:00
Josef Whiter a13cbe3753 [GFS2] fix hangup when multiple processes are trying to write to the same file
This fixes a problem I encountered while running bonnie++.  When you have one
thread that opens a file and starts to write to it, and then another thread that
tries to open and write to the same file, the second thread will loop forever
trying to grab the inode lock for that inode.  Basically we come in through
generic_buffered_file_write, which calls gfs2_prepare_write, which then attempts
to grab the glock.  Because we don't own the lock, gfs2_prepare_write gets
GLR_TRYFAILED, which returns AOP_TRUNCATED_PAGE to generic_buffered_file_write.
At this point generic_buffered_file_write loops around again and immediately
retries the prepare_write.  This means that the second process never gets off of
the processor in order to allow the process that holds the lock to finish its
work and let go of the lock.  This patch makes gfs2_glock_nq schedule() if it
gets back a GLR_TRYFAILED, which resolves this problem.

Signed-off-by: Josef Whiter <jwhiter@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:58:02 -05:00
Wendy Cheng a7d2b2bdc9 [GFS2] NFS filehandle check
File handle checking error found in '07 NFS connectathon. The fh_type
and fh_len are not necessarily identical. Some of the client machines
could fail mount with stale filehandle without this patch.

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:57:34 -05:00
Richard Fearn d5a6751b32 [GFS2] add newline to printk message
Patch for the 2.6.20 stable tree that adds a missing newline to one of
the printk messages in fs/gfs2/ops_fstype.c.

Signed-off-by: Richard Fearn <richardfearn@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:57:10 -05:00
Josef Whiter 2e95b6653b [GFS2] fix locking mistake
This patch fixes a locking mistake in the quota code, we do a mutex_lock instead
of a mutex_unlock.

Signed-off-by: Josef Whiter <jwhiter@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07 13:56:41 -05:00
Hugh Dickins 266d4f4037 [PATCH] suspend regression: sysfs deadlock
Suspend deadlocks when trying to unregister /sys/block/sr0.

This comes from Oliver's commit 94bebf4d1b
"Driver core: fix race in sysfs between sysfs_remove_file() and
read()/write()".

sysfs_write_file downs buffer->sem while calling flush_write_buffer, and
flushing that particular write buffer entails downing buffer->sem in
orphan_all_buffers, resulting in the obvious self-deadlock.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 17:59:14 -08:00
Linus Torvalds 9f6632d629 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] cifs_prepare_write was incorrectly rereading page in some cases
  [CIFS] Fix set file size to zero when doing chmod to Samba 3.0.26pre
  [CIFS] Remove some unused functions/declarations
  [CIFS] New file for previous commit
  [CIFS] cifs export operations
  [CIFS] small piece missing from previous patch
  [CIFS] Fix locking problem around some cifs uses of i_size write
2007-03-06 17:32:22 -08:00
Linus Torvalds 8328258e74 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
  sdhci: release irq during suspend
  sdhci: make isr tolerant of read errors
  mmc: require explicit support for high-speed
  ncpfs: make sure server connection survives a kill
2007-03-06 17:31:29 -08:00
Dave Kleikamp 57bf63d69c [PATCH] fs: nobh_truncate_page() fix
This fixes a regression caused by 22c8ca78f2.

nobh_prepare_write() no longer marks the page uptodate, so
nobh_truncate_page() needs to do it.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 09:30:25 -08:00
Mark Lord 0de1517e23 [PATCH] Fix 2.6.21 rfcomm lockups
Any attempt to open/use a bluetooth rfcomm device locks up
scheduling completely on my machine.

Interrupts (ping, alt-sysrq) seem to be alive, but nothing else.

This was working fine in 2.6.20, broken now in 2.6.21-rc2-git*

Reverting this change (below) fixes it:

| author    Marcel Holtmann <marcel@holtmann.org>
|      Sat, 17 Feb 2007 22:58:57 +0000 (23:58 +0100)
| committer    David S. Miller <davem@sunset.davemloft.net>
|      Mon, 26 Feb 2007 19:42:41 +0000 (11:42 -0800)
| commit    c1a3313698
| tree    337a876f72    tree | snapshot
| parent    f5ffd4620a    commit | diff
| | [Bluetooth] Make use of device_move() for RFCOMM TTY devices
| | In the case of bound RFCOMM TTY devices the parent is not available
| before its usage. So when opening a RFCOMM TTY device, move it to
| the corresponding ACL device as a child. When closing the device,
| move it back to the virtual device tree.
| Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

The simplest fix for this bug is to prevent sysfs_move_dir()
from self-deadlocking when (old_parent == new_parent).

This patch prevents total system lockup when using rfcomm devices.

Signed-off-by:  Mark Lord <mlord@pobox.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-06 09:30:24 -08:00
Pierre Ossman c5f93cf19d ncpfs: make sure server connection survives a kill
Use internal buffers instead of the ones supplied by the caller
so that a caller can be interrupted without having to abort the
entire ncp connection.

Signed-off-by: Pierre Ossman <ossman@cendio.se>
Acked-by: Petr Vandrovec <petr@vandrovec.name>
2007-03-06 13:26:27 +01:00
Steve French 8a236264f7 [CIFS] cifs_prepare_write was incorrectly rereading page in some cases
Noticed by Shaggy.

Signed-off-by: Shaggy <shaggy@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-03-06 00:31:00 +00:00
Dmitriy Monakhov a8fa74ab52 [PATCH] ecryptfs: handle AOP_TRUNCATED_PAGE better
- In fact we don't have to fail if AOP_TRUNCATED_PAGE was returned from
  prepare_write or commit_write. It is beter to retry attempt where it
  is possible.

- Rearange ecryptfs_get_lower_page() error handling logic, make it more clean.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Dmitriy Monakhov 82b1652840 [PATCH] ecryptfs: lower root result must be adirectory
- Currently after path_lookup succeed we dot't have any guarantie what
  it is DIR. This must be explicitly demanded.
- path_lookup can't return negative dentry, So inode check is useless.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Hugh Dickins 759b9775c2 [PATCH] shmem and simple const super_operations
shmem's super_operations were missed from the recent const-ification;
and simple_fill_super()'s, which can share with get_sb_pseudo()'s.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
Dmitriy Monakhov ad5f119679 [PATCH] ecryptfs: check xattr operation support fix
- ecryptfs_write_inode_size_to_metadata() error code was ignored.
  - i_op->setxattr() must be supported by lower fs because used below.

Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Mingming Cao 8a2bfdcbfa [PATCH] ext[34]: EA block reference count racing fix
There are race issues around ext[34] xattr block release code.

ext[34]_xattr_release_block() checks the reference count of xattr block
(h_refcount) and frees that xattr block if it is the last one reference it.
 Unlike ext2, the check of this counter is unprotected by any lock.
ext[34]_xattr_release_block() will free the mb_cache entry before freeing
that xattr block.  There is a small window between the check for the re
h_refcount ==1 and the call to mb_cache_entry_free().  During this small
window another inode might find this xattr block from the mbcache and reuse
it, racing a refcount updates.  The xattr block will later be freed by the
first inode without notice other inode is still use it.  Later if that
block is reallocated as a datablock for other file, then more serious
problem might happen.

We need put a lock around places checking the refount as well to avoid
racing issue.  Another place need this kind of protection is in
ext3_xattr_block_set(), where it will modify the xattr block content in-
the-fly if the refcount is 1 (means it's the only inode reference it).

This will also fix another issue: the xattr block may not get freed at all
if no lock is to protect the refcount check at the release time.  It is
possible that the last two inodes could release the shared xattr block at
the same time.  But both of them think they are not the last one so only
decreased the h_refcount without freeing xattr block at all.

We need to call lock_buffer() after ext3_journal_get_write_access() to
avoid deadlock (because the later will call lock_buffer()/unlock_buffer
() as well).

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-01 14:53:38 -08:00
Michael Halcrow 65dc814571 [PATCH] eCryptfs: no path_release() after path_lookup() error
Dmitriy Monakhov wrote:
> if path_lookup() return non zero code we don't have to worry about
> 'nd' parameter, but ecryptfs_read_super does path_release(&nd) after
> path_lookup has failed, and dentry counter becomes negative

Do not do a path_release after a path_lookup error.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-01 14:53:38 -08:00
Michael Halcrow 1ed6d896de [PATCH] eCryptfs: remove unnecessary flush_dcache_page()
Remove unnecessary flush_dcache_page() call. Thanks to Dmitriy
Monakhov for pointing this out.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-01 14:53:38 -08:00
Michael Halcrow a8d547d5cf [PATCH] eCryptfs: set O_LARGEFILE when opening lower file
O_LARGEFILE should be set here when opening the lower file.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-01 14:53:37 -08:00
Michael Halcrow ae73fc093a [PATCH] eCryptfs: resolve lower page unlocking problem
eCryptfs lower file handling code has several issues:
  - Retval from prepare_write()/commit_write() wasn't checked to equality
    to AOP_TRUNCATED_PAGE.
  - In some places page wasn't unmapped and unlocked after error.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-01 14:53:37 -08:00
Steve French c7af1857ef [CIFS] Fix set file size to zero when doing chmod to Samba 3.0.26pre
In fixing a bug Samba 3.0.26pre allowed some clients (including Linux cifs
client) to change file size to zero in SET_FILE_UNIX_BASIC (which Linux cifs
client uses for chmod).

The server has been "fixed" now but that also fixes the client to net send
file size zero on chmod.

Fixes Samba bugzilla bug # 4418.

Fixed with help from Jeremy Allison

Signed-off-by: Jeremy Allison <jra@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-03-01 04:11:22 +00:00
Steve French 99ee4dbd7c [CIFS] Remove some unused functions/declarations
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-02-27 05:35:17 +00:00
Steve French 1ae1bc44d4 [CIFS] New file for previous commit
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-02-27 05:16:30 +00:00
Steve French 35c11fdda7 [CIFS] cifs export operations
For nfsd to work over cifs mounts (which presumably makes sense when trying
to reexport mounts to windows, network appliances or Samba servers to nfs
clients via nfs server).

This is the first stage of that enablement, marked experimental and turned
off by default.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-02-27 05:09:35 +00:00
Steve French ba6a46a03f [CIFS] small piece missing from previous patch
There were two i_size_writes in the new truncate
function - we missed one in the last patch.
Noticed by Shaggy when he reviewed.

Thank you Shaggy ...

CC: Shaggy <shaggy@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-02-26 20:06:29 +00:00
Linus Torvalds 60f29b1e16 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
  JFS: Get rid of "may be used uninitialized" warnings
2007-02-26 11:44:51 -08:00
Linus Torvalds a7538a7f87 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  Revert "Driver core: let request_module() send a /sys/modules/kmod/-uevent"
  Driver core: fix error by cleanup up symlinks properly
  make kernel/kmod.c:kmod_mk static
  power management: fix struct layout and docs
  power management: no valid states w/o pm_ops
  Driver core: more fallout from class_device changes for pcmcia
  sysfs: move struct sysfs_dirent to private header
  driver core: refcounting fix
  Driver core: remove class_device_rename
2007-02-26 11:41:30 -08:00
Steve French 3677db10a6 [CIFS] Fix locking problem around some cifs uses of i_size write
Could cause hangs on smp systems in i_size_read on a cifs inode
whose size has been previously simultaneously updated from
different processes.

Thanks to Brian Wang for some great testing/debugging on this
hard problem.

Fixes kernel bugzilla #7903

CC: Shirish Pargoankar <shirishp@us.ibm.com>
CC: Shaggy <shaggy@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-02-26 16:46:11 +00:00
Alan Stern dfa87c824a sysfs: allow attributes to be added to groups
This patch (as860) adds two new sysfs routines:
sysfs_add_file_to_group() and sysfs_remove_file_from_group().
A later patch adds code that uses the new routines.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-23 15:03:46 -08:00
Adam J. Richter d56c3eae67 sysfs: move struct sysfs_dirent to private header
struct sysfs_dirent is private to the fs/sysfs/ subtree.  It is
not even referenced as an opaque structure outside of that subtree.

The following patch moves the declaration from include/linux/sysfs.h to
fs/sysfs/sysfs.h, making it clearer that nothing else in the kernel
dereferences it.

I have been running this patch for years.  Please integrate and forward
upstream if there are no objections.

From: "Adam J. Richter" <adam@yggdrasil.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-23 14:52:09 -08:00
Linus Torvalds 9654640d0a Merge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] One line missing from previous commit
  [CIFS] mtime bounces from local to remote when cifs nocmtime i_flags overwritten
  [CIFS] fix &&/& typo in cifs_setattr()
2007-02-21 13:02:17 -08:00
Peter Zijlstra 6d740cd5b1 [PATCH] lockdep: annotate BLKPG_DEL_PARTITION
>=============================================
>[ INFO: possible recursive locking detected ]
>2.6.19-1.2909.fc7 #1
>---------------------------------------------
>anaconda/587 is trying to acquire lock:
> (&bdev->bd_mutex){--..}, at: [<c05fb380>] mutex_lock+0x21/0x24
>
>but task is already holding lock:
> (&bdev->bd_mutex){--..}, at: [<c05fb380>] mutex_lock+0x21/0x24
>
>other info that might help us debug this:
>1 lock held by anaconda/587:
> #0:  (&bdev->bd_mutex){--..}, at: [<c05fb380>] mutex_lock+0x21/0x24
>
>stack backtrace:
> [<c0405812>] show_trace_log_lvl+0x1a/0x2f
> [<c0405db2>] show_trace+0x12/0x14
> [<c0405e36>] dump_stack+0x16/0x18
> [<c043bd84>] __lock_acquire+0x116/0xa09
> [<c043c960>] lock_acquire+0x56/0x6f
> [<c05fb1fa>] __mutex_lock_slowpath+0xe5/0x24a
> [<c05fb380>] mutex_lock+0x21/0x24
> [<c04d82fb>] blkdev_ioctl+0x600/0x76d
> [<c04946b1>] block_ioctl+0x1b/0x1f
> [<c047ed5a>] do_ioctl+0x22/0x68
> [<c047eff2>] vfs_ioctl+0x252/0x265
> [<c047f04e>] sys_ioctl+0x49/0x63
> [<c0404070>] syscall_call+0x7/0xb

Annotate BLKPG_DEL_PARTITION's bd_mutex locking and add a little comment
clarifying the bd_mutex locking, because I confused myself and initially
thought the lock order was wrong too.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:16 -08:00
Glauber de Oliveira Costa 63967fa911 [PATCH] Missing __user in pointer referenced within copy_from_user
Pointers to user data should be marked with a __user hint.  This one is
missing.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:15 -08:00
Christoph Hellwig 2be3c79046 [PATCH] affs: implement ->drop_inode
affs wants to truncate the inode when the last user goes away, currently it
does that through a potentially racy i_count check in ->put_inode.  But we
already have a method that's called just after the we dropped the last
reference, ->drop_inode.  This patch implements affs_drop_inode to take
advantage of this.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:15 -08:00
Ian Kent c9ffec4848 [PATCH] autofs4: check for directory re-create in lookup
This problem was identified and fixed some time ago by Jeff Moyer but it fell
through the cracks somehow.

It is possible that a user space application could remove and re-create a
directory during a request.  To avoid returning a failure from lookup
incorrectly when our current dentry is unhashed we need to check if another
positive, hashed dentry matching this one exists and if so return it instead
of a fail.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:15 -08:00
Ian Kent f50b6f8691 [PATCH] autofs4: fix another race between mount and expire
Jeff Moyer has identified a race between mount and expire.

What happens is that during an expire the situation can arise that a directory
is removed and another lookup is done before the expire issues a completion
status to the kernel module.  In this case, since the the lookup gets a new
dentry, it doesn't know that there is an expire in progress and when it posts
its mount request, matches the existing expire request and waits for its
completion.  ENOENT is then returned to user space from lookup (as the dentry
passed in is now unhashed) without having performed the mount request.

The solution used here is to keep track of dentrys in this unhashed state and
reuse them, if possible, in order to preserve the flags.  Additionally, this
infrastructure will provide the framework for the reintroduction of caching of
mount fails removed earlier in development.

Signed-off-by: Ian Kent <raven@themaw.net>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:15 -08:00
Ian Kent e8514478f6 [PATCH] autofs4: header file update
The current header file definitions for autofs version 5 have caused a couple
of problems for application builds downstream.

This fixes the problem by separating the definitions.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:15 -08:00
Nick Piggin 22c8ca78f2 [PATCH] fs: fix nobh data leak
nobh_prepare_write leaks data similarly to how simple_prepare_write did. Fix
by not marking the page uptodate until nobh_commit_write time. Again, this
could break weird use-cases, but none appear to exist in the tree.

We can safely remove the set_page_dirty, because as the comment says,
nobh_commit_write does set_page_dirty. If a filesystem wants to allocate
backing store for a page dirtied via mmap, page_mkwrite is the suggested
approach.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:15 -08:00
Nick Piggin 955eff5acc [PATCH] fs: fix libfs data leak
simple_prepare_write leaks uninitialised kernel data.  This happens because
the it leaves an uninitialised "hole" over the part of the page that the
write is expected to go to.  This is fine, but it then marks the page
uptodate, which means a concurrent read can come in and copy the
uninitialised memory into userspace before it written to.

Fix it by simply marking it uptodate in simple_commit_write instead, after
the hole has been filled in.  This could theoretically break an fs that
uses simple_prepare_write and not simple_commit_write, and that relies on
the incorrect simple_prepare_write behaviour.  Luckily, none of those
exists in the tree.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:15 -08:00
Aneesh Kumar K.V e627432c29 [PATCH] ext[234]: update documentation
Signed-off-by: "Aneesh Kumar K.V" <aneesh.kumar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:14 -08:00
OGAWA Hirofumi 94412a96c4 [PATCH] FAT: DIO-write fallback to normal buffered
If the DIO write on FAT is expanding the size, it will be fail by -EINVAL,
because FAT can't handle it now.

This patch fallback it to the normal buffered-write and would return
success.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:14 -08:00
Nick Piggin ffda9d3022 [PATCH] fs: fix __block_write_full_page error case buffer submission
Andrew noticed that unlocking the page before submitting all buffers for
writeout could cause problems if the IO completes before we've finished
messing around with the page buffers, and they subsequently get freed.

Even if there were no bug, it is a good idea to bring the error case
into line with the common case here.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:13 -08:00
Andrew Morton b446b60e4e [PATCH] rework reserved major handling
Several people have reported failures in dynamic major device number handling
due to the recent changes in there to avoid handing out the local/experimental
majors.

Rolf reports that this is due to a gcc-4.1.0 bug.

The patch refactors that code a lot in an attempt to provoke the compiler into
behaving.

Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:13 -08:00
Andrew Morton 5085b607fb [PATCH] xfs warning fix
fs/xfs/linux-2.6/xfs_super.c:903: warning: 'noinline' attribute ignored

Cc: David Chinner <dgc@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-20 17:10:13 -08:00
Greg Banks c9ce228306 [PATCH] Fix a free-wrong-pointer bug in nfs/acl server.
Due to type confusion, when an nfsacl verison 2 'ACCESS' request
finishes and tries to clean up, it calls fh_put on entiredly the
wrong thing and this can cause an oops.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-19 16:13:28 -08:00
Erez Zadok a6e6df25ec [PATCH] fs/stack.c: Copy i_nlink after all other attributes are copied
A user-specified get_nlinks may depend on other inode attributes.

Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-19 14:21:50 -08:00
Linus Torvalds 4935361766 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (49 commits)
  [MTD] [NAND] S3C2412 fix hw ecc
  [MTD] [NAND] Work around false compiler warning in CAFÉ driver
  [JFFS2] printk warning fixes
  [MTD] [MAPS] ichxrom warning fix
  [MTD] [MAPS] amd76xrom warning fix
  [MTD] [MAPS] esb2rom warning fixes
  [MTD] [MAPS] ck804xrom warning fix
  [MTD] [MAPS] netsc520 warning fix
  [MTD] [MAPS] sc520cdp warning fix
  [MTD] [ONENAND] onenand_base warning fix
  [MTD] [NAND] eXcite nand flash driver
  [MTD] Improve heuristic for detecting wrong-endian RedBoot partition table
  [MTD] Fix RedBoot partition parsing regression harder.
  [MTD] [NAND] S3C2410: Hardware ECC correction code
  [JFFS2] Use MTD_OOB_AUTO to automatically place cleanmarker on NAND
  [MTD] Clarify OOB-operation interface comments
  [MTD] remove unused ecctype,eccsize fields from struct mtd_info
  [MTD] [NOR] Intel: remove ugly PROGREGION macros
  [MTD] [NOR] STAA: use writesize instead off eccsize to represent ECC block
  [MTD] OneNAND: Invalidate bufferRAM after erase
  ...
2007-02-19 13:34:11 -08:00
Linus Torvalds 2874b391bd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: implement optional loose read cache
  9p: Use kthread_stop instead of sending a SIGKILL.
2007-02-19 13:33:01 -08:00
Linus Torvalds cb4aaf46c0 Merge branch 'audit.b37' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b37' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] AUDIT_FD_PAIR
  [PATCH] audit config lockdown
  [PATCH] minor update to rule add/delete messages (ver 2)
2007-02-19 13:29:54 -08:00
Linus Torvalds 874ff01bd9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
  Documentation/kernel-docs.txt update.
  arch/cris: typo in KERN_INFO
  Storage class should be before const qualifier
  kernel/printk.c: comment fix
  update I/O sched Kconfig help texts - CFQ is now default, not AS.
  Remove duplicate listing of Cris arch from README
  kbuild: more doc. cleanups
  doc: make doc. for maxcpus= more visible
  drivers/net/eexpress.c: remove duplicate comment
  add a help text for BLK_DEV_GENERIC
  correct a dead URL in the IP_MULTICAST help text
  fix the BAYCOM_SER_HDX help text
  fix SCSI_SCAN_ASYNC help text
  trivial documentation patch for platform.txt
  Fix typos concerning hierarchy
  Fix comment typo "spin_lock_irqrestore".
  Fix misspellings of "agressive".
  drivers/scsi/a100u2w.c: trivial typo patch
  Correct trivial typo in log2.h.
  Remove useless FIND_FIRST_BIT() macro from cardbus.c.
  ...
2007-02-19 13:29:02 -08:00
Linus Torvalds ebbe46f73a Merge branch 'kill-jffs' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6
* 'kill-jffs' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
  Remove JFFS (version 1), as scheduled.
2007-02-19 13:25:36 -08:00
Linus Torvalds 733abe4fff Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
  PCI: Make PCI device numa-node attribute visible in sysfs
  PCI: add systems for automatic breadth-first device sorting
  PCI: PCI devices get assigned redundant IRQs
  PCI: Make CARDBUS_MEM_SIZE and CARDBUS_IO_SIZE boot options
  PCI: pci.txt fix __devexit() usage
  PCI/sysfs/kobject kernel-doc fixes
2007-02-19 12:59:55 -08:00
Andrew Morton 7be26bfb2e [JFFS2] printk warning fixes
fs/jffs2/wbuf.c: In function 'jffs2_check_oob_empty':
fs/jffs2/wbuf.c:993: warning: format '%d' expects type 'int', but argument 3 has type 'size_t'
fs/jffs2/wbuf.c:993: warning: format '%d' expects type 'int', but argument 4 has type 'size_t'
fs/jffs2/wbuf.c: In function 'jffs2_check_nand_cleanmarker':
fs/jffs2/wbuf.c:1036: warning: format '%d' expects type 'int', but argument 3 has type 'size_t'
fs/jffs2/wbuf.c:1036: warning: format '%d' expects type 'int', but argument 4 has type 'size_t'
fs/jffs2/wbuf.c: In function 'jffs2_write_nand_cleanmarker':
fs/jffs2/wbuf.c:1062: warning: format '%d' expects type 'int', but argument 3 has type 'size_t'
fs/jffs2/wbuf.c:1062: warning: format '%d' expects type 'int', but argument 4 has type 'size_t'

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-02-18 16:43:14 +00:00
Eric Van Hensbergen e03abc0c96 9p: implement optional loose read cache
While cacheing is generally frowned upon in the 9p world, it has its
place -- particularly in situations where the remote file system is
exclusive and/or read-only.  The vacfs views of venti content addressable
store are a real-world instance of such a situation.  To facilitate higher
performance for these workloads (and eventually use the fscache patches),
we have enabled a "loose" cache mode which does not attempt to maintain
any form of consistency on the page-cache or dcache.  This results in over
two orders of magnitude performance improvement for cacheable block reads
in the Bonnie benchmark.  The more aggressive use of the dcache also seems
to improve metadata operational performance.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-02-18 10:16:10 -06:00
Eric W. Biederman 2c0463a9ae 9p: Use kthread_stop instead of sending a SIGKILL.
Since the kthread api does not bump the reference count on
processes that tracked it is not safe allow user space to
kill the threads, as I still retain a pointer to the task_struct.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-02-18 10:16:10 -06:00
Al Viro db3495099d [PATCH] AUDIT_FD_PAIR
Provide an audit record of the descriptor pair returned by pipe() and
socketpair().  Rewritten from the original posted to linux-audit by
John D. Ramsdell <ramsdell@mitre.org>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-02-17 21:30:15 -05:00
Jeff Garzik 419ee448ff Remove JFFS (version 1), as scheduled.
Unmaintained for years, few if any users.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-17 16:10:59 -05:00
Tobias Klauser c5a69d57eb Storage class should be before const qualifier
The C99 specification states in section 6.11.5:

The placement of a storage-class specifier other than at the
beginning of the declaration specifiers in a declaration is an
obsolescent feature.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-17 20:11:19 +01:00
Uwe Kleine-König 1b3c3714cb Fix typos concerning hierarchy
heirarchical, hierachical -> hierarchical
        heirarchy, hierachy -> hierarchy

Signed-off-by: Uwe Kleine-König <zeisberg@informatik.uni-freiburg.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-17 19:23:03 +01:00
Robert P. J. Day bbf2f9fb1c Fix misspellings of "agressive".
Fix the various misspellings of "agressive", as well as a couple
other things on the same lines while we're there.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-17 19:20:16 +01:00
Robert P. J. Day 405ae7d381 Replace remaining references to "driverfs" with "sysfs".
Globally, s/driverfs/sysfs/g.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-17 19:13:42 +01:00
Steve French 004c46b9e5 [CIFS] One line missing from previous commit
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-02-17 04:34:13 +00:00
Steve French 1b2b212603 [CIFS] mtime bounces from local to remote when cifs nocmtime i_flags overwritten
atime flag was also overwritten. Noticed by Shirish when he was debugging
an atime problem.  Should help performance a bit too.

cifs should be getting time stamps from the server (that was the original
intent too)

Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-02-17 04:30:54 +00:00
Randy Dunlap f95d882d81 PCI/sysfs/kobject kernel-doc fixes
Fix kernel-doc warnings in PCI, sysfs, and kobject files.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-16 15:30:10 -08:00
Cornelia Huck 873760fbf4 debugfs: Remove misleading comments.
Just mention which error will be returned if debugfs is disabled. Callers
should be able to figure out themselves what they need to check.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-16 15:19:17 -08:00
Peter Oberparleiter 66f5496393 debugfs: implement symbolic links
debugfs: implement symbolic links

Implement a new function debugfs_create_symlink() which can be used
to create symbolic links in debugfs. This function can be useful
for people moving functionality from /proc to debugfs (e.g. the
gcov-kernel patch).

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-16 15:19:17 -08:00
Mariusz Kozlowski b92be9f1ec Driver: remove redundant kobject_unregister checks
Here is a patch that removes all redundant kobject_unregister argument checks.

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-16 15:19:17 -08:00
Thomas Hisch 008983d966 [PATCH] ecryptfs: fix forgotten format specifier
Add format specifier %d for uid in ecryptfs_printk

Signed-off-by: Thomas Hisch <t.hisch@gmail.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
Michael Halcrow eb95e7ffa5 [PATCH] eCryptfs: Reduce stack usage in ecryptfs_generate_key_packet_set()
eCryptfs is gobbling a lot of stack in ecryptfs_generate_key_packet_set()
because it allocates a temporary memory-hungry ecryptfs_key_record struct.
This patch introduces a new kmem_cache for that struct and converts
ecryptfs_generate_key_packet_set() to use it.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
J. Bruce Fields 3160a711ef [PATCH] knfsd: nfsd4: fix handling of directories without default ACLs
When setting an ACL that lacks inheritable ACEs on a directory, we should set
a default ACL of zero length, not a default ACL with all bits denied.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
J. Bruce Fields bec50c47aa [PATCH] knfsd: nfsd4: acls: avoid unnecessary denies
We're inserting deny's between some ACEs in order to enforce posix draft acl
semantics which prevent permissions from accumulating across entries in an
acl.

That's fine, but we're doing that by inserting a deny after *every* allow,
which is overkill.  We shouldn't be adding them in places where they actually
make no difference.

Also replaced some helper functions for creating acl entries; I prefer just
assigning directly to the struct fields--it takes a few more lines, but the
field names provide some documentation that I think makes the result easier
understand.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
J. Bruce Fields f43daf6787 [PATCH] knfsd: nfsd4: acls: don't return explicit mask
Return just the effective permissions, and forget about the mask.  It isn't
worth the complexity.

WARNING: This breaks backwards compatibility with overly-picky nfsv4->posix
acl translation, as may has been included in some patched versions of libacl.
To our knowledge no such version was every distributed by anyone outside citi.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
J. Bruce Fields f34f924274 [PATCH] knfsd: nfsd4: fix error return on unsupported acl
We should be returning ATTRNOTSUPP, not NOTSUPP, when acls are unsupported.

Also fix a comment.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
J. Bruce Fields a4db5fe5df [PATCH] knfsd: nfsd4: fix memory leak on kmalloc failure in savemem
The wrong pointer is being kfree'd in savemem() when defer_free returns with
an error.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
J. Bruce Fields 28e05dd845 [PATCH] knfsd: nfsd4: represent nfsv4 acl with array instead of linked list
Simplify the memory management and code a bit by representing acls with an
array instead of a linked list.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
J. Bruce Fields 575a6290f0 [PATCH] knfsd: nfsd4: simplify nfsv4->posix translation
The code that splits an incoming nfsv4 ACL into inheritable and effective
parts can be combined with the the code that translates each to a posix acl,
resulting in simpler code that requires one less pass through the ACL.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
J. Bruce Fields 7bdfa68c5e [PATCH] knfsd: nfsd4: relax checking of ACL inheritance bits
The rfc allows us to be more permissive about the ACL inheritance bits we
accept:

	"If the server supports a single "inherit ACE" flag that applies to
	both files and directories, the server may reject the request
	(i.e., requiring the client to set both the file and directory
	inheritance flags). The server may also accept the request and
	silently turn on the ACE4_DIRECTORY_INHERIT_ACE flag."

Let's take the latter option--the ACL is a complex attribute that could be
rejected for a wide variety of reasons, and the protocol gives us little
ability to explain the reason for the rejection, so erroring out is a
user-unfriendly last resort.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
J. Bruce Fields f534a257ac [PATCH] knfsd: nfsd4: fix non-terminated string
The server name is expected to be a null-terminated string, so we can't pass
in the raw client identifier.

What's more, the client identifier is just a binary, not necessarily
printable, blob.  Let's just use the ip address instead.  The server name
appears to exist just to help debugging by making some printk's more
informative.

Note that the string is copies into the rpc client structure, so the pointer
to the local variable does not outlive the function call.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:14:01 -08:00
Dmitriy Monakhov beb497ab48 [PATCH] __page_symlink retry loop error code fix
If prepare_write or commit_write return AOP_TRUNCATED_PAGE we jump to
"retry" label and than if find_or_create_page() failed function return
incorrect error code.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Steve French c14e894bd4 [CIFS] fix &&/& typo in cifs_setattr()
Thanks to Dirk for pointing this out.

Signed-off-by: Dirk Mueller <dmueller@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-02-15 01:33:18 +00:00
Linus Torvalds 414f827c46 Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (94 commits)
  [PATCH] x86-64: Remove mk_pte_phys()
  [PATCH] i386: Fix broken CONFIG_COMPAT_VDSO on i386
  [PATCH] i386: fix 32-bit ioctls on x64_32
  [PATCH] x86: Unify pcspeaker platform device code between i386/x86-64
  [PATCH] i386: Remove extern declaration from mm/discontig.c, put in header.
  [PATCH] i386: Rename cpu_gdt_descr and remove extern declaration from smpboot.c
  [PATCH] i386: Move mce_disabled to asm/mce.h
  [PATCH] i386: paravirt unhandled fallthrough
  [PATCH] x86_64: Wire up compat epoll_pwait
  [PATCH] x86: Don't require the vDSO for handling a.out signals
  [PATCH] i386: Fix Cyrix MediaGX detection
  [PATCH] i386: Fix warning in cpu initialization
  [PATCH] i386: Fix warning in microcode.c
  [PATCH] x86: Enable NMI watchdog for AMD Family 0x10 CPUs
  [PATCH] x86: Add new CPUID bits for AMD Family 10 CPUs in /proc/cpuinfo
  [PATCH] i386: Remove fastcall in paravirt.[ch]
  [PATCH] x86-64: Fix wrong gcc check in bitops.h
  [PATCH] x86-64: survive having no irq mapping for a vector
  [PATCH] i386: geode configuration fixes
  [PATCH] i386: add option to show more code in oops reports
  ...
2007-02-14 09:46:06 -08:00
Eric W. Biederman 86a71dbd3e [PATCH] sysctl: hide the sysctl proc inodes from selinux
Since the security checks are applied on each read and write of a sysctl file,
just like they are applied when calling sys_sysctl, they are redundant on the
standard VFS constructs.  Since it is difficult to compute the security labels
on the standard VFS constructs we just mark the sysctl inodes in proc private
so selinux won't even bother with them.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:10:00 -08:00
Eric W. Biederman 3fbfa98112 [PATCH] sysctl: remove the proc_dir_entry member for the sysctl tables
It isn't needed anymore, all of the users are gone, and all of the ctl_table
initializers have been converted to use explicit names of the fields they are
initializing.

[akpm@osdl.org: NTFS fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:10:00 -08:00
Eric W. Biederman 77b14db502 [PATCH] sysctl: reimplement the sysctl proc support
With this change the sysctl inodes can be cached and nothing needs to be done
when removing a sysctl table.

For a cost of 2K code we will save about 4K of static tables (when we remove
de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or
about half that on a 32bit arch.

The speed feels about the same, even though we can now cache the sysctl
dentries :(

We get the core advantage that we don't need to have a 1 to 1 mapping between
ctl table entries and proc files.  Making it possible to have /proc/sys vary
depending on the namespace you are in.  The currently merged namespaces don't
have an issue here but the network namespace under /proc/sys/net needs to have
different directories depending on which network adapters are visible.  By
simply being a cache different directories being visible depending on who you
are is trivial to implement.

[akpm@osdl.org: fix uninitialised var]
[akpm@osdl.org: fix ARM build]
[bunk@stusta.de: make things static]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:10:00 -08:00
Eric W. Biederman 0b4d414714 [PATCH] sysctl: remove insert_at_head from register_sysctl
The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name.  Which is
pain for caching and the proc interface never implemented.

I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.

So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Corey Minyard <minyard@acm.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:59 -08:00
Eric W. Biederman 2abc26fc6b [PATCH] sysctl: create sys/fs/binfmt_misc as an ordinary sysctl entry
binfmt_misc has a mount point in the middle of the sysctl and that mount point
is created as a proc_generic directory.

Doing it that way gets in the way of cleaning up the sysctl proc support as it
continues the existence of a horrible hack.  So instead simply create the
directory as an ordinary sysctl directory.  At least that removes the magic
special case.

[akpm@osdl.org: warning fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:59 -08:00
Eric W. Biederman 0e03036c97 [PATCH] sysctl: register the ocfs2 sysctl numbers
ocfs2 was did not have the binary number it uses under CTL_FS registered in
sysctl.h.  Register it to avoid future conflicts, and change the name of the
definition to be in line with the rest of the sysctl numbers.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:58 -08:00
Eric W. Biederman 4ed075e93b [PATCH] sysctl: C99 convert ctl_tables in NTFS and remove sys_sysctl support
Putting ntfs-debug under FS_NRINODE was not a kosher thing to do so don't give
it any binary number.

[akpm@osdl.org: build fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:58 -08:00
Eric W. Biederman fd6065b4fd [PATCH] sysctl: C99 convert coda ctl_tables and remove binary sysctls
Will converting the coda sysctl initializers I discovered that it is yet
another user of sysctl that was stomping CTL_KERN.  So off with it's
sys_sysctl support since it wasn't done in a supportable way.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:58 -08:00
Tim Schmielau cd354f1ae7 [PATCH] remove many unneeded #includes of sched.h
After Al Viro (finally) succeeded in removing the sched.h #include in module.h
recently, it makes sense again to remove other superfluous sched.h includes.
There are quite a lot of files which include it but don't actually need
anything defined in there.  Presumably these includes were once needed for
macros that used to live in sched.h, but moved to other header files in the
course of cleaning it up.

To ease the pain, this time I did not fiddle with any header files and only
removed #includes from .c-files, which tend to cause less trouble.

Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
configs in arch/arm/configs on arm.  I also checked that no new warnings were
introduced by the patch (actually, some warnings are removed that were emitted
by unnecessarily included header files).

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:54 -08:00
NeilBrown af6a4e280e [PATCH] knfsd: add some new fsid types
Add support for using a filesystem UUID to identify and export point in the
filehandle.

For NFSv2, this UUID is xor-ed down to 4 or 8 bytes so that it doesn't take up
too much room.  For NFSv3+, we use the full 16 bytes, and possibly also a
64bit inode number for exports beneath the root of a filesystem.

When generating an fsid to return in 'stat' information, use the UUID (hashed
down to size) if it is available and a small 'fsid' was not specifically
provided.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:53 -08:00
NeilBrown 982aedfd09 [PATCH] knfsd: tidy up choice of filesystem-identifier when creating a filehandle
If we are using the same version/fsid as a current filehandle, then there is
no need to verify the the numbers are valid for this export, and they must be
(we used them to find this export).

This allows us to simplify the fsid selection code.

Also change "ref_fh_version" and "ref_fh_fsid_type" to "version" and
"fsid_type", as the important thing isn't that they are the version/type of
the reference filehandle, but they are the chosen type for the new filehandle.

And tidy up some indenting.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:53 -08:00
NeilBrown 8971a1016b [PATCH] knfsd: fix return value for writes to some files in 'nfsd' filesystem
Most files in the 'nfsd' filesystem are transactional.  When you write, a
reply is generated that can be read back only on the same 'file'.

If the reply has zero length, the 'write' will incorrectly return a value of
'0' instead of the length that was written.  This causes 'rpc.nfsd' to give an
annoying warning.

This patch fixes the test.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:53 -08:00
Trond Myklebust ac98695d6c Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ 2007-02-13 22:02:32 -08:00
Linus Torvalds 9468482bd4 Merge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] on reconnect to Samba - reset the unix capabilities
  [CIFS] Allow update of EOF on remote extend of file
  [CIFS] POSIX CIFS Extensions (continued) - POSIX Open
  [CIFS] Additional POSIX CIFS Extensions infolevels
2007-02-13 21:15:42 -08:00
Steve French 8af1897158 [CIFS] on reconnect to Samba - reset the unix capabilities
After temporary server or network failure and reconneciton, we were not
resending the unix capabilities via SetFSInfo - which confused Samba posix
byte range locking code.

Discovered by jra

Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-02-14 04:42:51 +00:00
Linus Torvalds 552ce544ed Revert "[PATCH] Fix d_path for lazy unmounts"
This reverts commit eb3dfb0cb1.

It causes some strange Gnome problem with dbus-daemon getting stuck, so
we'll revert it until that problem is understood.

Reported by both walt and Greg KH, who both independently git-bisected
the problem to this commit.

Andreas is looking at it.

Reported-by: walt <wa1ter@myrealbox.com>
Reported-by: Greg KH <greg@kroah.com>
Acked-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-13 12:08:18 -08:00
Andi Kleen 9fbbd4dd17 [PATCH] x86: Don't require the vDSO for handling a.out signals
and in other strange binfmts. vDSO is not necessarily mapped there.

Signed-off-by: Andi Kleen <ak@suse.de>
2007-02-13 13:26:26 +01:00
Trond Myklebust d9bc125caf Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	net/sunrpc/auth_gss/gss_krb5_crypto.c
	net/sunrpc/auth_gss/gss_spkm3_token.c
	net/sunrpc/clnt.c

Merge with mainline and fix conflicts.
2007-02-12 22:43:25 -08:00
Chuck Lever 43d78ef2ba NFS: disconnect before retrying NFSv4 requests over TCP
RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request
twice on the same connection unless it is the NULL procedure.  Section
3.1.1 suggests that the client should disconnect and reconnect if it
wants to retry a request.

Implement this by adding an rpc_clnt flag that an ULP can use to
specify that the underlying transport should be disconnected on a
major timeout.  The NFSv4 client asserts this new flag, and requests
no retries after a minor retransmit timeout.

Note that disconnecting on a retransmit is in general not safe to do
if the RPC client does not reuse the TCP port number when reconnecting.

See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-12 22:40:45 -08:00
Trond Myklebust a301b77771 NFS: Don't use ClearPageUptodate() when writeback fails
ClearPageUptodate() will just cause races here. What we really want to do
is to invalidate the page cache.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-12 22:40:38 -08:00
Trond Myklebust b0c4fddca2 NFS: Cleanup - avoid rereading 'jiffies' more than once in the same routine
Micro-optimisations for nfs_fhget() and nfs_wcc_update_inode().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-12 22:40:30 -08:00
Trond Myklebust 3e7d950a52 NFS: Fix a wraparound issue with nfsi->cache_change_attribute
Fix wraparound issue with nfsi->cache_change_attribute. If it is found
to lie in the future, then update it to lie in the past. Patch based on
a suggestion by Neil Brown.

..and minor micro-optimisation: avoid reading 'jiffies' more than once in
nfs_update_inode().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-12 22:40:22 -08:00
Josef 'Jeff' Sipek ee9b6d61a2 [PATCH] Mark struct super_operations const
This patch is inspired by Arjan's "Patch series to mark struct
file_operations and struct inode_operations const".

Compile tested with gcc & sparse.

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:47 -08:00
Arjan van de Ven c5ef1c42c5 [PATCH] mark struct inode_operations const 3
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:46 -08:00
Arjan van de Ven 92e1d5be91 [PATCH] mark struct inode_operations const 2
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:46 -08:00
Arjan van de Ven 754661f143 [PATCH] mark struct inode_operations const 1
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:46 -08:00
Arjan van de Ven 00977a59b9 [PATCH] mark struct file_operations const 6
Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:45 -08:00
Evgeniy Dushistov 54fb996ac1 [PATCH] ufs2 write: block allocation update
Patch adds ability to work with 64bit metadata, this made by replacing work
with 32bit pointers by inline functions.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:40 -08:00
Evgeniy Dushistov 3313e29267 [PATCH] ufs2 write: inodes write
This patch adds into write inode path function to write UFS2 inode, and
modifys allocate inode path to allocate and init additional inode chunks.

Also some cleanups:
- remove not used parameters in some functions
- remove i_gen field from ufs_inode_info structure,
there is i_generation in inode structure with same purposes.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:40 -08:00
Evgeniy Dushistov cbcae39fa1 [PATCH] ufs2 write: mount as rw
These series of patches add UFS2 write-support.  UFS2 - is default file system
for recent versions of FreeBSD.

The main differences from UFS1 from write support point of view
are:
1)Not all inodes are allocated during formatation of disk.
2)All meta-data(pointer to data blocks) are 64bit(in UFS1 they
are 32bit).

So patch series consist of
1)make possible mount UFS2 in read-write mode
2)code to write ufs2 inodes and code to initialize inodes chunks.
3)work with 64bit meta-data

I made simple testing like create/deleting/writing/reading/truncating, also I
ran fsx-linux and untar and build kernel on UFS1 and UFS2, after that FreeBSD
fsck do not find any errors in fs.

This patch makes possible to mount ufs2 "rw", and updates UFS2 documentation:
remove note about bug(it fixed by reallocate blocks on the fly patch) and add
me in the list of people who want receive bug reports.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:40 -08:00
Michael Halcrow 0a9ac38246 [PATCH] eCryptfs: add flush_dcache_page() calls
Call flush_dcache_page() after modifying a pagecache by hand.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:37 -08:00
Michael Halcrow e2bd99ec5c [PATCH] eCryptfs: open-code flag checking and manipulation
Open-code flag checking and manipulation.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Trevor Highland <tshighla@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:37 -08:00
Michael Halcrow 9d8b8ce556 [PATCH] eCryptfs: convert kmap() to kmap_atomic()
Replace kmap() with kmap_atomic().  Reduce the amount of time that mappings
are held.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Trevor Highland <tshighla@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:37 -08:00
Michael Halcrow 70456600f4 [PATCH] eCryptfs: convert f_op->write() to vfs_write()
sys_write() takes a local copy of f_pos and writes that back
into the struct file. It does this so that two concurrent write()
callers don't make a mess of f_pos, and of the file contents.

ecryptfs should be calling vfs_write().  That way we also get the fsnotify
notifications, which ecryptfs presently appears to have subverted.

Convert direct calls to f_op->write() into calls to vfs_write().

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:37 -08:00
Michael Halcrow e77a56ddce [PATCH] eCryptfs: Encrypted passthrough
Provide an option to provide a view of the encrypted files such that the
metadata is always in the header of the files, regardless of whether the
metadata is actually in the header or in the extended attribute.  This mode of
operation is useful for applications like incremental backup utilities that do
not preserve the extended attributes when directly accessing the lower files.

With this option enabled, the files under the eCryptfs mount point will be
read-only.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:36 -08:00
Michael Halcrow dd2a3b7ad9 [PATCH] eCryptfs: Generalize metadata read/write
Generalize the metadata reading and writing mechanisms, with two targets for
now: metadata in file header and metadata in the user.ecryptfs xattr of the
lower file.

[akpm@osdl.org: printk warning fix]
[bunk@stusta.de: make some needlessly global code static]
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:36 -08:00
Michael Halcrow 17398957aa [PATCH] eCryptfs: xattr flags and mount options
This patch set introduces the ability to store cryptographic metadata into an
lower file extended attribute rather than the lower file header region.

This patch set implements two new mount options:

ecryptfs_xattr_metadata
 - When set, newly created files will have their cryptographic
   metadata stored in the extended attribute region of the file rather
   than the header.

   When storing the data in the file header, there is a minimum of 8KB
   reserved for the header information for each file, making each file at
   least 12KB in size.  This can take up a lot of extra disk space if the user
   creates a lot of small files.  By storing the data in the extended
   attribute, each file will only occupy at least of 4KB of space.

   As the eCryptfs metadata set becomes larger with new features such as
   multi-key associations, most popular filesystems will not be able to store
   all of the information in the xattr region in some cases due to space
   constraints.  However, the majority of users will only ever associate one
   key per file, so most users will be okay with storing their data in the
   xattr region.

   This option should be used with caution.  I want to emphasize that the
   xattr must be maintained under all circumstances, or the file will be
   rendered permanently unrecoverable.  The last thing I want is for a user to
   forget to set an xattr flag in a backup utility, only to later discover
   that their backups are worthless.

ecryptfs_encrypted_view
 - When set, this option causes eCryptfs to present applications a
   view of encrypted files as if the cryptographic metadata were
   stored in the file header, whether the metadata is actually stored
   in the header or in the extended attributes.

   No matter what eCryptfs winds up doing in the lower filesystem, I want
   to preserve a baseline format compatibility for the encrypted files.  As of
   right now, the metadata may be in the file header or in an xattr.  There is
   no reason why the metadata could not be put in a separate file in future
   versions.

   Without the compatibility mode, backup utilities would have to know to
   back up the metadata file along with the files.  The semantics of eCryptfs
   have always been that the lower files are self-contained units of encrypted
   data, and the only additional information required to decrypt any given
   eCryptfs file is the key.  That is what has always been emphasized about
   eCryptfs lower files, and that is what users expect.  Providing the
   encrypted view option will provide a way to userspace applications wherein
   they can always get to the same old familiar eCryptfs encrypted files,
   regardless of what eCryptfs winds up doing with the metadata behind the
   scenes.

This patch:

Add extended attribute support to version bit vector, flags to indicate when
xattr or encrypted view modes are enabled, and support for the new mount
options.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:36 -08:00
Michael Halcrow dddfa461fc [PATCH] eCryptfs: Public key; packet management
Public key support code.  This reads and writes packets in the header that
contain public key encrypted file keys.  It calls the messaging code in the
previous patch to send and receive encryption and decryption request
packets from the userspace daemon.

[akpm@osdl.org: cleab fix]
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:36 -08:00
Michael Halcrow 88b4a07e66 [PATCH] eCryptfs: Public key transport mechanism
This is the transport code for public key functionality in eCryptfs.  It
manages encryption/decryption request queues with a transport mechanism.
Currently, netlink is the only implemented transport.

Each inode has a unique File Encryption Key (FEK).  Under passphrase, a File
Encryption Key Encryption Key (FEKEK) is generated from a salt/passphrase
combo on mount.  This FEKEK encrypts each FEK and writes it into the header of
each file using the packet format specified in RFC 2440.  This is all
symmetric key encryption, so it can all be done via the kernel crypto API.

These new patches introduce public key encryption of the FEK.  There is no
asymmetric key encryption support in the kernel crypto API, so eCryptfs pushes
the FEK encryption and decryption out to a userspace daemon.  After
considering our requirements and determining the complexity of using various
transport mechanisms, we settled on netlink for this communication.

eCryptfs stores authentication tokens into the kernel keyring.  These tokens
correlate with individual keys.  For passphrase mode of operation, the
authentication token contains the symmetric FEKEK.  For public key, the
authentication token contains a PKI type and an opaque data blob managed by
individual PKI modules in userspace.

Each user who opens a file under an eCryptfs partition mounted in public key
mode must be running a daemon.  That daemon has the user's credentials and has
access to all of the keys to which the user should have access.  The daemon,
when started, initializes the pluggable PKI modules available on the system
and registers itself with the eCryptfs kernel module.  Userspace utilities
register public key authentication tokens into the user session keyring.
These authentication tokens correlate key signatures with PKI modules and PKI
blobs.  The PKI blobs contain PKI-specific information necessary for the PKI
module to carry out asymmetric key encryption and decryption.

When the eCryptfs module parses the header of an existing file and finds a Tag
1 (Public Key) packet (see RFC 2440), it reads in the public key identifier
(signature).  The asymmetrically encrypted FEK is in the Tag 1 packet;
eCryptfs puts together a decrypt request packet containing the signature and
the encrypted FEK, then it passes it to the daemon registered for the
current->euid via a netlink unicast to the PID of the daemon, which was
registered at the time the daemon was started by the user.

The daemon actually just makes calls to libecryptfs, which implements request
packet parsing and manages PKI modules.  libecryptfs grabs the public key
authentication token for the given signature from the user session keyring.
This auth tok tells libecryptfs which PKI module should receive the request.
libecryptfs then makes a decrypt() call to the PKI module, and it passes along
the PKI block from the auth tok.  The PKI uses the blob to figure out how it
should decrypt the data passed to it; it performs the decryption and passes
the decrypted data back to libecryptfs.  libecryptfs then puts together a
reply packet with the decrypted FEK and passes that back to the eCryptfs
module.

The eCryptfs module manages these request callouts to userspace code via
message context structs.  The module maintains an array of message context
structs and places the elements of the array on two lists: a free and an
allocated list.  When eCryptfs wants to make a request, it moves a msg ctx
from the free list to the allocated list, sets its state to pending, and fires
off the message to the user's registered daemon.

When eCryptfs receives a netlink message (via the callback), it correlates the
msg ctx struct in the alloc list with the data in the message itself.  The
msg->index contains the offset of the array of msg ctx structs.  It verifies
that the registered daemon PID is the same as the PID of the process that sent
the message.  It also validates a sequence number between the received packet
and the msg ctx.  Then, it copies the contents of the message (the reply
packet) into the msg ctx struct, sets the state in the msg ctx to done, and
wakes up the process that was sleeping while waiting for the reply.

The sleeping process was whatever was performing the sys_open().  This process
originally called ecryptfs_send_message(); it is now in
ecryptfs_wait_for_response().  When it wakes up and sees that the msg ctx
state was set to done, it returns a pointer to the message contents (the reply
packet) and returns.  If all went well, this packet contains the decrypted
FEK, which is then copied into the crypt_stat struct, and life continues as
normal.

The case for creation of a new file is very similar, only instead of a decrypt
request, eCryptfs sends out an encrypt request.

> - We have a great clod of key mangement code in-kernel.  Why is that
>   not suitable (or growable) for public key management?

eCryptfs uses Howells' keyring to store persistent key data and PKI state
information.  It defers public key cryptographic transformations to userspace
code.  The userspace data manipulation request really is orthogonal to key
management in and of itself.  What eCryptfs basically needs is a secure way to
communicate with a particular daemon for a particular task doing a syscall,
based on the UID.  Nothing running under another UID should be able to access
that channel of communication.

> - Is it appropriate that new infrastructure for public key
> management be private to a particular fs?

The messaging.c file contains a lot of code that, perhaps, could be extracted
into a separate kernel service.  In essence, this would be a sort of
request/reply mechanism that would involve a userspace daemon.  I am not aware
of anything that does quite what eCryptfs does, so I was not aware of any
existing tools to do just what we wanted.

>   What happens if one of these daemons exits without sending a quit
>   message?

There is a stale uid<->pid association in the hash table for that user.  When
the user registers a new daemon, eCryptfs cleans up the old association and
generates a new one.  See ecryptfs_process_helo().

> - _why_ does it use netlink?

Netlink provides the transport mechanism that would minimize the complexity of
the implementation, given that we can have multiple daemons (one per user).  I
explored the possibility of using relayfs, but that would involve having to
introduce control channels and a protocol for creating and tearing down
channels for the daemons.  We do not have to worry about any of that with
netlink.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:36 -08:00