17511 Commits

Author SHA1 Message Date
Pavel Shilovsky
810627a013 [CIFS] Add mmap for direct, nobrl cifs mount types
without mmap functions in file_ops OpenOffice can't save changes in
existing document. The same situation you can see with gedit. Also, a.out
format of files can't be executed without mmap.

Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-03-27 02:00:49 +00:00
Linus Torvalds
e4d50423d7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
  nilfs2: fix imperfect completion wait in nilfs_wait_on_logs
  nilfs2: fix hang-up of cleaner after log writer returned with error
  nilfs2: fix duplicate call to nilfs_segctor_cancel_freev
2010-03-26 15:14:29 -07:00
Linus Torvalds
e0df9c0b42 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  Restore LOOKUP_DIRECTORY hint handling in final lookup on open()
2010-03-26 15:06:02 -07:00
Al Viro
3e297b6134 Restore LOOKUP_DIRECTORY hint handling in final lookup on open()
Lose want_dir argument, while we are at it - since now
nd->flags & LOOKUP_DIRECTORY is equivalent to it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-26 12:41:05 -04:00
Linus Torvalds
39f1cd635c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: Fixed inode allocator to correctly track a flex_bg's used_dirs
  ext4: Don't use delayed allocation by default when used instead of ext3
  ext4: Fix spelling of CONTIG_FS_EXT3 to CONFIG_FS_EXT3
  ext4: Fix estimate of # of blocks needed to write indirect-mapped files
2010-03-25 14:10:53 -07:00
Linus Torvalds
6c75969e22 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: don't try to decode GETATTR if DELEGRETURN returned error
  sunrpc: handle allocation errors from __rpc_lookup_create()
  SUNRPC: Fix the return value of rpc_run_bc_task()
  SUNRPC: Fix a use after free bug with the NFSv4.1 backchannel
  SUNRPC: Fix a potential memory leak in auth_gss
  NFS: Prevent another deadlock in nfs_release_page()
2010-03-24 16:50:46 -07:00
Dan Carpenter
1147d0f915 fscache: add missing unlock
Sparse complained about this missing spin_unlock()

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:49:21 -07:00
David Howells
61964eba5c do_sync_read/write() should set kiocb.ki_nbytes to be consistent
do_sync_read/write() should set kiocb.ki_nbytes to be consistent with
do_sync_readv_writev().

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:43:29 -07:00
David Howells
47568d4c56 FDPIC: For-loop in elf_core_vma_data_size() is incorrect
Fix an incorrect for-loop in elf_core_vma_data_size().  The advance-pointer
statement lacks an assignment:

	  CC      fs/binfmt_elf_fdpic.o
	fs/binfmt_elf_fdpic.c: In function 'elf_core_vma_data_size':
	fs/binfmt_elf_fdpic.c:1593: warning: statement with no effect

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:43:29 -07:00
OGAWA Hirofumi
8e0cc811e0 fs/partition/msdos: fix unusable extended partition for > 512B sector
Smaller size than a minimum blocksize can't be used, after all it's
handled like 0 size.

For extended partition itself, this makes sure to use bigger size than one
logical sector size at least.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Daniel Taylor <Daniel.Taylor@wdc.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:31:22 -07:00
Daniel Taylor
3fbf586cf7 fs/partitions/msdos: add support for large disks
In order to use disks larger than 2TiB on Windows XP, it is necessary to
use 4096-byte logical sectors in an MBR.

Although the kernel storage and functions called from msdos.c used
"sector_t" internally, msdos.c still used u32 variables, which results in
the ability to handle XP-compatible large disks.

This patch changes the internal variables to "sector_t".

Daniel said: "In the near future, WD will be releasing products that need
this patch".

[hirofumi@mail.parknet.co.jp: tweaks and fix]
Signed-off-by: Daniel Taylor <daniel.taylor@wdc.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:31:22 -07:00
Dan Carpenter
4fd2c20d96 kcore: fix test for end of list
"m" is never NULL here.  We need a different test for the end of list
condition.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:31:22 -07:00
Jeff Mahoney
3f8b5ee332 reiserfs: properly honor read-only devices
The reiserfs journal behaves inconsistently when determining whether to
allow a mount of a read-only device.

This is due to the use of the continue_replay variable to short circuit
the journal scanning.  If it's set, it's assumed that there are
transactions to replay, but there may not be.  If it's unset, it's assumed
that there aren't any, and that may not be the case either.

I've observed two failure cases:
1) Where a clean file system on a read-only device refuses to mount
2) Where a clean file system on a read-only device passes the
   optimization and then tries writing the journal header to update
   the latest mount id.

The former is easily observable by using a freshly created file system on
a read-only loopback device.

This patch moves the check into journal_read_transaction, where it can
bail out before it's about to replay a transaction.  That way it can go
through and skip transactions where appropriate, yet still refuse to mount
a file system with outstanding transactions.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:31:21 -07:00
Jeff Mahoney
6cb4aff0a7 reiserfs: fix oops while creating privroot with selinux enabled
Commit 57fe60df ("reiserfs: add atomic addition of selinux attributes
during inode creation") contains a bug that will cause it to oops when
mounting a file system that didn't previously contain extended attributes
on a system using security.* xattrs.

The issue is that while creating the privroot during mount
reiserfs_security_init calls reiserfs_xattr_jcreate_nblocks which
dereferences the xattr root.  The xattr root doesn't exist, so we get an
oops.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=15309

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:31:21 -07:00
Borislav Petkov
7731d9a5d4 fs/binfmt_aout.c: fix pointer warnings
fs/binfmt_aout.c: In function `aout_core_dump':
fs/binfmt_aout.c:125: warning: passing argument 2 of `dump_write' makes pointer from integer without a cast
include/linux/coredump.h:12: note: expected `const void *' but argument is of type `long unsigned int'
fs/binfmt_aout.c:132: warning: passing argument 2 of `dump_write' makes pointer from integer without a cast
include/linux/coredump.h:12: note: expected `const void *' but argument is of type `long unsigned int'

due to dump_write() expecting a user void *.  Fold casts into the
START_DATA/START_STACK macros and shut up the warnings.

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Cc: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-24 16:31:19 -07:00
Eric Sandeen
c4caae2518 ext4: Fixed inode allocator to correctly track a flex_bg's used_dirs
When used_dirs was introduced for the flex_groups struct, it looks
like the accounting was not put into place properly, in some places
manipulating free_inodes rather than used_dirs.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-23 21:32:00 -04:00
Jan Kara
ba69f9ab7d ext4: Don't use delayed allocation by default when used instead of ext3
When ext4 driver is used to mount a filesystem instead of the ext3 file
system driver (through CONFIG_EXT4_USE_FOR_EXT23), do not enable delayed
allocation by default since some ext3 users and application writers have
developed unfortunate expectations about the safety of writing files on
systems subject to sudden and violent death without using fsync().

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-24 20:18:37 -04:00
Theodore Ts'o
37f328eb60 ext4: Fix spelling of CONTIG_FS_EXT3 to CONFIG_FS_EXT3
Oops.  (Blush.)

Thanks to Sedat Dilek for pointing this out.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-03-24 20:06:41 -04:00
Srinivas Eeda
14741472a0 ocfs2: Fix a race in o2dlm lockres mastery
In o2dlm, the master of a lock resource keeps a map of all interested
nodes.  This prevents the master from purging the resource before an
interested node can create a lock.

A race between the mastery thread and the mastery handler allowed an
interested node to discover who the master is without informing the
master directly.  This is easily fixed by holding the dlm spinlock a
little longer in the mastery handler.

Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-23 18:22:59 -07:00
Tristan Ye
b54c2ca475 Ocfs2: Handle deletion of reflinked oprhan inodes correctly.
The rule is that all inodes in the orphan dir have ORPHANED_FL,
otherwise we treated it as an ERROR.  This rule works well except
for some rare cases of reflink operation:

http://oss.oracle.com/bugzilla/show_bug.cgi?id=1215

The problem is caused by how reflink and our orphan_scan thread
interact.

 * The orphan scan pulls the orphans into a queue first, then runs the
   queue at a later time.  We only hold the orphan_dir's lock
   during scanning.

 * Reflink create a oprhaned target in orphan_dir as its first step.
   It removes the target and clears the flag as the final step.
   These two steps take the orphan_dir's lock, but it is not held for
   the duration.

Based on the above semantics, a reflink inode can be moved out of the
orphan dir and have its ORPHANED_FL cleared before the queue of orphans
is run.  This leads to a ERROR in ocfs2_query_wipde_inode().

This patch teaches ocfs2_query_wipe_inode() to detect previously
orphaned reflink targets.  If a reflink fails or a crash occurs during
the relfink operation, the inode will retain ORPHANED_FL and will be
properly wiped.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-23 18:22:55 -07:00
Tristan Ye
3939fda4b3 Ocfs2: Journaling i_flags and i_orphaned_slot when adding inode to orphan dir.
Currently, some callers were missing to journal the dirty inode after
adding it to orphan dir.

Now we're going to journal such modifications within the ocfs2_orphan_add()
itself, It's safe to do so, though some existing caller may duplicate this,
and it makes the logic look more straightforward anyway.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-23 18:22:51 -07:00
Mark Fasheh
b4414eea0e ocfs2: Clear undo bits when local alloc is freed
When the local alloc file changes windows, unused bits are freed back to the
global bitmap. By defnition, those bits can not be in use by any file. Also,
the local alloc will never have been able to allocate those bits if they
were part of a previous truncate. Therefore it makes sense that we should
clear unused local alloc bits in the undo buffer so that they can be used
immediatly.

[ Modified to call it ocfs2_release_clusters() -- Joel ]

Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-23 18:22:40 -07:00
Tyler Hicks
f4e60e6b30 eCryptfs: Strip metadata in xattr flag in encrypted view
The ecryptfs_encrypted_view mount option provides a unified way of
viewing encrypted eCryptfs files.  If the metadata is stored in a xattr,
the metadata is moved to the file header when the file is read inside
the eCryptfs mount.  Because of this, we should strip the
ECRYPTFS_METADATA_IN_XATTR flag from the header's flag section.  This
allows eCryptfs to treat the file as an eCryptfs file with a header
at the front.

Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-03-23 12:31:35 -05:00
Tyler Hicks
1984c23f9e eCryptfs: Clear buffer before reading in metadata xattr
We initially read in the first PAGE_CACHE_SIZE of a file to if the
eCryptfs header marker can be found.  If it isn't found and
ecryptfs_xattr_metadata was given as a mount option, then the
user.ecryptfs xattr is read into the same buffer.  Since the data from
the first page of the file wasn't cleared, it is possible that we think
we've found a second tag 3 or tag 1 packet and then error out after the
packet contents aren't as expected.  This patch clears the buffer before
filling it with metadata from the user.ecryptfs xattr.

Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-03-23 12:31:09 -05:00
Tyler Hicks
fa3ef1cb4e eCryptfs: Rename ecryptfs_crypt_stat.num_header_bytes_at_front
This patch renames the num_header_bytes_at_front variable to
metadata_size since it now contains the max size of the metadata.

Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-03-23 12:30:41 -05:00
Tyler Hicks
157f107135 eCryptfs: Fix metadata in xattr feature regression
Fixes regression in 8faece5f906725c10e7a1f6caf84452abadbdc7b

When using the ecryptfs_xattr_metadata mount option, eCryptfs stores the
metadata (normally stored at the front of the file) in the user.ecryptfs
xattr.  This causes ecryptfs_crypt_stat.num_header_bytes_at_front to be
0, since there is no header data at the front of the file.  This results
in too much memory being requested and ENOMEM being returned from
ecryptfs_write_metadata().

This patch fixes the problem by using the num_header_bytes_at_front
variable for specifying the max size of the metadata, despite whether it
is stored in the header or xattr.

Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2010-03-23 12:29:49 -05:00
Ryusuke Konishi
d067633b44 nilfs2: fix imperfect completion wait in nilfs_wait_on_logs
nilfs_wait_on_logs has a potential to slip out before completion of
all bio requests when it met an error.  This synchronization fault may
cause unexpected results, for instance, violative access to freed
segment buffers from an end-bio callback routine.

This fixes the issue by ensuring that nilfs_wait_on_logs waits all
given logs.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2010-03-24 01:17:20 +09:00
Ryusuke Konishi
110d735a0a nilfs2: fix hang-up of cleaner after log writer returned with error
According to the report from Andreas Beckmann (Message-ID:
<4BA54677.3090902@abeckmann.de>), nilfs in 2.6.33 kernel got stuck
after a disk full error.

This turned out to be a regression by log writer updates merged at
kernel 2.6.33.  nilfs_segctor_abort_construction, which is a cleanup
function for erroneous cases, was skipping writeback completion for
some logs.

This fixes the bug and would resolve the hang issue.

Reported-by: Andreas Beckmann <debian@abeckmann.de>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: stable <stable@kernel.org>                     [2.6.33.x]
2010-03-24 00:03:06 +09:00
Sage Weil
393f662096 ceph: fix possible double-free of mds request reference
Clear pointer to mds request after dropping the reference to
ensure we don't drop it again, as there is at least one error
path through this function that does not reset fi->last_readdir
to a new value.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:47:06 -07:00
Sage Weil
d96d60498f ceph: fix session check on mds reply
Fix a broken check that a reply came back from the same MDS we sent the
request to.  I don't think a case that actually triggers this would ever
come up in practice, but it's clearly wrong and easy to fix.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:47:05 -07:00
Dan Carpenter
4736b009b8 ceph: handle kmalloc() failure
Return ERR_PTR(-ENOMEM) if kmalloc() fails.  We handle allocation
failures the same way later in the function.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:47:04 -07:00
Sage Weil
9c423956b8 ceph: propagate mds session allocation failures to caller
Return error to original caller if register_session() fails.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:47:04 -07:00
Sage Weil
8f883c24de ceph: make write_begin wait propagate ERESTARTSYS
Currently, if the wait_event_interruptible is interrupted, we
return EAGAIN unconditionally and loop, such that we aren't, in
fact, interruptible.  So, propagate ERESTARTSYS if we get it.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:47:03 -07:00
Sage Weil
ec4318bcb4 ceph: fix snap rebuild condition
We were rebuilding the snap context when it was not necessary
(i.e. when the realm seq hadn't changed _and_ the parent seq
was still older), which caused page snapc pointers to not match
the realm's snapc pointer (even though the snap context itself
was identical).  This confused begin_write and put it into an
endless loop.

The correct logic is: rebuild snapc if _my_ realm seq changed, or
if my parent realm's seq is newer than mine (and thus mine needs
to be rebuilt too).

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:47:02 -07:00
Sage Weil
87b315a5b5 ceph: avoid reopening osd connections when address hasn't changed
We get a fault callback on _every_ tcp connection fault.  Normally, we
want to reopen the connection when that happens.  If the address we have
is bad, however, and connection attempts always result in a connection
refused or similar error, explicitly closing and reopening the msgr
connection just prevents the messenger's backoff logic from kicking in.
The result can be a console full of

[ 3974.417106] ceph: osd11 10.3.14.138:6800 connection failed
[ 3974.423295] ceph: osd11 10.3.14.138:6800 connection failed
[ 3974.429709] ceph: osd11 10.3.14.138:6800 connection failed

Instead, if we get a fault, and have outstanding requests, but the osd
address hasn't changed and the connection never successfully connected in
the first place, do nothing to the osd connection.  The messenger layer
will back off and retry periodically, because we never connected and thus
the lossy bit is not set.

Instead, touch each request's r_stamp so that handle_timeout can tell the
request is still alive and kicking.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:47:01 -07:00
Sage Weil
3dd72fc0e6 ceph: rename r_sent_stamp r_stamp
Make variable name slightly more generic, since it will (soon)
reflect either the time the request was sent OR the time it was
last determined to be still retrying.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:47:00 -07:00
Sage Weil
3c3f2e32ef ceph: fix connection fault con_work reentrancy problem
The messenger fault was clearing the BUSY bit, for reasons unclear.  This
made it possible for the con->ops->fault function to reopen the connection,
and requeue work in the workqueue--even though the current thread was
already in con_work.

This avoids a problem where the client busy loops with connection failures
on an unreachable OSD, but doesn't address the root cause of that problem.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:59 -07:00
Sage Weil
e4cb4cb8a0 ceph: prevent dup stale messages to console for restarting mds
Prevent duplicate 'mds0 caps stale' message from spamming the console every
few seconds while the MDS restarts.  Set s_renew_requested earlier, so that
we only print the message once, even if we don't send an actual request.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:58 -07:00
Sage Weil
efd7576b23 ceph: fix pg pool decoding from incremental osdmap update
The incremental map decoding of pg pool updates wasn't skipping
the snaps and removed_snaps vectors.  This caused osd requests
to stall when pool snapshots were created or fs snapshots were
deleted.  Use a common helper for full and incremental map
decoders that decodes pools properly.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:57 -07:00
Sage Weil
80fc7314a7 ceph: fix mds sync() race with completing requests
The wait_unsafe_requests() helper dropped the mdsc mutex to wait
for each request to complete, and then examined r_node to get the
next request after retaking the lock.  But the request completion
removes the request from the tree, so r_node was always undefined
at this point.  Since it's a small race, it usually led to a
valid request, but not always.  The result was an occasional
crash in rb_next() while dereferencing node->rb_left.

Fix this by clearing the rb_node when removing the request from
the request tree, and not walking off into the weeds when we
are done waiting for a request.  Since the request we waited on
will _always_ be out of the request tree, take a ref on the next
request, in the hopes that it won't be.  But if it is, it's ok:
we can start over from the beginning (and traverse over older read
requests again).

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:56 -07:00
Sage Weil
916623da10 ceph: only release unused caps with mds requests
We were releasing used caps (e.g. FILE_CACHE) from encode_inode_release
with MDS requests (e.g. setattr).  We don't carry refs on most caps, so
this code worked most of the time, but for setattr (utimes) we try to
drop Fscr.

This causes cap state to get slightly out of sync with reality, and may
result in subsequent mds revoke messages getting ignored.

Fix by only releasing unused caps.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:55 -07:00
Sage Weil
15637c8b12 ceph: clean up handle_cap_grant, handle_caps wrt session mutex
Drop session mutex unconditionally in handle_cap_grant, and do the
check_caps from the handle_cap_grant helper.  This avoids using a magic
return value.

Also avoid using a flag variable in the IMPORT case and call
check_caps at the appropriate point.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:54 -07:00
Sage Weil
cdc2ce056a ceph: fix session locking in handle_caps, ceph_check_caps
Passing a session pointer to ceph_check_caps() used to mean it would leave
the session mutex locked.  That wasn't always possible if it wasn't passed
CHECK_CAPS_AUTHONLY.   If could unlock the passed session and lock a
differet session mutex, which was clearly wrong, and also emitted a
warning when it a racing CPU retook it and we did an unlock from the wrong
context.

This was only a problem when there was more than one MDS.

First, make ceph_check_caps unconditionally drop the session mutex, so that
it is free to lock other sessions as needed.  Then adjust the one caller
that passes in a session (handle_cap_grant) accordingly.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:53 -07:00
Sage Weil
4ea0043a29 ceph: drop unnecessary WARN_ON in caps migration
If we don't have the exported cap it's because we already released it. No
need to WARN.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:52 -07:00
Sage Weil
12eadc1900 ceph: fix null pointer deref of r_osd in debug output
This causes an oops when debug output is enabled and we kick
an osd request with no current r_osd (sometime after an osd
failure).  Check the pointer before dereferencing.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:51 -07:00
Sage Weil
0a990e7093 ceph: clean up service ticket decoding
Previously we would decode state directly into our current ticket_handler.
This is problematic if for some reason we fail to decode, because we end
up with half new state and half old state.

We are probably already in bad shape if we get an update we can't decode,
but we may as well be tidy anyway.  Decode into new_* temporaries and
update the ticket_handler only on success.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-23 07:46:47 -07:00
Dan Carpenter
99b437a925 AFS: Potential null dereference
It seems clear from the surrounding code that xpermits is allowed to be
NULL here.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-22 09:57:19 -07:00
Jeff Layton
556ae3bb32 NFS: don't try to decode GETATTR if DELEGRETURN returned error
The reply parsing code attempts to decode the GETATTR response even if
the DELEGRETURN portion of the compound returned an error. The GETATTR
response won't actually exist if that's the case and we're asking the
parser to read past the end of the response.

This bug is fairly benign. The parser catches this without reading past
the end of the response and decode_getfattr returns -EIO. Earlier
kernels however had decode_op_hdr using the READ_BUF macro, and this
bug would make this printk pop any time the client got an error from
a delegreturn:

kernel: decode_op_hdr: reply buffer overflowed in line XXXX

More recent kernels seem to have replaced this printk with a dprintk.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-22 05:34:13 -04:00
Ryusuke Konishi
2d8428acae nilfs2: fix duplicate call to nilfs_segctor_cancel_freev
Andreas Beckmann gave me a report that nilfs logged the following
warnings when it got a disk full:

  nilfs_sufile_do_cancel_free: segment 0 must be clean
  nilfs_sufile_do_cancel_free: segment 1 must be clean

These arise from a duplicate call to nilfs_segctor_cancel_freev in an
error path of log writer.  This will fix the issue.

Reported-by: Andreas Beckmann <debian@abeckmann.de>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2010-03-22 14:41:07 +09:00
Sage Weil
5b3dbb44ab ceph: release old ticket_blob buffer
Release the old ticket_blob buffer when we get an updated service ticket
from the monitor.  Previously these were getting leaked.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-20 21:33:11 -07:00