Commit Graph

7154 Commits

Author SHA1 Message Date
Steve French
2b83457bde [CIFS] Fix check after use error in ACL code
Spotted by the coverity scanner.

CC: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-25 10:01:00 +00:00
Steve French
058250a0d5 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2007-11-25 09:53:27 +00:00
Jeff Layton
cea218054a [CIFS] Fix potential data corruption when writing out cached dirty pages
Fix RedHat bug 329431

The idea here is separate "conscious" from "unconscious" flushes.
Conscious flushes are those due to a fsync() or close(). Unconscious
ones are flushes that occur as a side effect of some other operation or
due to memory pressure.

Currently, when an error occurs during an unconscious flush (ENOSPC or
EIO), we toss out the page and don't preserve that error to report to
the user when a conscious flush occurs. If after the unconscious flush,
there are no more dirty pages for the inode, the conscious flush will
simply return success even though there were previous errors when writing
out pages. This can lead to data corruption.

The easiest way to reproduce this is to mount up a CIFS share that's
very close to being full or where the user is very close to quota. mv
a file to the share that's slightly larger than the quota allows. The
writes will all succeed (since they go to pagecache). The mv will do a
setattr to set the new file's attributes. This calls
filemap_write_and_wait,
which will return an error since all of the pages can't be written out.
Then later, when the flush and release ops occur, there are no more
dirty pages in pagecache for the file and those operations return 0. mv
then assumes that the file was written out correctly and deletes the
original.

CIFS already has a write_behind_rc variable where it stores the results
from earlier flushes, but that value is only reported in cifs_close.
Since the VFS ignores the return value from the release operation, this
isn't helpful. We should be reporting this error during the flush
operation.

This patch does the following:

1) changes cifs_fsync to use filemap_write_and_wait and cifs_flush and also
sync to check its return code. If it returns successful, they then check
the value of write_behind_rc to see if an earlier flush had reported any
errors. If so, they return that error and clear write_behind_rc.

2) sets write_behind_rc in a few other places where pages are written
out as a side effect of other operations and the code waits on them.

3) changes cifs_setattr to only call filemap_write_and_wait for
ATTR_SIZE changes.

4) makes cifs_writepages accurately distinguish between EIO and ENOSPC
errors when writing out pages.

Some simple testing indicates that the patch works as expected and that
it fixes the reproduceable known problem.

Acked-by: Dave Kleikamp <shaggy@austin.rr.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-20 23:19:03 +00:00
Petr Tesarik
2a97468024 [CIFS] Fix spurious reconnect on 2nd peek from read of SMB length
When retrying kernel_recvmsg() because of a short read, check returned
length against the remaining length, not against total length. This
avoids unneeded session reconnects which would otherwise occur when
kernel_recvmsg() finally returns zero when asked to read zero bytes.

Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-20 02:24:08 +00:00
Steve French
f7a44eadd5 [CIFS] remove build warning
CC: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-17 00:01:51 +00:00
Steve French
2442421b17 [CIFS] Have CIFS_SessSetup build correct SPNEGO SessionSetup request
Have CIFS_SessSetup call cifs_get_spnego_key when Kerberos is
negotiated. Use the info in the key payload to build a session
setup request packet. Also clean up how the request buffer in
the function is freed on error.

With appropriate user space helper (in samba/source/client). Kerberos
support (secure session establishment can be done now via Kerberos,
previously users would have to use NTLMv2 instead for more secure
session setup).

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-16 23:37:35 +00:00
Steve French
8840dee9dc [CIFS] minor checkpatch cleanup
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-16 23:05:52 +00:00
Jeff Layton
d6c2e4d02b [CIFS] have cifs_get_spnego_key get the hostname from TCP_Server_Info
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-16 22:23:17 +00:00
Jeff Layton
c359cf3c61 [CIFS] add hostname field to TCP_Server_Info struct
...and populate it with the hostname portion of the UNC string.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-16 22:22:06 +00:00
Jeff Layton
70fe7dc055 [CIFS] clean up error handling in cifs_mount
Move all of the kfree's sprinkled in the middle of the function to the
end, and have the code set rc and just goto there on error. Also zero
out the password string before freeing it. Looks like this should also
fix a potential memory leak of the prepath string if an error occurs
near the end of the function.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-16 22:21:07 +00:00
Steve French
68bf728a22 [CIFS] add ver= prefix to upcall format version
Acked-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Igor Mammedov <niallan@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-16 18:32:52 +00:00
Jan Kara
7c06a8dc64 Fix 64KB blocksize in ext3 directories
With 64KB blocksize, a directory entry can have size 64KB which does not
fit into 16 bits we have for entry lenght.  So we store 0xffff instead and
convert value when read from / written to disk.  The patch also converts
some places to use ext3_next_entry() when we are changing them anyway.

[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:43 -08:00
Jeff Layton
dbaf4c024a smbfs: fix debug builds
Fix some warnings with SMBFS_DEBUG_* builds.  This patch makes it so that
builds with -Werror don't fail.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:43 -08:00
Arjan van de Ven
cb51f973bc mark sys_open/sys_read exports unused
sys_open / sys_read were used in the early 1.2 days to load firmware from
disk inside drivers.  Since 2.0 or so this was deprecated behavior, but
several drivers still were using this.  Since a few years we have a
request_firmware() API that implements this in a nice, consistent way.
Only some old ISA sound drivers (pre-ALSA) still straggled along for some
time....  however with commit c2b1239a9f the
last user is now gone.

This is a good thing, since using sys_open / sys_read etc for firmware is a
very buggy to dangerous thing to do; these operations put an fd in the
process file descriptor table....  which then can be tampered with from
other threads for example.  For those who don't want the firmware loader,
filp_open()/vfs_read are the better APIs to use, without this security
issue.

The patch below marks sys_open and sys_read unused now that they're
really not used anymore, and for deletion in the 2.6.25 timeframe.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:42 -08:00
Eric W. Biederman
9fcc2d15b1 proc: simplify and correct proc_flush_task
Currently we special case when we have only the initial pid namespace.
Unfortunately in doing so the copied case for the other namespaces was
broken so we don't properly flush the thread directories :(

So this patch removes the unnecessary special case (removing a usage of
proc_mnt) and corrects the flushing of the thread directories.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:42 -08:00
Adrian Bunk
8744969a81 fuse_file_alloc(): fix NULL dereferences
Fix obvious NULL dereferences spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:42 -08:00
Fengguang Wu
c06a018fa5 reiserfs: don't drop PG_dirty when releasing sub-page-sized dirty file
This is not a new problem in 2.6.23-git17.  2.6.22/2.6.23 is buggy in the
same way.

Reiserfs could accumulate dirty sub-page-size files until umount time.
They cannot be synced to disk by pdflush routines or explicit `sync'
commands.  Only `umount' can do the trick.

The direct cause is: the dirty page's PG_dirty is wrongly _cleared_.
Call trace:
	 [<ffffffff8027e920>] cancel_dirty_page+0xd0/0xf0
	 [<ffffffff8816d470>] :reiserfs:reiserfs_cut_from_item+0x660/0x710
	 [<ffffffff8816d791>] :reiserfs:reiserfs_do_truncate+0x271/0x530
	 [<ffffffff8815872d>] :reiserfs:reiserfs_truncate_file+0xfd/0x3b0
	 [<ffffffff8815d3d0>] :reiserfs:reiserfs_file_release+0x1e0/0x340
	 [<ffffffff802a187c>] __fput+0xcc/0x1b0
	 [<ffffffff802a1ba6>] fput+0x16/0x20
	 [<ffffffff8029e676>] filp_close+0x56/0x90
	 [<ffffffff8029fe0d>] sys_close+0xad/0x110
	 [<ffffffff8020c41e>] system_call+0x7e/0x83

Fix the bug by removing the cancel_dirty_page() call. Tests show that
it causes no bad behaviors on various write sizes.

=== for the patient ===
Here are more detailed demonstrations of the problem.

1) the page has both PG_dirty(D)/PAGECACHE_TAG_DIRTY(d) after being written to;
   and then only PAGECACHE_TAG_DIRTY(d) remains after the file is closed.

------------------------------ screen 0 ------------------------------
[T0] root /home/wfg# cat > /test/tiny
[T1] hi
[T2] root /home/wfg#

------------------------------ screen 1 ------------------------------
[T1] root /home/wfg# echo /test/tiny > /proc/filecache
[T1] root /home/wfg# cat /proc/filecache
     # file /test/tiny
     # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback
     # idx   len     state   refcnt
     0       1       ___UD__Bd_      2
[T2] root /home/wfg# cat /proc/filecache
     # file /test/tiny
     # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback
     # idx   len     state   refcnt
     0       1       ___U___Bd_      2

2) note the non-zero 'cancelled_write_bytes' after /tmp/hi is copied.

------------------------------ screen 0 ------------------------------
[T0] root /home/wfg# echo hi > /tmp/hi
[T1] root /home/wfg# cp /tmp/hi /dev/stdin /test
[T2] hi
[T3] root /home/wfg#

------------------------------ screen 1 ------------------------------
[T1] root /proc/4397# cd /proc/`pidof cp`
[T1] root /proc/4713# cat io
     rchar: 8396
     wchar: 3
     syscr: 20
     syscw: 1
     read_bytes: 0
     write_bytes: 20480
     cancelled_write_bytes: 4096
[T2] root /proc/4713# cat io
     rchar: 8399
     wchar: 6
     syscr: 21
     syscw: 2
     read_bytes: 0
     write_bytes: 24576
     cancelled_write_bytes: 4096

//Question: the 'write_bytes' is a bit more than expected ;-)

Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Reviewed-by: Chris Mason <chris.mason@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:41 -08:00
Dmitri Vorobiev
f433dc5634 Fixes to the BFS filesystem driver
I found a few bugs in the BFS driver.  Detailed description of the bugs as
well as the steps to reproduce the errors are given in the kernel bugzilla.
 Please follow these links for more information:

http://bugzilla.kernel.org/show_bug.cgi?id=9363
http://bugzilla.kernel.org/show_bug.cgi?id=9364
http://bugzilla.kernel.org/show_bug.cgi?id=9365
http://bugzilla.kernel.org/show_bug.cgi?id=9366

This patch fixes the bugs described above.  Besides, the patch introduces
coding style changes to make the BFS driver conform to the requirements
specified for Linux kernel code.  Finally, I made a few cosmetic changes
such as removal of trivial debug output.

Also, the patch removes the fields `si_lf_ioff' and `si_lf_sblk' of the
in-core superblock structure.  These fields are initialized but never
actually used.

If you are wondering why I need BFS, here is the answer: I am using this
driver in the context of Linux kernel classes I am teaching in the Moscow
State University and in the International Institute of Information
Technology in Pune, India.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com>
Cc: Tigran Aivazian <tigran@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:40 -08:00
Adam Litke
9a119c056d hugetlb: allow bulk updating in hugetlb_*_quota()
Add a second parameter 'delta' to hugetlb_get_quota and hugetlb_put_quota to
allow bulk updating of the sbinfo->free_blocks counter.  This will be used by
the next patch in the series.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: David Gibson <hermes@gibson.dropbear.id.au>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:40 -08:00
Adam Litke
c79fb75e5a hugetlb: fix quota management for private mappings
The hugetlbfs quota management system was never taught to handle MAP_PRIVATE
mappings when that support was added.  Currently, quota is debited at page
instantiation and credited at file truncation.  This approach works correctly
for shared pages but is incomplete for private pages.  In addition to
hugetlb_no_page(), private pages can be instantiated by hugetlb_cow(); but
this function does not respect quotas.

Private huge pages are treated very much like normal, anonymous pages.  They
are not "backed" by the hugetlbfs file and are not stored in the mapping's
radix tree.  This means that private pages are invisible to
truncate_hugepages() so that function will not credit the quota.

This patch (based on a prototype provided by Ken Chen) moves quota crediting
for all pages into free_huge_page().  page->private is used to store a pointer
to the mapping to which this page belongs.  This is used to credit quota on
the appropriate hugetlbfs instance.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: David Gibson <hermes@gibson.dropbear.id.au>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:40 -08:00
Eric W. Biederman
e1a1c997af proc: fix proc_kill_inodes to kill dentries on all proc superblocks
It appears we overlooked support for removing generic proc files
when we added support for multiple proc super blocks.  Handle
that now.

[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:38 -08:00
Jan Kara
e47776a0a4 Forbid user to change file flags on quota files
Forbid user from changing file flags on quota files.  User has no bussiness
in playing with these flags when quota is on.  Furthermore there is a
remote possibility of deadlock due to a lock inversion between quota file's
i_mutex and transaction's start (i_mutex for quota file is locked only when
trasaction is started in quota operations) in ext3 and ext4.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: LIOU Payphone <lioupayphone@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:38 -08:00
Michael Halcrow
8a146a2b0d eCryptfs: cast page->index to loff_t instead of off_t
page->index should be cast to loff_t instead of off_t.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:36 -08:00
Steve French
133672efbc [CIFS] Fix buffer overflow if server sends corrupt response to small
request

In SendReceive() function in transport.c - it memcpy's
message payload into a buffer passed via out_buf param. The function
assumes that all buffers are of size (CIFSMaxBufSize +
MAX_CIFS_HDR_SIZE) , unfortunately it is also called with smaller
(MAX_CIFS_SMALL_BUFFER_SIZE) buffers.  There are eight callers
(SMB worker functions) which are primarily affected by this change:

TreeDisconnect, uLogoff, Close, findClose, SetFileSize, SetFileTimes,
Lock and PosixLock

CC: Dave Kleikamp <shaggy@austin.ibm.com>
CC: Przemyslaw Wegrzyn <czajnik@czajsoft.pl>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-13 22:41:37 +00:00
Linus Torvalds
31083eba37 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (45 commits)
  [NETFILTER]: xt_time should not assume CONFIG_KTIME_SCALAR
  [NET]: Move unneeded data to initdata section.
  [NET]: Cleanup pernet operation without CONFIG_NET_NS
  [TEHUTI]: Fix incorrect usage of strncat in bdx_get_drvinfo()
  [MYRI_SBUS]: Prevent that myri_do_handshake lies about ticks.
  [NETFILTER]: bridge: fix double POSTROUTING hook invocation
  [NETFILTER]: Consolidate nf_sockopt and compat_nf_sockopt
  [NETFILTER]: nf_nat: fix memset error
  [INET]: Use list_head-s in inetpeer.c
  [IPVS]: Remove unused exports.
  [NET]: Unexport sysctl_{r,w}mem_max.
  [TG3]: Update version to 3.86
  [TG3]: MII => TP
  [TG3]: Add A1 revs
  [TG3]: Increase the PCI MRRS
  [TG3]: Prescaler fix
  [TG3]: Limit 5784 / 5764 to MAC LED mode
  [TG3]: Disable GPHY autopowerdown
  [TG3]: CPMU adjustments for loopback tests
  [TG3]: Fix nvram selftest failures
  ...
2007-11-13 09:04:48 -08:00
Linus Torvalds
0b832a4b93 Revert "ext2/ext3/ext4: add block bitmap validation"
This reverts commit 7c9e69faa2, fixing up
conflicts in fs/ext4/balloc.c manually.

The cost of doing the bitmap validation on each lookup - even when the
bitmap is cached - is absolutely prohibitive.  We could, and probably
should, do it only when adding the bitmap to the buffer cache.  However,
right now we are better off just reverting it.

Peter Zijlstra measured the cost of this extra validation as a 85%
decrease in cached iozone, and while I had a patch that took it down to
just 17% by not being _quite_ so stupid in the validation, it was still
a big slowdown that could have been avoided by just doing it right.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andreas Dilger <adilger@clusterfs.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-13 08:09:11 -08:00
Denis V. Lunev
022cbae611 [NET]: Move unneeded data to initdata section.
This patch reverts Eric's commit 2b008b0a8e

It diets .text & .data section of the kernel if CONFIG_NET_NS is not set.
This is safe after list operations cleanup.

Signed-of-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-13 03:23:50 -08:00
Trond Myklebust
91cf45f02a [NET]: Add the helper kernel_sock_shutdown()
...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers.

Looking at the sock->op->shutdown() handlers, it looks as if all of them
take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the
RCV_SHUTDOWN/SEND_SHUTDOWN arguments.
Add a helper, and then define the SHUT_* enum to ensure that kernel users
of shutdown() don't get confused.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 18:10:39 -08:00
J. Bruce Fields
6fa02839bf nfsd4: recheck for secure ports in fh_verify
As with commit 7fc90ec93a ("knfsd: nfsd:
call nfsd_setuser() on fh_compose(), fix nfsd4 permissions problem")
this is a case where we need to redo a security check in fh_verify()
even though the filehandle already has an associated dentry--if the
filehandle was created by fh_compose() in an earlier operation of the
nfsv4 compound, then we may not have done these checks yet.

Without this fix it is possible, for example, to traverse from an export
without the secure ports requirement to one with it in a single
compound, and bypass the secure port check on the new export.

While we're here, fix up some minor style problems and change a printk()
to a dprintk(), to make it harder for random unprivileged users to spam
the logs.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reviewed-By: NeilBrown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-12 14:28:08 -08:00
J. Bruce Fields
ac8587dcb5 knfsd: fix spurious EINVAL errors on first access of new filesystem
The v2/v3 acl code in nfsd is translating any return from fh_verify() to
nfserr_inval.  This is particularly unfortunate in the case of an
nfserr_dropit return, which is an internal error meant to indicate to
callers that this request has been deferred and should just be dropped
pending the results of an upcall to mountd.

Thanks to Roland <devzero@web.de> for bug report and data collection.

Cc: Roland <devzero@web.de>
Acked-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Reviewed-By: NeilBrown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-12 14:28:08 -08:00
Linus Torvalds
46015977e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (21 commits)
  [CIFS] fix oops on second mount to same server when null auth is used
  [CIFS] Fix stale mode after readdir when cifsacl specified
  [CIFS] add mode to acl conversion helper function
  [CIFS] Fix incorrect mode when ACL had deny access control entries
  [CIFS] Add uid to key description so krb can handle user mounts
  [CIFS] Fix walking out end of cifs dacl
  [CIFS] Add upcall files for cifs to use spnego/kerberos
  [CIFS] add OIDs for KRB5 and MSKRB5 to ASN1 parsing routines
  [CIFS] Register and unregister cifs_spnego_key_type on module init/exit
  [CIFS] implement upcalls for SPNEGO blob via keyctl API
  [CIFS] allow cifs_calc_signature2 to deal with a zero length iovec
  [CIFS] If no Access Control Entries, set mode perm bits to zero
  [CIFS] when mount helper missing fix slash wrong direction in share
  [CIFS] Don't request too much permission when reading an ACL
  [CIFS] enable get mode from ACL when cifsacl mount option specified
  [CIFS] ACL support part 8
  [CIFS] acl support part 7
  [CIFS] acl support part 6
  [CIFS] acl support part 6
  [CIFS] remove unused funtion compile warning when experimental off
  ...
2007-11-12 11:11:39 -08:00
Roland McGrath
00ec99da43 core dump: remain dumpable
The coredump code always calls set_dumpable(0) when it starts (even
if RLIMIT_CORE prevents any core from being dumped).  The effect of
this (via task_dumpable) is to make /proc/pid/* files owned by root
instead of the user, so the user can no longer examine his own
process--in a case where there was never any privileged data to
protect.  This affects e.g. auxv, environ, fd; in Fedora (execshield)
kernels, also maps.  In practice, you can only notice this when a
debugger has requested PTRACE_EVENT_EXIT tracing.

set_dumpable was only used in do_coredump for synchronization and not
intended for any security purpose.  (It doesn't secure anything that wasn't
already unsecured when a process dies by SIGTERM instead of SIGQUIT.)

This changes do_coredump to check the core_waiters count as the means of
synchronization, which is sufficient.  Now we leave the "dumpable" bits alone.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-12 10:32:29 -08:00
Jeff Layton
9b8f5f5737 [CIFS] fix oops on second mount to same server when null auth is used
When a share is mounted using no username, cifs_mount sets
volume_info.username as a NULL pointer, and the sesInfo userName as an
empty string. The volume_info.username is passed to a couple of other
functions to see if there is an existing unc or tcp connection that can
be used. These functions assume that the username will be a valid
string that can be passed to strncmp. If the pointer is NULL, then the
kernel will oops if there's an existing session to which the string
can be compared.

This patch changes cifs_mount to set volume_info.username to an empty
string in this situation, which prevents the oops and should make it
so that the comparison to other null auth sessions match.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-09 23:25:04 +00:00
Linus Torvalds
4c31c30302 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Add UNPLUG traces to all appropriate places
  block: fix requeue handling in blk_queue_invalidate_tags()
  mmc: Fix sg helper copy-and-paste error
  pktcdvd: fix BUG caused by sysfs module reference semantics change
  ioprio: allow sys_ioprio_set() value of 0 to reset ioprio setting
  cfq_idle_class_timer: add paranoid checks for jiffies overflow
  cfq: fix IOPRIO_CLASS_IDLE delays
  cfq: fix IOPRIO_CLASS_IDLE accounting
2007-11-09 15:17:49 -08:00
Linus Torvalds
e36aeee65d Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2: fix rename vs unlink race
  [PATCH] Fix possibly too long write in o2hb_setup_one_bio()
  ocfs2: fix write() performance regression
  ocfs2: Commit journal on sync writes
  ocfs2: Re-order iput in ocfs2_drop_dentry_lock
  ocfs2: Create locks at initially requested level
  [PATCH] Fix priority mistakes in fs/ocfs2/{alloc.c, dlmglue.c}
  [2.6 patch] make ocfs2_find_entry_el() static
2007-11-09 15:11:58 -08:00
Steve French
a6f8de3d9b [CIFS] Fix stale mode after readdir when cifsacl specified
When mounted with cifsacl mount option, readdir can not
instantiate the inode with the estimated mode based on the ACL
for each file since we have not queried for the ACL for
each of these files yet.  So set the refresh time to zero
for these inodes so that the next stat will cause the client
to go to the server for the ACL info so we can build the estimated
mode (this means we also will issue an extra QueryPathInfo if
the stat happens within 1 second, but this is trivial compared to
the time required to open/getacl/close for each).

ls -l is slower when cifsacl mount option is specified, but
displays correct mode information.

Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-08 23:10:32 +00:00
Steve French
ce06c9f025 [CIFS] add mode to acl conversion helper function
Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-08 21:12:01 +00:00
Steve French
15b0395911 [CIFS] Fix incorrect mode when ACL had deny access control entries
When mounted with the cifsacl mount option, we were
treating any deny ACEs found like allow ACEs and it turns out for
SFU and SUA Windows set these type of access control entries often.
The order of ACEs is important too.  The canonical order that most
ACL tools and Windows explorer consruct ACLs with is to begin with
DENY entries then follow with ALLOW, otherwise an allow entry
could be encountered first, making the subsequent deny entry like "dead
code which would be superflous since Windows stops when a match is
made for the operation you are trying to perform for your user

We start with no permissions in the mode and build up as we find
permissions (ie allow ACEs).  This fixes deny ACEs so they affect
the mask used to set the subsequent allow ACEs.

Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com>
CC: Alexander Bokovoy <ab@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-08 17:57:40 +00:00
Igor Mammedov
9eae8a8903 [CIFS] Add uid to key description so krb can handle user mounts
Adds uid to key description fro supporting user mounts
and minor formating changes

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Igor Mammedov <niallain@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2007-11-08 16:13:31 +00:00
Jens Axboe
8ec680e4c3 ioprio: allow sys_ioprio_set() value of 0 to reset ioprio setting
Normally io priorities follow the CPU nice, unless a specific scheduling
class has been set. Once that is set, there's no way to reset the
behaviour to 'none' so that it follows CPU nice again.

Currently passing in 0 as the ioprio class/value will return -1/EINVAL,
change that to allow resetting of a set scheduling class.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-07 13:54:07 +01:00
David S. Miller
df61c95262 [DLM] lowcomms: Do not muck with sysctl_rmem_max.
Use SO_RCVBUFFORCE instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:11:42 -08:00
David S. Miller
44656ba128 [NET]: Kill proc_net_create()
There are no more users.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:10:52 -08:00
Srinivas Eeda
e325a88f17 ocfs2: fix rename vs unlink race
If another node unlinks the destination while ocfs2_rename() is waiting on a
cluster lock, ocfs2_rename() simply logs an error and continues. This causes
a crash because the renaming node is now trying to delete a non-existent
inode. The correct solution is to return -ENOENT.

Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-11-06 15:35:40 -08:00
Jan Kara
bc7e97cbdd [PATCH] Fix possibly too long write in o2hb_setup_one_bio()
We should subtract start of our IO from PAGE_CACHE_SIZE to get the right
length of the write we want to perform.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-11-06 15:35:35 -08:00
Mark Fasheh
4e9563fd55 ocfs2: fix write() performance regression
On file systems which don't support sparse files, Ocfs2_map_page_blocks()
was reading blocks on appending writes. This caused write performance to
suffer dramatically. Fix this by detecting an appending write on a nonsparse
fs and skipping the read.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-11-06 15:35:29 -08:00
Mark Fasheh
9ea2d32f40 ocfs2: Commit journal on sync writes
We're missing a meta data commit for extending sync writes. In thoery, write
could return with the meta data required to read the data uncommitted to
disk. Fix that by detecting an allocating write and forcing a journal commit
in the sync case.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-11-06 15:32:00 -08:00
Mark Fasheh
9f70968af3 ocfs2: Re-order iput in ocfs2_drop_dentry_lock
Do this to avoid a theoretical (I haven't seen this in practice) race where
the downconvert thread might drop the dentry lock, allowing a remote unlink
to proceed before dropping the inode locks. This could bounce access to the
orphan dir between nodes.

There doesn't seem to be a need to do the same in ocfs2_dentry_iput() as
that's never called for the last ref drop from the downconvert thread.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-11-06 15:31:52 -08:00
Mark Fasheh
019d1b2247 ocfs2: Create locks at initially requested level
If we have not yet created a cluster lock, ocfs2_cluster_lock() will
first create it at NLMODE, and then convert the lock to either PRMODE or
EXMODE (whichever is requested).

Change ocfs2_cluster_lock() to just create the lock at the initially
requested level. ocfs2_locking_ast() handles this case fine, so the only
update required was in setup of locking state. This should reduce the number
of network messages required for a new lock by one, providing an incremental
performance enhancement.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-11-06 15:31:45 -08:00
Roel Kluin
3cf0c507dd [PATCH] Fix priority mistakes in fs/ocfs2/{alloc.c, dlmglue.c}
Fixes priority mistakes similar to '!x & y'

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-11-06 15:31:39 -08:00
Adrian Bunk
0af4bd3887 [2.6 patch] make ocfs2_find_entry_el() static
ocfs2_find_entry_el() can become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2007-11-06 15:31:06 -08:00