All callers to __d_path pass the dentry and vfsmount of a struct path to
__d_path. Pass the struct path directly, instead.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In nearly all cases the set_fs_{root,pwd}() calls work on a struct
path. Change the function to reflect this and use path_get() here.
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Use struct path in fs_struct.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This introduces the symmetric function to path_put() for getting a reference
to the dentry and vfsmount of a struct path in the right order.
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use path_put() in a few places instead of {mnt,d}put()
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Add path_put() functions for releasing a reference to the dentry and
vfsmount of a struct path in the right order
* Switch from path_release(nd) to path_put(&nd->path)
* Rename dput_path() to path_put_conditional()
[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.
Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
<dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
struct path in every place where the stack can be traversed
- it reduces the overall code size:
without patch series:
text data bss dec hex filename
5321639 858418 715768 6895825 6938d1 vmlinux
with patch series:
text data bss dec hex filename
5320026 858418 715768 6894212 693284 vmlinux
This patch:
Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
path_release_on_umount() should only be called from sys_umount(). I merged the
function into sys_umount() instead of having in in namei.c.
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The warning issued by fs/binfmt_flat.c when the format handler is given a
non-FLAT and non-script executable is annoying to say the least when working
with FDPIC ELF objects. If you build a kernel that supports both FLAT and
FDPIC ELFs on no-mmu, every time you execute an FDPIC ELF, the kernel spits
out this message. While I understand a lot of newcomers to the no-mmu world
screw up generation of FLAT binaries, this warning is not usable for systems
that support more than just FLAT.
Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Bernd Schmidt <bernds_cb1@t-online.de>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
inotify_max_user_instances, inotify_max_user_watches,
inotify_max_queued_events can all be made static.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The error field in nfs_readdir_descriptor_t is never used outside of the
function in which it is set. Remove the field and change the place that
does use it to use an existing local variable.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The warning message for a v4 server returning various bad sequence-ids is
missing spaces.
Signed-off-by: Dan Muntz <dmuntz@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The compat_sys_mount() system call throws EINVAL for text-based NFSv4
mounts.
The text-based mount interface assumes that any mount option blob that
doesn't set the version field to "1" is a C string (ie not a legacy
mount request). The compat_sys_mount() call treats blobs that don't
set the version field to "1" as an error. We just relax the check in
compat_sys_mount() a bit to allow C strings to be passed down to the NFSv4
client.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The reference counting for the NFSv4 callback thread stays artificially
high. When this thread comes down, it doesn't properly tear down the
svc_serv, causing a memory leak. In my testing on an older kernel on
x86_64, memory would leak out of the 8k kmalloc slab. So, we're leaking
at least a page of memory every time the thread comes down.
svc_create() creates the svc_serv with a sv_nrthreads count of 1, and
then svc_create_thread() increments that count. Whenever the callback
thread is started it has a sv_nrthreads count of 2. When coming down, it
calls svc_exit_thread() which decrements that count and if it hits 0, it
tears everything down. That never happens here since the count is always
at 2 when the thread exits.
The problem is that nfs_callback_up() should be calling svc_destroy() on
the svc_serv on both success and failure. This is how lockd_up_proto()
handles the reference counting, and doing that here fixes the leak.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In commit 742ba02a51 (udf: create common
function for changing free space counter) by accident I reversed safety
condition which lead to null pointer dereference in case of media error and
wrong counting of free space in normal situation
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch cleaning up UDF directory offset handling missed modifications in dir.c
(because I've submitted an old version :(). Fix it.
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Tested-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the warning message regarding smbfs to
"smbfs is deprecated and will be removed from the 2.6.27 kernel. Please migrate to cifs"
instead of
"smbfs is deprecated and will be removedfrom the 2.6.27 kernel. Please migrate to cifs"
Signed-off-by: Sergio Luis <sergio@uece.br>
Screwed-up-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
remove beX_add functions and replace all uses with beX_add_cpu
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: Dave Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://linux-nfs.org/~bfields/linux:
SUNPRC: Fix printk format warning
nfsd: clean up svc_reserve_auth()
NLM: don't requeue block if it was invalidated while GRANT_MSG was in flight
NLM: don't reattempt GRANT_MSG when there is already an RPC in flight
NLM: have server-side RPC clients default to soft RPC tasks
NLM: set RPC_CLNT_CREATE_NOPING for NLM RPC clients
It's possible for lockd to catch a SIGKILL while a GRANT_MSG callback
is in flight. If this happens we don't want lockd to insert the block
back into the nlm_blocked list.
This helps that situation, but there's still a possible race. Fixing
that will mean adding real locking for nlm_blocked.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
With the current scheme in nlmsvc_grant_blocked, we can end up with more
than one GRANT_MSG callback for a block in flight. Right now, we requeue
the block unconditionally so that a GRANT_MSG callback is done again in
30s. If the client is unresponsive, it can take more than 30s for the
call already in flight to time out.
There's no benefit to having more than one GRANT_MSG RPC queued up at a
time, so put it on the list with a timeout of NLM_NEVER before doing the
RPC call. If the RPC call submission fails, we requeue it with a short
timeout. If it works, then nlmsvc_grant_callback will end up requeueing
it with a shorter timeout after it completes.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Now that it no longer does an RPC ping, lockd always ends up queueing
an RPC task for the GRANT_MSG callback. But, it also requeues the block
for later attempts. Since these are hard RPC tasks, if the client we're
calling back goes unresponsive the GRANT_MSG callbacks can stack up in
the RPC queue.
Fix this by making server-side RPC clients default to soft RPC tasks.
lockd requeues the block anyway, so this should be OK.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
It's currently possible for an unresponsive NLM client to completely
lock up a server's lockd. The scenario is something like this:
1) client1 (or a process on the server) takes a lock on a file
2) client2 tries to take a blocking lock on the same file and
awaits the callback
3) client2 goes unresponsive (plug pulled, network partition, etc)
4) client1 releases the lock
...at that point the server's lockd will try to queue up a GRANT_MSG
callback for client2, but first it requeues the block with a timeout of
30s. nlm_async_call will attempt to bind the RPC client to client2 and
will call rpc_ping. rpc_ping entails a sync RPC call and if client2 is
unresponsive it will take around 60s for that to time out. Once it times
out, it's already time to retry the block and the whole process repeats.
Once in this situation, nlmsvc_retry_blocked will never return until
the host starts responding again. lockd won't service new calls.
Fix this by skipping the RPC ping on NLM RPC clients. This makes
nlm_async_call return quickly when called.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Commit 8811930dc7 ("splice: missing user
pointer access verification") added the proper access_ok() calls to
copy_from_user_mmap_sem() which ensures we can copy the struct iovecs
from userspace to the kernel.
But we also must check whether we can access the actual memory region
pointed to by the struct iovec to fix the access checks properly.
Signed-off-by: Bastian Blank <waldi@debian.org>
Acked-by: Oliver Pinter <oliver.pntr@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This flag is simply a generic "this is a crash/burn test filesystem"
marker. If it is set, then filesystem code which is "in development"
will be allowed to mount the filesystem. Filesystem code which is not
considered ready for prime-time will check for this flag, and if it is
not set, it will refuse to touch the filesystem.
As we start rolling ext4 out to distro's like Fedora, et. al, this makes
it less likely that a user might accidentally start using ext4 on a
production filesystem; a bad thing, since that will essentially make it
be unfsckable until e2fsprogs catches up.
Signed-off-by: Theodore Tso <tytso@MIT.EDU>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Multiblock allocator calls BUG_ON in many case if the free and used
blocks count obtained looking at the bitmap is different from what
the allocator internally accounted for. Use ext4_error in such case
and don't panic the system.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
struct ext4_allocation_context is rather large, and this bloats
the stack of many functions which use it. Allocating it from
a named slab cache will alleviate this.
For example, with this change (on top of the noinline patch sent earlier):
-ext4_mb_new_blocks 200
+ext4_mb_new_blocks 40
-ext4_mb_free_blocks 344
+ext4_mb_free_blocks 168
-ext4_mb_release_inode_pa 216
+ext4_mb_release_inode_pa 40
-ext4_mb_release_group_pa 192
+ext4_mb_release_group_pa 24
Most of these stack-allocated structs are actually used only for
mballoc history; and in those cases often a smaller struct would do.
So changing that may be another way around it, at least for those
functions, if preferred. For now, in those cases where the ac
is only for history, an allocation failure simply skips the history
recording, and does not cause any other failures.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In JBD2 jbd2_journal_write_commit_record(), clear the buffer_ordered
flag for the bh after barried IO has succeed. This prevents later, if
the same buffer head were submitted to the underlying device, which has
been reconfigured to not support barrier request, the JBD2 commit code
could treat it as a normal IO (without barrier).
This is a port from JBD/ext3 fix from Neil Brown.
More details from Neil:
Some devices - notably dm and md - can change their behaviour in
response to BIO_RW_BARRIER requests. They might start out accepting
such requests but on reconfiguration, they find out that they cannot
any more. JBD2 deal with this by always testing if BIO_RW_BARRIER
requests fail with EOPNOTSUPP, and retrying the write
requests without the barrier (probably after waiting for any pending
writes to complete).
However there is a bug in the handling this in JBD2 for ext4 .
When ext4/JBD2 to submit a BIO_RW_BARRIER request,
it sets the buffer_ordered flag on the buffer head.
If the request completes successfully, the flag STAYS SET.
Other code might then write the same buffer_head after the device has
been reconfigured to not accept barriers. This write will then fail,
but the "other code" is not ready to handle EOPNOTSUPP errors and the
error will be treated as fatal.
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
We cannot start transaction in ext4_direct_IO() and just let it last
during the whole write because dio_get_page() acquires mmap_sem which
ranks above transaction start (e.g. because we have dependency chain
mmap_sem->PageLock->journal_start, or because we update atime while
holding mmap_sem) and thus deadlocks could happen. We solve the problem
by starting a transaction separately for each ext4_get_block() call.
We *could* have a problem that we allocate a block and before its data
are written out the machine crashes and thus we expose stale data. But
that does not happen because for hole-filling generic code falls back to
buffered writes and for file extension, we add inode to orphan list and
thus in case of crash, journal replay will truncate inode back to the
original size.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In order to prevent a circular locking dependency when an unlink
operation is racing with an ext4 migration, we delay taking i_data_sem
until just before switch the inode format, and use i_mutex to prevent
writes and truncates during the first part of the migration operation.
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
When xfs_file_readdir() exactly fills a buffer, it can move it's index
past the end of the buffer and dereference it even though the result of
the dereference is never used. On some platforms this causes an oops.
SGI-PV: 976923
SGI-Modid: xfs-linux-melb:xfs-kern:30458a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
The only caller (xfs_fs_fill_super) can simplify call igrab on the root
inode.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30393a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
To get the read-only bind mounts in -mm to work correctly with XFS we need
to call the drop_nlink and inc_nlink helpers to monitor the link count.
Add calls to these to xfs_bumplink and xfs_droplink and stop copying over
di_nlink to i_nlink in xfs_validate_fields and vn_revalidate.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30392a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
The VFS doesn't use i_blocks, it's only used by generic_fillattr and the
generic quota code which XFS doesn't use. In XFS there is one use to check
whether we have an inline or out of line sumlink, but we can replace that
with a check of the XFS_IFINLINE inode flag.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30391a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Checking the entire AIL on every insert and remove is prohibitively
expensive - the sustained sequntial create rate on a single disk drops
from about 1800/s to 60/s because of this checking resulting in the
xfslogd becoming cpu bound.
By default on debug builds, only check the next and previous entries in
the list to ensure they are ordered correctly. If you really want, define
XFS_TRANS_DEBUG to use the old behaviour.
SGI-PV: 972759
SGI-Modid: xfs-linux-melb:xfs-kern:30372a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
When many hundreds to thousands of threads all try to do simultaneous
transactions and the log is in a tail-pushing situation (i.e. full), we
can get multiple threads walking the AIL list and contending on the AIL
lock.
The AIL push is, in effect, a simple I/O dispatch algorithm complicated by
the ordering constraints placed on it by the transaction subsystem. It
really does not need multiple threads to push on it - even when only a
single CPU is pushing the AIL, it can push the I/O out far faster that
pretty much any disk subsystem can handle.
So, to avoid contention problems stemming from multiple list walkers, move
the list walk off into another thread and simply provide a "target" to
push to. When a thread requires a push, it sets the target and wakes the
push thread, then goes to sleep waiting for the required amount of space
to become available in the log.
This mechanism should also be a lot fairer under heavy load as the waiters
will queue in arrival order, rather than queuing in "who completed a push
first" order.
Also, by moving the pushing to a separate thread we can do more
effectively overload detection and prevention as we can keep context from
loop iteration to loop iteration. That is, we can push only part of the
list each loop and not have to loop back to the start of the list every
time we run. This should also help by reducing the number of items we try
to lock and/or push items that we cannot move.
Note that this patch is not intended to solve the inefficiencies in the
AIL structure and the associated issues with extremely large list
contents. That needs to be addresses separately; parallel access would
cause problems to any new structure as well, so I'm only aiming to isolate
the structure from unbounded parallelism here.
SGI-PV: 972759
SGI-Modid: xfs-linux-melb:xfs-kern:30371a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Now that all direct caller of xfs_iaccess are gone we can kill xfs_iaccess
and xfs_access and just use generic_permission with a check_acl callback.
This is required for the per-mount read-only patchset in -mm to work
properly with XFS.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30370a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
xfs_swapext should simplify check if we have a writeable file descriptor
instead of re-checking the permissions using xfs_iaccess. Add an
additional check to refuse O_APPEND file descriptors because swapext is
not an append-only write operation.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30369a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Both callers of xfs_change_file_space alreaedy do the file->f_mode &
FMODE_WRITE check to ensure we have a file descriptor that has been opened
for write mode, so there is no need to re-check that with xfs_iaccess.
Especially as the later might wrongly deny it for corner cases like file
descriptor passing through unix domain sockets.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30365a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
A problem was reported where a system panicked in log recovery due to a
corrupt log record. The cause of the corruption is not known but this
change will at least prevent a crash for this specific scenario. Log
recovery definitely needs some more work in this area.
SGI-PV: 974151
SGI-Modid: xfs-linux-melb:xfs-kern:30318a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
- merge xfs_fid2 into it's only caller xfs_dm_inode_to_fh.
- remove xfs_vget and opencode it in the two callers, simplifying
both of them by avoiding the awkward calling convetion.
- assign directly to the dm_fid_t members in various places in the
dmapi code instead of casting them to xfs_fid_t first (which
is identical to dm_fid_t)
SGI-PV: 974747
SGI-Modid: xfs-linux-melb:xfs-kern:30258a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
xfs_lowbit64 was broken on 32 bit platforms in a recent cleanup of the xfs
bitops. Fix it back up again.
SGI-PV: 974005
SGI-Modid: xfs-linux-melb:xfs-kern:30202a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Currently XFS_IFORK_* and XFS_DFORK* are implemented by means of
XFS_CFORK* macros. But given that XFS_IFORK_* operates on an xfs_inode
that embedds and xfs_icdinode_core and XFS_DFORK_* operates on an
xfs_dinode that embedds a xfs_dinode_core one will have to do endian
swapping while the other doesn't. Instead of having the current mess with
the CFORK macros that have byteswapping and non-byteswapping version
(which are inconsistantly named while we're at it) just define each family
of the macros to stand by itself and simplify the whole matter.
A few direct references to the CFORK variants were cleaned up to use IFORK
or DFORK to make this possible.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30163a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
There is no need to lock any page in xfs_buf.c because we operate on our
own address_space and all locking is covered by the buffer semaphore. If
we ever switch back to main blockdeive address_space as suggested e.g. for
fsblock with a similar scheme the locking will have to be totally revised
anyway because the current scheme is neither correct nor coherent with
itself.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30156a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30098a
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
The BPCSHIFT based macros, btoc*, ctob*, offtoc* and ctooff are either not
used or don't need to be used. The NDPP, NDPP, NBBY macros don't need to
be used but instead are replaced directly by PAGE_SIZE and PAGE_CACHE_SIZE
where appropriate. Initial patch and motivation from Nicolas Kaiser.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30096a
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
This assert is bogus. We can have a forced shutdown occur
between the check for the XLOG_FORCED_SHUTDOWN and the ASSERT. Also, the
logging system shouldn't care about the state of XFS_FORCED_SHUTDOWN, it
should only check XLOG_FORCED_SHUTDOWN. The logging system has it's own
forced shutdown flag so, for the case of a forced shutdown that's not due
to a logging error, we can flush the log.
SGI-PV: 972985
SGI-Modid: xfs-linux-melb:xfs-kern:30029a
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Use XFS_IS_REALTIME_INODE in more places, and #define it to 0 if
CONFIG_XFS_RT is off. This should be safe because mount checks in
xfs_rtmount_init:
so if we get mounted w/o CONFIG_XFS_RT, no realtime inodes should be
encountered after that.
Defining XFS_IS_REALTIME_INODE to 0 saves a bit of stack space,
presumeably gcc can optimize around the various "if (0)" type checks:
xfs_alloc_file_space -8 xfs_bmap_adjacent -16 xfs_bmapi -8
xfs_bmap_rtalloc -16 xfs_bunmapi -28 xfs_free_file_space -64 xfs_imap +8
<-- ? hmm. xfs_iomap_write_direct -12 xfs_qm_dqusage_adjust -4
xfs_qm_vop_chown_reserve -4
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30014a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Mount option parsing is platform specific. Move it out of core code into
the platform specific superblock operation file.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30012a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Implement the new generic callout for file preallocation. Atomically
change the file size if requested.
SGI-PV: 972756
SGI-Modid: xfs-linux-melb:xfs-kern:30009a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
The log force added in xfs_iget_core() has been a performance issue since
it was introduced for tight loops that allocate then unlink a single file.
under heavy writeback, this can introduce unnecessary latency due tothe
log I/o getting stuck behind bulk data writes.
Fix this latency problem by avoinding the need for the log force by moving
the place we mark linux inode dirty to the transaction commit rather than
on transaction completion.
This also closes a potential hole in the sync code where a linux inode is
not dirty between the time it is modified and the time the log buffer has
been written to disk.
SGI-PV: 972753
SGI-Modid: xfs-linux-melb:xfs-kern:30007a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Prevent transaction overrun in xfs_iomap_write_allocate() if we race with
a truncate that overlaps the delalloc range we were planning to allocate.
If we race, we may allocate into a hole and that requires block
allocation. At this point in time we don't have a reservation for block
allocation (apart from metadata blocks) and so allocating into a hole
rather than a delalloc region results in overflowing the transaction block
reservation.
Fix it by only allowing a single extent to be allocated at a time.
SGI-PV: 972757
SGI-Modid: xfs-linux-melb:xfs-kern:30005a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
There are several mount options that don't show up in /proc/mounts. Add
them in and clean up the showargs code at the same time.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30004a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Sparse trips over the locking order in xlog_recover_do_efd_trans() when
xfs_trans_delete_ail() drops the ail lock. Because the unlock is
conditional, we need to either annotate with a "fake unlock" or change the
structure of the code so sparse thinks the function always unlocks.
Reordering the code makes it simpler, so do that.
SGI-PV: 972755
SGI-Modid: xfs-linux-melb:xfs-kern:30003a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
These are mostly locking annotations, marking things static, casts where
needed and declaring stuff in header files.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30002a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
We need xfs_bulkstat() to report inode stat for inodes with link count
zero but reference count non zero.
The fix here:
http://oss.sgi.com/archives/xfs/2007-09/msg00266.html
changed this behavior and made xfs_bulkstat() to filter all unlinked
inodes including those that are not destroyed yet but held by reference.
The attached patch returns back to the original behavior by marking the
on-disk inode buffer "dirty" when di_mode is cleared (at that time both
inode link and reference counter are zero).
SGI-PV: 972004
SGI-Modid: xfs-linux-melb:xfs-kern:29914a
Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
XFS_IOC_GETVERSION, XFS_IOC_GETXFLAGS and XFS_IOC_SETXFLAGS all take a
"long" which changes size between 32 and 64 bit platforms.
So, the ioctl cmds that come in from a 32-bit app aren't as expected, for
example on GETXFLAGS,
unknown cmd fd(3) cmd(80046601){t:'f';sz:4}
due to the size mismatch.
So, use instead the 32-bit version of the commands for compat ioctls, and
other than that it doesn't take any more manipulation.
Also, for both native and compat versions, just define them to the values
as defined in fs.h
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29849a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
No need for xfs to have its own hex dumping routine now that the kernel
has one.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29847a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
This macro is unused an all other acros in this family operate on native
types, so we most likely won't grow a user either.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29846a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
There is no need to lock any page in xfs_buf.c because we operate on our
own address_space and all locking is covered by the buffer semaphore. If
we ever switch back to main blockdeive address_space as suggested e.g. for
fsblock with a similar scheme the locking will have to be totally revised
anyway because the current scheme is neither correct nor coherent with
itself.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29845a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Refactoring xfs_mountfs() to call sub-functions for logical chunks can
help save a bit of stack, and can make it easier to read this long
function.
The mount path is one of the longest common callchains, easily getting to
within a few bytes of the end of a 4k stack when over lvm, quotas are
enabled, and quotacheck must be done.
With this change on top of the other stack-related changes I've sent, I
can get xfs to survive a normal xfsqa run on 4k stacks over lvm.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29834a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Mostly trivial conversion with one exceptions: h_num_logops was kept in
native endian previously and only converted to big endian in xlog_sync,
but we always keep it big endian now. With todays cpus fast byteswap
instructions that's not an issue but the new variant keeps the code clean
and maintainable.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29821a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
- the various assign lsn macros are replaced by a single inline,
xlog_assign_lsn, which is equivalent to ASSIGN_ANY_LSN_HOST except
for a more sane calling convention. ASSIGN_LSN_DISK is replaced
by xlog_assign_lsn and a manual bytespap, and ASSIGN_LSN by the same,
except we pass the cycle and block arguments explicitly instead of a
log paramter. The latter two variants only had 2, respectively one
user anyway.
- the GET_CYCLE is replaced by a xlog_get_cycle inline with exactly the
same calling conventions.
- GET_CLIENT_ID is replaced by xlog_get_client_id which leaves away
the unused arch argument. Instead of conditional defintions
depending on host endianess we now do an unconditional swap and shift
then, which generates equal code.
- the unused XLOG_SET macro is removed.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29820a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
- the various assign lsn macros are replaced by a single inline,
xlog_assign_lsn, which is equivalent to ASSIGN_ANY_LSN_HOST except
for a more sane calling convention. ASSIGN_LSN_DISK is replaced
by xlog_assign_lsn and a manual bytespap, and ASSIGN_LSN by the same,
except we pass the cycle and block arguments explicitly instead of a
log paramter. The latter two variants only had 2, respectively one
user anyway.
- the GET_CYCLE is replaced by a xlog_get_cycle inline with exactly the
same calling conventions.
- GET_CLIENT_ID is replaced by xlog_get_client_id which leaves away
the unused arch argument. Instead of conditional defintions
depending on host endianess we now do an unconditional swap and shift
then, which generates equal code.
- the unused XLOG_SET macro is removed.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29819a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
No need to have a wrapper just two call two more functions.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29816a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Get rid of vnode useage in xfs_iget.c and pass Linux inode / xfs_inode
where apropinquate. And kill some useless helpers while we're at it.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29808a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
xfs_ioctl.c passes around vnode pointers quite a lot, but all places
already have the Linux inode which is identical to the vnode these days.
Clean the code up to always use the Linux inode.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29807a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
We were already filling the Linux struct statfs anyway, and doing this
trivial task directly in xfs_fs_statfs makes the code quite a bit cleaner.
While I was at it I also moved copying attributes that don't change over
the lifetime of the filesystem outside the superblock lock.
xfs_fs_fill_super used to get the magic number and blocksize through
xfs_statvfs, but assigning them directly is a lot cleaner and will save
some stack space during mount.
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29802a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Just fill in struct kstat directly from the xfs_inode instead of doing a
detour through a bhv_vattr_t and xfs_getattr.
SGI-PV: 970980
SGI-Modid: xfs-linux-melb:xfs-kern:29770a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
xfs_iocore_t is a structure embedded in xfs_inode. Except for one field it
just duplicates fields already in xfs_inode, and there is nothing this
abstraction buys us on XFS/Linux. This patch removes it and shrinks source
and binary size of xfs aswell as shrinking the size of xfs_inode by 60/44
bytes in debug/non-debug builds.
SGI-PV: 970852
SGI-Modid: xfs-linux-melb:xfs-kern:29754a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
remove spinlock init abstraction macro in spin.h, remove the callers, and
remove the file. Move no-op spinlock_destroy to xfs_linux.h Cleanup
spinlock locals in xfs_mount.c
SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29751a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Un-obfuscate XFS_SB_LOCK, remove XFS_SB_LOCK->mutex_lock->spin_lock
macros, call spin_lock directly, remove extraneous cookie holdover from
old xfs code, and change lock type to spinlock_t.
SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29746a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Un-obfuscate dabuf_global_lock, remove mutex_lock->spin_lock macros, call
spin_lock directly, remove extraneous cookie holdover from old xfs code,
and change lock type to spinlock_t.
SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29744a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Un-obfuscate pagb_lock, remove mutex_lock->spin_lock macros, call
spin_lock directly, remove extraneous cookie holdover from old xfs code,
and change lock type to spinlock_t.
SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29743a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Un-obfuscate DQ_PINLOCK, remove DQ_PINLOCK->mutex_lock->spin_lock macros,
call spin_lock directly, remove extraneous cookie holdover from old xfs
code, and change lock type to spinlock_t.
SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29742a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Un-obfuscate GRANT_LOCK, remove GRANT_LOCK->mutex_lock->spin_lock macros,
call spin_lock directly, remove extraneous cookie holdover from old xfs
code, and change lock type to spinlock_t.
SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29741a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Un-obfuscate LOG_LOCK, remove LOG_LOCK->mutex_lock->spin_lock macros, call
spin_lock directly, remove extraneous cookie holdover from old xfs code,
and change lock type to spinlock_t.
SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29740a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Currently there is an indirection called ioops in the XFS data I/O path.
Various functions are called by functions pointers, but there is no
coherence in what this is for, and of course for XFS itself it's entirely
unused. This patch removes it instead and significantly reduces source and
binary size of XFS while making maintaince easier.
SGI-PV: 970841
SGI-Modid: xfs-linux-melb:xfs-kern:29737a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
No need to allocate a bhv_vattr_t on stack and call xfs_getattr to update
a few fields in the Linux inode from the XFS inode, just do it directly.
And yes, this function is in dire need of a better name and prototype,
I'll do in a separate patch, though.
SGI-PV: 970705
SGI-Modid: xfs-linux-melb:xfs-kern:29713a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
There is no reason to go through xfs_iomap for the BMAPI_UNWRITTEN because
it has nothing in common with the other cases. Instead check for the
shutdown filesystem in xfs_end_bio_unwritten and perform a direct call to
xfs_iomap_write_unwritten (which should be renamed to something more
sensible one day)
SGI-PV: 970241
SGI-Modid: xfs-linux-melb:xfs-kern:29681a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
There is no reason to go into the iomap machinery just to get the right
block device for an inode. Instead look at the realtime flag in the inode
and grab the right device from the mount structure.
I created a new helper, xfs_find_bdev_for_inode instead of opencoding it
because I plan to use it in other places in the future.
SGI-PV: 970240
SGI-Modid: xfs-linux-melb:xfs-kern:29680a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Simplify vnode tracing calls by embedding function name & return addr in
the calling macro.
Also do a lot of vnode->inode renaming for consistency, while we're at it.
SGI-PV: 970335
SGI-Modid: xfs-linux-melb:xfs-kern:29650a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
A large part of xfs_sync_inodes is conditional on the SYNC_BDFLUSH which
is never passed to it. This patch removes it and adds an assert that
triggers in case some new code tries to pass SYNC_BDFLUSH to it.
SGI-PV: 970242
SGI-Modid: xfs-linux-melb:xfs-kern:29630a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
The ext3 root inode was treated specially with respect
to in-inode extended attributes, for reasons detailed
in the removed comment below. The first mkfs-created
inodes would not get extra_i_size or the EXT3_STATE_XATTR
flag set in ext3_read_inode, which disallowed reading or
setting in-inode EAs on the root.
However, in ext4, ext4_mark_inode_dirty calls
ext4_expand_extra_isize for all inodes; once this is done
EAs may be placed in the root ext4 inode body.
But for reasons above, it won't be found after a reboot.
testcase:
setfattr -n user.name -v value mntpt/
setfattr -n user.name2 -v value2 mntpt/
umount mntpt/; remount mntpt/
getfattr -d mntpt/
name2/value2 has gone missing; debugfs shows it in the
inode body, but it is not found there by getattr.
The following fixes it up; newer mkfs appears to properly
zero the inodes, so this workaround isn't needed for ext4.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Repoted by Adrian Bunk <bunk@kernel.org>:
The Coverity checker spotted the following NULL dereference:
static int ext4_mb_mark_diskspace_used
{
...
if (!bitmap_bh)
goto out_err;
...
out_err:
sb->s_dirt = 1;
put_bh(bitmap_bh);
...
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
smbfs is a bit buggy and has no maintainer. Change it to shout at the user on
the first five mount attempts - tell them to switch to CIFS.
Come December we'll mark it BROKEN and see what happens.
[olecom@flower.upol.cz: documentation update]
Cc: Urban Widmark <urban@teststation.com>
Acked-by: Steven French <sfrench@us.ibm.com>
Signed-off-by: Oleg Verych <olecom@flower.upol.cz>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To convert from tv_nsec to tv_usec, one needs to divide by 1000, not multiply.
Signed-off-by: Dominique Quatravaux <dominique@quatravaux.org>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The patch supports legacy (32-bit) capability userspace, and where possible
translates 32-bit capabilities to/from userspace and the VFS to 64-bit
kernel space capabilities. If a capability set cannot be compressed into
32-bits for consumption by user space, the system call fails, with -ERANGE.
FWIW libcap-2.00 supports this change (and earlier capability formats)
http://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/
[akpm@linux-foundation.org: coding-syle fixes]
[akpm@linux-foundation.org: use get_task_comm()]
[ezk@cs.sunysb.edu: build fix]
[akpm@linux-foundation.org: do not initialise statics to 0 or NULL]
[akpm@linux-foundation.org: unused var]
[serue@us.ibm.com: export __cap_ symbols]
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Originally vfs_getxattr would pull the security xattr variable using
the inode getxattr handle and then proceed to clobber it with a subsequent call
to the LSM.
This patch reorders the two operations such that when the xattr requested is
in the security namespace it first attempts to grab the value from the LSM
directly.
If it fails to obtain the value because there is no module present or the
module does not support the operation it will fall back to using the inode
getxattr operation.
In the event that both are inaccessible it returns EOPNOTSUPP.
Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>