As the first step towards eliminating the ref->next_phys member and saving
memory by using an _array_ of struct jffs2_raw_node_ref per eraseblock,
stop the write functions from allocating their own refs; have them just
_reserve_ the appropriate number instead. Then jffs2_link_node_ref() can
just fill them in.
Use a linked list of pre-allocated refs in the superblock, for now. Once
we switch to an array, it'll just be a case of extending that array.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Else a subsequent bio_clone might make a mess.
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: "Don Dupuis" <dondster@gmail.com>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Both cause the 'entries' count in the export cache to be non-zero at module
removal time, so unregistering that cache fails and results in an oops.
1/ exp_pseudoroot (used for NFSv4 only) leaks a reference to an export
entry.
2/ sunrpc_cache_update doesn't increment the entries count when it adds
an entry.
Thanks to "david m. richter" <richterd@citi.umich.edu> for triggering the
problem and finding one of the bugs.
Cc: "david m. richter" <richterd@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
MTD clients are agnostic of FLASH which needs ECC suppport.
Remove the functions and fixup the callers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The writev based write buffer implementation was far to complex as
in most use cases the write buffer had to be handled anyway.
Simplify the write buffer handling and use mtd->write instead.
From extensive testing no performance impact has been noted.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We don't need the upper layers to deal with the physical offset. It's
_always_ c->nextblock->offset + c->sector_size - c->nextblock->free_size
so we might as well just let the actual write functions deal with that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
o Add a flag MTD_BIT_WRITEABLE for devices that allow single bits to be
cleared.
o Replace MTD_PROGRAM_REGIONS with a cleared MTD_BIT_WRITEABLE flag for
STMicro and Intel Sibley flashes with internal ECC. Those flashes
disallow clearing of single bits, unlike regular NOR flashes, so the
new flag models their behaviour better.
o Remove MTD_ECC. After the STMicro/Sibley merge, this flag is only set
and never checked.
Signed-off-by: Joern Engel <joern@wh.fh-wedel.de>
In 2002, STMicro started producing NOR flashes with internal ECC protection
for small blocks (8 or 16 bytes). Support for those flashes was added by me.
In 2005, Intel Sibley flashes copied this strategy and Nico added support for
those. Merge the code for both.
Signed-off-by: Joern Engel <joern@wh.fh-wedel.de>
At least two flashes exists that have the concept of a minimum write unit,
similar to NAND pages, but no other NAND characteristics. Therefore, rename
the minimum write unit to "writesize" for all flashes, including NAND.
Signed-off-by: Joern Engel <joern@wh.fh-wedel.de>
We'll be using a proper list of nodes in the jffs2_xattr_datum and
jffs2_xattr_ref structures, because the existing code to overwrite
them is just broken. Put it in the common part at the front of the
structure which is shared with the jffs2_inode_cache, so that the
jffs2_link_node_ref() function can do the right thing.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
In a couple of places, we assume that what's at the end of the
->next_in_ino list is a struct jffs2_inode_cache. Let's check
for that, since we expect it to change soon.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Let's avoid the potential for forgetting to set ref->next_in_ino, by doing
it within jffs2_link_node_ref() instead.
This highlights the ugliness of what we're currently doing with
xattr_datum and xattr_ref structures -- we should find a nicer way of
dealing with that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
When filing REF_OBSOLETE nodes, we'd add their size to the global
'dirty_size' count, but then to the eraseblock's 'used_size' count.
That's not clever.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Don't reassign to watch. If idr_find() returns NULL, then
put_inotify_watch() will choke.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
While doing some inotify stress testing, I hit the following race. In
inotify_release(), it's possible for a watch to be removed from the lists
in between dropping dev->mutex and taking inode->inotify_mutex. The
reference we hold prevents the watch from being freed, but not from being
removed.
Checking the dev's idr mapping will prevent a double list_del of the
same watch.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rml@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Bernd Schmidt points out that binfmt_flat is now leaving the exec file open
while the application runs. This offsets all the application's fd numbers.
We should have closed the file within exec(), not at exit()-time.
But there doesn't seem to be a lot of point in doing all this just to avoid
going over RLIMIT_NOFILE by one fd for a few microseconds. So take the EMFILE
checking out again. This will cause binfmt_flat to again fail LTP's
exec-should-return-EMFILE-when-fdtable-is-full test. That test appears to be
wrong anyway - Open Group specs say nothing about exec() returning EMFILE.
Cc: Bernd Schmidt <bernd.schmidt@analog.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Assigning the result of posix_acl_to_xattr() to an unsigned data type
(size/size_t) obscures possible errors.
Coverity CID: 1206.
Signed-off-by: Florin Malita <fmalita@gmail.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Address a problem found when a Linux NFS server uses the "subtree_check"
export option.
The "subtree_check" NFS export option was designed to prohibit a client
from using a file handle for which it should not have permission. The
algorithm used is to ensure that the entire path to the file being
referenced is accessible to the user attempting to use the file handle. If
some part of the path is not accessible, then the operation is aborted and
the appropriate version of ESTALE is returned to the NFS client.
The error, ESTALE, is unfortunate in that it causes NFS clients to make
certain assumptions about the continued existence of the file. They assume
that the file no longer exists and refuse to attempt to access it again.
In this case, the file really does exist, but access was denied by the
server for a particular user.
A better error to return would be an EACCES sort of error. This would
inform the client that the particular operation that it was attempting was
not allowed, without the nasty side effects of the ESTALE error.
Signed-off-by: Peter Staubach <staubach@redhat.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Functions compat_nfs_svc_trans, compat_nfs_clnt_trans,
compat_nfs_exp_trans, compat_nfs_getfd_trans and compat_nfs_getfs_trans,
which are called by compat_sys_nfsservctl(fs/compat.c), don't handle the
return value of access_ok properly. access_ok return 1 when the addr is
valid, and 0 when it's not, but these functions have the reversed
understanding. When the address is valid, they always return -EFAULT to
compat_sys_nfsservctl.
An example is to run /usr/sbin/rpc.nfsd(32bit program on Power5). It
doesn't function as expected. strace showes that nfsservctl returns
-EFAULT.
The patch fixes this by correcting the error handling on the return value
of access_ok in the five functions.
Signed-off-by: Lin Feng Shen <shenlinf@cn.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Well, almost. We'll actually keep a 'TEST_TOTLEN' macro set for now, and keep
doing some paranoia checks to make sure it's all working correctly. But if
TEST_TOTLEN is unset, the size of struct jffs2_raw_node_ref drops from 16
bytes to 12 on 32-bit machines. That's a saving of about half a megabyte of
memory on the OLPC prototype board, with 125K or so nodes in its 512MiB of
flash.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We can't use jffs2_scan_dirty_space() because it doesn't do any locking; it's
only for use at scan time -- hence the 'scan' in the name.
Also, don't allocate refs while we have c->erase_completion_lock held.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We don't allocate this locally any more -- it's given to us and owner by
our caller. Also improve the debug messages a little.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Next step in ongoing campaign to file a struct jffs2_raw_node_ref for every
piece of dirty space in the system, so that __totlen can be killed off....
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
If __totlen is going away, we need to pass the length in separately.
Also stop callers from needlessly setting ref->next_phys to NULL,
since that's done for them... and since that'll also be going away soon.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Make sure we allocate a ref for any dirty space which exists between nodes
which we find in an eraseblock summary.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
The incoming ref_totlen() calculation is going to rely on the existence
of nodes which cover all dirty space. We can't just tweak the accounting
data any more; we have to call jffs2_scan_dirty_space() to do it.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
To eliminate the __totlen field from struct jffs2_raw_node_ref, we need
to allocate nodes for dirty space instead of just tweaking the accounting
data. Introduce jffs2_scan_dirty_space() in preparation for that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
For RWCOMPAT and ROCOMPAT nodes, we should still allow the mount to
succeed. Just abandon the summary and fall through to the full scan.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
If we had to allocate extra space for the summary node, we weren't
correctly freeing it when jffs2_sum_scan_sumnode() returned nonzero --
which is both the success and the failure case. Only when it returned
zero, which means fall through to the full scan, were we correctly freeing
the buffer.
Document the meaning of those return codes while we're at it.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We should preserve these when we come to garbage collect them, not let
them get erased. Use jffs2_garbage_collect_pristine() for this, and make
sure the summary code copes -- just refrain from writing a summary for any
block which contains a node we don't understand.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
The same sequence of code was repeated in many places, to add a new
struct jffs2_raw_node_ref to an eraseblock and adjust the space accounting
accordingly. Move it out-of-line.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We were calling ref_totlen() 18 times. Even before that becomes a real
function rather than just a dereference, apparently some compilers still
suck anyway. It'll _certainly_ suck after ref_totlen() becomes more
complicated, so calculate it once and don't rely on CSE.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This improves the time to mount 512MiB of NAND flash on my OLPC prototype
by about 4%. We used to read the last page of the eraseblock twice -- once
to find the offset of the summary node, and again to actually _read_ the
summary node. Now we read the last page only once, and read more only if
we need to.
We also don't allocate a new buffer just for the summary code -- we use
the buffer which was already allocated for the scan. Better still, if the
'buffer' for the scan is actually just a pointer directly into NOR flash,
we use that too, avoiding the memcpy() which we used to do.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Remove forgotten lines from jffs2_scan_eraseblock() which
were unnecessary and may cause problem in some environments.
Thanks to Alexander Belyakov <alexander.belyakov@intel.com>.
Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Device node major/minor numbers are just stored in the payload of a single
data node. Just extend that to 4 bytes and use new_encode_dev() for it.
We only use the 4-byte format if we _need_ to, if !old_valid_dev(foo).
This preserves backwards compatibility with older code as much as
possible. If we do make devices with major or minor numbers above 255, and
then mount the file system with the old code, it'll just read the first
two bytes and get the numbers wrong. If it comes to garbage-collect it,
it'll then write back those wrong numbers. But that's about the best we
can expect.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
configfs_init() needs to be called first to register configfs before anyconsumers try to access it. Move up configfs in fs/Makefile to make
sure it is initialized early.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
If configfs_mkdir() errored in certain ways after the parent<->child
linkage was already created, it would not undo the linkage. Also,
comment the reference counting for clarity.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
configfs_mkdir() failed to release the working parent reference in most
exit paths. Also changed the exit path for readability.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
We were using GFP_KERNEL in a handful of places which really wanted
GFP_NOFS. Fix this.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Temporarily take the meta data lock in ocfs2_file_aio_read() to allow us to
update our inode fields.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
We need to take a data lock around extends to protect the pages that
ocfs2_zero_extend is going to be pulling into the page cache. Otherwise an
extend on one node might populate the page cache with data pages that have
no lock coverage.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
fs/jffs2/summary.c: In function ‘jffs2_sum_write_data’:
fs/jffs2/summary.c:658: warning: format ‘%zd’ expects type ‘signed size_t’, but argument 4 has type ‘uint32_t’
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Mark certain functions with __init and __exit appropriately.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>