Commit 1f36f774b2 broke FS_REVAL_DOT semantics.
In particular, before this patch, the command
ls -l
in an NFS mounted directory would always check if the directory on the server
had changed and if so would flush and refill the pagecache for the dir.
After this patch, the same "ls -l" will repeatedly return stale date until
the cached attributes for the directory time out.
The following patch fixes this by ensuring the d_revalidate is called by
do_last when "." is being looked-up.
link_path_walk has already called d_revalidate, but in that case LOOKUP_OPEN
is not set so nfs_lookup_verify_inode chooses not to do any validation.
The following patch restores the original behaviour.
Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
update the mnt of the path when it is not equal to the new one.
Signed-off-by: Huang Shijie <shijie8@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1) i_flags simply doesn't work for mount/unlink race prevention;
we may have many links to file and rm on one of those obviously
shouldn't prevent bind on top of another later on. To fix it
right way we need to mark _dentry_ as unsuitable for mounting
upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and
i_mutex on the inode in question. Set it (with dont_mount(dentry))
in unlink/rmdir/etc., check (with cant_mount(dentry)) in places
in namespace.c that used to check for S_DEAD. Setting S_DEAD
is still needed in places where we used to set it (for directories
getting killed), since we rely on it for readdir/rmdir race
prevention.
2) rename()/mount() protection has another bogosity - we unhash
the target before we'd checked that it's not a mountpoint. Fixed.
3) ancient bogosity in pivot_root() - we locked i_mutex on the
right directory, but checked S_DEAD on the different (and wrong)
one. Noticed and fixed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
According to specification
mkdir d; ln -s d a; open("a/", O_NOFOLLOW | O_RDONLY)
should return success but currently it returns ELOOP. This is a
regression caused by path lookup cleanup patch series.
Fix the code to ignore O_NOFOLLOW in case the provided path has trailing
slashes.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Marius Tolzmann <tolzmann@molgen.mpg.de>
Acked-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lose want_dir argument, while we are at it - since now
nd->flags & LOOKUP_DIRECTORY is equivalent to it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We managed to lose O_DIRECTORY testing due to a stupid typo in commit
1f36f774b2 ("Switch !O_CREAT case to use of do_last()")
Reported-by: Walter Sheets <w41ter@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
quota: stop using QUOTA_OK / NO_QUOTA
dquot: cleanup dquot initialize routine
dquot: move dquot initialization responsibility into the filesystem
dquot: cleanup dquot drop routine
dquot: move dquot drop responsibility into the filesystem
dquot: cleanup dquot transfer routine
dquot: move dquot transfer responsibility into the filesystem
dquot: cleanup inode allocation / freeing routines
dquot: cleanup space allocation / freeing routines
ext3: add writepage sanity checks
ext3: Truncate allocated blocks if direct IO write fails to update i_size
quota: Properly invalidate caches even for filesystems with blocksize < pagesize
quota: generalize quota transfer interface
quota: sb_quota state flags cleanup
jbd: Delay discarding buffers in journal_unmap_buffer
ext3: quota_write cross block boundary behaviour
quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
quota: split out compat_sys_quotactl support from quota.c
quota: split out netlink notification support from quota.c
quota: remove invalid optimization from quota_sync_all
...
Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c
Now that nd->last stays around until ->put_link() is called, we can
just postpone that ->put_link() in do_filp_open() a bit and don't
bother with copying.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
If we'd passed through 32 trailing symlinks already, there's
no sense following the 33rd - we'll bail out anyway. Better
bugger off earlier.
It *does* change behaviour, after a fashion - if the 33rd happens
to be a procfs-style symlink, original code *would* allow it.
This one will not. Cry me a river if that hurts you. Please, do.
And post a video of that, while you are at it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Since do_last() doesn't mangle nd->last_name, we can safely postpone
__putname() done in handling of trailing symlinks until after the
call of do_last()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Brute-force separation of stuff reachable from do_last: with
the exception of do_link:; just take all that crap to a helper
function as-is and have it tell the caller if it has to go
to do_link.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
That's going to be a long and painful series. The first step:
take the stuff reachable from 'ok' label in do_filp_open() into
a new helper (finish_open()).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Currently various places in the VFS call vfs_dq_init directly. This means
we tie the quota code into the VFS. Get rid of that and make the
filesystem responsible for the initialization. For most metadata operations
this is a straight forward move into the methods, but for truncate and
open it's a bit more complicated.
For truncate we currently only call vfs_dq_init for the sys_truncate case
because open already takes care of it for ftruncate and open(O_TRUNC) - the
new code causes an additional vfs_dq_init for those which is harmless.
For open the initialization is moved from do_filp_open into the open method,
which means it happens slightly earlier now, and only for regular files.
The latter is fine because we don't need to initialize it for operations
on special files, and we already do it as part of the namespace operations
for directories.
Add a dquot_file_open helper that filesystems that support generic quotas
can use to fill in ->open.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Make sure that automount "symlinks" are followed regardless of LOOKUP_FOLLOW;
it should have no effect on them.
Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
ima_path_check actually deals with files! call it ima_file_check instead.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The "Untangling ima mess, part 2 with counters" patch messed
up the counters. Based on conversations with Al Viro, this patch
streamlines ima_path_check() by removing the counter maintaince.
The counters are now updated independently, from measuring the file,
in __dentry_open() and alloc_file() by calling ima_counts_get().
ima_path_check() is called from nfsd and do_filp_open().
It also did not measure all files that should have been measured.
Reason: ima_path_check() got bogus value passed as mask.
[AV: mea culpa]
[AV: add missing nfsd bits]
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Some comments misspell "truly"; this fixes them. No code changes.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Instead of playing sick games with path saving, cleanups, just retry
the entire thing once with LOOKUP_REVAL added. Post-.34 we'll convert
all -ESTALE handling in there to that style, rather than playing with
many retry loops deep in the call chain.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
commit 5300990c03 had stepped on a rather
nasty mess: definitions of ACC_MODE used to be different. Fixed the
resulting breakage, converting them to variant that takes O_... value;
all callers have that and it actually simplifies life (see tomoyo part
of changes).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We end up trying to kfree() nd.last.name on open("/mnt/tmp", O_CREAT)
if /mnt/tmp is an autofs direct mount. The reason is that nd.last_type
is bogus here; we want LAST_BIND for everything of that kind and we
get LAST_NORM left over from finding parent directory.
So make sure that it *is* set properly; set to LAST_BIND before
doing ->follow_link() - for normal symlinks it will be changed
by __vfs_follow_link() and everything else needs it set that way.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
generic_permission was refusing CAP_DAC_READ_SEARCH-enabled
processes from opening DAC-protected files read-only, because
do_filp_open adds MAY_OPEN to the open mask.
Ignore MAY_OPEN. After this patch, CAP_DAC_READ_SEARCH is
again sufficient to open(fname, O_RDONLY) on a file to which
DAC otherwise refuses us read permission.
Reported-by: Mike Kazantsev <mk.fraggod@gmail.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Tested-by: Mike Kazantsev <mk.fraggod@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pull ACC_MODE to fs.h; we have several copies all over the place
* nightmarish expression calculating f_mode by f_flags deserves a helper
too (OPEN_FMODE(flags))
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Just set f_flags when shoving struct file into nameidata; don't
postpone that until __dentry_open(). do_filp_open() has correct
value; lookup_instantiate_filp() doesn't - we lose the difference
between O_RDWR and 3 by that point.
We still set .intent.open.flags, so no fs code needs to be changed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We can't get to this point unless it's a valid pointer.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
procfs-style symlinks return a last_type of LAST_BIND without an actual
path string. This causes __follow_link to skip calling __vfs_follow_link
and so the dentry isn't revalidated.
This is a problem when the link target sits on NFSv4 as it depends on
the VFS to revalidate the dentry before using it on an open call. Ensure
that this occurs by forcing a revalidation of the target dentry of
LAST_BIND symlinks.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Kill the 'update' argument of ima_path_check(), kill
dead code in ima.
Current rules: ima counters are bumped at the same time
when the file switches from put_filp() fodder to fput()
one. Which happens exactly in two places - alloc_file()
and __dentry_open(). Nothing else needs to do that at
all.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* do ima_get_count() in __dentry_open()
* stop doing that in followups
* move ima_path_check() to right after nameidata_to_filp()
* don't bump counters on it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* take truncate logics into a helper (handle_truncate())
* rip it out of may_open()
* call it from the only caller of may_open() that might pass
O_TRUNC
* and do that after we'd finished with opening.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
All users outside of fs/ of get_empty_filp() have been removed. This patch
moves the definition from the include/ directory to internal.h so no new
users crop up and removes the EXPORT_SYMBOL. I'd love to see open intents
stop using it too, but that's a problem for another day and a smarter
developer!
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Use the sucker in other places in pathname resolution
that check MAY_EXEC for directories; lose the _lite
from name, it's equivalent of full-blown inode_permission()
for its callers (albeit still lighter, since large parts
of generic_permission() do not apply for pure MAY_EXEC).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>