When an event (such as an interrupt) is injected, and the stack is
shadowed (and therefore write protected), the guest will exit. The
current code will see that the stack is shadowed and emulate a few
instructions, each time postponing the injection. Eventually the
injection may succeed, but at that time the guest may be unwilling
to accept the interrupt (for example, the TPR may have changed).
This occurs every once in a while during a Windows 2008 boot.
Fix by unshadowing the fault address if the fault was due to an event
injection.
Signed-off-by: Avi Kivity <avi@qumranet.com>
There is no guarantee that the old TSS descriptor in the GDT contains
the proper base address. This is the case for Windows installation's
reboot-via-triplefault.
Use guest registers instead. Also translate the address properly.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
The segment base is always a linear address, so translate before
accessing guest memory.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
If NPT is enabled after loading both KVM modules on AMD and it should be
disabled, both KVM modules must be reloaded. If only the architecture module is
reloaded the behavior is undefined. With this patch it is possible to disable
NPT only by reloading the kvm_amd module.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver files
mlx4_core: Add VLAN tag field to WQE control segment struct
RDMA/nes: CM connection setup/teardown rework
IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG
IPoIB/cm: Connected mode is no longer EXPERIMENTAL
RDMA/ucm: BKL is not needed for ib_ucm_open()
RDMA/ucma: BKL is not needed for ucma_open()
The fix NULLed a pointer without freeing it.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Reported-by: Juha Motorsportcom <juha_motorsportcom@luukku.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.infradead.org/mtd-2.6: (57 commits)
[MTD] [NAND] subpage read feature as a way to increase performance.
CPUFREQ: S3C24XX NAND driver frequency scaling support.
[MTD][NAND] au1550nd: remove unused variable
[MTD] jedec_probe: Fix SST 16-bit chip detection
[MTD][MTDPART] Fix a division by zero bug
[MTD][MTDPART] Cleanup and document the erase region handling
[MTD][MTDPART] Handle most checkpatch findings
[MTD][MTDPART] Seperate main loop from per-partition code in add_mtd_partition
[MTD] physmap: resume already suspended chips on failure to suspend
[MTD] physmap: Fix suspend/resume/shutdown bugs.
[MTD] [NOR] Fix -ETIMEO errors in CFI driver
[MTD] [NAND] fsl_elbc_nand: fix section mismatch with CONFIG_MTD_OF_PARTS=y
[JFFS2] Use .unlocked_ioctl
[MTD] Fix const assignment in the MTD command line partitioning driver
[MTD] [NOR] gen_probe: No debug message when debugging is disabled
[MTD] [NAND] remove __PPC__ hardcoded address from DiskOnChip drivers
[MTD] [MAPS] Remove the bast-flash driver.
[MTD] [NAND] fsl_elbc_nand: ecclayout cleanups
[MTD] [NAND] fsl_elbc_nand: implement support for flash-based BBT
[MTD] [NAND] fsl_elbc_nand: fix OOB workability for large page NAND chips
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
atmel-mci: debugfs support
mmc: Add per-card debugfs support
mmc: Export internal host state through debugfs
imxmmc: fix crash when no platform data is provided
imxmmc: fix platform resources
imxmmc: remove DEBUG definition
mmc_spi: put signals to low power off fix
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (39 commits)
[PATCH] fix RLIM_NOFILE handling
[PATCH] get rid of corner case in dup3() entirely
[PATCH] remove remaining namei_{32,64}.h crap
[PATCH] get rid of indirect users of namei.h
[PATCH] get rid of __user_path_lookup_open
[PATCH] f_count may wrap around
[PATCH] dup3 fix
[PATCH] don't pass nameidata to __ncp_lookup_validate()
[PATCH] don't pass nameidata to gfs2_lookupi()
[PATCH] new (local) helper: user_path_parent()
[PATCH] sanitize __user_walk_fd() et.al.
[PATCH] preparation to __user_walk_fd cleanup
[PATCH] kill nameidata passing to permission(), rename to inode_permission()
[PATCH] take noexec checks to very few callers that care
Re: [PATCH 3/6] vfs: open_exec cleanup
[patch 4/4] vfs: immutable inode checking cleanup
[patch 3/4] fat: dont call notify_change
[patch 2/4] vfs: utimes cleanup
[patch 1/4] vfs: utimes: move owner check into inode_change_ok()
[PATCH] vfs: use kstrdup() and check failing allocation
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
netns: fix ip_rt_frag_needed rt_is_expired
netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences
netfilter: fix double-free and use-after free
netfilter: arptables in netns for real
netfilter: ip{,6}tables_security: fix future section mismatch
selinux: use nf_register_hooks()
netfilter: ebtables: use nf_register_hooks()
Revert "pkt_sched: sch_sfq: dump a real number of flows"
qeth: use dev->ml_priv instead of dev->priv
syncookies: Make sure ECN is disabled
net: drop unused BUG_TRAP()
net: convert BUG_TRAP to generic WARN_ON
drivers/net: convert BUG_TRAP to generic WARN_ON
Fix firmware/memmap printk format warnings:
drivers/firmware/memmap.c:156: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'resource_size_t'
drivers/firmware/memmap.c:161: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'resource_size_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Bernhard Walle <bwalle@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/util.c: In function 'arch_pick_mmap_layout':
mm/util.c:144: error: dereferencing pointer to incomplete type
mm/util.c:145: error: 'arch_get_unmapped_area' undeclared (first use in this function)
mm/util.c:145: error: (Each undeclared identifier is reported only once
mm/util.c:145: error: for each function it appears in.)
mm/util.c:146: error: 'arch_unmap_area' undeclared (first use in this function)
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the following warning when CONFIG_HUGETLB_PAGE is not set:
ipc/shm.c: In function `shm_get_stat':
ipc/shm.c:565: warning: unused variable `h'
[akpm@linux-foundation.org: use tabs, not spaces]
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov points out that we should check that the task is still alive
before we iterate over the threads. This patch includes a fixup for this.
Also simplify do_io_accounting() implementation.
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes an off-by-one error in a comment.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* dup2() should return -EBADF on exceeded sysctl_nr_open
* dup() should *not* return -EINVAL even if you have rlimit set to 0;
it should get -EMFILE instead.
Check for orig_start exceeding rlimit taken to sys_fcntl().
Failing expand_files() in dup{2,3}() now gets -EMFILE remapped to -EBADF.
Consequently, remaining checks for rlimit are taken to expand_files().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Since Ulrich is OK with getting rid of dup3(fd, fd, flags) completely,
to hell the damn thing goes. Corner case for dup2() is handled in
sys_dup2() (complete with -EBADF if dup2(fd, fd) is called with fd
that is not open), the rest is done in dup3().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all.
Several places in the tree needed direct include.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
make it atomic_long_t; while we are at it, get rid of useless checks in affs,
hfs and hpfs - ->open() always has it equal to 1, ->release() - to 0.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro notice one cornercase that the new dup3() code. The dup2()
function, as a special case, handles dup-ing to the same file
descriptor. In this case the current dup3() code does nothing at
all. I.e., it ingnores the flags parameter. This shouldn't happen,
the close-on-exec flag should be set if requested.
In case the O_CLOEXEC bit in the flags parameter is not set the
dup3() function should behave in this respect identical to dup2().
This means dup3(fd, fd, 0) should not actively reset the c-o-e
flag.
The patch below implements this minor change.
[AV: credits to Artur Grabowski for bringing that up as potential subtle point
in dup2() behaviour]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* do not pass nameidata; struct path is all the callers want.
* switch to new helpers:
user_path_at(dfd, pathname, flags, &path)
user_path(pathname, &path)
user_lpath(pathname, &path)
user_path_dir(pathname, &path) (fail if not a directory)
The last 3 are trivial macro wrappers for the first one.
* remove nameidata in callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Almost all users __user_walk_fd() and friends care only about struct path.
Get rid of the few that do not.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
On Mon, May 19, 2008 at 12:01:49AM +0200, Marcin Slusarz wrote:
> open_exec is needlessly indented, calls ERR_PTR with 0 argument
> (which is not valid errno) and jumps into middle of function
> just to return value.
> So clean it up a bit.
Still looks rather messy. See below for a better version.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Move the immutable and append-only checks from chmod, chown and utimes
into notify_change(). Checks for immutable and append-only files are
always performed by the VFS and not by the filesystem (see
permission() and may_...() in namei.c), so these belong in
notify_change(), and not in inode_change_ok().
This should be completely equivalent.
CC: Ulrich Drepper <drepper@redhat.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The FAT_IOCTL_SET_ATTRIBUTES ioctl() calls notify_change() to change
the file mode before changing the inode attributes. Replace with
explicit calls to security_inode_setattr(), fat_setattr() and
fsnotify_change().
This is equivalent to the original. The reason it is needed, is that
later in the series we move the immutable check into notify_change().
That would break the FAT_IOCTL_SET_ATTRIBUTES ioctl, as it needs to
perform the mode change regardless of the immutability of the file.
[Fix error if fat is built as a module. Thanks to OGAWA Hirofumi for
noticing.]
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Untange the mess that is do_utimes(). Add kerneldoc comment to
do_utimes().
CC: Ulrich Drepper <drepper@redhat.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add a new ia_valid flag: ATTR_TIMES_SET, to handle the
UTIMES_OMIT/UTIMES_NOW and UTIMES_NOW/UTIMES_OMIT cases. In these
cases neither ATTR_MTIME_SET nor ATTR_ATIME_SET is in the flags, yet
the POSIX draft specifies that permission checking is performed the
same way as if one or both of the times was explicitly set to a
timestamp.
See the path "vfs: utimensat(): fix error checking for
{UTIME_NOW,UTIME_OMIT} case" by Michael Kerrisk for the patch
introducing this behavior.
This is a cleanup, as well as allowing filesystems (NFS/fuse/...) to
perform their own permission checking instead of the default.
CC: Ulrich Drepper <drepper@redhat.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
- use kstrdup() instead of kmalloc() + memcpy()
- return NULL if allocating ->mnt_devname failed
- mnt_devname should be const
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* MAY_CHDIR is redundant - it's an equivalent of MAY_ACCESS
* MAY_ACCESS on fuse should affect only the last step of pathname resolution
* fchdir() and chroot() should pass MAY_ACCESS, for the same reason why
chdir() needs that.
* now that we pass MAY_ACCESS explicitly in all cases, LOOKUP_ACCESS can be
removed; it has no business being in nameidata.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
... so we ought to pass MAY_CHDIR to vfs_permission() instead of having
it triggered on every step of preceding pathname resolution. LOOKUP_CHDIR
is killed by that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Remove the unused mode parameter from vfs_symlink and callers.
Thanks to Tetsuo Handa for noticing.
CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Why not reuse "inode" which is assigned as
struct inode *inode = old_dentry->d_inode;
in the beginning of vfs_link() ?
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
All calls to remove_suid() are made with a file pointer, because
(similarly to file_update_time) it is called when the file is written.
Clean up callers by passing in a file instead of a dentry.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
vfs_permission(MAY_WRITE) already checked for the inode being
immutable, so no need to repeat it.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Christoph Hellwig <hch@infradead.org>
* kill nameidata * argument; map the 3 bits in ->flags anybody cares
about to new MAY_... ones and pass with the mask.
* kill redundant gfs2_iop_permission()
* sanitize ecryptfs_permission()
* fix remaining places where ->permission() instances might barf on new
MAY_... found in mask.
The obvious next target in that direction is permission(9)
folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
hpfs_unlink() calls permission() prior to truncating the file. HPFS
doesn't define a .permission method, so replace with explicit call to
generic_permission().
This is equivalent, except that devcgroup_inode_permission() and
security_inode_permission() are not called.
The truncation is just an implementation detail of the unlink, so
these security checks are unnecessary.
I suspect that even calling generic_permission() is unnecessary, since
we shouldn't mind if the file isn't writable. But I leave that to the
maintainer to decide.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
* keep references to ctl_table_head and ctl_table in /proc/sys inodes
* grab the former during operations, use the latter for access to
entry if that succeeds
* have ->d_compare() check if table should be seen for one who does lookup;
that allows us to avoid flipping inodes - if we have the same name resolve
to different things, we'll just keep several dentries and ->d_compare()
will reject the wrong ones.
* have ->lookup() and ->readdir() scan the table of our inode first, then
walk all ctl_table_header and scan ->attached_by for those that are
attached to our directory.
* implement ->getattr().
* get rid of insane amounts of tree-walking
* get rid of the need to know dentry in ->permission() and of the contortions
induced by that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>