Xiaohui Xin and some other folks at Intel have been looking into what's
behind the performance hit of paravirt_ops when running native.
It appears that the hit is entirely due to the paravirtualized
spinlocks introduced by:
| commit 8efcbab674
| Date: Mon Jul 7 12:07:51 2008 -0700
|
| paravirt: introduce a "lock-byte" spinlock implementation
The extra call/return in the spinlock path is somehow
causing an increase in the cycles/instruction of somewhere around 2-7%
(seems to vary quite a lot from test to test). The working theory is
that the CPU's pipeline is getting upset about the
call->call->locked-op->return->return, and seems to be failing to
speculate (though I haven't seen anything definitive about the precise
reasons). This doesn't entirely make sense, because the performance
hit is also visible on unlock and other operations which don't involve
locked instructions. But spinlock operations clearly swamp all the
other pvops operations, even though I can't imagine that they're
nearly as common (there's only a .05% increase in instructions
executed).
If I disable just the pv-spinlock calls, my tests show that pvops is
identical to non-pvops performance on native (my measurements show that
it is actually about .1% faster, but Xiaohui shows a .05% slowdown).
Summary of results, averaging 10 runs of the "mmperf" test, using a
no-pvops build as baseline:
nopv Pv-nospin Pv-spin
CPU cycles 100.00% 99.89% 102.18%
instructions 100.00% 100.10% 100.15%
CPI 100.00% 99.79% 102.03%
cache ref 100.00% 100.84% 100.28%
cache miss 100.00% 90.47% 88.56%
cache miss rate 100.00% 89.72% 88.31%
branches 100.00% 99.93% 100.04%
branch miss 100.00% 103.66% 107.72%
branch miss rt 100.00% 103.73% 107.67%
wallclock 100.00% 99.90% 102.20%
The clear effect here is that the 2% increase in CPI is
directly reflected in the final wallclock time.
(The other interesting effect is that the more ops are
out of line calls via pvops, the lower the cache access
and miss rates. Not too surprising, but it suggests that
the non-pvops kernel is over-inlined. On the flipside,
the branch misses go up correspondingly...)
So, what's the fix?
Paravirt patching turns all the pvops calls into direct calls, so
_spin_lock etc do end up having direct calls. For example, the compiler
generated code for paravirtualized _spin_lock is:
<_spin_lock+0>: mov %gs:0xb4c8,%rax
<_spin_lock+9>: incl 0xffffffffffffe044(%rax)
<_spin_lock+15>: callq *0xffffffff805a5b30
<_spin_lock+22>: retq
The indirect call will get patched to:
<_spin_lock+0>: mov %gs:0xb4c8,%rax
<_spin_lock+9>: incl 0xffffffffffffe044(%rax)
<_spin_lock+15>: callq <__ticket_spin_lock>
<_spin_lock+20>: nop; nop /* or whatever 2-byte nop */
<_spin_lock+22>: retq
One possibility is to inline _spin_lock, etc, when building an
optimised kernel (ie, when there's no spinlock/preempt
instrumentation/debugging enabled). That will remove the outer
call/return pair, returning the instruction stream to a single
call/return, which will presumably execute the same as the non-pvops
case. The downsides arel 1) it will replicate the
preempt_disable/enable code at eack lock/unlock callsite; this code is
fairly small, but not nothing; and 2) the spinlock definitions are
already a very heavily tangled mass of #ifdefs and other preprocessor
magic, and making any changes will be non-trivial.
The other obvious answer is to disable pv-spinlocks. Making them a
separate config option is fairly easy, and it would be trivial to
enable them only when Xen is enabled (as the only non-default user).
But it doesn't really address the common case of a distro build which
is going to have Xen support enabled, and leaves the open question of
whether the native performance cost of pv-spinlocks is worth the
performance improvement on a loaded Xen system (10% saving of overall
system CPU when guests block rather than spin). Still it is a
reasonable short-term workaround.
[ Impact: fix pvops performance regression when running native ]
Analysed-by: "Xin Xiaohui" <xiaohui.xin@intel.com>
Analysed-by: "Li Xin" <xin.li@intel.com>
Analysed-by: "Nakajima Jun" <jun.nakajima@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4A0B62F7.5030802@goop.org>
[ fixed the help text ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes the following regression that occurred during the
scsi_dma_map()/unmap()
changes when compiling with CONFIG_DMA_API_DEBUG=y :
WARNING: at lib/dma-debug.c:496 check_unmap+0x142/0x542()
Hardware name:
3w-xxxx 0000:02:02.0: DMA-API: device driver tries to free DMA memory
it has not allocated [device address=0x0000000000000000] [size=36
bytes]
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch fixes the following regression the occurred during the
scsi_dma_map()/unmap() changes:
3w-9xxx 0001:45:00.0: DMA-API: device driver tries to free DMA memory
it has not allocated [device address=0x0000000000000000] [size=36
bytes]
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We use the name provided by SES to name objects. An empty name is
legal in SES but causes problems in our generic device hierarchy. Fix
this by falling back to a number if the name is either NULL or empty.
Also fix a secondary bug spotted in that dev_set_name(dev, name) uses
a string format and so would go wrong if name contained a '%'.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Andrew Vasquez wrote:
> fc-transport: Close state transition-window during rport deletion.
>
> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
> but, prior to making the upcall to 'block' the scsi-target
> associated with an rport, queued commands can recycle and
> ultimately run out of retries causing failures to propagate to
> upper-level drivers. Close this transition-window by returning
> the non-'retries' modifying DID_IMM_RETRY status for submitted
> I/Os.
The same can happen for iscsi when transitioning from logged in
to failed and blocking the sdevs.
This patch converts iscsi and fc's transitions back to use DID_IMM_RETRY
instead of DID_TRANSPORT_DISRUPTED which has a limited number of retries
that we do not want to use for handling this race.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
[Addition of iscsi and fc port online devloss case conversion by Mike Christie]
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Fix race in ext4_inode_info.i_cached_extent
ext4: Clear the unwritten buffer_head flag after the extent is initialized
ext4: Use a fake block number for delayed new buffer_head
ext4: Fix sub-block zeroing for writes into preallocated extents
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
kgdb: gdb documentation fix
kgdb,i386: use address that SP register points to in the exception frame
sysrq, intel_fb: fix sysrq g collision
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Revert "mm: add /proc controls for pdflush threads"
viocd: needs to depend on BLOCK
block: fix the bio_vec array index out-of-bounds test
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Fix PCI ROM access
powerpc/pseries: Really fix the oprofile CPU type on pseries
serial/nwpserial: Fix wrong register read address and add interrupt acknowledge.
powerpc/cell: Make ptcal more reliable
powerpc: Allow mem=x cmdline to work with 4G+
powerpc/mpic: Fix incorrect allocation of interrupt rev-map
powerpc: Fix oprofile sampling of marked events on POWER7
powerpc/iseries: Fix pci breakage due to bad dma_data initialization
powerpc: Fix mktree build error on Mac OS X host
powerpc/virtex: Fix duplicate level irq events.
powerpc/virtex: Add uImage to the default images list
powerpc/boot: add simpleImage.* to clean-files list
powerpc/8xx: Update defconfigs
powerpc/embedded6xx: Update defconfigs
powerpc/86xx: Update defconfigs
powerpc/85xx: Update defconfigs
powerpc/83xx: Update defconfigs
powerpc/fsl_soc: Remove mpc83xx_wdt_init, again
devpts_get_sb() calls memset(0) to clear mount options and calls
parse_mount_options() if user specified any mount options.
The memset(0) is bogus since the 'mode' and 'ptmxmode' options are
non-zero by default. parse_mount_options() restores options to default
anyway and can properly deal with NULL mount options.
So in devpts_get_sb() remove memset(0) and call parse_mount_options() even
for NULL mount options.
Bug reported by Eric Paris: http://lkml.org/lkml/2009/5/7/448.
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Tested-by: Marc Dionne <marc.c.dionne@gmail.com>
Reported-by: Eric Paris <eparis@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Reviewed-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If two CPU's simultaneously call ext4_ext_get_blocks() at the same
time, there is nothing protecting the i_cached_extent structure from
being used and updated at the same time. This could potentially cause
the wrong location on disk to be read or written to, including
potentially causing the corruption of the block group descriptors
and/or inode table.
This bug has been in the ext4 code since almost the very beginning of
ext4's development. Fortunately once the data is stored in the page
cache cache, ext4_get_blocks() doesn't need to be called, so trying to
replicate this problem to the point where we could identify its root
cause was *extremely* difficult. Many thanks to Kevin Shanahan for
working over several months to be able to reproduce this easily so we
could finally nail down the cause of the corruption.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
gdb command "set remote debug 1" is not valid, change to correct command.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
The treatment of the SP register is different on x86_64 and i386.
This is a regression fix that lived outside the mainline kernel from
2.6.27 to now. The regression was a result of the original merge
consolidation of the i386 and x86_64 archs to x86.
The incorrectly reported SP on i386 prevented stack tracebacks from
working correctly in gdb.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Commit 79e539453b introduced a
regression where you cannot use sysrq 'g' to enter kgdb. The solution
is to move the intel fb sysrq over to V for video instead of G for
graphics. The SMP VOYAGER code to register for the sysrq-v is not
anywhere to be found in the mainline kernel, so the comments in the
code were cleaned up as well.
This patch also cleans up the sysrq definitions for kgdb to make it
generic for the kernel debugger, such that the sysrq 'g' can be used
in the future to enter a gdbstub or another kernel debugger.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This reverts commit fafd688e4c.
Work is progressing to switch away from pdflush as the process backing
for flushing out dirty data. So it seems pointless to add more knobs
to control pdflush threads. The original author of the patch did not
have any specific use cases for adding the knobs, so we can easily
revert this before 2.6.30 to avoid having to maintain this API
forever.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This is a build fix, resyncing the DaVinci EVM ASoC board code
with the version in the DaVinci tree. That resync includes
support for the DM355 EVM, although that board isn't yet in
mainline.
(NOTE: also includes a bugfix to the platform_add_resources
call, recently sent by Chaithrika U S <chaithrika@ti.com> but
not yet merged into the DaVinci tree.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
This resyncs the DaVinci I2S code with the version in the DaVinci
tree. The behavioral change uses updated clock interfaces which
recently merged to mainline. Two other changes include adding a
comment on the ASP/McBSP/McASP confusion, and dropping pdev->id in
order to support more boards than just the DM644x EVM.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
This is a buildfix for the DaVinci PCM code, resyncing it with
the version in the DaVinci tree. The notable change is using
current EDMA interfaces, which recently merged to mainline.
(The older interfaces never made it into mainline.)
NOTE: open issue, the DMA should be to/from SRAM; see chip
errata for more info. The artifacts are extremely easy to
hear on DM355 hardware (not yet supported in mainline), but
don't seem as audible on DM6446 hardwaare (which does have
mainline support).
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
A couple of issues crept in since about 2.6.27 related to accessing PCI
device ROMs on various powerpc machines.
First, historically, we don't allocate the ROM resource in the resource
tree. I'm not entirely certain of why, I susepct they often contained
garbage on x86 but it's hard to tell. This causes the current generic
code to always call pci_assign_resource() when trying to access the said
ROM from sysfs, which will try to re-assign some new address regardless
of what the ROM BAR was already set to at boot time. This can be a
problem on hypervisor platforms like pSeries where we aren't supposed
to move PCI devices around (and in fact probably can't).
Second, our code that generates the PCI tree from the OF device-tree
(instead of doing config space probing) which we mostly use on pseries
at the moment, didn't set the (new) flag IORESOURCE_SIZEALIGN on any
resource. That means that any attempt at re-assigning such a resource
with pci_assign_resource() would fail due to resource_alignment()
returning 0.
This fixes this by doing these two things:
- The code that calculates resource flags based on the OF device-node
is improved to set IORESOURCE_SIZEALIGN on any valid BAR, and while at
it also set IORESOURCE_READONLY for ROMs since we were lacking that too
- We now allocate ROM resources as part of the resource tree. However
to limit the chances of nasty conflicts due to busted firmwares, we
only do it on the second pass of our two-passes allocation scheme,
so that all valid and enabled BARs get precedence.
This brings pSeries back the ability to access PCI ROMs via sysfs (and
thus initialize various video cards from X etc...).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
My previous pach for fixing the oprofile CPU type got somewhat mismerged
(by my fault) when it collided with another related patch. This should
finally (fingers crossed) fix the whole thing.
We make sure we keep the -old- oprofile type and CPU type whenever
one of them was specified in the first pass through the function.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The receive interrupt routine checks the wrong register if the
receive fifo is empty. Further an explicit interrupt acknowledge
write is introduced. In some circumstances another interrupt was
issued.
Signed-off-by: Benjamin Krill <ben@codiert.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There have been a series of checkstops on QS21 related to
ptcal being set up incorrectly. On systems that only
have memory on a single node, ptcal fails when it gets
a pointer to memory on the remote node.
Moreover, agressive prefetching in memcpy and other
functions may accidentally touch the first cache line
of the page that we reserve for ptcal, which causes
an ECC checkstop.
We now allocate pages only from the specified node, moves the
ptcal area into the middle of the allocated page to avoid
potential prefetch problems and prints the address of the
ptcal area to facilitate diagnostics.
Signed-off-by: Gerhard Stenzel <gerhard.stenzel@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We're currently choking on mem=4g (and above) due to memory_limit
being specified as an unsigned long. Make memory_limit
phys_addr_t to fix this.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Before when we were setting up the irq host map for mpic we passed in
just isu_size for the size of the linear map. However, for a number of
mpic implementations we have no isu (thus pass in 0) and will end up
with a no linear map (size = 0). This causes us to always call
irq_find_mapping() from mpic_get_irq().
By moving the allocation of the host map to after we've determined the
number of sources we can actually benefit from having a linear map for
the non-isu users that covers all the interrupt sources.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Description
-----------
Change ppc64 oprofile kernel driver to use the SLOT bits (MMCRA[37:39]only on
older processors where those bits are defined.
Background
----------
The performance monitor unit of the 64-bit POWER processor family has the
ability to collect accurate instruction-level samples when profiling on marked
events (i.e., "PM_MRK_<event-name>"). In processors prior to POWER6, the MMCRA
register contained "slot information" that the oprofile kernel driver used to
adjust the value latched in the SIAR at the time of a PMU interrupt. But as of
POWER6, these slot bits in MMCRA are no longer necessary for oprofile to use,
since the SIAR itself holds the accurate sampled instruction address. With
POWER6, these MMCRA slot bits were zero'ed out by hardware so oprofile's use of
these slot bits was, in effect, a NOP. But with POWER7, these bits are no
longer zero'ed out; however, they serve some other purpose rather than slot
information. Thus, using these bits on POWER7 to adjust the SIAR value results
in samples being attributed to the wrong instructions. The attached patch
changes the oprofile kernel driver to ignore these slot bits on all newer
processors starting with POWER6.
Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Signed-off-by: Michael Wolf <mjw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 4fc665b88a "powerpc: Merge 32 and
64-bit dma code" made changes to the PCI initialisation code that added
an assignment to archdata.dma_data but only for 32 bit code. Commit
7eef440a54 "powerpc/pci: Cosmetic cleanups
of pci-common.c" removed the conditional compilation. Unfortunately,
the iSeries code setup the archdata.dma_data before that assignment was
done - effectively overwriting the dma_data with NULL.
Fix this up by moving the iSeries setup of dma_data into a
pci_dma_dev_setup callback.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The mktree utility defines some variables as "uint", although this is not a
standard C type, and so cross-compiling on Mac OS X fails. Change this to
"unsigned int".
Signed-off-by: Timur Tabi <timur@freescale.com>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (38 commits)
MIPS: Sibyte: Fix locking in set_irq_affinity
MIPS: Use force_sig when handling address errors.
MIPS: Cavium: Add struct clocksource * argument to octeon_cvmcount_read()
MIPS: Rewrite <asm/div64.h> to work with gcc 4.4.0.
MIPS: Fix highmem.
MIPS: Fix sign-extension bug in 32-bit kernel on 32-bit hardware.
MIPS: MSP71xx: Remove the RAMROOT functions
MIPS: Use -mno-check-zero-division
MIPS: Set compiler options only after the compiler prefix has ben set.
MIPS: IP27: Get rid of #ident. Gcc 4.4.0 doesn't like it.
MIPS: uaccess: Switch lock annotations to might_fault().
MIPS: MSP71xx: Resolve use of non-existent GPIO routines in msp71xx reset
MIPS: MSP71xx: Resolve multiple definition of plat_timer_setup
MIPS: Make uaccess.h slightly more sparse friendly.
MIPS: Make access_ok() sideeffect proof.
MIPS: IP27: Fix clash with NMI_OFFSET from hardirq.h
MIPS: Alchemy: Timer build fix
MIPS: Kconfig: Delete duplicate definition of RWSEM_GENERIC_SPINLOCK.
MIPS: Cavium: Add support for 8k and 32k page sizes.
MIPS: TXx9: Fix possible overflow in clock calculations
...
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: Spelling fix in btrfs_lookup_first_block_group comments
Btrfs: make show_options result match actual option names
Btrfs: remove outdated comment in btrfs_ioctl_resize()
Btrfs: remove some WARN_ONs in the IO failure path
Btrfs: Don't loop forever on metadata IO failures
Btrfs: init inode ordered_data_close flag properly
This allows userlevel code to discover the pipe number corresponding
to a given CRTC ID. This is necessary for doing pipe-specific
operations such as waiting for vblank on a given CRTC. Failure to use
the right pipe mapping can result in GPU hangs, or at least failure
to actually sync to vblank.
Signed-off-by: Carl Worth <cworth@cworth.org>
[anholt: Style touchups from review]
Signed-off-by: Eric Anholt <eric@anholt.net>
We detect HDMI output connection status by writing to HOT Plug Interrupt
Detect Enable bit in PORT_HOTPLUG_EN. The behavior will generate a specified
interrupt, which is caught by audio driver, but during one detection driver
set all Detect Enable bits of HDMIB, HDMIC HDMID, and generate wrong
interrupt signals for current output, according to the signals audio driver
misunderstand device status. The patch intends to handle corresponding
output precisely.
It fixed freedesktop.org bug #21371
Signed-off-by: Ma Ling <ling.ma@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
It fixed bug #21659
Signed-off-by: Ma Ling <ling.ma@intel.com>
[anholt: hand-applied because git-am is too picky]
Signed-off-by: Eric Anholt <eric@anholt.net>
Although spec say CRT_HOTPLUG_ACTIVATION_PERIOD_64 is only useful for
mobile platform, it is also required to detect vga on G4x desktops correctly.
Tested on G45/G43/Q45 platforms with no regressions.
It fixed freedesktop.org bug #21120 and part of bug #21210
Signed-off-by: Ma Ling <ling.ma@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
There are a number of small form factor desktop systems with Intel mobile
graphics chips that lie and say they have an LVDS. With kernel mode-setting,
this becomes a problem, and makes native resolution boot go haywire -- for
example, my Dell Studio Hybrid, hooked to a 1920x1080 display claims to
have a 1024x768 LVDS, and the resulting graphical boot on the 1920x1080
display uses only the top left 1024x768, and auto-configured X will end
up only 1024x768 as well. With this change, graphical boot and X
both do 1920x1080 as expected.
Note that we're simply embracing and extending the early bail-out code
in place for the Mac Mini here. The xorg intel driver uses pci subsystem
device and vendor id for matching, while we're using dmi lookups here.
The MSI addition is courtesy of and tested by Bill Nottingham.
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Bill Nottingham <notting@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
We might sleep here anyway so I hope an extra uncached read is ok to
add.
In #20896 we found that vbetool clobbers the IER. In KMS mode this is
particularly bad since we don't set the interrupt regs late (in
EnterVT), so we'd fail to get *any* interrupts at all after X started
(since some distros have scripts that call vbetool at X startup
apparently).
So this patch checks IER at wait_request time, and re-enables
interrupts if it's been clobbered. In a proper config this check
should never be triggered.
This is really a distro issue, but having a sanity check is nice, as
long as it doesn't have a real performance hit.
Tested-by: Mateusz Kaduk <mateusz.kaduk@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[anholt: Moved the check inside of the sleeping case to avoid perf cost]
Signed-off-by: Eric Anholt <eric@anholt.net>
In IGD, DPCUNIT_CLOCK_GATE_DISABLE bit should be set, otherwise i2c
access will be wrong.
v2: Disable CLOCK_GATE_DISABLE bit after bit bashing as suggested by Eric.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
This should avoid a class of bugs where the hardware prefetches past the
end of the object, and walks into unallocated memory when the object is
bound to the last page of the aperture.
fd.o bug #21488
Signed-off-by: Eric Anholt <eric@anholt.net>
This patch initializes the max_target_blocked field of a scsi target
structure so that a queuecommand return value of
SCSI_MLQUEUE_TARGET_BUSY will actually result in having the
scsi_queue_insert blocking the device queue before requeuing the
command and running the queue. Otherwise, can and does cause livelock
on single CPU configurations if/when open-iSCSI software initiator's
command PDU window fills.
Signed-off-by: Ed Goggin <egoggin@vmware.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The BH_Unwritten flag indicates that the buffer is allocated on disk
but has not been written; that is, the disk was part of a persistent
preallocation area. That flag should only be set when a get_blocks()
function is looking up a inode's logical to physical block mapping.
When ext4_get_blocks_wrap() is called with create=1, the uninitialized
extent is converted into an initialized one, so the BH_Unwritten flag
is no longer appropriate. Hence, we need to make sure the
BH_Unwritten is not left set, since the combination of BH_Mapped and
BH_Unwritten is not allowed; among other things, it will result ext4's
get_block() to be called over and over again during the write_begin
phase of write(2).
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The notreelog and flushoncommit mount options were being printed slightly
differently.
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
In Li Zefan's commit dae7b665cf,
a combination call of kmalloc() and copy_from_user() is replaced by
memdup_user(). So btrfs_ioctl_resize() doesn't use GFP_NOFS any more.
Signed-off-by: Li Hong <lihong.hi@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
These debugging WARN_ONs make too much console noise during regular
IO failures. An IO failure will still generate a number of messages
as we verify checksums etc, but these two are not needed.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When a btrfs metadata read fails, the first thing we try to do is find
a good copy on another mirror of the block. If this fails, read_tree_block()
ends up returning a buffer that isn't up to date.
The btrfs btree reading code was reworked to drop locks and repeat
the search when IO was done, but the changes didn't add a check for failed
reads. The end result was looping forever on buffers that were never
going to become up to date.
Signed-off-by: Chris Mason <chris.mason@oracle.com>