Commit Graph

18 Commits

Author SHA1 Message Date
Tamas Vincze
118f3e1afd edac: i5000_edac critical fix panic out of bounds
EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4)
Kernel panic - not syncing: EDAC MC0: Uncorrected Error  (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.

This happens because FERR_NF_FBD bit 28 is not updated on i5000.  Due to
that, both bits 28 and 29 may be equal to one, returning channel = 3.  As
this value is invalid, EDAC core generates the panic.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14568

Signed-off-by: Tamas Vincze <tom@vincze.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16 12:15:38 -08:00
Keith Mannthey
c2494ace99 edac: i5100 fix initialization code
Allow csrows to properly initialize when the topology only has active
channels on 2 and 3.  This new check allows proper detection and
initialization in this topology.  Only checking the first mrt that
represented channels 0 and 1 is not sufficient.

I also fixed up the related debug information path.  I can submit as a 2nd
patch if needed.

Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Acked-by: Aristeu Rozanski <aris@ruivo.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:30 -07:00
Darrick J. Wong
f0f7e0dc73 i5000-edac: hold reference to mci kobject
It turns out that edac_mc_del_mc will kobject_put the last kref on the
mci object.

If the timing is just right, that means that the mci object is freed
before before i5000_remove_one has a chance to free the resources
associated with it, causing a null pointer exceptions when unloading the
driver.  Insert a kobject_{get,put} pair so that this doesn't happen.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-12 17:17:16 -08:00
Aristeu Rozanski
8360e81b5d edac i5000: fix thermal issues
Make the Thermal messages (temperature got past Tmid) be displayed only
once because:

1) it's the BIOS job to configure and handle the memory throttling
2) if the BIOS is broken or is aware about the condition, flooding the
   system logs won't help anything.
3) According to the specification update for Intel 5000 MCHs, all the
   revisions of this MCH have problems on the thermal sensors, making
   not automatic (a.k.a. intelligent thermal throttling) impossible.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:48 -07:00
Aristeu Rozanski
c066740739 edac i5000: fix error messages
Update the i5000_edac messages, making everything pass through the EDAC
(so the log controls will work) and being more specific about the errors.
Also, it makes the miscellaneous errors optional and disabled by default.

As I didn't found anywhere information about M23ERR-M26ERR
(FERR_NF_THERMAL) on FERR_NF_FBD, I'm removing them.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:48 -07:00
Hitoshi Mitake
c3c52bce69 edac: fix module initialization on several modules 2nd time
I implemented opstate_init() as a inline function in linux/edac.h.

added calling opstate_init() to:
	i82443bxgx_edac.c
	i82860_edac.c
	i82875p_edac.c
	i82975x_edac.c

I wrote a fixed patch of
edac-fix-module-initialization-on-several-modules.patch,
and tested building 2.6.25-rc7 with applying this. It was succeed.
I think the patch is now correct.

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:26 -07:00
Joe Perches
6f042b50e0 drivers/edac/: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-03 17:12:34 +02:00
Darrick J. Wong
57510c2f93 i5000_edac: no need to __stringify() KBUILD_BASENAME
The i5000_edac driver's PCI registration structure has the name
""i5000_edac"" (with extra set of double-quotes) which is probably not
intentional.  Get rid of __stringify.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:41 -08:00
Doug Thompson
b8f6f97552 drivers/edac: fix edac_mc init apis
Refactoring of sysfs code necessitated the refactoring of the edac_mc_alloc()
and edac_mc_add_mc() apis, of moving the index value to the alloc() function.
This patch alters the in tree drivers to utilize this new api signature.

Having the index value performed later created a chicken-and-the-egg issue.
Moving it to the alloc() function allows for creating the necessary sysfs
entries with the proper index number

Cc: Alan Cox alan@lxorguk.ukuu.org.uk
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:57 -07:00
Douglas Thompson
b2ccaecad2 drivers/edac: i5000 code tidying
Various code style conformance patches on the i5000 driver

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Marisuz Kozlowski
977c76bd68 drivers/edac: i5000 define typo
Found a typo in one of the #defines in the driver

MTR_DIM_RANKS --> MTR_DIMM_RANK

Signed-off-by: Marisuz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
052dfb45cc drivers/edac: cleanup spaces-gotos after Lindent messup
This patch fixes some remnant spaces inserted by the use of Lindent.
Seems Lindent adds some spaces when it shoulded. These have been fixed.
In addition, goto targets have issues, these have been fixed
in this patch.

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Dave Jiang
456a2f9552 drivers/edac: drivers to use new PCI operation
Move x86 drivers to new pci controller setup

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:55 -07:00
Douglas Thompson
f4aff42653 drivers/edac: Lindent i5000
Ran e752x_edac.c file through Lindent for cleanup

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:54 -07:00
Dave Jiang
c4192705fe drivers/edac: add dev_name getter function
Move dev_name() macro to a more generic interface since it's not possible
to determine whether a device is pci, platform, or of_device easily.

Now each low level driver sets the name into the control structure, and
the EDAC core references the control structure for the information.

Better abstraction.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Douglas Thompson
20bcb7a81d drivers/edac: mod use edac_core.h
In the refactoring of edac_mc.c into several subsystem files,
the header file edac_mc.h became meaningless. A new header file
edac_core.h was created. All the files that previously included
"edac_mc.h" are changed to include "edac_core.h".

Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Dave Jiang
c0d1217202 drivers/edac: add new nmi rescan
Provides a way for NMI reported errors on x86 to notify the EDAC
subsystem pending ECC errors by writing to a software state variable.

Here's the reworked patch. I added an EDAC stub to the kernel so we can
have variables that are in the kernel even if EDAC is a module. I also
implemented the idea of using the chip driver to select error detection
mode via module parameter and eliminate the kernel compile option.
Please review/test. Thx!

Also, I only made changes to some of the chipset drivers since I am
unfamiliar with the other ones. We can add similar changes as we go.

Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Douglas Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Eric Wollesen
eb60705ac5 drivers/edac: new intel 5000 MC driver
Eric Wollesen ported the Bluesmoke Memory Controller driver (written by Doug
Thompson) for the Intel 5000X/V/P (Blackford/Greencreek) chipset to the in
kernel EDAC model.

This patch incorporates the module for the 5000X/V/P chipset family

[m.kozlowski@tuxland.pl: edac i5000 parenthesis balance fix]
Signed-off-by: Eric Wollesen <ericw@xmtp.net>
Signed-off-by: Doug Thompson <norsk5@xmission.com>
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00