Commit Graph

34 Commits

Author SHA1 Message Date
Darrick J. Wong a9344e68ac [SCSI] libsas: Add an LU reset mechanism to the error handler
After discussion with andmike and dougg, it seems that the purpose of
eh_device_reset_handler is to issue LU resets, and that
eh_bus_reset_handler would be a more appropriate place for a phy reset.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-02-03 08:15:55 -06:00
Darrick J. Wong 423f7cf467 [SCSI] libsas: Don't BUG when connecting two expanders via wide port
libsas: Don't BUG when connecting two expanders via wide port

When a device is connected to an expander, the discovery process goes through
sas_ex_discover_dev to figure out what's attached to the phy.  If it is the
case that the phy being discovered happens to be the second phy of a wide link
to an expander, that discover_dev function will incorrectly call
sas_ex_discover_expander, which creates another sas_port and tries to attach the
other sas_phys to the new port, thus triggering a BUG.  The correct thing to do is
to check the other ex_phys of the expander to see if there's a sas_port for this
sas_phy, and attach the sas_phy to the existing sas_port.

This is easily triggered if one enables the phys of a wide port between
expanders one by one.

This second version of the patch fixes a small regression in the case where
all the phys show up at once and we accidentally try to attach to a port
that hasn't been created yet.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-02-03 08:15:15 -06:00
FUJITA Tomonori 63bb1bf040 [SCSI] libsas: fix task attribute
Why TASK_ATTR_HOQ?

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-30 10:34:54 -06:00
Darrick J. Wong f27708fc75 [SCSI] libsas: Enable automatic spin-up of SAS disks
Set allow_restart=1 for all SAS disks so that they are spun up when needed.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-27 10:07:21 -06:00
Darrick J. Wong ad689233be [SCSI] libsas: Handle SCSI commands that complete with failure codes
This patch moves the code that handles SAS failures out of the main EH
function and into a separate function.  It also detects commands that have
no sas_task (i.e. they completed, but with error data) and sends them into
scsi_error for processing.  This allows us to handle SCSI errors (and
enables auto-spinup as a side effect) instead of dropping them on the
floor and falling into an infinite loop.  It also requires the
implementation of a device reset function, which the SAS failure code has
been modified to employ for REQ_DEVICE_RESET.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-27 10:06:51 -06:00
Darrick J. Wong 6f63caae21 [SCSI] libsas: Clean up discovery failure handler code
sas_rphy_delete does two things: it removes the sas_rphy from the transport
layer and frees the sas_rphy.  This can be broken down into two functions,
sas_rphy_remove and sas_rphy_free; sas_rphy_remove is of interest to
sas_discover_root_expander because it calls functions that require
sas_rphy_add as a prerequisite and can fail (namely sas_discover_expander).
In that case, sas_discover_root_expander needs to be able to undo the effects
of sas_rphy_add yet leave the job of freeing the sas_rphy to the caller of
sas_discover_root_expander.

This patch also removes some unnecessary code from sas_discover_end_dev
to eliminate an unnecessary cycle of sas_notify_lldd_gone/found for SAS
devices, thus eliminating a sas_rphy_remove call (and fixing a race condition
where a SCSI target scan can come in between the gone and found call).
It also moves the sas_rphy_free calls into sas_discover_domain and
sas_ex_discover_end_dev to complement the sas_rphy_allocation via
sas_get_port_device.

This patch does not change the semantics of sas_rphy_delete.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-27 10:05:15 -06:00
Darrick J. Wong 3b6e9fafc4 [SCSI] libsas: Fix incorrect sas_port deformation in sas_form_port
Currently, sas_form_port checks the given asd_sas_phy's sas_phy to see if
there's already a port attached.  If so, the SAS addresses of the port and
the phy are compared to determine if we need to detach from the port
because the addresses don't match or if we can stop; the SAS address stored
in the sas_port reflects whatever device _was_ attached to the port/phy, and
the SAS address stored in the sas_port reflects whatever device we just
discovered.  As written, the code detaches from the port if the addresses
_do_ match, and prints an error if they do _not_ match.  I believe this to
be incorrect, as it seems more logical to keep the port if the addresses
match (i.e. the phy was reset but the device didn't change), and detach it
they do not (i.e. the device changed).

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-27 10:04:58 -06:00
Darrick J. Wong 02cd743bb3 [SCSI] libsas: Start I_T recovery if ABORT TASK fails
The EH should fall into I_T recovery (and potentially stronger
remedies) if ABORT TASK fails.

Signed-off-by: Alexis Bruemmer <alexisb@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:23:40 -06:00
Darrick J. Wong 6b0efb8516 [SCSI] libsas: Add SAS_HA state flags to avoid queueing events while unloading
Track sas_ha_struct state so that we ignore events that come in while
we're shutting things down.

Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:21:53 -06:00
Darrick J. Wong 980fa2f9d6 [SCSI] libsas: phy port lock needs irq spinlocks
Convert the phy port locks to use irq spinlocks.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:20:46 -06:00
Darrick J. Wong 396819fba8 [SCSI] libsas: Delay issuing ABORT TASK TMF until the error handler
sas_task_abort() should simply abort the upper-level SCSI command and wait
until the error handler to send the actual ABORT TASK command.  By
deferring things to the EH we simplify the concurrency coordination and
eliminate some race conditions.  Note that sas_task_abort has a few hooks
to handle libsas internal commands properly too.

Also rename do_sas_task_abort to __sas_task_abort just in case we really
want to abort the task *right now* and we don't have a scsi_cmnd attached
to the command.  This is a hook for libata internal commands to abort.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:18:06 -06:00
Darrick J. Wong 3ebf6922b0 [SCSI] libsas: Enable the EH strategy handler to reset a phy after a command
When a SAS LLDD needs to request a device port reset, it needs to have all
commands aborted before it can reset the port.  Since commands are put on
the EH's list in the order that they were queued, the LLDD can set a "need
reset" flag in the last task to be aborted so that the EH can reset the
port after all commands are aborted.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:17:27 -06:00
Darrick J. Wong 37958fb040 [SCSI] libsas: Remove SAS_TASK_INITIATOR_ABORTED flag
This flag is no longer necessary because we push tasks to be aborted into
the EH as soon as we possibly can, and let the SCSI EH code take care of
the coordination for which this flag was used.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:17:04 -06:00
Darrick J. Wong cde3f74bac [SCSI] libsas: Destroy the task collector thread after releasing ports
If we use task collector mode, we can end up destroying the task collector
thread before we release the ports, which is bad if a port release causes
a disk I/O (such as cache flushing).

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:15:27 -06:00
Darrick J. Wong 6d4dcd4dae [SCSI] libsas: Reset timer on taskless scsi_cmnds in sas_scsi_timed_out
Every so often, a scsi_cmnd will time out, and the libsas timeout handler
will discover that the scsi_cmnd does not have a sas_task attached to it.
This can happen in two cases: (1) the scsi_cmnd actually made it through
libsas to the HBA and is now going through scsi_done, or (2) the
scsi_cmnd has been held up (host lock, slab alloc, etc) and libsas has
not yet attached a sas_task.  In both cases, it is safe to ask SCSI for
more time to process the command via EH_RESET_TIMER; we cannot blindly
return EH_HANDLED because if (2) happens, we could end up calling
scsi_done while another CPU is heading towards sas_queuecommand, which
causes slab corruption when sas_task_done updates the freed scsi_cmnd.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:13:38 -06:00
Darrick J. Wong acbf167d4a [SCSI] libsas: Add a sysfs knob to enable/disable a phy
This patch lets a user arbitrarily enable or disable a phy via sysfs.
Potential applications include shutting down a phy to replace one
lane of wide port, and (more importantly) providing a method for the
libata SATL to control the phy.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:13:00 -06:00
Darrick J. Wong b218a0d8e2 [SCSI] libsas: Don't give scsi_cmnds to the EH if they never made it to the SAS LLDD or have already returned
On a system with many SAS targets, it appears possible that a scsi_cmnd
can time out without ever making it to the SAS LLDD or at the same time
that a completion is occurring.  In both of these cases, telling the
LLDD to abort the sas_task makes no sense because the LLDD won't know
about the sas_task; what we really want to do is to increase the timer.
Note that this involves creating another sas_task bit to indicate
whether or not the task has been sent to the LLDD; I could have
implemented this by slightly redefining SAS_TASK_STATE_PENDING, but
this way seems cleaner.

This second version amends the aic94xx portion to set the
TASK_AT_INITIATOR flag for all sas_tasks that were passed to
lldd_execute_task.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 16:12:39 -06:00
Darrick J. Wong bf45120751 [SCSI] libsas: Clean up rphys/port dev list after a discovery error on an expander
sas_get_port_device assigns a rphy to a domain device in anticipation
of finding a disk.  When a discovery error occurs in
sas_discover_{sata,sas,expander}*, however, we need to clean up that
rphy and the port device list so that we don't GPF.  In addition, we
need to check the result of the second sas_notify_lldd_dev_found.
This patch seems ok on a x260, x366 and x206m.

This patch fixes up sas_expander.c separately because jejb has some
cleanup patches of his own that are a prerequisite.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 14:24:25 -06:00
Darrick J. Wong 8880839815 [SCSI] libsas: Clean up rphys/port dev list after a discovery error.
sas_get_port_device assigns a rphy to a domain device in anticipation
of finding a disk.  When a discovery error occurs in
sas_discover_{sata,sas,expander}*, however, we need to clean up that
rphy and the port device list so that we don't GPF.  In addition, we
need to check the result of the second sas_notify_lldd_dev_found.
This patch seems ok on a x260, x366 and x206m.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-01-13 14:23:36 -06:00
Christoph Lameter e18b890bb0 [PATCH] slab: remove kmem_cache_t
Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:39:25 -08:00
David Howells 06328b4f79 Actually update the fixed up compile failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-06 15:02:26 +00:00
David Howells 4796b71fbb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/pcmcia/ds.c

Fix up merge failures with Linus's head and fix new compile failures.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-06 15:01:18 +00:00
James Bottomley 024879ead9 [SCSI] libsas: better error handling in sas_expander.c
With async scanning, we're now tripping the BUG_ON in
sas_ex_discover_end_dev(), so make the error handling here correct.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-12-03 12:29:46 -06:00
James Bottomley 0bd2af4683 Merge ../scsi-rc-fixes-2.6 2006-11-22 12:06:44 -06:00
Darrick J. Wong dea2221479 [PATCH] aic94xx: handle REQ_DEVICE_RESET
This patch implements a REQ_DEVICE_RESET handler for the aic94xx
driver.  Like the earlier REQ_TASK_ABORT patch, this patch defers the
device reset to the Scsi_Host's workqueue, which has the added benefit
of ensuring that the device reset does not happen at the same time
that the abort tmfs are being processed.  After the phy reset, the
busted drive should go away and be re-detected later, which is indeed
what I've seen on both a x260 and a x206m.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-11-22 11:05:59 -06:00
David Howells c4028958b6 WorkStruct: make allyesconfig
Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-22 14:57:56 +00:00
Darrick J. Wong 79a5eb609b [SCSI] libsas: add sas_abort_task
This patch adds an external function, sas_abort_task, to enable LLDDs
to abort sas_tasks.  It also adds a work_struct so that the actual
work of aborting a task can be shifted from tasklet context (in the
LLDD) onto the scsi_host's workqueue.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-11-15 12:27:50 -06:00
Darrick J. Wong f456393e19 [SCSI] libsas: modify error handler to use scsi_eh_* functions
This patch adds an EH done queue to sas_ha, converts the error handling
strategy function and the sas_scsi_task_done functions in libsas to use
the scsi_eh_* commands for error'd commands, and adds checks for the
INITIATOR_ABORTED flag so that we do the right thing if a sas_task has
been aborted by the initiator.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-11-15 12:14:16 -06:00
malahal@us.ibm.com 42961ee8fc [SCSI] aic94xx SCSI timeout fix: SMP retry fix.
Updating DDB0 inside aic94xx driver itself caused SMP command timeout. I
hit this SMP timeout problem twice but I am not able to reproduce it since
then. Here is a fix that retries an SMP command.

Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-11-09 14:27:53 +09:00
Jeff Garzik 1bdfd554be [PATCH] SCSI: fix request flag-related build breakage
The ->flags in struct request was split into two variables, in a recent
changeset.  The merge of this change forgot to update SCSI's libsas,
probably because libsas was a very recent merge.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-30 19:33:43 -07:00
Al Viro 3cc27547d6 [PATCH] SCSI gfp_t annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-24 20:07:49 -07:00
James Bottomley a01e70e570 [SCSI] aci94xx: implement link rate setting
This patch implements the ability to set the minimum and maximum
linkrates for both libsas (for expanders) and aic94xx (for the host
phys).  It also tidies up the setting of the hardware min and max to
make sure they're updated when the expander emits a change broadcast.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-09-07 15:20:23 -05:00
James Bottomley 88edf74610 [SCSI] SAS: consolidate linkspeed definitions
At the moment we have two separate linkspeed enumerations covering
roughly the same values.  This patch consolidates on a single one enum
sas_linkspeed in scsi_transport_sas.h and uses it everywhere in the
aic94xx driver.  Eventually I'll get around to removing the duplicated
fields in asd_sas_phy and sas_phy ...

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-09-07 12:41:16 -05:00
James Bottomley 2908d778ab [SCSI] aic94xx: new driver
This is the end point of the separate aic94xx driver based on the
original driver and transport class from Luben Tuikov
<ltuikov@yahoo.com>

The log of the separate development is:

Alexis Bruemmer:
  o aic94xx: fix hotplug/unplug for expanderless systems
  o aic94xx: disable split completion timer/setting by default
  o aic94xx: wide port off expander support
  o aic94xx: remove various inline functions
  o aic94xx: use bitops
  o aic94xx: remove queue comment
  o aic94xx: remove sas_common.c
  o aic94xx: sas remove depot's
  o aic94xx: use available list_for_each_entry_safe_reverse()
  o aic94xx: sas header file merge

James Bottomley:
  o aic94xx: fix TF_TMF_NO_CTX processing
  o aic94xx: convert to request_firmware interface
  o aic94xx: fix hotplug/unplug
  o aic94xx: add link error counts to the expander phys
  o aic94xx: add transport class phy reset capability
  o aic94xx: remove local_attached flag
  o Remove README
  o Fixup Makefile variable for libsas rename
  o Rename sas->libsas
  o aic94xx: correct return code for sas_discover_event
  o aic94xx: use parent backlink port
  o aic94xx: remove channel abstraction
  o aic94xx: fix routing algorithms
  o aic94xx: add backlink port
  o aic94xx: fix cascaded expander properties
  o aic94xx: fix sleep under lock
  o aic94xx: fix panic on module removal in complex topology
  o aic94xx: make use of the new sas_port
  o rename sas_port to asd_sas_port
  o Fix for eh_strategy_handler move
  o aic94xx: move entirely over to correct transport class formulation
  o remove last vestages of sas_rphy_alloc()
  o update for eh_timed_out move
  o Preliminary expander support for aic94xx
  o sas: remove event thread
  o minor warning cleanups
  o remove last vestiges of id mapping arrays
  o Further updates
  o Convert aic94xx over entirely to the transport class end device and
  o update aic94xx/sas to use the new sas transport class end device
  o [PATCH] aic94xx: attaching to the sas transport class
  o Add missing completion removal from prior patch
  o [PATCH] aic94xx: attaching to the sas transport class
  o Build fixes from akpm

Jeff Garzik:
  o [scsi aic94xx] Remove ->owner from PCI info table

Luben Tuikov:
  o initial aic94xx driver

Mike Anderson:
  o aic94xx: fix panic on module insertion
  o aic94xx: stub out SATA_DEV case
  o aic94xx: compile warning cleanups
  o aic94xx: sas_alloc_task
  o aic94xx: ref count update
  o aic94xx nexus loss time value
  o [PATCH] aic94xx: driver assertion in non-x86 BIOS env

Randy Dunlap:
  o libsas: externs not needed

Robert Tarte:
  o aic94xx: sequence patch - fixes SATA support

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-08-29 09:52:29 -05:00