linux

Commit Graph

Author	SHA1	Message	Date
Brian King	e555db930f	[SCSI] use sysfs configured timeout for EH Start Unit timeout Use the sysfs configurable timeout when issuing a START_UNIT command from the scsi error handler. This is needed for devices which take longer than thirty seconds to respond to the start unit. The problem was observed when sending a start unit to a disk array device in an ipr RAID adapter, which results in the adapter firmware sending potentially multiple commands to physical devices as a result of this command, which ended up timing out sometimes. This patch does not change the default value used for this command. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2007-05-06 09:33:12 -05:00
Linus Torvalds	4f7a307dc6	Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (87 commits) [SCSI] fusion: fix domain validation loops [SCSI] qla2xxx: fix regression on sparc64 [SCSI] modalias for scsi devices [SCSI] sg: cap reserved_size values at max_sectors [SCSI] BusLogic: stop using check_region [SCSI] tgt: fix rdma transfer bugs [SCSI] aacraid: fix aacraid not finding device [SCSI] aacraid: Correct SMC products in aacraid.txt [SCSI] scsi_error.c: Add EH Start Unit retry [SCSI] aacraid: [Fastboot] Panics for AACRAID driver during 'insmod' for kexec test. [SCSI] ipr: Driver version to 2.3.2 [SCSI] ipr: Faster sg list fetch [SCSI] ipr: Return better qc_issue errors [SCSI] ipr: Disrupt device error [SCSI] ipr: Improve async error logging level control [SCSI] ipr: PCI unblock config access fix [SCSI] ipr: Fix for oops following SATA request sense [SCSI] ipr: Log error for SAS dual path switch [SCSI] ipr: Enable logging of debug error data for all devices [SCSI] ipr: Add new PCI-E IDs to device table ...	2007-05-05 13:30:44 -07:00
Brian King	ed773e6648	[SCSI] scsi_error.c: Add EH Start Unit retry Currently, the scsi error handler will issue a START_UNIT command if the drive indicates it needs its motor started and the allow_restart flag is set in the scsi_device. If, after the scsi error handler invokes a host adapter reset due to error recovery, a device is in a unit attention state AND also needs a START_UNIT, that device will be placed offline. The disk array devices on an ipr RAID adapter will do exactly this when in a dual initiator configuration. This patch adds a single retry to the EH initiated START_UNIT. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Patch modified and Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2007-04-17 17:55:36 -04:00
David S. Miller	8cc574a3c5	[SCSI]: Fix scsi_send_eh_cmnd scatterlist handling This fixes a regression caused by commit: `2dc611de5a` The sense buffer code in scsi_send_eh_cmnd was changed to use alloc_page() and a scatter list, but the sense data copy was not updated to match so what we actually get in the sense buffer is total grabage starting with the kernel address of the struct page we got. Basically the stack frame of scsi_send_eh_cmd() is what ends up in the sense buffer. Depending upon how pointers look on a given platform, you can end up getting sr_ioctl.c errors when you mount a cdrom. If the CDROM gives a check condition for GPCMD_GET_CONFIGURATION issued by drivers/cdrom/cdrom.c:cdrom_mmc_profile(), sr_ioctl will spit out this error message in sr_do_ioctl() with the way pointers are on sparc64: default: printk(KERN_ERR "%s: CDROM (ioctl) error, command: ", cd->cdi.name); __scsi_print_command(cgc->cmd); scsi_print_sense_hdr("sr", &sshdr); err = -EIO; This is the error Tom Callaway reported in: http://marc.info/?l=linux-sparc&m=117407453208101&w=2 Anyways, fix this by using page_address(sgl.page) which is OK because we know this is low-mem due to GFP_ATOMIC. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Christoph Hellwig <hch@lst.de>	2007-04-02 14:26:22 -07:00
James Bottomley	6c5f8ce1fb	[SCSI] expose eh_timed_out to the host template It looks like megaraid_sas at least needs this to throttle its commands as they begin to time out. The code keeps the existing transport template use of eh_timed_out (and allows the transport to override the host if they both have this callback). Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2007-03-20 10:56:49 -05:00
Brian King	292148f8bb	[SCSI] scsi_error: Fix lost EH commands If an EH command times out today, the LLDD's abort handler will be called to abort the command. It is assumed that this completes successfully, which can result in the command getting completed later resulting in an oops. Improve the current implementation by escalating all the way to host reset if necessary in order to clean up the EH command. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2007-02-03 08:32:10 -06:00
Luben Tuikov	fd1b494d4a	[SCSI] Fix sense key MEDIUM ERROR processing and retry 1) If the device reports an uncorrectable MEDIUM ERROR, such as SK MEDIUM ERROR, ASC UNRECOVERED READ ERR, AMNF DATA FIELD or RECORD NOT FOUND, then: In scsi_check_sense() return SUCCESS so as to not retry -- the error is uncorrectable -- this speeds up total processing time. Signed-off-by: Luben Tuikov <ltuikov@yahoo.com> Extracted the MEDIUM ERROR piece and Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2007-01-31 12:18:52 -06:00
Darrick J. Wong	dca84e4694	[SCSI] scsi_error.c: Export some scsi_eh_* functions Export a couple of functions from scsi_error that are needed to handle failed SCSI commands from the SAS EH. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> make exports GPL and Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2007-01-27 10:06:34 -06:00
Christoph Hellwig	2dc611de5a	[SCSI] use one-element sg list in scsi_send_eh_cmnd scsi_send_eh_cmnd is the last user of non-sg commands currently. This patch switches it to a one-element SG list. Also updates the kerneldoc comment for scsi_send_eh_cmnd to reflect reality while we're at it. Test on my mptsas card, but this should get testing with as many drivers as possible. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-11-15 12:55:52 -06:00
Al Viro	fa1f5ea860	[PATCH] gfp annotations: scsi_error Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-10 15:37:23 -07:00
Stefan Richter	7fbb36451a	[PATCH] SCSI: lockdep annotation in scsi_send_eh_cmnd Fixup for lockdep enabled kernels: Annotate an on-stack completion. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-13 07:32:14 -07:00
Mike Christie	0db99e3359	[SCSI] fix scsi_send_eh_cmnd regression The callers of scsi_send_eh_cmnd are setting the cmnd buffer, and then scsi_send_eh_cmnd is copying that updated buffer to the old_cmnd variable. Then after the command runs, we end up copying that old_cmnd var which has the new cmnd to the scsi command buffer. When this command gets recent, all types of fun things happen like getting TUR or START_STOP commands with data and scatterlists. This patch made against scsi-rc-fixes, has the callers of scsi_send_eh_cmnd pass in the command so scsi_send_eh_cmnd can do the right thing. This should go into 2.6.18 since this fixes a regression added when we removed some of the scsi_cmnd fields and replaced them with local variables. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-08-26 10:03:14 -05:00
Christoph Hellwig	631c228cd0	[SCSI] hide EH backup data outside the scsi_cmnd Currently struct scsi_cmnd has various fields that are used to backup original data after the corresponding fields have been overridden for EH commands. This means drivers can easily get at it and misuse it. Due to the old_ naming this doesn't happen for most of them, but two that have different names have been used wrong a lot (see previous patch). Another downside is that they unessecarily bloat the scsi_cmnd size. This patch moves them onstack in scsi_send_eh_cmnd to fix those two issues aswell as allowing future EH fixes like moving the EH command submissions to use SG lists like everything else. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-07-09 11:56:44 -05:00
James Smart	d7a1bb0a04	[SCSI] Block I/O while SG reset operation in progress - the midlayer patch The scsi midlayer portion of the patch Signed-off-by: James Smart <James.Smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-06-27 10:48:11 -05:00
Jeff Garzik	71d530cd1b	Merge branch 'master' into upstream Conflicts: drivers/scsi/libata-core.c drivers/scsi/libata-scsi.c include/linux/pci_ids.h	2006-06-22 22:11:56 -04:00
Christoph Hellwig	8d7feac3c7	[SCSI] remove RQ_SCSI_* flags The RQ_SCSI_* flags are a vestiage of a long past history. The EH code still sets them but we never make use of that information. The other users is pluto.c which never had a chance to work but needs to be kept compiling to keep Davem happy, so copy over the definition there. We could probably get rid of RQ_ACTIVE/RQ_INACTIVE aswell with some work, there's only two more or less bogus looking uses in ubd and scsi. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-06-10 16:25:21 -05:00
Christoph Hellwig	beb4048750	[SCSI] remove scsi_request infrastructure With Achim patch the last user (gdth) is switched away from scsi_request so we an kill it now. Also disables some code in i2o_scsi that was broken since the sg driver stopped using scsi_requests. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-06-10 16:24:40 -05:00
Tejun Heo	f8bbfc247e	[PATCH] SCSI: make scsi_implement_eh() generic API for SCSI transports libata implemented a feature to schedule EH without an associated EH by manipulating shost->host_eh_scheduled in ata_scsi_schedule_eh() directly. Move this function to scsi_error.c and rename it to scsi_schedule_eh(). It is now an exported API for SCSI transports and exported via new header file drivers/scsi/scsi_transport_api.h This patch also de-export scsi_eh_wakeup() which was exported specifically for ata_scsi_schedule_eh(). Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-05-20 00:39:08 -04:00
Tejun Heo	ee7863bc68	[PATCH] SCSI: implement shost->host_eh_scheduled libata needs to invoke EH without scmd. This patch adds shost->host_eh_scheduled to implement such behavior. Currently the only user of this feature is libata and no general interface is defined. This patch simply adds handling for host_eh_scheduled where needed and exports scsi_eh_wakeup() to modules. The rest is upto libata. This is the result of the following discussion. http://thread.gmane.org/gmane.linux.scsi/23853/focus=9760 In short, SCSI host is not supposed to know about exceptions unrelated to specific device or command. Such exceptions should be handled by transport layer proper. However, the distinction is not essential to ATA and libata is planning to depart from SCSI, so, for the time being, libata will be using SCSI EH to handle such exceptions. Signed-off-by: Tejun Heo <htejun@gmail.com>	2006-05-15 20:57:20 +09:00
Christoph Hellwig	9227c33de8	[PATCH] move ->eh_strategy_handler to the transport class Overriding the whole EH code is a per-transport, not per-host thing. Move ->eh_strategy_handler to the transport class, same as ->eh_timed_out. Downside is that scsi_host_alloc can't check for the total lack of EH anymore, but the transition period from old EH where we needed it is long gone already. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-10 14:15:47 -04:00
James Bottomley	d04cdb6421	Merge ../linux-2.6	2006-03-21 13:05:45 -06:00
James Bottomley	f33b5d783b	Merge ../linux-2.6	2006-03-14 14:18:01 -06:00
James Smart	c829c39416	[SCSI] FC transport : Avoid device offline cases by stalling aborts until device unblocked This moves the eh_timed_out functionality from the scsi_host_template to the transport_template. Given that this is now a transport function, the EH_RESET_TIMER case no longer caps the timer reschedulings. The transport guarantees that this is not an infinite condition. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-03-13 08:58:58 -06:00
Jeff Garzik	d2dbaad855	Merge branch 'master'	2006-03-01 14:45:47 -05:00
Brian King	8884efab15	[SCSI] scsi: scsi command retries off by one fix Fix up an off by one error in calculating retries for scsi commands. This bug was discovered when an SG_IO request was sent to scsi core with retries = 0, causing the overall timeout check to go off in scsi_softirq_done. Signed-off-by: Brian King <brking@us.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-02-27 21:38:39 -06:00
Jeff Garzik	18ee361004	Merge branch 'master'	2006-02-02 01:12:54 -05:00
Tejun Heo	041c5fc33c	[PATCH] SCSI: export scsi_eh_finish_cmd() and scsi_eh_flush_done_q() Export two SCSI EH command handling functions. To be used by libata EH. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>	2006-01-26 22:36:28 -05:00
brking@us.ibm.com	bb1d1073a1	[SCSI] Prevent scsi_execute_async from guessing cdb length When the scsi_execute_async interface was added it ended up reducing the flexibility of userspace to send arbitrary scsi commands through sg using SG_IO. The SG_IO interface allows userspace to specify the CDB length. This is now ignored in scsi_execute_async and it is guessed using the COMMAND_SIZE macro, which is not always correct, particularly for vendor specific commands. This patch adds a cmd_len parameter to the scsi_execute_async interface to allow the caller to specify the length of the CDB. Signed-off-by: Brian King <brking@us.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2006-01-26 15:13:50 -05:00
James Bottomley	2a1e1379ba	Merge by hand (conflicts in scsi_lib.c) This merge is pretty extensive. The conflict is over the new req->retries parameter, so I had to change the prototype to scsi_setup_blk_pc_cmnd() and the usage in sd, sr and st. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-12-15 17:35:24 -06:00
Mike Christie	6e68af666f	[SCSI] Convert SCSI mid-layer to scsi_execute_async Add scsi helpers to create really-large-requests and convert scsi-ml to scsi_execute_async(). Per Jens's previous comments, I placed this function in scsi_lib.c. I made it follow all the queue's limits - I think I did at least :), so I removed the warning on the function header. I think the scsi_execute_* functions should eventually take a request_queue and be placed some place where the dm-multipath hw_handler can use them if that failover code is going to stay in the kernel. That conversion patch will be sent in another mail though. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-12-14 19:03:35 -08:00
Michael Reed	85631672e6	[SCSI] fix OOPS due to clearing eh_action prior to aborting eh command The eh_action semaphore in scsi_eh_send_command is cleared after a command timeout. The command is subsequently aborted and the abort will try to call scsi_done() on it. Unfortunately, the scsi_eh_done() routine unconditinally completes the semaphore (which is now null). Fix this race by makiong the scsi_eh_done() routine check that the semaphore is non null before completing it (mirroring the ordinary command done/timeout logic). Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-12-08 09:13:29 -05:00
Christoph Hellwig	7dfdc9a52b	[SCSI] use a completion in scsi_send_eh_cmnd scsi_send_eh_cmnd currently uses a semaphore and an overload of eh_timer to either get a completion for a command for a timeout. Switch to using a completion and wait_for_completion_timeout to simply the code and not having to deal with the races ourselves. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-11-06 12:49:36 -06:00
Christoph Hellwig	474838d5e5	[SCSI] remove Scsi_Host.eh_active now that the abuse in qla2xxx is gone this field can be remove. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-11-06 12:44:44 -06:00
Christoph Hellwig	ad42eb1b77	[SCSI] tidy up scsi_error_handler adjust comments, remove a useless cast and remove a write-only variable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-11-06 12:43:26 -06:00
Jeff Garzik	422c0d61d5	[SCSI] use scmd_id(), scmd_channel() throughout code Wrap a highly common idiom. Makes the code easier to read, helps pave the way for sdev->{id,channel} removal, and adds a token that can easily by grepped-for in the future. There are a couple sdev_id() and scmd_printk() updates thrown in as well. Rejections fixed up and Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-10-28 21:10:16 -05:00
Jeff Garzik	3bf743e7c8	[SCSI] use {sdev,scmd,starget,shost}_printk in generic code rejections fixed and Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-10-28 20:52:11 -05:00
James Bottomley	9ccfc756a7	[SCSI] move the mid-layer printk's over to shost/starget/sdev_printk This should eliminate (at least in the mid layer) to make numeric assumptions about any of the enumeration variables. As a side effect, it will also make all the messages consistent and line us up nicely for the error logging strategy (if it ever shows itself again). Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-10-28 14:23:02 -05:00
Steven Rostedt	461a0ffbec	[PATCH] scsi_error thread exits in TASK_INTERRUPTIBLE state. Found in the -rt patch set. The scsi_error thread likely will be in the TASK_INTERRUPTIBLE state upon exit. This patch fixes this bug. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-19 23:16:21 -07:00
James Bottomley	3ed7a4704b	[SCSI] Fix thread termination for the SCSI error handle From: Alan Stern <stern@rowland.harvard.edu> This patch (as561) fixes the error handler's thread-exit code. The kthread_stop call won't wake the thread from a down_interruptible, so the patch gets rid of the semaphore and simply does set_current_state(TASK_INTERRUPTIBLE); Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Modified to simplify the termination loop and correct the sleep condition. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-09-19 09:50:04 -05:00
James Bottomley	939647ee30	[SCSI] fix oops on usb storage device disconnect We fix the oops by enforcing the host state model. There have also been two extra states added: SHOST_CANCEL_RECOVERY and SHOST_DEL_RECOVERY so we can take the model through host removal while the recovery thread is active. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-09-19 09:24:52 -05:00
James Bottomley	17fa53da12	Merge by hand (conflicts in sd.c)	2005-09-06 17:52:54 -05:00
Christoph Hellwig	fe1b2d544d	[SCSI] unexport scsi_add_timer/scsi_delete_timer Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-09-06 17:26:37 -05:00
Christoph Hellwig	c5478def7a	[SCSI] switch EH thread startup to the kthread API Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-09-06 17:26:06 -05:00
Alan Stern	e47373ec1c	[SCSI] return success after retries in scsi_eh_tur The problem lies in the way the error handler uses TEST UNIT READY to tell whether error recovery has succeeded. The scsi_eh_tur function gives up after one round of retrying; after that it decides that more error recovery is needed. However TUR is liable to report sense data indicating a retry is needed when in fact error recovery has succeeded. A typical example might be SK=2, ASC=4, ASCQ=1 (Logical unit in process of becoming ready). The mere fact that we were able to get a sensible reply to the TUR should indicate that the device is working well enough to stop error recovery. I ran across a case back in January where this happened. A CD-ROM drive timed out the INQUIRY command, and a device reset fixed the blockage. But then the drive kept responding with 2/4/1 -- because it was spinning up I suppose -- until the error handler gave up and placed it offline. If the initial INQUIRY had received the 2/4/1 instead, everything would have worked okay. It doesn't seem reasonable for things to fail just because the error handler had started running. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-09-06 17:19:23 -05:00
James Bottomley	33aa687db9	[SCSI] convert SPI transport class to scsi_execute This one's slightly more difficult. The transport class uses REQ_FAILFAST, so another interface (scsi_execute) had to be invented to take the extra flag. Also, the sense functions are shifted around to allow spi_execute to place data directly into a struct scsi_sense_hdr. With this change, there's probably a lot of unnecessary sense buffer allocation going on which we can fix later. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-08-28 11:31:14 -05:00
Mike Anderson	d330187408	[SCSI] host state model update: replace old host bitmap state Migrate the current SCSI host state model to a model like SCSI device is using. Signed-off-by: Mike Anderson <andmike@us.ibm.com> Rejections fixed up and Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-07-30 11:10:24 -05:00
Christoph Hellwig	937abeaadf	[SCSI] use list_for_each_entry_safe in scsi_error.c Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-06-26 12:20:42 -05:00
Christoph Hellwig	3111b0d164	[SCSI] remove scsi_eh_eflags_ macros Just opencoded access to eh_eflags, it's much more readable anyway. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-06-26 12:17:24 -05:00
Christoph Hellwig	8d115f845a	[SCSI] remove scsi_cmnd->state We never look at it except for the old megaraid driver that abuses it for sending internal commands. That usage can be fixed easily because those internal commands are single-threaded by a mutex and we can easily use a completion there. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-06-26 12:16:24 -05:00
Christoph Hellwig	b4edcbcafd	[SCSI] remove scsi_cmnd->owner never checked anywhere Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-06-26 12:15:28 -05:00

1 2

65 Commits