For disks with 4KB sectors, report the correct block size and alignment
when filling out the READ CAPACITY(16) response.
This patch is based upon code from Matthew Wilcox' 4KB ATA tree. I
fixed the bug I reported a while back caused by ATA and SCSI using
different approaches to describing the alignment.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
With recent unification of fields, it's now guaranteed that
rq->data_len always equals blk_rq_bytes(). Convert all non-IDE direct
users to accessors. IDE will be converted in a separate patch.
Boaz: spotted incorrect data_len/resid_len conversion in osd.
[ Impact: convert direct rq->data_len usages to blk_rq_bytes() ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The legacy old IDE ioctl API for this is a bit primitive so we try
and map stuff sensibly onto it.
- Set PIO over DMA devices to report 32bit
- Add ability to change the PIO32 settings if the controller permits it
- Add that functionality into the sff drivers
- Add that functionality into the VLB legacy driver
- Turn on the 32bit PIO on the ninja32 and add support there
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix libata kernel-doc warnings:
Warning(linux-next-20090120//drivers/ata/libata-core.c:4720): Excess function parameter 'dev' description in 'ata_qc_new'
Warning(linux-next-20090120//drivers/ata/libata-scsi.c:428): No description found for parameter 'ap'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Introduce new libata flags ATA_FLAG_NO_POWEROFF_SPINDOWN and
ATA_FLAG_NO_HIBERNATE_SPINDOWN that, if set, will prevent disks from
being spun off during system power off and hibernation, respectively
(to handle the hibernation case we need the new system state
SYSTEM_HIBERNATE_ENTER that can be checked against by libata, in
analogy with SYSTEM_POWER_OFF).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
for SAS drivers.
Caught by Ke Wei (and team?) at Marvell.
Also, move the ata_scsi_ioctl export to libata-scsi.c, as that seems to be the
general trend.
Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
scsi_execute() and scsi_execute_req() discard the residual length
information. Some callers need it. This adds residual argument
(optional) to scsi_execute and scsi_execute_req.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There currently are the following looping constructs.
* __ata_port_for_each_link() for all available links
* ata_port_for_each_link() for edge links
* ata_link_for_each_dev() for all devices
* ata_link_for_each_dev_reverse() for all devices in reverse order
Now there's a need for looping construct which is similar to
__ata_port_for_each_link() but iterates over PMP links before the host
link. Instead of adding another one with long name, do the following
cleanup.
* Implement and export ata_link_next() and ata_dev_next() which take
@mode parameter and can be used to build custom loop.
* Implement ata_for_each_link() and ata_for_each_dev() which take
looping mode explicitly.
The following iteration modes are implemented.
* ATA_LITER_EDGE : loop over edge links
* ATA_LITER_HOST_FIRST : loop over all links, host link first
* ATA_LITER_PMP_FIRST : loop over all links, PMP links first
* ATA_DITER_ENABLED : loop over enabled devices
* ATA_DITER_ENABLED_REVERSE : loop over enabled devices in reverse order
* ATA_DITER_ALL : loop over all devices
* ATA_DITER_ALL_REVERSE : loop over all devices in reverse order
This change removes exlicit device enabledness checks from many loops
and makes it clear which ones are iterated over in which direction.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch reverts the following three commits which convert libata to
use block layer tagging.
43a49cbdf3e013e13bf62fca5ccf97
Although using block layer tagging is the right direction, due to the
tight coupling among tag number, data structure allocation and
hardware command slot allocation, libata doesn't work correctly with
the current conversion.
The biggest problem is guaranteeing that tag 0 is always used for
non-NCQ commands. Due to the way blk-tag is implemented and how SCSI
starts and finishes requests, such guarantee can't be made. I'm not
sure whether this would actually break any low level driver but it
doesn't look like a good idea to break such assumption given the
frailty of ATA controllers.
So, for the time being, keep using the old dumb in-libata qc
allocation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axobe <jens.axboe@oracle.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Moulder has pointed out that there is a slight chance that a
negative value might be passed to jiffies_to_msecs() in
ata_scsi_park_show(). This is fixed by saving the value of jiffies in a
local variable, thus also reducing code since the volatile variable
jiffies is accessed only once.
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: Tejun Heo <tj.kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
For devices behind sata port multipliers, we have to make sure that
they share a tag map since all tags for that PMP must be unique.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The recent commit 2fca5ccf97 ("libata:
switch to using block layer tagging support") to enable support for
block layer tagging in libata was broken for non-NCQ devices
The block layer initializes the tag field to -1 to detect invalid uses
of a tag, and if the libata devices does NOT support NCQ, we just used
that field to index the internal command list. So we need to check for
-1 first and only use the tag field if it's valid.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Paul Mundt <lethal@linux-sh.org>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
libata currently has a pretty dumb ATA_MAX_QUEUE loop for finding
a free tag to use. Instead of fixing that up, convert libata to
using block layer tagging - gets rid of code in libata, and is also
much faster.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-2.6.28' of git://git.kernel.dk/linux-2.6-block: (132 commits)
doc/cdrom: Trvial documentation error, file not present
block_dev: fix kernel-doc in new functions
block: add some comments around the bio read-write flags
block: mark bio_split_pool static
block: Find bio sector offset given idx and offset
block: gendisk integrity wrapper
block: Switch blk_integrity_compare from bdev to gendisk
block: Fix double put in blk_integrity_unregister
block: Introduce integrity data ownership flag
block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1
bio.h: Remove unused conditional code
block: remove end_{queued|dequeued}_request()
block: change elevator to use __blk_end_request()
gdrom: change to use __blk_end_request()
memstick: change to use __blk_end_request()
virtio_blk: change to use __blk_end_request()
blktrace: use BLKTRACE_BDEV_SIZE as the name size for setup structure
block: add lld busy state exporting interface
block: Fix blk_start_queueing() to not kick a stopped queue
include blktrace_api.h in headers_install
...
SSD devices should give an RPM setting of 1 in word 217 of the ID
page. If we see such a device, tell the block layer about it.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
On user request (through sysfs), the IDLE IMMEDIATE command with UNLOAD
FEATURE as specified in ATA-7 is issued to the device and processing of
the request queue is stopped thereafter until the specified timeout
expires or user space asks to resume normal operation. This is supposed
to prevent the heads of a hard drive from accidentally crashing onto the
platter when a heavy shock is anticipated (like a falling laptop
expected to hit the floor). In fact, the whole port stops processing
commands until the timeout has expired in order to avoid any resets due
to failed commands on another device.
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Global and per-LLD ATAPI disable checks were done in the command issue
path probably because it was left out during EH conversion. On
affected machines, this can cause lots of warning messages. Move them
to where they belong - the probing path.
Reported by Chunbo Luo.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Chunbo Luo <chunbo.luo@windriver.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
AHCI: Remove an unnecessary flush from ahci_qc_issue
AHCI: speed up resume
[libata] Add support for VPD page b1
ata: endianness annotations in pata drivers
libata-eh: update atapi_eh_request_sense() to take @dev instead of @qc
[libata] sata_svw: update code comments relating to data corruption
libata/ahci: enclosure management support
libata: improve EH internal command timeout handling
libata: use ULONG_MAX to terminate reset timeout table
libata: improve EH retry delay handling
libata: consistently use msecs for time durations
SCSI VPD page b1 reports the nominal rotation speed and physical size
of the device. Devices that conform to ATA-8 can return this information
in words 217 and 168 of the identify data.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add Enclosure Management support to libata and ahci.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This adds blk_queue_update_dma_pad to prevent LLDs from overwriting
the dma pad mask wrongly (we added blk_queue_update_dma_alignment due
to the same reason).
This also converts libata to use blk_queue_update_dma_pad instead of
blk_queue_dma_pad.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
There's no reason to check whether to use DMA or not for no data
commands. Don't do it. While at it, make local variable using_pio in
atapi_xlat() set iff ATAPI_PROT_PIO is going to be used and rename
ata_check_atapi_dma() to atapi_check_dma() for consistency.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix libata-scsi kernel-doc notation:
Warning(linux-2.6.25-git15//drivers/ata/libata-scsi.c:1659): No description found for parameter 'cmd'
Warning(linux-2.6.25-git15//drivers/ata/libata-scsi.c:1971): No description found for parameter 'buf'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
I was hoping ATA_HORKAGE_NODMA | ATA_HORKAGE_SKIP_PM could keep it
happy but no even this doesn't work under certain configurations and
it's not like we can do anything useful with the cofig device anyway.
Replace ATA_HORKAGE_SKIP_PM with ATA_HORKAGE_DISABLE and use it for
the config device. This makes the device completely ignored by
libata.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Buffer length handling in simulated commands is error-prone and full
of bugs. There are a number of places where necessary length checks
are missing and if the output buffer is passed in as sglist, nothing
works.
This patch adds a static buffer ata_scsi_rbuf which is sufficiently
large to handle the larges output from simulated commands (4k
currently), let all simulte functions write to the buffer and removes
all length checks as we know that there always is enough buffer space.
Copying in (for ATAPI inquiry fix up) and out are handled by
sg_copy_to/from_buffer() behind ata_scsi_rbuf_get/put() interface
which handles sglist properly.
This patch is inspired from buffer length check fix patch from Petr
Vandrovec.
Updated to use sg_copy_to/from_buffer() as suggested by FUJITA
Tomonori.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Petr Vandrovec <petr@vmware.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* make ata_scsiop_*() static
* make ata_scsi_set_sense() static and move it above its users
* make ata_scsi_rbuf_fill() static
* kill unused ata_scsi_badcmd()
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
It's big, but there doesn't seem to be a way to split it up smaller...
Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (137 commits)
[SCSI] iscsi: bidi support for iscsi_tcp
[SCSI] iscsi: bidi support at the generic libiscsi level
[SCSI] iscsi: extended cdb support
[SCSI] zfcp: Fix error handling for blocked unit for send FCP command
[SCSI] zfcp: Remove zfcp_erp_wait from slave destory handler to fix deadlock
[SCSI] zfcp: fix 31 bit compile warnings
[SCSI] bsg: no need to set BSG_F_BLOCK bit in bsg_complete_all_commands
[SCSI] bsg: remove minor in struct bsg_device
[SCSI] bsg: use better helper list functions
[SCSI] bsg: replace kobject_get with blk_get_queue
[SCSI] bsg: takes a ref to struct device in fops->open
[SCSI] qla1280: remove version check
[SCSI] libsas: fix endianness bug in sas_ata
[SCSI] zfcp: fix compiler warning caused by poking inside new semaphore (linux-next)
[SCSI] aacraid: Do not describe check_reset parameter with its value
[SCSI] aacraid: Fix down_interruptible() to check the return value
[SCSI] sun3_scsi_vme: add MODULE_LICENSE
[SCSI] st: rename flush_write_buffer()
[SCSI] tgt: use KMEM_CACHE macro
[SCSI] initio: fix big endian problems for auto request sense
...
The kernel now panics reliably on boot if you have a SATAPI device
connected.
The problem was introduced by the libata merge trying to pull out all
the SFF code into a separate module. Unfortunately, if you're a satapi
device you usually need to call atapi_request_sense, which has a bare
invocation of a SFF callback which is NULL on non-SFF HBAs. Fix this by
making the call conditional.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement helpers to test whether PMP is supported, attached and
determine pmp number to use when issuing SRST to a link. While at it,
move ata_is_host_link() so that it's together with the two new PMP
helpers.
This change simplifies LLDs and helps making PMP support optional.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Now that SFF support is completely separated out from the core layer,
it can be made optional. Add CONFIG_ATA_SFF and let SFF drivers
depend on it. If CONFIG_ATA_SFF isn't set, all codes in libata-sff.c
and data structures for SFF support are disabled. This saves good
number of bytes for small systems.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Add sff_ prefix to SFF specific port ops.
This rename is in preparation of separating SFF support out of libata
core layer. This patch strictly renames ops and doesn't introduce any
behavior difference.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ata_ehi_schedule_probe() was created to hide details of link-resuming
reset magic. Now that all the softreset workarounds are gone,
scheduling probe is very simple - set probe_mask and request RESET.
Kill ata_ehi_schedule_probe() and open code it. This also increases
consistency as ata_ehi_schedule_probe() couldn't cover individual
device probings so they were open-coded even when the helper existed.
While at it, define ATA_ALL_DEVICES as mask of all possible devices on
a link and always use it when requesting probe on link level for
simplicity and consistency. Setting extra bits in the probe_mask
doesn't hurt anybody.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ATA_EHI_RESUME_LINK has two functions - promote reset to hardreset if
ATA_LFLAG_HRST_TO_RESUME is set and preventing EH from shortcutting
reset action when probing is requested. The former is gone now and
the latter can easily be achieved by making EH to perform at least one
reset if reset is requested, which also makes more sense than
depending on RESUME_LINK flag.
As ATA_EHI_RESUME_LINK was the only EHI reset modifier, this also
kills reset modifier handling.
Signed-off-by: Tejun Heo <htejun@gmail.com>
When both soft and hard resets are available, libata preferred
softreset till now. The logic behind it was to be softer to devices;
however, this doesn't really help much. Rationales for the change:
* BIOS may freeze lock certain things during boot and softreset can't
unlock those. This by itself is okay but during operation PHY event
or other error conditions can trigger hardreset and the device may
end up with different configuration.
For example, after a hardreset, previously unlockable HPA can be
unlocked resulting in different device size and thus revalidation
failure. Similar condition can occur during or after resume.
* Certain ATAPI devices require hardreset to recover after certain
error conditions. On PATA, this is done by issuing the DEVICE RESET
command. On SATA, COMRESET has equivalent effect. The problem is
that DEVICE RESET needs its own execution protocol.
For SFF controllers with bare TF access, it can be easily
implemented but more advanced controllers (e.g. ahci and sata_sil24)
require specialized implementations. Simply using hardreset solves
the problem nicely.
* COMRESET initialization sequence is the norm in SATA land and many
SATA devices don't work properly if only SRST is used. For example,
some PMPs behave this way and libata works around by always issuing
hardreset if the host supports PMP.
Like the above example, libata has developed a number of mechanisms
aiming to promote softreset to hardreset if softreset is not going
to work. This approach is time consuming and error prone.
Also, note that, dependingon how you read the specs, it could be
argued that PMP fan-out ports require COMRESET to start operation.
In fact, all the PMPs on the market except one don't work properly
if COMRESET is not issued to fan-out ports after PMP reset.
* COMRESET is an integral part of SATA connection and any working
device should be able to handle COMRESET properly. After all, it's
the way to signal hardreset during reboot. This is the most used
and recommended (at least by the ahci spec) method of resetting
devices.
So, this patch makes libata prefer hardreset over softreset by making
the following changes.
* Rename ATA_EH_RESET_MASK to ATA_EH_RESET and use it whereever
ATA_EH_{SOFT|HARD}RESET used to be used. ATA_EH_{SOFT|HARD}RESET is
now only used to tell prereset whether soft or hard reset will be
issued.
* Strip out now unneeded promote-to-hardreset logics from
ata_eh_reset(), ata_std_prereset(), sata_pmp_std_prereset() and
other places.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement ata_qc_raw_nbytes() which determines the raw user-requested
size of a PC command.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Block layer alignment was used for two different purposes - memory
alignment and padding. This causes problems in lower layers because
drivers which only require memory alignment ends up with adjusted
rq->data_len. Separate out padding such that padding occurs iff
driver explicitly requests it.
Tomo: restorethe code to update bio in blk_rq_map_user
introduced by the commit 40b01b9bbd
according to padding alignment.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The meaning of rq->data_len was changed to the length of an allocated
buffer from the true data length. It breaks SG_IO friends and
bsg. This patch restores the meaning of rq->data_len to the true data
length and adds rq->extra_len to store an extended length (due to
drain buffer and padding).
This patch also removes the code to update bio in blk_rq_map_user
introduced by the commit 40b01b9bbd.
The commit adjusts bio according to memory alignment
(queue_dma_alignment). However, memory alignment is NOT padding
alignment. This adjustment also breaks SG_IO friends and bsg. Padding
alignment needs to be fixed in a proper way (by a separate patch).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
Interrupts must be disabled if using kmap_atomic(KM_IRQ0), but that was
not the case in a few code paths coming directly from ATA driver
interrupt handlers (which use spin_lock rather than spin_lock_irqsave).
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Back in 2.6.17-rc2, a libata module parameter was added for atapi_dmadir.
That's nice, but most SATA devices which need it will tell us about it
in their IDENTIFY PACKET response, as bit-15 of word-62 of the
returned data (as per ATA7, ATA8 specifications).
So for those which specify it, we should automatically use the DMADIR bit.
Otherwise, disc writing will fail by default on many SATA-ATAPI drives.
This patch adds ATA_DFLAG_DMADIR and make ata_dev_configure() set it
if atapi_dmadir is set or identify data indicates DMADIR is necessary.
atapi_xlat() is converted to check ATA_DFLAG_DMADIR before setting
DMADIR.
Original patch is from Mark Lord.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix libata kernel-doc parameter:
Warning(linux-2.6.25-rc2-git3//drivers/ata/libata-scsi.c:845): No description found for parameter 'rq'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This just updates the libata slave configure routine to take advantage
of the block layer drain buffers. It also adjusts the size lengths in
the atapi code to add the drain buffer to the DMA length so the driver
knows it can rely on it.
I suspect I should also be checking for AHCI as well as ATA_DEV_ATAPI,
but I couldn't see how to do that easily.
tj: * atapi_drain_needed() added such that draining is applied to only
misc ATAPI commands.
* q->bounce_gfp used when allocating drain buffer.
* Now duplicate ATAPI PIO drain logic dropped.
* ata_dev_printk() used instead of sdev_printk().
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
that provided by the block layer
ATA requires that all DMA transfers begin and end on word boundaries.
Because of this, a large amount of machinery grew up in ide to adjust
scatterlists on this basis. However, as of 2.5, the block layer has a
dma_alignment variable which ensures both the beginning and length of a
DMA transfer are aligned on the dma_alignment boundary. Although the
block layer does adjust the beginning of the transfer to ensure this
happens, it doesn't actually adjust the length, it merely makes sure
that space is allocated for transfers beyond the declared length. The
upshot of this is that scatterlists may be padded to any size between
the actual length and the length adjusted to the dma_alignment safely
knowing that memory is allocated in this region.
Right at the moment, SCSI takes the default dma_aligment which is on a
512 byte boundary. Note that this aligment only applies to transfers
coming in from user space. However, since all kernel allocations are
automatically aligned on a minimum of 32 byte boundaries, it is safe to
adjust them in this manner as well.
tj: * Adjusting sg after padding is done in block layer. Make libata
set queue alignment correctly for ATAPI devices and drop broken
sg mangling from ata_sg_setup().
* Use request->raw_data_len for ATAPI transfer chunk size.
* Killed qc->raw_nbytes.
* Separated out killing qc->n_iter.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 commits)
[SCSI] usbstorage: use last_sector_bug flag universally
[SCSI] libsas: abstract STP task status into a function
[SCSI] ultrastor: clean up inline asm warnings
[SCSI] aic7xxx: fix firmware build
[SCSI] aacraid: fib context lock for management ioctls
[SCSI] ch: remove forward declarations
[SCSI] ch: fix device minor number management bug
[SCSI] ch: handle class_device_create failure properly
[SCSI] NCR5380: fix section mismatch
[SCSI] sg: fix /proc/scsi/sg/devices when no SCSI devices
[SCSI] IB/iSER: add logical unit reset support
[SCSI] don't use __GFP_DMA for sense buffers if not required
[SCSI] use dynamically allocated sense buffer
[SCSI] scsi.h: add macro for enclosure bit of inquiry data
[SCSI] sd: add fix for devices with last sector access problems
[SCSI] fix pcmcia compile problem
[SCSI] aacraid: add Voodoo Lite class of cards.
[SCSI] aacraid: add new driver features flags
[SCSI] qla2xxx: Update version number to 8.02.00-k7.
[SCSI] qla2xxx: Issue correct MBC_INITIALIZE_FIRMWARE command.
...
Hugh Dickens noticed that SMART commands issued from user space can
end up corupting memory. The problem occurs if the buffer used to
read data spans two pages. The reason is that the PIO sector routines
in libata are expecting physically contiguous pages when they do
sector operations, so the left overs on the second page go into the
next physically adjacent page rather than the next page in the sg
mapping.
Fix this by enforcing strict 512 byte alignment on all buffers from
userspace.
Acked-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata used private sg iterator to handle padding sg. Now that sg can
be chained, padding can be handled using standard sg ops. Convert to
chained sg.
* s/qc->__sg/qc->sg/
* s/qc->pad_sgent/qc->extra_sg[]/. Because chaining consumes one sg
entry. There need to be two extra sg entries. The renaming is also
for future addition of other extra sg entries.
* Padding setup is moved into ata_sg_setup_extra() which is organized
in a way that future addition of other extra sg entries is easy.
* qc->orig_n_elem is unused and removed.
* qc->n_elem now contains the number of sg entries that LLDs should
map. qc->mapped_n_elem is added to carry the original number of
mapped sgs for unmapping.
* The last sg of the original sg list is used to chain to extra sg
list. The original last sg is pointed to by qc->last_sg and the
content is stored in qc->saved_last_sg. It's restored during
ata_sg_clean().
* All sg walking code has been updated. Unnecessary assertions and
checks for conditions the core layer already guarantees are removed.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ATA_PROT_ATAPI_* are ugly and naming schemes between ATA_PROT_* and
ATA_PROT_ATAPI_* are inconsistent causing confusion. Rename them to
ATAPI_PROT_* and make them consistent with ATA counterpart.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
atapi_request_sense() is now the only left user of ata_sg_init_one().
Convert it to use sg interface.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Rusty Russel <rusty@rustcorp.com.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Historically word 48 in the identify data was used to mean 32bit I/O
was supported for VLB IDE etc. ATA8 reassigns this word to the Trusted
Computing Group, where it is used for TCG features. This means that
an ATA8 TCG drive is going to trigger 32bit I/O on some systems which
will be funny.
Anyway we need to sort this out ready for ATA8 so:
- Reorder the ata.h header a bit so the ata_version function occurs early
in it
- Make dword_io check the ATA version
- Add an ATA8 version checking TCG presence test
While we are at it the current drafts have a flaw where it may not be
possible to disable TCG features at boot (and opt out of the trusted
model) as TCG intends because it relies on presence of a different
optional feature (DCS). Handle this in software by refusing the TCG
commands if libata.allow_tpm is not set. (We must make it possible
as some environments such as proprietary VDR devices will doubtless
want to use it to lock up content)
Finally as with CPRM print a warning so that the user knows they may
not be able to full access and use the device.
Signed-off-by: Alan Cox <alan@redhat.com>
After 9b8e8de7, manage_start_stop configuration depends on valid ATA
device. Move it into ata_scsi_dev_config(). This was detected by the
coverity checker.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch relaxes the default SCSI DMA alignment from 512 bytes to 4
bytes. I remember from previous discussions that usb and firewire have
sector size alignment requirements, so I upped their alignments in the
respective slave allocs.
The reason for doing this is so that we don't get such a huge amount of
copy overhead in bio_copy_user() for udev. (basically all inquiries it
issues can now be directly mapped).
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
None of the drives I have follows what the standard says about
transfer chunk size. Of the four SATA and six PATA ATAPI devices
tested, four ignore transfer chunk size completely and the ones which
honor it don't behave according to the spec when it's odd.
According to the spec, transfer chunk size can be odd if the amount of
data to transfer equals or is smaller than the chunk size and the
device can indicate the same odd number and transfer the whole thing
at one go with a pad byte appended. However, in reality, none of the
drives I have does that. They all indicate and transfer even number
of bytes one byte shorter than the chunk size first; then indicate and
transfer two bytes, which is clearly out of spec.
In addition to unnecessary second PIO data phase, this also creates a
weird problem when combined with SATA controllers which perform PIO
via DMA. Some of these controllers use actualy number of bytes
received to update DMA pointer so chunks which are sized 4n + 2 makes
DMA pointer off by two bytes. This causes data corruption and buffer
overruns.
This patch rounds nbytes up to the nearest even number such that ATAPI
devices don't split data transfer for the last odd byte. This
shouldn't confuse controllers which depend on transfer chunk size as
devices will report the rounded-up number, actually transfer that much
and padding buffer is there to receive them.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Sebastian Kemper reported that issuing CD/DVD commands under libata is
not fully compatible with ide-scsi. In particular, the
GPCMD_SET_STREAMING was being rejected at the host level in some
instances.
The reason is that libata-scsi insists upon the cmd_len field exactly
matching the SCSI opcode being issued, whereas ide-scsi tolerates
12-byte commands contained within a 16-byte (cmd_len) CDB.
There doesn't seem to be a good reason for us to not be compatible
there, so here is a patch to fix libata-scsi to permit SCSI opcodes so
long as they fit within whatever size CDB is provided.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement ATA_QCFLAG_QUIET which indicates that there's no need to
report if the command fails with AC_ERR_DEV and set it for passthrough
commands.
Combined with previous changes, this now makes device errors for all
direct commands reported directly to the issuer without going through
EH actions and reporting.
Note that EH is still invoked after non-IO device errors to determine
the nature of the error and resume command execution (some controller
requires special care after error to continue). It just performs
default maintenance after error, examines what's going on, realizes
that it's none of its business and reports the command failure without
logging any error messages.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ATA_QCFLAG_IO is used to mark commands which are used to perform
regluar IO transfers via block layer. These commands are assumed to
be valid and taken more seriously during error handling. Cache flush
is used by regular IO path and necessary for data integrity. Mark it
with ATA_QCFLAG_IO.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Device Initiated Power Management, which is defined
in SATA 2.5 can be enabled for disks which support it.
This patch enables DIPM when the user sets the link
power management policy to "min_power".
Additionally, libata drivers can define a function
(enable_pm) that will perform hardware specific actions to
enable whatever power management policy the user set up
for Host Initiated Power management (HIPM).
This power management policy will be activated after all
disks have been enumerated and intialized. Drivers should
also define disable_pm, which will turn off link power
management, but not change link power management policy.
Documentation/scsi/link_power_management_policy.txt has additional
information.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Some commands need post-processing after successful completion. This
was done in ata_scsi_qc_complete() till now but this has the following
problems.
* Post-command processing gets executed when qc is completed from EH.
Some qc's are retried from EH with zero err_mask and thus triggers
unnecessary/incorrect post-command processing.
* Command post processing doesn't belong to SAT layer.
* Link-wide revalidation was scheduled where device revalidation
suffices.
This patch moves post-command processing to success completion path of
ata_qc_complete() which is travelled iff the command is going to be
completed without passing through EH and updates post-command
processing such that device-specific action is used. While at it,
restructure code a bit for readability.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tackle the relatively sane complaints of checkpatch --file.
The vast majority is indentation and whitespace changes, the rest are
* #include fixes
* printk KERN_xxx prefix addition
* BSS/initializer cleanups
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
LIBATA_MAX_PRD is the maximum number of DMA scatter/gather elements
permitted by the HBA's DMA engine. It's properly set to
q->max_hw_segments via the sg_tablesize parameter.
libata shouldn't call blk_queue_max_phys_segments. Now LIBATA_MAX_PRD
is equal to SCSI_MAX_PHYS_SEGMENTS by default (both is 128), so
everything is fine. But if they are changed, some code (like the scsi
mid layer, sg chaining, etc) might not work properly.
(Addition from Jens) The basic issue is that the physical segment
setting is purely a driver issue. And since SCSI is managing the sglist,
libata has no business changing the setting. All libata should care
about is the hw segment setting.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Fix libata docbook warnings.
Warning(linux-2.6.23-git8//drivers/ata/libata-scsi.c:3251): No description found for parameter 'dev'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After commands which can change device configuration, EH is scheduled
to revalidate and reconfigure the device. Host link was incorrectly
used unconditionally when scheduling EH action. This resulted in
bogus revalidation request and mismatched configuration between device
and driver. Fix it.
This bug was reported by Igor Durdanovic.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Igor Durdanovic <idurdanovic@comcast.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some controller variants snoop the ATAPI length value for Packet
transfers to do state machine and FIFO management. Thus we want to
set it properly, even for cases where it is otherwise meaningless.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
AN serves multiple purposes. For ATAPI, it's used for media change
notification. For PMP, for downstream PHY status change notification.
Implement sata_async_notification() which demultiplexes AN.
To avoid unnecessary port events, ATAPI AN is not enabled if PMP is
attached but SNTF is not available.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Kriten Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some pseudo devices fail PM commands unnecessarily aborting system
suspend. Implement ATA_HORKAGE_SKIP_PM which makes libata skip PM
commands for these devices.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Controllers which support PMP have various restrictions on which
combinations of commands are allowed to what number of devices
concurrently. This patch implements ops->qc_defer() which determines
whether a qc can be issued at the moment or should be deferred.
If the function returns ATA_DEFER_LINK, the qc will be deferred until
a qc completes on the link. If ATA_DEFER_PORT, until a qc completes
on any link. The defer conditions are advisory and in general
ATA_DEFER_LINK can be considered as lower priority deferring than
ATA_DEFER_PORT.
ops->qc_defer() replaces fixed ata_scmd_need_defer(). For standard
NCQ/non-NCQ exclusion, ata_std_qc_defer() is implemented. ahci and
sata_sil24 are converted to use ata_std_qc_defer().
ops->qc_defer() is heavier than the original mechanism because full qc
is prepped before determining to defer it, but various information is
needed to determine defer conditinos and fully translating a qc is the
only way to supply such information in generic manner.
IMHO, this shouldn't cause any noticeable performance issues as
* for most cases deferring occurs rarely (except for NCQ-aware
cmd-switching PMP)
* translation itself isn't that expensive
* once deferred the command won't be repeated until another command
completes which usually is a very long time cpu-wise.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Update AN support in preparation of PMP support.
* s/ata_id_has_AN/ata_id_has_atapi_AN/
* add AN enabled reporting during configuration
* add err_mask to AN configuration failure reporting
* update LOCKING comment for ata_scsi_media_change_notify()
* check whether ATA dev is attached to SCSI dev ata_scsi_media_change_notify()
* set ATA_FLAG_AN in ahci and sata_sil24
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Kriten Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Clear ARRE, we don't do auto-reallocation on reads, just on writes.
Also, hardcode the size of the array using RW_RECOVERY_MPAGE_LEN,
following the style of the surrounding code.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* SAT specifies that FORMAT UNIT should be translated into a series
of READ and WRITE commands that zero the ATA device. That is far too
cumbersome to bother with.
Since we don't actually format the device, the old behavior of
always returning success was inaccurate. Change FORMAT UNIT from
returning success immediately (old behavior) to always returning
an error (new behavior).
* Add some comments around SYNCHRONIZE CACHE
* Shuffle scsi command code around a bit, so that things are close
to alphabetic order.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
A few pedantic apps care about missing or lame "mandatory" SCSI
commands, so
REQUEST SENSE -- as we autosense, R.S. just returns zeroes
SEND DIAGNOSTIC -- our default (no-op) self-test succeeds, all
other requests for testing fail.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
simple search-and-replace of direct scsi_cmnd access to
use the data buffer accessors.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This is a minimal patch needed to remove use of !use_sg
but it is not a complete clean up of the !use_sg paths.
Libata-core still has the qc->flags & ATA_QCFLAG_SG
and !qc->n_elem code paths. Perhaps an ata maintainer
would have a go at it.
- TODO: further cleanup of qc->flags & ATA_QCFLAG_SG
and !qc->n_elem code paths in libata-core
- TODO: Use scsi_dma_{map,unmap} where applicable.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
When we get an SDB FIS with the 'N' bit set, we should send
an event to user space to indicate that there has been a
media change. This will be done via the scsi device.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add support for issuing ATA_16 passthru commands to ATAPI devices
managed by libata. It requires the previous CDB length fix patch.
A boot/module parameter, "atapi_passthru16=0" can be used to globally
disable this feature, if ever desired.
tj: restructured __ata_scsi_queuecmd() according to Jeff's suggestion.
Signed-off-by: Mark Lord <liml@rtr.ca>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Update hotplug to handle PMP links. When PMP is attached, the PMP
number corresponds to C of SCSI H:C:I:L. While at it, change argument
to ata_find_dev() to @devno from @id to avoid confusion with SCSI
device ID.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Multiple links and different number of devices per link should be
considered to iterate over links and devices. This patch implements
and uses link and device iterators - ata_port_for_each_link() and
ata_link_for_each_dev() - and ata_link_max_devices().
This change makes a lot of functions iterate over only possible
devices instead of from dev 0 to dev ATA_MAX_DEVICES. All such
changes have been examined and nothing should be broken.
While at it, add a separating comment before device helpers to
distinguish them better from link helpers and others.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Introduce ata_link. It abstracts PHY and sits between ata_port and
ata_device. This new level of abstraction is necessary to support
SATA Port Multiplier, which basically adds a bunch of links (PHYs) to
a ATA host port. Fields related to command execution, spd_limit and
EH are per-link and thus moved to ata_link.
This patch only defines the host link. Multiple link handling will be
added later. Also, a lot of ap->link derefences are added but many of
them will be removed as each part is converted to deal directly with
ata_link instead of ata_port.
This patch introduces no behavior change.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
SCSI scan may fail due to memory allocation failure even if EH is not
in progress. Due to use of GFP_ATOMIC in SCSI scan path, allocation
failure isn't too rare especially while probing multiple devices at
once which is the case when a bunch of devices are connected to PMP.
This patch moves SCSI scan failure detetion logic from
ata_scsi_hotplug() to ata_scsi_scan_host() and implement synchronous
scan behavior. The synchronous path sleeps briefly and repeats SCSI
scan if some devices aren't attached properly. It contains robust
retry loop to minimize the chance of device misdetection during boot
and falls back to async retry if everything fails.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add ata_dumb_qc_prep and supporting logic so that a driver can just
specify it needs to be helped in this area. 64K entries are split
as with drivers/ide.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
In atapi_xlat(), prepare qc better before calling
ata_check_atapi_dma() such that ata_check_atapi_dma() can use info
from qc. While at it, reformat weird looking if/else block in the
function.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
INIT_DEV_PARAMS and SET_MULTI_MODE change the device parameters cached
by libata. Re-read IDENTIFY DEVICE info and update the cached device
paramters when seeing these commands.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Always enforce correct DEV bit since we know which drive the command
is targeted. SAT demands to ignore the DEV bit, too.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Update the ATA passthru protocol numbers according to the new spec.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
With STANDBYDOWN tracking added, libata.spindown_compat isn't
necessary anymore. If userspace shutdown(8) issues STANDBYNOW, libata
warns. If userspace shutdown(8) doesn't issue STANDBYNOW, libata does
the right thing. Userspace can tell whether kernel supports spindown
by testing whether sysfs node manage_start_stop exists as before.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Our assumption that most distros issue STANDBYNOW seems wrong. The
upstream sysvinit and thus many distros including gentoo and opensuse
don't take any action for libata disks on spindown. We can skip
compat handling for these distros so that they don't need to update
anything to take advantage of kernel-side shutdown.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Unlocking ap->lock and ssleeping don't work because SCSI commands can
be issued from completion path without context. Reimplement delayed
completion by allowing translation functions to override
qc->scsidone(), storing the original completion function to
scmd->scsi_done() and overriding qc->scsidone() with a function which
schedules delayed invocation of scmd->scsi_done().
This isn't pretty at all but all the ugly parts are thankfully
contained in the stop translation path where the compat feature is
implemented.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Now that libata uses sd->manage_start_stop, libata spins down disk on
shutdown. In an attempt to compensate libata's previous shortcoming,
some distros sync and spin down disks attached via libata in their
shutdown(8). Some disks spin back up just to spin down again on
STANDBYNOW1 if the command is issued when the disk is spun down, so
this double spinning down causes problem.
This patch implements module parameter libata.spindown_compat which,
when set to one (default value), prevents libata from spinning down
disks on shutdown thus avoiding double spinning down. Note that
libata spins down disks for suspend to mem and disk, so with
libata.spindown_compat set to one, disks should be properly spun down
in all cases without modifying shutdown(8).
shutdown(8) should be fixed eventually. Some drive do spin up on
SYNCHRONZE_CACHE even when their cache is clean. Those disks
currently spin up briefly when sd tries to shutdown the device and
then the machine powers off immediately, which can't be good for the
head. We can't skip SYNCHRONIZE_CACHE during shudown as it can be
dangerous data integrity-wise.
So, this spindown_compat parameter is already scheduled for removal by
the end of the next year and here's what shutdown(8) should do.
* Check whether /sys/modules/libata/parameters/spindown_compat
exists. If it does, write 0 to it.
* For each libata harddisk {
* Check whether /sys/class/scsi_disk/h:c:i:l/manage_start_stop
exists. Iff it doesn't, synchronize cache and spin the disk
down as before.
}
The above procedure will make shutdown(8) work properly with kernels
before this change, ones with this workaround and later ones without
it.
To accelerate shutdown(8) updates, if the compat mode is in use, this
patch prints BIG FAT warning for five seconds during shutdown (the
optimal interval to annoy the user just the right amount discovered by
hours of tireless usability testing).
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reimplement suspend/resume support using sdev->manage_start_stop.
* Device suspend/resume is now SCSI layer's responsibility and the
code is simplified a lot.
* DPM is dropped. This also simplifies code a lot. Suspend/resume
status is port-wide now.
* ata_scsi_device_suspend/resume() and ata_dev_ready() removed.
* Resume now has to wait for disk to spin up before proceeding. I
couldn't find easy way out as libata is in EH waiting for the
disk to be ready and sd is waiting for EH to complete to issue
START_STOP.
* sdev->manage_start_stop is set to 1 in ata_scsi_slave_config().
This fixes spindown on shutdown and suspend-to-disk.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reorganize ata_host_alloc() and its subroutines into the following
three functions.
* ata_host_alloc() : allocates host and its ports. shost is not
registered automatically.
* ata_scsi_add_hosts() : allocates and adds shosts associated with an
ATA host. Used by ata_host_register().
* ata_host_register() : takes a fully initialized ata_host structure
and registers it to libata layer and probes it.
Only ata_host_alloc() and ata_host_register() are exported.
ata_device_add() is rewritten using the above functions. This patch
does not introduce any observable behavior change. Things worth
mentioning.
* print_id is assigned at registration time and LLDs are allowed to
overallocate ports and reduce host->n_ports during initialization.
ata_host_register() will throw away unused ports automatically.
* All SCSI host initialization stuff now resides in
ata_scsi_add_hosts() in libata-scsi.c, where it should be.
* ipr is now the only user of ata_host_init(). Either kill it by
converting ipr to use ata_host_alloc() and friends or rename and
move it to libata-scsi.c
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The READ/WRITE LONG commands are theoretically obsolete,
but the majority of drives in existance still implement them.
The WRITE_LONG and WRITE_LONG_ONCE commands are of particular
interest for fault injection testing -- eg. creating "media errors"
at specific locations on a disk.
The fussy bit is that these commands require a non-standard
sector size, usually 520 bytes instead of 512.
This patch adds support to libata for READ/WRITE LONG commands
issued via SG_IO/ATA_16.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Resending, with s/printk/DPRINTK/ as pointed out by Alan.
Fix libata to perform CDB len validation per device
rather than per host. This way, validation still works
when we have a mix of 12-byte and 16-byte devices on
a common host interface.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Preserve the LBA bit in the DevSel/Head register for HDIO_DRIVE_TASK.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Conditionalize all PM related stuff in libata core layer using
CONFIG_PM.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_port has two different id fields - id and port_no. id is
system-wide 1-based unique id for the port while port_no is 0-based
host-wide port number. The former is primarily used to identify the
ATA port to the user in printk messages while the latter is used in
various places in libata core and LLDs to index the port inside the
host.
The two fields feel quite similar and sometimes ap->id is used in
place of ap->port_no, which is very difficult to spot. This patch
renames ap->id to ap->print_id to reduce the possibility of such bugs.
Some printk messages are adjusted such that id string (ata%u[.%u])
isn't printed twice and/or to use ata_*_printk() instead of hardcoded
id format.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix ata_scsi_change_queue_depth() such that...
* NCQ on/off is exactly determined using the same logic as the issue path.
* queue depth is adjusted to 1 if NCQ is not enabled.
* -EINVAL is returned if requested action is ignored due to limitations.
This fixes the bug which allows queue depth to be increased on
blacklisted NCQ hosts/devices.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix ata_scmd_need_defer() such that...
* whether NCQ is used or not is exactly determined using the same
criteria as the issue path.
* defer-check is performed in all cases.
This fixes race condition where turning off NCQ on the fly causes
non-NCQ commands sneak into NCQ phase.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_probe_ent_alloc() had a temporary hack such that devm_kzalloc()
was used for allocation if devres had been previously initialized on
the device; otherwise, plain kzalloc() was used. This was to make the
code useable from both the old and devres-aware libata drivers during
transition. This hack made ata_sas_port_alloc() unable to determine
how the probe_ent is allocated, causing double free in some cases.
Remove the now-unneeded hack and make ata_sas_port_alloc() use
devm_kfree().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
I was trying to use HDIO_DRIVE_TASK for something today,
and discovered that the libata implementation does not copy
over the upper four LBA bits from args[6].
This is serious, as any tools using this ioctl would have their
commands applied to the wrong sectors on the drive, possibly resulting
in disk corruption.
Ideally, newer apps should use SG_IO/ATA_16 directly,
avoiding this bug. But with libata poised to displace drivers/ide,
better compatibility here is a must.
This patch fixes libata to use the upper four LBA bits passed
in from the ioctl.
The original drivers/ide implementation copies over all bits
except for the master/slave select bit. With this patch,
libata will copy only the four high-order LBA bits,
just in case there are assumptions elsewhere in libata (?).
Signed-Off-By: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Update libata core layer to use devres.
* ata_device_add() acquires all resources in managed mode.
* ata_host is allocated as devres associated with ata_host_release.
* Port attached status is handled as devres associated with
ata_host_attach_release().
* Initialization failure and host removal is handedl by releasing
devres group.
* Except for ata_scsi_release() removal, LLD interface remains the
same. Some functions use hacky is_managed test to support both
managed and unmanaged devices. These will go away once all LLDs are
updated to use devres.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
'hdparm -I' doesn't work with ATAPI devices and sg_sat is not widely
spread yet leaving no easy way to access ATAPI IDENTIFY data.
Implement HDIO_GET_IDENTITY such that at least 'hdparm -i' works.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata used two separate sets of variables to record request size and
current offset for ATA and ATAPI. This is confusing and fragile.
This patch replaces qc->nsect/cursect with qc->nbytes/curbytes and
kills them. Also, ata_pio_sector() is updated to use bytes for
qc->cursg_ofs instead of sectors. The field used to be used in bytes
for ATAPI and in sectors for ATA.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* Kill _OFS suffixes in ATA_ID_{SERNO|FW_REV|PROD}_OFS for consistency
with other ATA_ID_* constants.
* Kill ATA_SERNO_LEN
* Add and use ATA_ID_SERNO_LEN, ATA_ID_FW_REV_LEN and ATA_ID_PROD_LEN.
This change also makes ata_device_blacklisted() use proper length
for fwrev.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch makes the following needlessly global functions static:
- libata-core.c: ata_qc_complete_internal()
- libata-scsi.c: ata_scsi_qc_new()
- libata-scsi.c: ata_dump_status()
- libata-scsi.c: ata_to_sense_error()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch introduces users of the round_jiffies() function: ATA subsystem
This delayed work is of the "about once a second" variety and can be rounded
to coincide with other wakers.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata's SCSI translation for the SCSI START STOP UNIT command with the
START bit clear (i.e. stopping the drive) appears to be incorrect. It
sends an ATA STANDBY command with the time period set to 0, which the code
comment says means "now", but the ATA standard says this means disable the
standby timer, which effectively does nothing. Change this to issue a
STANDBY IMMEDIATE command which will actually spin the drive down. The SAT
(SCSI/ATA Translation) standard revision 9 concurs with this choice.
Signed-off-by: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
User applications using the HDIO_DRIVE_TASK ioctl through libata expect
specific ATA registers to be returned to userspace. Verified that
ata_task_ioctl correctly returns register values to the smartctl
application.
Signed-off-by: David Milburn <dmilburn@redhat.com>
Acked-by: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fixup the inialization of qc->n_elem. It currently gets
initialized to 1 for commands that do not transfer any data.
Fix this by initializing n_elem to 0 and only setting to 1
in ata_scsi_qc_new when there is data to transfer. This fixes
some problems seen with SATA devices attached to ipr adapters.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata depended on SCSI command to have the correct length when
tranlating it into an ATA command. This generally worked for commands
issued by SCSI HLD but user could issue arbitrary broken command using
sg interface.
Also, when building ATAPI command, full command size was always
copied. Because some ATAPI devices needs bytes after CDB cleared, if
upper layer doesn't clear bytes after CDB, such devices will
malfunction. This necessiated recent clear-garbage-after-CDB fix in
sg interfaces. However, scsi_execute() isn't fixed yet and HL-DT-ST
DVD-RAM GSA-H30N malfunctions on initialization commands issued from
SCSI.
This patch makes xlat functions always consider SCSI cmd_len. Each
translation function checks for proper cmd_len and ATAPI translaation
clears bytes after CDB.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Douglas Gilbert <dougg@torque.net>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
xlat function will be updated to consider qc->scsicmd->cmd_len and
many xlat functions deference qc->scsicmd already. It doesn't make
sense to pass qc->scsicmd->cmnd as @cdb separately. Kill the
argument.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Variable names in xlat functions are quite confusing now. 'scsicmd'
is used for CDB while qc->scsicmd points to struct scsi_cmnd while
'cmd' is used for struct scsi_cmnd.
This patch cleans up variable names in xlat functions such that 'scmd'
is used for struct scsi_cmnd and 'cdb' for CDB. Also, 'scmd' local
variable is added if qc->scsicmd is used multiple times.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Conflicts:
drivers/ata/libata-scsi.c
include/linux/libata.h
Futher merge of Linus's head and compilation fixups.
Signed-Off-By: David Howells <dhowells@redhat.com>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
Separate out rw ATA taskfile building from ata_scsi_rw_xlat() into
ata_build_rw_tf(). This will be used to improve media error handling.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ata_scsi_dev_rescan() doesn't synchronize against SCSI device detach
and the target sdev might go away in the middle. Fix it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
* READ CAPACITY (16) implementation fixed. Result was shifted by two
bytes. Carlos Pardo spotted this problem and submitted preliminary
patch. Capacity => 2TB is handled correctly now. (verifid w/ fake
capacity)
* Use dev->n_sectors instead of re-reading directly from ID data.
* Define and use ATA_SCSI_RBUF_SET() which considers rbuf length.
This should be done for all simulation functions. Userland can
issue any simulated command with arbitrary buffer length.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Carlos Pardo <Carlos.Pardo@siliconimage.com>
Update ata_gen_ata_sense() to use desc format sense data to report the
first failed block. The first failed block is read from result_tf
using ata_tf_read_block() which can handle all three address formats.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* s/ata_gen_ata_desc_sense/ata_gen_passthru_sense/
* s/ata_gen_fixed_sense/ata_gen_ata_sense/
* make both functions static
* neither function has locking requirement, change it to None.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
sb[7] should contain the length of whole information sense data
descriptor while desc[1] should contain the number of following bytes
in the descriptor. ie. 14 for sb[7] but 12 for desc[1].
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Many drives support LBA48 even when its capacity is smaller than
1<<28, as LBA48 is required for many functionalities. FLUSH_EXT is
mandatory for drives w/ LBA48 support.
Interestingly, at least one of such drives (ST960812A) has problems
dealing with FLUSH_EXT. It eventually completes the command but takes
around 7 seconds to finish in many cases thus drastically slowing down
IO transactions. This seems to be a firmware bug which sneaked into
production probably because no other ATA driver including linux IDE
issues FLUSH_EXT to drives which report support for LBA48 & FLUSH_EXT
but is smaller than 1<<28 blocks.
This patch adds ATA_DFLAG_FLUSH_EXT which is set iff the drive
supports LBA48 & FLUSH_EXT and is larger than LBA28 limit. Both cache
flush paths are updated to issue FLUSH_EXT only when the flag is set.
Note that the changed behavior is more inline with the rest of libata.
libata prefers shorter commands whenever possible.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Danny Kukawka <dkukawka@novell.com>
Cc: Stefan Seyfried <seife@novell.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Move dev->max_sectors configuration from ata_scsi_dev_config() to
ata_dev_configure().
* more consistent.
* allows LLDs to peek at the default dev->max_sectors value.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Make user scan wait for scan to complete. This way user can wait for
warm plug request to complete and is prevented from causing EH event
storm by repetitively issuing scan request while EH is in progress.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Fajun Chen <fajunchen@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fixes ata_sas_queuecmd to properly handle a failure from
__ata_scsi_queuecmd.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Do not schedule EH for revalidation on wcache on/off if old EH. Old
EH cannot handle it and will result in WARN_ON()'s and oops.
This closes bug #7412.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.
For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.
To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.
Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).
However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().
In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).
Signed-Off-By: David Howells <dhowells@redhat.com>
A curious thing happens, however, when ata_qc_new_init fails to get
an ata_queued_cmd:
First, ata_qc_new_init handles the failure like this:
cmd->result = (DID_OK << 16) | (QUEUE_FULL << 1);
done(cmd);
Then, we return to ata_scsi_translate and do this:
err_mem:
cmd->result = (DID_ERROR << 16);
done(cmd);
It appears to me that first we set a status code indicating that we're
ok but the device queue is full and finish the command, but then
we blow away that status code and replace it with an error flag and
finish the command a second time! That does not seem to be desirable
behavior since we merely want the I/O to wait until a command slot
frees up, not send errors up the block layer.
In the err_mem case, we should simply exit out of ata_scsi_translate
instead.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Make the HDIO_DRIVE_CMD ioctl in libata (ATA command pass through) return a
few ATA registers to userspace, following the same convention as the
drivers/ide implementation of the same ioctl. This is needed to support ATA
commands like CHECK POWER MODE, which return information in nsectors.
This fixes "hdparm -C" on SATA drives.
Forcing the sense data read via the cc flag causes spurious check conditions,
so we filter these out (following the ATA command pass-through specification
T10/04-262r7).
Signed-off-by: Eran Tromer <eran@tromer.org>
Acked-by: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The biggest change is that ata_host_set is renamed to ata_host.
* ata_host_set => ata_host
* ata_probe_ent->host_flags => ata_probe_ent->port_flags
* ata_probe_ent->host_set_flags => ata_probe_ent->_host_flags
* ata_host_stats => ata_port_stats
* ata_port->host => ata_port->scsi_host
* ata_port->host_set => ata_port->host
* ata_port_info->host_flags => ata_port_info->flags
* ata_(.*)host_set(.*)\(\) => ata_\1host\2()
The leading underscore in ata_probe_ent->_host_flags is to avoid
reusing ->host_flags for different purpose. Currently, the only user
of the field is libata-bmdma.c and probe_ent itself is scheduled to be
removed.
ata_port->host is reused for different purpose but this field is used
inside libata core proper and of different type.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>