Commit Graph

540 Commits

Author SHA1 Message Date
John Snow
1cbdd96813 ahci: Fix FIS decomposition
This patch introduces a few changes to how FIS packets are
deciphered in the AHCI virtual device. The summary of
changes can be grouped into two pieces:

[A] Changes to how we apply a preliminary sieve to FISes,
[B] Changes in how we internalize a decomposed FIS.

== Changes to how we apply a preliminary sieve to FISes ==

(1) Packets may now either update the Control register or
    the Command register, but not both. This is according
    to the SATA 3.2 specification which states:
    "...the device either initiates processing of the command
    indicated in the Command register or initiates processing
    of the control request indicated [...] depending on the
    state of the C bit in the FIS."

    See SATA 3.2 section 10.5.5.4, "Reception" in the 10.5.5
    "Register Host to Device FIS" section.

    This change accounts for the first two regions of change
    within the diff. All other changes belong to the following
    changes.

== Changes in how we internalize a decomposed FIS ==

(2) Instead of trying to extract the sector number out of the
    FIS from bytes 4-10 and setting it with ide_set_sector,
    we set the appropriate IDEState registers and trust that
    ide_get_sector can retrieve the correct sector later.

    By "constructing" the sector for use with ide_set_sector,
    we are duplicating the mechanisms of ide_get_sector.
    This change makes the FIS decomposition more obvious.

    SATA 3.2 as a specification does not make the legacy
    register mapping with respect to the D2H FIS obvious.
    However, SATA 3.2 section 10.5.5.1 "Register Host to
    Device FIS layout" describes all of the "cmd_fis"
    bytes:

    0 - FIS Type (0x27)
    1 - Port Multiplier Port and Command Update flag
    2 - ATA Command
    3 - Features_Low
    4 - LBA 7:0
    5 - LBA 15:8
    6 - LBA 23:16
    7 - Device, AKA "Drive Select."
    8 - LBA 31:24
    9 - LBA 39:32
    10 - LBA 47:40
    11 - Features_High
    12 - Count Low
    13 - Count High
    14 - ICC
    15 - Control
    16-19 - Auxiliary (for NCQ, defined per-command)

    Most of these registers map to existing IDEState registers
    in obvious ways, especially features, select, hob_features,
    and nsector (count). ICC is reserved in older specifications
    but is not supported in our implementation, and remains
    unused here. The Control register is not valid for a command
    that is trying to update the command register and is to be
    considered reserved at this point.

    What is not obvious is the LBA register mappings, but SATA 1.0
    can help inform of us legacy device support, see SATA 1.0 section
    8.5.2 "Register - Host to Device."

    LBA 7:0   - Sector Number    (sector)
    LBA 15:8  - Cyl Low          (lcyl)
    LBA 23:16 - Cyl High         (hcyl)
    LBA 31:24 - Sector Num Exp.  (hob_sector)
    LBA 39:32 - Cyl Low Exp.     (hob_lcyl)
    LBA 47:40 - Cyl High Exp.    (hob_hcyl)

    These mappings help guide which registers the FIS should be decomposed
    into/towards for CHS, LBA28 and LBA48 commands.

    As a note: The prior confusion that can be seen in the documentation
    arises from the fact that CHS and LBA28 commands use the low nybble
    of the drive select register to store LBA 27:24, whereas LNA48 commands
    use the hob_sector, hob_lcyl and hob_hcyl registers as explained above.

    The decomposition as it stands now will correctly decompose CHS, LBA28
    and LBA48 commands into their appropriate registers where the core
    IDE/ATAPI layers can deal with them correctly.

    See the below point for more information.

(3) We save cmd_fis[7] as ide_state->select, which informs
    decisions about if we are using LBA or CHS.
    This corrects a bug in AHCI wherein we attempt to set and/or
    retrieve the sector number by using ide_set_sector and
    ide_get_sector, which depend on the select register to
    determine if we are using LBA or CHS.

    Without this adjustment, LBA48 read/writes are currently
    broken. Thanks to Eniac Zheng @ HP for pointing this out.

(4) Save cmd_fis[11] as ide_state->hob_feature, as defined in SATA 3.2.

(5) For several ATA commands, the sector count register set to 0
    is a magic number that means 256 sectors. For LBA48 commands,
    this means 65,536 sectors. We drop the magic sector correction
    here, and trust the ide core layer to handle the conversion
    appropriately, in ide_cmd_lba48_transform(). As it stands,
    the current AHCI code is only compliant with LBA28 commands.
    By simply removing the magic, it will work with LBA28 and LBA48.

(6) We expand FIS decomposition to include both ATAPI and IDE devices.
    We leave the logic of determining if the fields are valid or not
    to the respective layers.

    This change intends to make it clearer that AHCI is only a
    composition mechanism for the FIS packets: the meanings of
    the registers is best left to the implementation layers for
    those devices.

(7) Forcefully setting the feature, hcyl and lcyl registers for ATAPI
    commands is removed.
    - The hcyl and lcyl magic present here is valid at boot only,
      and should not be overridden for every PACKET command.
    - The feature register is defined as valid for the PACKET command,
      so we should not suppress it. The ATAPI layer does not even
      currently depend on or require 0x01 as mandatory.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
72a065dbb1 ahci: add is_ncq predicate helper
A small helper to determine which S/ATA commands
are destined to be routed to the NCQ pathways.

This references SATA 3.2 section 13.6,
Native Command Queueing. See sections 13.6.4,
13.6.5, 13.6.6, 13.6.7 and 13.6.8 for all
SATA commands considered to be part of the
NCQ feature set. This is summarized in a small
list in section 13.6.3.1 and again in 13.6.3.2.

Not all of these NCQ commands are currently supported,
so the error pathways are adjusted slightly to be more
informative in the case they are encountered.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1415058979-16604-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
3251bdcf1c ide: Correct handling of malformed/short PRDTs
This impacts both BMDMA and AHCI HBA interfaces for IDE.
Currently, we confuse the difference between a PRDT having
"0 bytes" and a PRDT having "0 complete sectors."

When we receive an incomplete sector, inconsistent error checking
leads to an infinite loop wherein the call succeeds, but it
didn't give us enough bytes -- leading us to re-call the
DMA chain over and over again. This leads to, in the BMDMA case,
leaked memory for short PRDTs, and infinite loops and resource
usage in the AHCI case.

The .prepare_buf() callback is reworked to return the number of
bytes that it successfully prepared. 0 is a valid, non-error
answer that means the table was empty and described no bytes.
-1 indicates an error.

Our current implementation uses the io_buffer in IDEState to
ultimately describe the size of a prepared scatter-gather list.
Even though the AHCI PRDT/SGList can be as large as 256GiB, the
AHCI command header limits transactions to just 4GiB. ATA8-ACS3,
however, defines the largest transaction to be an LBA48 command
that transfers 65,536 sectors. With a 512 byte sector size, this
is just 32MiB.

Since our current state structures use the int type to describe
the size of the buffer, and this state is migrated as int32, we
are limited to describing 2GiB buffer sizes unless we change the
migration protocol.

For this reason, this patch begins to unify the assertions in the
IDE pathways that the scatter-gather list provided by either the
AHCI PRDT or the PCI BMDMA PRDs can only describe, at a maximum,
2GiB. This should be resilient enough unless we need a sector
size that exceeds 32KiB.

Further, the likelihood of any guest operating system actually
attempting to transfer this much data in a single operation is
very slim.

To this end, the IDEState variables have been updated to more
explicitly clarify our maximum supported size. Callers to the
prepare_buf callback have been reworked to understand the new
return code, and all versions of the prepare_buf callback have
been adjusted accordingly.

Lastly, the ahci_populate_sglist helper, relied upon by the
AHCI implementation of .prepare_buf() as well as the PCI
implementation of the callback have had overflow assertions
added to help make clear the reasonings behind the various
type changes.

[Added %d -> %"PRId64" fix John sent because off_pos changed from int to
int64_t.
--Stefan]

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:35 +00:00
John Snow
bef1301acb ahci: unify sglist preparation
The intent of this patch is to further unify the creation and
deletion of the sglist used for all AHCI transfers, including
emulated PIO, ATAPI R/W, and native DMA R/W.

By replacing ahci_start_transfer's call to ahci_populate_sglist
with ahci_dma_prepare_buf, we reduce the number of direct calls
where we manipulate the scatter-gather list in the AHCI code.

To make this switch, the constant "0" passed as an offset
in ahci_dma_prepare_buf is adjusted to use io_buffer_offset.

For DMA pathways, this has no effect: io_buffer_offset is always
updated to 0 at the beginning of a DMA transfer loop regardless.
DMA pathways through ide_dma_cb() update the io_buffer_offset
accordingly, and for circumstances where we might make several
trips through this loop, this may actually correct a design flaw.

For PIO pathways, the newly updated ahci_dma_prepare_buf will
now prepare the sglist at the correct offset. It will also set
io_buffer_size, but this is not used in the cmd_read_pio or
cmd_write_pio pathways.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:34 +00:00
John Snow
36334faf35 ide: repair PIO transfers for cases where nsector > 1
Currently, for emulated PIO transfers through the AHCI device,
any attempt made to request more than a single sector's worth
of data will result in the same sector being transferred over
and over.

For example, if we request 8 sectors via PIO READ SECTORS, the
AHCI device will give us the same sector eight times.

This patch adds offset tracking into the PIO pathways so that
we can fulfill these requests appropriately.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:34 +00:00
John Snow
a395f3fa2f ahci: Fix byte count regression for ATAPI/PIO
This patch fixes a regression caused by commit
659142ecf7.
The problem occurs when we wish to return early
from the ahci_start_transfer function, but are now
updating the transferred byte count in the AHCI
command header via ahci_commit_buf.

This will cause problems in the Windows 8 installer.

Don't update the byte count in the command header
for the transmission of ATAPI packets: These commands
will distort the final byte count of the actual data
payload.

The call to ahci_commit_buf remains in the "out"
portion of the call in order to clean up the sglist.
The byte count is maintained by forcing size to be 0.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-14 09:20:34 +00:00
John Snow
54a7f8f38d ahci: Fix SDB FIS Construction
The SDB FIS creation was mangled;
We were writing the error byte to byte 0,
and omitting the SDB FIS magic byte.

Though the SDB packet layout states that:
byte 0: Must be 0xA1 to indicate SDB FIS.
byte 1: Port multiplier select & other flags
byte 2: status byte.
byte 3: error byte.

This patch adds an SDB FIS structure with
human-readable names, and ensures that we
are filling the structure appropriately.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412204151-18117-7-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
John Snow
659142ecf7 ahci: Update byte count after DMA completion
Currently, DMA read/write operations neglect to update
the byte count after a successful transfer like ATAPI
DMA read or PIO read/write operations do.

We correct this oversight by adding another callback into
the IDEDMAOps structure. The commit callback is called
whenever we are cleaning up a scatter-gather list.
AHCI can register this callback in order to update post-
transfer information such as byte count updates.

We use this callback in AHCI to consolidate where we delete
the SGlist as generated from the PRDT, as well as update the
byte count after the transfer is complete.

The QEMUSGList structure has an init flag added to it in order
to make qemu_sglist_destroy a nop if it is called when
there is no sglist, which simplifies cleanup and error paths.

This patch fixes several AHCI problems, notably Non-NCQ modes
of operation for Windows 7 as well as Hibernate support for Windows 7.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412204151-18117-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
John Snow
7b8bad1b6a ahci: Correct PIO/D2H FIS responses
Currently, the D2H FIS packets AHCI generates simply parrot back
the LBA that the guest sent to us in the cmd_fis. However, some
commands (like READ NATIVE MAX) modify the LBA registers as a
return value, through which the AHCI D2H FIS is the only response
mechanism. Thus, the D2H response should use the current register
values, not the initial ones.

This patch adjusts the LBA and drive select register responses for
PIO Setup and D2H FIS response packets.

Additionally, the PIO and D2H FIS responses copy too many bytes
from the command FIS that it is being generated from. Specifically,
byte 11 which is the Features(15:8) field for Register Host to
Device FIS packets, is instead reserved for the PIO Setup FIS and
should always be 0.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412204151-18117-2-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 09:48:41 +00:00
James Harper
d4f9e806c2 fix off-by-one error in pci_piix3_xen_ide_unplug
Fix off-by-one error when unplugging disks, which would otherwise leave the last ATA disk plugged, with obvious consequences. Also rewrite loop to be more readable.

Signed-off-by: James Harper <james.harper@ejbdigital.com.au>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2014-10-30 14:16:39 +00:00
Markus Armbruster
7797a73947 hmp: Remove "info pcmcia"
This command lists PCMCIA sockets and cards.  Only a few ARM boards
have sockets (akita, borzoi, connex, mainstone, spitz, terrier, tosa,
verdex, z2), the only card is the DSCM-1xxxx Hitachi Microdrive (qdev
"microdrive"), and it is only inserted during machine init, if ever.
So this command doesn't really tell anybody anything new so far.

Moreover, pcmcia_socket_unregister() has a use-after-free bug, flagged
by Coverity.  Has never been used, because there has never been code
to eject a PCMCIA card.

Not worth fixing & converting to QMP.  Remove it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
Acked-by: Andreas Färber <afaerber@suse.de>
Message-id: 1411144812-22958-1-git-send-email-armbru@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-10-24 12:19:11 +01:00
Markus Armbruster
26f8b3a847 blockdev: Fix blockdev-add not to create DriveInfo
blockdev_init() always creates a DriveInfo, but only drive_new() fills
it in.  qmp_blockdev_add() leaves it blank.  This results in a drive
with type = IF_IDE, bus = 0, unit = 0.  Screwed up in commit ee13ed1c.

Board initialization code looking for IDE drive (0,0) can pick up one
of these bogus drives.  The QMP command has to execute really early to
be visible.  Not sure how likely that is in practice.

Fix by creating DriveInfo in drive_new().  Block backends created by
blockdev-add don't get one.

Breaks the test for "has been created by qmp_blockdev_add()" in
blockdev_mark_auto_del() and do_drive_del(), because it changes the
value of dinfo && !dinfo->enable_auto_del from true to false.  Simply
test !dinfo instead.

Leaves DriveInfo member enable_auto_del unused.  Drop it.

A few places assume a block backend always has a DriveInfo.  Fix them
up.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:50 +02:00
Markus Armbruster
a987ee1f1b ide: Complete conversion from BlockDriverState to BlockBackend
Add a BlockBackend member to TrimAIOCB, so ide_issue_trim_cb() can use
blk_aio_discard() instead of bdrv_aio_discard().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:03:50 +02:00
Markus Armbruster
4be746345f hw: Convert from BlockDriverState to BlockBackend, mostly
Device models should access their block backends only through the
block-backend.h API.  Convert them, and drop direct includes of
inappropriate headers.

Just four uses of BlockDriverState are left:

* The Xen paravirtual block device backend (xen_disk.c) opens images
  itself when set up via xenbus, bypassing blockdev.c.  I figure it
  should go through qmp_blockdev_add() instead.

* Device model "usb-storage" prompts for keys.  No other device model
  does, and this one probably shouldn't do it, either.

* ide_issue_trim_cb() uses bdrv_aio_discard() instead of
  blk_aio_discard() because it fishes its backend out of a BlockAIOCB,
  which has only the BlockDriverState.

* PC87312State has an unused BlockDriverState[] member.

The next two commits take care of the latter two.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:02:25 +02:00
Markus Armbruster
097310b53e block: Rename BlockDriverCompletionFunc to BlockCompletionFunc
I'll use it with block backends shortly, and the name is going to fit
badly there.  It's a block layer thing anyway, not just a block driver
thing.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:27 +02:00
Markus Armbruster
7c84b1b831 block: Rename BlockDriverAIOCB* to BlockAIOCB*
I'll use BlockDriverAIOCB with block backends shortly, and the name is
going to fit badly there.  It's a block layer thing anyway, not just a
block driver thing.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:27 +02:00
Markus Armbruster
fa1d36df74 block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()
The patch is big, but all it really does is replacing

    dinfo->bdrv

by

    blk_bs(blk_by_legacy_dinfo(dinfo))

The replacement is repetitive, but the conversion of device models to
BlockBackend is imminent, and will shorten it to just
blk_legacy_dinfo(dinfo).

Line wrapping muddies the waters a bit.  I also omit tests whether
dinfo->bdrv is null, because it never is.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:27 +02:00
Markus Armbruster
b9fe8a7a12 blockdev: Eliminate drive_del()
drive_del() has become a trivial wrapper around blk_unref().  Get rid
of it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Gonglei
d2b186f96d ide: add calling add_boot_device_patch in bootindex setter function
On this way, we can assure the new bootindex take effect
during vm rebooting. Meanwhile set the initial value of
bootindex to -1.

Because ide devcies's unit property maybe
do not initialize when set_bootindex function is called,
so that we don't know its suffix. So we have to save the
call add_boot_device_path() on ide realize/init function.
When we want to change bootindex during vm rebooting, we
can call it in setter function.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 10:45:51 +02:00
Gonglei
4556363087 ide: add bootindex to qom property
Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2014-10-15 09:52:55 +02:00
John Snow
d93162e13c q35/ahci: Pick up -cdrom and -hda options
This patch implements the backend for the Q35 board
for us to be able to pick up and use drives defined
by the -cdrom, -hda, or -drive if=ide shorthand options.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412187569-23452-7-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
John Snow
d8f94e1bb2 ide: Update ide_drive_get to be HBA agnostic
Instead of duplicating the logic for the if_ide
(bus,unit) mappings, rely on the blockdev layer
for managing those mappings for us, and use the
drive_get_by_index call instead.

This allows ide_drive_get to work for AHCI HBAs
as well, and can be used in the Q35 initialization.

Lastly, change the nature of the argument to
ide_drive_get so that represents the number of
total drives we can support, and not the total
number of buses. This will prevent array overflows
if the units-per-default-bus property ever needs
to be adjusted for compatibility reasons.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1412187569-23452-5-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-03 10:30:33 +01:00
John Snow
fac7aa7fc2 ahci: properly shadow the TFD register
In a real AHCI device, several S/ATA registers are mirrored or shadowed
within the AHCI register set. These registers are not updated
synchronously for each read access, but are instead updated after a
Device-to-Host Register FIS packet is received. The D2H FIS contains
the values from these registers on the device.

In QEMU, by reaching directly into the device to grab these bits before
they are "sent," we may introduce race conditions where unexpected
values are present "before they are sent" which could cause issues for
some guests, particularly if an attempt is made to read the PxTFD
register prior to enabling the port, where incorrect values will be read.

This patch also addresses the boot-time values for the PxTFD and PxSIG
registers to bring them in line with the AHCI 1.3 specification.

Lastly, several fields (PxTFD, PxSIG and PxSACT) are read-only,
and any attempts to write to them should be ignored.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-6-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:41 +01:00
John Snow
c8b5b20f81 ahci: MSI capability should be at 0x80, not 0x50.
In the Intel ICH9 data sheet, the MSI capability offset
in the PCI configuration space for ICH9 AHCI devices is
specified to be 0x80.

Further, the PCI capability pointer should always point
to 0x80 in ICH9 devices, despite the fact that AHCI 1.3
specifies that it should be pointing to PMCAP (Which in
this instance would be 0x70) to maintain adherence to
the Intel data sheet specifications and real observed behavior.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1408643079-30675-3-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:39 +01:00
Fam Zheng
8007429a99 block: Rename qemu_aio_release -> qemu_aio_unref
Suggested-by: Benoît Canet <benoit.canet@irqsave.net>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:17 +01:00
Fam Zheng
e551c999bc ide: Convert trim_aiocb_info.cancel to .cancel_async
We know that either bh is scheduled or ide_issue_trim_cb will be called
again, so we just set i, j and ret to the right values. In both cases,
ide_trim_bh_cb will be called.

Also forward the cancellation to the iocb->aiocb which we get from
bdrv_aio_discard.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:16 +01:00
Fam Zheng
0d910cfeaf ide/ahci: Check for -ECANCELED in aio callbacks
Before, bdrv_aio_cancel will either complete the request (like normal)
and call CB with an actual return code, or skip calling the request (for
example when the IO req is not submitted by thread pool yet).

We will change bdrv_aio_cancel to do it differently: always call CB
before return, with either [1] a normal req completion ret code, or [2]
ret == -ECANCELED. So the callers' callback must accept both cases. The
existing logic works with case [1], but not [2].

The simplest transition of callback code is do nothing in case [2], just
as if the CB is not called by the bdrv_aio_cancel() call.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:38:55 +01:00
John Snow
d735b620b5 ide/atapi: Mark non-data commands as complete
When the command completion code in IDE and AHCI
was unified to put all command completion inside
of a callback, "cmd_done," we neglected to
ensure that all AHCI/ATAPI command paths would
eventually register as finished. for the PCI
interface to IDE this is not a problem because
cmd_done is a nop, but the AHCI implementation
needs to send a D2H_REG_FIS and interrupt back
to the guest to inform of completion.

This patch adds calls to ide_stop_transfer,
which calls ide_cmd_done, inside of
ide_atapi_cmd_ok and ide_atapi_cmd_error.

This fixes regressions observed by trying to boot QEMU
with a Fedora 20 live CD under Q35/AHCI, which uses
ATAPI command 0x00, which is a status check that may
cause a hang because we never complete, and ATAPI
command 0x56, which is unsupported by our current
implementation and results in an error that we never
report back to the guest.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:38:44 +01:00
Benoît Canet
5366d0c8bc block: Make the block accounting functions operate on BlockAcctStats
This is the next step for decoupling block accounting functions from
BlockDriverState.
In a future commit the BlockAcctStats structure will be moved from
BlockDriverState to the device models structures.

Note that bdrv_get_stats was introduced so device models can retrieve the
BlockAcctStats structure of a BlockDriverState without being aware of it's
layout.
This function should go away when BlockAcctStats will be embedded in the device
models structures.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Michael Tokarev <mjt@tls.msk.ru>
CC: John Snow <jsnow@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>
CC: Max Reitz <mreitz@redhat.com>

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Benoît Canet
28298fd3d9 block: rename BlockAcctType members to start with BLOCK_ instead of BDRV_
The middle term goal is to move the BlockAcctStats structure in the device models.
(Capturing I/O accounting statistics in the device models is good for billing)
This patch make a small step in this direction by removing a reference to BDRV.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>i

Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Valentin Manea
1a7044bb62 IDE: MMIO IDE device control should be little endian
Set the IDE MMIO memory type to little endian. The ATA specs identify
words part of the control commands encoded as little endian.
While this has no impact on little endian systems, it's required for big
endian systems(eg OpenRisc).

Signed-off-by: Valentin Manea <valentin.manea@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
Markus Armbruster
7ca9b7c035 xen: Drop redundant bdrv_close() from pci_piix3_xen_ide_unplug()
drive_del() closes just fine.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-10 10:41:29 +02:00
John Snow
01ce352e62 ide: Add resize callback to ide/core
Currently, if the block device backing the IDE drive is resized,
the information about the device as cached inside of the IDEState
structure is not updated, thus when a guest OS re-queries the drive,
it is unable to see the expanded size.

This patch adds a resize callback that updates the IDENTIFY data
buffer in order to correct this.

Lastly, a Linux guest as-is cannot resize a libata drive while in-use,
but it can see the expanded size as part of a bus rescan event.
This patch also allows guests such as Linux to see the new drive size
after a soft reboot event, without having to exit the QEMU process.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:44 +01:00
John Snow
4bf6637d35 IDE: Fill the IDENTIFY request consistently
IDE-HD, IDE-ATAPI and IDE-CFATA all fill the
identify buffer in slightly different ways,
this is a relatively minor patch to make them
uniform, to emphasize that:

(1) We build the s->identify_data cache first, then
(2) We copy it to s->io_buffer to fulfill the request.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:44 +01:00
John Snow
c5fe97e359 ide: Add wwn support to IDE-ATAPI drive
Although it is possible to specify the wwn
property for cdrom devices on the command line,
the underlying driver fails to relay this information
to the guest operating system via IDENTIFY.

This is a simple patch to correct that.

See ATA8-ACS, Table 22 parts 5, 6, and 9.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-08 11:12:44 +01:00
Markus Armbruster
28fa7133b8 ide: Fix bootindex for bus_id > 9
We identify devices by their Open Firmware device paths.  The encoding
of bus numbers is incorrect: idebus_get_fw_dev_path() formats them in
decimal, while SeaBIOS uses hexadecimal.  With bus number > 9, SeaBIOS
will miss the bootindex (lucky case), or apply it to another device
(unlucky case).

Bug can't bite right now: ich9-ahci has six ports, and the sysbus-ahci
created by Calxeda Highbank has just one.

Fix it anyway, by changing %d to %x.

I couldn't find an Open Firmware spec covering this.  For what it's
worth, OVMF agrees with SeaBIOS.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:57 +01:00
Peter Maydell
a6aebb38ba SCSI patches include bug fixes from Fam and Peter, improved error
reporting from Fam and a fix for DPRINTF bitrot.  Memory patches try
 again to initialize name from the QOM name.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJT/zhXAAoJEBvWZb6bTYby/UMP/jyhrHAaCgPgrvM4bXzMXBoZ
 l4UQXSTAmhlpr/OUI9pgT/392IGNTkDZ3mJi0sqgG6p6egWUT6a4+lzJjlhExTJy
 K5GRkbgfj83nVI1Jr0uxs58dwM527IFc5RD2Fzz0QXJIMA+HDseLfSYfa3gxbdTU
 iU8fK4PG1usb8FMR+Rd7SzGgGbGhgs6KOar98izH9C+SsPtCIGEu86KW9EkCh2dZ
 t7RI9PAJZUA1Ci2GuFISAuxl08ZkfKo29fXfM0DsovbaQda2dph7j1y6sYqXYQBA
 jZW0BEedpC1PfbWEODU81PG5t4AH5AUNmrNIsG04NiDwcRzWQckA5/x6qHsdA/33
 N/GGzqfmLknvIPecuzmwmgRBVMzf+K0xXd+StSFJWR9dwP09Y0UfhdkuBsiqdd1a
 H+xtsDyBl9pR9VgqtetIq8uQf3fpiHSRUnh++YYU8V/uK2C8ZyTmYYBJNwk2FK6l
 2PBTD1Jsl0WYCZBScM0IK+BnDNDwrygdfAa2CF1KcfNHQSiHvjHIQwsVo6Lwev7i
 1eE6/0zQ7Yumi3LOSyUx3v6JwdKH1zsN3uIjlrg4SxpmgzK/vhwrtuJti4W7HpLd
 eHLHfszCWAmNO6zx0/bH44lPZNDBBFZeaZ+NVjW0nv7y3pAeLP3qyuY3pUsr+Suh
 0xRPwhfmoSz9CZ+5mAIX
 =iLTP
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

SCSI patches include bug fixes from Fam and Peter, improved error
reporting from Fam and a fix for DPRINTF bitrot.  Memory patches try
again to initialize name from the QOM name.

# gpg: Signature made Thu 28 Aug 2014 15:10:31 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  memory: Lazy init name from QOM name as needed
  xen: hvm: Abstract away memory region name ref
  xen-hvm: Constify string
  virtio-scsi: Report error if num_queues is 0 or too large
  scsi-generic: remove superfluous DPRINTF avoid to break compiling
  block/iscsi: fix memory corruption on iscsi resize
  scsi-bus: Convert DeviceClass init to realize
  block: Pass errp in blkconf_geometry

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-28 17:08:13 +01:00
Fam Zheng
5ff5efb46c block: Pass errp in blkconf_geometry
This allows us to pass error information to caller.

Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-26 13:20:44 +02:00
Markus Armbruster
d4df3dbc02 block: Drop some superfluous casts from void *
They clutter the code.  Unfortunately, I can't figure out how to make
Coccinelle drop all of them, so I have to settle for common special
cases:

    @@
    type T;
    T *pt;
    void *pv;
    @@
    - pt = (T *)pv;
    + pt = pv;
    @@
    type T;
    @@
    - (T *)
      (\(g_malloc\|g_malloc0\|g_realloc\|g_new\|g_new0\|g_renew\|
	 g_try_malloc\|g_try_malloc0\|g_try_realloc\|
	 g_try_new\|g_try_new0\|g_try_renew\)(...))

Topped off with minor manual style cleanups.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Markus Armbruster
5839e53bbc block: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
for two reasons.  One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.

Patch created with Coccinelle, with two manual changes on top:

* Add const to bdrv_iterate_format() to keep the types straight

* Convert the allocation in bdrv_drop_intermediate(), which Coccinelle
  inexplicably misses

Coccinelle semantic patch:

    @@
    type T;
    @@
    -g_malloc(sizeof(T))
    +g_new(T, 1)
    @@
    type T;
    @@
    -g_try_malloc(sizeof(T))
    +g_try_new(T, 1)
    @@
    type T;
    @@
    -g_malloc0(sizeof(T))
    +g_new0(T, 1)
    @@
    type T;
    @@
    -g_try_malloc0(sizeof(T))
    +g_try_new0(T, 1)
    @@
    type T;
    expression n;
    @@
    -g_malloc(sizeof(T) * (n))
    +g_new(T, n)
    @@
    type T;
    expression n;
    @@
    -g_try_malloc(sizeof(T) * (n))
    +g_try_new(T, n)
    @@
    type T;
    expression n;
    @@
    -g_malloc0(sizeof(T) * (n))
    +g_new0(T, n)
    @@
    type T;
    expression n;
    @@
    -g_try_malloc0(sizeof(T) * (n))
    +g_try_new0(T, n)
    @@
    type T;
    expression p, n;
    @@
    -g_realloc(p, sizeof(T) * (n))
    +g_renew(T, p, n)
    @@
    type T;
    expression p, n;
    @@
    -g_try_realloc(p, sizeof(T) * (n))
    +g_try_renew(T, p, n)

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Peter Maydell
0e4a773705 SCSI changes that enable sending vendor-specific commands via virtio-scsi.
Memory changes for QOMification and automatic tracking of MR lifetime.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJT8et9AAoJEBvWZb6bTYbyIJAQAI3AlLSe27xWoUGfQUgWH30z
 Rt/pShHz3BJMfQpD79JfTH8u6uBpkQmKtflerNT7FhXN9ULDzNq+b/jRtke8nkuy
 ctCt05FhhK00rfWpUoRue4XiCuvbizBU7MK0DI3yCyNdXQyYnFvgnvsJtlqox8Zh
 J5HZcBJEmdCiWBxq7UPk0qBitp4PqNoy7jlD/Ex3m7fJN5WK2cyspQIT9zmhehVn
 B8Nwp+RitDDbXbwm0r18col5rFr/6Nj6+dW1gr+7sVJDLNsmJEqC2l3Kgk0wbPkG
 Uqwbih29me9PC9/L1VLGHY0ApKDQ8JGE0GrYgEg162hbhoxEHkjjoHMhDUfV6Pj8
 NkqcjjWl11UUhgkNqrGafayXbBVnOiEglxy8uXCeq14y9Xd/gjK9Fz6MQvRSOjms
 PFmaKknhdmpxh0DuZmTix7WBmKim8zOiCE0/vrAPvwx5L+d1bn5xh6yQvtVjBMpU
 Sru3Mhdm9bL9dUDBgOM/G6WCxSTVLBlExOblcYkQh03MfabD7bfplcrKYPXt5ull
 Y8YLjqkoIfoy5t0ErvtlpdBJjeEz99JXU+wLQ6NYHnzwzTV+oUtSaEph14mAFOcY
 XkFKdoPDI9PnyEfvy4193du8z/dSbhu7sWgHWbTCQyrcaNnSaVhlH43NUC+p23YN
 8vfEsVLd1X7MFkDBUmWp
 =M+/m
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

SCSI changes that enable sending vendor-specific commands via virtio-scsi.

Memory changes for QOMification and automatic tracking of MR lifetime.

# gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  mtree: remove write-only field
  memory: Use canonical path component as the name
  memory: Use memory_region_name for name access
  memory: constify memory_region_name
  exec: Abstract away ref to memory region names
  loader: Abstract away ref to memory region names
  tpm_tis: remove instance_finalize callback
  memory: remove memory_region_destroy
  memory: convert memory_region_destroy to object_unparent
  ioport: split deletion and destruction
  nic: do not destroy memory regions in cleanup functions
  vga: do not dynamically allocate chain4_alias
  sysbus: remove unused function sysbus_del_io
  qom: object: move unparenting to the child property's release callback
  qom: object: delete properties before calling instance_finalize
  virtio-scsi: implement parse_cdb
  scsi-block, scsi-generic: implement parse_cdb
  scsi-block: extract scsi_block_is_passthrough
  scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
  scsi-bus: prepare scsi_req_new for introduction of parse_cdb

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 13:00:57 +01:00
Paolo Bonzini
469b046ead memory: remove memory_region_destroy
The function is empty after the previous patch, so remove it.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:21 +02:00
Michael Tokarev
d66168ed68 ide: only constrain read/write requests to drive size, not other types
Commit 58ac321135 introduced a check to ide dma processing which
constrains all requests to drive size.  However, apparently, some
valid requests (like TRIM) does not fit in this constraint, and
fails in 2.1.  So check the range only for reads and writes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:14 +01:00
Kevin Wolf
f7f3ff1da0 ide: Fix segfault when flushing a device that doesn't exist
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
271dddd133 cmd646: synchronise UDMA interrupt status with DMA interrupt status
Make sure that both registers are synchronised when being accessed through
PCI configuration space.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
1d113ef874 cmd646: allow MRDMODE interrupt status bits clearing from PCI config space
Make sure that we also update the normal DMA interrupt status bits at the
same time, and alter the IRQ if being cleared accordingly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
dab91a1e13 cmd646: switch cmd646_update_irq() to accept PCIDevice instead of PCIIDEState
This is in preparation for adding configuration space accessors which accept
PCIDevice as a parameter.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
5bbc0a703d cmd646: synchronise DMA interrupt status with UDMA interrupt status
Make sure that the standard DMA interrupt status bits reflect any changes made
to the UDMA interrupt status bits. The CMD646U2 datasheet claims that these
bits are equivalent, and they must be synchronised for guests that manipulate
both registers.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Mark Cave-Ayland
58f16a7b47 cmd646: add constants for CNTRL register access
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:13 +01:00
Paolo Bonzini
088415202b ahci: construct PIO Setup FIS for PIO commands
PIO commands should put a PIO Setup FIS in the receive area when data
transfer ends.  Currently QEMU does not do this and only places the
D2H FIS at the end of the operation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 18:03:12 +01:00