qemu-e2k/hw/net
Sameeh Jubran 82342e91b6 e1000e: Fix ICR "Other" causes clear logic
This commit fixes a bug which causes the guest to hang. The bug was
observed upon a "receive overrun" (bit #6 of the ICR register)
interrupt which could be triggered post migration in a heavy traffic
environment. Even though the "receive overrun" bit (#6) is masked out
by the IMS register (refer to the log below) the driver still receives
an interrupt as the "receive overrun" bit (#6) causes the "Other" -
bit #24 of the ICR register - bit to be set as documented below. The
driver handles the interrupt and clears the "Other" bit (#24) but
doesn't clear the "receive overrun" bit (#6) which leads to an
infinite loop. Apparently the Windows driver expects that the "receive
overrun" bit and other ones - documented below - to be cleared when
the "Other" bit (#24) is cleared.

So to sum that up:
1. Bit #6 of the ICR register is set by heavy traffic
2. As a results of setting bit #6, bit #24 is set
3. The driver receives an interrupt for bit 24 (it doesn't receieve an
   interrupt for bit #6 as it is masked out by IMS)
4. The driver handles and clears the interrupt of bit #24
5. Bit #6 is still set.
6. 2 happens all over again

The Interrupt Cause Read - ICR register:

The ICR has the "Other" bit - bit #24 - that is set when one or more
of the following ICR register's bits are set:

LSC - bit #2, RXO - bit #6, MDAC - bit #9, SRPD - bit #16, ACK - bit
#17, MNG - bit #18

This bug can occur with any of these bits depending on the driver's
behaviour and the way it configures the device. However, trying to
reproduce it with any bit other than RX0 is challenging and came to
failure as the drivers don't implement most of these bits, trying to
reproduce it with LSC (Link Status Change - bit #2) bit didn't succeed
too as it seems that Windows handles this bit differently.

Log sample of the storm:

27563@1494850819.411877:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
27563@1494850819.411900:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.411915:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412380:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412395:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412436:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412441:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412998:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)

* This bug behaviour wasn't observed with the Linux driver.

This commit solves:
https://bugzilla.redhat.com/show_bug.cgi?id=1447935
https://bugzilla.redhat.com/show_bug.cgi?id=1449490

Cc: qemu-stable@nongnu.org
Signed-off-by: Sameeh Jubran <sjubran@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-05-23 10:10:38 +08:00
..
fsl_etsec sysbus: Set user_creatable=false by default on TYPE_SYS_BUS_DEVICE 2017-05-17 10:37:01 -03:00
rocker pci: Convert msix_init() to Error and fix callers 2017-02-01 03:37:18 +02:00
allwinner_emac.c
cadence_gem.c cadence_gem: Make the revision a property 2017-04-20 17:39:17 +01:00
dp8393x.c qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable 2017-05-17 10:37:00 -03:00
e1000_regs.h
e1000.c e1000: disable debug by default 2017-03-31 08:48:13 +08:00
e1000e_core.c e1000e: Fix ICR "Other" causes clear logic 2017-05-23 10:10:38 +08:00
e1000e_core.h
e1000e.c e1000e: correctly tear down MSI-X memory regions 2017-03-14 15:39:55 +08:00
e1000x_common.c
e1000x_common.h
eepro100.c
etraxfs_eth.c qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable 2017-05-17 10:37:00 -03:00
ftgmac100.c net/ftgmac100: add a 'aspeed' property 2017-04-25 19:17:25 +08:00
imx_fec.c net: imx: limit buffer descriptor count 2017-02-15 11:18:57 +08:00
lan9118.c
lance.c qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable 2017-05-17 10:37:00 -03:00
Makefile.objs net: add FTGMAC100 support 2017-04-24 11:30:03 +08:00
mcf_fec.c hw/net: implement MIB counters in mcf_fec driver 2017-03-14 15:39:55 +08:00
milkymist-minimac2.c
mipsnet.c
ne2000-isa.c
ne2000.c
ne2000.h
net_rx_pkt.c NetRxPkt: Remove code duplication in net_rx_pkt_pull_data() 2017-03-06 11:46:02 +08:00
net_rx_pkt.h
net_tx_pkt.c
net_tx_pkt.h
opencores_eth.c
pcnet-pci.c
pcnet.c
pcnet.h
rtl8139.c rtl8139: correctly handle PHY reset 2017-01-06 10:38:05 +08:00
smc91c111.c
spapr_llan.c hw/net/spapr_llan: 6 byte mac address device tree entry 2017-02-22 14:28:53 +11:00
stellaris_enet.c arm: stellaris: make MII accesses complete immediately 2017-01-27 15:29:08 +00:00
trace-events trace: clean up trace-events files 2017-01-31 17:12:15 +00:00
vhost_net.c vhost_net: device IOTLB support 2017-01-18 22:59:53 +02:00
virtio-net.c virtio-net: fix wild pointer when remove virtio-net queues 2017-05-23 10:10:38 +08:00
vmware_utils.h
vmxnet3.c vmxnet3: VMStatify rx/tx q_descr and int_state 2017-03-06 11:46:02 +08:00
vmxnet3.h
vmxnet_debug.h
xen_nic.c xen: Rename xen_be_send_notify 2016-10-28 17:54:21 -07:00
xgmac.c
xilinx_axienet.c
xilinx_ethlite.c