[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-usb
Subject:    Re: xhci streams bug
From:       Gerd Hoffmann <kraxel () redhat ! com>
Date:       2013-01-31 14:34:30
Message-ID: 510A80F6.7080607 () redhat ! com
[Download RAW message or body]

  Hi,

> I think it's because xhci doesn't manage the trb_address_map radix tree
> correctly.  I can only find a single radix_tree_insert() call in the
> code, and that one is for the initial segment.  But nobody seems to
> update the radix tree when linking the next segment ...

There seems to be a bit more fishy, a device reset doesn't bring the
device back online.

[  117.169453] scsi3 : uas
[  117.171072] usbcore: registered new interface driver uas
[  117.175060] scsi 3:0:0:0: Direct-Access     QEMU     QEMU HARDDISK
 1.3. PQ: 0 ANSI: 5
[  117.195589] sd 3:0:0:0: Attached scsi generic sg1 type 0
[  117.206834] sd 3:0:0:0: [sdb] 2097152 512-byte logical blocks: (1.07
GB/1.00 GiB)
[  117.223331] sd 3:0:0:0: [sdb] Write Protect is off
[  117.236356] sd 3:0:0:0: [sdb] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[  117.251144]  sdb: sdb1
[  117.266808] sd 3:0:0:0: [sdb] Attached SCSI disk

All fine so far.

[  117.324571] xhci_hcd 0000:00:0f.0: ERROR Transfer event for disabled
endpoint or incorrect stream ring
[  117.325543] xhci_hcd 0000:00:0f.0: @000000003c348550 3c8a8800
00000000 0d000060 01058000

Hitting stream ring link bug, status pipe stops working.

[  177.760380] sd 3:0:0:0: [sdb] uas_eh_abort_handler ffff88003c8ef600
tag 0, inflight: CMD
[  180.769264] scsi host3: uas_eh_task_mgmt: ABORT TASK timed out
[  180.778724] sd 3:0:0:0: [sdb] uas_eh_abort_handler ffff8800350f6d00
tag 1, inflight: CMD
[  183.780182] scsi host3: uas_eh_task_mgmt: ABORT TASK timed out
[  183.790096] sd 3:0:0:0: uas_eh_abort_handler ffff88002e859100 tag 2,
inflight: CMD
[  186.796318] scsi host3: uas_eh_task_mgmt: ABORT TASK timed out
[  186.799973] sd 3:0:0:0: uas_eh_device_reset_handler
[  189.805352] scsi host3: uas_eh_task_mgmt: LOGICAL UNIT RESET timed out

scsi / uas tries to recover via task management, which fails.
Probably due to the status pipe being hosed.
Could also be a bug though in qemu's tmf code though.

[  189.815757] usb 7-3: URB BAD STATUS -2
[  189.819638] usb 7-3: URB BAD STATUS -2
[  189.822628] usb 7-3: URB BAD STATUS -2
[  189.826979] usb 7-3: URB BAD STATUS -2
[  189.829789] usb 7-3: URB BAD STATUS -2
[  189.830699] usb 7-3: URB BAD STATUS -2

uas canceled inflight urbs here via usb_kill_anchored_urbs()

[  189.936982] usb 7-3: reset SuperSpeed USB device number 2 using xhci_hcd
[  189.956674] usb 7-3: Parent hub missing LPM exit latency info.  Power
management will be impacted.
[  189.958721] xhci_hcd 0000:00:0f.0: xHCI xhci_drop_endpoint called
with disabled ep ffff88003d043700
[  189.964337] xhci_hcd 0000:00:0f.0: xHCI xhci_drop_endpoint called
with disabled ep ffff88003d043740
[  189.968832] xhci_hcd 0000:00:0f.0: xHCI xhci_drop_endpoint called
with disabled ep ffff88003d043780
[  189.970601] xhci_hcd 0000:00:0f.0: xHCI xhci_drop_endpoint called
with disabled ep ffff
88003d0437c0

uas resets device.

[  189.974161] scsi host3: uas_eh_bus_reset_handler success

uas thinks we are fine again ...

[  189.978617] scsi host3: sense urb submission failure

... but we are not, sense pipe still broken.

[  199.979340] sd 3:0:0:0: uas_eh_abort_handler ffff88002e859100 tag 2,
inflight: s-st a-cmd s-cmd
[  199.991331] sd 3:0:0:0: abort completed
[  199.995088] sd 3:0:0:0: Device offlined - not ready after error recovery

scsi layer decides to take the device offline as the request (test
unit ready probably) didn't work.

[  199.999020] sd 3:0:0:0: Device offlined - not ready after error recovery
[  200.001558] sd 3:0:0:0: Device offlined - not ready after error recovery
[  200.003413] sd 3:0:0:0: [sdb] Unhandled error code
[  200.006862] sd 3:0:0:0: [sdb]
[  200.007532] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
[  200.011701] sd 3:0:0:0: [sdb] CDB:
[  200.013864] Read(10): 28 00 00 00 08 00 00 00 08 00
[  200.015128] end_request: I/O error, dev sdb, sector 2048
[  200.016244] Buffer I/O error on device sdb1, logical block 0
[  200.017412] sd 3:0:0:0: [sdb] Unhandled error code
[  200.018384] sd 3:0:0:0: [sdb]
[  200.019023] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
[  200.020161] sd 3:0:0:0: [sdb] CDB:
[  200.020846] Read(10): 28 00 00 00 01 68 00 00 08 00
[  200.023422] end_request: I/O error, dev sdb, sector 360
[  200.024636] Buffer I/O error on device sdb, logical block 45

scsi layer finally throws an I/O error.

But, hey, at least the machine is still fine, uas didn't crash in the
process ...

[ all still in qemu, will cross-checking on real hardware ]

cheers,
  Gerd
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic