Author Topic: Debugging new USB Hub design and suspicious device driver.  (Read 24948 times)

guscrown

  • Member
  • ***
  • Posts: 18
Debugging new USB Hub design and suspicious device driver.
« on: February 08, 2014, 02:39:49 pm »
EE here. I was tasked with building a USB Hub onto one of our boards, and I chose the TI part TUSB2077A. Everything seems to work properly except when I disconnet one of our devices, but it doesn't happen all the time; I can safely say that it happens about 5% of the disconnections.

My initial assumption was that it had to do with improper layout, ESD, surge or any other electrical problem you can think up. I got all my toys out and started measuring, scoping, cutting traces, adding components, removing components, but no matter how much I did, I would always be able to duplicate the problem with the same frequency of occurrence.

I decided to sniff around the USB traffic and see what I could find, but device drivers are completely out of my expertise and I am unable to interpret the logs. I have been googling around many of the things I see, and they differences in the two logs are more than obvious, but I am still unable to determine if this is a faulty device driver, a faulty Windows usbhub.sys driver, or an error within the TI chip itself. Out of all the possibilities, I believe the fault device driver is the more likely, but before I contact my device provider, I wanted to understand better what is going on.

Can any USB expert out there take a look at these logs and tell me what you can interpret from them? I captured the full USB stack, so there are a lot of Bulk transfer messages, but if you scroll down about 1.8 seconds in you will see the magic happen.

https://www.dropbox.com/s/epdhsf9lcox5ci7/Full%20Stack%20Logs.zip


MORE DETAILS

The device in question is a custom made thermal ticket reader that uses the Thesycon usbio generic USB driver. I did not design this device, nor the device driver. It has been in use by our company for about 10 years, but this would be the first time it is connected to a USB Hub, before it was always connected to the Root Hub in a Host PC running Windows XP. I only designed the USB Hub, using the above mentioned TI Part.

The problem:

When unpluggin the device from the Hub, the hub will reset and re-enumerate itself and therefor the devices attached to it. I have checked the reset line in the hub and it is not asserted, nor is VCC dropped by any level that would cause an electrical malfunction.

I have also checked for ESD, Surge, over current, over voltage and undervoltage problems and have found nothing. I have protection devices on my board, as well as EMI filters for the data pairs. All other devices work normally, and if I do not unplug the device, all my software tools, and our commercial software operates without faults. It is only when you unplug (about 5% of the unplugs) that we get this fault.

I can see from the logs that the "Host" is sending the "RESET_PORT" command/call (or whatever is called) after it tried to SYNC_RESET_PIPE_AND_CLEAR_STALL with unseccesful results. I notice from the logs that this call is not made on the OK log, and simply calls for a device removal. The RESET_PORT call is issues on the HUB, which makes me think that the issues arises on the device driver.

The HUB IC is a standard TI part, and is already in mass production. My schematic follows their guidelines and our board layout was made by a design house in South California well aware of the critical importance of good practice for USB design.

Take a look at the following section of the 2 logs:

good disconnection:



and the bad disconnection:



The 10th column from left to right is the Endpoint column. On the good disconnection, the bulk transfer of 64 bytes has an Enndpoint address of 01:00:82, that is the Device's IN endpoint address. But on the bad disconnection, this is sent to FF:FF:8F, and I don't really dont know who or what that is.

My question basically is: Who is messing this up? The TI Hub? The Host? The device driver?

I did not design the device nor the device driver, I did the hardware design of the USB Hub using the TI part, and this prototype is working on a PCB we had manufactured. I would like to add that all other devices work fine, and I can hot-plug them with no issues, on this device is giving us this problem.

The thing about this device is that it has two stages. Stage one it identifies it the host as a bcdUSB 0x0110 device with one end point. One our application is loaded, the application loads the firmware onto the device on-the-fly and the device is re-enumerated, but this time it fails the WHQL by using a bcdUSB of 0x0101, and it now posses two endpoints, one IN and one OUT, in the logs you will probably see the addresses as 0x02 and 0x82, that is the device.

According to TI, they believe that the device driver is not handling the addressing topology of a USB Hub correctly, but I am unfamiliar with device driver design and do not know if this could be the actual problem.

Edit:

The other thing that I noticed is that after that last Bulk Transfer with 64 bytes on the GOOD disconnection, and 1024 bytes on the BAD disconnection that has different end points, there are 6 Internal USB get port status request on the GOOD disconnection, one of them is on the USB hub, but on the BAD disconnection no request is made on the hub whatsoever.
« Last Edit: February 08, 2014, 04:09:37 pm by guscrown »

Jan Axelson

  • Administrator
  • Frequent Contributor
  • *****
  • Posts: 3033
    • Lakeview Research
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #1 on: February 08, 2014, 05:22:20 pm »
I may have other comments later, but for now, is the device firmware returning bcdUSB 0x0101 on purpose or is it unknown why you are seeing that value:

>Stage one it identifies it the host as a bcdUSB 0x0110 device with one end point. One our application is loaded, the application loads the firmware onto the device on-the-fly and the device is re-enumerated, but this time it fails the WHQL by using a bcdUSB of 0x0101

There is no USB version 1.01; the device should return 0x0110 (or 0x0200).

guscrown

  • Member
  • ***
  • Posts: 18
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #2 on: February 08, 2014, 06:03:47 pm »
Jan,

I do not know why the device returns that value. It has been working on two more of our products for about 10 years, and at this time and stage I do not know if that was a design decision or simply an error. It was introduced before I joined the company.

It should be noted that this is the first time the device will be connected to a hub, so we were unaware of this potential issue.
« Last Edit: February 08, 2014, 06:15:23 pm by guscrown »

guscrown

  • Member
  • ***
  • Posts: 18
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #3 on: February 08, 2014, 07:29:44 pm »
The HUB has a pre-built state machine with all the USB hub stack in it, so no firmware is necessary on it. It is basically a drop-in solution.

I am going to make this very uneducated guess:

On the good disconnection, we see the last bulk transfer using the correct IN endpoint for the reader 0x82:



whilst on the bad disconnection, we see this strange C:I:E values FF:FF:8F



I am going to venture and say this actually crashes the USB Hub, hence we don't see a Get Port Status on the Hub, and later on we see the host trying to clear a stall on the hub:


« Last Edit: February 08, 2014, 07:32:45 pm by guscrown »

Jan Axelson

  • Administrator
  • Frequent Contributor
  • *****
  • Posts: 3033
    • Lakeview Research
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #4 on: February 09, 2014, 12:37:04 pm »
Is this an embedded hub, and if so is the disconnect a soft (firmware-controlled) disconnect for re-enumerating rather than physically removing the plug?

If not, do you see this problem with other hubs?

A hardware analyzer would show more detail about what is happening on the bus, but I'm guessing you don't have one available to you.
« Last Edit: February 09, 2014, 12:44:17 pm by Jan Axelson »

Jan Axelson

  • Administrator
  • Frequent Contributor
  • *****
  • Posts: 3033
    • Lakeview Research

guscrown

  • Member
  • ***
  • Posts: 18
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #6 on: February 09, 2014, 12:56:25 pm »
Jan,

Yes, this is an embedded hub built into one of our boards, we used the TUSB2077A TI part, and their reference schematic. The reset is a soft reset, electrically nothing is happening that would cause the IC to reset, so I am left to assume it is something to do with the OS, or device driver.

You are correct, I do not posses a hardware analyzer unfortunately. I will see if I can convince my manager to purchase the Beagle USB analyzer.

tsybezoff

  • Member
  • ***
  • Posts: 10
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #7 on: February 10, 2014, 07:06:33 am »
I guess this problem deal with TI HUB driver. I'm so doubt that Windows Driver doesn't work correctly. I worked with TI usb controllers (Luminary's seria) and always have problem with hot plugged, 'cause their developers have bugs at power functions. In due time I forced TI support to change usb power functions and develop usb hub driver for LM96xxx.

guscrown

  • Member
  • ***
  • Posts: 18
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #8 on: February 10, 2014, 01:54:10 pm »
Jan,

I managed to get my hands on a Ellisys USB Analyzer. I am attaching the 2 logs that I took. Maybe this will allow you to look better at what is happening. I am attaching the original ufo files used by their Visual USB Analyzing tool.

0001.ufo = Good disconnection
0002.ufo = bad disconnection

https://www.dropbox.com/s/guakbe2dxemf3hu/Hardware%20Analyzer%20Logs.zip

guscrown

  • Member
  • ***
  • Posts: 18
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #9 on: February 10, 2014, 03:52:10 pm »
Things that I have noticed:

On the "good" disconnection, the Incomplete IN transaction has an interface value of "0", whilst on the "bad" disconnection, the Incomplete IN Transaction has no "interface" value. Does this mean anything to you?

On the good disconnection I can see 6 "Incomplete IN Transactions attempts", and this is repeatable with all "good" disconnections. While on the "bad" disconnection I can only see one attempt, and then immediately go into the reboot process.

Jan Axelson

  • Administrator
  • Frequent Contributor
  • *****
  • Posts: 3033
    • Lakeview Research
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #10 on: February 10, 2014, 06:10:43 pm »
I have no answers, but here is what I see, which I think agrees with what you've reported:

Device address 2 is the device that attaches to the hub and device address 1 is the hub.

On the good disconnect, at 3.756, the host sends six IN tokens to the device with no response, then quits attempting to access the device and requests the status of port 4 from the hub. The hub reports that no device is present on port 4. The host makes no more attempts to access the device.

On the bad disconnect, the host sends one IN token to the device with no response, then resets the hub's upstream bus segment. A host can do whatever it wants, but a single incomplete transaction is a minor event that normally wouldn't lead to resetting the hub.

When you emulate disconnect, you're switching out the pull-up on D+ (or D- if low speed) from the device? Or is the device high speed? And this has worked fine when attached to a root hub?

Re your comment - IN token packets specify a device address and endpoint number but not an interface number (not needed). I'm not sure where you're seeing an interface number.







guscrown

  • Member
  • ***
  • Posts: 18
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #11 on: February 10, 2014, 06:20:50 pm »
Jan,

I was using the interface number provided by Ellisys Visual USB app, it is column 4.

The device is a full speed device. I am not emulating a disconnection, I am physically disconnecting the cable.

This command to reset the Hub, when you say "the host", is that actually the USB driver on the OS? or would that be the device driver for the device running on the OS?

guscrown

  • Member
  • ***
  • Posts: 18
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #12 on: February 10, 2014, 08:46:06 pm »
It is incredible how after this much analysis I am still nowhere near understanding what is going on. The USB Protocol is in fact too complex.  ???

Today I tried with some off-the-shelve hubs and they all showed a similar issue: They cannot load the firmware to the device after a reconnection, I do not know if they enumerate incorectly or what, but I do not have the time to debug so many hubs at the same time.

The other thing I noticed, my hub and the off-the-shelve hubs, occasionally will give a BSoD when reconnecting the device, or a little after doing it. The usual culprit is usuhub.sys. I should note that Windows 7 never gave me any BSoD, this only happens in Win XP; which is very odd, since the device driver was designed for Windows XP.

As of now I have several hypothesis:

1.- The device driver. Since the problem only happens when the device enumerates with the secondary firmware, I can assume that the problem resides in the device. I still haven't been able to duplicate the problem if I uninstall the device driver from the OS.

2.- The hub: According to the USB spec, USB hubs should be able to handle any incomplete, or erroneous data packages without crashing. Based on the hardware logs nothing is happening on the bus with that device, just a bunch of IN tokens and NAKs. A disconnection should cause no issue.

I guess tomorrow will be another day of logging and trying to see if I can find something I haven't seen. I will also try to get those other hubs to reload the firmware so I can actually do a disconnection test with results that would be of help determining of all hubs behave the same way.

Jan Axelson

  • Administrator
  • Frequent Contributor
  • *****
  • Posts: 3033
    • Lakeview Research
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #13 on: February 10, 2014, 08:50:51 pm »
In the Ellisys captures you linked to, I don't see anything in the interface column in either capture.

I had thought the hub and device were on the same PC board so yes, testing with conventional external hubs is a good thing to try.

Do the two different firmwares use different Product IDs in the device descriptor? Posting the descriptor sets might be helpful.

guscrown

  • Member
  • ***
  • Posts: 18
Re: Debugging new USB Hub design and suspicious device driver.
« Reply #14 on: February 10, 2014, 09:01:29 pm »
Jan,

I believe they use the same PID and VID, but they have different descriptors. The first stage has 1 endpoint, I guess it is to wait for the software to load the firmware. The secondary stage has two endpoints, I guess this is because the reader has a built in thermal printer for branding tickets. I will post all descriptors tomorrow morning as soon as I get to the office. For now I can upload the secondary firmware descriptors:

https://www.dropbox.com/s/r7hfi30puuy0ncn/OMR%20Reader%20Enumeration%2014.html

Sorry if I wasn't clear, I guess it is difficult to explain over the internet. The hub I designed and the PC do not reside on the same board. The PC is an off-the-shelve all in one PC that we use for several products. From there I drop 1 USB to the upstream of my hub board, which also provides power to several of our peripherals (24V, 12V, 5V). This board has been operational for about 1 year before the addition of the hub circuit, it is FCC certified and everything. I have checked the power line quality and I haven't seen any real source of noise. Of course there's always the chance of EMI being a potential problem, but in this case I do not see it as a clear reason for the problems I am facing.