Author Topic: Mass Storage Write (10) hangs at very last EP OUT data transfer  (Read 22918 times)

Barry Twycross

  • Frequent Contributor
  • ****
  • Posts: 263
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #15 on: April 01, 2015, 03:55:00 pm »
Trying to sift thru Apple's usbtracer command-line software looked like hell, and there's no manual I could find on it to make it usable.
Tracer really needs to be used in conjunction with the source code. Apple used to provide the source, but I'm not sure for the very latest OS. I have a head start there as I wrote some of the source.
Quote
since Total Phase doesn't provide Mass Storage "convenience" unless you fork lots of money for the higher-end models, I had to manually inspect the raw packet's CBW and CSW's
That's the way I wrote and debugged my device, with just the packet traces and no interpretation. Any USB analyzer is 100% better than no USB analyzer, though I did miss the packet interpretation I'd got using other analyzers. I eventually got approval to buy something better, but not before I'd done it totally without the help. As Total Phase doesn't provide a software upgrade to include packet interpretation, I went and bought a CATC/LeCroy/Teledyne Mercury T2 instead. I prefer the CATC software. That does packet interpretation which has made my life easier ever since.
Quote
The problem lies in my driver, because the USB bus does indeed see the correct amount of packets.
A. That's why I always recommend a hardware analyzer, doing USB without it is much like banging your head against a wall, only less rewarding.

B. So the packet appears on the bus, but you never see it in your firmware? That sounds like a classic case of data toggle mismatch. The correct behavior in response to a data toggle mismatch is to receive the packet, ACK it and then discard it. Any time someone says anything like "I see it on the bus but not in my software" I say data toggle.



bazz

  • Member
  • ***
  • Posts: 18
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #16 on: April 02, 2015, 01:55:21 am »
I've made lots of progress.
On OSX, the drive can now "come online," I can view and edit a sample text file I created from the device-side. I can unmount the drive.

You might be wondering, what did I do to fix it?? Although your idea is sound on data toggling, that wasn't it. The problem for me is with the unique aspects of my USB controller ASIC what-have-you,the PDIUSBD12. As you know, when studying a datasheet, it takes time to fully comprehend the complete connection between everything -- well sometimes the datasheet says things that misleads your mind into thinking "OK I don't need to address XYZ," when you really do... That's what happened to me, and right now my solution is sub-optimal. Let me expound :

PDIUSBD12

Endpoint 2
This endpoint is different from the other endpoints. The other ones which have 16-byte single buffers. EP2 has 64-byte buffer, and is double buffered.. It's advertised as transparent in the datasheet. So I just operated on it the same way I would with my experience from dealing with EP 0 and enumerating the device.. WRONG -- the more I looked thru the datasheet I started realizing the potential importance of this double buffering behaviour.. Especially when I noticed a max-read of 130 bytes (2 extra to account for certain pecularities of the device as listed in datasheet). I felt like the packets were in the controller I just didn't know how to access them.. Then I noticed a command that would tell you if one of the 2 buffers was full.. Transparent my ass.. Just kidding :) ... The firmware example uses DMA with EP2, which I cannot do, and I can only wonder until I look at it whether it will shine light on how to properly handle -- but for now what I did in a hunch instinct was I moved all of my logic over to the smaller endpoint 1.. Changing some code to adapt for the smaller EP size, and wah-lah!! I proved my hunch! It started working!! And that's how I fixed so far :) I hope to in the future attempt a return to EP2's bigger buffer. I can probably figure out how to use it, given instincts and possible firmware documentation [as long as DMA doesn't auto-handle what I will need to handle manually].


And I learned a couple things at this point:

Point 1
I need a more dynamic response to TEST UNIT READY instead of just passing good CSW's if I want my "drive" to come back online after a dismount.. or during the OSX format process, which still doesn't complete due to this and possibly other future road blocks.

Point 2
If I remount the drive after unmount, the drive will not display the new files I may have added, and the one pre-existing text file is listed as having 0 bytes although the data and the other files raw data can be found in the partition if dumped by raw memory..But I didn't inspect the FAT itself to see if it became messed up. I suspect this is connected to my observation that when OSX mounts my drive, it writes some data to the partition.. off the top of my head I loosely recall it writing "MAC OSX" somewhere after the 1st or 2nd sector and a serial number too.

Summary on these points
I'm not too concerned with these observations or fixing them directly since I already know I'm not properly implementing the whole required-SCSI command set.. So here's what I've done to admonish that:

Learning from a Real Flash Drive
I immediately ran into a problem -- I am implementing a Full Speed product, but most flash drives are high-speed. Purchasing a full-speed product seemed pretty lame as they are rare and shipping is expensive, plus the crappy USB advertising makes it hard to guarantee the proper operation I'm after.. Luckily, I have some USB sticks here, and I prayed one of them may support Full Speed and that I could modify the bus speed from the OS-side..

Reducing 2.0 Bus Speed to Full Speed
I dabbled with virtualization but to no avail (VirtualBox and Windows XP)... although it is said you might be able to disable the 2.0 UHCI [may have the wrong term there] and get it working.. Not for me.... even with vbox extensions installed and USB 2.0 virtual UHCI enabled in V-OS settings... Luckily, I found a solution..

I booted into my Ubuntu 12.04 partition thru Refind [love it] and found some instructions on how to [maybe] be able to change the bus and port speed -- I was warned that my tech has to be old enough to do this. Anyway the instructions are here, and it worked : ) -- http://lists.en.qi-hardware.com/pipermail/discussion/2011-August/008508.html -- I'd love to make a mental note that using dmesg to find the usb bus and port was more intuitive than the output of lsusb, which has a device number which I originally mistakenly used for the port number.. With those instructions I had it working..

I got the TotalPhase USB driver working thru the UDEV instructions found here: http://www.totalphase.com/support/articles/200472426/#s4.3.1 -- the USB drivers themselves are easily found on totalphase's website..

Analyzing USB Stick at Full Speed
I proceeded to take a number of "snapshots" of different events.. Right now, I have the following:

Code: [Select]
1_enumerate_and_preload_drive.tdc                      1.1M
2_launch_gparted.tdc                                   765K
3_select_and_unmount_drive.tdc                         11M
4_erase_partition_to_unallocated.tdc                   3.0M
5_create_msdos_partition_table.tdc                     3.6M
6_create_fat16_partition.tdc                           6.8M
7_exit_gparted.tdc                                     908K
8_mount_open_drive_create_file_write_file_unmount.tdc  281K
9_connect_open_drive_open_file_close_file.tdc          1.4M

and to print that output, I just learned / found my own way to do this:
Code: [Select]
ls -lh | awk '{print $9,"   ",$5}' | column -t
« Last Edit: April 02, 2015, 01:57:42 am by bazz »

bazz

  • Member
  • ***
  • Posts: 18
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #17 on: April 02, 2015, 06:23:51 am »
Another thing I'm curious of is that when a stall condition arises.. The host sends a CLEAR FEATURE request to my device.. I'm not sure I'm responding correctly to it, or maybe mis-parsing it, because communication then goes silent for a good amount of seconds before it resumes again.. I'm not sure if this is normal behavior. I'm looking for clarification on this. GET STATUS is also a potential culprit.

[too lazy to post source code, hoping for high-level advice]
« Last Edit: April 02, 2015, 06:47:00 am by bazz »

bazz

  • Member
  • ***
  • Posts: 18
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #18 on: April 02, 2015, 10:54:19 am »
Mode Sense (6) Response -- Question

Note: It is my expectation that the behavior I'm about to describe is a case of the "Allocation Length" fields in SCSI Primary Commands, which when present allow the CSW residue to be set to 0 even when less data is sent than specified in the CBW.

This is from my live capture of a USB flash drive operating in Full Speed, with 64-byte bulk endpoints:

At "USB enumeration" -- the Linux host eventually sends its first MODE SENSE (6) command, asking for all supported pages.. In the CBW, a DataTransferLength of 0xC0 / 192 is specified.. This length is also specified inside the CBWCB.
The device goes on to reply with an IN packet of only 0x23 bytes, containing 2 Pages, Cache and Informational Exceptions Control. No block descriptors were sent. Immediately following this transmission, a CSW is sent with good status, and no residue..

The 13 Cases would have dictated other behavior -- such as send padded data, or stall the in-point, but that doesn't happen in this case.. Is this proper?

The USB operations happily proceed.. I do not notice either a STALL condition, or a CLEAR FEATURE request.. So I'm wondering, why wasn't the residue set and the stall occurring, like specified as should happen in the 13 cases.. ??? Whereas the device sends less info than specified in the CBW.. Is this a special case?? How come I did not read about such behavior, even in Jan's book or firmware implementation, which we can both agree was incomplete and the mode sense implementation was incomplete specifically.. [no offense]

But since this behavior appears non-standard and I haven't seen it anywhere else yet, I'm wondering if it will pop up in other commands I should know about.. Please someone shine light on this. Thank you [can't have the answers soon enough! hehe.. Fortunately I'll be spending the next few days on break, to give you some time :P ]  But please answer quickly :D


EDIT: Am I missing some documentation.. I have the USB 2.0 spec, the Mass storage spec, and the SCSI command sets, but I feel a little in the dark when it comes to exactly how each SCSI command is treated when in the context of being within CBW and CSW responses.. IS there a doc on that?
« Last Edit: April 02, 2015, 11:42:21 am by bazz »

Barry Twycross

  • Frequent Contributor
  • ****
  • Posts: 263
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #19 on: April 02, 2015, 05:26:56 pm »
I've made lots of progress.

I'm glad things are working out for you.

Quote
Point 1
I need a more dynamic response to TEST UNIT READY instead of just passing good CSW's if I want my "drive" to come back online after a dismount.. or during the OSX format process, which still doesn't complete due to this and possibly other future road blocks.

Point 2
If I remount the drive after unmount, the drive will not display the new files I may have added,

Does your device declare itself as a removable device? That is it sets the RMB bit in the inquiry data. That's the easiest way of doing this. You need to keep state of whether your "media" is loaded or not. It can start out loaded, but you should get an eject command (Start/Stop unit with LoEj=1) which causes you to become not loaded. Then you return a sense of Not ready, no media to just about anything the host tries to do (including test unit ready). When you want to remount the media, you start returning good status to the test unit ready.

If the host thinks the media was ejected, it invalidates all caches and rereads the media and can find new files. This method works in our device and about 150 million iPods out there

Quote
Summary on these points
I'm not too concerned with these observations or fixing them directly since I already know I'm not properly implementing the whole required-SCSI command set.. So here's what I've done to admonish that:
I'd suggest you implement at least the RBC command set. ftp://ftp.t10.org/t10/document.97/97-260r2.pdf

And maybe the bootability set: http://www.usb.org/developers/docs/devclass_docs/usb_msc_boot_1.0.pdf

Quote
I am implementing a Full Speed product, but most flash drives are high-speed.
The easiest way to make something full speed, is to attach it via a full speed hub. You can still buy hubs which are advertised as full speed, we did for this very sort of reason.

It used to be that you could remove the Mac's EHCI driver and that would disable high speed. That hasn't worked since about the 2010 Macs though. Warning: Don't do this on the wrong machine, it will disable USB entirely.

If your Mac has slots, you may be able to find a UHCI or OHCI plug in card for it. That will be full speed only.

Barry Twycross

  • Frequent Contributor
  • ****
  • Posts: 263
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #20 on: April 02, 2015, 05:28:56 pm »
Another thing I'm curious of is that when a stall condition arises.. The host sends a CLEAR FEATURE request to my device.. I'm not sure I'm responding correctly to it, or maybe mis-parsing it, because communication then goes silent for a good amount of seconds before it resumes again.. I'm not sure if this is normal behavior. I'm looking for clarification on this. GET STATUS is also a potential culprit.
Are there NAKs on the bus during this quiet period? You may be ignoring them. If there are, the host is expecting you to do something that you aren't. A SETUP command will typically timeout after 5 sec which sounds like what you're describing.

Barry Twycross

  • Frequent Contributor
  • ****
  • Posts: 263
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #21 on: April 02, 2015, 05:38:27 pm »
Mode Sense (6) Response -- Question

Note: It is my expectation that the behavior I'm about to describe is a case of the "Allocation Length" fields in SCSI Primary Commands, which when present allow the CSW residue to be set to 0 even when less data is sent than specified in the CBW.
I'm not sure of what your question is exactly, and I can't be bothered to look up the 13 cases to check.

The first point is any random USB stick is likely to be very badly implemented. As one commentator noted their adherence to specs is "coincidental at best". They're likely to do just enough to enumerate successfully on Windows, so the behavior of Windows is the defacto standard for this.

At least for mode sense and inquiry if the SCSI command specifies an allocation length you can respond with less. No one cares about that. I don't know if its actually standard, or just customary.

Barry Twycross

  • Frequent Contributor
  • ****
  • Posts: 263
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #22 on: April 02, 2015, 05:48:30 pm »
It used to be that you could remove the Mac's EHCI driver and that would disable high speed. That hasn't worked since about the 2010 Macs though. Warning: Don't do this on the wrong machine, it will disable USB entirely.
I see you say you have a 2009 MacBook, this might work for you. But do it at your own risk, and have a backup plan in place in case it doesn't work. If you have another machine handy you can set up screen sharing or remote terminal access. This allows you to control a machine with dead USB enough to put it back together. The other back up plan is to have a different boot partition installed, you can boot to the other partition and recover things (maybe).

From the terminal:

Code: [Select]
sudo mv /System/Library/Extensions/IOUSBFamily.kext/Contents/PlugIns/AppleUSBEHCI.kext /
sudo touch /System/Library/Extensions

Then reboot. That moves your EHCI driver out from where it lives and tells IOKIt to rebuild its extension cache (now without the EHCI driver). Without the EHCI driver high speed IUSB no longer works, but with the right sort of machine (pre 2010) the full speed controllers will still work.

Recovery is to move it back to the proper place and touch the extensions folder.

I used to do this all the time when I needed a full speed host. There are some machine it doesn't work on, always have a backup plan and don't blame me if it hoses your machine.

bazz

  • Member
  • ***
  • Posts: 18
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #23 on: April 02, 2015, 06:57:01 pm »
hehe, I've read all your responses.. Thanks for the tip, it sounds pretty risky, and even tho I have a triple boot machine and separate external hard drive with OSX installed, I think I'll pass.

EDIT: Regarding RBC, you must not be talking about stating support for RBC in the inquiry response, right? I actually personally went out of my way NOT to -- well in Jan Axelson's Mass Storage book, she went on to explain that PC's don't have RBC drivers natively, and that it'd be easier to create one's own vendor firmware at that point... Or are you talking about implementing RBC in the context of having selected SBC in the inquiry response?

EDIT 2: I have chosen for heaven's sake to respond to inquiry that I support SPC-2 / SBC, even though there are much newer versions of SPC and SBC.. Maybe this is something I can upgrade when I feel more comfortable after having supported the older one which I can reference from the USB stick I have. I doubt much would change in the impl anyways.
« Last Edit: April 02, 2015, 07:04:28 pm by bazz »

Barry Twycross

  • Frequent Contributor
  • ****
  • Posts: 263
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #24 on: April 03, 2015, 06:23:21 pm »
You're right Windows doesn't support RBC as declared in the enquiry. So we just declare Zero, whatever that is.

I was thinking about what commands actually to support. RBC and Bootability make a good subset, it works with all hosts I've tried it on so far. I don't support Format or Verify, but that's never been a problem.

bazz

  • Member
  • ***
  • Posts: 18
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #25 on: April 07, 2015, 07:10:16 am »
hehe, I've read all your responses.. Thanks for the tip, it sounds pretty risky, and even tho I have a triple boot machine and separate external hard drive with OSX installed, I think I'll pass.

EDIT: Regarding RBC, you must not be talking about stating support for RBC in the inquiry response, right? I actually personally went out of my way NOT to -- well in Jan Axelson's Mass Storage book, she went on to explain that PC's don't have RBC drivers natively, and that it'd be easier to create one's own vendor firmware at that point... Or are you talking about implementing RBC in the context of having selected SBC in the inquiry response?

EDIT 2: I have chosen for heaven's sake to respond to inquiry that I support SPC-2 / SBC, even though there are much newer versions of SPC and SBC.. Maybe this is something I can upgrade when I feel more comfortable after having supported the older one which I can reference from the USB stick I have. I doubt much would change in the impl anyways.
I tried this on my laptop and it worked, but with the following caveat: The beagle must be running in high speed mode to function correctly.. On Linux I was able to toggle high/full speed by the bus, so that I could have the beagle running High speed and the USB flash drive running in full speed. can I do this on my OS X? If not, you suggested I buy a full-speed USB hub. Do you have any suggestions for what debugging a device over a hub is like? Will it make the log file more confusing to read?

EDIT
You're right Windows doesn't support RBC as declared in the enquiry. So we just declare Zero, whatever that is.
That would be SBC
« Last Edit: April 07, 2015, 08:04:02 am by bazz »

bazz

  • Member
  • ***
  • Posts: 18
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #26 on: April 07, 2015, 12:46:14 pm »

Does your device declare itself as a removable device? That is it sets the RMB bit in the inquiry data. That's the easiest way of doing this. You need to keep state of whether your "media" is loaded or not. It can start out loaded, but you should get an eject command (Start/Stop unit with LoEj=1) which causes you to become not loaded. Then you return a sense of Not ready, no media to just about anything the host tries to do (including test unit ready). When you want to remount the media, you start returning good status to the test unit ready.

If the host thinks the media was ejected, it invalidates all caches and rereads the media and can find new files. This method works in our device and about 150 million iPods out there

OK, I'm coding this part of the driver, and I have some questions.
What if I receive a START/STOP and LoEj is != 1 ? What kind of circumstance does this put me in? Can I just ignore LoEj and perform all logic based on START/STOP?

Also, if SCSI commands are sent that have a data stage, what is the correct step during STOP stage?? Send junk data that is the requested length followed by the CSW == 1 status?

bazz

  • Member
  • ***
  • Posts: 18
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #27 on: April 08, 2015, 12:27:59 pm »
I fixed a dire mistake in my code and now I can basically operate the drive, format it, update it, everything :D. There are still some bugs in my code, but they are personal bugs I need to find.

Although my previous posts are still pending answers, I'd love to talk about the super-bug-fix I found today :D

Arg, so it all boils down to this:
&BulkXfer.dataBuffer vs. &BulkXfer.dataBuffer[0] or some prefer just BulkXfer.dataBuffer

Oops.. It was causing the Write(10) to not actually write to the proper buffer.. And THAT'S why everything wasn't working correctly.. AKA -- formatting getting stuck at "waiting for disks to reappear" on Mac OSX -- on Linux, causing the drive to appear as unformatted after ejecting and re-inserting.. And on OSX, causing the drive to not display file contents after ejecting/re-inserting the drive..

But now it works!!

Pending Issues
I still get a 5-second hang after Clear Feature.. but I don't know why.. Everything checks out fine.. I unstall the EP then I send an empty data response packet.. Any tips? I'll try to see if I can capture that event on my USB sniffer.. but with the increase in accuracy in my code, there are less and less stalled end point events :)

Also, sometimes the device just stops handling USB traffic.. Argh.. that's on my end..  
« Last Edit: April 08, 2015, 12:30:18 pm by bazz »

Jan Axelson

  • Administrator
  • Frequent Contributor
  • *****
  • Posts: 3033
    • Lakeview Research
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #28 on: April 08, 2015, 02:01:12 pm »
As Barry asked, is your device responding to the Clear Feature request? Does the request complete?

Barry Twycross

  • Frequent Contributor
  • ****
  • Posts: 263
Re: Mass Storage Write (10) hangs at very last EP OUT data transfer
« Reply #29 on: April 09, 2015, 05:04:06 pm »
On Linux I was able to toggle high/full speed by the bus, so that I could have the beagle running High speed and the USB flash drive running in full speed. can I do this on my OS X? If not, you suggested I buy a full-speed USB hub. Do you have any suggestions for what debugging a device over a hub is like? Will it make the log file more confusing to read?
The idea of moving the EHCI driver aside kills all high speed in the system, so you probably can't have different busses running at different speeds.

By log file, do you mean the Beagle trace? A hub doesn't add significantly to the log's complexity. Its particularly easy if you can hide traffic to the hub, I'm not sure if the beagle does that or not. If not, there is very little traffic to the hub, and you only see the traffic to the hub, not the replies, if you position the analyzer below the hub.

One other thought, does your device chip have a high speed mode bit? Most chips I've worked with allow you to respond to the chirp or not. If you turn that off the device becomes full speed even on a high speed capable bus.