Differential D20293

ucom(4): synchronously execute param changes
AbandonedPublic
Actions

Authored by kevans on May 17 2019, 5:46 PM.

Details

Reviewers

• hselasky
• ian
trasz

Summary

Param change command is queued into a separate thread for completion and control immediately returns to ucom_param to schedule the transfer enable and line state change commands, along with a quick return to the tty layer. As as side effect, tcsetattr (for instance) may return long before params reach the hardware and things can go a bit belly up without introducing artificial delays.

While here, push the knowledge of what gets executed synchronously out of ucom_queue_command. While it's likely an OK assumption that ucom_cfg_close should always be executed synchronously, there's only one caller for it and this makes it a bit easier to determine at the call-site whether the command will be executed asynchronously or not.

I also tend to wonder if ucom_cfg_open and/or ucom_cfg_line_state should also be synchronous changes, but I don't have as strong of feelings about these.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 24284

Event Timeline

kevans created this revision.May 17 2019, 5:46 PM

Herald added a subscriber: imp. · View Herald TranscriptMay 17 2019, 5:46 PM

Harbormaster completed remote builds in B24284: Diff 57492.May 17 2019, 5:46 PM

Just make sure that ucom_param() won't race when you drop the lock.

This revision is now accepted and ready to land.May 18 2019, 10:25 AM

In D20293#437614, @hselasky wrote:

Just make sure that ucom_param() won't race when you drop the lock.

Can you elaborate on which lock you're talking about? My understanding is that ucom_param is serialized by the ucom mtx, but only locking change here should be that the usb process lock may (will) get dropped while we wait for the queue to drain.

What happens if there are two threads calling the function where you drop the lock?

In D20293#440034, @hselasky wrote:

What happens if there are two threads calling the function where you drop the lock?

As far as I can tell, nothing- we're still holding the ucom lock and we drop the process lock so the proc thread can drain the queue. ucom lock effectively stops everything else that could queue and screw up our ordering.

The part that runs off the USB process threads is fine. But what contexts can ucom_param() be called from. Is there any chance of race here?

In D20293#440078, @hselasky wrote:

The part that runs off the USB process threads is fine. But what contexts can ucom_param() be called from. Is there any chance of race here?

Ah, yeah, no- it can only be called with the ucom mtx held and we don't drop that one.

I was going through the code putting "I think this one needs to be sync too" comments here and there. Then I realized I was pretty tagging every call to ucom_queue_command() as "needs to be sync".

So I guess that leads to the base question: why is ANYTHING done async here? In pretty much every case where a call to a tty-layer function results in a call down into this code, this code should not return to the tty layer until the requested action is completed. That's especially true of things like line state and param changes... if you configure a baud rate then send data, you expect that new data to go out at the new line speed.

sys/dev/usb/serial/usb_serial.c
744	Does anybody know what this 100ms wait is for?
798	This comment says to queue the transfer enable last, but that's not what happens... the following 4 function calls also queue async calls to the lower layer. I think that means transfers get enabled before the line state has been properly programmed in the hardware.
810	IMO, there needs to be some kind of wait here so that we don't return to the tty layer until all the things related to processing the open call have completed. Maybe that means every one of the preceeding queue calls need to use the sync flag, or maybe there's some value in queueing them all async and then doing one wait at the end? I don't know, since I've never understood the purpose of doing anything async with multiple threads in this code (there's a comment somewhere that says the cfg names are run on a different thread, but I've never seen a comment that says why that is).
1274	Another unexplained 100ms wait... why?

I understand you want synchronous behaviour, but how about implementing a drain command, which is called unlocked from the TTY layer, just before the end of the IOCTL return. Won't that fix the problems you see?

sys/dev/usb/serial/usb_serial.c
1274	Sometimes if you program values in a loop, the firmware in the USB chip might overflow or choke. Make sure we nice programming a bit.

In D20293#443792, @hselasky wrote:

I understand you want synchronous behaviour, but how about implementing a drain command, which is called unlocked from the TTY layer, just before the end of the IOCTL return. Won't that fix the problems you see?

I don't understand what you're saying at all. It's just a normal sane expectation that open/close/read/write/ioctl calls all behave synchronously, and it's especially imporant in serial comms where there is interaction between the read/write data and the other calls that configure the comms or manipulate line state. Also, I've never seen anything documented in relation to ttydevsw that even hints that it's okay to return before the action is complete.

sys/dev/usb/serial/usb_serial.c
1274	Then this should be in the lower-layer driver of a chip that has that problem. It's crazy to sleep in open() long enough to open/close the device hundreds of times if the sleep weren't there.

In D20293#443794, @ian wrote:

In D20293#443792, @hselasky wrote:

I understand you want synchronous behaviour, but how about implementing a drain command, which is called unlocked from the TTY layer, just before the end of the IOCTL return. Won't that fix the problems you see?

I don't understand what you're saying at all. It's just a normal sane expectation that open/close/read/write/ioctl calls all behave synchronously, and it's especially imporant in serial comms where there is interaction between the read/write data and the other calls that configure the comms or manipulate line state. Also, I've never seen anything documented in relation to ttydevsw that even hints that it's okay to return before the action is complete.

I agree about that, but as you know the TTY layer holds a mutex while calling us, so sleeping is not allowed or am I wrong? You don't solve mutex problems by just dropping the TTY lock, that leaves races wide open.

I suggest a new method in the ttydevsw which drain any pending commands off the underlying layer, which can be called outside the tty_lock() so we don't mess with the TTY's internal state.

• hselasky added inline comments.Jun 6 2019, 6:53 PM

sys/dev/usb/serial/usb_serial.c
744	You can try removing the delay if you want.

In D20293#443814, @hselasky wrote:

I agree about that, but as you know the TTY layer holds a mutex while calling us, so sleeping is not allowed or am I wrong? You don't solve mutex problems by just dropping the TTY lock, that leaves races wide open.

I suggest a new method in the ttydevsw which drain any pending commands off the underlying layer, which can be called outside the tty_lock() so we don't mess with the TTY's internal state.

Oh right, I had forgotten about tty_lock() being held during calls to ttydevsw stuff. Now I remember looking into all this once before, a few years ago. What I discovered (and re-discovered yesterday) is that lots of places now are dropping and reacquiring the tty lock, unsafely. At a minimum, you have to check tty_gone() after recaquiring it. In addition you have to consider the other parts of tty state that the lock protects and what might have changed there, and that's hard to do because what the lock protects is not documented.

But now at least I have a better idea of what kind of trouble to look for in evaluating these changes.

In D20293#444094, @ian wrote:

In D20293#443814, @hselasky wrote:

I agree about that, but as you know the TTY layer holds a mutex while calling us, so sleeping is not allowed or am I wrong? You don't solve mutex problems by just dropping the TTY lock, that leaves races wide open.

I suggest a new method in the ttydevsw which drain any pending commands off the underlying layer, which can be called outside the tty_lock() so we don't mess with the TTY's internal state.

Oh right, I had forgotten about tty_lock() being held during calls to ttydevsw stuff. Now I remember looking into all this once before, a few years ago. What I discovered (and re-discovered yesterday) is that lots of places now are dropping and reacquiring the tty lock, unsafely. At a minimum, you have to check tty_gone() after recaquiring it. In addition you have to consider the other parts of tty state that the lock protects and what might have changed there, and that's hard to do because what the lock protects is not documented.

We've learned a long time ago that this sort of locking is bad for performance, so we do things like reference counting, epoch guarantees, etc to ensure proper lifetime and concurrent protection. It's long past time to revisit the tty_lock(), but I fear there's too little RoI on that for someone to do it on their own.

But now at least I have a better idea of what kind of trouble to look for in evaluating these changes.

Yeah, so, looking at this again- I've walked through a couple of different approaches, and none of them really work because we have to drop the tty/usb lock.

Ultimately, why does this stuff need to be done asynchronously in a different process? I would understand if bits outside of dev/usb/serial/* were submitting commands to this process as well, but that's simply not the case as far as I can tell -- it looks like it was mostly done like this to match the structure of usb_ethernet, which is not a solid reason.

I want to propose we do two things:

Kick out usb_proc/ucom_queue_command, because they don't seem to really solve any problems for the added complexity, and
Separate out the tty lock from the USB lock -- let the tty layer allocate it. If we have a 1:n mapping of usb driver to ttys, I can't see a reason immediately that we *need* a lock with that large of scope. It seems like the n tty's should be able to operate independently, as long as usb operations are serialized appropriately.

Ultimately, why does this stuff need to be done asynchronously in a different process? I would understand if bits outside of dev/usb/serial/* were submitting commands to this process as well, but that's simply not the case as far as I can tell -- it looks like it was mostly done like this to match the structure of usb_ethernet, which is not a solid reason.

USB commands sleep, and if the device has crashed can take several seconds and then you don't want to block the caller.
How about the following:

Put a msleep() call with some predefined time like 10ms after the IOCTLs in question?
Or record the time of the last call, and if too tight, sleep a bit?

You either want to return success synchronously, or return failure and not change things... so long as there are those semantics, a timeout is fine.

Yes, that's the simplest.

I think it is important we still keep on using a USB task/process queue.

Multiple threads may be exercising the API simultaneously, likely causing USB low level commands out of order!

There may be more than one atomic USB request involved!

Use a SX lock perhaps in TTY layers?

In D20293#479628, @hselasky wrote:

Yes, that's the simplest.

I think it is important we still keep on using a USB task/process queue.

Multiple threads may be exercising the API simultaneously, likely causing USB low level commands out of order!

There may be more than one atomic USB request involved!

Use a SX lock perhaps in TTY layers?

Ok, these reasons make more sense to me, but I don't see what wE're gaining if I'm going to throw a dummy command into the usbproc queue to get back the semantics that TTY consumers expect.

I don't want to ask, but do we know what other OSen do for USBTTY drivers?

Abandoning this, because it's wrong and we likely need finer-grained (and different) locking at the TTY layer to compensate.

Revision Contents
Changeset List

Path

Size

sys/

dev/

usb/

serial/

usb_serial.c

26 lines

Diff 57492

View Options

sys/dev/usb/serial/usb_serial.c

ucom(4): synchronously execute param changesAbandonedPublicActions