Page MenuHomeFreeBSD

cam: Permit non-pollable sims.
ClosedPublic

Authored by jhb on Feb 1 2021, 11:32 PM.
Tags
None
Referenced Files
Unknown Object (File)
Oct 3 2024, 1:28 PM
Unknown Object (File)
Oct 1 2024, 3:17 AM
Unknown Object (File)
Sep 27 2024, 3:46 AM
Unknown Object (File)
Sep 26 2024, 9:21 PM
Unknown Object (File)
Sep 26 2024, 9:21 PM
Unknown Object (File)
Sep 26 2024, 9:21 PM
Unknown Object (File)
Sep 26 2024, 9:20 PM
Unknown Object (File)
Sep 26 2024, 9:10 PM
Subscribers

Details

Summary

Some CAM sim drivers do not support polling (notably iscsi(4)).
Rather than using a no-op poll routine that always times out requests,
permit a SIM to set a NULL poll callback. xpt_poll_setup() and
xpt_poll_timewait() fail requests on non-pollable sims immediately as
if they had timed out.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jhb requested review of this revision.Feb 1 2021, 11:32 PM
This revision is now accepted and ready to land.Feb 1 2021, 11:41 PM
sys/cam/cam_xpt.c
3241

It might make sense to have a KASSERT sim->poll is != NULL in xpt_sim_poll. This will guard against future introduction of paths that get there w/o it being set. I agree that it will just change the panic there, but I think it's useful enough info to let people know WTF is up when they get a crash there (even if the KASSERTS aren't active).

3248

So all calls for xpt_pollwait() from non-polled sims always return CAM_CMD_TIMEOUT error?

There's places that we send commands to the disk in polled mode that aren't the dumps (when we're shutting down).

Also, would the non-polling SIMs be allowed to asynchronously cahnge the ccb's status? If so, we should at least check here to see if the ccb is completed before failing.

sys/cam/cam_xpt.c
3248

So all calls for xpt_pollwait() from non-polled sims always return CAM_CMD_TIMEOUT error?

There's places that we send commands to the disk in polled mode that aren't the dumps (when we're shutting down).

The condition in cam_periph.c is:

	must_poll = dumping || SCHEDULER_STOPPED();

SCHEDULER_STOPPED() (which is curthread->td_stopsched) is only set to true during a panic (vpanic()) or while stopped in the kernel debugger (kdb_trap()). It is not stopped during shutdown, so normal shutdowns should still send commands to polled sims.

Also, would the non-polling SIMs be allowed to asynchronously cahnge the ccb's status? If so, we should at least check here to see if the ccb is completed before failing.

In practice, the sim will never see the ccb. When cam_periph_runccb() wants to send a polled ccb, it first calls xpt_poll_setup(). That will always fail returning 0. As a result, cam_periph_runccb() will never call xpt_action() and thus never invoke the SIM's action routine.

I would be happy using some other error than EBUSY and CAM_RESRC_UNAVAIL which we could do by having cam_periph_runccb() check cam_sim_pollable() directly and fail the request with the different errors instead. In fact, if we took that approach, we could perhaps assert cam_sim_pollable() in xpt_poll*() instead of modifying those functions to handle non-pollable sims.

sys/cam/cam_xpt.c
3248

So all calls for xpt_pollwait() from non-polled sims always return CAM_CMD_TIMEOUT error?

There's places that we send commands to the disk in polled mode that aren't the dumps (when we're shutting down).

The condition in cam_periph.c is:

	must_poll = dumping || SCHEDULER_STOPPED();

SCHEDULER_STOPPED() (which is curthread->td_stopsched) is only set to true during a panic (vpanic()) or while stopped in the kernel debugger (kdb_trap()). It is not stopped during shutdown, so normal shutdowns should still send commands to polled sims.

OK.

Also, would the non-polling SIMs be allowed to asynchronously cahnge the ccb's status? If so, we should at least check here to see if the ccb is completed before failing.

In practice, the sim will never see the ccb. When cam_periph_runccb() wants to send a polled ccb, it first calls xpt_poll_setup(). That will always fail returning 0. As a result, cam_periph_runccb() will never call xpt_action() and thus never invoke the SIM's action routine.

I would be happy using some other error than EBUSY and CAM_RESRC_UNAVAIL which we could do by having cam_periph_runccb() check cam_sim_pollable() directly and fail the request with the different errors instead. In fact, if we took that approach, we could perhaps assert cam_sim_pollable() in xpt_poll*() instead of modifying those functions to handle non-pollable sims.

what you are describing sounds better, yes.

  • Pull the cam_sim_pollable logic up into cam_periph_runccb.
This revision now requires review to proceed.Feb 11 2021, 1:50 AM

This is the alternate version that pulls the real pollable check up into cam_periph_runccb() and just asserts in the xpt_poll_*() routines. It took me a while to post it as I hadn't triggered another panic with an active ISCSI initiator with this change applied to test it until just now.

sys/cam/cam_periph.c
1254

I didn't bother with a different error value here as I wasn't really sure that any of the other existing CCB status values make sense. I'm open to suggestions if we think a different CCB status and/or error value would be better.

This looks much better... I'm neutral on a different return value as well...

This revision is now accepted and ready to land.Feb 11 2021, 5:18 AM
This revision was automatically updated to reflect the committed changes.