mpi3mr: Block I/Os While Task Management is in Progress
ClosedPublic
Actions

Authored by chandrakanth.patil_broadcom.com on Apr 9 2025, 7:35 PM.

Details

Reviewers

imp
sumit.saxena_broadcom.com

Commits

rG9980fb4d87ae: mpi3mr: Block I/Os While Task Management is in Progress
rG49445b54638f: mpi3mr: Block I/Os While Task Management is in Progress

Summary

The driver previously blocked I/Os only for OS-initiated task
management commands. This patch extends the behavior to also
block I/Os during application-initiated task management
operations (excluding Task Abort).

Before submitting such commands to the firmware, I/O
submissions are paused for the respective device. Once the
command completes, I/O operations are resumed.

This ensures safe and consistent task management handling.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

chandrakanth.patil_broadcom.com requested review of this revision.Apr 9 2025, 7:35 PM

chandrakanth.patil_broadcom.com created this revision.

This change blocks new I/Os from being submitted while a task management command is in progress, but wouldn't you need to also wait for any in-progress I/Os to complete before submitting the task management command? I think you would need to wait for those if your goal is to avoid having the firmware processing I/O commands and task management commands together (which is what this change appears to be trying to do).

In D49749#1135018, @chs wrote:

This change blocks new I/Os from being submitted while a task management command is in progress, but wouldn't you need to also wait for any in-progress I/Os to complete before submitting the task management command? I think you would need to wait for those if your goal is to avoid having the firmware processing I/O commands and task management commands together (which is what this change appears to be trying to do).

When a TM (e.g., target reset) is issued, it indicates that a number of IOs have not been completed on the target, suggesting an issue with the firmware, hardware, or device. Since waiting for the in-flight IOs to complete offers no guarantee and only delays the process, a target reset is initiated to recover the lost IOs. Before the reset, further IO submissions are blocked to prevent additional issues

Ok, I didn't realize that "task management" always meant "target reset" here. Not waiting for in-progress commands makes sense in this context. Could you add a comment to the code explaining this?

Another thing that I noticed while looking at this patch was that all the other places that decrement block_io do it in this style:

if (mpi3mr_atomic_read(&tgtdev->block_io) > 0)
        mpi3mr_atomic_dec(&tgtdev->block_io);

whereas this one you've added does not check the current value before decrementing. is there some reason why this new instance doesn't follow the pattern of the others?

note that the pattern of the others also seems wrong... the check of the current value is not atomic with the decrement, so it looks like one thread could check the value and see that it is not zero, another thread could execute one of the paths that sets the current value to zero, and then the first thread could proceed to decrement the value and cause the value to become -1. is there something that prevents this scenario from occurring?

In D49749#1135942, @chs wrote:
Ok, I didn't realize that "task management" always meant "target reset" here. Not waiting for in-progress commands makes sense in this context. Could you add a comment to the code explaining this?

Another thing that I noticed while looking at this patch was that all the other places that decrement block_io do it in this style:
if (mpi3mr_atomic_read(&tgtdev->block_io) > 0)
        mpi3mr_atomic_dec(&tgtdev->block_io);

[ckp] Agreed, we need to check the block_io > 0 before decrementing block_io, otherwise if other thread has already decremented then the counter may go negative. I will add new patch for that. Thank you.

whereas this one you've added does not check the current value before decrementing.  is there some reason why this new instance doesn't follow the pattern of the others?

note that the pattern of the others also seems wrong... the check of the current value is not atomic with the decrement, so it looks like one thread could check the value and see that it is not zero, another thread could execute one of the paths that sets the current value to zero, and then the first thread could proceed to decrement the value and cause the value to become -1.  is there something that prevents this scenario from occurring?

[ckp] I didnt get how two thread can access the atomic variable? Is this because the v (variable) is non atomic? but atomic_subtract_int() api should take care of atomicity right? if not then please let us know any other way of making the variable and operation atomic. do we need to use this atomic_fetchadd_int() API to achieve atomicity?

#define mpi3mr_atomic_dec(v) atomic_subtract_int(&(v)->val, 1)

In D49749#1136554, @chandrakanth.patil_broadcom.com wrote:
In D49749#1135942, @chs wrote:
Ok, I didn't realize that "task management" always meant "target reset" here. Not waiting for in-progress commands makes sense in this context. Could you add a comment to the code explaining this?

Another thing that I noticed while looking at this patch was that all the other places that decrement block_io do it in this style:
if (mpi3mr_atomic_read(&tgtdev->block_io) > 0)
        mpi3mr_atomic_dec(&tgtdev->block_io);
[ckp] Agreed, we need to check the block_io > 0 before decrementing block_io, otherwise if other thread has already decremented then the counter may go negative. I will add new patch for that. Thank you.
whereas this one you've added does not check the current value before decrementing.  is there some reason why this new instance doesn't follow the pattern of the others?

note that the pattern of the others also seems wrong... the check of the current value is not atomic with the decrement, so it looks like one thread could check the value and see that it is not zero, another thread could execute one of the paths that sets the current value to zero, and then the first thread could proceed to decrement the value and cause the value to become -1.  is there something that prevents this scenario from occurring?
[ckp] I didnt get how two thread can access the atomic variable? Is this because the v (variable) is non atomic? but atomic_subtract_int() api should take care of atomicity right? if not then please let us know any other way of making the variable and operation atomic. do we need to use this atomic_fetchadd_int() API to achieve atomicity?

#define mpi3mr_atomic_dec(v) atomic_subtract_int(&(v)->val, 1)

The atomic_subtract_int() API makes the subtraction itself atomic, but the preceeding read and compare with zero are not part of the atomic subtraction. The read itself is atomic, but then another thread can modify the counter in memory before the atomic subtraction begins, so that the value that the atomic subtract is subtracting from is not necessarily the same value that we used in the comparison with zero that is done to decide whether or not to do the atomic subtraction. To make this work, you need to use a loop like this:

unsigned int count;

count = mpi3mr_atomic_read(&tgtdev->block_io);
for (;;) {
        if (count == 0)
                break;
        if (atomic_fcmpset_int(&tgtdev->block_io.val, &count, count - 1))
                break;
}

the atomic compare-and-set makes the subtraction conditional on the value in memory remaining unchanged from the value from which we subtract one. if another thread changes the value between the read and the atomic compare-and-set, then the atomic compare-and-set will fail, and the loop will recheck the new value to make sure that it is still not zero before trying the atomic compare-and-set again.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 28 2025, 3:25 AM

Closed by commit rG49445b54638f: mpi3mr: Block I/Os While Task Management is in Progress (authored by chandrakanth.patil_broadcom.com, committed by imp). · Explain Why

This revision was automatically updated to reflect the committed changes.

imp added a commit: rG49445b54638f: mpi3mr: Block I/Os While Task Management is in Progress.

imp added a commit: rG9980fb4d87ae: mpi3mr: Block I/Os While Task Management is in Progress.Apr 30 2025, 5:22 PM

chandrakanth.patil_broadcom.com mentioned this in D50134: mpi3mr: Add atomic decrement-if-not-zero helper.May 3 2025, 9:05 AM