Page MenuHomeFreeBSD

CID 1332078: Logically dead code in sys/dev/pms/freebsd/driver/ini/src/osapi.c
AcceptedPublic

Authored by dab on Feb 12 2019, 7:27 PM.

Details

Summary

An omitted brace caused an else if statement to be nested
inside the if that tested the inverse of the else if condition,
thus rendering the else if statement to be dead code. Fix by
correctly terminating the body of the if statement so the else if
becomes viable code.

Test Plan

Code inspection.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 22469
Build 21624: arc lint + arc unit

Event Timeline

dab created this revision.Feb 12 2019, 7:27 PM
dab updated this revision to Diff 53848.Feb 12 2019, 7:29 PM

Remove stray file from this review.

imp added a comment.Feb 12 2019, 7:30 PM

do you have the ability to test this change?

imp added a comment.Feb 12 2019, 7:30 PM

I ask because it's basically vendor code.

dab added a comment.Feb 12 2019, 7:33 PM

I can test on hardware at $WORK. As to it being vendor code, although the driver was imported from PMC-Sierra 3 years ago, it doesn't look to me like any updates have come from them since. If you know of a source of updated vendor code, can you point me to it?

imp added a comment.Feb 12 2019, 9:47 PM
In D19168#410022, @dab wrote:

I can test on hardware at $WORK. As to it being vendor code, although the driver was imported from PMC-Sierra 3 years ago, it doesn't look to me like any updates have come from them since. If you know of a source of updated vendor code, can you point me to it?

You changed the logic here. We need to make sure that it still works. This code is pseudo-vendor supplied code (but more like abandonware), so we need to have a good paper-trail should they suddenly pop up with a new version to make sure the change is (a) important and (b) doesn't get lost. Having done a regression test of sorts will help with that.

dab added a comment.Feb 12 2019, 10:35 PM

Yes, I did change the logic. This turned out to be a more interesting issue than I had initially thought. At $WORK we also have bug fix in this area, but it differs from mine. I'm working on getting a good review, reconciling the changes, and adequate testing. So, this shouldn't be thought of as a finished work (yet).

The change by itself, LGTM. As David described, there's a larger fix here of which this is basically a subset. The handling in general of timed out commands, and aborting of commands, is pretty broken in this driver, with this mis-bracket being the very first step in fixing. We did add some sysctls that can be used to simulate timed out commands which may be helpful to upstream to help test - basically tweaking the timeout time within the command struct such that we'll consider an IO timed out immediately the next pass of the functionality that checks for timed out IOs. It's in my experience not 100% equivalent to a drive accepting commands and going unresponsive, since those 'really failing' drives often then fail the 'next step' in their recovery in terms of aborting the IO, but ensuring as a first step we can abort an IO to an otherwise functional drive that we think timed out is a good test for this specific case.

From my discussions with the vendor, and the fact I don't believe they even have a committer on-staff, I'd be pretty skeptical that they will have any future updates to the driver to reconcile. In this case, I already submitted our larger fix back to the vendor, which will be the case for most fixes we had on this driver, so I suspect it exists in whatever their internal repo is (we had/have a ticket system with them I have access to which largely was used to 'hand back' fixes I/we did at $WORK). There's a broader discussion to be had on that topic but our goal I think should be to get as many of the generic fixes we have into upstream as possible, and if anything, ask the vendor to then rebase/align to what's in upstream.

imp accepted this revision.Feb 13 2019, 7:56 PM

OK. I'm happy with the due diligence that's been done, and look forward to better timeout handling.
In general, our error recovery from timeout sucks, so I wouldn't be too hard on the vendor.

This revision is now accepted and ready to land.Feb 13 2019, 7:56 PM