Page MenuHomeFreeBSD

Fix various Coverity-detected errors in nvme driver
ClosedPublic

Authored by dab on Apr 22 2020, 1:49 PM.

Details

Summary

This fixes several Coverity-detected errors in the nvme
driver.

CIDs addressed: 1008344, 1009377, 1009380, 1193740, 1305470, 1403975,
1403980

Test Plan

Run some I/O tests with nvmecontrol and diskinfo

# diskinfo -ciStw /dev/nvme0ns1
/dev/nvme0ns1
	512         	# sectorsize
	2147483648  	# mediasize in bytes (2.0G)
	4194304     	# mediasize in sectors
	0           	# stripesize
	0           	# stripeoffset
	No          	# TRIM/UNMAP support
	Unknown     	# Rotation rate in RPM

I/O command overhead:
	time to read 10MB block      0.019035 sec	=    0.001 msec/sector
	time to read 20480 sectors   0.977857 sec	=    0.048 msec/sector
	calculated command overhead			=    0.047 msec/sector

Seek times:
	Full stroke:	  250 iter in   0.024211 sec =    0.097 msec
	Half stroke:	  250 iter in   0.023469 sec =    0.094 msec
	Quarter stroke:	  500 iter in   0.039334 sec =    0.079 msec
	Short forward:	  400 iter in   0.034748 sec =    0.087 msec
	Short backward:	  400 iter in   0.031097 sec =    0.078 msec
	Seq outer:	 2048 iter in   0.108007 sec =    0.053 msec
	Seq inner:	 2048 iter in   0.105674 sec =    0.052 msec

Transfer rates:
	outside:       102400 kbytes in   0.097164 sec =  1053888 kbytes/sec
	middle:        102400 kbytes in   0.097458 sec =  1050709 kbytes/sec
	inside:        102400 kbytes in   0.097434 sec =  1050968 kbytes/sec

Asynchronous random reads:
	sectorsize:     56278 ops in    3.000200 sec =    18758 IOPS
	4 kbytes:       57708 ops in    3.000118 sec =    19235 IOPS
	32 kbytes:      52476 ops in    3.000122 sec =    17491 IOPS
	128 kbytes:  diskinfo: aio_read: Operation not supported

# nvmecontrol perftest -p -n 5 -s 512 -t 120 -o read nvme0ns1
Threads:  5 Size:    512  READ Time: 120 IO/s:   22143 MB/s:   10
	  0:     5692 IO/s
	  1:     5606 IO/s
	  2:     5342 IO/s
	  3:        0 IO/s
	  4:     5501 IO/s
# nvmecontrol perftest -p -n 5 -s 512 -t 120 -o write nvme0ns1
Threads:  5 Size:    512 WRITE Time: 120 IO/s:   17934 MB/s:    8
	  0:     2270 IO/s
	  1:     4350 IO/s
	  2:     4507 IO/s
	  3:     2270 IO/s
	  4:     4535 IO/s

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

sys/dev/nvme/nvme_ctrlr.c
1591 ↗(On Diff #70874)

There is some precedent for this in the kernel, but not a whole lot. It does make clear to both the human and machine (e.g., Coverity) reader that ignoring the return value is intentional.

sys/dev/nvme/nvme_sysctl.c
149–160 ↗(On Diff #70874)

Coverity complained here that in the old code oldval was restored in one path but not in another. This revised code does have the advantage (besides quieting Coverity) that at no time is an invalid value for the timeout stored in the sysctl variable.

sys/dev/nvme/nvme_test.c
103 ↗(On Diff #70874)

There was a possibility here that the (un-casted) expression idx * 2048 could overflow the 32-bits of that expression.

dab edited the test plan for this revision. (Show Details)
dab retitled this revision from Fix various Coverity-detected errors in nvmecontrol to Fix various Coverity-detected errors in nvme driver.Apr 23 2020, 3:34 PM
sys/dev/nvme/nvme_ctrlr.c
1300 ↗(On Diff #70874)

strncpy ensures all the bytes in the returned structure are initialized. strlcpy only ensures that the string is NUL terminated.

sys/dev/nvme/nvme_ns.c
88 ↗(On Diff #70874)

Same as above. I think strncpy is more correct.

sys/dev/nvme/nvme_sysctl.c
149–160 ↗(On Diff #70874)

Coverity was wrong to complain about that. The old code was correct. The new code is more correct, though, for the reason you noted. We use the timeout_period for new I/Os. It's unlikely to make any difference for really long timeouts and nvme drives are fast enough it would be hard to set one too slow, but it's better to have the code be more correct.

sys/dev/nvme/nvme_test.c
125 ↗(On Diff #70874)

Why is this needed? If csw is NULL, that should be a panic...

sys/dev/nvme/nvme_ctrlr.c
1300 ↗(On Diff #70874)

True. Coverity was complaining because strncpy() doesn't guarantee NUL termination. I thought I had seen it treated as a NUL terminated string in other parts of the code, so I thought strlcpy() would be appropriate. Upon rechecking, I see that it is used as a string within nvmecontrol, but it is copied with a strndup(), thus allowing for the possibility of non-NUL termination. Restoring the strncpy() and explicitly storing a NUL in the last byte would both quiet Coverity and be correct, as the field is over-allocated by 1 byte.

Would you agree?

sys/dev/nvme/nvme_test.c
125 ↗(On Diff #70874)

The csw is assigned to the return from dev_refthread() on line 119. dev_refthread() can return NULL and in fact that is checked for in other places within the kernel, so Coverity kindly suggested that the it should also be done here.

I could explicitly panic after that assignment if the value is NULL. I know some prefer to panic on the dereference rather than put in the extra code of a test and explicit panic.

Thoughts?

Changes based on some review comments.

dab marked 4 inline comments as done.Apr 30 2020, 11:22 PM
sys/dev/nvme/nvme_ctrlr.c
1302 ↗(On Diff #71215)

Missing closing paren.

sys/dev/nvme/nvme_ns.c
90 ↗(On Diff #71215)

Missing closing paren.

sys/dev/nvme/nvme_sysctl.c
149 ↗(On Diff #71215)

Space before =.

FreeBSD style doesn't initialize variables in their declaration, but I won't complain, because I disagree with that. :)

152 ↗(On Diff #71215)

You should also return here if req->newptr == NULL. In that case, the caller is reading the current value, but not setting a new one. For that to work, initialize newval with ctrlr->timeout_period, as it was before.prefer this.

sys/dev/nvme/nvme_test.c
127 ↗(On Diff #71215)

If you keep the panic above, don't test for NULL here.

I believe Coverity runs on builds with INVARIANTS and therefore KASSERTs. Maybe KASSERT(csw != NULL, ("insert message")); would appease Coverity while keeping the code clean (i.e. simply panic on dereference).

dab marked 5 inline comments as done.May 1 2020, 2:37 AM
dab added inline comments.
sys/dev/nvme/nvme_ctrlr.c
1302 ↗(On Diff #71215)

Also not the only mistake there on that line. Careless.

sys/dev/nvme/nvme_sysctl.c
149 ↗(On Diff #71215)

My excuse is that the old code also initialized the variable in the declaration. ;-)

sys/dev/nvme/nvme_test.c
127 ↗(On Diff #71215)

Errggghhh. Forgot to take that test out.

I didn't think Coverity ran with INVARIANTS; I'm almost positive I've seen issues flagged that would not have been had a KASSERT been in effect. Although, that might be a difference between Coverity runs internal to $JOB and the FreeBSD scans.

dab marked 3 inline comments as done.

Changes based on some review comments.

sys/dev/nvme/nvme_test.c
103 ↗(On Diff #70874)

You can get the same effect by add ull to 2048.

sys/dev/nvme/nvme_test.c
103 ↗(On Diff #70874)

True. I thought the cast to uint64_t was perhaps more easily understood since the declaration of offset is just a few lines above and one wouldn't have to do the mental equivalence check of ULL == uint64_t. Is there a preference for one method over the other?

sys/dev/nvme/nvme_test.c
103 ↗(On Diff #70874)

ull is generally preferred because it's less invasive than the cast.

Switch from a (uint64_t) cast to specifying ULL on a numeric constant
to widen the data type of an expression (suggested by imp@).

dab marked 2 inline comments as done.May 1 2020, 9:26 PM
This revision is now accepted and ready to land.May 1 2020, 9:28 PM
This revision was automatically updated to reflect the committed changes.