Page MenuHomeFreeBSD

libcasper/libnv: various error handling changes
Needs ReviewPublic

Authored by kevans on Thu, Aug 8, 9:27 PM.

Details

Reviewers
markj
oshogbo
Summary

casper(3):
If nvlout is an error state before we attempt to send it, rebuild with just the final error to be passed back. This can happen for various reasons, but most of them seem to be either EMFILE/ENFILE/ENOMEM issues that would naturally clear up by creating a more minimal nvlist with a single nvpair. Regardless, we must try- libnv will not send out an nvlist that's in an error state.

cap_fileargs(3):
Attempt to handle EMFILE/ENFILE style errors in fileargs_add_cache. In one error case, the nvlist_add_nvlist(nvlout, fname, new) was setting EMFILE on nvlout and causing the send to later fail.

libnv(3):
recvmsg can kick back EMSGSIZE if the SCM_RIGHTS transfer can't happen due to EMFILE/ENFILE. Drain the recv buf if we've hit EMSGSIZE so that subsequent transfers can work as expected, rather than the current domino effect.

I'm not sure any of it's the most correct; the libnv(3) change could/should probably only drain enough bytes to hold the cmsg that can't happen (nfds * CMSG_SPACE(sizeof(int)) + sizeof(struct msghdr).

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 25760

Event Timeline

kevans created this revision.Thu, Aug 8, 9:27 PM
kevans added a comment.Thu, Aug 8, 9:37 PM

the libnv(3) change could/should probably only drain enough bytes to hold the cmsg that can't happen (nfds * CMSG_SPACE(sizeof(int)) + sizeof(struct msghdr).

I say that, but the subsequent buf_drain only ever reads one byte from the socket after the resulting EMSGSIZE.

kevans updated this revision to Diff 60605.Fri, Aug 9, 12:20 AM

Plugs a potential nvlist/fd leak in fileargs_add_cache. The fd is moved into new if it succeeds, and new is cloned into nvlout with the fd dup'd. Destroy new after nvlist_add whether it failed or not to plug it.

Leaving buf_drain alone for now -- it seems like we might periodically have more than one message queued that needs to be flushed, but it's not immediately clear to me. libnv has a somewhat rigid protocol to follow with this socket, so the current approach should generally be safe.