libcasper/libnv: various error handling changes
Needs ReviewPublic
Actions

Authored by kevans on Aug 8 2019, 9:27 PM.

Details

Reviewers

markj
oshogbo

Summary

casper(3):
If nvlout is an error state before we attempt to send it, rebuild with just the final error to be passed back. This can happen for various reasons, but most of them seem to be either EMFILE/ENFILE/ENOMEM issues that would naturally clear up by creating a more minimal nvlist with a single nvpair. Regardless, we must try- libnv will not send out an nvlist that's in an error state.

cap_fileargs(3):
Attempt to handle EMFILE/ENFILE style errors in fileargs_add_cache. In one error case, the nvlist_add_nvlist(nvlout, fname, new) was setting EMFILE on nvlout and causing the send to later fail.

libnv(3):
recvmsg can kick back EMSGSIZE if the SCM_RIGHTS transfer can't happen due to EMFILE/ENFILE. Drain the recv buf if we've hit EMSGSIZE so that subsequent transfers can work as expected, rather than the current domino effect.

I'm not sure any of it's the most correct; the libnv(3) change could/should probably only drain enough bytes to hold the cmsg that can't happen (nfds * CMSG_SPACE(sizeof(int)) + sizeof(struct msghdr).

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 25760

Event Timeline

kevans created this revision.Aug 8 2019, 9:27 PM

Herald added a subscriber: imp. · View Herald TranscriptAug 8 2019, 9:27 PM

Harbormaster completed remote builds in B25756: Diff 60600.Aug 8 2019, 9:27 PM

the libnv(3) change could/should probably only drain enough bytes to hold the cmsg that can't happen (nfds * CMSG_SPACE(sizeof(int)) + sizeof(struct msghdr).

I say that, but the subsequent buf_drain only ever reads one byte from the socket after the resulting EMSGSIZE.

Plugs a potential nvlist/fd leak in fileargs_add_cache. The fd is moved into new if it succeeds, and new is cloned into nvlout with the fd dup'd. Destroy new after nvlist_add whether it failed or not to plug it.

Leaving buf_drain alone for now -- it seems like we might periodically have more than one message queued that needs to be flushed, but it's not immediately clear to me. libnv has a somewhat rigid protocol to follow with this socket, so the current approach should generally be safe.

Harbormaster completed remote builds in B25760: Diff 60605.Aug 9 2019, 12:20 AM

pstef added a subscriber: pstef.Aug 15 2021, 7:02 PM