Still need to handle some smalls details, e.g. credentials and go through some more careful testing. Basic testing didn't show any issue FWIW. Would like to get another pair of eyes on this in order to understand if I did something completely wrong.
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
I do not know the audit code. From what I see in the diff and from the limited reading of the audit_pipe.c, the changes are fine in general.
It does not make much sense to leave the assertions that ap != NULL if devfs_get_cdevpriv(9) calls succeeded.
Should ap_open member of struct audit_pipe be removed ?
Why do you left the asignment to si_drv1 in audit_pipe_open() ? All uses of si_drv1 should be converted to devfs_get_cdevpriv(), and it would be actually good for debugging to leave si_drv1 as NULL to catch bugs.
sys/security/audit/audit_pipe.c | ||
---|---|---|
712 | Why is this assignment left ? |
Overall, seems plausible to me.
Could you clarify the 'XXXDAVIDE: credentials' comment: what's missing here?
Could you change the panic() on make_dev() failure to be "Can't make_dev audit-pipe device"?
There are a number of places where you check devfs_get_cdevpriv() for failure; under what circumstances might these happen and is returning an error or panicking more suitable? (This might be a Kostik question). Previously, it was an invariant that you could never enter these file-descriptor-based functions without valid state.
Nothing changed there. Most of the existing in-tree code does check for an error from devfs_get_cdevpriv() in cdevsw methods and returns error, like the patch does. Some minority of users ignore the error
As I understand your question, the use of panic on error from devfs_get_cdevpriv() comes from the desire to declare absence of the cdevpriv data the programming error. I would prefer prefer to not go with this route. E.g., I remember that the inability of usermode to operate on the not fully initialized filedescriptor still in open(2) was fixed relatively recently.
Also, the handling of error, instead of panicing, may allow to extend the KPI in future, by e.g. standard facilities to prevent device fd use from other threads etc.
I.e., I think that the used pattern is fine and it should be continued to be used.
Updated after comments from Kostik/Robert.
Peter, do you mind to give this a spin before it gets in? I tested that a bit and found no issue -- but it would be great if the patch could be stress tested a bit more.