Page MenuHomeFreeBSD

alq(9): Record any write failures and return the last in alq_close(9).
AcceptedPublic

Authored by bdrewery on Mar 22 2016, 8:34 PM.

Details

Reviewers
kib
markj
lstewart
Group Reviewers
manpages

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 3048
Build 3080: arc lint + arc unit

Event Timeline

bdrewery updated this revision to Diff 14530.Mar 22 2016, 8:34 PM
bdrewery retitled this revision from to alq(9): Record any write failures and return the last in alq_close(9)..
bdrewery updated this object.
bdrewery edited the test plan for this revision. (Show Details)
markj added inline comments.Mar 23 2016, 12:03 AM
sys/kern/kern_alq.c
390

Should we overwrite previous errors?

483

This comment is misleading - there can't be any pending I/O since no one has a reference to this alq yet. All this does is close the vnode.

kib added inline comments.Mar 23 2016, 6:27 AM
sys/kern/kern_alq.c
297

Please explain this change.

381–382

If mac_vnode_check_write() failed, should we return this error ? Otherwise code returns zero, with no write.

bdrewery added inline comments.Mar 23 2016, 4:28 PM
sys/kern/kern_alq.c
381–382

Ah true, I missed this.

390

I don't see a problem with it. It depends on how complicated the API should become and what behavior should happen when a write fails. Should the entire log be closed and reject further writes? Should the error be returned from alq_flush() as well? Should flushing it cause the error to reset? I have no opinion on any of those, I'll leave it to Lawrence who maintains this more. My only interest is determining if any write failed so I can explore using this elsewhere.

483

Yes you're right. I didn't read the code here, I just moved the call to the parent since I wanted it out of alq_destroy().

bdrewery marked 4 inline comments as done.Mar 23 2016, 6:06 PM
bdrewery added inline comments.
sys/kern/kern_alq.c
297

alq_shutdown() flushes pending IO before destroy. I need to flush any pending IO in alq_close() before freeing the alq pointer so that alq->error can be retrieved. Calling something like alq_flush() here instead would just be redundant since alq_shutdown() would then try flushing again.

bdrewery updated this revision to Diff 14552.Mar 23 2016, 6:12 PM

Fix findings

kib added inline comments.Mar 24 2016, 9:39 AM
sys/kern/kern_alq.c
297

Then, could you assert that no penind io is queued, there ?

381–382

It is much easier to read restructured code as jhb prefers it

#ifdef MAC
    error = mac_vnode_check_write();
    if (error == 0)
#endif
         error = VOP_WRITE();

and also it shows cleanly that the addition of 'error = 0;' assignment above is redundant.

bdrewery updated this revision to Diff 14691.Mar 28 2016, 7:28 PM
bdrewery edited edge metadata.
  • Fix unrelated memory leak in ald_shutdown
  • Add more asserts
  • Retrieve error from vn_close
  • Return error rather than 0 from alq_close() on shutting down, which seems to be only possible with an invalid pointer anyhow
bdrewery marked 4 inline comments as done.Mar 28 2016, 7:29 PM
bdrewery added inline comments.
sys/kern/kern_alq.c
245

FYI this is an unrelated memory leak fix that can happen when unloading the module.

markj accepted this revision.Mar 28 2016, 7:37 PM
markj edited edge metadata.
markj added inline comments.
sys/kern/kern_alq.c
390

Hm, I'm not suggesting any API changes - just that we might wish to save the first write error rather than the most recent error:

if (error != 0 && alq->aq_error == 0)
        alq->aq_error = error;

I think this is generally more useful for debugging in the face of cascading errors. But I can't think of a specific alq failure mode where this will make a difference.

This revision is now accepted and ready to land.Mar 28 2016, 7:37 PM
kib accepted this revision.Mar 29 2016, 9:45 AM
kib edited edge metadata.

It looks fine to me from the patch, but I did not read the alq code.

share/man/man9/alq.9
422

I do not think that this is either correct or useful explanation. E.g. it does not mention mac checks.

It would be more useful to say that errors encountered while final file operations are returned, without descending into specific ops.

lstewart edited edge metadata.Mar 30 2016, 1:09 AM

Apologies for the delay in getting to this, still heads down wrapping up my PhD thesis. The comments from Kib and Mark all appear to have been addressed and the changes look good. I don't really have an opinion on the "report first or most recent" error issue.

sys/kern/kern_alq.c
245

Good catch.

390

No strong opinion either way from me - there are reasonable arguments for both approaches.

934

This conflates two unrelated sources of "error". Any reason not to return alq->aq_error here even though it's possible the value may change by the time the shutdown actually completes (which this change doesn't currently capture anyway)

Note also that my use case will require an alq_open_file (as in struct file*) or alq_open_fd or alq_open_vnode which I have a separate patch coming for soon. Currently I am using alq_open_vnode for testing.

sys/kern/kern_alq.c
390

I am leaning towards changing it to the first error as Mark suggested.

934

Agreed about conflating. I wavered on this line a lot. I changed it probably 5 times before submitting because I was unsure what to do with it.

As far as I can tell it is only possible with an invalid pointer (which is documented as "undefined behavior"). This only returned EBUSY if the module is unloading which is only possible to get into if the alq_queues is empty which presumably means that this pointer passed in alq_close is invalid. If somehow the pointer passed in is "valid" but just not in the queue anymore (thus allowing shutdown to start since the queue is empty), then the pointer will be made invalid soon anyhow since it will be freed soon by the "unrelated memory leak" fix above. I might be missing something.