Page MenuHomeFreeBSD

ng_socket: Treat EEXIST from kern_kldload() as success in case there is a race condition
ClosedPublic

Authored by zlei on Apr 4 2024, 1:25 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 9, 11:37 PM
Unknown Object (File)
Thu, Jan 9, 11:31 PM
Unknown Object (File)
Thu, Jan 9, 2:47 PM
Unknown Object (File)
Fri, Dec 27, 8:06 PM
Unknown Object (File)
Dec 19 2024, 2:26 AM
Unknown Object (File)
Dec 9 2024, 4:13 PM
Unknown Object (File)
Dec 2 2024, 2:01 AM
Unknown Object (File)
Dec 2 2024, 12:14 AM

Details

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

zlei requested review of this revision.Apr 4 2024, 1:25 AM
zlei edited the test plan for this revision. (Show Details)

@olce
As a side effect, the error from kern_kldload() is lost, so I think you are right , we should emit logs from kernel about the error, to help diagnosing.

Why do we really want to lose a meaningful error code and reduce all errors into one value? IMHO, it is counterproductive. And if we got a new error code type, why can't we propagate it to userland, too?

Why do we really want to lose a meaningful error code and reduce all errors into one value? IMHO, it is counterproductive. And if we got a new error code type, why can't we propagate it to userland, too?

I agree with you that user want meaningful error code. But for the underlaying error (for this case errors from kernel linkers) would make more confusion rather than be meaningful to users, especially the newbies.

I'm not going to argue how to classify experienced users, and I do not want to argue how experienced users will do when they reach misleading error code. From my option those users who are familiar / or able to be familiar with kernel parts are developers rather than plain users, so most time we are making plain users happy.

But for the underlaying error (for this case errors from kernel linkers) would make more confusion rather than be meaningful to users, especially the newbies.

Why ?

The root cause is:

I would conclude, an error in domain A has the context, so when it goes cross to another domain B the context get lost, then probably it should be translated (to a proper error for domain B).

For this case, how can plain users distinguish errors from kernel linkers and those from netgraph ? And if we're going to document errors of ngctl mkpeer, should we document all errors from kernel linkers ? I think both answers are NO.

FYI, D42327 and D44552 are good examples which do the translating for underlaying errors.

And if we got a new error code type, why can't we propagate it to userland, too?

That is another question. If the new error code type ( EUNSDEP in D44581 ) is not kernel-only, then Yes it can be propagate to userland, but still NO for netgraph.

Reducing multiple error codes to one is losing information and it is confusing regardless of how experienced a user is. Even in case of the least experienced user, they would file a problem report or ask on mailing lists, forums, or whatever, they will come asking for help with reduced information. First thing to help such a user would be to write a dtrace script or ask him to patch kernel with printfs to get the actual error.

If the whole problem is with EUNSDEP, there are two ways to solve it:

  1. Give it a positive value and move it from _KERNEL to _BSD_VISIBLE.
  2. Single out EUNSDEP in ng_socket and convert it to ENXIO (not the best choice imho) or anything else.

Reducing multiple error codes to one is losing information and it is confusing regardless of how experienced a user is. Even in case of the least experienced user, they would file a problem report or ask on mailing lists, forums, or whatever, they will come asking for help with reduced information. First thing to help such a user would be to write a dtrace script or ask him to patch kernel with printfs to get the actual error.

If the concern is losing information then we can fulfill by emitting logs either from the caller ngc_send() or from the callee kern_kldload() / linker_load_module ().

If the whole problem is with EUNSDEP, there are two ways to solve it:

  1. Give it a positive value and move it from _KERNEL to _BSD_VISIBLE.
  2. Single out EUNSDEP in ng_socket and convert it to ENXIO (not the best choice imho) or anything else.

This is not all about EUNSDEP. But the introduction of kernel-only EUNSDEP reveals this problem.

zlei retitled this revision from ng_socket: Consistently use ENXIO as return value of ngc_send() to ng_socket: Treat EEXIST from kern_kldload() as success in case there is a race condition.
zlei edited the summary of this revision. (Show Details)
zlei edited the test plan for this revision. (Show Details)

Bite out the translation for errors from kernel linker. If that is needed, do it in separate CR.

This revision is now accepted and ready to land.Apr 9 2024, 5:14 AM