Page MenuHomeFreeBSD

Compute 'maxproc'/'maxfiles' from memory amount; Expand/fix comments
ClosedPublic

Authored by olce on Fri, May 2, 8:32 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, May 21, 12:02 AM
Unknown Object (File)
Tue, May 20, 11:50 PM
Unknown Object (File)
Mon, May 19, 9:45 PM
Unknown Object (File)
Mon, May 19, 6:54 AM
Unknown Object (File)
Sat, May 17, 1:37 PM
Unknown Object (File)
Sat, May 17, 1:36 PM
Unknown Object (File)
Thu, May 15, 2:51 PM
Subscribers

Details

Summary

Change the formulae so that 'maxproc' and 'maxfiles' are computed based
on the physical memory amount, and not the corresponding number of
pages. Make sure that the practical results are unchanged for all
architectures (which is possible since all have 4096 as their
PAGE_SIZE). Despite this, we do this change to make it clearer how many
units of these values are computed per MB, which is easier on readers.

Change the comments accordingly and expand them to indicate which parts
of the computations are actually not used by default. In passing, fix the
comments about default values and limits for 'maxfiles' (they were off).

No functional change (intended).

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

olce requested review of this revision.Fri, May 2, 8:32 PM

I object. If you have some problem with some app, fix the app, up to the level of making app self-limit by changing the resource.
Not to mention that files are not only vnodes, and also single vnode can be referenced by many files. But global limits are not the facility to tune system for safety of the single app.

In D50124#1143468, @kib wrote:

I object. If you have some problem with some app, fix the app, up to the level of making app self-limit by changing the resource.
Not to mention that files are not only vnodes, and also single vnode can be referenced by many files. But global limits are not the facility to tune system for safety of the single app.

I mis-handled tabs, the answer should go into D50125.

Aarch64 can have 16k pages. There are Experimental patches for amd64 to have 32k pages. I'm not sure it matters though..

In D50124#1143473, @imp wrote:

Aarch64 can have 16k pages. There are Experimental patches for amd64 to have 32k pages. I'm not sure it matters though..

Don't know, but what matters here is the value PAGE_SIZE is set to. AFAICT, in-tree, it is always defined to 4096, even if I see, e.g., PAGE_SIZE_16K in <machine/param.h> for arm64 (nothing seems to use it except arm64 bhyve). Wasn't aware of a 32k pages experiment.

In any case, I don't think we want to scale these kinds of quantities with the number of pages of PAGE_SIZE, at least, there doesn't appear to have been any intention to do so so far, but rather to base these constants on amount of available memory, KVA, etc.

sys/kern/subr_param.c
304

kmap entries cannot be exhausted by processes I do not understand this.

316

Can we remove the maxproc limiting at all? It is something set by user, and if user wants this value, I do not see a reason to deny that.

329–331

Please remove these {}. If you did it to have the variable declared after the opening block, just move it to the start of the function and remove const.

But again, why not remove the clamp for user value at all?

olce marked 3 inline comments as done.Tue, May 6, 9:08 AM
olce added inline comments.
sys/kern/subr_param.c
304

I didn't add that comment, it has been there since 2003, introduced by commit 77a7d074e4cc27, r91780. It probably does not apply today. Would you like me to remove it? Do you see any other specific limitations? After a quick code review, it seems that all memory allocated during a fork is allocated from UMA.

316

I could remove it. I just in general think we must prevent the admin from shooting himself in the foot too easily. So the question becomes: What does happen in case some user tries to create an additional process but the system can't accommodate it? If the answer is: The fork will just fail (preferably with ENOMEM or similar), as I suspect, then happy to remove this clamping.

329–331

Please remove these {}. If you did it to have the variable declared after the opening block, just move it to the start of the function and remove const.

I usually strive to have variables declared for the minimal possible scope and as const unless they really can't be. In any case, I'll see to remove the clamping in subsequent commits, so the braces should disappear. Out of curiosity, do you ask that out of aesthetic/readability reasons?

But again, why not remove the clamp for user value at all?

As for maxproc, I'm fine with doing that as long as that doesn't open up obvious foot shooting. With maxfiles too high, we have the deadlock problem (with one candidate remediation in D50125), but the clamping here does not prevent it in any way. I do not see any reason why removing this clamping could open up new system failures. So I'll prepare a commit to remove it.

olce marked 3 inline comments as done.Tue, May 6, 9:41 AM

Well, I commented too fast. Removing clamping in both cases opens up new system failures for very high values: The impossibility for even root to create new processes or new file descriptors if memory is exhausted, rendering the current hardcoded leeways (10 and 5% respectively) useless. And this will materialize again as a deadlock.

The general guidelines really should be: We must never deadlock on resource exhaustion, but fail gracefully. And we should reserve resources for emergency situations (procs/file descriptors/files reserved to be used by root only; and in other situations as well, like swapping must continue to work on memory shortage, etc.) so that we can actually recover from them.

Clamping is a kind of last resort, but it's probably better than nothing.

Are you facing a practical situation where they are a problem? We could have other knob names for the same variables, such as raw.kern.maxfiles, which would allow changing the variable without any safety checks/clamping, with the benefit that, at least, admins tweaking them would *know* bad things can happen.

sys/kern/subr_param.c
316

This is unrelated question. If admin wants to increase the maxproc, why limiting the admin?

And yes, if system cannot create a new process, fork() must fall gracefully with ENOMEM or such. Very ong time ago I did the cleanup to make thr_new() to fail this way, and did something similar for fork, see 89b57fcf01d895a82. I believe stress2 does not allow this work to rot, at least for thr_new().

olce marked an inline comment as done.Wed, May 7, 5:27 PM
olce added inline comments.
sys/kern/subr_param.c
316

This is unrelated question. If admin wants to increase the maxproc, why limiting the admin?

Because we should limit the admin against foot-shooting, since he does not always know the consequences of seemingly innocuous actions such as bumping a limit (he's not always a developer, and even in this case, mistakes happen). I agree we should in general stay liberal and give them maximum freedom, but I don't that we should allow them to change internals in ways that are likely going to cause problems that are not obvious to them.

What disturbs me in this case is that, if maxproc is set too high, the admin looses some admittedly very brittle safety net of 10 "reserved" processes, because these processes are not in fact "reserved", this number is just a small bypass for an (arbitrary) limit, and it may well happen that there is not enough memory to really create them. Of course, this usually doesn't depend on maxproc alone, but maxproc is a part of the equation, and in case of a fork bomb, it's actually the only part that matters. But it is probably very hard to predict the hard limit we should allow maxproc to be set to in this case, let alone keep it consistent with other evolutions in the system. So the current clamp is what it is: A very brittle safety net, that only applies in some circumstances, and also a gross hack very imperfectly compensating for the absence of more solid reservation mechanisms. Despite all that, my point is that it is still better than nothing.

And I could reverse the argument: Why would like to augment that limit? For which practical case?

This specific discussion, however interesting, is logically unrelated to the change proposed here, which I'd like to land in 14.3 ASAP in conjunction with clamping if possible, so I'd rather drop it for now and forget about removing the limit as part of this revision.

This revision was not accepted when it landed; it landed in state Needs Review.Thu, May 15, 2:23 PM
This revision was automatically updated to reflect the committed changes.
olce marked an inline comment as done.