Page MenuHomeFreeBSD

(1/2) threads: introduce a limit for total number
ClosedPublic

Authored by mjg on Nov 5 2020, 6:44 AM.
Tags
None
Referenced Files
F103515268: D27100.id.diff
Mon, Nov 25, 11:48 PM
Unknown Object (File)
Tue, Nov 19, 8:23 AM
Unknown Object (File)
Wed, Nov 13, 8:34 AM
Unknown Object (File)
Thu, Nov 7, 10:51 PM
Unknown Object (File)
Tue, Nov 5, 3:04 PM
Unknown Object (File)
Oct 24 2024, 2:15 AM
Unknown Object (File)
Oct 11 2024, 3:03 AM
Unknown Object (File)
Oct 1 2024, 6:47 AM
Subscribers

Details

Summary

The intent is to replace the current id allocation method and a known upper bound will be useful.

Test Plan

tested by pho (both patches)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

mjg requested review of this revision.Nov 5 2020, 6:44 AM
sys/kern/kern_thread.c
59 ↗(On Diff #79202)

priv.h sorts after pmckern.h.

257 ↗(On Diff #79202)

Why did you move the eventhandler call sites?

sys/kern/kern_thread.c
257 ↗(On Diff #79202)

The original place was a little iffy in that it was invoking very early/very late, meaning relevant part of the struct might have not been inititalized yet or got destroyed. iow the move alone improves the hook in my opinion.

The patch also moved around tid alloc/free and I find a necessity that a valid id is assigned before we call into into ctor and is not yet freed (and possibly reused) when calling into dtor.

sys/kern/kern_thread.c
147 ↗(On Diff #79211)

Why is it RD and not RW ?

347 ↗(On Diff #79211)

This should go in some header, perhaps proc.h.

359 ↗(On Diff #79211)

{} are not needed.

There is already somewhat implicit limit on total number of threads, imposed by KVA limits.
Currently I believe we die around 150000 threads on 64 bit, and several tens of thousands on 32bit (i386 might have slightly larger slack than e.g. arm). I suggest you to make the cap arch-depended (ILP32/LP64 differentiator should be enough), around that limits, unless you recheck.

Also please notify pho, he has tests that exercise max allowable number of threads and check that kstack allocation fails gracefully. He would need to tune this limit for tests to be relevant.

448 ↗(On Diff #79211)

{} are not needed.

sys/kern/kern_thread.c
257 ↗(On Diff #79202)

You can pass the tid to the ctor using uma_zalloc_arg().

sys/kern/kern_thread.c
147 ↗(On Diff #79211)

This is stemming from the other patch -- there is no code implemented to resize the bitmap. I'll add it later.

347 ↗(On Diff #79211)

I think the split between kern_thr.c and kern_thread.c should be resolved in some manner and it would also take care of this var. The change at hand is bare minimum which moves things forward until someone refactors the area.

359 ↗(On Diff #79211)

{} are allowed by style and I prefer them as they don't disrupt git blame if body gets modified (single<->multiline)

On flix1 (488GB RAM) I get to 1 mln threads very easily, I have not checked what the real bound might be.

The program is:

#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

static void *
worker(void *arg)
{

        pause();
        return (NULL);
}

int
main(void)
{
        pthread_t *threads;
        int i, error;

        threads = malloc(20000000 * sizeof(*threads));
        for (i = 0;; i++) {
                if ((i % 1000) == 0) {
                        printf("%i...", i);
                        fflush(stdout);
                }
                if (i == 99999) {
                        printf("got to 99999, waiting....\n");
                        getchar();
                }
                if ((error = pthread_create(&threads[i], NULL, worker, NULL)) != 0) {
                        printf("failed at %i with %d\n", i, error);
                        pause();
                        return (1);
                }
        }
}

Singular instances stop get an error at 100k threads, but spawning additional one works fine.

That said, threads can be created but they are all parked in pause. I don't know how usable the system might be if they actually run.

I'll put a much lower limit on 32-bit.

257 ↗(On Diff #79202)

The current code is acting like this and I think this is a highly pessimal behavior. For one it forces UMA to provide memory to run the constructor even ID allocation is bound to fail due to the limit. I can concede though that with arbitrary limits some elements may be free in per-CPU buckets, but the point stands.

mjg edited the test plan for this revision. (Show Details)
  • different limits for 32/64
sys/kern/kern_thread.c
257 ↗(On Diff #79202)

I'm saying, why not allocate the TID before calling into UMA, and use uma_zalloc_arg() to pass the tid as the third argument to the constructor? Then you don't need to move the eventhandlers. If you hit the thread limit, then there's no need to call into UMA.

sys/kern/kern_thread.c
257 ↗(On Diff #79202)

I see I misread, I have plans of removing the constructor/destructor anyway. Regardless, I don't see what's to be won for calling eventhandlers from inside of uma. Changed places as it is should be a nop for the hooks in base at least.

kib added inline comments.
sys/kern/kern_thread.c
59 ↗(On Diff #79202)

This is not handled.

This revision is now accepted and ready to land.Nov 9 2020, 6:31 PM
markj added inline comments.
sys/kern/kern_thread.c
257 ↗(On Diff #79202)

It's changing the thread initialization order, and it's hard to predict how that might affect consumers. It's also just extra churn, and confusing to readers since the UMA constructor only does an arbitrary subset of the work.

This revision was automatically updated to reflect the committed changes.