Page MenuHomeFreeBSD

jeff (Jeffrey Roberson)
User

Projects

User Details

User Since
Aug 6 2017, 12:45 AM (59 w, 4 h)

Recent Activity

Mon, Aug 27

jeff added inline comments to D16191: Fix vm_waitpfault on numa.
Mon, Aug 27, 8:35 PM
jeff updated the diff for D16191: Fix vm_waitpfault on numa.

Fix the lock leak. I would like to get this committed to 12. I have not yet had feedback on the low domain avoidance code in the iterator. How do people feel about this?

Mon, Aug 27, 8:35 PM
jeff closed D15977: Make the partpopq LRU sloppy to reduce contention.
Mon, Aug 27, 8:18 PM

Aug 22 2018

jeff added inline comments to D16836: Allow empty NUMA domains to support AMD 2990WX.
Aug 22 2018, 5:57 PM

Aug 20 2018

jeff updated the diff for D16191: Fix vm_waitpfault on numa.

This uses a more correct filter for the vm_wait_severe in vm_glue.c.

Aug 20 2018, 7:46 AM
jeff added inline comments to D16667: Extend uma_reclaim() to permit different reclamation targets..
Aug 20 2018, 7:15 AM

Aug 19 2018

jeff added a comment to D16799: Eliminate the unused arena parameter from kmem_alloc_contig().
In D16799#357567, @alc wrote:
In D16799#357566, @jeff wrote:
In D16799#357563, @alc wrote:
In D16799#357540, @jeff wrote:

I did not do this before because I felt the ROI was low compared to the churn in ports. Have we contacted ports maintainers? The nvidia and virtualbox port are both going to need #ifdefs.

I will followup with the ports maintainers. I had two reasons for this. Primarily, I want to implement stronger segregation for physical memory allocations that are permanent, e.g., physical pages backing UMA_ZONE_NOFREE. Specifically, I don't want to have to modify the kmem callers to specify a different arena if that arena is simply a placeholder. Secondarily, we use the arenas somewhat differently in older branches. So, mechanical MFCs may not do the correct thing anyway.

Do you intend to do kmem_malloc as well?

Yes, and kmem_free().

At this point in the release we should possibly also consult re@

Sure, and I'll make it clear that this is not a functional change.

Aug 19 2018, 7:55 PM
jeff added a comment to D16799: Eliminate the unused arena parameter from kmem_alloc_contig().
In D16799#357563, @alc wrote:
In D16799#357540, @jeff wrote:

I did not do this before because I felt the ROI was low compared to the churn in ports. Have we contacted ports maintainers? The nvidia and virtualbox port are both going to need #ifdefs.

I will followup with the ports maintainers. I had two reasons for this. Primarily, I want to implement stronger segregation for physical memory allocations that are permanent, e.g., physical pages backing UMA_ZONE_NOFREE. Specifically, I don't want to have to modify the kmem callers to specify a different arena if that arena is simply a placeholder. Secondarily, we use the arenas somewhat differently in older branches. So, mechanical MFCs may not do the correct thing anyway.

Aug 19 2018, 7:46 PM
jeff added inline comments to D16666: Add some accounting to the per-domain full bucket caches..
Aug 19 2018, 7:18 PM
jeff added inline comments to D16667: Extend uma_reclaim() to permit different reclamation targets..
Aug 19 2018, 7:09 PM
jeff added a comment to D16799: Eliminate the unused arena parameter from kmem_alloc_contig().

I did not do this before because I felt the ROI was low compared to the churn in ports. Have we contacted ports maintainers? The nvidia and virtualbox port are both going to need #ifdefs.

Aug 19 2018, 7:00 PM

Jul 10 2018

jeff added a comment to D16191: Fix vm_waitpfault on numa.
In D16191#343791, @alc wrote:
In D16191#343789, @jeff wrote:
In D16191#343788, @kib wrote:
In D16191#343719, @kib wrote:

I do not think that we have a mechanism that would allow us to migrate the pages to other domains in this situation.

I think vm_page_reclaim_contig_domain() provides most of the machinery needed to implement such a mechanism, FWIW. If there is a NUMA allocation policy which only permits allocations from a specific domain set, we would also need a mechanism to indicate that a given page is "pinned" to that set, and cannot be relocated.

Do we need to stop forking if there is a severe domain ? IMO if there is one non-severe then we can allow the fork to proceed. The process with non-fitting policy would be stopped waiting for a free page in the severly-depleted domain anyway. I think this check is more about preventing the kernel allocators from blocking on fork.

I still think the better question is, why are we allowing a domain preference to push into severe when another domain is completely unused? That's why I think the more general solution is on the allocator side.

For a single page it makes sense to look at the specific domains we may allocate from. But when we fork we have no idea what objects and policies may be involved. So I'm more reluctant to change that.

This actually strikes me as a scheduling problem. The forking thread should be temporarily migrated to an underutilized domain. That said, the right time to do that migration may be execve(), not fork().

Jul 10 2018, 5:59 PM
jeff added a comment to D16191: Fix vm_waitpfault on numa.
In D16191#343788, @kib wrote:
In D16191#343719, @kib wrote:

I do not think that we have a mechanism that would allow us to migrate the pages to other domains in this situation.

I think vm_page_reclaim_contig_domain() provides most of the machinery needed to implement such a mechanism, FWIW. If there is a NUMA allocation policy which only permits allocations from a specific domain set, we would also need a mechanism to indicate that a given page is "pinned" to that set, and cannot be relocated.

Do we need to stop forking if there is a severe domain ? IMO if there is one non-severe then we can allow the fork to proceed. The process with non-fitting policy would be stopped waiting for a free page in the severly-depleted domain anyway. I think this check is more about preventing the kernel allocators from blocking on fork.

Jul 10 2018, 5:32 PM

Jul 9 2018

jeff added a reviewer for D16191: Fix vm_waitpfault on numa: alc.

The basic issue is that there are a handful of places where we test for 'any domain' in min or severe and it may need to be 'every domain we try to allocate from'.

Jul 9 2018, 9:48 PM
jeff added a comment to D16191: Fix vm_waitpfault on numa.
In D16191#343542, @mjg wrote:

yes, GB. is there a problem reproducing the bug?

Jul 9 2018, 9:46 PM
jeff added a comment to D16191: Fix vm_waitpfault on numa.
In D16191#343412, @mjg wrote:

This does not fix the problem for me -- now things start wedging on 'vmwait'.

The affected machine has 512GB of ram and 4 nodes. Using the prog below with 256MB passed (./a.out 262144) reproduces it. mmacy can be prodded to test on the box.

Jul 9 2018, 6:26 PM
jeff created D16191: Fix vm_waitpfault on numa.
Jul 9 2018, 5:00 AM

Jul 7 2018

jeff committed rS336055: Use the ticks since the last update to reduce hysteresis in the partpopq and.
Use the ticks since the last update to reduce hysteresis in the partpopq and
Jul 7 2018, 1:54 AM

Jul 6 2018

jeff accepted D15933: Back pcpu zone with domain correct pages.

I would like to keep all cores set in cpuset_domain[0] in the !NUMA case so there are no surprises. Other than that this looks good to go.

Jul 6 2018, 1:04 AM

Jul 3 2018

jeff added inline comments to D15933: Back pcpu zone with domain correct pages.
Jul 3 2018, 11:52 PM
jeff added inline comments to D15933: Back pcpu zone with domain correct pages.
Jul 3 2018, 5:01 AM
jeff accepted D16078: make critical_{enter, exit} inline.
Jul 3 2018, 12:51 AM

Jul 1 2018

jeff added a comment to D16078: make critical_{enter, exit} inline.
In D16078#340860, @kib wrote:

Type info can be recovered from the .o compiled with -g, using dwarf dump utilities, which we do not have in base.

When we discussed inlining critical_enter(9) with mjg, my opinion was that struct thread_lite only adds complications. It is good enough to only have the offsets to the members auto-generated, and manually calculate the addresses of the td_critnest and td_owepreempt. It seems that I am the only one who thinks so, everybody else prefer thread_lite. genoffset.h is the good illustration of what I mean. BTW, what are the restrictions on the structure definitions which are processed by the script ?

Jul 1 2018, 8:06 PM
jeff added a comment to D16078: make critical_{enter, exit} inline.
In D16078#340968, @imp wrote:

so you have both thread_lite and offset generator... Why?

Jul 1 2018, 8:02 PM
jeff added a comment to D15933: Back pcpu zone with domain correct pages.

Some cosmetic stuff but I'm happy with this patch. If you fix those issues I approve.

Jul 1 2018, 12:31 AM

Jun 30 2018

jeff added inline comments to D16078: make critical_{enter, exit} inline.
Jun 30 2018, 10:39 PM

Jun 26 2018

jeff added a comment to D15985: Reduce unnecessary preemption, add a preemption knob for timeshare, fix missing NEEDRESCHED.
In D15985#339132, @avg wrote:

IMO, it would be better to do changes this way. Thank you!

Jun 26 2018, 7:42 PM
jeff added a comment to D15985: Reduce unnecessary preemption, add a preemption knob for timeshare, fix missing NEEDRESCHED.
In D15985#339080, @avg wrote:
In D15985#338992, @jeff wrote:

I had just forgotten about IPI_AST but I like the way I have implemented it here better. It gives the remote scheduler a chance to look at what's going on and make a new decision.

I just hoped that maybe a smaller change could fix the problem.
Honestly, to me this change looks overly specialized towards the problem it solves (rather than a general improvement of the scheduling logic).
Also, I don't quite like that _sched_shouldpreempt_ which was a pure function now becomes a function with non-obvious side effects.

Jun 26 2018, 7:53 AM
jeff added a comment to D15531: Inline critical_enter/exit for amd64.

genassm already generates these and many other offsets. It would be better to do that than manual constants. There would be a little bit of extra build work to make this happen but it should be trivial. We would also need to assert that sizes and types are correct somewhere. The generator could generate both offset and type information for required fields however.

Jun 26 2018, 12:31 AM

Jun 25 2018

jeff added a comment to D15985: Reduce unnecessary preemption, add a preemption knob for timeshare, fix missing NEEDRESCHED.

I think what I would like to do is commit this with the timeshare preempt delta disabled until I get more experience with it and see if I can reason out a better algorithm.

Jun 25 2018, 10:08 PM
jeff added a comment to D15985: Reduce unnecessary preemption, add a preemption knob for timeshare, fix missing NEEDRESCHED.
In D15985#338891, @avg wrote:

Hmm, now I see a difference between 4BSD and ULE.
If kick_other_cpu does not preempt the remote CPU, then it does this:

pcpu->pc_curthread->td_flags |= TDF_NEEDRESCHED;
ipi_cpu(cpuid, IPI_AST);

On the other hand, tdq_notify either preempts or does nothing at all.

Jun 25 2018, 9:24 PM
jeff added a comment to D15985: Reduce unnecessary preemption, add a preemption knob for timeshare, fix missing NEEDRESCHED.
In D15985#338884, @avg wrote:

@jeff I do not completely understand why in this scenario cksum runs behind the loop. Does cksum get a priority worse than the loop?
Otherwise, I would expect that even without preemption TDF_NEEDRESCHED would produce a similar effect.

Jun 25 2018, 9:21 PM

Jun 24 2018

jeff added inline comments to D15985: Reduce unnecessary preemption, add a preemption knob for timeshare, fix missing NEEDRESCHED.
Jun 24 2018, 12:23 AM
jeff added inline comments to D15985: Reduce unnecessary preemption, add a preemption knob for timeshare, fix missing NEEDRESCHED.
Jun 24 2018, 12:14 AM
jeff created D15985: Reduce unnecessary preemption, add a preemption knob for timeshare, fix missing NEEDRESCHED.
Jun 24 2018, 12:08 AM

Jun 23 2018

jeff abandoned D13308: pageout fix.

Closing as it is already fixed.

Jun 23 2018, 11:51 PM
jeff commandeered D13308: pageout fix.

This has been resolved in current with a more complete fix.

Jun 23 2018, 11:50 PM
jeff added inline comments to D15976: Change vm_page_import() to avoid physical memory fragmentation.
Jun 23 2018, 10:11 PM
jeff added a comment to D15977: Make the partpopq LRU sloppy to reduce contention.
In D15977#338278, @kib wrote:
In D15977#338276, @jeff wrote:

The other thing to consider is how accurate it needs to be and already is. Which thread was scheduled in the last tick is just as arbitrary as this LRU. You just want to replace something that hasn't been used in a long time. I would guess we're more often looking at things that haven't been touched in seconds or at least hundreds of milliseconds than within a few ticks. I can measure that but it will of course be grossly dependent on the workload and amount of memory.

If we do not need an LRU, might be we do not need the partpopq list at all ? E.g. keeping a bins of generations per ticks, up to some limited number of bins.

Jun 23 2018, 10:37 AM
jeff added inline comments to D15976: Change vm_page_import() to avoid physical memory fragmentation.
Jun 23 2018, 10:13 AM
jeff added a comment to D15977: Make the partpopq LRU sloppy to reduce contention.
In D15977#338275, @kib wrote:

How much random becomes the order of the partpopq ? Is there any way to evaluate it ?

I mean, a tick is a lot, so instead of only doing it at tick, deletegate the limited (?) sorting of the rvq_partpop queues by lasttick to a daemon.

Jun 23 2018, 10:04 AM
jeff added a comment to D15975: eliminate global serialization points in swap reserve & mmap.

Some great stuff in here. Let's peel off parts while we perfect the rest.

Jun 23 2018, 9:28 AM
jeff created D15977: Make the partpopq LRU sloppy to reduce contention.
Jun 23 2018, 8:54 AM
jeff committed rS335579: Sort uma_zone fields according to 64 byte cache line with adjacent line.
Sort uma_zone fields according to 64 byte cache line with adjacent line
Jun 23 2018, 8:10 AM

Jun 15 2018

jeff added a comment to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.

kqueue and select both use fget_unlocked. If you want to propose files without references for single threaded programs you are free to do so. You should raise it on arch@ as there is no real owner in this area. This patch further reduces differences between select and poll and reduces the number of atomics used in select which I would argue is the more frequently used of the pair.

Jun 15 2018, 10:55 PM

Jun 14 2018

jeff added inline comments to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
Jun 14 2018, 8:37 PM
jeff added a comment to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
In D15799#334063, @mjg wrote:
In D15799#334059, @jeff wrote:

I understand but you can't guarantee the thread is the only thing which is accessing these file descriptors. Off the top of my head, the unix domain socket gc thread does fdrops() on a task queue. It _may_ be possible to start to work around these things but it becomes incredibly hard to reason about. And you'd have to audit everything else in the kernel that uses a file * to understand whether it imposes restrictions on things you can do single threaded.

None of this is of any concern.

If the process is single threaded and the file descriptor table is not shared, it is the only entity which can modify its own fd table.

So in particular if it has a file installed, it holds a reference to keep it alive. Also nothing but curthread can drop it.

Let's say the same file object is being inspected by the unix gc thread - it is of no significance for this process. Let's say it fdrops. Does not matter, the process at hand still has its own ref.

The optimisation of not refing/unrefing files in single-threaded processes is implemented in Linux for all syscalls translating fd -> file.

The only caveat here is that you have to remember whether you grabbed the reference or not, since after you got one the other thread/whatever can disappear and you may transition to being single-threaded.

So the idiom is of this sort:
fp = fd2fp(fd, &need_fdrop);
....
fdrop_cond(fp, need_fdrop);

I would say pretty much no obfuscation in the caller and possibly beneficial to applly globally, not only here.

Jun 14 2018, 8:34 PM
jeff added a comment to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.

as an aside, this is what select already does anyway. Poll was still using the big slock but select was using the lockless fd support.

Jun 14 2018, 6:58 AM
jeff added a comment to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
In D15799#334058, @mjg wrote:
In D15799#334056, @jeff wrote:
In D15799#334039, @mjg wrote:

this avoidably pessimizes the common case of single-threaded execution by adding atomic op pair for each fd. the code can check if both the process is single threaded and the fd table is not shared, in which case there is no need to grab a ref on files. this will end up being a minor pessimization for the multithreaded (and presumably rare) case while being a win for singlethreaded one.

on my machine it takes less than 40 clock cycles or 11ns to do a atomic_add, atomic_fetchadd pair on a line that is in cache. I would really prefer that we did not obfuscate the code with fragile exceptions for a tiny bit of performance. There are far more profitable ways to improve our single threaded perf in poll.

This patch converts a per-call lock/unlock pair into a ref/unref pair for each passed fd, so it does matter. More importantly vast majority of poll users are single-threaded, so this patch as presented is pessimal for real uses. I don't see how the proposal obfuscated the code in any significant way.

I definitely agree there are plenty of wins to get in this code regardless of the above.

Jun 14 2018, 6:49 AM
jeff added inline comments to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
Jun 14 2018, 6:43 AM
jeff added a comment to D15799: use fget_unlocked in pollscan, defer fdrop to sel/pollrescan and seltdclear.
In D15799#334039, @mjg wrote:

this avoidably pessimizes the common case of single-threaded execution by adding atomic op pair for each fd. the code can check if both the process is single threaded and the fd table is not shared, in which case there is no need to grab a ref on files. this will end up being a minor pessimization for the multithreaded (and presumably rare) case while being a win for singlethreaded one.

Jun 14 2018, 6:38 AM

Jun 11 2018

jeff added a comment to D15736: Implement fast path for malloc and free.

Does this not include the malloc.h changes for M_ZERO?

Jun 11 2018, 1:30 AM

May 31 2018

jeff accepted D15491: Eliminate the "pass" variable in the page daemon control loop..

I just realized that this change has the same goal as D13644. The discussion there is still relevant; in particular, with the PID controller change it now doesn't make sense for v_free_target to be set as high as it is: the controller will produce a positive output as soon as v_free_count < v_free_target, and the page daemon runs the controller once every 100ms. In other words, we start freeing pages more or less immediately after v_free_count drops below v_free_target. I plan to address that, but in a separate change.

May 31 2018, 11:36 PM

May 30 2018

jeff added a comment to D15526: reduce overhead of entropy collection.

Given that there is trivially little if any entropy coming from mbufs is there a reason we're leaving this callsite at all? has anyone from secteam commented?

May 30 2018, 11:05 PM

May 24 2018

jeff committed rS334127: Merge from head.
Merge from head
May 24 2018, 3:47 AM

May 17 2018

jeff accepted D15462: Fix a race in vm_page_pagequeue_lockptr()..
May 17 2018, 4:15 AM

May 13 2018

jeff added inline comments to D15365: simple preempt safe epoch API.
May 13 2018, 1:04 AM
jeff added a comment to D15010: add white listing for ZFS locking pairs that WITNESS can't report accurately and enable WITNESS by default in ZFS.
In D15010#316190, @mav wrote:

I am not closely familiar with WITNESS, so just a feeling: the long lists of blessed locks and their combinations promises high chances for them to be forgotten on following ZFS updates.

That's actually true of all documented lock orders. I don't have a good fix for that. However, the cost of lookup can be further reduced by putting the names in a red black tree, reducing the overhead to the O(lgN).

At very list it would be good to document how those new mechanisms should be used.

Yup. Seems pretty self-explanatory apart from the separate lists used for expediting negative lookups.

May 13 2018, 12:43 AM
jeff added a comment to D15275: Feature enhancements to pmcstat.
In D15275#322358, @kib wrote:
In D15275#322071, @kib wrote:

How does an event description from the json tables is matched against the index from pmc_events.h ?

It works so long as the FreeBSD version was named correctly. I have an aliases table pmu_utils.c for things like UNHALTED_CORE_CYCLES and LLC_MISSES. If the table lookup fails it will just use the default sampling rate that is used on HEAD. Ultimately, on supported architectures I'd like to switch from using the ad hoc defines in pmc_events.h to using the json tables from Intel, IBM, and Cavium.

So did you verified that the name matches ? What are the plan for non-matching names ?

Mostly the names match. Switching to using the Intel names in the tables will fix it for good.

Also I think that importing tables in userspace is really a half measure. Right now we must update both kernel and userspace to get new event added, and in the course of it we have to break pmc(4) ABI. IMO the tables should live in the kernel, it is not a problem when hwpmc(4) is a module, or hwpmc(4) can be a minimal core, with microarch-specific submodules loaded as needed. Userspace should fetch the table from kernel and use kernel handles for events.

I don't agree and neither does Intel (see Andi's mail). We have a table of bits we can pass to the kernel as an ioctl. One could claim that defining them in the kernel buys some safety or compatibility guarantees, but that doesn't actually hold water in practice. Being able to add newly exposed PMCs without having to modify the kernel is a step forward, not a half measure. There are dozens if not hundreds of non-public PMCs on Zen processors. With this mechanism we could simply add a new table as opposed to laboriously and in error prone fashion copy from the docs.

May 13 2018, 12:19 AM
jeff added a comment to D15337: Add support for higher resolution timestamps.

My feeling is that ticks is unlikely to go any faster on general purpose kernels and some technique like this is inevitable as we continue to scale link performance. Some slight extra CPU time is a good trade-off for also eliminating weird rounding conditions and scaling factors. Overall I support this work going forward.

May 13 2018, 12:16 AM

May 11 2018

jeff added a comment to D15055: Map constant zero page on read faults which touch non-existing anon page..
In D15055#324256, @alc wrote:
In D15055#323469, @kib wrote:

In fact I started with ft.A.x when I did the testing, but there it was even less interesting than for ft.C.x. The counter's increment was about 1 or 2. This is why I changed to C and also asked about tuning.

I can re-test but I do not see the point.

I agree.

May 11 2018, 9:27 AM
jeff accepted D14917: Detect reads from the hole..

It would be nice to implement it in other filesystems that support sparse files.

May 11 2018, 9:23 AM
jeff added inline comments to D15155: Make pmclog buffer pcpu and update constants.
May 11 2018, 9:21 AM

Apr 30 2018

jeff added a comment to D15233: make ucred thread private.
In D15233#321371, @mjg wrote:

I mean is there any good reason to do this per-uid swap accounting to begin with? By default overcommit flags are 0, which in particular means the limit is not enforced whatsoever. I think it would be acceptable for the time being to flip overcommit to be a boot-time tunable and only play around with accounting if it got enabled.

The general point here is that in the normal case this is just a pessimization and fixing it requires quite some care, all while more pressing issues are here and 12.0 releng process is behind the corner.

Apr 30 2018, 9:45 PM
jeff added a comment to D15233: make ucred thread private.
In D15233#321171, @mjg wrote:
Apr 30 2018, 9:25 AM

Apr 29 2018

jeff added a comment to D15055: Map constant zero page on read faults which touch non-existing anon page..
In D15055#320936, @alc wrote:
In D15055#320927, @kib wrote:
In D15055#320924, @alc wrote:

Has anyone actually measured how often this optimization gets triggered? I'm just curious.

Even plain multiuser boot does trigger this code several times, it is me were sloppy with the testing of the last version.

In hindsight, the question that I should have asked is "How often does pmap_remove() encounter the zero page in the page table?" pmap_remove_pages() won't encounter the zero page because it's not mapped as a managed mapping. For "normal", i.e., writeable, virtual memory, I fear that this change is a pessimization. Without this change, on first touch, regardless of whether the access is a write, we will allocate a physical page and map it for write access. And so, this change would only increase the number of page faults. Moreover, in a multithreaded program, those page faults are going to have to perform a TLB shootdown, because we're changing the physical page being mapped. The cost of these additional page faults would have to be outweighed by the savings in the cases where pmap_remove() encountered a mapping to the zero page.

That said, I can see a variant of this change being an optimization for a more restricted set of cases, e.g., a read-only mapping of a file.

The optimization was requested by Jeff for very specific benchmark, since Linux also does the same trick and apparently FreeBSD loose a lot due to this. See also related D14917.
I think actual numbers will be provided when Jeff returns.

Apr 29 2018, 2:47 AM

Apr 8 2018

jeff added a comment to D14917: Detect reads from the hole..

We should think about what other filesystems could be trivially converted to this interface.

Apr 8 2018, 8:06 PM

Apr 7 2018

jeff added inline comments to D14994: Update zfs_arc_free_target after r329882..
Apr 7 2018, 7:52 PM

Apr 4 2018

jeff accepted D14893: VM page queue batching.
Apr 4 2018, 6:05 PM

Apr 3 2018

jeff added inline comments to D14893: VM page queue batching.
Apr 3 2018, 10:42 PM
jeff added a comment to D14891: msetdomain prototype (similar to mbind()).

I intend to commit this next week. I will denote that it is experimental and API may change in a man page and in comments. I think we're going to need more burn-in time with applications before 12.0 settles. I have a commitment from Netflix to sponsor that work.

Apr 3 2018, 10:34 PM

Apr 1 2018

jeff added inline comments to D14893: VM page queue batching.
Apr 1 2018, 8:41 PM
jeff committed rS331863: Add a uma cache of free pages in the DEFAULT freepool. This gives us.
Add a uma cache of free pages in the DEFAULT freepool. This gives us
Apr 1 2018, 4:50 AM
jeff closed D14905: per-cpu free page caching.
Apr 1 2018, 4:50 AM
jeff committed rS331862: Add the flag ZONE_NOBUCKETCACHE. This flag instructions UMA not to keep.
Add the flag ZONE_NOBUCKETCACHE. This flag instructions UMA not to keep
Apr 1 2018, 4:47 AM
jeff added a comment to D14917: Detect reads from the hole..

This version is much slower than the other version but still much faster than head. I think there is some bug though because after a while it started consuming space.

Apr 1 2018, 4:22 AM
jeff committed rS331861: Experimental support for msetdomain() a syscall similar to linux's mbind().
Experimental support for msetdomain() a syscall similar to linux's mbind()
Apr 1 2018, 4:12 AM

Mar 31 2018

jeff added a comment to D14917: Detect reads from the hole..

For what it's worth, I did test this with my sparse file dd test and we well exceed the performance of linux at this benchmark now that we're using the same technique. Unfortunately it defeats a convenient way to create a lot of paging traffic.

Mar 31 2018, 1:42 PM
jeff added inline comments to D14917: Detect reads from the hole..
Mar 31 2018, 1:20 PM

Mar 30 2018

jeff added inline comments to D14905: per-cpu free page caching.
Mar 30 2018, 11:24 PM
jeff updated the diff for D14891: msetdomain prototype (similar to mbind()).

I addressed review feedback.

Mar 30 2018, 8:35 AM
jeff added inline comments to D14891: msetdomain prototype (similar to mbind()).
Mar 30 2018, 5:44 AM
jeff added inline comments to D14893: VM page queue batching.
Mar 30 2018, 5:06 AM
jeff created D14905: per-cpu free page caching.
Mar 30 2018, 3:47 AM
jeff committed rS331754: Re-implement the page free cache with UMA. Change the limits so the import….
Re-implement the page free cache with UMA. Change the limits so the import…
Mar 30 2018, 1:33 AM
jeff committed rS331753: Fix a couple of pageout control issues. Reset pass after we meet our target..
Fix a couple of pageout control issues. Reset pass after we meet our target.
Mar 30 2018, 1:31 AM

Mar 29 2018

jeff added inline comments to D14891: msetdomain prototype (similar to mbind()).
Mar 29 2018, 11:17 PM
jeff added inline comments to D14891: msetdomain prototype (similar to mbind()).
Mar 29 2018, 8:59 PM
jeff committed rS331748: Merge from head.
Merge from head
Mar 29 2018, 8:40 PM
jeff added inline comments to D14891: msetdomain prototype (similar to mbind()).
Mar 29 2018, 6:03 AM
jeff created D14891: msetdomain prototype (similar to mbind()).
Mar 29 2018, 6:01 AM
jeff committed rS331723: Implement several enhancements to NUMA policies..
Implement several enhancements to NUMA policies.
Mar 29 2018, 2:55 AM
jeff closed D14839: NUMA policy enhancements.
Mar 29 2018, 2:55 AM

Mar 28 2018

jeff added inline comments to D14839: NUMA policy enhancements.
Mar 28 2018, 7:19 PM
jeff committed rS331698: Restore r331606 with a bugfix to setup cpuset_domain[] earlier on all.
Restore r331606 with a bugfix to setup cpuset_domain[] earlier on all
Mar 28 2018, 6:47 PM

Mar 27 2018

jeff committed rS331610: Backout r331606 until I can identify why it does not boot on some.
Backout r331606 until I can identify why it does not boot on some
Mar 27 2018, 10:21 AM
jeff committed rS331606: Only use CPUs in the domain the device is attached to for default.
Only use CPUs in the domain the device is attached to for default
Mar 27 2018, 3:37 AM
jeff closed D14838: By default bind interrupts to the set of CPUs in the domain they are connected to.
Mar 27 2018, 3:37 AM
jeff committed rS331605: Move vm_ndomains to vm.h where it can be used with a single header include.
Move vm_ndomains to vm.h where it can be used with a single header include
Mar 27 2018, 3:27 AM

Mar 26 2018

jeff committed rS331561: Fix a bug introduced in r329612 that slowly invalidates all clean bufs..
Fix a bug introduced in r329612 that slowly invalidates all clean bufs.
Mar 26 2018, 6:36 PM