This is to reduce PCTRIE lookups from swap_pager_haspage().
Given swap_pager_haspage() is called frequently during heavy swap in and out activities, avoiding each individual lookup helps a lot.
For example, on i386, SWB_NPAGES is 32 and each swap_pager_haspage may call
PCTRIE_LOOKUP up to 65 times. Instead, new code will call 9 lookups for
the same size giving speed up of 700%.
On amd64, it is 64 and even twice faster as SWB_NPAGES is 64.
Details
This changeset contains self-validations as a test driver.
I haven't seen any behavior changes between old and new implementations.
I will upload a version without self-test later.
make buildworld -j <hight number> causes lots of page in/out and is a
good test case
I analyzed the difference with the dtrace code below:
fbt::swap_pager_haspage:entry
{
self->ts = timestamp;
}
fbt::swap_pager_haspage:return
/self->ts/
{
@count["swap_pager_haspage count"] = count(); @avg["swap_pager_haspage average time"] = avg(timestamp - self->ts); @sum["swap_pager_haspage total time"] = sum(timestamp - self->ts); self->ts = 0;
}
I wanted to measure swp_meta_lookup but somehow dtrace -l did not list swp_meta_
lookup probe. So, I measured swap_pager_haspage() which calls the
swp_meta_lookup.
The results are from running buildworld from the 2 kernels with the same
base with and without the patch.
The system was setup in Parallels VM with 1.5 GB physical memory and 10GB swap s
pace on SSD.
r357119M: /usr/src/usr/obj/i386.i386/sys/GENERIC-NODEBUG i386
swap_pager_haspage count 33807456 swap_pager_haspage average time 12867 swap_pager_haspage total time 435000565713
r357119M: /usr/obj/mnt/sys/swp_meta_lookup/i386.i386/sys/GENERIC-NODEBUG i386
swap_pager_haspage count 88970 swap_pager_haspage average time 8397 swap_pager_haspage total time 747098759
average time is about 50% faster.
number of calls are reduced by about 380 times lesser.
total time is reduced by about 60 times lesser.
The speed up per each call is 50% faster and it sounds somewhat reasonable.
However, I am not sure why the total number of calls are reduced so much.
The challange of running dtrace for swap-in/swap-out is that
OOM killer frequently killed dtrace during low memory.
I had my dtrace killed nearly a half a hundred times in the past couple
of months of trying to measure the performance gain.
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 27581 Build 25798: arc lint + arc unit
Event Timeline
Dropped self-test.
Updating D22409: swp_pager_meta_lookup() to find swap address and additionally search for
the number of continuous blocks.
sys/vm/swap_pager.c | ||
---|---|---|
2131–2132 | I need to restore this KASSERT. |
sys/vm/swap_pager.c | ||
---|---|---|
1072 | These swp_pager_meta_lookup function calls here and below are meld into a single swp_pager_meta_lookup. For up to the size of SWAP_META_PAGES, swp_pager_meta_lookup can access adjacent elements via array access avoiding most of PCTRIE lookup. | |
2218 | The argument is an integer and vm_pindex_t is unsigned long long; I wanted to check negative values earlier to avoid comparing these 2 different types. |
sys/vm/swap_pager.c | ||
---|---|---|
1056 | This is changing the values of maxahead and maxbehind passed in swap_pager_getpages(), so the caller's hints may not be respected. | |
2225 | This comment seems a little misleading? vm_pindex_t is unsigned. | |
2274 | I find this function quite hard to read: there are many local variables and adjustments by one. Is it possible to simplify this at all? |
sys/vm/swap_pager.c | ||
---|---|---|
1056 | Existing code has limit with look like: |
sys/vm/swap_pager.c | ||
---|---|---|
2225 | Problem cases were when pindex was very small like 1 and *before was larger, like 31, the subtraction resulted a huge number. I think the comment is now accurate. |
Fixed style, moved 0 assignment to *before and *after when SWWAPBLK_NONE,
and adjusted if/else statement for *after case.
I think I addressed all of feed backs so far.
I'm wondering if someone can take a look.