When the function blist_fill() was added to the kernel by r107913, the swap pager used a different scheme for striping the allocation of swap space across multiple devices. And, although blist_fill() was intended to support fill operations with arbitrarily large counts, the old scheme never performed a fill larger than the stripe size. Consequently, the misplacement of a sanity check in blst_meta_fill() went undetected. Now, moving forward in time to r118390, a new scheme for striping was introduced that maintained a blist allocator per device, but as noted in r318995, swapoff_one() was not fully and correctly converted to the new scheme. This change completes what was started in r318995 by fixing the underlying bug in blst_meta_fill() that stops swapoff_one() from simply performing a single blist_fill() operation.
Details
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
It looks fine to me as well.
I would ask why don't we change blist(9) to vmem(9), but it is probably the guarantee that blist(9) does not allocate memory after creation. I wonder if vmem(9) can be similarly enhanced for specific allocations.
I asked myself the same question about a week ago. Here is what "myself" concluded. :-)
blist(9) is more space efficient than vmem(9) when the managed space becomes fragmented. I believe that the worst case for vmem(9) would be alternating allocated and free blocks. vmem(9) would need a boundary tag for every free block. In contrast, blist(9) will use roughly 2 bits per block. That said, the best case for vmem(9) is going to be better.
However, the real reason that I decided to invest time into blist(9) was that vmem(9)'s supported allocation policies are no better than the current blist(9). I think that first-fit and best-fit policies make no sense for managing disk- (or even SSD-)based storage.
I'll also mention that there is an accounting bug in swaponsomething(). When it updates "swap_pager_avail", it fails to account for the 2 blocks that are reserved for disk labels. Consequently, "swap_pager_avail" can never be 0. If we fix this bug, we may change the behavior of vm_pageout_oom().