Query: Advanced Search

Simplify anonymous memory handling with an OBJ_ANON flag. This eliminates

	Include stories about projects I am a member of.

That would have the advantage of doing less work for transient conditions. Rapidly forking/exiting processes. It would be hard to do the scan efficiently. But if you did a sort of ranged mark/sweep you would be able to plug all holes from a central algorithm and simplify normal operation.

In D22423#490916, @kib wrote:

This should be fine IMO.

The only purpose of the shadow_list seems to be able to call vm_object_collapse() of refcount == 1 in vm_object_deallocate(). It might be that doing more collapses in other places, or to follow the whole shadow chain on collapse, makes this location not important. Might be we can remove shadow_list altogether then.

Address both pieces of review feedback.

Readd a missing lock around the vnode size.

We run all of our paging threads constantly now. I would prefer not to disable creating cache buckets from pageproc unless we're in low memory situation. Even then it may be preferable to simply flush the buckets at the end of paging in that case.

I don't see the lock leak? again or continue both expect the lock held.

I am ok with this but I would definitely like to see a UMA daemon thread that handles timeouts and wss trimming proactively and regularly. Reviewing the mechanism it seems that we might need to work a little bit to limit the cost of more frequent processing but doing it regularly should make it less impactful when it does run.

I used 'tid' for debugging in the past. You can get the whole tid in. We can do somewhat more meaningful asserts with that but we'd have to deal with I/O ala BUF_KERNPROC

Taking this over to remind myself to commit it

Are there any further comments? This passed pho testing as well as some other ad-hoc workload testing. I would like to commit soon. I have a number of smaller object lock contention reduction patches I would like to move forward with.

Handle more cases. Fix reservations limit. Other feedback.

Replace OBJ_MIGHTBEDIRTY with a system using atomics. Remove the TMPFS_DIRTY

Use atomics and a shared object lock to protect the object reference count.

Drop the object lock earlier in fault and don't relock it after pmap_enter().

Drop the object lock in vfs_bio and cluster where it is now safe to do so.

It is really your call Alan. I was trying to unify the flags among the various non-anonymous swap object users. We also saw with tmpfs that it performs much better with reservations enabled. I can defer this part of the patch or I can fix the above code.

We have run into issues with the auto-tuning here. It all needs to do better. We should be able to set max cache based on total memory. I'm not opposed to this patch as an intermediate step though.

One last point; it would be trivial to turn generation into a timestamp so that it could be used for more precise mtime without needing periodic polling.

I can do arm this afternoon.

Any comment on the MIGHTBEDIRTY check?

On the other hand this means you'd need 1TB of ram to run with 256 processor cores threads which is a possible amount for a two socket system but still a somewhat unlikely amount of ram. Netflix ran into this on some of their test systems and had to disable the check.

(6/6) Convert pmap to expect busy in write related operations now that all

(5/6) Move the VPO_NOSYNC to PGA_NOSYNC to eliminate the dependency on the

(4/6) Protect page valid with the busy lock.

(3/6) Add a shared object busy synchronization mechanism that blocks new page

(2/6) Don't release xbusy in vm_page_remove(), defer to vm_page_free_prep().

(1/6) Replace busy checks with acquires where it is trival to do so.

Rewrote the vm_page.h locking description.

Add missing barrier.

Fixed wired/busy check order.

In D21596#474569, @kib wrote:

Is there any use of PGA_WRITEABLE flag left after the patch ? pmap_page_is_write_mapped() takes the pv lock, while PGA_WRITEABLE check is essentially free.

In D21594#474564, @kib wrote:

What do you mean by a note that pageout clears valid state of the page ? I thought that pageout laundries the page, and a clean page might be freed for reuse. In other words, valid bits can be only trimmed by truncation either of the vnode or swap backing OBJ_NOSPLIT object.

In D21592#473738, @kib wrote:

In D21592#473571, @jeff wrote:

In D21592#473568, @markj wrote:

I think I've brought this up before, but I would like it if the VM had a generic per-2MB page structure. We already have several in vm_reserv and the pmap, and IMO it would be a good place to maintain a "compound" busy state, rather than in the object. I worry that a mechanism to block the busying of all pages in an object will inhibit concurrency and lead to transient latency spikes. I don't object to the current approach though.

Right now the object lock is the mechanism that blocks busying of all pages and creates transient latency spikes. This at least narrows it a level and allows other object operations to proceed. If you had a per-superpage object you would have to be able to very quickly look it up and acquire in tryxbusy/sbusy.

Ultimately I would like to see a generic mechanism that can treat pages as groups with a single set of state. A variable page size.

But isn't vm_reserv adequate object ? psind is set to 1 iff the reserv is fully populated. Why cannot we lock vm_reserv around pmap_enter() in psind==1 case for fast_soft ?

Update: when I wrote that, I thought about locking reserv and e.g. busy the first page in the range. Incompatible manipulation of any page in the superpage would need to break the reservation. But this is not enough, indeed.

In D21592#473568, @markj wrote:

I think I've brought this up before, but I would like it if the VM had a generic per-2MB page structure. We already have several in vm_reserv and the pmap, and IMO it would be a good place to maintain a "compound" busy state, rather than in the object. I worry that a mechanism to block the busying of all pages in an object will inhibit concurrency and lead to transient latency spikes. I don't object to the current approach though.

My only real feedback is on the naming.

Advanced Search
Use Results
Edit Query
Hide Query

Nov 19 2019

Nov 18 2019

Nov 17 2019

Nov 16 2019

Nov 11 2019

Nov 9 2019

Nov 4 2019

Oct 30 2019

Oct 29 2019

Oct 26 2019

Oct 24 2019

Oct 23 2019

Oct 22 2019

Oct 21 2019

Oct 16 2019

Oct 15 2019

Oct 10 2019

Oct 8 2019

Oct 6 2019

Oct 5 2019

Oct 4 2019

Sep 29 2019

Sep 28 2019

Sep 25 2019

Sep 23 2019

Sep 19 2019

Sep 18 2019

Sep 17 2019

Sep 16 2019

Advanced SearchUse ResultsEdit QueryHide Query