This is the least elegant of the object concurrency patches. The basic notion is that the object can become busy after which new pages can not become busy but existing pages may persist in being busy. This permits the existing style of only checking, and not acquiring, busy for dedicated blocks of code.
Here with object busy we're synchronizing vm_fault_soft_fast against pmap_remove* or changes in valid bits on any constituent page in the super page set. We could simply busy every page in the set and then unbusy at the end as we would've for 512 individual mappings. This would be my preference, but there was concern over the cost. So instead we use the atomic object busy counter to emulate the old behavior of a busy check being sufficient.
The implementation does allow for pages to transiently become busy, notice the object busy count, and drop busy. These will revert back to single page mappings if you lose the race. I don't see a reasonable way around this.