- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 16 2019
Handle arm v4.
Rework patch as discussed above, changing most of the sfence uses by a locked op.
Alan. Mark, could you please look at the sys/vm part of the patch ? It was changed, and I think it was simplified comparing to the original version. Now the change is that vm_fault() path can take the vnode lock earlier, comparing with previous more delicate (and apparently buggy) move of the size check.
So, can we get to some conclusion there, please ? I see two action items:
- Addition of SFENCE. I now tend to think that SFENCE should be moved to the else part of 'oldpmap == pmap' in pmap_activate_sw(). Intel changed its description several times, I believe it is the safest way.
- From the discussion, I believe that SFENCEs which brace CLFLUSH{OPT} for Intel could be replaced by locked atomic, i.e. atomic_thread_fence_seq_cst().
- Additionally, I will try to make a query about this stuff through FF/Intel technical contact.
Oct 15 2019
In D22007#481450, @alc wrote:In D22007#481438, @kib wrote:BTW, I did tried to find a reference in the SDM vol.3 that would certainly witness that interrupts and exceptions are serialized, and failed. I know for sure that sysenter is not serializing.
If an application performs a system call in the middle of, for example, a sequence of non-temporal stores, without having performed an sfence first, I would say that the application is broken. :-)
In D22041#481451, @alc wrote:If my recollection of the sparc64 pmap is correct, it will have the same issue.
In D22041#481448, @alc wrote:Why not use PMAP_ENTER_NOSLEEP instead of introducing a new flag?
BTW, I did tried to find a reference in the SDM vol.3 that would certainly witness that interrupts and exceptions are serialized, and failed. I know for sure that sysenter is not serializing.
Really update used var.
Update flags arg name for booke.
Assert that the object is locked if !quick.
What is the purpose ?
This version passes stress2, according to Peter' report.
Rebase.
Disable assert for ZFS, I do not intend to fix it now.
I remembered why I did not wanted to put the SFENCE instruction in pmap_activate_sw(). Description of e.g. CLFLUSHOPT (as well as e.g. AMD CLZERO) explicitly state that SFENCE is required, they do not mention serialization instructions.
Oct 14 2019
In D22007#481114, @alc wrote:I'm confused as to why this is necessary. pmap_activate_sw() performs a serializing instruction, specifically, a move to cr3. And, Section 8.2.5 of Volume 3 says,
Program synchronization can also be carried out with serializing instructions (see Section 8.3). These instructions are typically used at critical procedure or task boundaries to force completion of all previous instructions before a jump to a new section of code or a context switch occurs. Like the I/O and locking instructions, the processor waits until all previous instructions have been completed and all buffered writes have been drained to memory before executing the serializing instruction.And, the next paragraph discusses the use of fences, e.g., sfence, instead of serializing instructions, e.g., cpuid.
In D22007#481089, @jhb wrote:Does i386 require a similar fix (or do we not care enough about i386 to bother?)
Oct 13 2019
What do you mean by idle map entries ? The deferred list ?
Rebase.
Oct 12 2019
Stop using td in vm_fault_trap(), use curthread.
Add comment trying to explain convoluted ncl_pager_setsize() interface.
Oct 11 2019
Instead of unconditionally moving the object size check for vnodes, ensure that the vnode is locked at the existing check place. Only do the locking for filesystems that require it, by a flag on the object.
This cannot go in without explanation why mtx_owner is checked, in the comment above. But it should not go in regardless, if such check is useful perhaps it should be added to VI_TRYLOCK, or mtx_trylock, or another version of VI_TRYLOCK created.
It is relative, of course. SFENCE might not flush store buffers faster than possbile, but it does guarantee that code after it sees store buffers flushed.
Oct 10 2019
Oct 9 2019
In D21883#479553, @markj wrote:In D21883#479546, @kib wrote:In D21883#479469, @markj wrote:I think the fault handler change is ok. I read through the NFS changes but am not very confident in that area.
Are you fine with both vm changes commits with 'Reviewed by: you' ?
Yes.
I was somewhat surprised to see that vm_page_alloc() and its callees do not verify that the requested pindex is < object->size. Indeed, vm_reserv_alloc_page() even handles the case where pindex >= object->size.
This is because we actually have pages beyond EOF on the object' queue. Main offender is UFS, where negative logical block numbers are used for indirect blocks and extended data blocks. Then the negative indexes of the blocks are converted into very large pindexes, which cannot be equal to pindex of any data page, due to sign extension.
I see, thanks. I knew about the negative block numbers but did not remember that they are used in VMIO.
Peter, could you, please, test this ? Whole test suite run is required, just NFS part is not enough.
Add comment in vm_fault.c.
In D21883#479469, @markj wrote:I think the fault handler change is ok. I read through the NFS changes but am not very confident in that area.
Are you fine with both vm changes commits with 'Reviewed by: you' ?
Oct 8 2019
Am I right that the total wall clock time for buildworld is same, while the system time slighly reduced ?
Oct 7 2019
In D21921#478894, @kevans wrote:dev_clone(9) does certainly document it as being invoked for all name lookups; I'm thinking passing the flags along would be more appropriate for this.
Oct 6 2019
Well, this relies on the pv chunks list consistency. In other words, if any CPU in the system modifies pv lru list, minidump would hang/crash.
More, you are locking the mutex, which is either nop or hangs as well.
It is impossible to turn logsigexit back to default with proccontrol(1), but this is arguably a wart in the current structure of the arguments parser.
Second LK_UPGRADE should have been LK_TRYUPGRADE.
Oct 5 2019
Why did you left i386 out ?
You may note that this is a revert of r244643, in the commit message.
In D21883#478403, @markj wrote:I have a tangential question regarding the vnode size: in vn_lseek()'s L_XTND case, we use VOP_GETATTR to get the file size. Why is it not sufficient to use the size from the pager field, under the shared vnode lock?
You need to handle compat32.