Index: lib/libc/sys/_umtx_op.2 =================================================================== --- lib/libc/sys/_umtx_op.2 +++ lib/libc/sys/_umtx_op.2 @@ -28,7 +28,7 @@ .\" .\" $FreeBSD$ .\" -.Dd May 5, 2016 +.Dd May 17, 2016 .Dt _UMTX_OP 2 .Os .Sh NAME @@ -85,6 +85,7 @@ volatile lwpid_t m_owner; uint32_t m_flags; uint32_t m_ceilings[2]; + uintptr_t m_rb_lnk; }; .Ed .Pp @@ -95,18 +96,24 @@ locked state, or zero when the lock is unowned. The highest bit set indicates that there is contention on the lock. The constants are defined for special values: -.Bl -tag -width "Dv UMUTEX_CONTESTED" +.Bl -tag -width "Dv UMUTEX_RB_OWNERDEAD" .It Dv UMUTEX_UNOWNED Zero, the value stored in the unowned lock. .It Dv UMUTEX_CONTESTED The contenion indicator. +.It Dv UMUTEX_RB_OWNERDEAD +A thread owning the robust mutex terminated. +The mutex is in unlocked state. +.It Dv UMUTEX_RB_NOTRECOV +The robust mutex is in non-recoverable state. +It cannot be locked until reinitialized. .El .Pp The .Dv m_flags field may contain the following umutex-specific flags, in addition to the common flags: -.Bl -tag -width "Dv UMUTEX_PRIO_INHERIT" +.Bl -tag -width "Dv UMUTEX_NONCONSISTENT" .It Dv UMUTEX_PRIO_INHERIT Mutex implements .Em Priority Inheritance @@ -115,6 +122,11 @@ Mutex implements .Em Priority Protection protocol. +.It Dv UMUTEX_ROBUST +Mutex is robust. +.It Dv UMUTEX_NONCONSISTENT +Robust mutex is in transient non-consistent state. +Not used by kernel. .El .Pp In the manual page, mutexes not having @@ -417,6 +429,69 @@ When waking up a limited number of threads from a given sleep queue, the highest priority threads that have been blocked for the longest on the queue are selected. +.Ss ROBUST UMUTEXES +The +.Em robust umutexes +are provided as a substrate for the userspace library to implement +POSIX robust mutexes. +A robust umutex must have the +.Dv UMUTEX_ROBUST +flag set. +.Pp +On the thread termination, kernel walks two lists of the mutexes. +The lists heads addresses must be provided by prior call to +.Dv UMTX_OP_ROBUST_LISTS +request. +The lists are single-linked, next element link is provided by the +.Dv m_rb_lnk +member of the +.Vt struct umutex . +.Pp +If kernel finds a mutex either +.Bl -dash -offset indent -compact +.It +without +.Dv UMUTEX_ROBUST +flag set +.It +not owned by the current thread, except when the mutex is pointed to +by the +.Dv robust_inactive +member of the +.Vt struct umtx_robust_lists_params , +registered for the current thread by +.Dv UMTX_OP_ROBUST_LISTS +request +.It +with invalid flags combination +.It +when next pointer is invalid +.It +when reached the list length limit, see +.Xr libthr 3 +.El +the robust list processing is aborted. +.Pp +For each iteration, found mutex is unlocked as if the +.Dv UMTX_OP_MUTEX_UNLOCK +request is performed on it, but instead of the +.Dv UMUTEX_UNOWNED +value, the +.Dv m_owner +field is written with the +.Dv UMUTEX_RB_OWNERDEAD +value. +When such mutex is locked, the lock is granted, with the error +.Er EOWNERDEAD +returned. +.Pp +Also, kernel specially interprets the +.Dv UMUTEX_RB_NOTRECOV +value of the +.Dv m_owner +field, always returning +.Er ENOTRECOVERABLE +for lock attempts. .Ss OPERATIONS The following operations, requested by the .Fa op @@ -582,12 +657,12 @@ Pointer to the umutex. .It Fa val New ceiling value. -.It Fa uaddr1 +.It Fa uaddr Address of a variable of type .Vt uint32_t . If not NULL, after the successful update the previous ceiling value is written to the location pointed to by -.Fa uaddr1 . +.Fa uaddr . .El .Pp The request locks the umutex pointed to by the @@ -614,7 +689,7 @@ .Vt struct ucond . .It Fa val Request flags, see below. -.It Fa uaddr1 +.It Fa uaddr Pointer to the umutex. .It Fa uaddr2 Optional pointer to a @@ -624,7 +699,7 @@ .Pp The request must be issued by the thread owning the mutex pointed to by the -.Fa uaddr1 +.Fa uaddr argument. The .Dv c_hash_waiters @@ -633,7 +708,7 @@ pointed to by the .Fa obj argument, is set to an arbitrary non-zero value, after which the -.Fa uaddr1 +.Fa uaddr mutex is unlocked (following the appropriate protocol), and the current thread is put to sleep on the sleep queue keyed by the @@ -651,7 +726,7 @@ .Dv c_hash_waiters member is cleared. After wakeup, the -.Fa uaddr1 +.Fa uaddr umutex is not relocked. .Pp The following flags are defined: @@ -1084,6 +1159,49 @@ argument specifies the virtual address, which backing physical memory byte identity is used as a key for the anonymous shared object creation or lookup. +.It Dv UMTX_OP_ROBUST_LISTS +Register the robust lists heads for the current thread. +The arguments to the request are: +.Bl -tag -width "It Fa obj" +.It Fa val +Size of the structure passed in the +.Fa uaddr +argument. +.It Fa uaddr +Pointer to the structure of type +.Vt struct umtx_robust_lists_params . +.El +.Pp +The structure is defined as +.Bd -literal +struct umtx_robust_lists_params { + uintptr_t robust_list_offset; + uintptr_t robust_priv_list_offset; + uintptr_t robust_inact_offset; +}; +.Ed +.Pp +The +.Dv robust_list_offset +member contains address of the first element in the shared robust +locked mutexes list. +The +.Dv robust_priv_list_offset +member contains address of the first element in the private robust +locked mutexes list. +The private and shared robust locked lists are split to allow fast +termination of the shared list on fork, in the child. +.Pp +The +.Dv robust_inact_offset +contains pointer to the mutex which might be locked in nearby future, +or might have been just unlocked. +It is typically set by the lock or unlock mutex implementation code +around the whole operation, since lists can be only changed race-free +when the thread owns the mutex. +Kernel inspects the +.Dv robust_inact_offset +in addition to walking the shared and private lists. .El .Sh RETURN VALUES If successful, @@ -1106,7 +1224,7 @@ The .Fn _umtx_op operations will return the following errors: -.Bl -tag -width Er +.Bl -tag -width "Bq Er ENOTRECOVERABLE" .It Bq Er EFAULT One of the arguments point to invalid memory. .It Bq Er EINVAL @@ -1145,7 +1263,7 @@ argument specifies invalid operation. .It Bq Er EINVAL The -.Fa uaddr1 +.Fa uaddr argument for the .Dv UMTX_OP_SHM request specifies invalid operation. @@ -1162,6 +1280,21 @@ .Dv RTP_PRIO_MAX . .It Bq Er EPERM Unlock attempted on an object not owned by the current thread. +.It Bq Er EOWNERDEAD +The lock was requested on an umutex which +.Dv m_owner +field was set to the +.Dv UMUTEX_RB_OWNERDEAD +value, indicating terminated robust mutex. +The lock was granted to the caller, so this error in fact +indicates success with additional conditions. +.It Bq Er ENOTRECOVERABLE +The lock was requested on an umutex which +.Dv m_owner +field is equal to the +.Dv UMUTEX_RB_NOTRECOV +value, indicating abandoned robust mutex after termination. +Lock was not granted to the caller. .It Bq Er ENOTTY The shared memory object, associated with the address passed to the .Dv UMTX_SHM_ALIVE @@ -1197,7 +1330,7 @@ A try mutex lock operation was not able to obtain the lock. .It Bq Er ETIMEDOUT The request specified a timeout in the -.Fa uaddr1 +.Fa uaddr and .Fa uaddr2 arguments, and timed out before obtaining the lock or being woken up. @@ -1211,6 +1344,27 @@ The error is typically not returned to userspace code, restart is handled by usual adjustment of the instruction counter. .El +.Sh BUGS +A window between unlocking robust mutex and resetting the pointer in the +.Dv robust_inact_offset +member of the registered +.Vt struct umtx_robust_lists_params +allows other thread to destroy the mutex, thus making the kernel inspect +freed or reused memory. +The +.Li libthr +implementation is only vulnerable to this race when operating on +a shared mutex. +Possible fix for the current implementation is to strength checks +for shared mutexes before terminating them, in particular, verifying +that the mutex memory is mapped from the POSIX shared object, allocated +by the +.Dv UMTX_OP_SHM +request. +This is not done because it is believed that the race is adequately +covered by other consistency checks, while adding the check would +prevent alternative implementations of +.Li libpthread . .Sh SEE ALSO .Xr clock_gettime 2 , .Xr mmap 2 , Index: lib/libthr/libthr.3 =================================================================== --- lib/libthr/libthr.3 +++ lib/libthr/libthr.3 @@ -29,7 +29,7 @@ .\" .\" $FreeBSD$ .\" -.Dd February 12, 2015 +.Dd May 17, 2016 .Dt LIBTHR 3 .Os .Sh NAME @@ -167,7 +167,7 @@ The following environment variables are recognized by .Nm and adjust the operation of the library at run-time: -.Bl -tag -width LIBPTHREAD_SPLITSTACK_MAIN +.Bl -tag -width "Ev LIBPTHREAD_SPLITSTACK_MAIN" .It Ev LIBPTHREAD_BIGSTACK_MAIN Disables the reduction of the initial thread stack enabled by .Ev LIBPTHREAD_SPLITSTACK_MAIN . @@ -198,7 +198,36 @@ threads are inserted at the head of the sleep queue, instead of its tail. Bigger values reduce the frequency of the FIFO discipline. The value must be between 0 and 255. +.Pp +.El +The following +.Dv sysctl +MIBs affect the operation of the library: +.Bl -tag -width "Dv debug.umtx.robust_faults_verbose" +.It Dv kern.ipc.umtx_vnode_persistent +By default, a shared lock backed by the mapped file in memory is +automatically destroyed on the last unmap of the corresponding file's page. +Setting the sysctl to 1 makes such shared lock object to persist until +the vnode is recycled by Virtual File System. +Note that in case file is not opened and not mapped, kernel might +recycle it at any moment, making this sysctl less useful than it sounds. +.It Dv kern.ipc.umtx_max_robust +The maximal number of robust mutexes allowed for one thread. +Kernel will not unlock more mutexes than specified, see +.Xr _umtx_op +for more details. +The default value is large enough for most useful applications. +.It Dv debug.umtx.robust_faults_verbose +Non zero value makes kernel emit some diagnostic when the robust +mutexes unlock was prematurely aborted due to detect of some +inconsistencies, as a measure to prevent memory corruption. .El +.Pp +The +.Dv RLIMIT_UMTXP +limit (see +.Xr getrlimit 2 ) +defines how many shared locks given user may create simultaneously. .Sh INTERACTION WITH RUN-TIME LINKER On load, .Nm @@ -236,6 +265,12 @@ .Xr ld-elf.so.1 1 , .Xr getrlimit 2 , .Xr errno 2 , +.Xr thr_exit 2 , +.Xr thr_kill 2 , +.Xr thr_kill2 2 , +.Xr thr_new 2 , +.Xr thr_self 2 , +.Xr thr_set_name 2 , .Xr _umtx_op 2 , .Xr dlclose 3 , .Xr dlopen 3 ,