Changeset View
Standalone View
lib/libc/x86/sys/pkru.3
- This file was added.
.\" Copyright (c) 2019 The FreeBSD Foundation, Inc. | |||||
.\" All rights reserved. | |||||
.\" | |||||
0mp: //All rights reserved// is usually omitted in recent commits. | |||||
.\" This documentation was written by | |||||
.\" Konstantin Belousov <kib@FreeBSD.org> under sponsorship | |||||
.\" from the FreeBSD Foundation. | |||||
.\" | |||||
.\" Redistribution and use in source and binary forms, with or without | |||||
.\" modification, are permitted provided that the following conditions | |||||
.\" are met: | |||||
.\" 1. Redistributions of source code must retain the above copyright | |||||
.\" notice, this list of conditions and the following disclaimer. | |||||
.\" 2. Redistributions in binary form must reproduce the above copyright | |||||
.\" notice, this list of conditions and the following disclaimer in the | |||||
.\" documentation and/or other materials provided with the distribution. | |||||
.\" | |||||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND | |||||
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | |||||
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE | |||||
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE | |||||
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | |||||
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS | |||||
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) | |||||
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT | |||||
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY | |||||
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF | |||||
.\" SUCH DAMAGE. | |||||
.\" | |||||
.\" $FreeBSD$ | |||||
.\" | |||||
.Dd February 16, 2019 | |||||
.Dt PKRU 3 | |||||
.Os | |||||
.Sh NAME | |||||
.Nm Protection Key Rights for User pages | |||||
.Nd provide fast user-managed key-based access control for pages | |||||
Done Inline ActionsIn this place, the 's' on the end of 'Keys' makes my ears hurt. :-) "Protection Key Rights" would be better. alc: In this place, the 's' on the end of 'Keys' makes my ears hurt. :-) "Protection Key Rights"… | |||||
.Sh LIBRARY | |||||
.Lb libc | |||||
.Sh SYNOPSIS | |||||
.In machine/sysarch.h | |||||
.Ft int | |||||
.Fn x86_pkru_get_perm "unsigned int keyidx" "int *access" "int *modify" | |||||
.Ft int | |||||
.Fn x86_pkru_set_perm "unsigned int keyidx" "int access" "int modify" | |||||
.Ft int | |||||
.Fo x86_pkru_protect_range | |||||
.Fa "void *addr" | |||||
.Fa "unsigned long len" | |||||
.Fa "unsigned int keyidx" | |||||
.Fa "int flag" | |||||
.Fc | |||||
.Ft int | |||||
.Fn x86_pkru_unprotect_range "void *addr" "unsigned long len" | |||||
.Sh DESCRIPTION | |||||
The protection keys feature provides an additional mechanism, besides the | |||||
normal page permissions as established by | |||||
Done Inline ActionsThe dash here isn't needed. markj: The dash here isn't needed. | |||||
.Xr mmap 2 | |||||
and | |||||
.Xr mprotect 2 , | |||||
to control access to user-mode addresses. | |||||
The mechanism gives safety measures which can be used to avoid | |||||
Done Inline Actionsaccess pho: access | |||||
incidental read or modification of sensitive memory, | |||||
Done Inline ActionsEnabled by what? markj: Enabled by what? | |||||
Done Inline ActionsBy user. More precisely, 'not disabled'. I removed the 'enabled' part. kib: By user. More precisely, 'not disabled'. I removed the 'enabled' part. | |||||
or as a debugging feature. | |||||
Done Inline Actionsan associated 4-bit protection key jilles: **an** associated 4-bit protection key | |||||
It cannot guard against conscious accesses since permissions | |||||
Done Inline ActionsA new per-thread PKRU hardware register determines jilles: **A** new per-thread PKRU hardware register determines | |||||
are user-controllable. | |||||
Done Inline Actionsfor each protection key, whether user-mode addresses with that protection key jilles: for each protection **key**, whether user-mode addresses with that protection **key** | |||||
Done Inline ActionsNo need for a comma after "key". markj: No need for a comma after "key". | |||||
.Pp | |||||
If supported by hardware, each mapped user linear address | |||||
has an associated 4-bit protection key. | |||||
A new per-thread PKRU hardware register determines, for each protection | |||||
key, whether user-mode addresses with that protection key may be | |||||
read or written. | |||||
.Pp | |||||
Only one key may apply to a given range at a time. | |||||
The default protection key index is zero, it is used even if no key | |||||
Done Inline Actions"... it returns the EFAULT..." markj: "... it returns the EFAULT..." | |||||
was explicitly assigned to the address, or if the key was removed. | |||||
.Pp | |||||
Not Done Inline ActionsPerhaps it should be mentioned that this EFAULT is not always a "proper" error in the sense that a failed system call may have been partially successful. For example, if sigaction(2) could not write to *oact, it will return the EFAULT error code even if the signal disposition was updated. It is best for applications to pass pointers that may cause protection faults only for data buffers for read, write and the like, where the error can be properly reported (either as immediate EFAULT or as an incomplete transfer). If the EFAULT is used as a trigger to abort the process, the above is not a problem. jilles: Perhaps it should be mentioned that this `EFAULT` is not always a "proper" error in the sense… | |||||
Done Inline ActionsI am not sure what to do about this note and even how to interpret it. If taken literally, I have to note that sigaction(2) would fail in the same manner if oact is invalid but non-NULL pointer. So really the behavior of the syscalls is same as if passed pointer was invalid (but not NULL) or valid but readonly. In the corner case, it might be even a race where initially usable pointer now points to unusable (e.g. not writable) page. So I do not think that this additional permission check changes potential syscalls outcome. What it changes is that it makes easier to make useracc(9) useless, but this falls under the already acceptable behavior. kib: I am not sure what to do about this note and even how to interpret it. If taken literally, I… | |||||
Done Inline ActionsMy point is that while you can fix up userspace accesses by catching SIGSEGV, remapping the faulting address and returning from the signal handler, there is no way to do something similar for system calls in the general case. Reading the standards strictly, it is undefined behaviour to pass a pointer to a system function that does not permit the required access (where NULL is also valid if documented for the function). PKRU does not change this, but makes it more likely to trigger. Perhaps a sentence "Note that some side effects may have occurred if this error is reported." could be added. jilles: My point is that while you can fix up userspace accesses by catching `SIGSEGV`, remapping the… | |||||
Not Done Inline ActionsI wanted to highlight this sentence. My actual question is in my next comment. alc: I wanted to highlight this sentence. My actual question is in my next comment. | |||||
The protection prevents the system from accessing user addresses as well | |||||
as the user applications. | |||||
Done Inline Actions"that the system" markj: "that the system" | |||||
When a system call was unable to read or write user memory due to key | |||||
Done Inline Actions"only available on amd64 systems"? markj: "only available on amd64 systems"? | |||||
Done Inline ActionsI mean that user might run i386 on the same machine. kib: I mean that user might run i386 on the same machine. | |||||
protection, it returns the | |||||
Done Inline ActionsUsually written "64-bit" and "32-bit". markj: Usually written "64-bit" and "32-bit". | |||||
Done Inline ActionsI think that 'AKA' is too informal for man pages. alc: I think that 'AKA' is too informal for man pages. | |||||
.Er EFAULT | |||||
error code. | |||||
Note that some side effects may have occurred if this error is reported. | |||||
.Pp | |||||
Done Inline Actions"The key indexes" markj: "The key indexes" | |||||
Done Inline ActionsShould be "Developer's" markj: Should be "Developer's" | |||||
Protection keys require that the system uses 4-level paging | |||||
(also called long mode), | |||||
which means that it is only available on amd64 system. | |||||
Done Inline Actions"managed using the user-mode instructions" markj: "managed using the user-mode instructions" | |||||
Both 64-bit and 32-bit applications can use protection keys. | |||||
More information about the hardware feature is provided in the IA32 Software | |||||
Developer's Manual published by Intel Corp. | |||||
.Pp | |||||
Done Inline ActionsThe system provides jilles: **The** system provides | |||||
The key indexes written into the page table entries are managed by the | |||||
.Fn sysarch | |||||
Done Inline Actions"for both" markj: "for both" | |||||
syscall. | |||||
Per-key permissions are managed using the user-mode instructions | |||||
Done Inline Actions"The x86_pkru_protect_range function assigns..." markj: "The x86_pkru_protect_range function assigns..." | |||||
.Em RDPKRU | |||||
and | |||||
.Em WRPKRU. | |||||
Done Inline Actions"assigns key keyidx to..." markj: "assigns key keyidx to..." | |||||
The system provides convenient library helpers for both the syscall and | |||||
the instructions, described below. | |||||
Done Inline Actions"having length len." markj: "having length len." | |||||
.Pp | |||||
Done Inline Actions.Fa len . (missing space) 0mp: `.Fa len .` (missing space) | |||||
The | |||||
.Fn x86_pkru_protect_range | |||||
Done Inline ActionsIt should probably be explained that only one key may apply to a given range at a time. If _EXCL is not specified, x86_pkru_protect_range() will replace any existing key. markj: It should probably be explained that only one key may apply to a given range at a time. If… | |||||
function assigns key | |||||
.Fa keyidx | |||||
to the range starting at | |||||
.Fa addr | |||||
and having length | |||||
.Fa len . | |||||
Starting address is truncated to the page start, | |||||
Done Inline Actions"the zero key" markj: "the zero key" | |||||
and the end is rounded up to the end of the page. | |||||
Done Inline Actions"You must first remove any existing key with x86_pkru_unprotect_range() in order for this request to succeed. markj: "You must first remove any existing key with x86_pkru_unprotect_range() in order for this… | |||||
After the successfull call, the range has the specified key assigned, | |||||
alcUnsubmitted Not Done Inline Actions"After a successful call, ..." alc: "After a successful call, ..." | |||||
even if the key is zero and it did not changed the page table entries. | |||||
alcUnsubmitted Not Done Inline Actions"changed" -> "change" alc: "changed" -> "change" | |||||
.Pp | |||||
The | |||||
.Fa flags | |||||
argument takes the logical OR of the following values: | |||||
.Bl -tag -width | |||||
Done Inline Actions"You must use a x86_pkru_unprotect_range call" markj: "You must use a x86_pkru_unprotect_range call" | |||||
.It Bq Va AMD64_PKRU_EXCL | |||||
Not Done Inline ActionsThe earlier sentence that I highlighted could be interpreted as any address range has a initial, default key value of zero. But AMD64_PKRU_EXCL fails even for a zero key, so such an operation could never succeed. I assume that this is not the intended interpretation, i.e., that there can be address ranges with no associated key. Yes? alc: The earlier sentence that I highlighted could be interpreted as any address range has a initial… | |||||
Done Inline ActionsWhen PKRU is enabled, hardware requires that any valid pte has a key assigned. By default, the key zero is installed. There is a separate, software managed key, that can be in two states: assigned for a range, or not. If assigned, it can be removed, and then the state would be not assigned (in ptes, zero key is installed then). For _EXCL to succeed, there must be no sw key assigned to any page in the range. kib: When PKRU is enabled, hardware requires that any valid pte has a key assigned. By default, the… | |||||
Not Done Inline ActionsThis distinction between the hardware keys and the software managed keys doesn't really appear in the text, at least explicitly. Can you please add some text? alc: This distinction between the hardware keys and the software managed keys doesn't really appear… | |||||
Only assign the key if the range does not have any other keys assigned | |||||
(including the zero key). | |||||
You must first remove any existing key with | |||||
.Fn x86_pkru_unprotect_range | |||||
in order for this request to succeed. | |||||
If the | |||||
Done Inline ActionsMissing "function" after the function name. markj: Missing "function" after the function name. | |||||
.Va AMD64_PKRU_EXCL | |||||
flag is not specified, | |||||
.Fn x86_pkru_protect_range | |||||
replaces any existing key. | |||||
.It Bq Va AMD64_PKRU_PERSIST | |||||
The keys assigned to the range are persistent. | |||||
They are re-established when the current mapping is destroyed | |||||
and a new mapping is created in any sub-range of the specified range. | |||||
You must use a | |||||
.Fn x86_pkru_unprotect_range | |||||
call to forget the key. | |||||
.El | |||||
Done Inline Actionss/call/function/ markj: s/call/function/ | |||||
.Pp | |||||
The | |||||
.Fn x86_pkru_unprotect_range | |||||
Done Inline ActionsI think you can s/at least the //. markj: I think you can s/at least the //. | |||||
function removes any keys assigned to the specified range. | |||||
Done Inline Actions"the variable" markj: "the variable" | |||||
Existing mappings are changed to use key index zero in page table entries. | |||||
Keys are no longer considered installed for all mappings in the range, | |||||
Done Inline Actions"indicates that write access" markj: "indicates that write access" | |||||
for the purposes of | |||||
.Fn x86_pkru_protect_range | |||||
with the | |||||
.Va AMD64_PKRU_EXCL | |||||
flag. | |||||
.Pp | |||||
The | |||||
.Fn x86_pkru_get_perm | |||||
function returns access rights for the key specified by the | |||||
Done Inline Actions"Conversely, ..." alc: "Conversely, ..." | |||||
.Fn keyidx | |||||
argument. | |||||
If the value pointed to by | |||||
.Fa access | |||||
is zero after the call, no read or write permissions is granted for | |||||
mappings which are assigned the key | |||||
.Fn keyidx . | |||||
If | |||||
.Fa access | |||||
is not zero, read access is permitted. | |||||
The non-zero value of the variable pointed to by the | |||||
.Fa modify | |||||
argument indicates that write access is permitted. | |||||
.Pp | |||||
Conversely, the | |||||
Done Inline Actions"completion of the operation," or "the operation's completion" Is it still possible for these calls to fail with ENOMEM? markj: "completion of the operation," or "the operation's completion"
Is it still possible for these… | |||||
Done Inline ActionsI want to document the interface. Not returning ENOMEM is actually an impl detail, e.g. we can start enforcing the limit (and probably should). kib: I want to document the interface. Not returning ENOMEM is actually an impl detail, e.g. we can… | |||||
Done Inline ActionsOk. markj: Ok. | |||||
.Fn x86_pkru_set_perm | |||||
establishes the access and modify permissions for the given key index | |||||
as specified by its arguments. | |||||
.Sh RETURN VALUES | |||||
.Rv -std | |||||
.Sh ERRORS | |||||
.Bl -tag -width Er | |||||
.It Bq Er EOPNOTSUPP | |||||
The hardware does not support protection keys. | |||||
.It Bq Er EINVAL | |||||
The supplied key index is invalid (greater than 15). | |||||
.It Bq Er EINVAL | |||||
The supplied | |||||
.Fa flags | |||||
argument for | |||||
Done Inline ActionsThis expands to nothing, did you mean .Nm? markj: This expands to nothing, did you mean .Nm? | |||||
.Fn x86_pkru_protect_range | |||||
has reserved bits set. | |||||
.It Bq Er EFAULT | |||||
The supplied address range does not completely fit into the user-managed | |||||
address range. | |||||
.It Bq Er ENOMEM | |||||
The memory shortage prevents the completion of the operation. | |||||
.It Bq Er EBUSY | |||||
The | |||||
.Va AMD64_PKRU_EXCL | |||||
flag was specified for | |||||
.Fn x86_pkru_protect_range | |||||
and the range already has defined protection keys. | |||||
.El | |||||
.Sh SEE ALSO | |||||
.Xr mmap 2 , | |||||
.Xr mprotect 2 , | |||||
.Xr munmap 2 , | |||||
.Xr sysarch 2 . | |||||
.Sh STANDARDS | |||||
The | |||||
.Nm | |||||
functions are non-standard and first appeared in | |||||
.Fx 13.0 . |
All rights reserved is usually omitted in recent commits.