Page MenuHomeFreeBSD

Include the psind in data returned by mincore(2).
ClosedPublic

Authored by markj on Aug 30 2020, 5:44 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mar 3 2024, 8:16 AM
Unknown Object (File)
Jan 13 2024, 12:27 PM
Unknown Object (File)
Jan 11 2024, 3:33 AM
Unknown Object (File)
Jan 6 2024, 6:28 PM
Unknown Object (File)
Jan 6 2024, 6:28 PM
Unknown Object (File)
Jan 6 2024, 6:28 PM
Unknown Object (File)
Jan 6 2024, 6:28 PM
Unknown Object (File)
Jan 6 2024, 6:28 PM

Details

Summary

Currently we use a single bit to indicate whether the virtual page is
part of a superpage. With the introduction of non-transparent 1GB
mappings I think it's useful to use some unused bits in the mincore()
vector to return more precise information. In particular, I wrote some
tests which aim to verify that the large page mapping does what it's
supposed to.

The change turns MINCORE_SUPER into a mask from which the corresponding
psind can be extracted. In this diff only two bits are reserved,
allowing for up to four page sizes, leaving one bit available. I am not
sure whether the last bit should be reserved for this purpose as well;
features like arm64's ATTR_CONTIGUOUS could result in extra page sizes
beyond the ones corresponding to page table levels.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 33242
Build 30581: arc lint + arc unit

Event Timeline

markj requested review of this revision.Aug 30 2020, 5:44 PM

I am not sure about the reuse of the existing MINCORE_SUPER bit for psind. It seemingly break ABI. Of course right now we do not have 1G superpage mappings in userspace, but it is natural to assume that MINCORE_SUPER indicates just superpage, so it should be set when 1G mappings appear. But since 1G psind is 2, we would have (current) MINCORE_SUPER bit clear.

I would take a "wait-and-see" approach to allocating the last available bit. Later adding that bit to the page size field isn't going to break any code written to this proposed 2-bit field.

This revision is now accepted and ready to land.Aug 30 2020, 6:45 PM
In D26238#583120, @kib wrote:

I am not sure about the reuse of the existing MINCORE_SUPER bit for psind. It seemingly break ABI. Of course right now we do not have 1G superpage mappings in userspace, but it is natural to assume that MINCORE_SUPER indicates just superpage, so it should be set when 1G mappings appear. But since 1G psind is 2, we would have (current) MINCORE_SUPER bit clear.

I think it is theoretically possible, but wouldn't such an application would have to both using the old ABI _and_ have some ability to create 1G mappings?

In D26238#583120, @kib wrote:

I am not sure about the reuse of the existing MINCORE_SUPER bit for psind. It seemingly break ABI. Of course right now we do not have 1G superpage mappings in userspace, but it is natural to assume that MINCORE_SUPER indicates just superpage, so it should be set when 1G mappings appear. But since 1G psind is 2, we would have (current) MINCORE_SUPER bit clear.

I would just limit this change to 13.x and not try to MFC it.

Make the static assert conditional on defined(MAXPAGESIZES) as well.

This revision now requires review to proceed.Aug 30 2020, 7:00 PM
lib/libc/sys/mincore.2
83–89

I would deprecate the use of MINCORE_SUPER and note that it is now a bit field.

lib/libc/sys/mincore.2
83–89

Why deprecate? (vec[i] & MINCORE_SUPER) != 0 still tells you that the mapping is part of a superpage. It's a sort of union of all values of MINCORE_PSIND(i) for page size indices i > 0.

Try to clarify the link between MINCORE_PSIND and MINCORE_SUPER.

lib/libc/sys/mincore.2
83–89

Consider (vec[i] & MINCORE_SUPER) == MINCORE_SUPER. While I think that it's more likely that someone writes (vec[i] & MINCORE_SUPER) != 0, the former is now (silently) broken.

In D26238#583120, @kib wrote:

I am not sure about the reuse of the existing MINCORE_SUPER bit for psind. It seemingly break ABI. Of course right now we do not have 1G superpage mappings in userspace, but it is natural to assume that MINCORE_SUPER indicates just superpage, so it should be set when 1G mappings appear. But since 1G psind is 2, we would have (current) MINCORE_SUPER bit clear.

I think it is theoretically possible, but wouldn't such an application would have to both using the old ABI _and_ have some ability to create 1G mappings?

It could be e.g. some old library that does mincore(2), and some other library or main binary that creates 1G superpage, both loaded into same process.

In D26238#583172, @kib wrote:
In D26238#583120, @kib wrote:

I am not sure about the reuse of the existing MINCORE_SUPER bit for psind. It seemingly break ABI. Of course right now we do not have 1G superpage mappings in userspace, but it is natural to assume that MINCORE_SUPER indicates just superpage, so it should be set when 1G mappings appear. But since 1G psind is 2, we would have (current) MINCORE_SUPER bit clear.

I think it is theoretically possible, but wouldn't such an application would have to both using the old ABI _and_ have some ability to create 1G mappings?

It could be e.g. some old library that does mincore(2), and some other library or main binary that creates 1G superpage, both loaded into same process.

I'm not sure this possibility is worth handling. Such a library would most likely be using MINCORE_SUPER (a FreeBSD-specific flag) for informational purposes. If it really seems worth handling I see three options:

  1. add a new mincore() system call, and preserve compatibility in the old one by setting the old MINCORE_SUPER if any bits in the new MINCORE_SUPER are set
  2. keep MINCORE_SUPER and use the remaining two bits for the psind, though this limits us to MAXPAGESIZES <= 4 forever
  3. drop the change

The existence of a library that does anything useful with MINCORE_SUPER suggests that MINCORE_PSIND() may be a useful feature, so I do not think it is a good argument for 3).

Lets make a deal: document in man page that previously MINCORE_SUPER was MINCORE_PSIND(1), and superpage with the index 2 is reported with the old MINCORE_SUPER clear.

sys/sys/mman.h
185

May be move it to vm_mmap.c ? It does not make sense to get 32 compiler asserts when doing parallel build.

markj marked an inline comment as done.
  • Move the static assert to vm_mmap.c.
  • Add a IMPLEMENTATION NOTES section describing the incompatibility.
kib added inline comments.
lib/libc/sys/mincore.2
119

.Dv MINCORE_SUPER

This revision is now accepted and ready to land.Aug 31 2020, 9:02 PM
This revision now requires review to proceed.Aug 31 2020, 10:42 PM
This revision is now accepted and ready to land.Sep 22 2020, 12:43 PM