Page MenuHomeFreeBSD

Add hw.machine_real_arch
AbandonedPublic

Authored by mjg on Mar 5 2022, 9:14 PM.
Tags
None
Referenced Files
Unknown Object (File)
Nov 20 2024, 4:17 PM
Unknown Object (File)
Sep 7 2024, 6:55 PM
Unknown Object (File)
Sep 6 2024, 2:36 AM
Unknown Object (File)
Sep 5 2024, 12:38 PM
Unknown Object (File)
Sep 2 2024, 8:49 PM
Unknown Object (File)
Jul 20 2024, 6:31 AM
Unknown Object (File)
Jul 14 2024, 6:49 PM
Unknown Object (File)
Jul 10 2024, 3:06 PM
Subscribers

Details

Reviewers
None
Summary

For some reason golang runtime insists on finding out the real architecture (that is, if running as a 32-bit process it still wants to know if the kernel is 64-bit). The available sysctls lie about it, thus they resort to parsing kern.conftxt(!), see here:
https://github.com/golang/go/blob/master/src/syscall/route_freebsd.go#L10

Thus this patch adds hw.machine_real_arch which they can query instead.

I want to get this in time for 13.1.

I don't have strong opinions about the name, apart from it making clear that the result is not faked.

Test Plan

got an i386 base.txz, finds amd64 reported

# sysctl hw. | grep machine
hw.machine: i386
hw.machine_arch: i386
hw.machine_real_arch: amd64

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

mjg requested review of this revision.Mar 5 2022, 9:14 PM
mjg edited the test plan for this revision. (Show Details)

I'd suggest the name 'kernel_arch' because that's what this is returning. But the current code is really returning the name of the kernel directory used to compile code with.

Also, qemu-bsd-user is going to have to tell the same 'story' as this.

sys/kern/kern_mib.c
322

MACHINE_ARCH is wrong here. It should be MACHINE if you want it to match what 'machine foo' gives you in the kern.conftxt.
This will be the same on amd64, but different on both riscv and aarch64.

Also, what will 'go' use on older systems if we rush this into 13.1?

But the root cause here seems to be go's desire to send routing socket messages, which it magically does weird things to get. FreeBSD's routing socket interface should do the right thing w/o go having to do https://github.com/golang/go/commit/69275eef to fake them up.
Won't this keep coming up with different interfaces? Shouldn't we fix the reason they want to know this instead and make routing sockets work properly in compat32 mode?

jrtc27 added inline comments.
sys/kern/kern_mib.c
322

Except that's pretty useless on PowerPC, MACHINE is always powerpc for both 32-bit and 64-bit so a 32-bit process can't tell if it's on a 32-bit kernel or a 64-bit one via freebsd32. Same would be true if we had a riscv32, both that and riscv64 would be MACHINE=riscv, and downstream for all CHERI architectures we leave MACHINE alone but change MACHINE_ARCH. The machine takes both MACHINE and MACHINE_ARCH as arguments, with the latter defaulting to the former if not specified.

In D34457#780713, @imp wrote:

But the root cause here seems to be go's desire to send routing socket messages, which it magically does weird things to get. FreeBSD's routing socket interface should do the right thing w/o go having to do https://github.com/golang/go/commit/69275eef to fake them up.
Won't this keep coming up with different interfaces? Shouldn't we fix the reason they want to know this instead and make routing sockets work properly in compat32 mode?

Yes...

Looking at the go code, there's several problems:

  • It only ever removes the first whitespace character
  • it doesn't cope with powerpc and other architectures that have multiple values here
  • it only ever tests for amd64 (but arm/aarch64 likely has a similar problem)
  • It only uses it for routing sockets, which arguably shouldn't present different ABIs between 32-bit processes running on 32-bit kernel and on a 64-bit kernel
  • kern.conftxt is optional. Kernels aren't required to have it and we have options to remove it.

Would it make more sense to to publish the alignment requirement more directly?

Not saying I'm opposed to these things, but once we have them, and once the outside world uses them for certain things, it's a lot harder to change our minds should they prove to be a worse mismatch.

sys/kern/kern_mib.c
322

This is a good point. I think we may need to publish both values. It isn't the 'real arch' per say, but the kernel and kernel arch and we should let others use that to work around problems like the routing sockets having different ABIs between 32-bit on 32-bit and 32-bit on 64-bit. The current values aren't lies, but reflections of the current ABI. These would augment that when there's a bug like this.

The network problem of course has to be fixed, but what is going to happen next time around when they find there is a missing compat? As is, they will be forced to go back to parsing the config which is slow, nasty and not guaranteed to work. The sysctl at hand is supposed to provide a reliable way to find out if they need to hack around, hence I stand by the intent in this proposal.

Given the mentioned problems with used macro, perhaps instead this can export just whether the kernel is 32/64/whatever bits? E.g., hw.machine_real_bits.

I strongly dislike 'kernel_arch' (or 'kernel_bits' for that matter) as it does not provide any reason why it can't be faked the way hw.machine already is. I don't insist on 'real' being somewhere in the name, but *something* indicating the value is not faked definitely has to be.

Short term I planned to simply provide them with a query of the proposed sysctl, and should it return ENOENT, falling back to what they have right now. Once the oldest supported release also has the sysctl, the fallback can get removed.

In D34457#780719, @mjg wrote:

The network problem of course has to be fixed, but what is going to happen next time around when they find there is a missing compat? As is, they will be forced to go back to parsing the config which is slow, nasty and not guaranteed to work. The sysctl at hand is supposed to provide a reliable way to find out if they need to hack around, hence I stand by the intent in this proposal.

Agreed.

Given the mentioned problems with used macro, perhaps instead this can export just whether the kernel is 32/64/whatever bits? E.g., hw.machine_real_bits.

I'm not sure that provides enough information to be always useful, but it is an interesting notion.

I strongly dislike 'kernel_arch' (or 'kernel_bits' for that matter) as it does not provide any reason why it can't be faked the way hw.machine already is. I don't insist on 'real' being somewhere in the name, but *something* indicating the value is not faked definitely has to be.

I'd say it isn't faked today. It returns the proper value for the ABI for the process. That's its definition, one that was settled ages ago. It's what the vast majority of software expect for these values. The go folks misunderstand its definition, though.

I think hw.kernel.machine and hw.kernel.machine_arch are my recommendations for names. Their definition is for what the kernel we're running under is. Properly defining them would keep them from the alteration you are worried about: it's always the kernel's ABI as opposed to the process' ABI. And it would also cover all bases in the !x86 world should additional workarounds be needed there.

But since we're rolling a 'custom' thing for them, wouldn't it be better to also return the routing socket alignment requirements. If that's provided, use that, otherwise fall back to the heuristic that's being used now? That would solve the problem on all platforms, and would also allow us to fix the alignment but for 32-on-64 and just change what this returns. I think we should also have the other two sysctls I'm recommending above for other workarounds not covered by that. The go fix could look for this new sysctl (or sysctls, since there's two places this test is made), use that directly and fall back to the heuristic they use now until the need for it ages out, like you suggest below.

Short term I planned to simply provide them with a query of the proposed sysctl, and should it return ENOENT, falling back to what they have right now. Once the oldest supported release also has the sysctl, the fallback can get removed.

I like that plan.

In D34457#780720, @imp wrote:
In D34457#780719, @mjg wrote:

The network problem of course has to be fixed, but what is going to happen next time around when they find there is a missing compat? As is, they will be forced to go back to parsing the config which is slow, nasty and not guaranteed to work. The sysctl at hand is supposed to provide a reliable way to find out if they need to hack around, hence I stand by the intent in this proposal.

Agreed.

Given the mentioned problems with used macro, perhaps instead this can export just whether the kernel is 32/64/whatever bits? E.g., hw.machine_real_bits.

I'm not sure that provides enough information to be always useful, but it is an interesting notion.

I strongly dislike 'kernel_arch' (or 'kernel_bits' for that matter) as it does not provide any reason why it can't be faked the way hw.machine already is. I don't insist on 'real' being somewhere in the name, but *something* indicating the value is not faked definitely has to be.

I'd say it isn't faked today. It returns the proper value for the ABI for the process. That's its definition, one that was settled ages ago. It's what the vast majority of software expect for these values. The go folks misunderstand its definition, though.

Does not matter how you called, the crux is it does not provide the "real" state of things inside the kernel.

I think hw.kernel.machine and hw.kernel.machine_arch are my recommendations for names. Their definition is for what the kernel we're running under is. Properly defining them would keep them from the alteration you are worried about: it's always the kernel's ABI as opposed to the process' ABI. And it would also cover all bases in the !x86 world should additional workarounds be needed there.

So how exactly would you define so that it is apparent it is not "augmented" for the calling process?

But since we're rolling a 'custom' thing for them, wouldn't it be better to also return the routing socket alignment requirements. If that's provided, use that, otherwise fall back to the heuristic that's being used now? That would solve the problem on all platforms, and would also allow us to fix the alignment but for 32-on-64 and just change what this returns. I think we should also have the other two sysctls I'm recommending above for other workarounds not covered by that. The go fix could look for this new sysctl (or sysctls, since there's two places this test is made), use that directly and fall back to the heuristic they use now until the need for it ages out, like you suggest below.

The routing socket stuff, if it ever gets fixed, will need new sysctls/ioctls to be used by 32-bit processes, as otherwise it would break existing binaries. Consequently I don't think exporting alignemnt requirement specifically for it helps here.

Short term I planned to simply provide them with a query of the proposed sysctl, and should it return ENOENT, falling back to what they have right now. Once the oldest supported release also has the sysctl, the fallback can get removed.

I like that plan.

In D34457#782059, @mjg wrote:
In D34457#780720, @imp wrote:
In D34457#780719, @mjg wrote:

The network problem of course has to be fixed, but what is going to happen next time around when they find there is a missing compat? As is, they will be forced to go back to parsing the config which is slow, nasty and not guaranteed to work. The sysctl at hand is supposed to provide a reliable way to find out if they need to hack around, hence I stand by the intent in this proposal.

Agreed.

Given the mentioned problems with used macro, perhaps instead this can export just whether the kernel is 32/64/whatever bits? E.g., hw.machine_real_bits.

I'm not sure that provides enough information to be always useful, but it is an interesting notion.

I strongly dislike 'kernel_arch' (or 'kernel_bits' for that matter) as it does not provide any reason why it can't be faked the way hw.machine already is. I don't insist on 'real' being somewhere in the name, but *something* indicating the value is not faked definitely has to be.

I'd say it isn't faked today. It returns the proper value for the ABI for the process. That's its definition, one that was settled ages ago. It's what the vast majority of software expect for these values. The go folks misunderstand its definition, though.

Does not matter how you called, the crux is it does not provide the "real" state of things inside the kernel.

It does matter. The ABI dictates these things, not the kernel it happens to run on. The real state of a QEMU user-mode kernel could be amd64 for an armv7 binary for such a guest. Does such a guest need to do the alignment changes to work around the bug or not?

I think hw.kernel.machine and hw.kernel.machine_arch are my recommendations for names. Their definition is for what the kernel we're running under is. Properly defining them would keep them from the alteration you are worried about: it's always the kernel's ABI as opposed to the process' ABI. And it would also cover all bases in the !x86 world should additional workarounds be needed there.

So how exactly would you define so that it is apparent it is not "augmented" for the calling process?

The whole reason you need to know this is because the ABI isn't quite what the contract promised due to a bug in emulation and the program needs to second guess things to work around that bug. It is not a normal thing that needs to be known by a binary.

But since we're rolling a 'custom' thing for them, wouldn't it be better to also return the routing socket alignment requirements. If that's provided, use that, otherwise fall back to the heuristic that's being used now? That would solve the problem on all platforms, and would also allow us to fix the alignment but for 32-on-64 and just change what this returns. I think we should also have the other two sysctls I'm recommending above for other workarounds not covered by that. The go fix could look for this new sysctl (or sysctls, since there's two places this test is made), use that directly and fall back to the heuristic they use now until the need for it ages out, like you suggest below.

The routing socket stuff, if it ever gets fixed, will need new sysctls/ioctls to be used by 32-bit processes, as otherwise it would break existing binaries. Consequently I don't think exporting alignemnt requirement specifically for it helps here.

The current existing binaries are broken when run on 64-bit kernels. However, programs that work around them would break given how we implement compatibility shims in general. I will agree that future fixes are speculative in this area though. Having an explicit alignment requirement will provide for better workarounds than are possible with the current guessing setup, and the proposed improved guessing setup.

Short term I planned to simply provide them with a query of the proposed sysctl, and should it return ENOENT, falling back to what they have right now. Once the oldest supported release also has the sysctl, the fallback can get removed.

I like that plan.

Of course the current ABI is broken. The point is that binaries which put effort into working around it are going to stop working if it gets implemented the way it should have been from the get go, thus it can't be done in a way which affects them.

In D34457#782070, @imp wrote:
In D34457#782059, @mjg wrote:
In D34457#780720, @imp wrote:
In D34457#780719, @mjg wrote:

The network problem of course has to be fixed, but what is going to happen next time around when they find there is a missing compat? As is, they will be forced to go back to parsing the config which is slow, nasty and not guaranteed to work. The sysctl at hand is supposed to provide a reliable way to find out if they need to hack around, hence I stand by the intent in this proposal.

Agreed.

Given the mentioned problems with used macro, perhaps instead this can export just whether the kernel is 32/64/whatever bits? E.g., hw.machine_real_bits.

I'm not sure that provides enough information to be always useful, but it is an interesting notion.

I strongly dislike 'kernel_arch' (or 'kernel_bits' for that matter) as it does not provide any reason why it can't be faked the way hw.machine already is. I don't insist on 'real' being somewhere in the name, but *something* indicating the value is not faked definitely has to be.

I'd say it isn't faked today. It returns the proper value for the ABI for the process. That's its definition, one that was settled ages ago. It's what the vast majority of software expect for these values. The go folks misunderstand its definition, though.

Does not matter how you called, the crux is it does not provide the "real" state of things inside the kernel.

It does matter. The ABI dictates these things, not the kernel it happens to run on. The real state of a QEMU user-mode kernel could be amd64 for an armv7 binary for such a guest. Does such a guest need to do the alignment changes to work around the bug or not?

I think hw.kernel.machine and hw.kernel.machine_arch are my recommendations for names. Their definition is for what the kernel we're running under is. Properly defining them would keep them from the alteration you are worried about: it's always the kernel's ABI as opposed to the process' ABI. And it would also cover all bases in the !x86 world should additional workarounds be needed there.

So how exactly would you define so that it is apparent it is not "augmented" for the calling process?

The whole reason you need to know this is because the ABI isn't quite what the contract promised due to a bug in emulation and the program needs to second guess things to work around that bug. It is not a normal thing that needs to be known by a binary.

This does not answer the question. How do you make it clear this is the internal ABI of the kernel and not something the kernel thinks you should see based on your own ABI.

But since we're rolling a 'custom' thing for them, wouldn't it be better to also return the routing socket alignment requirements. If that's provided, use that, otherwise fall back to the heuristic that's being used now? That would solve the problem on all platforms, and would also allow us to fix the alignment but for 32-on-64 and just change what this returns. I think we should also have the other two sysctls I'm recommending above for other workarounds not covered by that. The go fix could look for this new sysctl (or sysctls, since there's two places this test is made), use that directly and fall back to the heuristic they use now until the need for it ages out, like you suggest below.

The routing socket stuff, if it ever gets fixed, will need new sysctls/ioctls to be used by 32-bit processes, as otherwise it would break existing binaries. Consequently I don't think exporting alignemnt requirement specifically for it helps here.

The current existing binaries are broken when run on 64-bit kernels. However, programs that work around them would break given how we implement compatibility shims in general. I will agree that future fixes are speculative in this area though. Having an explicit alignment requirement will provide for better workarounds than are possible with the current guessing setup, and the proposed improved guessing setup.

Short term I planned to simply provide them with a query of the proposed sysctl, and should it return ENOENT, falling back to what they have right now. Once the oldest supported release also has the sysctl, the fallback can get removed.

I like that plan.

I'd say "kernel_machine_arch" is the more accurate name, though possibly you might think of it as "the MACHINE_ARCH of the default ABI (e.g. sys/kern/init_sysent.c)" (this is more accurate in CheriBSD where you can have what is in effect a 64-bit kernel with 128-bit user pointers and the routing messages would use the 128-bit ABI). Certainly this means you can get all sorts of weird things for qemu user mode. Maybe 'default_machine_arch' or 'native_machine_arch' though would be a better name?

sys/sys/sysctl.h
1044

New nodes should probably use OID_AUTO instead.