Page MenuHomeFreeBSD

cpuset: add local functions for copyin/copyout
ClosedPublic

Authored by alfredo on Aug 16 2022, 9:27 PM.
Referenced Files
Unknown Object (File)
Sat, Nov 23, 12:04 AM
Unknown Object (File)
Fri, Nov 22, 8:00 PM
Unknown Object (File)
Sat, Nov 16, 10:23 AM
Unknown Object (File)
Thu, Nov 14, 2:48 PM
Unknown Object (File)
Thu, Nov 14, 2:23 PM
Unknown Object (File)
Thu, Nov 14, 12:03 AM
Unknown Object (File)
Sep 30 2024, 3:21 PM
Unknown Object (File)
Sep 30 2024, 7:52 AM

Details

Summary

At least powerpc64 and powerpc64le kernels panic when copyin/copyout
is called by external kernel modules (like pfsync). This patch works
around it while the reason is under investigation.

The panic with exception 0x480 (instruction segment exception) occurs
in a context where the functions are set as pointers in cpuset_copy_cb
struct. It doesn't crash when functions are called directly.
These are ifunc'd functions, so the implementation is decided by a
resolved function at runtime.

MFC after: 1 week

Test Plan

test "kldload pfsync" on powerpc64 and powerpc64le

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

alfredo added reviewers: jhibbits, PowerPC, markj, kevans.
alfredo added a project: PowerPC.

This is rather strange. The description makes it sound like there is a bug somewhere in powerpc relocation handling, nothing specifically to do with cpuset code.

Until the problem is better understood, I don't think it's appropriate to commit something like this - how do you know that the bug won't require another workaround next week?

This is rather strange. The description makes it sound like there is a bug somewhere in powerpc relocation handling, nothing specifically to do with cpuset code.
Until the problem is better understood, I don't think it's appropriate to commit something like this - how do you know that the bug won't require another workaround next week?

Yes, it's a workaround, it appears to be a corner bug around the clang compiler and the ifunc resolver when used as pointer in a struct, the cpuset code looks sane. The regression appeared after 47a57144af25a7bd768b29272d50a36fdf2874ba but there's nothing wrong with that change, it just exposed the problem.

It was cherry-picked to 13/stable in 72bc1e6806ccff0cc3e712c65090e59482b33357, so I'm suggesting this as "hot fix/quick fix" as powerpc team won't be able to trace it down in a short time frame. (13/stable remains broken for two months already)

This is rather strange. The description makes it sound like there is a bug somewhere in powerpc relocation handling, nothing specifically to do with cpuset code.

working with gsoc student on ppc64 Linuxulator we see the same relocation problem with copyin/copyout calls from loadable modules

Until the problem is better understood, I don't think it's appropriate to commit something like this - how do you know that the bug won't require another workaround next week?

Make this change powerpc* only in order to unbreak pfsync module and
the test infrastructure.

The issue is still being discussed with LLVM community through e-mail
and [1] and [2].

Bisecting LLVM I found that issue doesn't happen with LLD 9. On LLD 10
The behavior change in [3] caused the problem but it's still unclear
if crash is caused by LLVM bug or lack of handling in FreeBSD.

[1] https://github.com/llvm/llvm-project/issues/57851
[2] https://github.com/llvm/llvm-project/issues/57722
[3] https://reviews.llvm.org/rGdc06b0bc9ad055d06535462d91bfc2a744b2f589

Considering PowerPC has been broken for quite some time, this seems a good approach, while the real issue on LLD or the kernel linker is not resolved.

This revision is now accepted and ready to land.Sep 30 2022, 12:52 PM

Make this change powerpc* only in order to unbreak pfsync module and
the test infrastructure.

I'm also seeing this issue when zfs tries to mount my partition.