Page MenuHomeFreeBSD

(draft) if: Ensure additions to groups are visible to readers
Needs ReviewPublic

Authored by olce on Jun 26 2025, 4:16 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Oct 12, 6:45 AM
Unknown Object (File)
Sat, Oct 4, 4:56 AM
Unknown Object (File)
Sep 15 2025, 3:26 PM
Unknown Object (File)
Sep 13 2025, 1:25 PM
Unknown Object (File)
Sep 13 2025, 10:59 AM
Unknown Object (File)
Sep 8 2025, 1:24 AM
Unknown Object (File)
Aug 30 2025, 8:17 AM
Unknown Object (File)
Aug 29 2025, 9:37 PM

Details

Reviewers
bz
Summary

Adding a group to an interface is protected by the 'if_addr_lock' mutex,
but reading is by using epoch. It thus may be possible for
if_getgroup(), called under 'net_epoch_preempt', to observe a 'struct
ifg_list' holding a group added to an interface, but without that
structure being fully initialized (in particular, 'igl_group' may still
be observed as being NULL).

Reported by: bz
Fixes: d7c5a620e2b9 ("ifnet: Replace if_addr_lock rwlock with epoch + mutex")
MFC after: 1 week
Sponsored by: The FreeBSD Foundation

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 65105
Build 61988: arc lint + arc unit

Event Timeline

olce requested review of this revision.Jun 26 2025, 4:16 PM
olce retitled this revision from if: Ensure additions to groups are visible to readers to (draft) if: Ensure additions to groups are visible to readers.Jun 26 2025, 4:23 PM

That fence should be too much, and CK_STAILQ_INSERT_TAIL() is supposed to insert a proper store to store barrier, so this is probably a mistake, but if possible I'd like to know if that changes something in practice.

That fence should be too much, and CK_STAILQ_INSERT_TAIL() is supposed to insert a proper store to store barrier, so this is probably a mistake, but if possible I'd like to know if that changes something in practice.

I'm not sure, but it appears CK_STAILQ_FOREACH() lack proper load fence. The x86 has total store order ( TSO ) but aarch64 has much weaker memory order. The ck_pr_fence_load() should be redundant on x86 but should be relevant on aarch64.

I bet CK_STAILQ_FOREACH() want a load fence.

@bz Can you please try this patch ?

--- a/sys/net/if.c
+++ b/sys/net/if.c
@@ -1576,6 +1576,7 @@ if_getgroup(struct ifgroupreq *ifgr, struct ifnet *ifp)
                if (len < sizeof(ifgrq))
                        return (EINVAL);
                bzero(&ifgrq, sizeof ifgrq);
+               ck_pr_fence_load();
                strlcpy(ifgrq.ifgrq_group, ifgl->ifgl_group->ifg_group,
                    sizeof(ifgrq.ifgrq_group));
                if ((error = copyout(&ifgrq, ifgp, sizeof(struct ifg_req))))

I'm not sure, but it appears CK_STAILQ_FOREACH() lack proper load fence. The x86 has total store order ( TSO ) but aarch64 has much weaker memory order. The ck_pr_fence_load() should be redundant on x86 but should be relevant on aarch64.

I bet CK_STAILQ_FOREACH() want a load fence.

@bz Can you please try this patch ?

--- a/sys/net/if.c
+++ b/sys/net/if.c
@@ -1576,6 +1576,7 @@ if_getgroup(struct ifgroupreq *ifgr, struct ifnet *ifp)
                if (len < sizeof(ifgrq))
                        return (EINVAL);
                bzero(&ifgrq, sizeof ifgrq);
+               ck_pr_fence_load();
                strlcpy(ifgrq.ifgrq_group, ifgl->ifgl_group->ifg_group,
                    sizeof(ifgrq.ifgrq_group));
                if ((error = copyout(&ifgrq, ifgp, sizeof(struct ifg_req))))

This shouldn't make any difference, as this fence here prevents reordering of loads and the existing code already imposes an ordering (ifgl has to be loaded before ifgl_group can be, and then same for ifg_group).

That fence should be too much, and CK_STAILQ_INSERT_TAIL() is supposed to insert a proper store to store barrier, so this is probably a mistake, but if possible I'd like to know if that changes something in practice.

I suspect the arm64 barriers in the Concurrency Kit are just too weak. @bz: If my patch above fixes your problem, that would be a very strong hint in that direction, and I'll then propose a patch to CK.

I don't think we need the fence here. Assuming the compiler doesn't reorder any of the barriers and memory operations then as long as the store in CK_STAILQ_INSERT_TAIL has been observed on another CPU then the stores before CK_STAILQ_INSERT_TAIL will also be observed.

I don't think we need the fence here. Assuming the compiler doesn't reorder any of the barriers and memory operations then as long as the store in CK_STAILQ_INSERT_TAIL has been observed on another CPU then the stores before CK_STAILQ_INSERT_TAIL will also be observed.

In theory, I agree. But the symptoms described by @bz are exactly those we would see with a missing barrier, and I'm kind of suspicious about dmb ish*. According to what I've read here and there (see also next paragraph), ish means InnerShareable, apparently covering all PEs in the same cluster or core complex, whereas, e.g., osh means OuterShareable, apparently covering cores in other clusters/complexes. If I understand the difference correctly, we want at least osh in the general case. But there's an even stronger variant, sy, which means full system (my educated guess is that that shareability domain should be used for synchronization with peripherals, outside of the core complexes). Inserting atomic_thread_fence_seq_cst() forces the use of the strong variant, and for all {Load,Store}->{Load,Store} combinations. I really want to see if that makes a difference, because that's the line of inquiry that I find the most promising. And there aren't a lot of other possibilities. If inserting it doesn't make a difference, then there's most probably a bug in the epoch mechanism, which a priori I find unlikely (but not impossible).

The ARM Architecture Reference Manual does not seem to say anything about InnerShareable, OutShareable and FullSystem. If you have some insight on the matter, sharing it would be appreciated.

FreeBSD requires all CPUs to be in the same Inner Shareable domain so using ish is correct. This is the same requirement as Linux.

sy should probably only be needed for devices that may be outside the CPU inner shareable domain.

Something else funky is going on; please stop spending time on this until I can figure out I would say:
https://lists.freebsd.org/archives/freebsd-current/2025-June/007932.html

FreeBSD requires all CPUs to be in the same Inner Shareable domain so using ish is correct. This is the same requirement as Linux.

sy should probably only be needed for devices that may be outside the CPU inner shareable domain.

Point taken. That said, not using sy could be a problem if we are using CK to communicate with some other device (don't know if such cases exist; I'm specifically thinking about drivers that would be common between amd64 and arm64, because there is no such nuance on amd64 AFAIK).

(For the record, bz@ says in the mailing post that the change here doesn't solve his problem.)

In D51067#1165731, @bz wrote:

Something else funky is going on; please stop spending time on this until I can figure out I would say:
https://lists.freebsd.org/archives/freebsd-current/2025-June/007932.html

It's beyond funky I'd say... :-) Yes, I'll wait, and probably will close or repurpose this review depending on new elements.