Page MenuHomeFreeBSD

if_gre: make access to softc's data safe in network epoch
Needs ReviewPublic

Authored by ae on Fri, Feb 20, 2:42 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Feb 24, 5:13 PM
Unknown Object (File)
Mon, Feb 23, 4:04 PM

Details

Reviewers
pouria
zlei
Group Reviewers
network
Summary

Move all data required for data transfer into new struct gre_priv
and make it allocation and reclamation using network epoch.

This also should fix PR 275474.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 70818
Build 67701: arc lint + arc unit

Event Timeline

ae held this revision as a draft.
ae added reviewers: network, pouria, zlei.
ae published this revision for review.Fri, Feb 20, 2:53 PM

I don't understand why we need a separate data structure (priv).
I apologize if this seems like a stupid question, but wouldn't it be better to move ip6_gre.c and ip_gre.c into if_gre.c itself like other interfaces?

sys/net/if_gre.c
638

bcopy is deprecated.

993

I don't understand why we need a separate data structure (priv).

When you issue ioctl that does some changes in softc, another thread can do transmission in the same time and use some data from softc.
To avoid access to not yet fully modified or already freed data you need to block transmission.
Or you can allocate new gre_priv structure for each modification and free old one using NET_EPOCH_CALL(). This allows avoid blocking of transmit thread and it uses always stable data while it works in net_epoch.

I apologize if this seems like a stupid question, but wouldn't it be better to move ip6_gre.c and ip_gre.c into if_gre.c itself like other interfaces?

I think that's how it happened historically, for some interfaces we have different files of INET and INET6 implementation.

Ahh, I think I see what's happening. While some time ago I was trying to diagnose PR 275474, I observed UAF and got messed up in mind.

So if it is obvious that the fast path is still accessing some ifnet private data ( socket in gre softc ), I think the quick fix should be signaling the fast path that the interface is going to be re-configured ?

In D55398#1267129, @ae wrote:

When you issue ioctl that does some changes in softc, another thread can do transmission in the same time and use some data from softc.
To avoid access to not yet fully modified or already freed data you need to block transmission.
Or you can allocate new gre_priv structure for each modification and free old one using NET_EPOCH_CALL(). This allows avoid blocking of transmit thread and it uses always stable data while it works in net_epoch.

I understand, I mean, personaly, I'd prefer to block transmission since reconfiguration of gre interface is rare in comparison to transmission.
This is what all other interfaces do. IMHO, it's not worth the extra complexity.

I think that's how it happened historically, for some interfaces we have different files of INET and INET6 implementation.

When I was working on implementing netlink for GRE, I noticed that the GRE implementation has too many layers of indirection due to its historical context.
It's not important and just an idea, but would you like to merge them under sys/net? I can work on it if you want.

Anyway, this patch still results in a panic for my tests in D55363:
kgdb output:

kdb_enter (why=<optimized out>, msg=<optimized out>) at /usr/src/sys/kern/subr_kdb.c:556
556                     kdb_why = KDB_WHY_UNSET;
(kgdb) bt
#0  kdb_enter (why=<optimized out>, msg=<optimized out>) at /usr/src/sys/kern/subr_kdb.c:556
#1  0xffffffff80b9c8ff in vpanic (fmt=0xffffffff81343906 "%s: hashtbl %p not empty (malloc type %s)", ap=ap@entry=0xfffffe0069658d10) at /usr/src/sys/kern/kern_shutdown.c:962
#2  0xffffffff80b9c753 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:887
#3  0xffffffff80beca11 in hashdestroy (vhashtbl=0xffffffff812f131c, type=<optimized out>, hashmask=<optimized out>) at /usr/src/sys/kern/subr_hash.c:94
#4  0xffffffff80d1c944 in vnet_sysuninit () at /usr/src/sys/net/vnet.c:640
#5  vnet_destroy (vnet=0xfffff8000eeac280) at /usr/src/sys/net/vnet.c:295
#6  0xffffffff80b57495 in prison_deref (pr=<optimized out>, pr@entry=0xfffff8009633c000, flags=<optimized out>, flags@entry=85) at /usr/src/sys/kern/kern_jail.c:3742
#7  0xffffffff80b588f7 in prison_remove (pr=0xfffff8009633c000) at /usr/src/sys/kern/kern_jail.c:3040
#8  sys_jail_remove (td=0xfffff800b89c2000, uap=<optimized out>) at /usr/src/sys/kern/kern_jail.c:2996
#9  0xffffffff810dba45 in syscallenter (td=0xfffff800b89c2000) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:193
#10 amd64_syscall (td=0xfffff800b89c2000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1208
#11 <signal handler called>
#12 0x00001e141c54051a in ?? ()

Tests:

FreeBSD 16.0-CURRENT (GENERIC) #0 arcpatch-D55398-n284097-18c9d20f1dd0: Mon Feb 23 21:32:15 +0330 2026
# kyua test -k /usr/tests/sys/net/Kyuafile if_gre
if_gre:gre4_basic  ->  panic: hashdestroy: hashtbl 0xfffff800746c1800 not empty (malloc type ifaddr)
cpuid = 6
time = 1771869806
KDB: stack backtrace:
#0 0xffffffff80bedc2d at kdb_backtrace+0x5d
#1 0xffffffff80b9c8ce at vpanic+0x16e
#2 0xffffffff80b9c753 at panic+0x43
#3 0xffffffff80beca11 at hashdestroy+0x41
#4 0xffffffff80d1c944 at vnet_destroy+0x144
#5 0xffffffff80b57495 at prison_deref+0xa65
#6 0xffffffff80b588f7 at sys_jail_remove+0x127
#7 0xffffffff810dba45 at amd64_syscall+0x175
#8 0xffffffff810abe0b at fast_syscall_common+0xf8
KDB: enter: panic
KDB: reentering
KDB: stack backtrace:
#0 0xffffffff80bedc2d at kdb_backtrace+0x5d
#1 0xffffffff80bedeab at kdb_reenter+0x2b
#2 0xffffffff810ab518 at calltrap+0x8
#3 0xffffffff80ac1836 at gdb_trap+0x8f6
#4 0xffffffff80bee37f at kdb_trap+0x18f
#5 0xffffffff810da639 at trap+0x3f9
#6 0xffffffff810ab518 at calltrap+0x8
#7 0xffffffff80b9c753 at panic+0x43
#8 0xffffffff80beca11 at hashdestroy+0x41
#9 0xffffffff80d1c944 at vnet_destroy+0x144
#10 0xffffffff80b57495 at prison_deref+0xa65
#11 0xffffffff80b588f7 at sys_jail_remove+0x127
#12 0xffffffff810dba45 at amd64_syscall+0x175
#13 0xffffffff810abe0b at fast_syscall_common+0xf8

My first test is basically the same as PR 275474, as the final goal is to avoid kernel panics from PR 275474.