Page MenuHomeFreeBSD

Wireguard merge
ClosedPublic

Authored by mmacy on Aug 20 2020, 6:42 PM.
Tags
None
Referenced Files
F102181652: D26137.id78069.diff
Fri, Nov 8, 2:54 PM
F102151928: D26137.id78145.diff
Fri, Nov 8, 6:39 AM
Unknown Object (File)
Thu, Nov 7, 2:51 PM
Unknown Object (File)
Thu, Nov 7, 12:25 PM
Unknown Object (File)
Thu, Nov 7, 5:04 AM
Unknown Object (File)
Wed, Nov 6, 7:19 PM
Unknown Object (File)
Wed, Nov 6, 6:53 PM
Unknown Object (File)
Tue, Nov 5, 10:32 PM

Details

Summary

basic ifconfig + adding module to kernel build

  • ifconfig still feature poor vis a vis wg(8)
ifconfig wg create listen-port 51820 private-key  <your private key>
ifconfig wg0 peer public-key <peer's public key>  endpoint 192.168.1.86:51820 allowed-ips 10.0.0.2/24
ifconfig wg0 10.0.0.2
route add -host 10.0.0.1 10.0.0.2

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

When a peer has more than one AllowedIPs, dump_peer() (sys/ifconfig/ifwg.c:270)
will print the address (but not the mask) of the first entry multiple times
because of sys/ifconfig/ifwg.c:306 which should be

sa = __DECONST(void *, &aips[i].a_addr);
This revision now requires changes to proceed.Oct 1 2020, 12:41 AM

IMHO wg_get() (sys/dev/if_wg/module/module.c:526) should not expose private-key
and wireguard_status() (sbin/ifconfig/ifwg.c:546) should not print it.

This might be out of scope of this review:
the WGC_SET ioctl is not priv(9) checked (and there is no PRIV_NET_WG entry in sys/priv.h)

  • rebase against master
  • don't print the first allowedip repeatedly
  • don't print the private key for unprivileged users
  • priv_check WGC_SET

This might be out of scope of this review:
the WGC_SET ioctl is not priv(9) checked (and there is no PRIV_NET_WG entry in sys/priv.h)

I've addressed all of your comments in my latest update.

  • Don't advertise checksum offload
  • Don't advertise checksum offload

Perfect! The ssh to bsd22 issue is now resolved.

It seems that the new wg interface is not completely jail-ready yet. I'm exposing the wg interface in devfs.rules with
[devfsrules_jail_wg=10]
add include $devfsrules_jail_vnet
add path 'wg*' unhide

Inside the jail i can create the wg interface. However i'm not allowed to add peers.

ifconfig wg0 create .... gives: ifconfig: failed to install peer

wg setconf wg0 ... gives: Unable to modify interface: Operation not permitted

Could it be that the wg peer structures are not exposed to the jail?

It seems that the new wg interface is not completely jail-ready yet. I'm exposing the wg interface in devfs.rules with
[devfsrules_jail_wg=10]
add include $devfsrules_jail_vnet
add path 'wg*' unhide

Inside the jail i can create the wg interface. However i'm not allowed to add peers.

ifconfig wg0 create .... gives: ifconfig: failed to install peer

wg setconf wg0 ... gives: Unable to modify interface: Operation not permitted

Could it be that the wg peer structures are not exposed to the jail?

Can you try this patch:

diff --git a/sys/net/if.c b/sys/net/if.c
index fb3e307ae5c..9ccffd1cf6d 100644
--- a/sys/net/if.c
+++ b/sys/net/if.c
@@ -2845,6 +2845,7 @@ ifhwioctl(u_long cmd, struct ifnet *ifp, caddr_t data, struct thread *td)
 #endif
        case SIOCSIFMEDIA:
        case SIOCSIFGENERIC:
+       case SIOCSDRVSPEC:
                error = priv_check(td, PRIV_NET_HWIOCTL);
                if (error)
                        return (error);
@@ -2864,6 +2865,7 @@ ifhwioctl(u_long cmd, struct ifnet *ifp, caddr_t data, struct thread *td)
        case SIOCGIFRSSKEY:
        case SIOCGIFRSSHASH:
        case SIOCGIFDOWNREASON:
+       case SIOCGDRVSPEC:
                if (ifp->if_ioctl == NULL)
                        return (EOPNOTSUPP);
                error = (*ifp->if_ioctl)(ifp, cmd, data);
  • Fix run_send_keepalive panic

Can you try this patch:

Did not help. Same results as before.

In order to have working wg in VIMAGE jails:

diff --git a/sys/dev/if_wg/module/module.c b/sys/dev/if_wg/module/module.c
index 6aa4aa52c146..23fda51b4935 100644
--- a/sys/dev/if_wg/module/module.c
+++ b/sys/dev/if_wg/module/module.c
@@ -766,7 +766,7 @@ wg_priv_ioctl(if_ctx_t ctx, u_long command, caddr_t data)
                        return (wgc_get(sc, ifd));
                        break;
                case WGC_SET:
-                       if (priv_check(curthread, PRIV_DRIVER))
+                       if (priv_check(curthread, PRIV_NET_HWIOCTL))
                                return (EPERM);
                        return (wgc_set(sc, ifd));
                        break;

since PRIV_DRIVER is not in the privileges specific to prisons with a virtual network stack.
Or maybe create PRIV_NET_WG after all.

Another kernel panic triggered by interface destruction: incoming upd traffic from the wg peer arrives in wg_input() where sc is already gone.

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff80cefaad
stack pointer           = 0x28:0xfffffe000eb13610
frame pointer           = 0x28:0xfffffe000eb13610
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq43: virtio_pci1)
trap number             = 9
panic: general protection fault
cpuid = 0
time = 1602732063
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe000eb13320
vpanic() at vpanic+0x182/frame 0xfffffe000eb13370
panic() at panic+0x43/frame 0xfffffe000eb133d0
trap_fatal() at trap_fatal+0x387/frame 0xfffffe000eb13430
trap() at trap+0xa4/frame 0xfffffe000eb13540
calltrap() at calltrap+0x8/frame 0xfffffe000eb13540
--- trap 0x9, rip = 0xffffffff80cefaad, rsp = 0xfffffe000eb13610, rbp = 0xfffffe000eb13610 ---
if_inc_counter() at if_inc_counter+0xd/frame 0xfffffe000eb13610
wg_input() at wg_input+0xa3/frame 0xfffffe000eb13650
udp_append() at udp_append+0x81/frame 0xfffffe000eb136c0
udp_input() at udp_input+0xa2f/frame 0xfffffe000eb13790
ip_input() at ip_input+0x194/frame 0xfffffe000eb13820
netisr_dispatch_src() at netisr_dispatch_src+0xb1/frame 0xfffffe000eb13880
ether_demux() at ether_demux+0x16e/frame 0xfffffe000eb138b0
ether_nh_input() at ether_nh_input+0x408/frame 0xfffffe000eb13910
netisr_dispatch_src() at netisr_dispatch_src+0xb1/frame 0xfffffe000eb13970
ether_input() at ether_input+0xa1/frame 0xfffffe000eb139d0
vtnet_rxq_input() at vtnet_rxq_input+0x200/frame 0xfffffe000eb13a10
vtnet_rxq_eof() at vtnet_rxq_eof+0x63d/frame 0xfffffe000eb13ae0
vtnet_rx_vq_process() at vtnet_rx_vq_process+0x97/frame 0xfffffe000eb13b20
ithread_loop() at ithread_loop+0x279/frame 0xfffffe000eb13bb0
fork_exit() at fork_exit+0x80/frame 0xfffffe000eb13bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000eb13bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

In order to have working wg in VIMAGE jails:

diff --git a/sys/dev/if_wg/module/module.c b/sys/dev/if_wg/module/module.c
index 6aa4aa52c146..23fda51b4935 100644
--- a/sys/dev/if_wg/module/module.c
+++ b/sys/dev/if_wg/module/module.c
@@ -766,7 +766,7 @@ wg_priv_ioctl(if_ctx_t ctx, u_long command, caddr_t data)
                        return (wgc_get(sc, ifd));
                        break;
                case WGC_SET:
-                       if (priv_check(curthread, PRIV_DRIVER))
+                       if (priv_check(curthread, PRIV_NET_HWIOCTL))
                                return (EPERM);
                        return (wgc_set(sc, ifd));
                        break;

since PRIV_DRIVER is not in the privileges specific to prisons with a virtual network stack.
Or maybe create PRIV_NET_WG after all.

This diff works. Both ifconfig create and wg setconf can now create a wg interface with peers.

  • rebase
  • fix WGC_SET priv_check to work in jails
  • mark link down before starting detach

Another kernel panic triggered by interface destruction: incoming upd traffic from the wg peer arrives in wg_input() where sc is already gone.

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff80cefaad
stack pointer           = 0x28:0xfffffe000eb13610
frame pointer           = 0x28:0xfffffe000eb13610
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq43: virtio_pci1)
trap number             = 9
panic: general protection fault
cpuid = 0
time = 1602732063
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe000eb13320
vpanic() at vpanic+0x182/frame 0xfffffe000eb13370
panic() at panic+0x43/frame 0xfffffe000eb133d0
trap_fatal() at trap_fatal+0x387/frame 0xfffffe000eb13430
trap() at trap+0xa4/frame 0xfffffe000eb13540
calltrap() at calltrap+0x8/frame 0xfffffe000eb13540
--- trap 0x9, rip = 0xffffffff80cefaad, rsp = 0xfffffe000eb13610, rbp = 0xfffffe000eb13610 ---
if_inc_counter() at if_inc_counter+0xd/frame 0xfffffe000eb13610
wg_input() at wg_input+0xa3/frame 0xfffffe000eb13650
udp_append() at udp_append+0x81/frame 0xfffffe000eb136c0
udp_input() at udp_input+0xa2f/frame 0xfffffe000eb13790
ip_input() at ip_input+0x194/frame 0xfffffe000eb13820
netisr_dispatch_src() at netisr_dispatch_src+0xb1/frame 0xfffffe000eb13880
ether_demux() at ether_demux+0x16e/frame 0xfffffe000eb138b0
ether_nh_input() at ether_nh_input+0x408/frame 0xfffffe000eb13910
netisr_dispatch_src() at netisr_dispatch_src+0xb1/frame 0xfffffe000eb13970
ether_input() at ether_input+0xa1/frame 0xfffffe000eb139d0
vtnet_rxq_input() at vtnet_rxq_input+0x200/frame 0xfffffe000eb13a10
vtnet_rxq_eof() at vtnet_rxq_eof+0x63d/frame 0xfffffe000eb13ae0
vtnet_rx_vq_process() at vtnet_rx_vq_process+0x97/frame 0xfffffe000eb13b20
ithread_loop() at ithread_loop+0x279/frame 0xfffffe000eb13bb0
fork_exit() at fork_exit+0x80/frame 0xfffffe000eb13bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000eb13bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

Can you check if the latest change prevents this?

  • rebase
  • fix WGC_SET priv_check to work in jails
  • mark link down before starting detach

Compiling latest commit fails. After some investigation i found that the call clone_setdefcallback used in ifwg.c have disappered from ifconfig/ifclone.c, it seems to have been replaced with call clone_setdefcallback_prefix.

The retirement of clone_setdefcallback happend in r366917 due to https://reviews.freebsd.org/D26436 ..

with 78675 applied (and clone_setdefcallback changed to clone_setdefcallback_prefix in ifwg.c), i got another
kernel panic (on wg device destruction):

Fatal trap 12: page fault while in kernel mode
panic: sleepq_add: td 0xfffffe005bffd800 to sleep on wchan 0xfffff800049c6988 with sleeping prohibited
cpuid = 2
time = 1603766504
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe005edfe500
vpanic() at vpanic+0x182/frame 0xfffffe005edfe550
panic() at panic+0x43/frame 0xfffffe005edfe5b0
sleepq_add() at sleepq_add+0x359/frame 0xfffffe005edfe600
msleep_spin_sbt() at msleep_spin_sbt+0xda/frame 0xfffffe005edfe670
gtaskqueue_drain() at gtaskqueue_drain+0x97/frame 0xfffffe005edfe6b0
wg_peer_destroy() at wg_peer_destroy+0x1cb/frame 0xfffffe005edfe710
wg_peer_remove_all() at wg_peer_remove_all+0x6e/frame 0xfffffe005edfe760
wg_detach() at wg_detach+0x63/frame 0xfffffe005edfe790
device_detach() at device_detach+0x18e/frame 0xfffffe005edfe7d0
device_delete_child() at device_delete_child+0x15/frame 0xfffffe005edfe7f0
iflib_clone_destroy() at iflib_clone_destroy+0x8b/frame 0xfffffe005edfe820
if_clone_destroyif() at if_clone_destroyif+0x237/frame 0xfffffe005edfe870
if_clone_destroy() at if_clone_destroy+0x1f1/frame 0xfffffe005edfe8c0
ifioctl() at ifioctl+0x35b/frame 0xfffffe005edfe990
kern_ioctl() at kern_ioctl+0x28e/frame 0xfffffe005edfea00
sys_ioctl() at sys_ioctl+0x127/frame 0xfffffe005edfead0
amd64_syscall() at amd64_syscall+0x135/frame 0xfffffe005edfebf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe005edfebf0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80042708a, rsp = 0x7fffffffe188, rbp = 0x7fffffffe1a0 ---
KDB: enter: panic
Uptime: 1h21m8s
Dumping 377 out of 8062 MB:..5%..13%..22%..34%..43%..51%..64%..73%..81%..94%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:394
#2  0xffffffff80be05b0 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:481
#3  0xffffffff80be09fa in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:912
#4  0xffffffff80be0763 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:838
#5  0xffffffff80c3bd29 in sleepq_add (wchan=0xfffff800049c6988, 
    lock=0xfffff8000347bc40, wmesg=0xffffffff8113f64b "gtq_drain", flags=0, 
    queue=0) at /usr/src/sys/kern/subr_sleepqueue.c:328
#6  0xffffffff80becbba in msleep_spin_sbt (ident=0xfffff800049c6988, 
    mtx=0xfffff8000347bc40, wmesg=0xffffffff8113f64b "gtq_drain", sbt=0, pr=0, 
    flags=256) at /usr/src/sys/kern/kern_synch.c:264
#7  0xffffffff80c29907 in TQ_SLEEP (tq=0xfffff8000347bc00, 
    p=0xfffff800049c6988, wm=<optimized out>)
    at /usr/src/sys/kern/subr_gtaskqueue.c:121
#8  gtaskqueue_drain_locked (queue=0xfffff8000347bc00, 
    gtask=0xfffff800049c6988) at /usr/src/sys/kern/subr_gtaskqueue.c:420
#9  gtaskqueue_drain (queue=0xfffff8000347bc00, gtask=0xfffff800049c6988)
    at /usr/src/sys/kern/subr_gtaskqueue.c:431
#10 0xffffffff8231289b in wg_peer_destroy (peer=0xfffff800049c6000)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:1377
#11 0xffffffff8231444e in wg_peer_remove_all (sc=<optimized out>)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:2118
#12 0xffffffff82315103 in wg_detach (ctx=<optimized out>)
    at /usr/src/sys/dev/if_wg/module/module.c:337
#13 0xffffffff80c1911e in DEVICE_DETACH (dev=0xfffff80004d8dd00)
    at ./device_if.h:234
#14 device_detach (dev=0xfffff80004d8dd00) at /usr/src/sys/kern/subr_bus.c:3014
#15 0xffffffff80c18de5 in device_delete_child (dev=0xfffff8000475a800, 
    child=0xfffff80004d8dd00) at /usr/src/sys/kern/subr_bus.c:1946
#16 0xffffffff80d1af2b in iflib_clone_destroy (ifp=<optimized out>)
    at /usr/src/sys/net/iflib_clone.c:241
#17 0xffffffff80cfdb77 in ifc_simple_destroy (ifc=0xfffff80004331b00, 
    ifp=<optimized out>) at /usr/src/sys/net/if_clone.c:741
#18 if_clone_destroyif (ifc=0xfffff80004331b00, ifp=<optimized out>)
    at /usr/src/sys/net/if_clone.c:335
#19 0xffffffff80cfd8c1 in if_clone_destroy (name=0xfffffe005edfea20 "wg0")
    at /usr/src/sys/net/if_clone.c:295
#20 0xffffffff80cfa79b in ifioctl (so=<optimized out>, cmd=<optimized out>, 
    data=0xfffffe005edfea20 "wg0", td=<optimized out>)
    at /usr/src/sys/net/if.c:3046
#21 0xffffffff80c540ae in fo_ioctl (fp=<optimized out>, com=2149607801, 
    data=<unavailable>, active_cred=<unavailable>, td=0xfffffe005bffd800)
    at /usr/src/sys/sys/file.h:343
#22 kern_ioctl (td=<optimized out>, fd=<optimized out>, com=<optimized out>, 
    data=<unavailable>) at /usr/src/sys/kern/sys_generic.c:802
#23 0xffffffff80c53d77 in sys_ioctl (td=0xfffffe005bffd800, 
    uap=0xfffffe005bffdbe8) at /usr/src/sys/kern/sys_generic.c:710
#24 0xffffffff8102bc85 in syscallenter (td=<optimized out>)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:176
#25 amd64_syscall (td=0xfffffe005bffd800, traced=0)
    at /usr/src/sys/amd64/amd64/trap.c:1156
#26 <signal handler called>
#27 0x000000080042708a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffe188
(kgdb) 

------------------------------------------------------------------------

with the same setup (on FreeBSD: while true; do ifconfig wg0 create .....; ping -c 1 PEERIP; sleep 1; ifconfig wg0 destroy; done and on the Linux peer: ping -f FreeBSDwgIP) i can also get a different panic: here the gtaskqueue_drain thread got to wg_deliver_in(...) but peer->p_sc->sc_socket->so_so4 is 0x0

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0xd8
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff823123ef
stack pointer           = 0x28:0xfffffe004c8daa60
frame pointer           = 0x28:0xfffffe004c8dab00
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_1)
trap number             = 12
panic: page fault
cpuid = 1
time = 1603769864
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe004c8da710
vpanic() at vpanic+0x182/frame 0xfffffe004c8da760
panic() at panic+0x43/frame 0xfffffe004c8da7c0
trap_fatal() at trap_fatal+0x387/frame 0xfffffe004c8da820
trap_pfault() at trap_pfault+0x97/frame 0xfffffe004c8da880
trap() at trap+0x2ab/frame 0xfffffe004c8da990
calltrap() at calltrap+0x8/frame 0xfffffe004c8da990
--- trap 0xc, rip = 0xffffffff823123ef, rsp = 0xfffffe004c8daa60, rbp = 0xfffffe004c8dab00 ---
wg_deliver_in() at wg_deliver_in+0x24f/frame 0xfffffe004c8dab00
gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame 0xfffffe004c8dab80
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x94/frame 0xfffffe004c8dabb0
fork_exit() at fork_exit+0x80/frame 0xfffffe004c8dabf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004c8dabf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
Uptime: 32m27s
Dumping 555 out of 8062 MB:..3%..12%..21%..32%..41%..52%..61%..72%..81%..93%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:394
#2  0xffffffff80be05b0 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:481
#3  0xffffffff80be09fa in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:912
#4  0xffffffff80be0763 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:838
#5  0xffffffff8102b2b7 in trap_fatal (frame=0xfffffe004c8da9a0, eva=216)
    at /usr/src/sys/amd64/amd64/trap.c:915
#6  0xffffffff8102b357 in trap_pfault (frame=0xfffffe004c8da9a0, 
    usermode=<optimized out>, signo=<optimized out>, ucode=<optimized out>)
    at /usr/src/sys/amd64/amd64/trap.c:732
#7  0xffffffff8102a94b in trap (frame=0xfffffe004c8da9a0)
    at /usr/src/sys/amd64/amd64/trap.c:398
#8  <signal handler called>
#9  0xffffffff823123ef in wg_deliver_in (peer=0xfffff80164e98000)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:1572
#10 0xffffffff80c2aa27 in gtaskqueue_run_locked (queue=0xfffff8000347bc00)
    at /usr/src/sys/kern/subr_gtaskqueue.c:371
#11 0xffffffff80c2a824 in gtaskqueue_thread_loop (arg=<optimized out>)
    at /usr/src/sys/kern/subr_gtaskqueue.c:547
#12 0xffffffff80b9b9c0 in fork_exit (
    callout=0xffffffff80c2a790 <gtaskqueue_thread_loop>, 
    arg=0xfffffe004ca97020, frame=0xfffffe004c8dac00)
    at /usr/src/sys/kern/kern_fork.c:1052
#13 <signal handler called>
(kgdb) 

------------------------------------------------------------------------
sbin/ifconfig/ifwg.c
612

Since r366917, clone_setdefcallback() was renamed clone_setdefcallback_prefix()

  • fix ifwg.c compile
  • avoid enqueueing tasks when link is down
  • wait for tasks to complete before detach
  • fix ifwg.c compile
  • avoid enqueueing tasks when link is down
  • wait for tasks to complete before detach

I got this error when I compile the latest commit with r367700:

/usr/src/sys/dev/if_wg/module/module.c:334:2: error: implicit declaration of function 'taskqgroup_drain_all' is invalid in C99 [-Werror,-Wimplicit-function-declaration]

taskqgroup_drain_all(qgroup_if_io_tqg);
^

/usr/src/sys/dev/if_wg/module/module.c:334:2: note: did you mean 'taskqueue_drain_all'?
/usr/src/sys/sys/taskqueue.h:98:6: note: 'taskqueue_drain_all' declared here
void taskqueue_drain_all(struct taskqueue *queue);

^

1 error generated.

  • [module.o] Error code 1

make[4]: stopped in /usr/src/sys/modules/if_wg
1 error

Wouldn't it be better to commit this to -CURRENT and start over later on?

with 79581 applied, device destruction can still panic the kernel (same setup as before: loop create, ping, sleep, destroy on FreeBSD and ping flood the wg address of the FreeBSD machine)

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff80bbbd05
stack pointer           = 0x28:0xfffffe005f204660
frame pointer           = 0x28:0xfffffe005f204670
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 669 (ifconfig)
trap number             = 9
panic: general protection fault
cpuid = 0
time = 1605842613
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe005f204370
vpanic() at vpanic+0x182/frame 0xfffffe005f2043c0
panic() at panic+0x43/frame 0xfffffe005f204420
trap_fatal() at trap_fatal+0x387/frame 0xfffffe005f204480
trap() at trap+0xa4/frame 0xfffffe005f204590
calltrap() at calltrap+0x8/frame 0xfffffe005f204590
--- trap 0x9, rip = 0xffffffff80bbbd05, rsp = 0xfffffe005f204660, rbp = 0xfffffe005f204670 ---
mb_free_extpg() at mb_free_extpg+0x35/frame 0xfffffe005f204670
m_free() at m_free+0xce/frame 0xfffffe005f2046a0
m_freem() at m_freem+0x28/frame 0xfffffe005f2046c0
wg_peer_destroy() at wg_peer_destroy+0x309/frame 0xfffffe005f204720
wg_peer_remove_all() at wg_peer_remove_all+0x3e/frame 0xfffffe005f204750
wg_detach() at wg_detach+0x6f/frame 0xfffffe005f204780
device_detach() at device_detach+0x18e/frame 0xfffffe005f2047c0
device_delete_child() at device_delete_child+0x15/frame 0xfffffe005f2047e0
iflib_clone_destroy() at iflib_clone_destroy+0x8b/frame 0xfffffe005f204810
if_clone_destroyif() at if_clone_destroyif+0x237/frame 0xfffffe005f204860
if_clone_destroy() at if_clone_destroy+0x1f1/frame 0xfffffe005f2048b0
ifioctl() at ifioctl+0x35b/frame 0xfffffe005f204980
kern_ioctl() at kern_ioctl+0x28e/frame 0xfffffe005f2049f0
sys_ioctl() at sys_ioctl+0x12a/frame 0xfffffe005f204ac0
amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe005f204bf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe005f204bf0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80042716a, rsp = 0x7fffffffe188, rbp = 0x7fffffffe1a0 ---
KDB: enter: panic
Uptime: 1m17s
Dumping 364 out of 8062 MB:..5%..14%..22%..31%..44%..53%..62%..71%..84%..93%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:394
#2  0xffffffff80be3220 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:481
#3  0xffffffff80be366a in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:912
#4  0xffffffff80be33d3 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:838
#5  0xffffffff81030357 in trap_fatal (frame=0xfffffe005f2045a0, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:915
#6  0xffffffff8102f7e4 in trap (frame=0xfffffe005f2045a0)
    at /usr/src/sys/amd64/amd64/trap.c:212
#7  <signal handler called>
#8  mb_free_extpg (m=0xfffff8000489db00) at /usr/src/sys/kern/kern_mbuf.c:1259
#9  0xffffffff80bba6ce in m_free (m=0xfffff8000489db00)
    at /usr/src/sys/sys/mbuf.h:1439
#10 0xffffffff80bbba18 in m_freem (mb=0xfffff8000489db00)
    at /usr/src/sys/kern/kern_mbuf.c:1525
#11 0xffffffff82312a19 in mbufq_drain (mq=<optimized out>)
    at /usr/src/sys/sys/mbuf.h:1513
#12 wg_queue_deinit (q=<optimized out>)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:936
#13 wg_peer_destroy (peer=0xfffff80004261000)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:1392
#14 0xffffffff823144de in wg_peer_remove_all (sc=<optimized out>)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:2133
#15 0xffffffff823151df in wg_detach (ctx=<optimized out>)
    at /usr/src/sys/dev/if_wg/module/module.c:336
#16 0xffffffff80c1c57e in DEVICE_DETACH (dev=0xfffff80003748800)
    at ./device_if.h:234
#17 device_detach (dev=0xfffff80003748800)
    at /usr/src/sys/kern/subr_bus.c:3014
#18 0xffffffff80c1c245 in device_delete_child (dev=0xfffff80003748900, 
    child=0xfffff80003748800) at /usr/src/sys/kern/subr_bus.c:1946
#19 0xffffffff80d1e8bb in iflib_clone_destroy (ifp=<optimized out>)
    at /usr/src/sys/net/iflib_clone.c:241
#20 0xffffffff80d01637 in ifc_simple_destroy (ifc=0xfffff800040da280, 
    ifp=<optimized out>) at /usr/src/sys/net/if_clone.c:741
#21 if_clone_destroyif (ifc=0xfffff800040da280, ifp=<optimized out>)
    at /usr/src/sys/net/if_clone.c:335
#22 0xffffffff80d01381 in if_clone_destroy (name=0xfffffe005f204a10 "wg0")
    at /usr/src/sys/net/if_clone.c:295
#23 0xffffffff80cfe24b in ifioctl (so=<optimized out>, cmd=<optimized out>, 
    data=0xfffffe005f204a10 "wg0", td=<optimized out>)
    at /usr/src/sys/net/if.c:2976
#24 0xffffffff80c5784e in fo_ioctl (fp=<optimized out>, com=2149607801, 
    data=0x1c3, active_cred=0x0, td=0xfffffe005f03d500)
    at /usr/src/sys/sys/file.h:343
#25 kern_ioctl (td=<optimized out>, fd=<optimized out>, com=<optimized out>, 
    data=0x1c3 <error: Cannot access memory at address 0x1c3>)
    at /usr/src/sys/kern/sys_generic.c:801
#26 0xffffffff80c5750a in sys_ioctl (td=0xfffffe005f03d500, 
    uap=0xfffffe005f03d8e8) at /usr/src/sys/kern/sys_generic.c:709
#27 0xffffffff81030d1e in syscallenter (td=<optimized out>)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189
#28 amd64_syscall (td=0xfffffe005f03d500, traced=0)
    at /usr/src/sys/amd64/amd64/trap.c:1156
#29 <signal handler called>
#30 0x000000080042716a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffe188
(kgdb)

with the same setup (on FreeBSD: while true; do ifconfig wg0 create .....; ping -c 1 PEERIP; sleep 1; ifconfig wg0 destroy; done and on the Linux peer: ping -f FreeBSDwgIP) i can also get a different panic: here the gtaskqueue_drain thread got to wg_deliver_in(...) but peer->p_sc->sc_socket->so_so4 is 0x0

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0xd8
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff823123ef
stack pointer           = 0x28:0xfffffe004c8daa60
frame pointer           = 0x28:0xfffffe004c8dab00
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_1)
trap number             = 12
panic: page fault
cpuid = 1
time = 1603769864
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe004c8da710
vpanic() at vpanic+0x182/frame 0xfffffe004c8da760
panic() at panic+0x43/frame 0xfffffe004c8da7c0
trap_fatal() at trap_fatal+0x387/frame 0xfffffe004c8da820
trap_pfault() at trap_pfault+0x97/frame 0xfffffe004c8da880
trap() at trap+0x2ab/frame 0xfffffe004c8da990
calltrap() at calltrap+0x8/frame 0xfffffe004c8da990
--- trap 0xc, rip = 0xffffffff823123ef, rsp = 0xfffffe004c8daa60, rbp = 0xfffffe004c8dab00 ---
wg_deliver_in() at wg_deliver_in+0x24f/frame 0xfffffe004c8dab00
gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame 0xfffffe004c8dab80
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x94/frame 0xfffffe004c8dabb0
fork_exit() at fork_exit+0x80/frame 0xfffffe004c8dabf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004c8dabf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
Uptime: 32m27s
Dumping 555 out of 8062 MB:..3%..12%..21%..32%..41%..52%..61%..72%..81%..93%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:394
#2  0xffffffff80be05b0 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:481
#3  0xffffffff80be09fa in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:912
#4  0xffffffff80be0763 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:838
#5  0xffffffff8102b2b7 in trap_fatal (frame=0xfffffe004c8da9a0, eva=216)
    at /usr/src/sys/amd64/amd64/trap.c:915
#6  0xffffffff8102b357 in trap_pfault (frame=0xfffffe004c8da9a0, 
    usermode=<optimized out>, signo=<optimized out>, ucode=<optimized out>)
    at /usr/src/sys/amd64/amd64/trap.c:732
#7  0xffffffff8102a94b in trap (frame=0xfffffe004c8da9a0)
    at /usr/src/sys/amd64/amd64/trap.c:398
#8  <signal handler called>
#9  0xffffffff823123ef in wg_deliver_in (peer=0xfffff80164e98000)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:1572
#10 0xffffffff80c2aa27 in gtaskqueue_run_locked (queue=0xfffff8000347bc00)
    at /usr/src/sys/kern/subr_gtaskqueue.c:371
#11 0xffffffff80c2a824 in gtaskqueue_thread_loop (arg=<optimized out>)
    at /usr/src/sys/kern/subr_gtaskqueue.c:547
#12 0xffffffff80b9b9c0 in fork_exit (
    callout=0xffffffff80c2a790 <gtaskqueue_thread_loop>, 
    arg=0xfffffe004ca97020, frame=0xfffffe004c8dac00)
    at /usr/src/sys/kern/kern_fork.c:1052
#13 <signal handler called>
(kgdb) 

------------------------------------------------------------------------

I added some additional logic to avoid socket references when the link is down and then close the socket right away. This is what I'm doing to (try to) reproduce:

matt@BSD-UFS-0 [~|17:52|1] more wgsetup.sh
ifconfig wg create listen-port 51820 private-key  cDT7pVGJzATUrbvs0YuRRqXkD2kGfqOkDEIixys3JnU=
ifconfig wg0 peer public-key o38K4VAE1nO/jod+VjppIUMoTNSjYP5LSqcPc9vCCw8=  endpoint 192.168.7.116:51820 allowed-ips 10.0.0.2/24
ifconfig wg0 10.0.0.2
ping -c 1 10.0.0.1
sleep 1
ifconfig wg0 destroy

matt@BSD-UFS-0:~ % while ( 1 )
while? doas ./wgsetup.sh
while? end
 sudo ping -f 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
...............................................

It took me about 5 or 6 minutes to reproduce. The first time. I then fixed a use after free. I've now let it run
for a comparable amount of time and am not seeing any issues on teardown.

  • fix BPF issue
  • avoid socket operations when link is down
  • fix use after free
  • fix BPF issue
  • avoid socket operations when link is down
  • fix use after free

Good work! This is the most resilient version AFAICT, but let's wait for the verdict from Stefan ;-) What is next? Any chance that we could have PresharedKey implemented?

i did not test Diff 21 79917 ( Mon, Nov 23, 10:04 PM ) since it does not seem to involve the code responsible for the device destruction panics.

while Diff 20 79843 ( Sun, Nov 22, 2:20 AM ) did improve the situation, the panics still happen, it just takes longer.

Another difference is, that with 79843 the panic always happens when peer->p_encap_queue is drained.
In Diff 19 79581 ( (Mon, Nov 16, 3:57 AM ) it was always draining of peer->p_decap_queue that caused the panic:

vmcore.txt of the new panic:

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:


Fatal trap 9: general protection fault while in kernel mode
cpuid = 6; apic id = 06
instruction pointer     = 0x20:0xffffffff80bb1d05
stack pointer           = 0x28:0xfffffe00616c7660
frame pointer           = 0x28:0xfffffe00616c7670
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1212 (ifconfig)
trap number             = 9
panic: general protection fault
cpuid = 6
time = 1606172429
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00616c7370
vpanic() at vpanic+0x181/frame 0xfffffe00616c73c0
panic() at panic+0x43/frame 0xfffffe00616c7420
trap_fatal() at trap_fatal+0x387/frame 0xfffffe00616c7480
trap() at trap+0xa4/frame 0xfffffe00616c7590
calltrap() at calltrap+0x8/frame 0xfffffe00616c7590
--- trap 0x9, rip = 0xffffffff80bb1d05, rsp = 0xfffffe00616c7660, rbp = 0xfffffe00616c7670 ---
mb_free_extpg() at mb_free_extpg+0x35/frame 0xfffffe00616c7670
m_free() at m_free+0xce/frame 0xfffffe00616c76a0
m_freem() at m_freem+0x28/frame 0xfffffe00616c76c0
wg_peer_destroy() at wg_peer_destroy+0x289/frame 0xfffffe00616c7720
wg_peer_remove_all() at wg_peer_remove_all+0x3e/frame 0xfffffe00616c7750
wg_detach() at wg_detach+0x7b/frame 0xfffffe00616c7780
device_detach() at device_detach+0x18e/frame 0xfffffe00616c77c0
device_delete_child() at device_delete_child+0x15/frame 0xfffffe00616c77e0
iflib_clone_destroy() at iflib_clone_destroy+0x8b/frame 0xfffffe00616c7810
if_clone_destroyif() at if_clone_destroyif+0x237/frame 0xfffffe00616c7860
if_clone_destroy() at if_clone_destroy+0x1f1/frame 0xfffffe00616c78b0
ifioctl() at ifioctl+0x35b/frame 0xfffffe00616c7980
kern_ioctl() at kern_ioctl+0x289/frame 0xfffffe00616c79f0
sys_ioctl() at sys_ioctl+0x12a/frame 0xfffffe00616c7ac0
amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe00616c7bf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00616c7bf0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80042616a, rsp = 0x7fffffffd198, rbp = 0x7fffffffd1b0 ---
KDB: enter: panic
Uptime: 3m19s
Dumping 380 out of 8062 MB:..5%..13%..22%..34%..43%..51%..64%..72%..85%..93%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80bd9420 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff80bd9880 in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff80bd95d3 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff81026797 in trap_fatal (frame=0xfffffe00616c75a0, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:915
#6  0xffffffff81025c24 in trap (frame=0xfffffe00616c75a0)
    at /usr/src/sys/amd64/amd64/trap.c:212
#7  <signal handler called>
#8  mb_free_extpg (m=0xfffff800042ae400) at /usr/src/sys/kern/kern_mbuf.c:1259
#9  0xffffffff80bb06ce in m_free (m=0xfffff800042ae400)
    at /usr/src/sys/sys/mbuf.h:1439
#10 0xffffffff80bb1a18 in m_freem (mb=0xfffff800042ae400)
    at /usr/src/sys/kern/kern_mbuf.c:1525
#11 0xffffffff823169f9 in mbufq_drain (mq=<optimized out>)
    at /usr/src/sys/sys/mbuf.h:1513
#12 wg_queue_deinit (q=<optimized out>)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:749
#13 wg_peer_destroy (peer=0xfffff800bf0e6000)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:1204
#14 0xffffffff8231850e in wg_peer_remove_all (sc=<optimized out>)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:1957
#15 0xffffffff823191fb in wg_detach (ctx=<optimized out>)
    at /usr/src/sys/dev/if_wg/module/module.c:337
#16 0xffffffff80c124be in DEVICE_DETACH (dev=0xfffff80004a16a00)
    at ./device_if.h:234
#17 device_detach (dev=0xfffff80004a16a00)
    at /usr/src/sys/kern/subr_bus.c:3014
#18 0xffffffff80c12185 in device_delete_child (dev=0xfffff80004a0fd00, 
    child=0xfffff80004a16a00) at /usr/src/sys/kern/subr_bus.c:1946
#19 0xffffffff80d14b5b in iflib_clone_destroy (ifp=<optimized out>)
    at /usr/src/sys/net/iflib_clone.c:241
#20 0xffffffff80cf78d7 in ifc_simple_destroy (ifc=0xfffff800044aaf00, 
    ifp=<optimized out>) at /usr/src/sys/net/if_clone.c:741
#21 if_clone_destroyif (ifc=0xfffff800044aaf00, ifp=<optimized out>)
    at /usr/src/sys/net/if_clone.c:335
#22 0xffffffff80cf7621 in if_clone_destroy (name=0xfffffe00616c7a10 "wg0")
    at /usr/src/sys/net/if_clone.c:295
#23 0xffffffff80cf44eb in ifioctl (so=<optimized out>, cmd=<optimized out>, 
    data=0xfffffe00616c7a10 "wg0", td=<optimized out>)
    at /usr/src/sys/net/if.c:2976
#24 0xffffffff80c4d859 in fo_ioctl (fp=<optimized out>, com=2149607801, 
    data=0x1c3, active_cred=0x0, td=0xfffffe004cd5c700)
    at /usr/src/sys/sys/file.h:343
#25 kern_ioctl (td=<optimized out>, fd=<optimized out>, com=<optimized out>, 
    data=0x1c3 <error: Cannot access memory at address 0x1c3>)
    at /usr/src/sys/kern/sys_generic.c:801
#26 0xffffffff80c4d51a in sys_ioctl (td=0xfffffe004cd5c700, 
    uap=0xfffffe004cd5cae8) at /usr/src/sys/kern/sys_generic.c:709
#27 0xffffffff8102715e in syscallenter (td=<optimized out>)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189
#28 amd64_syscall (td=0xfffffe004cd5c700, traced=0)
    at /usr/src/sys/amd64/amd64/trap.c:1156
#29 <signal handler called>
#30 0x000000080042616a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffd198
(kgdb)

my test setup:

root@fbsd13:~ # cat ./wgsetup.sh 
#!/bin/sh
ifconfig wg0 create listen-port 51820 private-key INtbSa3CrE5JYg9UQBJJ/yv1p1sCWqpTLyE3swROGV4=
ifconfig wg0 peer public-key Pb8t0QZ3B7+Hy2NDbIKnfwadxDFf3I1cBjKAep7iwR8= endpoint 172.19.77.33:51820 allowed-ips 10.0.0.2/24
ifconfig wg0 10.0.0.2/24
ping -c 3 10.0.0.1
sleep 1
ifconfig wg0 destroy
root@fbsd13:~ # while (1)
while? sh /root/wgsetup.sh 
while? end

and:

[root@fedora33 ~]# wg
interface: wg0
  public key: Pb8t0QZ3B7+Hy2NDbIKnfwadxDFf3I1cBjKAep7iwR8=
  private key: (hidden)
  listening port: 51820

peer: INpyS8NnVTk/hH4ozOzNMsTKXQE8vTzRLmy4zT5CJ2Q=
  endpoint: 172.19.77.13:51820
  allowed ips: 10.0.0.2/32
  latest handshake: 39 minutes, 43 seconds ago
  transfer: 11.93 MiB received, 17.23 MiB sent
[root@fedora33 ~]# ping -f 10.0.0.2

fbsd13 and fedora33 are bhyve VMs connected to the same vale(4) switch

  • don't prematurely free in wg_encap

I don't have time to test right now, but this is an analogous double free fix in the wg_encap path that I did earlier in the wg_decap path.

I don't have time to test right now, but this is an analogous double free fix in the wg_encap path that I did earlier in the wg_decap path.

that did fix the panic on device destroy issue.

since the arc was unwilling to apply 79919 or 79917 here, i took 79843 and changed wg_encap().

the test setup is now running for 2hrs without a panic.

moved the test setup to a different machine and after 1 hour and 19 minutes of running the test setup i got a panic here:

Unread portion of the kernel message buffer:
panic: rw_rlock() of destroyed rwlock @ /usr/src/sys/dev/if_wg/module/if_wg_session.c:1829
cpuid = 14
time = 1606195619
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe004ce36530
vpanic() at vpanic+0x181/frame 0xfffffe004ce36580
panic() at panic+0x43/frame 0xfffffe004ce365e0
__rw_rlock_int() at __rw_rlock_int+0xf7/frame 0xfffffe004ce36610
wg_input() at wg_input+0x22e/frame 0xfffffe004ce36650
udp_append() at udp_append+0x81/frame 0xfffffe004ce366c0
udp_input() at udp_input+0xa2f/frame 0xfffffe004ce36790
ip_input() at ip_input+0x194/frame 0xfffffe004ce36820
netisr_dispatch_src() at netisr_dispatch_src+0xb1/frame 0xfffffe004ce36880
ether_demux() at ether_demux+0x16e/frame 0xfffffe004ce368b0
ether_nh_input() at ether_nh_input+0x415/frame 0xfffffe004ce36910
netisr_dispatch_src() at netisr_dispatch_src+0xb1/frame 0xfffffe004ce36970
ether_input() at ether_input+0xa1/frame 0xfffffe004ce369d0
vtnet_rxq_input() at vtnet_rxq_input+0x200/frame 0xfffffe004ce36a10
vtnet_rxq_eof() at vtnet_rxq_eof+0x63d/frame 0xfffffe004ce36ae0
vtnet_rx_vq_process() at vtnet_rx_vq_process+0x97/frame 0xfffffe004ce36b20
ithread_loop() at ithread_loop+0x279/frame 0xfffffe004ce36bb0
fork_exit() at fork_exit+0x80/frame 0xfffffe004ce36bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004ce36bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
Uptime: 1h19m1s
Dumping 553 out of 8062 MB:..3%..12%..21%..32%..41%..53%..61%..73%..81%..93%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80bd94a0 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff80bd9900 in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff80bd9653 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff80bd4d77 in __rw_rlock_int (rw=0xfffff801a9970b20, 
    file=0xffffffff8232f0fb "/usr/src/sys/dev/if_wg/module/if_wg_session.c", 
    line=1829) at /usr/src/sys/kern/kern_rwlock.c:672
#6  0xffffffff8231464e in wg_index_get (sc=0xfffff801a9970800, 
    key0=4171035521) at /usr/src/sys/dev/if_wg/module/if_wg_session.c:1829
#7  wg_input (m0=<optimized out>, offset=<optimized out>, 
    inpcb=<optimized out>, srcsa=<optimized out>, _sc=0xfffff801a9970800)
    at /usr/src/sys/dev/if_wg/module/if_wg_session.c:1916
#8  0xffffffff80dbf131 in udp_append (inp=0xfffff801470207a0, 
    ip=0xfffff8000465901a, n=0xfffff80004658600, off=20, 
    udp_in=0xfffffe004ce366d0) at /usr/src/sys/netinet/udp_usrreq.c:320
#9  0xffffffff80dbef0f in udp_input (mp=<optimized out>, 
    offp=<optimized out>, proto=<optimized out>)
    at /usr/src/sys/netinet/udp_usrreq.c:747
#10 0xffffffff80d8ce84 in ip_input (m=0x0)
    at /usr/src/sys/netinet/ip_input.c:833
#11 0xffffffff80d16b71 in netisr_dispatch_src (proto=1, source=0, 
    m=0xfffff80004658600) at /usr/src/sys/net/netisr.c:1143
#12 0xffffffff80cf9ade in ether_demux (ifp=0xfffff8000397e800, 
    m=<unavailable>) at /usr/src/sys/net/if_ethersubr.c:923
#13 0xffffffff80cfb185 in ether_input_internal (ifp=0xfffff8000397e800, 
    m=<unavailable>) at /usr/src/sys/net/if_ethersubr.c:709
#14 ether_nh_input (m=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:739
#15 0xffffffff80d16b71 in netisr_dispatch_src (proto=5, source=0, 
    m=0xfffff80004658600) at /usr/src/sys/net/netisr.c:1143
#16 0xffffffff80cf9fd1 in ether_input (ifp=0xfffff8000397e800, 
    m=0xfffff80004658600) at /usr/src/sys/net/if_ethersubr.c:830
#17 0xffffffff80a15f90 in vtnet_rxq_input (rxq=<optimized out>, 
    m=0xfffff80004658600, hdr=0xfffffe004ce36a50)
    at /usr/src/sys/dev/virtio/network/if_vtnet.c:1779
#18 0xffffffff80a15cbd in vtnet_rxq_eof (rxq=0xfffff800037f0d00)
    at /usr/src/sys/dev/virtio/network/if_vtnet.c:1904
#19 0xffffffff80a155e7 in vtnet_rx_vq_process (rxq=0xfffff800037f0d00, 
    tries=<optimized out>) at /usr/src/sys/dev/virtio/network/if_vtnet.c:1968
#20 0xffffffff80b973f9 in intr_event_execute_handlers (p=<optimized out>, 
    ie=0xfffff80003828b00) at /usr/src/sys/kern/kern_intr.c:1168
#21 ithread_execute_handlers (p=<optimized out>, ie=0xfffff80003828b00)
    at /usr/src/sys/kern/kern_intr.c:1181
#22 ithread_loop (arg=<optimized out>) at /usr/src/sys/kern/kern_intr.c:1269
#23 0xffffffff80b93ed0 in fork_exit (
    callout=0xffffffff80b97180 <ithread_loop>, arg=0xfffff8000386fdc0, 
    frame=0xfffffe004ce36c00) at /usr/src/sys/kern/kern_fork.c:1069
#24 <signal handler called>
(kgdb)
release/arm64/RPI3.conf
7 ↗(On Diff #79919)

Not sure this part is related to current work :-)

moved the test setup to a different machine and after 1 hour and 19 minutes of running the test setup i got a panic here:

Stefan, I'm on r367980 with diff 79843 and I manually removed the mfree line in wg_encap since the latest diff 79919 could not be used. I've run your test for over 6 hours now without any panic. I even added iperf3 --udp and bombarded the server over the wg link for one hour. Only difference what I understand is that i'm on a bare metal server and you run in a bhyve/vale instance. Could what you see now instead be an issue with the virtualization layer?

moved the test setup to a different machine and after 1 hour and 19 minutes of running the test setup i got a panic here:

Stefan, I'm on r367980 with diff 79843 and I manually removed the mfree line in wg_encap since the latest diff 79919 could not be used. I've run your test for over 6 hours now without any panic. I even added iperf3 --udp and bombarded the server over the wg link for one hour. Only difference what I understand is that i'm on a bare metal server and you run in a bhyve/vale instance. Could what you see now instead be an issue with the virtualization layer?

It's a real bug insofar as people will run this virtualized. It's probably not possible to reproduce this on physical hardware because the OS is delivering new packets to a closed socket almost half a second after the socket has been closed. This suggests that the bhyve thread was blocked from running after the packet arrived but before the socket was closed.

FreeBSD ifnet is full of life cycle issues, it would take fairly thorough revamping, eliminating ifnet pointers floating around the stack that no one is going to pay for. So I'm loath to put too much additional time in to this. This is further compounded in a virtual environment where vcpu scheduling can create a situation where one vcpu destroys an object and another vcpu runs hundreds of milliseconds later accessing it. I've already added considerable overhead to the data path as it is for things that no one would actually see in production that I would at some point like to see taken out.

Since we're just dealing with hypervisor scheduling at this point the only thing we can do here is add a delay:

diff --git a/sys/dev/if_wg/module/module.c b/sys/dev/if_wg/module/module.c
index 0d5aca904ec..cf2f10fb697 100644
--- a/sys/dev/if_wg/module/module.c
+++ b/sys/dev/if_wg/module/module.c
@@ -340,6 +340,7 @@ wg_detach(if_ctx_t ctx)
        taskqgroup_drain_all(qgroup_if_io_tqg);
        pause("link_down", hz/4);
        wg_peer_remove_all(sc);
+       pause("link_down", hz);
        mtx_destroy(&sc->sc_mtx);
        rw_destroy(&sc->sc_index_lock);
        taskqgroup_detach(qgroup_if_io_tqg, &sc->sc_handshake);

I'm ok with this to go in to CURRENT and any remaining debug to occur there.

(modulo cleaning up the diff).

I've been running this for a while now against OpenBSD and Ubuntu systems without issue.

  • garbage collect dead code
  • more dead code GC
  • add header licenses
  • more dead code GC
  • add header licenses

Since 79919 and now in 80072 below files are included in the Diff file, Why? They do not seem to be involved with wireguard:

release/arm64/RPI3.conf
sys/compat/freebsd32/freebsd32.h
sys/compat/freebsd32/freebsd32_misc.c
sys/dev/dwc/if_dwc.h
sys/dev/dwc/if_dwc.c
sys/dev/isp/isp.c
sys/dev/isp/isp_freebsd.h
sys/dev/isp/isp_freebsd.c
sys/dev/isp/isp_library.c
sys/dev/isp/isp_pci.c
sys/dev/isp/ispmbox.h
sys/dev/isp/ispreg.h
sys/dev/isp/ispvar.h
sys/kern/kern_descrip.c
sys/kern/kern_umtx.c
sys/net/route.h
sys/net/route.c
sys/net/route/route_ctl.h
sys/net/route/route_ctl.c
sys/net/route/route_helpers.c
sys/netinet/in_rmx.c
sys/netinet6/nd6_rtr.c
sys/sys/syscallsubr.h
sys/sys/umtx.h
tests/sys/kern/Makefile
tests/sys/kern/fdgrowtable_test.c
tools/tools/netmap/Makefile
tools/tools/netmap/bridge.c
tools/tools/netmap/lb.c
tools/tools/netmap/nmreplay.c
tools/tools/netmap/pkt-gen.c
tools/tools/netmap/pkt_hash.c
usr.sbin/valectl/Makefile
usr.sbin/valectl/valectl.c

I'm not seeing those files in 80072.

I'm not seeing those files in 80072.

Maybe it was to early in the morning for me or the Phabricator ;-) but I could see in the revision contents all those files. Now they are gone and I just downloaded again and the contents in the diff files is now correct. Thanks.

It's a real bug insofar as people will run this virtualized. It's probably not possible to reproduce this on physical hardware because the OS is delivering new packets to a closed socket almost half a second after the socket has been closed. This suggests that the bhyve thread was blocked from running after the packet arrived but before the socket was closed.

FreeBSD ifnet is full of life cycle issues, it would take fairly thorough revamping, eliminating ifnet pointers floating around the stack that no one is going to pay for. So I'm loath to put too much additional time in to this. This is further compounded in a virtual environment where vcpu scheduling can create a situation where one vcpu destroys an object and another vcpu runs hundreds of milliseconds later accessing it. I've already added considerable overhead to the data path as it is for things that no one would actually see in production that I would at some point like to see taken out.

Since we're just dealing with hypervisor scheduling at this point the only thing we can do here is add a delay:

diff --git a/sys/dev/if_wg/module/module.c b/sys/dev/if_wg/module/module.c
index 0d5aca904ec..cf2f10fb697 100644
--- a/sys/dev/if_wg/module/module.c
+++ b/sys/dev/if_wg/module/module.c
@@ -340,6 +340,7 @@ wg_detach(if_ctx_t ctx)
        taskqgroup_drain_all(qgroup_if_io_tqg);
        pause("link_down", hz/4);
        wg_peer_remove_all(sc);
+       pause("link_down", hz);
        mtx_destroy(&sc->sc_mtx);
        rw_destroy(&sc->sc_index_lock);
        taskqgroup_detach(qgroup_if_io_tqg, &sc->sc_handshake);

This delay took care of the rw_rlock() panic.

One can still crash the kernel with the test setup (after hours of running it in a vmm instance with 15 vCPUs), but the panic might not be related to the wireguard code and the effort (if any) to fix it can be coordinated somewhere else after the wg code is merged.

  • build fixes for tier 2 & 3 architectures
jrtc27 added inline comments.
sys/dev/if_wg/include/sys/support.h
300

This was copy/pasted from the OpenZFS code without changing the uniquifier (also repeated below).

@jhb wrote:

Can we store the crypto/zinc bits in sys/crypto/zinc if they are from a 3rd party? In particular, an existing chacha20+poly1035 AEAD cipher would be useful in OCF for use with both IPsec and and KTLS.

On a larger scale, is adding another pile of crypto bits necessary? Would it be feasible to use existing crypto framework instead? I've heard Linux folks eventually forced the author to rewrite this zinc thing of his as a simple wrapper over kernel crypto API. What's the situation in our case?

Thanks for in-kernel Wireguard. That's really great news before 13-STABLE is branched !
Everything works fine for me allowing to tunnel both legacy IP and IPv6 over legacy IP link. I was not able to utilise IPv6 address as tunnel endpoint so far. It failed with such an error: "wg0: wg_peer_add bad length for endpoint 28". Will tunnelling over IPv6 be supported in future?

Thanks for in-kernel Wireguard. That's really great news before 13-STABLE is branched !
Everything works fine for me allowing to tunnel both legacy IP and IPv6 over legacy IP link. I was not able to utilise IPv6 address as tunnel endpoint so far. It failed with such an error: "wg0: wg_peer_add bad length for endpoint 28". Will tunnelling over IPv6 be supported in future?

It should be. That just sounds like a bad size check.

mmacy added inline comments.
sys/dev/if_wg/include/sys/support.h
300

This was copy/pasted from the OpenZFS code without changing the uniquifier (also repeated below).

Thanks.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 12 2021, 8:38 PM
This revision was automatically updated to reflect the committed changes.
mmacy marked an inline comment as done.