Page MenuHomeFreeBSD

malloc/zalloc M_NOWAIT failure injection
Needs ReviewPublic

Authored by rlibby on Jun 20 2019, 11:10 PM.
Tags
None
Referenced Files
F103227593: D20714.id.diff
Fri, Nov 22, 10:33 AM
Unknown Object (File)
Wed, Nov 20, 2:11 PM
Unknown Object (File)
Fri, Nov 15, 1:14 PM
Unknown Object (File)
Oct 4 2024, 6:23 PM
Unknown Object (File)
Sep 17 2024, 4:50 PM
Unknown Object (File)
Sep 5 2024, 12:54 PM
Unknown Object (File)
Sep 2 2024, 10:29 AM
Unknown Object (File)
Aug 18 2024, 4:59 PM
Subscribers

Details

Reviewers
markj
rwatson
Summary

The MALLOC_MAKE_FAILURES kernel option could be used to inject failures
for malloc(9) allocations which use the M_NOWAIT flag. This is an
expansion and enhancement.

  • Failures may now be injected for UMA zalloc instead of just malloc.
  • A fail(9) fail_point now controls injection instead of an ad hoc rate mechanism.
  • A whitelist and blacklist now allow specific malloc type and UMA zone names to be targeted or avoided.
  • Details about the last injection are now recorded to aid debugging.

This currently still lacks a manual page and the option is not yet
enabled in any kernel configurations.

Test Plan

sysctl debug.mnowait_failure
sysctl debug.fail_point.mnowait="1%return"
kyua test -k /usr/tests/sys/Kyuafile

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 24967
Build 23688: arc lint + arc unit

Event Timeline

Here's an example of how I have been applying this. This one looks like a locking bug in an error path in in6_joingroup_locked. I'll submit a separate review for it.

vali# sysctl debug.fail_point.mnowait="1%return"
vali# sysctl debug.mnowait_failure.blacklist="$(sysctl -n debug.mnowait_failure.blacklist),RADIX NODE,vm pgcache"
debug.mnowait_failure.blacklist: ata_request,BUF TRIE,ifaddr,kobj,linker,pcb,sackhole,sctp_ifa,sctp_ifn,sctp_vrf -> ata_request,BUF TRIE,ifaddr,kobj,linker,pcb,sackhole,sctp_ifa,sctp_ifn,sctp_vrf,RADIX NODE,vm pgcache
vali# kyua test -k /usr/tests/sys/Kyuafile 
[...]
netipsec/tunnel/aes_gcm_128:v6  ->  panic: mutex if_addr_lock not owned at /usr/src/freebsd/sys/netinet6/in6_mcast.c:614
cpuid = 3
time = 1561183891
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00513a31b0
vpanic() at vpanic+0x19d/frame 0xfffffe00513a3200
panic() at panic+0x43/frame 0xfffffe00513a3260
__mtx_assert() at __mtx_assert+0xb4/frame 0xfffffe00513a3270
in6m_disconnect_locked() at in6m_disconnect_locked+0x62/frame 0xfffffe00513a32a0
in6_joingroup_locked() at in6_joingroup_locked+0x512/frame 0xfffffe00513a3350
in6_joingroup() at in6_joingroup+0x44/frame 0xfffffe00513a3380
in6_update_ifa() at in6_update_ifa+0x1882/frame 0xfffffe00513a3530
in6_ifattach() at in6_ifattach+0x558/frame 0xfffffe00513a3690
in6_if_up() at in6_if_up+0x80/frame 0xfffffe00513a36d0
if_up() at if_up+0x6a/frame 0xfffffe00513a3700
ifhwioctl() at ifhwioctl+0xc77/frame 0xfffffe00513a3780
ifioctl() at ifioctl+0x529/frame 0xfffffe00513a3850
kern_ioctl() at kern_ioctl+0x28a/frame 0xfffffe00513a38c0
sys_ioctl() at sys_ioctl+0x15d/frame 0xfffffe00513a3990
amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe00513a3ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00513a3ab0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80048831a, rsp = 0x7fffffffd298, rbp = 0x7fffffffd2f0 ---
KDB: enter: panic
[ thread pid 23880 tid 100268 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db> x/s g_udnf_last_name
g_udnf_last_name:       mbuf
db> x/d g_udnf_last_tid 
g_udnf_last_tid:        100268
db> x/aS g_udnf_last_stack+0x8,0x12
g_udnf_last_stack+0x8:  uma_dbg_nowait_fail_record+0x31
g_udnf_last_stack+0x10: zalloc_inject_failure+0x4c
g_udnf_last_stack+0x18: uma_zalloc_arg+0xa98
g_udnf_last_stack+0x20: mld_v2_enqueue_group_record+0x709
g_udnf_last_stack+0x28: mld_change_state+0x5d1
g_udnf_last_stack+0x30: in6_joingroup_locked+0x4b6
g_udnf_last_stack+0x38: in6_joingroup+0x44
g_udnf_last_stack+0x40: in6_update_ifa+0x1882
g_udnf_last_stack+0x48: in6_ifattach+0x558
g_udnf_last_stack+0x50: in6_if_up+0x80
g_udnf_last_stack+0x58: if_up+0x6a
g_udnf_last_stack+0x60: ifhwioctl+0xc77
g_udnf_last_stack+0x68: ifioctl+0x529
g_udnf_last_stack+0x70: kern_ioctl+0x28a
g_udnf_last_stack+0x78: sys_ioctl+0x15d
g_udnf_last_stack+0x80: amd64_syscall+0x276
g_udnf_last_stack+0x88: fast_syscall_common+0x101
g_udnf_last_stack+0x90: 0

Friendly ping. There's no particular rush, but I believe that the functionality here is useful. I have uncovered around a dozen bugs with it.

Friendly ping. There's no particular rush, but I believe that the functionality here is useful. I have uncovered around a dozen bugs with it.

I'm sorry, I have been behind on reviews.

This looks fine to me.

sys/vm/uma_core.c
2344

Why is it useful to be able to ignore malloc zones?

sys/vm/uma_dbg.c
459

Sysctl description strings conventionally don't end in a period.

sys/vm/uma_core.c
2344

The main reason is that blacklisting malloc types doesn't work without this. Without this an M_NOWAIT malloc for a blacklisted malloc type may then be evaluated against the uma zone name, which will be one of the malloc bucket zone names (see kmemzones), and then may inject failure (from zalloc) despite being blacklisted.

Less importantly, even without blacklisting it also keeps the failure injection rate more accurate (rolling the dice once per allocation, not twice for mallocs).

I suppose that might deserve a comment.

sys/vm/uma_dbg.c
459

Will fix here and below. Any comment on the ones that are two sentences? Or are they just too long, and the desc should be moved to a man page?

sys/vm/uma_core.c
2344

I see. Yeah, I'd suggest adding a comment to that effect.

sys/vm/uma_dbg.c
459

I think we usually keep periods for multi-sentence descriptions, but yeah, I'd take this as a hint that the sysctls should be documented in a man page as well.