Page MenuHomeFreeBSD

ng_bridge: Use M_NOWAIT when allocating memory in the newhook routine
ClosedPublic

Authored by markj on Aug 9 2021, 11:21 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Apr 7, 10:33 AM
Unknown Object (File)
Apr 1 2024, 3:50 AM
Unknown Object (File)
Apr 1 2024, 3:49 AM
Unknown Object (File)
Feb 16 2024, 10:09 AM
Unknown Object (File)
Feb 7 2024, 10:53 AM
Unknown Object (File)
Jan 25 2024, 3:26 AM
Unknown Object (File)
Jan 25 2024, 3:26 AM
Unknown Object (File)
Nov 25 2023, 1:29 PM

Details

Summary

It appears that newhook is (sometimes?) called by the ngthread, which is
not allowed to sleep. This causes occasional panics in CI runs, e.g.,
https://ci.freebsd.org/job/FreeBSD-main-aarch64-test/340/console

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj requested review of this revision.Aug 9 2021, 11:21 PM

I think the problem is somewhat broader. Any message item can be added to the node's queue under load or if the queue is blocked. This message item will be processed by ngthread() in the EPOCH section. There are many places in the code of the nodes that perform actions prohibited in the EPOCH section.

sys/netgraph/ng_bridge.c
398–399

malloc(9) has KASSERT https://github.com/freebsd/freebsd-src/blob/main/sys/kern/kern_malloc.c#L542

What about uma_zalloc(9)? I think it should also be prohibited in the EPOCH section.

JFYI, the same situation. If you call "ngctl shutdown ngeth0:" under load:

panic: epoch_wait_preempt() called in the middle of an epoch section of the same epoch
cpuid = 0
time = 1600268186
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0154b9e880
vpanic() at vpanic+0x182/frame 0xfffffe0154b9e8d0
panic() at panic+0x43/frame 0xfffffe0154b9e930
epoch_wait_preempt() at epoch_wait_preempt+0x293/frame 0xfffffe0154b9e980
if_detach_internal() at if_detach_internal+0x1ca/frame 0xfffffe0154b9e9e0
if_detach() at if_detach+0x3d/frame 0xfffffe0154b9ea00
ng_eiface_rmnode() at ng_eiface_rmnode+0x55/frame 0xfffffe0154b9ea40
ng_rmnode() at ng_rmnode+0x188/frame 0xfffffe0154b9ea70
ng_mkpeer() at ng_mkpeer+0x7b/frame 0xfffffe0154b9eac0
ng_apply_item() at ng_apply_item+0x547/frame 0xfffffe0154b9eb40
ngthread() at ngthread+0x26e/frame 0xfffffe0154b9ebb0
fork_exit() at fork_exit+0x80/frame 0xfffffe0154b9ebf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0154b9ebf0

JFYI, the same situation. If you call "ngctl shutdown ngeth0:" under load:

Why does it have to be under load? I thought it was just stack usage that determined whether we queue messages for ngthread.

sys/netgraph/ng_bridge.c
398–399

Thanks, yes. UMA doesn't seem to have this assertion for some reason...

Allocate counters with M_NOWAIT as well.

The load simply allows this bug to manifest itself. Under load, the node's queue already contains the item's, so if a message item is sent at that moment (ngtstl shutdown), it will also go to the queue and will be processed by ngthread(). Without load, the queue is mostly empty, the message item will be delivered/processed by the thread of the calling process directly: ngctl -> syscall (sandto(2)) -> ngc_send (ng_socket(4)) -> ng_eiface(4) without entering to the EPOCH section. Therefore, the bug does not manifest itself.

There are many conditions when ng_item (data or the message) will get into the queue: stack usage, the queue is not empty, node flags, etc.

The main problem is that we are processing message items in the EPOCH section. Although the whole idea is that the processing of data items is in the EPOCH section (hot path), and messages in the non-EPOCH of the section.

Each netgraph node implements the following interface:

struct ng_type {
	u_int32_t	version; 	/* must equal NG_API_VERSION */
	const char	*name;		/* Unique type name */
	modeventhand_t	mod_event;	/* Module event handler (optional) */
	ng_constructor_t *constructor;	/* Node constructor */
	ng_rcvmsg_t	*rcvmsg;	/* control messages come here */
	ng_close_t	*close;		/* warn about forthcoming shutdown */
	ng_shutdown_t	*shutdown;	/* reset, and free resources */
	ng_newhook_t	*newhook;	/* first notification of new hook */
	ng_findhook_t	*findhook;	/* only if you have lots of hooks */
	ng_connect_t	*connect;	/* final notification of new hook */
	ng_rcvdata_t	*rcvdata;	/* data comes here */
	ng_disconnect_t	*disconnect;	/* notify on disconnect */

	const struct	ng_cmdlist *cmdlist;	/* commands we can convert */

	/* R/W data private to the base netgraph code DON'T TOUCH! */
	LIST_ENTRY(ng_type) types;		/* linked list of all types */
	int		    refs;		/* number of instances */
};

I think only two methods should have execution restrictions in the EPOCH section: ng_rcvdata_t and ng_findhook_t

The load simply allows this bug to manifest itself. Under load, the node's queue already contains the item's, so if a message item is sent at that moment (ngtstl shutdown), it will also go to the queue and will be processed by ngthread(). Without load, the queue is mostly empty, the message item will be delivered/processed by the thread of the calling process directly: ngctl -> syscall (sandto(2)) -> ngc_send (ng_socket(4)) -> ng_eiface(4) without entering to the EPOCH section. Therefore, the bug does not manifest itself.

There are many conditions when ng_item (data or the message) will get into the queue: stack usage, the queue is not empty, node flags, etc.

The main problem is that we are processing message items in the EPOCH section. Although the whole idea is that the processing of data items is in the EPOCH section (hot path), and messages in the non-EPOCH of the section.

Each netgraph node implements the following interface:

struct ng_type {
	u_int32_t	version; 	/* must equal NG_API_VERSION */
	const char	*name;		/* Unique type name */
	modeventhand_t	mod_event;	/* Module event handler (optional) */
	ng_constructor_t *constructor;	/* Node constructor */
	ng_rcvmsg_t	*rcvmsg;	/* control messages come here */
	ng_close_t	*close;		/* warn about forthcoming shutdown */
	ng_shutdown_t	*shutdown;	/* reset, and free resources */
	ng_newhook_t	*newhook;	/* first notification of new hook */
	ng_findhook_t	*findhook;	/* only if you have lots of hooks */
	ng_connect_t	*connect;	/* final notification of new hook */
	ng_rcvdata_t	*rcvdata;	/* data comes here */
	ng_disconnect_t	*disconnect;	/* notify on disconnect */

	const struct	ng_cmdlist *cmdlist;	/* commands we can convert */

	/* R/W data private to the base netgraph code DON'T TOUCH! */
	LIST_ENTRY(ng_type) types;		/* linked list of all types */
	int		    refs;		/* number of instances */
};

I think only two methods should have execution restrictions in the EPOCH section: ng_rcvdata_t and ng_findhook_t

Could be, yes. I don't know netgraph well enough to say much. I do think that the epoch_wait_preempt() in if_detach_internal() is a hack. If nothing else, it slows down ifnet destruction quite a lot. Ideally we could synchronously unlink the ifnet and use epoch_call() to handle teardown asynchronously, but that might have other ramifications.

The whole netgraph code predates any epoch, so the interactions are interesting.

Currently I do not have the nerve to check in detail, how such a call to ng_(add/rm)hook is possible and which of those situations are covered by epoch, which should be covered, and which can't be.

Removing potential complicated operations like malloc(M_WAITOK) is a acceptable compromise, but should be raised to the whole netgraph framework.

sys/netgraph/ng_bridge.c
406–407

I'd like to see a goto nomem here, which is the common idiom for a more complex failure handling.

I looked at other nodes that implement the ng_newhook_t method. They already use M_NOWAIT, so I think this patch is good.

ng_eiface(4) issues should be resolved in a separate review.

This revision is now accepted and ready to land.Aug 12 2021, 7:25 AM
markj marked an inline comment as done.

Use a goto to handle failures.

This revision now requires review to proceed.Aug 12 2021, 2:30 PM
This revision is now accepted and ready to land.Aug 12 2021, 9:05 PM