Page MenuHomeFreeBSD

Add the netdump mbuf allocator.

Authored by markj on May 1 2018, 5:21 PM.



The aim here is to permit mbuf allocations after a panic without
calling into the page allocator, without imposing any runtime overhead
during regular operation of the system, and without modifying driver
code. The approach taken is to pre-allocate a number of mbufs and
clusters, storing them in linked lists. The lists back a set of UMA
cache zones which replace the regular mbuf/cluster UMA zones. At panic
time, the zone pointers are overwritten with those of the cache zones,
so m_get() and so on return mbufs from the linked lists.

The main complication is that a few drivers (cxgb and iflib) cache mbuf
zone pointers from m_getzone(), so they need some special handling.

Diff Detail

rS FreeBSD src repository
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

markj created this revision.May 1 2018, 5:21 PM
shurd added a subscriber: shurd.May 1 2018, 6:02 PM
sbruno accepted this revision.May 1 2018, 8:52 PM
This revision is now accepted and ready to land.May 1 2018, 8:52 PM
emaste added a subscriber: emaste.May 2 2018, 12:11 AM
emaste added inline comments.
871 ↗(On Diff #42038)

just commit these independently first?

julian added a subscriber: julian.May 2 2018, 5:33 AM
julian added inline comments.
383 ↗(On Diff #42038)

how about a big comment here separating this code from 'regular code' so that the casual reader can instantly know what is going on.

know it's ifdef'd but take the message in this review and make it a comment.

496 ↗(On Diff #42038)

in some cases it has been known to dump memory and then keep running.
do you have a way to 'un-overwrite' the zone pointers?

julian accepted this revision.May 2 2018, 6:13 AM
markj added inline comments.May 2 2018, 2:15 PM
383 ↗(On Diff #42038)

netdump_mbuf_reinit() already has a similar comment; I'll move it.

496 ↗(On Diff #42038)

Not at the moment. I will look into the difficulty of getting that to work.

871 ↗(On Diff #42038)

Yeah, didn't mean to include these.

markj updated this revision to Diff 42155.May 4 2018, 7:53 PM
  • Move comments around.
This revision now requires review to proceed.May 4 2018, 7:53 PM
markj marked an inline comment as done.May 4 2018, 7:56 PM
markj added inline comments.
496 ↗(On Diff #42038)

For the initial revision I plan to just return an error if one attempts to netdump while panicstr == NULL. From DDB one can execute "panic" to get a dump in that case. It may be possible for the system to resume operation after a netdump, but I don't think it's very straightforward, so I will punt on that for now.

This revision was not accepted when it landed; it landed in state Needs Review.May 6 2018, 12:20 AM
This revision was automatically updated to reflect the committed changes.