Details

Reviewers

sjg
rrs
gnn
bz
rwatson

Group Reviewers

network
manpages
Contributor Reviewers (ports)

Commits

rS292484: Add a safety net to reclaim mbufs when one of the mbuf zones become

Summary

It is possible for a bug in the code (or, theoretically, even unusual network conditions) to exhaust all possible mbufs or mbuf clusters. When this occurs, things can grind to a halt fairly quickly. However, we currently do not call mb_reclaim() unless the entire system is experiencing a low-memory condition.

While it is best to try to prevent exhaustion of one of the mbuf zones, it would also be useful to have a mechanism to attempt to recover from these situations by freeing "expendable" mbufs.

This patch makes two changes:

a) The patch adds a generic API to the UMA zone allocator to set a function that should be called when an allocation fails because the zone limit has been reached. Because of the way this function can be called, it really should do minimal work; it should be something akin to a signal handler.

b) The patch uses this API to try to free mbufs when an allocation fails from one of the mbuf zones because the zone limit has been reached. The function schedules a callout to run mb_reclaim() in the context of the callout softclock thread.

Test Plan

With only the changes to the UMA system, I saw that the system behaved normally when I exhausted all mbuf clusters.

With the additional changes to run mb_reclaim() when an allocation fails from one of the mbuf zones because the zone limit has been reached, I exhausted all mbufs on a system and saw that mb_reclaim() freed them.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

jonlooney_gmail.com updated this revision to Diff 9291.Oct 10 2015, 2:19 AM

jonlooney_gmail.com retitled this revision from to Add a safety net to reclaim mbufs when one of the mbuf zones become exhausted.

jonlooney_gmail.com updated this object.

jonlooney_gmail.com edited the test plan for this revision. (Show Details)

jonlooney_gmail.com set the repository for this revision to rS FreeBSD src repository - subversion.

jonlooney_gmail.com added a project: network.

jonlooney_gmail.com added a subscriber: network.

Herald added reviewers: manpages, Contributor Reviewers (ports). · View Herald TranscriptOct 10 2015, 2:19 AM

Herald added a subscriber: imp. · View Herald Transcript

I am trying to start a proper reviewers list ... ;-)

Should not the callout have some sort of lock?

You declare it MP-SAFE but I see no lock as described in the man pages
for callout_reset and friends..

In D3864#81077, @rrs wrote:

Should not the callout have some sort of lock?

You declare it MP-SAFE but I see no lock as described in the man pages
for callout_reset and friends..

My understanding - perhaps flawed - was that the callout would be mpsafe because the actual actions taken by the callout (mb_reclaim()) acquires appropriate locks before working.

However, I see that the man page specifically talks about the callout structure itself being mpsafe. It is easy enough to add such a lock, so I will do it.

Added a lock around the callout calls.

Herald edited edge metadata. · View Herald TranscriptOct 15 2015, 7:13 PM

"bump" :-)

If anyone has time to review this, I'd appreciate it.

Thanks!

jonlooney_gmail.com added reviewers: rrs, bz, network.Oct 29 2015, 12:19 AM

jtl added a subscriber: sjg.Nov 6 2015, 1:51 PM

fix my comment and I think you will be good to go

sys/kern/kern_mbuf.c
726 ↗	(On Diff #9431)	The return with-in the callout_pending/!callout_active needs to unlock the lock that you took.

This revision now requires changes to proceed.Nov 9 2015, 3:31 PM

jtl commandeered this revision.Nov 15 2015, 12:30 AM

jtl added a reviewer: jonlooney_gmail.com.

Herald edited edge metadata. · View Herald TranscriptNov 15 2015, 12:30 AM

Addressed @rrs's comment.

Also, updated the date in the man page.

Herald edited edge metadata. · View Herald TranscriptNov 15 2015, 12:36 AM

jtl added inline comments.Nov 15 2015, 12:38 AM

sys/kern/kern_mbuf.c
726 ↗	(On Diff #10185)	Doh! Thanks! Good catch! This should be fixed now.

jtl edited reviewers, added: sjg; removed: jonlooney_gmail.com.Nov 15 2015, 12:39 AM

Switch the locking mechanism to a mutex.

Herald edited edge metadata. · View Herald TranscriptNov 20 2015, 4:18 PM

gnn accepted this revision.Nov 23 2015, 2:59 PM

gnn edited edge metadata.

glebius added a subscriber: glebius.Dec 17 2015, 9:45 PM

What about using taskqueue for memory reclaiming purposes? The struct taskqueue will live at the end of struct uma_zone. That would allow a less constrained KPI. uma(9) users would not need to do gymnastics with callout(9) and their maxaction function can go directly into freeing memory, since the context doesn't have any locks.

I'm not trying to delay this revision, since I know other things depend on it! If you agree on my suggestion we can do it post-commit.

Hi Gleb,

In D3864#97313, @glebius wrote:

What about using taskqueue for memory reclaiming purposes? The struct taskqueue will live at the end of struct uma_zone. That would allow a less constrained KPI. uma(9) users would not need to do gymnastics with callout(9) and their maxaction function can go directly into freeing memory, since the context doesn't have any locks.

I have been thinking about something similar. I also want to tighten up the locking model. I think task queues could help with that, too.

I'm not trying to delay this revision, since I know other things depend on it! If you agree on my suggestion we can do it post-commit.

Sounds good. Let's plan on that.

Jonathan

Closed by commit rS292484: Add a safety net to reclaim mbufs when one of the mbuf zones become (authored by jtl). · Explain WhyDec 20 2015, 2:05 AM

This revision was automatically updated to reflect the committed changes.

Add a safety net to reclaim mbufs when one of the mbuf zones become exhausted
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 11470

head/share/man/man9/Makefile

head/share/man/man9/zone.9

head/sys/kern/kern_mbuf.c

head/sys/vm/uma.h

head/sys/vm/uma_core.c

head/sys/vm/uma_int.h

Add a safety net to reclaim mbufs when one of the mbuf zones become exhaustedClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 11470

head/share/man/man9/Makefile

head/share/man/man9/zone.9

head/sys/kern/kern_mbuf.c

head/sys/vm/uma.h

head/sys/vm/uma_core.c

head/sys/vm/uma_int.h

Add a safety net to reclaim mbufs when one of the mbuf zones become exhausted
ClosedPublic
Actions

Revision Contents
Changeset List