fix a use after free in nfscl_cleanupkext()
ClosedPublic
Actions

Authored by rmacklem on Feb 22 2022, 12:21 AM.

Details

Reviewers

markj
ler

Commits

rG1cedb4ea1a79: nfscl: Fix a use after free in nfscl_cleanupkext()
rGdd08b84e35b6: nfscl: Fix a use after free in nfscl_cleanupkext()

Summary

ler@, mark2 reported a use after free in nfscl_cleanupkext().
They also provided two possible causes:

In nfscl_cleanup_common(), "own" is the owner string owp->nfsow_owner. If we free that particular owner structure, than in subsequent comparisons "own" will point to freed memory.

nfscl_cleanup_common() can apparently free more than one owner, so isn't the use of LIST_FOREACH_SAFE() in nfscl_cleanupkext() insufficient? That is, it looks like nfscl_cleanup_common() could free both "owp" and "nowp".

I also believe there is a 3rd, if nfscl_freeopenowner() or nfscl_freelockowner()
is called without the NFSCLSTATE mutex held. This could happen when
the exclusive lock is held on the client, such as when delegations are being
returned.

This patch fixes them as follows:
1 - Copy the owner string to a local variable before the nfscl_cleanup_common() call.
2 - Modify nfscl_cleanup_common() to return whether or not a free was done.

When a free was done, do a goto to restart the loop, instead of using FOREACH_SAFE,
which was not safe in this case.

3 - Acquire the NFSCLSTATE mutex in nfscl_freeopenowner() and nfscl_freelockowner(),

if it not already held. This serializes all of these calls with the ones done in
nfscl_cleanup_common().

Test Plan

I am testing it and monitoring the number of openowners and
lockowners allocated in the client.

Hopefully ler@ can test it for their environment, to see if the
use after free still occurs?

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

rmacklem created this revision.Feb 22 2022, 12:21 AM

Herald added a subscriber: imp. · View Herald TranscriptFeb 22 2022, 12:21 AM

rmacklem requested review of this revision.Feb 22 2022, 12:21 AM

markj added inline comments.Feb 22 2022, 2:25 PM

sys/fs/nfsclient/nfs_clstate.c
1894	Why is a restart needed in this case?

rmacklem added inline comments.Feb 22 2022, 2:55 PM

sys/fs/nfsclient/nfs_clstate.c
1894	It is not. I just thought it would be weird to return true when freeing open owners, but not when freeing lock owners. I can change the function to only return true when open owners are free'd, if you think that is appropriate? (And comment it as such.)

markj accepted this revision.Feb 22 2022, 3:29 PM

markj added inline comments.

sys/fs/nfsclient/nfs_clstate.c
1894	I think it's logically fine as-is, but I do wonder if the extra restarts might make nfscl_cleanupkext() a lot more expensive in the face of a large number of delegations. Consider that nfscl_cleanupkext() holds all of the PID hash locks. I don't have much intuition for whether this is likely to be a problem in practice, though. Hmm, actually, isn't a restart indeed required for the second loop in nfscl_cleanupkext()? That is, nfscl_cleanup_common() can potentially free multiple lock owners for a given delegation.

This revision is now accepted and ready to land.Feb 22 2022, 3:29 PM

Instead of returning a boolean, nfscl_cleanup_common()
returns flag bits for which of open/lock owner(s) have
been free'd.

This way, the loops only restart when they need to, fixing
the inefficiency caused by returning true for both cases.

As suggested (or at least hinted) by markj@.

This revision now requires review to proceed.Feb 22 2022, 4:57 PM

markj accepted this revision.Feb 22 2022, 5:22 PM

This revision is now accepted and ready to land.Feb 22 2022, 5:22 PM

Closed by commit rGdd08b84e35b6: nfscl: Fix a use after free in nfscl_cleanupkext() (authored by rmacklem). · Explain WhyFeb 22 2022, 10:24 PM

This revision was automatically updated to reflect the committed changes.

rmacklem added a commit: rGdd08b84e35b6: nfscl: Fix a use after free in nfscl_cleanupkext().

rmacklem added a reverting change: rG06148d225170: Revert "nfscl: Fix a use after free in nfscl_cleanupkext()".Feb 24 2022, 3:04 PM

cy@ reported via email that he had a problem when
running with the previous patch. He observed the mount
"come to a grinding halt" when under heavy load.
(He was doing "make -j16" builds.)
The CPU was busy, which would have indicated that
the renew thread was very busy.
The "goto tryagain(2)" was only done when an entry
was deleted, so it wasn't exactly an infinite loop, but
it appears that the overhead of repeating the loops
from the beginning was excessive when there were many
open owners.

I do not believe that there should ever be multiple elements
within a list with the same open/lock owner. There can be
multiple elements with the same open/lock owner (which
represents a process on the client), but they should always
be in different lsist.
--> As such, #2 should be ok with the FOREACH_SAFE loops

in nfscl_cleanupkext().

This version of the patch retains the original code that uses
FOREACH_SAFE loops, but has "break"s added, to ensure that
nfscl_cleanup_common() only removes the first element that
matches.

This provides a "safety belt (and optimization) to ensure
that nfscl_cleanup_common() only frees the first matching element.
If there ever is an additional element in the list (due to a bug?), it will
be handled on a subsequent call to nfscl_cleanupkext().

cy@ has confirmed that this version of the patch works fine
for his test case.

The previous patch has been reverted from main.

markj added inline comments.Feb 25 2022, 12:25 AM

sys/fs/nfsclient/nfs_clstate.c
1900	These three lines and the last line of the loop could just become a LIST_FOREACH(), I believe.

rmacklem added inline comments.Feb 25 2022, 2:51 PM

sys/fs/nfsclient/nfs_clstate.c
1894	Yes, the second restart loop is for lock owners, so a return of "true" is needed for the free lock owner case. There is a certain amount of inefficiency caused by a boolean re
1900	Yes, there are still places in the NFS code where FOREACH_SAFE is not used. (This code is so old, some of it was written before FOREACH_SAFE existed in all the BSDen I was maintaining a port to.) I have replaced a bunch of them with FOREACH_SAFE and will probably do this one with a FOREACH soon (as you noted, with "break" added, it does not need to be FOREACH_SAFE), since I noticed this while doing the patch. However, changing it is just code cleanup, which I would rather do separately and later. (No need to get this into 13.1.)

rmacklem added a commit: rG1cedb4ea1a79: nfscl: Fix a use after free in nfscl_cleanupkext().Feb 25 2022, 3:28 PM

Revision Contents
Changeset List

Path

Size

sys/

fs/

nfsclient/

	nfs_clstate.c
	nfs_clstate.c.freeown

46 lines

Diff 103209

View Options

sys/fs/nfsclient/nfs_clstate.c

Property	Old Value	New Value
fbsd:nokeywords	null	yes \ No newline at end of property
svn:eol-style	null	native \ No newline at end of property
svn:mime-type	null	text/plain \ No newline at end of property

				include/ERD/ERD_MANGLE.h
				lib/liberd.so
				share/cmake/erd/erdConfig.cmake
				share/cmake/erd/erdConfigVersion.cmake
				share/cmake/erd/erdTargets-%%CMAKE_BUILD_TYPE%%.cmake
				share/cmake/erd/erdTargets.cmake

fix a use after free in nfscl_cleanupkext()ClosedPublicActions