Page MenuHomeFreeBSD

Fix data race in jump cache read/update
ClosedPublic

Authored by smalukav_gmail.com on Aug 9 2021, 9:10 PM.
Tags
None
Referenced Files
F104087732: D31484.id93639.diff
Tue, Dec 3, 9:29 AM
Unknown Object (File)
Wed, Nov 27, 1:17 PM
Unknown Object (File)
Wed, Nov 27, 1:17 PM
Unknown Object (File)
Wed, Nov 27, 1:17 PM
Unknown Object (File)
Wed, Nov 27, 1:16 PM
Unknown Object (File)
Wed, Nov 27, 1:16 PM
Unknown Object (File)
Wed, Nov 27, 1:16 PM
Unknown Object (File)
Wed, Nov 27, 1:15 PM
Subscribers

Details

Summary

There is a data race between jump cache reading and updating

  1. Several threads execute ruleset simultaneously under RLOCK https://reviews.freebsd.org/source/src/browse/main/sys/netpfil/ipfw/ip_fw2.c$1785
  2. Let's say that two threads enter jump_fast, thread A stops there https://reviews.freebsd.org/source/src/browse/main/sys/netpfil/ipfw/ip_fw2.c$1258, thread B stops there https://reviews.freebsd.org/source/src/browse/main/sys/netpfil/ipfw/ip_fw2.c$1245
  3. Thread A writes correct id to f->cached_id
  4. Thread B reads f->cached_id, checks that it's correct, and reads f->cached_pos
  5. Thread A didn't write f->cached_pos, thread B returns incorrect rule position

To fix we need to ensure that cached_pos is updated before cached_id

  1. For 64-bit architectures, we write a 64-bit value composed of both cached_id and cached_pos
  2. For 32-bit ones, we enforce load/store ordering by adding release fence between writing pos and id, and acquire fence between reading id and pos

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/netpfil/ipfw/ip_fw2.c
1264

I'd avoid using atomics in the fast path.

For 64-bit architectures, we can simply write a 64-bit value composed of both cached_id and cached_pos.

For 32-bit ones, we can enforce load/store ordering by adding release fence between pos and id, and acquire fence between reading id and pos.

Generally LGTM, please see a comment on the code structure above.

sys/netpfil/ipfw/ip_fw2.c
1248

I'd restructure the current logic. We only need to update cache if it's not IP_FW_TARG.
With that in mind, current code looks like:

if (cache_correct) f_pos = cache_value
 else {
  f_pos = caculate_value
  update_cache
}
return f_pos

I'd suggest moving all cache fetch/retrieve logic to a separate (inlined) function, so the code looks like the following:

jump_fast() {
  if (num != IP_FW_TARG)
   return (jump_fast_cache(chain, f, num))
 # IP_FW_TARG handler logic
}

jump_fast_cache() {
#if __LP_64__
  if (__predict_true(cache_valid))
   return (value)
 else {
  update-and-return
 }
#else

#endif
}
sys/netpfil/ipfw/ip_fw_private.h
274

Mind naming it "value" or "raw_value" to make it a bit more self-descriptive?

smalukav_gmail.com edited the summary of this revision. (Show Details)
smalukav_gmail.com marked 2 inline comments as done.
This revision is now accepted and ready to land.Aug 13 2021, 10:35 AM
This revision now requires review to proceed.Aug 17 2021, 6:31 AM
This revision was not accepted when it landed; it landed in state Needs Review.Aug 17 2021, 8:35 AM
This revision was automatically updated to reflect the committed changes.