Page MenuHomeFreeBSD

Improve performance and functionality of the bitstring(3) api
ClosedPublic

Authored by asomers on Apr 19 2016, 4:55 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jun 24, 8:25 AM
Unknown Object (File)
Mon, Jun 17, 7:51 PM
Unknown Object (File)
Thu, Jun 13, 11:04 AM
Unknown Object (File)
Tue, Jun 11, 6:37 AM
Unknown Object (File)
Sat, Jun 8, 7:17 PM
Unknown Object (File)
May 21 2024, 6:57 PM
Unknown Object (File)
May 19 2024, 7:34 AM
Unknown Object (File)
May 9 2024, 1:22 PM
Subscribers

Details

Summary

Two new functions are provided, bit_ffs_at() and bit_ffc_at(), which allow
for efficient searching of set or cleared bits starting from any bit offset
within the bit string.

Performance is improved by operating on longs instead of bytes and using
ffsl() for searches within a long. ffsl() is a compiler builtin in both
clang and gcc for most architectures, converting what was a brute force
while loop search into a couple of instructions.

All of the bitstring(3) API continues to be contained in the header file.
Some of the functions are large enough that perhaps they should be uninlined
and moved to a library, but that is beyond the scope of this commit.

sys/sys/bitstring.h:
Convert the majority of the existing bit string implementation from
macros to inline functions.

Properly protect the implementation from inadvertant macro expansion
when included in a user's program by prefixing all private
macros/functions and local variables with '_'.

Add bit_ffs_at() and bit_ffc_at(). Implement bit_ffs() and
bit_ffc() in terms of their "at" counterparts.

Provide a kernel implementation of bit_alloc(), making the full API
usable in the kernel.

Improve code documenation.

share/man/man3/bitstring.3:
Add pre-exisiting API bit_ffc() to the synopsis.

Document new APIs.

Document the initialization state of the bit strings
allocated/declared by bit_alloc() and bit_decl().

Correct documenation for bitstr_size(). The original code comments
indicate the size is in bytes, not "elements of bitstr_t". The new
implementation follows this lead. Only hastd assumed "elements"
rather than bytes and it has been corrected.

etc/mtree/BSD.tests.dist:
tests/sys/Makefile:
tests/sys/sys/Makefile:
tests/sys/sys/bitstring.c:
Add tests for all existing and new functionality.

lib/libbluetooth/bluetooth.h:
usr.sbin/bluetooth/hccontrol/le.c:
Include bitstring.h instead of sys/bitstring.h.

sbin/hastd/activemap.c:
Correct usage of bitstr_size().

sys/kern/subr_unit.c:
Remove hard-coded assumption that sizeof(bitstr_t) is 1. Convert
struct unrb (a "unit number bitmap") from a struct containing a
u_char counter of set bits followed by a bit string bitmap, to a
union of a bit string bitmap and an array of u_char. Reduce the
number of bits used in the bitmap by 8 and use this area to store
the busy count.

sys/boot/common/bcache.c:
Include sys/bitstring.h instead of bitstring.h, so libc includes are
not pulled in.

sys/net/flowtable.c:
Use the new kernel implementation of bit-alloc, instead of hacking
the old libc-dependent macro.

Test Plan

Added ATF test case

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

asomers retitled this revision from to Improve performance and functionality of the bitstring(3) api.
asomers updated this object.
asomers edited the test plan for this revision. (Show Details)
asomers added reviewers: gibbs, dwmalone.
asomers edited edge metadata.

Add bcache.c, which was accidentally left out of the original CR. And bump
__FreeBSD_version to indicate availability of the new API.

Is this the same as D1408? I didn't push that revision because I wanted to update subr_unit.c to avoid the janky union stuff. The accessors should just use the last 8 bits in the bitmap for the count and assemble/decompose the value via mask and shift.

In D6004#128145, @gibbs wrote:

Is this the same as D1408? I didn't push that revision because I wanted to update subr_unit.c to avoid the janky union stuff. The accessors should just use the last 8 bits in the bitmap for the count and assemble/decompose the value via mask and shift.

Yes. I wasn't aware of that other review. And yes the structure of unrb is pretty weird. I'll see if I can easily straighten it out. But first I'll open a separate CR to convert phk's test program into ATF.

asomers edited edge metadata.

Add blkback.c, which was missing from the review.

@gibbs, it seems that unrb.busy is really just a count of the number of bits set in unrb.map. The only place it's ever used is in collapse_unr, which only cares if the value is 0 or len. How would you feel about removing the busy count altogether? All collapse_unr really has to do is check whether the bitstr is equal to 0 or (1 << len) - 1?

Given the small size of the bitmap, and the improved performance of the bitset_ffc/bitset_ffs calls you'd need, i think that would be fine for the non-diagnostic code. For the diganostic code (which isn't performance critical), you'd need to add a bitset_count() API. Doing a proper count of the bitmap seems a more robust check than relying on the locally maintained count anyway (no chance that two bugs in the subr_unit code cancel each other out and cause the diagnostic to fail to catch a bug).

In D6004#128383, @gibbs wrote:

Given the small size of the bitmap, and the improved performance of the bitset_ffc/bitset_ffs calls you'd need, i think that would be fine for the non-diagnostic code. For the diganostic code (which isn't performance critical), you'd need to add a bitset_count() API. Doing a proper count of the bitmap seems a more robust check than relying on the locally maintained count anyway (no chance that two bugs in the subr_unit code cancel each other out and cause the diagnostic to fail to catch a bug).

Except that there wouldn't _be_ anything for the diagnostic code to check. If I eliminate the busy field, then there would be no way for the structure to be inconsistent, and no need to do a bitset_count in check_unhdr.

check_unhdr() still must validate the count in the header. The current brute force "bitset_count" in check_unhdr() works, but bitset itself could do better (e.g. word at a time accumulation using mask, plus, and shift).

I'd talk with Paul Vixie about re-licensing/fixing the license from 3-clause BSD to 2-clause BSD. I doubt he'll object to the change.

share/man/man3/bitstring.3
30–34 ↗(On Diff #15391)

Why is this licensing tort getting tacked on?

sys/boot/common/bcache.c
38 ↗(On Diff #15391)

Can this be moved up?

sys/sys/bitstring.h
1–2 ↗(On Diff #15391)

This should be moved down below the licensing/RCS comments.

34–61 ↗(On Diff #15391)

I'm confused. Why is BSD 2 clause getting tacked on BSD 3 clause?

tests/sys/Makefile
19 ↗(On Diff #15391)

I actually put these in lib/libc/tests/sys (see queue_test.c). This may or may not be an intuitive location to put these header tests...

share/man/man3/bitstring.3
30–34 ↗(On Diff #15391)

@gibbs can you answer this question?

sys/boot/common/bcache.c
38 ↗(On Diff #15391)

No, because it must succeed stand.h. bcache is weird because it's part of neither the kernel nor userland. sys/bitstring.h requires calloc, which is declared in stand.h. sys/bitstring.h knows how to include the appropriate userland or kernel headers to get calloc, but it doesn't know about stand.h, and bcache's Makefile defines no preprocessor macro to indicate that we are using libstand.

However, Allan Jude just checked in a refactor of bcache.c that totally eliminates bitstring. So I won't actually end up committing this file.

tests/sys/Makefile
19 ↗(On Diff #15391)

Why there? These aren't really part of libc. I followed the convention of placing the test as close as possible to the code that it tests.

share/man/man3/bitstring.3
30–34 ↗(On Diff #15391)

How else would you suggest Spectra Logic document the license governing these changes? Note that the license terms are slightly different from the existing ones (e.g. no clause 3, and clause 2 only requires a substantially similar disclaimer (so disclaimer text can be merged by a distributor) without an individual copyright notice).

asomers edited edge metadata.

Update D6004 with:

  • subr_unit refactor to eliminate the busy counter in struct unrb. It's a few percent faster than before, and uses slighly less RAM. This change also eliminates some ugly type punning from @gibb's earlier change.
  • Add boilerplate to makefiles required by the pkg base initiative
  • Remove bcache.c, which no longer uses bitstring
asomers edited edge metadata.

Fix sys/sys/param.h which got lost in a merge, and move an ifdef guard below
the copyright block as suggested by ngie.

sys/kern/subr_unit.c
436 ↗(On Diff #15852)

This line was to tricky to find. The old code didn't actually initialize the entire bitstr, only the lowest us->len bits.

sys/sys/bitstring.h
34–61 ↗(On Diff #15391)

For the same reason that @gibbs described above. He wished to use the 2 clause license for his changes.

gibbs edited edge metadata.
gibbs added inline comments.
sys/kern/subr_unit.c
240–242 ↗(On Diff #15853)

It would be nice if this type of count function were included in the bitstring API. There are faster ways to do this (e.g. using "sideways addition" one bitstr_t at a time or utilizing hw instructions like popcnt), but that should only be done within the API implementation, not by an ouside caller.

258 ↗(On Diff #15853)

Just remove the variable names? They are easy enough to find by looking at the debug version.

440–443 ↗(On Diff #15853)

The typical convention is to use braces in all forks of an if sequence if any of them need braces. Here you could go either way - add or remove braces - depending on what is the prevailing style in the file.

Since this occurs in a few places, perhaps there should be a local or supported "bit_modify/bit_nmodify" API that takes an enum or bool to indicate whether you are setting or clearing? This would avoid typo bugs from duplicating the args.

449–453 ↗(On Diff #15853)

Not necessarily in this diff, but perhaps motivates some kind of bit_copy() addition to the API?

909 ↗(On Diff #15853)

Printing the count here could be useful for someone debugging the code. If you end up adding a bit_count() API, you should restore this functionality.

This revision is now accepted and ready to land.May 3 2016, 8:02 PM
sys/kern/subr_unit.c
240–242 ↗(On Diff #15853)

True, but its usefulness will be limited until somebody implements dynamic dispatch for bitcountl in the kernel. Currently, it will only use the popcnt instruction if you compile the kernel with the right -march option. Still, the fallback SWAR approach in sys/types.h should be faster than the naive approach above.

258 ↗(On Diff #15853)

Will do.

440–443 ↗(On Diff #15853)

Oops. I tried to fix those, but I guess I missed this one. I'll fix it when I commit this revision.

sys/net/flowtable.c
744–747 ↗(On Diff #15853)

Oy... seems like this should be committed separately?

sys/sys/bitstring.h
79–80 ↗(On Diff #15853)

Missing a newline here before the return.

87 ↗(On Diff #15853)

Missing a newline here before the return.

94 ↗(On Diff #15853)

Missing a newline here before the return.

100 ↗(On Diff #15853)

Missing a newline here before the return.

114 ↗(On Diff #15853)

Missing a newline here before the return.

117–120 ↗(On Diff #15853)

Should there be a corresponding _free function to wrap around free, such that you don't have to have 2 separate implementations?

120 ↗(On Diff #15853)

Missing a newline here before the return.

210–212 ↗(On Diff #15853)

Inconsistent braces on single line here and elsewhere.

tests/sys/Makefile
19 ↗(On Diff #15391)

I did that because it already existed and was the same location that the NetBSD tests live.

tests/sys/sys/Makefile
3–5 ↗(On Diff #15853)

This is going to change soon. Stay tuned..

tests/sys/sys/bitstring_test.c
32–37 ↗(On Diff #15853)

The sorting's off:

sys/param.h
bitstring.h
stdio.h
atf-c.h

243–244 ↗(On Diff #15853)
int found_clear_bit, i;

?

320 ↗(On Diff #15853)

Why does this need to be commented?

347–348 ↗(On Diff #15853)

Newline missing here.

357 ↗(On Diff #15853)
  • Newline missing here.
  • No ellipses around atf_no_error().
ngie removed a subscriber: ngie.
sys/sys/bitstring.h
79–80 ↗(On Diff #15853)

I thought I'd find something in style.9 that required a newline in cases of no local variables, but can't find it. An extra newline adds no value to a single line function (no improvement in readability).

210–212 ↗(On Diff #15853)

The style used here, as in many other places in FreeBSD, is that braces are used if any block in an if/else if/else requires them. Otherwise, a single line block doesn't get them. From my reading of style 9, this is allowed.

If the above style is violated in the file, feel free to flag those instances.

sys/net/flowtable.c
744–747 ↗(On Diff #15853)

No. The #define was necessary because the old bit_alloc function required calloc. The new bit_alloc function does not. So it's appropriate to commit it together.

tests/sys/Makefile
19 ↗(On Diff #15391)

Do you want me to move the bitstring tests? I prefer tests/sys/sys for consistency with how we do other tests for code that lives in sys.

tests/sys/sys/bitstring_test.c
320 ↗(On Diff #15853)

Those are the function arguments for the bit_nset_test function. The arguments are defined by the expansion of the BITSTRING_TC_DEFINE macro. Since they're not shown, it's helpful to include the comment.

This revision was automatically updated to reflect the committed changes.