Changeset View
Standalone View
sys/sys/bitset.h
Show First 20 Lines • Show All 265 Lines • ▼ Show 20 Lines | #define BIT_COUNT(_s, p) __extension__ ({ \ | ||||
long __count; \ | long __count; \ | ||||
\ | \ | ||||
__count = 0; \ | __count = 0; \ | ||||
for (__i = 0; __i < __bitset_words((_s)); __i++) \ | for (__i = 0; __i < __bitset_words((_s)); __i++) \ | ||||
__count += __bitcountl((p)->__bits[__i]); \ | __count += __bitcountl((p)->__bits[__i]); \ | ||||
__count; \ | __count; \ | ||||
}) | }) | ||||
/* Non-destructively loop over all set or clear bits in the set. */ | |||||
#define _BIT_FOREACH(_s, i, p, op) \ | |||||
for (__size_t __i = 0; __i < __bitset_words(_s); __i++) \ | |||||
for (long __j = op((p)->__bits[__i]), __b = ffsl(__j); \ | |||||
(i = (__b - 1) + __i * _BITSET_BITS), __j != 0; \ | |||||
__j &= ~(1l << i), __b = ffsl(__j)) | |||||
rlibby: This nested for loop construct has a booby trap, continue/break don't work, and it isn't… | |||||
cemUnsubmitted Not Done Inline ActionsNice catch. Re bitset size, we could static assert that? Unless we have a bitset using this construct that isn’t a multiple of word size. cem: Nice catch. Re bitset size, we could static assert that? Unless we have a bitset using this… | |||||
rlibbyUnsubmitted Not Done Inline Actions
I don't want to distract from the feature too much here since this is an existing issue, just wanted to raise awareness. We can take it up later. That said: Since not all bitsets have a statically defined size, I don't think it will be easy to enforce static rules about it. On the other hand, I think it's not too hard to fix and potentially has no extra cost for static size sets with nice sizes. rlibby: > Re bitset size, we could static assert that? Unless we have a bitset using this construct… | |||||
markjAuthorUnsubmitted Done Inline ActionsThanks for pointing this out, I feel quite dumb for not having thought about it. (Though, I think it is just "break" that is broken and not "continue"?) I've spent a fair bit of time trying to repair this. It's straightforward to eliminate the nesting if I use a statement expression which conditionally advances to the next non-zero limb of the bitmap. The problem is that clang and gcc both generate quite pessimal code no matter how I try to write it. As a result the new no-nesting macro ends up much slower than before if the bitmap is mostly populated. Most of the time it's slower than a simple loop over each bit[*]. The new macro is still much faster for sparse bitmaps, so maybe that's acceptable. Now I think it's possible to fix the problem without removing the nesting: if the outer loop additionally tests a condition which is false only if the inner loop terminates without clearing every bit in the copied limb, then "break" works and the loop runs basically as fast as before. So, is there anything other than "break" that I should consider? If I go ahead with this idea (which works with some basic tests at least), I would include some regression tests, probably using your tests as a starting point. \* With a full bitmap, testing each bit is faster than anything else I've implemented if I also disable loop unrolling. But with cpusets (size 256) I expect the compiler to do some unrolling anyway. markj: Thanks for pointing this out, I feel quite dumb for not having thought about it. (Though, I… | |||||
rlibbyUnsubmitted Not Done Inline ActionsI think it's right that it's just break that is broken and continue is fine. I think your suggestion to check an additional condition in the outer loop should work, I can't think of any other grammar problems. I wouldn't complain about that approach. You might try something like this if you haven't already: #define _BIT_FOREACH_ADVANCE(_s, i, p, op, __iter) __extension__ ({ \ int __found; \ for (;;) { \ if (__iter.v != 0) { \ int __b = ffsl(__iter.v) - 1; \ __iter.v &= ~(1l << __b); \ (i) = __iter.i * _BITSET_BITS + __b; \ __found = 1; \ break; \ } \ __iter.i++; \ if (__iter.i >= __bitset_words(_s)) { \ __found = 0; \ break; \ } \ __iter.v = op((p)->__bits[__iter.i]); \ } \ __found != 0; \ )} struct _bitset_iter { ssize_t i; long v; }; #define _BIT_FOREACH(_s, i, p, op) \ for (struct _bitset_iter __iter = { -1, 0 }; \ _BIT_FOREACH_ADVANCE(_s, i, p, op, __iter); ) (Untested, not compiled, just a sketch.) rlibby: I think it's right that it's just break that is broken and continue is fine.
I think your… | |||||
#define BIT_FOREACH_ISSET(_s, i, p) _BIT_FOREACH(_s, i, p, ) | |||||
#define BIT_FOREACH_ISCLR(_s, i, p) _BIT_FOREACH(_s, i, p, ~) | |||||
#define BITSET_T_INITIALIZER(x) \ | #define BITSET_T_INITIALIZER(x) \ | ||||
{ .__bits = { x } } | { .__bits = { x } } | ||||
#define BITSET_FSET(n) \ | #define BITSET_FSET(n) \ | ||||
[ 0 ... ((n) - 1) ] = (-1L) | [ 0 ... ((n) - 1) ] = (-1L) | ||||
#define BITSET_SIZE(_s) (__bitset_words((_s)) * sizeof(long)) | #define BITSET_SIZE(_s) (__bitset_words((_s)) * sizeof(long)) | ||||
/* | /* | ||||
* Dynamically allocate a bitset. | * Dynamically allocate a bitset. | ||||
*/ | */ | ||||
#define BITSET_ALLOC(_s, mt, mf) malloc(BITSET_SIZE((_s)), mt, (mf)) | #define BITSET_ALLOC(_s, mt, mf) malloc(BITSET_SIZE((_s)), mt, (mf)) | ||||
#endif /* !_SYS_BITSET_H_ */ | #endif /* !_SYS_BITSET_H_ */ |
This nested for loop construct has a booby trap, continue/break don't work, and it isn't documented. A way out of this is to flatten it into one loop, perhaps using helper functions or macros to make it readable.
This macro also has a problem (not new, like most of the others here) that if the bit set size is not a multiple of the word size, garbage will be reported at the end. This may be particularly acute for the BIT_FOREACH_ISCLR macro. I started a patch to fix that in the other macros but didn't finish it: https://github.com/rlibby/freebsd/commit/a205c7d865d3d826b96fe48d7011fd870f3cbb93
I also started and didn't finish some tests: https://github.com/rlibby/freebsd/commit/5dc206eb1b52aad570688b260a7f473b31bb0266