taskqgroup initialization was broken into two steps:
1. allocate the taskqgroup structure, at SI_SUB_TASKQ
2. initialize taskqueues, start taskqueue threads, enqueue "binder"
tasks to bind threads to specific CPUs, at SI_SUB_SMP
There is some complexity to handle the case where code attaches tasks to
a queue before step 2, in which case step 2 must migrate tasks to their
intended queue. Moreover, tasks can't be enqueued until step 2 has
completed. This breaks setups where EARLY_AP_STARTUP is not defined,
and we need to use an iflib-based driver to mount a network root
filesystem, since mountroot happens before SI_SUB_SMP in this case.
If we do all initialization and thread creation in step 2, and only
defer CPU binding to SI_SUB_SMP, then things become a lot simpler and
the aforementioned problem is fixed. This does mean that there exists a
window where taskqueue threads may run on a CPU other than that to which
they are supposed to be bound. I believe, but am not yet 100% certain,
that this does not affect code correctness.
The change basically splits initialization as described above. It also
removes some code to handle re-adjustment of taskqueue CPU bindings,
since that is unused. The change also lets us simplify iflib
initialization a bit.