Page MenuHomeFreeBSD

sound: Retire SND_MAXVCHANS
Needs ReviewPublic

Authored by christos on Sep 3 2024, 3:26 PM.
Tags
None
Referenced Files
F99029119: D46520.id142818.diff
Sat, Oct 5, 10:11 PM
Unknown Object (File)
Sat, Oct 5, 7:40 AM
Unknown Object (File)
Sat, Oct 5, 1:48 AM
Unknown Object (File)
Thu, Oct 3, 12:55 AM
Unknown Object (File)
Wed, Oct 2, 8:14 PM
Unknown Object (File)
Wed, Oct 2, 5:18 PM
Unknown Object (File)
Wed, Oct 2, 12:11 PM
Unknown Object (File)
Wed, Oct 2, 11:43 AM
Subscribers

Details

Summary

We have hw.snd.maxautovchans already, so use this as the upper VCHAN
limit instead of having a hardcoded, and by nowadays' standards low, cap
(SND_MAXVCHANS) compiled into the module.

Additionally, bump the default hw.snd.maxautovchans value to 256. The
current value of 16 is too low and can be easily reached (e.g with a
browser having more than 16 audio tabs open). The user can theoretically
increase the limit, but it is easier to just ship with a hard-to-reach
value, than let the user figure this out.

A side-effect of this change is that now a user with root access can
potentially set hw.snd.maxautovchans to a very large value, and freeze
the machine by spawning too many channels. However, this is not
something that sound(4) should prevent, as we are currently also
restricting the ability to spawn an arbitrary amount of channels, even
for machines that can handle it.

Sponsored by: The FreeBSD Foundation
MFC after: 2 days

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 59326
Build 56213: arc lint + arc unit

Event Timeline

I will also add a RELNOTES entry when I commit it.

olce added a subscriber: olce.

LGTM.

This revision is now accepted and ready to land.Sep 3 2024, 3:56 PM

We're changing semantics of hw.snd.maxautovchans here. Currently it's possible to override maxautovchans via the per pcm device vchans sysctl, with this patch it becomes a hard limit. This will break some use cases, e.g. turning off vchans by default through sysctl hw.snd.maxautovchans=0 and then turning it on for some specific device with sysctl dev.pcm.0.play.vchans=32. While that's also achievable the other way round, we're definitely going to break some people's workflows.

To be strictly backward compatible, we'd have to add a hw.snd.maxvchans sysctl and keep the semantics of hw.snd.maxautovchans. If we decide not to, we should at least make it obvious by renaming the sysctl to hw.snd.maxvchans and adjust the description in sound(4).

Somehow I don't get this reasoning from the commit description:

A side-effect of this change is that now a user with root access can
potentially set hw.snd.maxautovchans to a very large value, and freeze
the machine by spawning too many channels. However, this is not
something that sound(4) should prevent, as we are currently also
restricting the ability to spawn an arbitrary amount of channels, even
for machines that can handle it.

sys/dev/sound/pcm/vchan.c
945–946

I tripped over this assertion while testing, and I think it's bogus as it may be called with an arbitrary newcnt value by sysctl_dev_pcm_vchans(). It has no effect on releases, but easily panics CURRENT.

As a collateral it prevents me from testing a high number of vchans on CURRENT.

We're changing semantics of hw.snd.maxautovchans here. Currently it's possible to override maxautovchans via the per pcm device vchans sysctl, with this patch it becomes a hard limit.

sysctl_dev_pcm_vchans() already had a hard limit at SND_MAXVCHANS, which is the same as the new default hw.snd.maxautovchans value as of this patch (hence why I bumped it to 256). With the current patch we can at least increase (or decrease of course) the hard limit to whatever we want, whereas with SND_MAXVCHANS we hard-cap at 256 and one needs to recompile the module/kernel to set a different cap. If you keep hw.snd.maxautovchans at 256, there shouldn't be any difference in behavior.

This will break some use cases, e.g. turning off vchans by default through sysctl hw.snd.maxautovchans=0 and then turning it on for some specific device with sysctl dev.pcm.0.play.vchans=32. While that's also achievable the other way round, we're definitely going to break some people's workflows.

This can indeed break. However, I think it makes more intuitive sense to set dev.pcm.X.[play|rec].vchans to 0 to disable VCHANs, than hw.snd.maxautovchans. I know you said "by default", but still, that's just my humble opinion :)

To be strictly backward compatible, we'd have to add a hw.snd.maxvchans sysctl and keep the semantics of hw.snd.maxautovchans. If we decide not to, we should at least make it obvious by renaming the sysctl to hw.snd.maxvchans and adjust the description in sound(4).

An alternative is to just set SND_MAXVCHANS to INT_MAX or something, but I think it's cleaner to do away with the hard-caps in general.

Somehow I don't get this reasoning from the commit description:

A side-effect of this change is that now a user with root access can
potentially set hw.snd.maxautovchans to a very large value, and freeze
the machine by spawning too many channels. However, this is not
something that sound(4) should prevent, as we are currently also
restricting the ability to spawn an arbitrary amount of channels, even
for machines that can handle it.

That because SND_MAXVCHANS introduces an arbitrary/low VCHAN limit, we restrict use cases where someone wants to have more than 256 VCHANs. But at the same time, getting rid of any limit means that an attacker (for the sake of argument) with root access to the machine can just set hw.snd.maxautovchans (and spawn hw.snd.maxautovchans VCHANs) to something the machine cannot handle, and freeze the machine. It is a silly argument - all I am just saying is that, if we keep the upper boundary (i.e SND_MAXVCHAN) for the sake of preventing this kind of abusive behavior, this isn't something that the sound driver should be preventing, but the user themselves.

sys/dev/sound/pcm/vchan.c
945–946

I think I (or someone else, cannot remember...) had a similar issue with this KASSERT in the past, and was wondering whether it is useful to have it in the first place or not. Is this related to the patch?

We're changing semantics of hw.snd.maxautovchans here. Currently it's possible to override maxautovchans via the per pcm device vchans sysctl, with this patch it becomes a hard limit.

sysctl_dev_pcm_vchans() already had a hard limit at SND_MAXVCHANS, which is the same as the new default hw.snd.maxautovchans value as of this patch (hence why I bumped it to 256). With the current patch we can at least increase (or decrease of course) the hard limit to whatever we want, whereas with SND_MAXVCHANS we hard-cap at 256 and one needs to recompile the module/kernel to set a different cap. If you keep hw.snd.maxautovchans at 256, there shouldn't be any difference in behavior.

Understood, and I agree with the motivation for this change.

This will break some use cases, e.g. turning off vchans by default through sysctl hw.snd.maxautovchans=0 and then turning it on for some specific device with sysctl dev.pcm.0.play.vchans=32. While that's also achievable the other way round, we're definitely going to break some people's workflows.

This can indeed break. However, I think it makes more intuitive sense to set dev.pcm.X.[play|rec].vchans to 0 to disable VCHANs, than hw.snd.maxautovchans. I know you said "by default", but still, that's just my humble opinion :)

I'm fine with that opinion. Problem is, it's far easier to set hw.snd.maxautovchans=0 than going for the individual pcm devices in both directions. And it's suggested for exactly this purpose by sound(4):

hw.snd.maxautovchans
        Global VCHAN setting that only affects devices with at least one
        playback or recording channel available.  The sound system will
        dynamically create up to this many VCHANs.  Set to “0” if no
        VCHANs are desired.  Maximum value is 256.

Therefore people are currently using it, and it's not a good idea to silently break their workflow.

BTW, we could reduce the need to disable vchans in the first place, but that's a future topic.

To be strictly backward compatible, we'd have to add a hw.snd.maxvchans sysctl and keep the semantics of hw.snd.maxautovchans. If we decide not to, we should at least make it obvious by renaming the sysctl to hw.snd.maxvchans and adjust the description in sound(4).

An alternative is to just set SND_MAXVCHANS to INT_MAX or something, but I think it's cleaner to do away with the hard-caps in general.

Then you could just as well omit these checks. I'd prefer an explicit limit as stated above, my concern is how to communicate this change explicitly to the user.

Somehow I don't get this reasoning from the commit description:

A side-effect of this change is that now a user with root access can
potentially set hw.snd.maxautovchans to a very large value, and freeze
the machine by spawning too many channels. However, this is not
something that sound(4) should prevent, as we are currently also
restricting the ability to spawn an arbitrary amount of channels, even
for machines that can handle it.

That because SND_MAXVCHANS introduces an arbitrary/low VCHAN limit, we restrict use cases where someone wants to have more than 256 VCHANs. But at the same time, getting rid of any limit means that an attacker (for the sake of argument) with root access to the machine can just set hw.snd.maxautovchans (and spawn hw.snd.maxautovchans VCHANs) to something the machine cannot handle, and freeze the machine. It is a silly argument - all I am just saying is that, if we keep the upper boundary (i.e SND_MAXVCHAN) for the sake of preventing this kind of abusive behavior, this isn't something that the sound driver should be preventing, but the user themselves.

It's all clear up to "However, this is not something that sound(4) should prevent,", what came after that confused me. Anyway, not that important.

sys/dev/sound/pcm/vchan.c
945–946

I don't see any reasonable thing to check here in the general case when being called from sysctl_dev_pcm_vchans(), so I'd just remove it.

This will break some use cases, e.g. turning off vchans by default through sysctl hw.snd.maxautovchans=0 and then turning it on for some specific device with sysctl dev.pcm.0.play.vchans=32. While that's also achievable the other way round, we're definitely going to break some people's workflows.

This can indeed break. However, I think it makes more intuitive sense to set dev.pcm.X.[play|rec].vchans to 0 to disable VCHANs, than hw.snd.maxautovchans. I know you said "by default", but still, that's just my humble opinion :)

I'm fine with that opinion. Problem is, it's far easier to set hw.snd.maxautovchans=0 than going for the individual pcm devices in both directions. And it's suggested for exactly this purpose by sound(4):

[...]

Therefore people are currently using it, and it's not a good idea to silently break their workflow.

Yeah... I am trying to figure out a way to address this but also keep a clean interface, instead of introducing more sysctls and complexity. An alternative idea could be to remove the checks in vchan_setnew() and sysctl_dev_pcm_vchans(). This way we can still disable VCHANs through sysctl_hw_snd_maxautovchans() and override with sysctl_dev_pcm_vchans().

BTW, we could reduce the need to disable vchans in the first place, but that's a future topic.

Reduce the need by adding, say, some heuristics in sound(4) that will decide whether having VCHANs or not is optimal? Or make the VCHAN chain more modular (i.e editable) so we can make them faster on-demand? What do you have in mind?

sys/dev/sound/pcm/vchan.c
945–946

Remove snd_maxautovchans checks in sysctl_dev_pcm_vchans() and vchan_setnew()
to preserve previous behavior. This way we can still disable VCHANs globally
through hw.snd.maxautovchans and enable them on demand through
dev.pcm.X.[play|rec].vchans.

This revision now requires review to proceed.Sep 5 2024, 8:33 PM
share/man/man4/pcm.4
370

I bumped the manpage date afterwards, no point in updating the diff here.

This is a clever solution for the maxautovchans sysctl. There's some aftermath though. With lots of channels and sysctl hw.snd.verbose=2, sndstat_prepare_pcm() does some memory allocation in sbuf_printf() while holding a lock:

uma_zalloc_debug: zone "malloc-65536" with the following non-sleepable locks held:
exclusive sleep mutex pcm0:virtual_record:dsp0.vr216 (pcm virtual record channel) r = 0 (0xfffff800015a8460) locked @ /usr/src/sys/dev/sound/pcm/sndstat.c:1261
stack backtrace:
#0 0xffffffff80bc423c at witness_debugger+0x6c
#1 0xffffffff80bc5433 at witness_warn+0x403
#2 0xffffffff80ef2e44 at uma_zalloc_debug+0x34
#3 0xffffffff80ef2967 at uma_zalloc_arg+0x27
#4 0xffffffff80b21e8d at malloc+0x7d
#5 0xffffffff80bac3eb at sbuf_extend+0x7b
#6 0xffffffff80bac9aa at sbuf_put_byte+0xda
#7 0xffffffff80ba6318 at kvprintf+0xe68
#8 0xffffffff80bac7ed at sbuf_vprintf+0x6d
#9 0xffffffff80bac89f at sbuf_printf+0x3f
#10 0xffffffff808df4b4 at sndstat_read+0x7c4
#11 0xffffffff809d8fd4 at devfs_read_f+0xe4
#12 0xffffffff80bc8f40 at dofileread+0x80
#13 0xffffffff80bc8ac7 at sys_read+0xb7
#14 0xffffffff810779e8 at amd64_syscall+0x158
#15 0xffffffff81049c3b at fast_syscall_common+0xf8

I think we fixed a similar case for the nvlist sndstat interface.

Another consequence of "unlimited" vchans is that when setting a high value like sysctl dev.pcm.0.rec.vchans=100000, this keeps the sysctl process thread in the kernel for several minutes, and I wasn't able to kill it. That's because we preallocate the vchans directly in this sysctl. The same happens when you reduce the number of vchans again afterwards, for deallocation.

I don't see any security issue here as only root can set this sysctl. But I suppose it's more than just a nuisance. It blocks other sysctl requests, and if someone does sysctl fuzzing it may leave the fuzzer hanging for a very long time. @olce, what do you think?

This is a clever solution for the maxautovchans sysctl. There's some aftermath though. With lots of channels and sysctl hw.snd.verbose=2, sndstat_prepare_pcm() does some memory allocation in sbuf_printf() while holding a lock:

uma_zalloc_debug: zone "malloc-65536" with the following non-sleepable locks held:
exclusive sleep mutex pcm0:virtual_record:dsp0.vr216 (pcm virtual record channel) r = 0 (0xfffff800015a8460) locked @ /usr/src/sys/dev/sound/pcm/sndstat.c:1261
stack backtrace:
#0 0xffffffff80bc423c at witness_debugger+0x6c
#1 0xffffffff80bc5433 at witness_warn+0x403
#2 0xffffffff80ef2e44 at uma_zalloc_debug+0x34
#3 0xffffffff80ef2967 at uma_zalloc_arg+0x27
#4 0xffffffff80b21e8d at malloc+0x7d
#5 0xffffffff80bac3eb at sbuf_extend+0x7b
#6 0xffffffff80bac9aa at sbuf_put_byte+0xda
#7 0xffffffff80ba6318 at kvprintf+0xe68
#8 0xffffffff80bac7ed at sbuf_vprintf+0x6d
#9 0xffffffff80bac89f at sbuf_printf+0x3f
#10 0xffffffff808df4b4 at sndstat_read+0x7c4
#11 0xffffffff809d8fd4 at devfs_read_f+0xe4
#12 0xffffffff80bc8f40 at dofileread+0x80
#13 0xffffffff80bc8ac7 at sys_read+0xb7
#14 0xffffffff810779e8 at amd64_syscall+0x158
#15 0xffffffff81049c3b at fast_syscall_common+0xf8

I think we fixed a similar case for the nvlist sndstat interface.

I think that's the same problem with what we discussed in https://reviews.freebsd.org/D45982#1050109, but I haven't worked on it yet.

Another consequence of "unlimited" vchans is that when setting a high value like sysctl dev.pcm.0.rec.vchans=100000, this keeps the sysctl process thread in the kernel for several minutes, and I wasn't able to kill it. That's because we preallocate the vchans directly in this sysctl. The same happens when you reduce the number of vchans again afterwards, for deallocation.

I don't see any security issue here as only root can set this sysctl. But I suppose it's more than just a nuisance. It blocks other sysctl requests, and if someone does sysctl fuzzing it may leave the fuzzer hanging for a very long time. @olce, what do you think?

That would be an example of what I mention in the last paragraph of the commit message. :)

Another consequence of "unlimited" vchans is that when setting a high value like sysctl dev.pcm.0.rec.vchans=100000, this keeps the sysctl process thread in the kernel for several minutes, and I wasn't able to kill it. That's because we preallocate the vchans directly in this sysctl. The same happens when you reduce the number of vchans again afterwards, for deallocation.

I don't see any security issue here as only root can set this sysctl. But I suppose it's more than just a nuisance. It blocks other sysctl requests, and if someone does sysctl fuzzing it may leave the fuzzer hanging for a very long time. @olce, what do you think?

That would be an example of what I mention in the last paragraph of the commit message. :)

IMHO, these are not the same thing. My machine was actually capable of creating 100000 vchans (I didn't try to use them). The problem is that setting sysctl dev.pcm.0.rec.vchans=100000 runs in kernel for more than 5 minutes. That's at least non-customary for sysctls. And it blocks other sysctl requests while running. This can happen inadvertently, and as discussed before there is reason for people to increase this sysctl. People make mistakes, or they'll just overdo it because they are unaware of the consequences.

Which is why I would like to know the opinion of @markj, @emaste or @olce first.

Another consequence of "unlimited" vchans is that when setting a high value like sysctl dev.pcm.0.rec.vchans=100000, this keeps the sysctl process thread in the kernel for several minutes, and I wasn't able to kill it. That's because we preallocate the vchans directly in this sysctl. The same happens when you reduce the number of vchans again afterwards, for deallocation.

I don't see any security issue here as only root can set this sysctl. But I suppose it's more than just a nuisance. It blocks other sysctl requests, and if someone does sysctl fuzzing it may leave the fuzzer hanging for a very long time. @olce, what do you think?

That would be an example of what I mention in the last paragraph of the commit message. :)

IMHO, these are not the same thing. My machine was actually capable of creating 100000 vchans (I didn't try to use them). The problem is that setting sysctl dev.pcm.0.rec.vchans=100000 runs in kernel for more than 5 minutes. That's at least non-customary for sysctls. And it blocks other sysctl requests while running. This can happen inadvertently, and as discussed before there is reason for people to increase this sysctl. People make mistakes, or they'll just overdo it because they are unaware of the consequences.

Which is why I would like to know the opinion of @markj, @emaste or @olce first.

I am having a discussion with @olce now, and I would like to share my proposal. So, my opinion is that having the *vchans sysctls in the first place makes the experience quite complicated. I know they give us the ability to have very high control over the sound system, but I am still not convinced they really serve a good purpose. Their most typical use is to either disable/enable VCHANs, or garbage collect unused VCHANs.

I think we'd be better off scrapping all *vchans sysctls, and instead allocating vchans on demand (i.e allocating on dsp_open() and deallocating on dsp_close()). This would simplify both the code and the user experience significantly, since the only user-facing controls there will be is simply a knob to enable/disable vchans (we could even name it dev.pcm.X.[play|rec].autoconversion or something to make it more intuitive). The only obstacle to making this work properly is figuring out (and fixing) what takes so long to allocate the vchans, as you highlighted. What do you think?

I'm now thinking that we should probably keep some cap on the total number of system vchans, but set it to a very high value with respect to expected uses on current machines (e.g. ~50,000), just to avoid obvious footshooting.

However, doing this loses criticality if, as per @christos ' idea (discussed elsewhere), channels can be allocated on-demand (at DSP open) instead, when setting dev.pcm.X.rec.vchans.

In any case, figuring why allocating 100,000 channels takes more than a minute would be good because, a priori and without much knowledge of this subsystem, I would expect it to be almost unnoticeable, so there may be something wrong going on here.

IMHO, these are not the same thing. My machine was actually capable of creating 100000 vchans (I didn't try to use them). The problem is that setting sysctl dev.pcm.0.rec.vchans=100000 runs in kernel for more than 5 minutes. That's at least non-customary for sysctls. And it blocks other sysctl requests while running. This can happen inadvertently, and as discussed before there is reason for people to increase this sysctl. People make mistakes, or they'll just overdo it because they are unaware of the consequences.

Which is why I would like to know the opinion of @markj, @emaste or @olce first.

I am having a discussion with @olce now, and I would like to share my proposal. So, my opinion is that having the *vchans sysctls in the first place makes the experience quite complicated. I know they give us the ability to have very high control over the sound system, but I am still not convinced they really serve a good purpose. Their most typical use is to either disable/enable VCHANs, or garbage collect unused VCHANs.

I think we'd be better off scrapping all *vchans sysctls, and instead allocating vchans on demand (i.e allocating on dsp_open() and deallocating on dsp_close()). This would simplify both the code and the user experience significantly, since the only user-facing controls there will be is simply a knob to enable/disable vchans (we could even name it dev.pcm.X.[play|rec].autoconversion or something to make it more intuitive). The only obstacle to making this work properly is figuring out (and fixing) what takes so long to allocate the vchans, as you highlighted. What do you think?

Agreed. I can only think of two use cases where the current *vchan sysctl has an advantage over a simple on/off knob:

  • To preallocate a certain number of vchans.
  • To keep maxautovchans at a low, non-zero level, and allow more vchans for a specific device.

Preallocation should be irrelevant unless we have a huge performance problem in the allocation code. The latter is more a theoretical exercise, I don't expect anyone to do that.

All in all I think this is a good proposal, but I'd vote to keep the name of the *vchans as is and make its sysctl interface compatible with how users currently turn vchans on and off.

I'm now thinking that we should probably keep some cap on the total number of system vchans, but set it to a very high value with respect to expected uses on current machines (e.g. ~50,000), just to avoid obvious footshooting.

However, doing this loses criticality if, as per @christos ' idea (discussed elsewhere), channels can be allocated on-demand (at DSP open) instead, when setting dev.pcm.X.rec.vchans.

Also agreed.

In any case, figuring why allocating 100,000 channels takes more than a minute would be good because, a priori and without much knowledge of this subsystem, I would expect it to be almost unnoticeable, so there may be something wrong going on here.

I suspect the traversal of the channel list to have an impact. If we traverse it for every allocation, and the list getting longer, we get O(N^2) for N allocations. That should be easy to prove by doing a benchmark of allocations with few vchans and compare that to situations with 1'000, 10'000 and 100'000 preallocated vchans.

BTW, I'm on vacation this week, so can't run any tests or do research in the code.

I suspect the traversal of the channel list to have an impact. If we traverse it for every allocation, and the list getting longer, we get O(N^2) for N allocations. That should be easy to prove by doing a benchmark of allocations with few vchans and compare that to situations with 1'000, 10'000 and 100'000 preallocated vchans.

Some brief tests seem to support this theory, we have a scaling problem. Allocating 1000 vchans from 0 is almost instantaneous, while allocating from 100000 to 101000 takes about 10 seconds.

I think even without preallocation, we still need some hard upper bound for the number of vchans, like @olce proposed. 10ms per allocation can already be problematic for multimedia applications. And we should have a close look at how we deallocate the vchans again.

I suspect the traversal of the channel list to have an impact. If we traverse it for every allocation, and the list getting longer, we get O(N^2) for N allocations. That should be easy to prove by doing a benchmark of allocations with few vchans and compare that to situations with 1'000, 10'000 and 100'000 preallocated vchans.

Some brief tests seem to support this theory, we have a scaling problem. Allocating 1000 vchans from 0 is almost instantaneous, while allocating from 100000 to 101000 takes about 10 seconds.

I think even without preallocation, we still need some hard upper bound for the number of vchans, like @olce proposed. 10ms per allocation can already be problematic for multimedia applications. And we should have a close look at how we deallocate the vchans again.

I'll submit a patch that (de-)allocates vchans on demand (i.e on open() and close() respectively), and we can work from there.