Page MenuHomeFreeBSD

Enable VIMAGE by default
ClosedPublic

Authored by bz on Oct 11 2017, 4:22 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 23, 1:27 PM
Unknown Object (File)
Mon, Nov 18, 6:44 PM
Unknown Object (File)
Mon, Nov 18, 5:04 AM
Unknown Object (File)
Sun, Nov 17, 5:27 PM
Unknown Object (File)
Fri, Nov 8, 12:04 AM
Unknown Object (File)
Oct 22 2024, 6:07 AM
Unknown Object (File)
Oct 1 2024, 3:21 PM
Unknown Object (File)
Sep 19 2024, 12:17 PM

Details

Summary

Enable VIMAGE in GENERIC kernels and some others (where
GENERIC does not exist).

Disable building LINT-VIMAGE with VIMAGE being default.

This should give it a lot more exposure in the run-up to
12 to help us make the decision on whether to keep it on
by default or not.

Requested by: many

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

👍 (thumbs up) from me. I've applied this to my local test tree.

kp added a subscriber: kp.

Thumbs up. Let's do this.

This revision is now accepted and ready to land.Oct 11 2017, 9:31 PM

I remain concerned about the performance overhead of activating this where not needed. However, I have no performance information to either alleviate or confirm my fears. Has someone done the work to gather performance information to compare a non-VIMAGE kernel to a VIMAGE kernel with a single VNET?

In D12639#262566, @jtl wrote:

I remain concerned about the performance overhead of activating this where not needed. However, I have no performance information to either alleviate or confirm my fears. Has someone done the work to gather performance information to compare a non-VIMAGE kernel to a VIMAGE kernel with a single VNET?

About forwarding performance: cf slide 36-37-38 of my EuroBSDcon talk

  • Adding option VIMAGE (without any vnet jail) will impact by about -6% of inet4 small packet size forwarding performance (but gain +2% regarding inet6). Comparing to just using VLAN tagging (-17%, cf slide 35) or forwarding inet6 vs inet4 (-16%, cf slide 32), or the new iflib drivers (-17%): The cost/feature ratio of enabling VIMAGE worth it.
  • And once VIMAGE is enabled, a vnet jail has the same forwarding speed as the host: no more performance impact, which is great!

Should we enable it on ARMv4/ARMv5 (i.e. old ARM systems)? If so you should add it to sys/arm/conf/std.arm.

In D12639#262566, @jtl wrote:

I remain concerned about the performance overhead of activating this where not needed. However, I have no performance information to either alleviate or confirm my fears. Has someone done the work to gather performance information to compare a non-VIMAGE kernel to a VIMAGE kernel with a single VNET?

@jtl I am not aware of anyone done a end-host, e.g., webserver kind of performance measurement. No one wanted to volunteer so far. Hence my comment on the proposed commit message (get more people testing and see if it can stay on for 12). If a lot of people find they'll lose 20% the next days this will not go in, or if they find in two months, this can be reverted quite easily if the overhead can't be removed. I think the point is if we ever want to move forward with this we need to. We solved the stability issues had it sit like that for quite a while and apart form the usual bugs we have everywhere seems fine. If we get more confident with stability and performance, the KPI/KBI thing is probably the last bits we need to decide on.

In D12639#262598, @bz wrote:

Hence my comment on the proposed commit message (get more people testing and see if it can stay on for 12). If a lot of people find they'll lose 20% the next days this will not go in, or if they find in two months, this can be reverted quite easily if the overhead can't be removed.

Practice shows that the real feedback from sensible number of users will be ready only when 12.1 or even 12.2 will be released. :)

In D12639#262598, @bz wrote:

@jtl I am not aware of anyone done a end-host, e.g., webserver kind of performance measurement. No one wanted to volunteer so far.

For what it's worth, I've not seen a performance difference on TCP loopback connections (at least, once I turned INVARIANTS off). Unfortunately I don't have the hardware setup to do a meaningful HTTP serving benchmark.

I found some crusty old hardware to run a test on. This is nginx serving its default index page (612 bytes of data).
Test client is wrk. I played around with the number of connections and threads briefly, but didn't see a major difference (in the non-vimage performance, I've not compared vimage there).

I figure the small file is best, because it maximises overhead per request, which is what we're interested in.
The compared value is requests per second so more is better.
This is IPv4.

x non-vimage-2-thread-10-connections-60seconds.txt
+ vimage-2-thread-10-connections-60seconds.txt
+--------------------------------------------------------------------------------+
|+   +                                                                      x    |
|++ ++ ++  ++                                                   x   xx  x xxx  xx|
| |__MA__|                                                          |____AM___|  |
+--------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10      22025.03      22070.39     22054.165     22050.851     14.726427
+  10      21840.55       21872.2      21852.61     21853.987     10.865634
Difference at 95.0% confidence
	-196.864 +/- 12.1591
	-0.892773% +/- 0.0548231%
	(Student's t, pooled s = 12.9408)

So there is a performance impact, but less than 1 percent seems like it'd be acceptable.

With a larger file (102400 bytes) the difference indeed shrinks:

x non-vimage-2-thread-10-connections-60seconds_10240.txt
+ vimage-2-thread-10-connections-60seconds_10240.txt
+--------------------------------------------------------------------------------+
|                                  +     ++                                  x x |
|+     +                 +         +  +  ++                        x   x   x xxx |
|               |______________A_____M________|                         |___AM__||
+--------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10      11104.44      11109.82      11108.82     11108.324     1.7179781
+  10      11075.95      11093.78     11091.275     11088.768     6.4413953
Difference at 95.0% confidence
	-19.556 +/- 4.42923
	-0.176048% +/- 0.0398684%
	(Student's t, pooled s = 4.71397)

This is still IPv4. I'll kick off an IPv6 test too.

Similar results for v6:

x non-vimage-2-thread-10-connections-60seconds_v6.txt
+ vimage-2-thread-10-connections-60seconds_v6.txt
+--------------------------------------------------------------------------------+
|+  + +++ +++            ++                                 xx  x         xxx   x|
|  |_____M_A_______|                                           |______A___M___|  |
+--------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  10      21504.47      21551.39      21537.77     21528.655     18.427621
+  10      21364.42      21424.46       21383.7     21388.061     19.988774
Difference at 95.0% confidence
	-140.594 +/- 18.0628
	-0.653055% +/- 0.0836501%
	(Student's t, pooled s = 19.2241)

A loss of 0.65% in requests per second by activating VIMAGE on IPv6. This is the small-file result again.

For what its worth, here is my 2c on this.

I continue to have concerns about performance. But, there are clearly people who think this is a very important feature and it seems to have crossed the adoption threshold to the place where it should be tested thoroughly. And, if its an important feature, it should be tested in GENERIC in head the same way we add other performance-impacting things (WITNESS, INVARIANTS, etc.).

On the other hand, I think we need to perform thorough testing before we decide whether to retain it as part of GENERIC for releases.

Just my 2c.

I think it's important that we get some experience with it enabled in GENERIC so that we can make an informed decision sufficiently far in advance of 12.0. The benchmarks we do have so far show a rather small regression, and downstream FreeBSD consumers who have a need for the maximum possible performance can still disable it.

hiren added a subscriber: hiren.

The benchmarks we do have so far show a rather small regression

Can someone share the benchmarks?

I am okay having this in the GENERIC for widespread testing. And IMO, anyone caring about perf is going to tweak GENERIC anyways.

Can someone share the benchmarks?

In previous comments on this review: @kristof has some results inline and @olivier has a link to the EuroBSDCon talk.

I am okay having this in the GENERIC for widespread testing. And IMO, anyone caring about perf is going to tweak GENERIC anyways.

Indeed. For us to be able to ship this in 12.0 we need some experience with it turned on in -current first, and GENERIC is really as its name applies - a general purpose kernel for a broad set of use cases. Someone building appliances or special-purpose servers can tweak options and other attributes as desired.

We must still keep as a goal maintaining the smallest possible (and ideally no) performance regression.

This revision was automatically updated to reflect the committed changes.

21.10.2017 4:41, bz (Bjoern A. Zeeb) пишет:

This revision was automatically updated to reflect the committed changes.
Closed by commit rS324810: With r181803 on 2008-08-17 23:27:27Z the first VIMAGE commit went into (authored by bz, committed by ).

CHANGED PRIOR TO COMMIT

https://reviews.freebsd.org/D12639?vs=33888&id=34199#toc

REPOSITORY

rS FreeBSD src repository

CHANGES SINCE LAST UPDATE

https://reviews.freebsd.org/D12639?vs=33888&id=34199

REVISION DETAIL

https://reviews.freebsd.org/D12639

diff --git a/head/sys/mips/conf/QEMU b/head/sys/mips/conf/QEMU

  • a/head/sys/mips/conf/QEMU

+++ b/head/sys/mips/conf/QEMU
@@ -31,6 +31,7 @@
options KDB

options SCHED_4BSD #4BSD scheduler
+options VIMAGE # Subsystem virtualization, e.g. VNET
options INET #InterNETworking
options TCP_HHOOK # hhook(9) framework for TCP
options NFSCL network Filesystem Client
@@ -44,6 +45,11 @@
#options INVARIANT_SUPPORT #Extra sanity checks of internal structures, required by INVARIANTS
#options WITNESS #Enable checks to detect deadlocks and cycles
#options WITNESS_SKIPSPIN #Don't run witness on spinlocks for speed
+
+# The `bpf' device enables the Berkeley Packet Filter.
+# Be aware of the administrative consequences of enabling this!
+# Note that 'bpf' is required for DHCP.
+device bpf # Berkeley packet filter

device loop
device ether

Was this `bpf' chunk committed intentionally?

21.10.2017 4:41, bz (Bjoern A. Zeeb) пишет:

+# The `bpf' device enables the Berkeley Packet Filter.
+# Be aware of the administrative consequences of enabling this!
+# Note that 'bpf' is required for DHCP.
+device bpf # Berkeley packet filter

Was this `bpf' chunk committed intentionally?

@eugen_grosbein.net yes. otherwise things won't compile. This is obviously another thing to fix. But even then in QEMU bpf is useful.