- User Since
- Mar 10 2018, 1:54 AM (79 w, 4 d)
Sun, Sep 15
Follow-up change after updating D20973.
Define mevent_add_disabled here.
Regarding mevent_add_disabled, I can defer that to the next patch.
Addressed jhb's comments.
Wed, Sep 4
Thanks a lot!
Tue, Sep 3
Mon, Aug 19
Aug 18 2019
Aug 17 2019
Aug 16 2019
The ptnet(4) case is a bit different, since the driver knows at attach time which if_capabilities are supported, and those are not going to change.
On the other hand, with these changes to tap(4) the user can change if_capabilities dynamically.
Added SIOCSIFCAP support. Addressed more comments.
The inline functions added to virtio_net.h were simply copy-pasted from if_vtnet.c to if_ptnet.c in the first place. And yes, I plan to make also if_vtnet.c use the same code in future. I haven't done it yet because if_vtnet.c increments device specific statistic counters.
Aug 14 2019
Aug 10 2019
Any opinions on this change?
Any additional opinions on this patch?
Any opinions on this patch?
Jul 24 2019
Thanks a lot for your effort!
To be honest I'm not sure why the throughput increases so much, since TSO (64KB unchecksummed packets) is being used in both cases.
The main difference is that with mergeable rx buffers there is less pressure on the guest memory allocators, since the driver can allocate 2K clusters, rather than bigger packets.
Also, with mergeable rx buffer bhyve may do a little more work, because it needs to call vq_gechain() 33 times in order to receive each packet.
In any case, your results look very good to me, and they also agree with mine (taken with a smaller and less powerful machine).
Jul 23 2019
Thanks. Did you notice any change in terms of performance?
Jul 22 2019
Fix issue identified by @aleksandr.fedorov_itglobal.com
Jul 20 2019
Jul 19 2019
Jul 18 2019
Removed traling space.
Jul 16 2019
Upload the right patch
minimize patch by adding mevent_add_disabled().
Start the backends with receive enabled for the moment being.
Jul 8 2019
Changed the patch to update the license identifiers only.
Jul 7 2019
Jul 4 2019
Jul 1 2019
Looks good, thanks.
The netmap unit tests and integration tests still pass with these changes.
Thanks. This touches also TAP and setups, so we need tests with TAP.
Also, we would need to check that TAP+e1000 works.
Jun 30 2019
Thanks for the review. I actually don't have yet a real testbed for this (will get one soon), and I can only do functional tests in a nested KVM VM. Any test is welcome.
Addressed coments from reviewers.
Jun 27 2019
Thanks for the in depth review. Sorry about the style issues. Is there a FreeBSD clang-format file against which I can validate code?
Tried to address jhb's comments.
Jun 24 2019
Jun 22 2019
Jun 21 2019
Jun 16 2019
Jun 13 2019
Do you mean a common vq_notify callback for both transmit and receive queue? But pci_vtnet_ping_txq() and pci_vtnet_ping_rxq() perform different work ...
Removed "All rights reserved".
Jun 12 2019
Jun 11 2019
Jun 10 2019
Provide patch relative to /usr/src/.
Moved include <machine/atomic.h> to virtio.h.
This looks better, but the problem is that I was refactoring this file to separate virtio specific code from the backends (netmap, tap), and add
support for offloads (including TSO); this patch creates many conflicts with my pending work, and IMHO it should be rebased after my work to prevent a total mess.
Jun 9 2019
Jun 7 2019
Sorry for the diff context, I forgot about that.
Thanks for the suggestions, they look good to me, and I'll implement those in short.
Btw I would like to refactor/improve vtnet/tap/netmap support, so more patches will follow.
Jun 6 2019
Jun 4 2019
May 27 2019
Yes, but I think the story is more complex than that. Guests can publish single-descriptor chains or multi-descriptor chains, depending on feature negotiation and driver implementation. With mergeable buffers enabled, or TSO features disabled, guests will normally submit single-descriptor chains (because it makes sense), but this is not mandatory. With TSO enabled and mergeable buffers disabled, guests will normally pass in multi-descriptor chains, each one describing 64K or 32K buffers.
It makes sense to add more vq methods to handle the mergeable rx bufs case, but think they should be moved to usr.sbin/bhyve/virtio.c, so that they can be reused.
May 22 2019
Netmap usage by itself looks mostly ok, except where noted. We could improve the management of r->cur in certain cases, but it's not particularly important for now (for reference, please look at my QEMU implementation, for instance the transmit routine is here https://github.com/qemu/qemu/blob/master/net/netmap.c#L160-L225).
I still need to review your changes to the virtqueue processing. But why did you drop vq_getchain() and write a custom method? If another method is needed, it should be added to virtio.c IMHO.
May 17 2019
May 16 2019
Apr 9 2019
Mar 19 2019
Feb 18 2019
No worries, and thanks for testing!
Feb 14 2019
In my tests I can see that without the patch, two pkt-gen communicating over a VALE port will consume 9% CPU on kqueue processing (although kqueue is not used), on a separate CPU.
With the patch, the 9% goes away.
Feb 13 2019
We can't. We use a taskqueue there precisely because we cannot take the si->m lock (or we get LOR).
Feb 7 2019
It would probably help to have more lines of contexts for the patch.
If you used git, you should probably use something like:
git diff -U30 [...]
If you used svn
svn diff -x '-U30'
Jan 30 2019
Thanks for testing.
Jan 29 2019
It looks like reading the the HEAD of the first TX ring works, while the other TX rings don't...