Page MenuHomeFreeBSD

RFC: netfront: multiqueue support
ClosedPublic

Authored by liuw_liuw.name on Nov 17 2015, 2:13 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 12, 5:08 PM
Unknown Object (File)
Nov 10 2024, 7:56 PM
Unknown Object (File)
Nov 5 2024, 12:18 PM
Unknown Object (File)
Nov 5 2024, 6:59 AM
Unknown Object (File)
Nov 2 2024, 7:51 PM
Unknown Object (File)
Nov 1 2024, 11:50 AM
Unknown Object (File)
Nov 1 2024, 11:29 AM
Unknown Object (File)
Oct 6 2024, 1:24 AM
Subscribers

Details

Summary

Heavily refactor netfront driver: break out a bunch of helper
functions and different structures. Use threads to handle TX and RX.
Remove some dead code and fix quite a few bugs as I go along.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

liuw_liuw.name retitled this revision from to RFC: netfront: multiqueue support.
liuw_liuw.name updated this object.
liuw_liuw.name edited the test plan for this revision. (Show Details)
liuw_liuw.name added a reviewer: royger.

Thanks for this massive cleanup! I've made quite a lot of comment about coding style, but in general the approach looks fine to me, and I only have a couple of comments that are actually about the implementation itself.

sys/dev/xen/netfront/netfront.c
127 ↗(On Diff #10268)

2 spaces between int and xn_assemble_...

148 ↗(On Diff #10268)

You can also remove the comment while there :)

151 ↗(On Diff #10268)

xn_get_responses to match the rest of the prefixes?

187 ↗(On Diff #10268)

Probably a stupid question, but do you know why do we need the +1 here and above?

222 ↗(On Diff #10268)

A boolean_t might be more adequate type here. And setting it with TRUE or FALSE.

272 ↗(On Diff #10268)

Those macros are not specially helpful IMHO, so if you think removing them would make the code clearer that's fine for me.

478 ↗(On Diff #10268)

boolean_t hierachy

489 ↗(On Diff #10268)

Those KASSERT messages are not really helpful, IMHO the first one should be "Mismatch between RX and TX queue id", and the second one the text in the comment just below "Split event channels are not supported".

503 ↗(On Diff #10268)

err != 0. In FreeBSD coding style the construct you are using can only be applied to boolean types, everything else must be checked explicitly.

I know there are a lot of other instances of this construct in netfront, but we should try to at least avoid introducing new ones.

549 ↗(On Diff #10268)

err != 0 should be fine. Anything else different than 0 returned by xs_scanf is an error.

552 ↗(On Diff #10268)

Newline please.

567 ↗(On Diff #10268)

Align with 4 spaces after the newline.

569 ↗(On Diff #10268)

err != 0

573 ↗(On Diff #10268)

Align with 4 spaces after the newline.

574 ↗(On Diff #10268)

err != 0

581 ↗(On Diff #10268)

Align with 4 spaces.

583 ↗(On Diff #10268)

err != 0

617 ↗(On Diff #10268)

extra newline?

645 ↗(On Diff #10268)

XN_TX_LOCK_ASSERT(txq);

670 ↗(On Diff #10268)

Shouldn't this call xn_txq_start instead of open coding it?

677 ↗(On Diff #10268)

If a function has no local variables an empty newline should be left at the start (here and in the function below).

717 ↗(On Diff #10268)

Align using 4 spaces always after a newline is inserted (ie: don't align to the "(" of the previous line).

747 ↗(On Diff #10268)

The malloc at the start of the function uses M_WAITOK, so I guess this one should also use it.

759 ↗(On Diff #10268)

error != 0

771 ↗(On Diff #10268)

Can we use error labels here in order to prevent repeating and expanding the same cleanup in every fail case?

813 ↗(On Diff #10268)

I think this is missing a '&', shouldn't it be:

xen_intr_unbind(&txq->xen_intr_handle);

Isn't the compiler complaining about it?

And then the line below is not needed.

868 ↗(On Diff #10268)

Is there any reason to not use:

txq->mbufs[NET_TX_RING_SIZE] = NULL;
879 ↗(On Diff #10268)

The cast is not needed, and you are using malloc with M_WAITOK above in the same function, so this one should also be using M_WAITOK.

892 ↗(On Diff #10268)

error != 0

900 ↗(On Diff #10268)

M_WAITOK

913 ↗(On Diff #10268)

M_WAITOK

918 ↗(On Diff #10268)

Same comment here as in setup_rxqs I would rather use labels in order to avoid repeating the same cleanup each time.

943 ↗(On Diff #10268)

Since the interrupt handler just schedules a thread, I guess we could look into using a filter and then a fast taskqueue (see taskqueue_create_fast), it might prove the be faster.

974 ↗(On Diff #10268)

4 spaces to indent

1001 ↗(On Diff #10268)

Newline between return and out label, and I would rename the "out" label to "fail" to make it clear it's only taken on failure. Adding a KASSERT to the error path would also be nice:

KASSERT(error != 0, ("Error path taken without providing an error code"));
1320 ↗(On Diff #10268)

Since you are already touching the line, could you add a "XXX:" prefix to the comment to make it clear it's a doubt that we need to look into and not a clarification.

1487 ↗(On Diff #10268)

s/xennet/xn/ in order to keep the same prefix for the whole file?

liuw_liuw.name added inline comments.
sys/dev/xen/netfront/netfront.c
127 ↗(On Diff #10268)

Fixed.

148 ↗(On Diff #10268)

Done.

151 ↗(On Diff #10268)

Done, along with other xennet_ functions.

187 ↗(On Diff #10268)

I didn't notice this until you mentioned it.

I think this is because mbufs doubles as a free list so an extra element is required.

222 ↗(On Diff #10268)

Done.

272 ↗(On Diff #10268)

I will leave them as they are now. This patch is massive enough.

478 ↗(On Diff #10268)

Fixed.

489 ↗(On Diff #10268)

Done.

503 ↗(On Diff #10268)

All instances fixed.

549 ↗(On Diff #10268)

Done.

552 ↗(On Diff #10268)

Done.

567 ↗(On Diff #10268)

Done.

569 ↗(On Diff #10268)

Done.

573 ↗(On Diff #10268)

Done.

574 ↗(On Diff #10268)

Done.

581 ↗(On Diff #10268)

Done.

670 ↗(On Diff #10268)

Indeed. I've fixed it.

747 ↗(On Diff #10268)

Replaced all M_NOWAIT with M_WAITOK and deleted error handling code since M_WAITOK won't fail.

813 ↗(On Diff #10268)

Maybe I missed the warning. Anyway, I've fixed that.

Does xen_intr_unbind sets the handle to a known value? If not, txq->xen_intr_handle = 0 is still needed because reconnection routine needs that to determine if it needs to release that handle. It is not implemented at the moment though.

868 ↗(On Diff #10268)

The reason is that it actually is storing 0 semantically, because the array doubles as a free list chain. Using NULL is wrong.

liuw_liuw.name marked 20 inline comments as done.
liuw_liuw.name edited edge metadata.

Fix coding style problems.

Use _fast variants of taskqueue functions.

Use M_WAITOK instead of M_NOWAIT.

Use goto style error handling.

Thanks! This is looking good, I only have a couple of comments, and the only important one is about using a filter instead of an interrupt handler, which is trivial to fix.

BTW, do you have some figures regarding performance? Or is this focused as a cleanup rather than a performance improvement?

I'm also adding Colin, who can probably help with some testing? :)

sys/dev/xen/netfront/netfront.c
309 ↗(On Diff #11025)

I know this is not your fault, but according to the coding style(9) you should not use function calls in initializers:

"DO NOT use function calls in initializers."

I don't plan to block this change on this, it can always be fixed as a follow up patch.

795 ↗(On Diff #11025)

Yes, xen_intr_unbind sets it to NULL:

http://fxr.watson.org/fxr/source/x86/xen/xen_intr.c#L1496

852 ↗(On Diff #11025)

Ack, thanks for the clarification.

892 ↗(On Diff #11025)

AFAICT, you are using fast taskqueues now, so you should be using a filter instead of a handler, it should be:

xen_intr_alloc_and_bind_local_port(dev, xenbus_get_otherend_id(dev), xn_intr, NULL, ...

You will have to change xn_intr slightly, so that it now returns FILTER_HANDLED:

http://fxr.watson.org/fxr/source/sys/bus.h?im=3#L222
http://fxr.watson.org/fxr/source/dev/xen/timer/timer.c?im=3#L252

This should help reduce the latency.

1102 ↗(On Diff #11025)

The (void ) here is pointless AFAICT.

Addressed all comments.

As for numbers, I will try to get some when I get around.

Some naive test results below.

Original kernel in snapshot

iperf -c host -t 60 -i 5 : 5.88 Gbit/s

iperf -c host -t 60 -i 5 -P 4 : 5.40 Gbit/s

Multiple queue kernel

iperf -c host -t 60 -i 5 : 5.15 Gbit/s

iperf -c host -t 60 -i 5 -P 4 : 11.3 Gbit/s

royger edited edge metadata.

LGTM, I plan to commit this in a couple of days.

This revision is now accepted and ready to land.Jan 18 2016, 4:40 PM
liuw_liuw.name edited edge metadata.

Fix a minor problem (unused variable) to make tinderbox pass.

This revision now requires review to proceed.Jan 19 2016, 2:05 PM
This revision was automatically updated to reflect the committed changes.