Page MenuHomeFreeBSD

RFC: netfront: multiqueue support
ClosedPublic

Authored by liuw_liuw.name on Nov 17 2015, 2:13 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 12, 5:08 PM
Unknown Object (File)
Nov 10 2024, 7:56 PM
Unknown Object (File)
Nov 5 2024, 12:18 PM
Unknown Object (File)
Nov 5 2024, 6:59 AM
Unknown Object (File)
Nov 2 2024, 7:51 PM
Unknown Object (File)
Nov 1 2024, 11:50 AM
Unknown Object (File)
Nov 1 2024, 11:29 AM
Unknown Object (File)
Oct 6 2024, 1:24 AM
Subscribers

Details

Summary

Heavily refactor netfront driver: break out a bunch of helper
functions and different structures. Use threads to handle TX and RX.
Remove some dead code and fix quite a few bugs as I go along.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 2196
Build 2205: arc lint + arc unit

Event Timeline

liuw_liuw.name retitled this revision from to RFC: netfront: multiqueue support.
liuw_liuw.name updated this object.
liuw_liuw.name edited the test plan for this revision. (Show Details)
liuw_liuw.name added a reviewer: royger.

Thanks for this massive cleanup! I've made quite a lot of comment about coding style, but in general the approach looks fine to me, and I only have a couple of comments that are actually about the implementation itself.

sys/dev/xen/netfront/netfront.c
124–125

2 spaces between int and xn_assemble_...

145–146

You can also remove the comment while there :)

148–150

xn_get_responses to match the rest of the prefixes?

227

Probably a stupid question, but do you know why do we need the +1 here and above?

262

A boolean_t might be more adequate type here. And setting it with TRUE or FALSE.

269

Those macros are not specially helpful IMHO, so if you think removing them would make the code clearer that's fine for me.

479

boolean_t hierachy

490

Those KASSERT messages are not really helpful, IMHO the first one should be "Mismatch between RX and TX queue id", and the second one the text in the comment just below "Split event channels are not supported".

504

err != 0. In FreeBSD coding style the construct you are using can only be applied to boolean types, everything else must be checked explicitly.

I know there are a lot of other instances of this construct in netfront, but we should try to at least avoid introducing new ones.

548

err != 0 should be fine. Anything else different than 0 returned by xs_scanf is an error.

551

Newline please.

568

Align with 4 spaces after the newline.

570

err != 0

574

Align with 4 spaces after the newline.

575

err != 0

582

Align with 4 spaces.

584

err != 0

616

extra newline?

643

XN_TX_LOCK_ASSERT(txq);

668

Shouldn't this call xn_txq_start instead of open coding it?

675

If a function has no local variables an empty newline should be left at the start (here and in the function below).

717

Align using 4 spaces always after a newline is inserted (ie: don't align to the "(" of the previous line).

766

The malloc at the start of the function uses M_WAITOK, so I guess this one should also use it.

778

error != 0

790

Can we use error labels here in order to prevent repeating and expanding the same cleanup in every fail case?

797

I think this is missing a '&', shouldn't it be:

xen_intr_unbind(&txq->xen_intr_handle);

Isn't the compiler complaining about it?

And then the line below is not needed.

907

The cast is not needed, and you are using malloc with M_WAITOK above in the same function, so this one should also be using M_WAITOK.

917

4 spaces to indent

920

error != 0

921

Is there any reason to not use:

txq->mbufs[NET_TX_RING_SIZE] = NULL;
928

M_WAITOK

941

M_WAITOK

944

Newline between return and out label, and I would rename the "out" label to "fail" to make it clear it's only taken on failure. Adding a KASSERT to the error path would also be nice:

KASSERT(error != 0, ("Error path taken without providing an error code"));
946

Same comment here as in setup_rxqs I would rather use labels in order to avoid repeating the same cleanup each time.

971

Since the interrupt handler just schedules a thread, I guess we could look into using a filter and then a fast taskqueue (see taskqueue_create_fast), it might prove the be faster.

1277

Since you are already touching the line, could you add a "XXX:" prefix to the comment to make it clear it's a doubt that we need to look into and not a clarification.

1445

s/xennet/xn/ in order to keep the same prefix for the whole file?

liuw_liuw.name added inline comments.
sys/dev/xen/netfront/netfront.c
124–125

Fixed.

145–146

Done.

148–150

Done, along with other xennet_ functions.

227

I didn't notice this until you mentioned it.

I think this is because mbufs doubles as a free list so an extra element is required.

262

Done.

269

I will leave them as they are now. This patch is massive enough.

479

Fixed.

490

Done.

504

All instances fixed.

548

Done.

551

Done.

568

Done.

570

Done.

574

Done.

575

Done.

582

Done.

668

Indeed. I've fixed it.

766

Replaced all M_NOWAIT with M_WAITOK and deleted error handling code since M_WAITOK won't fail.

797

Maybe I missed the warning. Anyway, I've fixed that.

Does xen_intr_unbind sets the handle to a known value? If not, txq->xen_intr_handle = 0 is still needed because reconnection routine needs that to determine if it needs to release that handle. It is not implemented at the moment though.

921

The reason is that it actually is storing 0 semantically, because the array doubles as a free list chain. Using NULL is wrong.

liuw_liuw.name marked 20 inline comments as done.
liuw_liuw.name edited edge metadata.

Fix coding style problems.

Use _fast variants of taskqueue functions.

Use M_WAITOK instead of M_NOWAIT.

Use goto style error handling.

Thanks! This is looking good, I only have a couple of comments, and the only important one is about using a filter instead of an interrupt handler, which is trivial to fix.

BTW, do you have some figures regarding performance? Or is this focused as a cleanup rather than a performance improvement?

I'm also adding Colin, who can probably help with some testing? :)

sys/dev/xen/netfront/netfront.c
309

I know this is not your fault, but according to the coding style(9) you should not use function calls in initializers:

"DO NOT use function calls in initializers."

I don't plan to block this change on this, it can always be fixed as a follow up patch.

797

Yes, xen_intr_unbind sets it to NULL:

http://fxr.watson.org/fxr/source/x86/xen/xen_intr.c#L1496

921

Ack, thanks for the clarification.

936

AFAICT, you are using fast taskqueues now, so you should be using a filter instead of a handler, it should be:

xen_intr_alloc_and_bind_local_port(dev, xenbus_get_otherend_id(dev), xn_intr, NULL, ...

You will have to change xn_intr slightly, so that it now returns FILTER_HANDLED:

http://fxr.watson.org/fxr/source/sys/bus.h?im=3#L222
http://fxr.watson.org/fxr/source/dev/xen/timer/timer.c?im=3#L252

This should help reduce the latency.

1103

The (void ) here is pointless AFAICT.

Addressed all comments.

As for numbers, I will try to get some when I get around.

Some naive test results below.

Original kernel in snapshot

iperf -c host -t 60 -i 5 : 5.88 Gbit/s

iperf -c host -t 60 -i 5 -P 4 : 5.40 Gbit/s

Multiple queue kernel

iperf -c host -t 60 -i 5 : 5.15 Gbit/s

iperf -c host -t 60 -i 5 -P 4 : 11.3 Gbit/s

royger edited edge metadata.

LGTM, I plan to commit this in a couple of days.

This revision is now accepted and ready to land.Jan 18 2016, 4:40 PM
liuw_liuw.name edited edge metadata.

Fix a minor problem (unused variable) to make tinderbox pass.

This revision now requires review to proceed.Jan 19 2016, 2:05 PM
This revision was automatically updated to reflect the committed changes.