Page MenuHomeFreeBSD

RFC: netfront: multiqueue support
ClosedPublic

Authored by liuw_liuw.name on Nov 17 2015, 2:13 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 12, 5:08 PM
Unknown Object (File)
Nov 10 2024, 7:56 PM
Unknown Object (File)
Nov 5 2024, 12:18 PM
Unknown Object (File)
Nov 5 2024, 6:59 AM
Unknown Object (File)
Nov 2 2024, 7:51 PM
Unknown Object (File)
Nov 1 2024, 11:50 AM
Unknown Object (File)
Nov 1 2024, 11:29 AM
Unknown Object (File)
Oct 6 2024, 1:24 AM
Subscribers

Details

Summary

Heavily refactor netfront driver: break out a bunch of helper
functions and different structures. Use threads to handle TX and RX.
Remove some dead code and fix quite a few bugs as I go along.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 1547
Build 1553: arc lint + arc unit

Event Timeline

liuw_liuw.name retitled this revision from to RFC: netfront: multiqueue support.
liuw_liuw.name updated this object.
liuw_liuw.name edited the test plan for this revision. (Show Details)
liuw_liuw.name added a reviewer: royger.

Thanks for this massive cleanup! I've made quite a lot of comment about coding style, but in general the approach looks fine to me, and I only have a couple of comments that are actually about the implementation itself.

sys/dev/xen/netfront/netfront.c
124–125

2 spaces between int and xn_assemble_...

145–146

You can also remove the comment while there :)

148–150

xn_get_responses to match the rest of the prefixes?

181–182

Probably a stupid question, but do you know why do we need the +1 here and above?

216

A boolean_t might be more adequate type here. And setting it with TRUE or FALSE.

269

Those macros are not specially helpful IMHO, so if you think removing them would make the code clearer that's fine for me.

478

boolean_t hierachy

489

Those KASSERT messages are not really helpful, IMHO the first one should be "Mismatch between RX and TX queue id", and the second one the text in the comment just below "Split event channels are not supported".

503

err != 0. In FreeBSD coding style the construct you are using can only be applied to boolean types, everything else must be checked explicitly.

I know there are a lot of other instances of this construct in netfront, but we should try to at least avoid introducing new ones.

547

err != 0 should be fine. Anything else different than 0 returned by xs_scanf is an error.

550

Newline please.

567

Align with 4 spaces after the newline.

569

err != 0

573

Align with 4 spaces after the newline.

574

err != 0

581

Align with 4 spaces.

583

err != 0

615

extra newline?

642

XN_TX_LOCK_ASSERT(txq);

667

Shouldn't this call xn_txq_start instead of open coding it?

674

If a function has no local variables an empty newline should be left at the start (here and in the function below).

715

Align using 4 spaces always after a newline is inserted (ie: don't align to the "(" of the previous line).

745

The malloc at the start of the function uses M_WAITOK, so I guess this one should also use it.

757

error != 0

769

Can we use error labels here in order to prevent repeating and expanding the same cleanup in every fail case?

795

I think this is missing a '&', shouldn't it be:

xen_intr_unbind(&txq->xen_intr_handle);

Isn't the compiler complaining about it?

And then the line below is not needed.

852

Is there any reason to not use:

txq->mbufs[NET_TX_RING_SIZE] = NULL;
863

The cast is not needed, and you are using malloc with M_WAITOK above in the same function, so this one should also be using M_WAITOK.

876

error != 0

884

M_WAITOK

897

M_WAITOK

902

Same comment here as in setup_rxqs I would rather use labels in order to avoid repeating the same cleanup each time.

916

4 spaces to indent

927

Since the interrupt handler just schedules a thread, I guess we could look into using a filter and then a fast taskqueue (see taskqueue_create_fast), it might prove the be faster.

943

Newline between return and out label, and I would rename the "out" label to "fail" to make it clear it's only taken on failure. Adding a KASSERT to the error path would also be nice:

KASSERT(error != 0, ("Error path taken without providing an error code"));
1276

Since you are already touching the line, could you add a "XXX:" prefix to the comment to make it clear it's a doubt that we need to look into and not a clarification.

1442

s/xennet/xn/ in order to keep the same prefix for the whole file?

liuw_liuw.name added inline comments.
sys/dev/xen/netfront/netfront.c
124–125

Fixed.

145–146

Done.

148–150

Done, along with other xennet_ functions.

181–182

I didn't notice this until you mentioned it.

I think this is because mbufs doubles as a free list so an extra element is required.

216

Done.

269

I will leave them as they are now. This patch is massive enough.

478

Fixed.

489

Done.

503

All instances fixed.

547

Done.

550

Done.

567

Done.

569

Done.

573

Done.

574

Done.

581

Done.

667

Indeed. I've fixed it.

745

Replaced all M_NOWAIT with M_WAITOK and deleted error handling code since M_WAITOK won't fail.

795

Maybe I missed the warning. Anyway, I've fixed that.

Does xen_intr_unbind sets the handle to a known value? If not, txq->xen_intr_handle = 0 is still needed because reconnection routine needs that to determine if it needs to release that handle. It is not implemented at the moment though.

852

The reason is that it actually is storing 0 semantically, because the array doubles as a free list chain. Using NULL is wrong.

liuw_liuw.name marked 20 inline comments as done.
liuw_liuw.name edited edge metadata.

Fix coding style problems.

Use _fast variants of taskqueue functions.

Use M_WAITOK instead of M_NOWAIT.

Use goto style error handling.

Thanks! This is looking good, I only have a couple of comments, and the only important one is about using a filter instead of an interrupt handler, which is trivial to fix.

BTW, do you have some figures regarding performance? Or is this focused as a cleanup rather than a performance improvement?

I'm also adding Colin, who can probably help with some testing? :)

sys/dev/xen/netfront/netfront.c
309

I know this is not your fault, but according to the coding style(9) you should not use function calls in initializers:

"DO NOT use function calls in initializers."

I don't plan to block this change on this, it can always be fixed as a follow up patch.

795

Yes, xen_intr_unbind sets it to NULL:

http://fxr.watson.org/fxr/source/x86/xen/xen_intr.c#L1496

852

Ack, thanks for the clarification.

892

AFAICT, you are using fast taskqueues now, so you should be using a filter instead of a handler, it should be:

xen_intr_alloc_and_bind_local_port(dev, xenbus_get_otherend_id(dev), xn_intr, NULL, ...

You will have to change xn_intr slightly, so that it now returns FILTER_HANDLED:

http://fxr.watson.org/fxr/source/sys/bus.h?im=3#L222
http://fxr.watson.org/fxr/source/dev/xen/timer/timer.c?im=3#L252

This should help reduce the latency.

1102

The (void ) here is pointless AFAICT.

Addressed all comments.

As for numbers, I will try to get some when I get around.

Some naive test results below.

Original kernel in snapshot

iperf -c host -t 60 -i 5 : 5.88 Gbit/s

iperf -c host -t 60 -i 5 -P 4 : 5.40 Gbit/s

Multiple queue kernel

iperf -c host -t 60 -i 5 : 5.15 Gbit/s

iperf -c host -t 60 -i 5 -P 4 : 11.3 Gbit/s

royger edited edge metadata.

LGTM, I plan to commit this in a couple of days.

This revision is now accepted and ready to land.Jan 18 2016, 4:40 PM
liuw_liuw.name edited edge metadata.

Fix a minor problem (unused variable) to make tinderbox pass.

This revision now requires review to proceed.Jan 19 2016, 2:05 PM
This revision was automatically updated to reflect the committed changes.