Page MenuHomeFreeBSD

More than 65K connection from single application
Needs RevisionPublic

Authored by eri on Jan 21 2016, 2:20 PM.

Details

Reviewers
pkelsey
ae
julian
gnn
adrian
Group Reviewers
network
transport
Summary

The patch allows applications to not be limited to only 65k connections to different hosts, which is the limit of local port numbers, but calculate the limit by considering the destination ip as well.

Inspired by: julian
Sponsored by: Wheel Systems, http://wheelsystems.com

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 6876

Event Timeline

eri retitled this revision from to More than 65K connection from single application.
eri updated this object.
eri edited the test plan for this revision. (Show Details)
eri added a reviewer: network.
eri set the repository for this revision to rS FreeBSD src repository - subversion.
eri added a project: network.
eri added subscribers: pjd, rwatson, gnn.
eri edited edge metadata.

I'm a bit confused by this proposal. No protocol I know of has a port field larger than 16 bits. How and where might this be applied?

In D5017#106574, @gnn wrote:

I'm a bit confused by this proposal. No protocol I know of has a port field larger than 16 bits. How and where might this be applied?

Maybe the description is not detailed.

This patch does not add more port numbers.
The patch fixes the case when you have an application trying TCP/UDP connections to many hosts that are bigger than 65K(which is the uint16 limit on port number).
The patch allows an application to open more than 65K connections to hosts with different ip addresses.

The test for this is very easy by having a client and two servers echo process.
Start from the client opening more than 65K connections you will not be able in standard FreeBSD.
With this patch you will be able to perform such operation.

Oh i see. you're changing it so if you do an anon port bind, you can
use more than the 65535 port space as long as you have pcb matches to
multiple destination IPs. Yeah, that's actually a big deal - a lot of
places just don't bother with this and they do their own local port/IP
management because of this very specific issue.

I'm +1 for the idea. I'll review it first.

-a

I think this is fine, but I'd like a second set of eyeballs on this.

Any update on the following?

In D5017#106574, @gnn wrote:

I'm a bit confused by this proposal. No protocol I know of has a port field larger than 16 bits. How and where might this be applied?

the current limit would only be useful if you were talking to only one host. Don't know about you but my servers usually talk to more than one client. you can certainly use the same port number multiple times if it's not got the same remote address.

looks about right but I'm not current with all the internal functions. I also want more eyes, but I like it generally.

sys/netinet/in_pcb.c
473

can't you do this assign only in the clause that needs it?

just make sure you test the RSS/PCBGROUPS options too. :)

-a

just make sure you test the RSS/PCBGROUPS options too. :)

-a

I do not see the relation and how this correlate.
To my understanding this does not impact or change functionality with those options.

Just making sure you haven't broken any APIs that get used by the
RSS/PCBGROUP code.

(They use / implement some of the pcb hash table code..)

-a

ae added a subscriber: ae.

Can you link your patch with /head branch, so all context will be available?

In D5017#108466, @ae wrote:

Can you link your patch with /head branch, so all context will be available?

Not sure what you mean by this!
To me its against HEAD but i used the webinterface to submit this and i do not see an option to make that link by editing the revision.

It seems the only way to do this - upload the patch using arcanist from the command line.

In D5017#108469, @eri wrote:
In D5017#108466, @ae wrote:

Can you link your patch with /head branch, so all context will be available?

Not sure what you mean by this!
To me its against HEAD but i used the webinterface to submit this and i do not see an option to make that link by editing the revision.

Try using this command to generate the diff:
svn diff --diff-cmd=diff -x -U999999

(See https://wiki.freebsd.org/CodeReview .)

Is this still a "live" proposal? If so, can you update it to apply against the latest HEAD revision and generate the diff as described in the previous comment(s)?

Busy with other things.
Updating this to include context.

we should probably skip ports that have a listenig socket on them.. do we?

Added @pkelsey as a reviewer because I didn't realize it would *assign* to him, and rather wanted to invite his comments. Given the work he's done on promiscuous sockets, with a fairly widely-deployed codebase, I think his insights would be meaningful on these changes. The principle seems about right to me here, but I'm not sure about some of the specifics.

Anyone has objections so this can go in?

So i go and commit this in one week if no objections are raised.

gnn requested changes to this revision.Feb 5 2017, 9:35 PM
gnn added a reviewer: gnn.

This cannot be committed until the comments by julian (who you say proposed this) and others are addressed.

This revision now requires changes to proceed.Feb 5 2017, 9:35 PM
sys/netinet/in_pcb.c
473

There is no other place i can think of where this can be defered.

adrian added a reviewer: adrian.

i hope it doesn't break RSS multi-socket binding for incoming requests. :-)

julian edited edge metadata.

my only comment is that I ask you to just confirm that a listening socket can not collide with the socket we are allocating.
i don't think it does, but I may have missed something

who is this mysterious group reviewer "transport"? And where is their negative review? I only see gnn's

pkelsey requested changes to this revision.Feb 9 2017, 5:15 PM
pkelsey edited edge metadata.

This is the source comment that should have been submitted with my 'request for change' action, had I pressed the buttons in a way that didn't dispose of it instead.

sys/netinet/in_pcb.c
493

Why isn't this being implemented as a modification of the existing logic in in_pcblookup_local()? From a design standpoint, I think distributing the definition of 'is this local port available' to additional places outside of in_pcblookup_local() is a step in the wrong direction.

Bypassing in_pcblookup_local() and going straight to a non-wildcard in_pcblookup_hash_locked() lookup here creates a new behavior where an outbound connection can use a local address and port number that is being listened on. Currently, the logic in in_pcblookup_local() prevents such a thing. I am not sure if this is the 'collision' between a listening socket and the socket being allocated that julian@ was referring to, but certainly it is a change in behavior that is not necessary for the concept of this patch and one that I think should not be readily accepted as a side effect, and possibly is a change that is undesirable.

so what change did pkelsey request? I can't see any request to change anything.

This is the source comment that should have been submitted with my 'request for change' action, had I pressed the buttons in a way that didn't dispose of it instead.

yeah hence my previous comment. now I see your comment

I think that "collision" is ok.

Of course it IS a new behaviour, but that is what we are trying to get. new behaviour.

I think that "collision" is ok.

Of course it IS a new behaviour, but that is what we are trying to get. new behaviour.

I think @pkelsey is pretty persuasive, though, that it would be worth it to do and evaluate that separately, since it isn't a necessary component of the new behaviour in general. There's less consensus on the necessity of that than of the main portion of the change.

I think that "collision" is ok.

Of course it IS a new behaviour, but that is what we are trying to get. new behaviour.

My objection is that this doesn't seem to me to be the place to go into a technical dive on that behavior, particularly because it appears to exist only as a side effect of avoiding working with in_pcblookup_local(), which seems to be where the intended effects should be implemented.

Just some information to pkesley@ comment.

sys/netinet/in_pcb.c
493

It has not been done in in_pcb_lport since it means removing completely the logic on random port selection there which is used on anonymous binds.
Transfer this to context of hashes and to me seems reinventing infrastructure which i reused with this functions.
The code has already so many workarounds that would become a big hurdle and refactor that goes out of context of the change itself.

Added some comments regarding style(9)

sys/netinet/in_pcb.c
494

style(9): Line too long

593

style(9): Line too long

714

style(9): Line too long

sys/netinet6/in6_pcb.c
299

style(9): Line too long

sys/netinet6/udp6_usrreq.c
771

style(9): Bad indent

774

style(9): Line too long

AFAIU, the patch has been neither finished nor submitted to master.
The last activity was more than a year ago, so should it be considered as abandoned?
Are there any plans to add the feature into the kernel?

his "feature" is I believe still needed. I am keeping this alive to help me remember the details and to flag it to others.