More than 65K connection from single application
Needs RevisionPublic

Authored by eri on Jan 21 2016, 2:20 PM.

Details

Reviewers
pkelsey
ae
julian
gnn
adrian
Group Reviewers
network
transport
Summary

The patch allows applications to not be limited to only 65k connections to different hosts, which is the limit of local port numbers, but calculate the limit by considering the destination ip as well.

Inspired by: julian
Sponsored by: Wheel Systems, http://wheelsystems.com

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 6876
eri updated this revision to Diff 12550.Jan 21 2016, 2:20 PM
eri retitled this revision from to More than 65K connection from single application.
eri updated this object.
eri edited the test plan for this revision. (Show Details)
eri added a reviewer: network.
eri set the repository for this revision to rS FreeBSD src repository.
eri added a project: network.
eri added subscribers: pjd, rwatson, gnn.
eri updated this object.Jan 21 2016, 3:52 PM
eri edited edge metadata.
jch added a subscriber: jch.Jan 21 2016, 3:58 PM
gnn added a comment.Jan 21 2016, 5:03 PM

I'm a bit confused by this proposal. No protocol I know of has a port field larger than 16 bits. How and where might this be applied?

eri added a comment.Jan 21 2016, 5:27 PM
In D5017#106574, @gnn wrote:

I'm a bit confused by this proposal. No protocol I know of has a port field larger than 16 bits. How and where might this be applied?

Maybe the description is not detailed.

This patch does not add more port numbers.
The patch fixes the case when you have an application trying TCP/UDP connections to many hosts that are bigger than 65K(which is the uint16 limit on port number).
The patch allows an application to open more than 65K connections to hosts with different ip addresses.

The test for this is very easy by having a client and two servers echo process.
Start from the client opening more than 65K connections you will not be able in standard FreeBSD.
With this patch you will be able to perform such operation.

adrian added a subscriber: adrian.Jan 21 2016, 6:28 PM

Oh i see. you're changing it so if you do an anon port bind, you can
use more than the 65535 port space as long as you have pcb matches to
multiple destination IPs. Yeah, that's actually a big deal - a lot of
places just don't bother with this and they do their own local port/IP
management because of this very specific issue.

I'm +1 for the idea. I'll review it first.

-a

I think this is fine, but I'd like a second set of eyeballs on this.

eri added a subscriber: julian.Jan 25 2016, 1:16 PM

Any update on the following?

In D5017#106574, @gnn wrote:

I'm a bit confused by this proposal. No protocol I know of has a port field larger than 16 bits. How and where might this be applied?

the current limit would only be useful if you were talking to only one host. Don't know about you but my servers usually talk to more than one client. you can certainly use the same port number multiple times if it's not got the same remote address.

looks about right but I'm not current with all the internal functions. I also want more eyes, but I like it generally.

sys/netinet/in_pcb.c
473

can't you do this assign only in the clause that needs it?

just make sure you test the RSS/PCBGROUPS options too. :)

-a

eri added a comment.Jan 27 2016, 8:57 AM

just make sure you test the RSS/PCBGROUPS options too. :)

-a

I do not see the relation and how this correlate.
To my understanding this does not impact or change functionality with those options.

Just making sure you haven't broken any APIs that get used by the
RSS/PCBGROUP code.

(They use / implement some of the pcb hash table code..)

-a

ae removed rS FreeBSD src repository as the repository for this revision.Jan 28 2016, 8:36 AM
ae set the repository for this revision to rS FreeBSD src repository.Jan 28 2016, 8:40 AM
ae added a subscriber: ae.

Can you link your patch with /head branch, so all context will be available?

eri added a comment.Jan 28 2016, 9:13 AM
In D5017#108466, @ae wrote:

Can you link your patch with /head branch, so all context will be available?

Not sure what you mean by this!
To me its against HEAD but i used the webinterface to submit this and i do not see an option to make that link by editing the revision.

ae added a comment.Jan 28 2016, 9:34 AM

It seems the only way to do this - upload the patch using arcanist from the command line.

jtl added a subscriber: jtl.Jan 28 2016, 3:08 PM
In D5017#108469, @eri wrote:
In D5017#108466, @ae wrote:

Can you link your patch with /head branch, so all context will be available?

Not sure what you mean by this!
To me its against HEAD but i used the webinterface to submit this and i do not see an option to make that link by editing the revision.

Try using this command to generate the diff:
svn diff --diff-cmd=diff -x -U999999

(See https://wiki.freebsd.org/CodeReview .)

jtl added a comment.Apr 22 2016, 2:46 AM

Is this still a "live" proposal? If so, can you update it to apply against the latest HEAD revision and generate the diff as described in the previous comment(s)?

eri updated this revision to Diff 24184.Jan 19 2017, 5:12 AM

Busy with other things.
Updating this to include context.

we should probably skip ports that have a listenig socket on them.. do we?

jmallett added a subscriber: jmallett.

Added @pkelsey as a reviewer because I didn't realize it would *assign* to him, and rather wanted to invite his comments. Given the work he's done on promiscuous sockets, with a fairly widely-deployed codebase, I think his insights would be meaningful on these changes. The principle seems about right to me here, but I'm not sure about some of the specifics.

eri added a comment.Feb 1 2017, 1:04 AM

Anyone has objections so this can go in?

eri added a comment.Feb 5 2017, 9:08 PM

So i go and commit this in one week if no objections are raised.

gnn requested changes to this revision.Feb 5 2017, 9:35 PM
gnn added a reviewer: gnn.

This cannot be committed until the comments by julian (who you say proposed this) and others are addressed.

This revision now requires changes to proceed.Feb 5 2017, 9:35 PM
eri added inline comments.Feb 5 2017, 10:00 PM
sys/netinet/in_pcb.c
473

There is no other place i can think of where this can be defered.

eri added a reviewer: julian.Feb 9 2017, 6:29 AM
adrian accepted this revision.Feb 9 2017, 6:37 AM
adrian added a reviewer: adrian.

i hope it doesn't break RSS multi-socket binding for incoming requests. :-)

julian accepted this revision.Feb 9 2017, 7:57 AM
julian edited edge metadata.

my only comment is that I ask you to just confirm that a listening socket can not collide with the socket we are allocating.
i don't think it does, but I may have missed something

julian added a comment.Feb 9 2017, 7:59 AM

who is this mysterious group reviewer "transport"? And where is their negative review? I only see gnn's

pkelsey requested changes to this revision.Feb 9 2017, 5:15 PM
pkelsey edited edge metadata.

This is the source comment that should have been submitted with my 'request for change' action, had I pressed the buttons in a way that didn't dispose of it instead.

sys/netinet/in_pcb.c
493

Why isn't this being implemented as a modification of the existing logic in in_pcblookup_local()? From a design standpoint, I think distributing the definition of 'is this local port available' to additional places outside of in_pcblookup_local() is a step in the wrong direction.

Bypassing in_pcblookup_local() and going straight to a non-wildcard in_pcblookup_hash_locked() lookup here creates a new behavior where an outbound connection can use a local address and port number that is being listened on. Currently, the logic in in_pcblookup_local() prevents such a thing. I am not sure if this is the 'collision' between a listening socket and the socket being allocated that julian@ was referring to, but certainly it is a change in behavior that is not necessary for the concept of this patch and one that I think should not be readily accepted as a side effect, and possibly is a change that is undesirable.

julian added a comment.Feb 9 2017, 5:23 PM

so what change did pkelsey request? I can't see any request to change anything.

julian added a comment.Feb 9 2017, 5:25 PM

This is the source comment that should have been submitted with my 'request for change' action, had I pressed the buttons in a way that didn't dispose of it instead.

yeah hence my previous comment. now I see your comment

julian added a comment.Feb 9 2017, 5:48 PM

I think that "collision" is ok.

Of course it IS a new behaviour, but that is what we are trying to get. new behaviour.

I think that "collision" is ok.

Of course it IS a new behaviour, but that is what we are trying to get. new behaviour.

I think @pkelsey is pretty persuasive, though, that it would be worth it to do and evaluate that separately, since it isn't a necessary component of the new behaviour in general. There's less consensus on the necessity of that than of the main portion of the change.

I think that "collision" is ok.

Of course it IS a new behaviour, but that is what we are trying to get. new behaviour.

My objection is that this doesn't seem to me to be the place to go into a technical dive on that behavior, particularly because it appears to exist only as a side effect of avoiding working with in_pcblookup_local(), which seems to be where the intended effects should be implemented.

eri added a comment.Feb 10 2017, 3:20 AM

Just some information to pkesley@ comment.

sys/netinet/in_pcb.c
493

It has not been done in in_pcb_lport since it means removing completely the logic on random port selection there which is used on anonymous binds.
Transfer this to context of hashes and to me seems reinventing infrastructure which i reused with this functions.
The code has already so many workarounds that would become a big hurdle and refactor that goes out of context of the change itself.

garga added a subscriber: garga.Feb 13 2017, 2:57 PM

Added some comments regarding style(9)

sys/netinet/in_pcb.c
494

style(9): Line too long

593

style(9): Line too long

714

style(9): Line too long

sys/netinet6/in6_pcb.c
299

style(9): Line too long

sys/netinet6/udp6_usrreq.c
771

style(9): Bad indent

774

style(9): Line too long

eri added a reviewer: ae.Feb 14 2017, 1:23 AM

AFAIU, the patch has been neither finished nor submitted to master.
The last activity was more than a year ago, so should it be considered as abandoned?
Are there any plans to add the feature into the kernel?