Page MenuHomeFreeBSD

sendfile_test: Finish to implement linear probing to find a free port
Needs ReviewPublic

Authored by olivier on May 13 2020, 9:07 PM.

Details

Reviewers
ngie
asomers
lwhsu
Summary

I've noticed that sendfile_test is flaky using this simple shell script:

#!/bin/sh
i=1
while true; do
        echo -n "Run: $i, "
        if /usr/tests/lib/libc/sys/sendfile_test hdtr_positive_v6  >/dev/null 2>&1; then
                echo success
        else
                echo failed
                exit
        fi
        i=$((i + 1))
done

On my system it fail before 3000 runs with this error:

Generating a random port with seed=2
Port range lower bound: 10000
Port range upper bound: 65535
Random port generated: 21793
Will try to bind socket to host='127.0.0.1', address_family=2, socket_type=1
failed: /FreeBSD/lib/libc/tests/sys/sendfile_test.c:198: error != 0: bind failed: Address already in use

So I've implemented the code with comment 'XXX: use linear probing to find a free port'.

Test Plan

The loop shell script given in example should not failed after a large number of tests in loop (minimum 10 000).

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

olivier requested review of this revision.May 13 2020, 9:07 PM
olivier created this revision.
ngie requested changes to this revision.May 13 2020, 10:23 PM
ngie added inline comments.
lib/libc/tests/sys/sendfile_test.c
177

What if ports 6000-9000 are all consumed?

Also, style(9) bug:

for (i = 6000; i < 9000; i++) {
179

I don't think it's wise to try and resolve localhost 3000 times -- the value shouldn't change across iterations (recommendation: move name resolution out of the loop).

Slow name resolution could potentially result in failed tests due to timeouts or failures to query the name info between iterations; it would definitely impact testing on OneFS, since we don't run a standard name server and it can be unreachable at a particular point in time depending on cluster state.

181

nit: please fix indentation for lines 182 & 183 (second level indents are 4 spaces, not 2).

193

Please fix per style(9) and please fix res getting leaked on success:

freeaddrinfo(res);
if (error == 0)
    break;
765

style(9) bug: don't initialize the variable here. Do this instead (there's a similar issue on other touched lines that needs to be resolved):

int client_sock, error, fd, port, server_sock;
struct sf_hdtr *hdtr1, hdtr2, hdtr3;

port = 0;
This revision now requires changes to proceed.May 13 2020, 10:23 PM
rrs added a subscriber: rrs.May 14 2020, 11:17 AM
rrs added inline comments.
lib/libc/tests/sys/sendfile_test.c
177

I agree here we probably just need to use the anon port space even though it is
highly unlikely that 3000 ports will be used. If we want to get a bit more complilcated
sysctl can be used and the values in net.inet.ip.portrange can be helpful. :)

765

Style 9 does not prohibit the use of variable initializers. It says to be thoughtful about
its use and not obfuscate code. It does prohibit the use of function calls in variable
initializers though.

I do prefer that you keep the initialized variables on a separate line though. I.e.
int client_sock, error, fd, server_sock;
it port = 0;

olivier added inline comments.May 14 2020, 4:21 PM
lib/libc/tests/sys/sendfile_test.c
177

If ports from 6000-9000 are all consumed during a regression test, then yes having the test failing is a correct behavior to alert about something wrong.

179

Yes but because function resolve_localhost() is using the port, it's more complex than just moving the name resolution out of the loop.

olivier updated this revision to Diff 73195.Jun 16 2020, 9:20 PM

Fixing style and almost all other remarks.

ngie added inline comments.Jun 16 2020, 10:08 PM
lib/libc/tests/sys/sendfile_test.c
177

I completely disagree: there's no guarantee that a test system will have all of those ports available.

olivier added inline comments.Jun 17 2020, 6:55 AM
lib/libc/tests/sys/sendfile_test.c
177

If the code is using all ports from 6000 to 9000, yes I disagree too :-)
My goal here, was to find the first free port (but only one) into the 6000-9000 range, and not to consume all.
So I believed that the line 192 "if (error == 0) then freeaddrinfo(res) + break" was breaking the loop as soon as the first bind returned a successs.
Isn't the case ?

Why don't you just remove call to bind() and listen on unbound socket? Then you get its address with getsockname() and use it in the client side?

olivier updated this revision to Diff 74747.Tue, Jul 21, 4:16 PM

Following gleb's advice, use getsockname avoiding the random port generation part
I've still binded the port (I didn't understand the improvement brings by not binding it).