Page MenuHomeFreeBSD

Replace the single linked list of exported file system structures with a hash table of lists
Needs ReviewPublic

Authored by rmacklem on May 16 2019, 3:36 AM.

Details

Reviewers
cy
Summary

mountd.c uses a single linked list of "struct exportlist" structures, where there is one of these for each exported
file system on the NFS server. This list gets long if there are a large number of file systems exported and the list
must be searched for each line in the exports file(s) when SIGHUP causes the exports file(s) to be reloaded.
A simple benchmark that traverses SLIST() elements and compares two 32bit fields in the structure for equal
(which is what the search is) appears to take a couple of nsec. So, for a server with 72000 exported file systems,
this can take about 5sec during reload of the exports file(s).

By replacing the single linked list with a hash table and a target of 10 elements per list (what some call a load factor),
the time should be reduced to something less than 1msec.

I had Peter Errikson (who has a server with 72000+ exported file systems) run a test program using 5 hashes to see
how they worked.
fnv_32_buf(…, 0)
fnv_32_buf(…, FNV1_32_INIT)
hash32_buf(…, 0)
hash32_buf(…, HASHINIT)
(fsid.val[0)
The first three behaved about equally well, with the first one being slightly better than the others.
It has an average variation of about 4.5% about the target list length and that is what this patch uses.

cem@ suggested that a load factor closer to 1 would be more typical, but since the reload of exports file(s)
doesn't occur that frequently, getting the total search time for the load below 1msec seemed adequate to me.

Test Plan

I have tested this patch for an assortment of exports files and I am waiting for Peter Errikson to report on his
testing with a variety of file systems including one with 72000+ exported file systems.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 24252

Event Timeline

rmacklem created this revision.May 16 2019, 3:36 AM

Here are the results of a test run done by Peter for a hash table of the size this patch
creates and one ten times bigger:

“BigHash” (without “/ 10”)

May 18 15:06:56 filur00 mountd[66347]: time spent reading exports: 6.9s
May 18 15:06:56 filur00 mountd[66347]: time spent updating kernel: 16.4s
May 18 15:07:10 filur00 mountd[67659]: time spent reading exports: 902.5ms
May 18 15:07:10 filur00 mountd[67659]: time spent updating kernel: 17.9ms

“SmallHash” (with “/ 10”)

May 18 15:08:45 filur00 mountd[69272]: time spent reading exports: 8.9s
May 18 15:08:45 filur00 mountd[69272]: time spent updating kernel: 18.0s
May 18 15:08:59 filur00 mountd[69308]: time spent reading exports: 886.8ms
May 18 15:08:59 filur00 mountd[69308]: time spent updating kernel: 19.3ms

Note that the first two lines are for the initial loading, which will only happen
when mountd is started for the "-I" option being worked on. Since this does
not use the hash tables any more than the last two lines, I think the differences
in times are related to delays doing the 40000+ nmount(2) and statfs(2) syscalls.

For the sum of the last two lines (which happen on each reload), the smaller
hash table size actually does better (by less than 1%) and there probably is
enough variance in the results that this is statistically insignificant.

Therefore, there does not seem to be a need for a larger hash table than the
patch current generates.

cy added a reviewer: cy.May 21 2019, 3:39 AM