Add a SPD cache to speed up lookups.
When large SPDs are used, we face two problems:
- too many CPU cycles are spent during the linear searches in the SPD for each packet eat
- too much contention on multi socket systems, since we use a single shared lock.
Main changes:
- added the sysctl tree 'net.key.spdcache' to control the SPD cache (disabled by default)
- cache the sp indexes that are used to perform SP lookups.
- use a range of dedicated mutexes to protect the cache lines.