Overview
sysctl(9) is an easiy way to export information to userland while maintaining isolation.
We have dozens of interfaces exposing various statistics and control data by filling in and exporting structures. net.inet6.icmp6.stats or net.inet6.icmp6.nd6_prlist can be a good examples of such interaction.
Most of these structure do not have version information embedded, which requires us to break compatibility when changing them.
Proposal
The proposed change addresses compatibility: versioning provides the way of making changes without breaking ABI and with minimal API changes required on the user side.
Versioning is build on two simple principles. First is defining version of the structure (#define) near the structure.
Second is automatic export of multiple versions of the structure under the same oid: net.inet6.icmp6.stats.0, net.inet6.icmp6.stats.1 can be requested to fetch the particular version of a structure. Note that fetching of original net.inet6.icmp6.stats still works, proving the oldest supported version of the structure.
This is build on top of another sysctl change, extending dynamic OID functionality.
Versioned OIDs
Versioned OIDs are simply a dynamic list of supported versions under a parent node.
Special macro-based wrappers are added to generate per-node functions for range validation, OID resolution and formatting.
Pros/Cons
Pros:
- no ABI breakage
- relatively small amount of changes for conversions
Cons:
Before the change, rebuilding the software after ABI breakage mostly worked or we got compile-time error indicating the problem.
Now third-party software may actually fail in run-time, as it may be calling unversioned sysctl, which will return _old_ structure, while the software is compiled with the new structure.
There are 2 parts of the solution here:
(1) rename structure after the conversion, leaving the old structure intact
(2) rely on emitted kernel warnings to identify and fix all third-party users of the particular sysctl
Conversion example
Unbreaking ABI of net.inet6.icmp6.stats after existing changes (r358620 and r366842):
Userland changes:
- if (fetch_stats("net.inet6.icmp6.stats", + if (fetch_stats("net.inet6.icmp6.stats." __XSTRING(ICMP6STAT_VER),
Kernel changes:
static int icmp6_copyout_ver(SYSCTL_HANDLER_ARGS, void *new_data, size_t new_datalen) { size_t sz = new_datalen; #define ICMP6STAT_VER1_DIFF sizeof(uint64_t) * (33 + 4) /* Reminder to change the code here when changing version */ _Static_assert(ICMP6STAT_VER == 1, "icmp6_copyout_ver() requires fixing"); uint32_t version = sysctl_get_oid_version(oidp, arg1, arg2, req); switch(version) { case ICMP6STAT_VER: break; case 0: sz -= ICMP6STAT_VER1_DIFF; break; default: return (ENOTSUP); } return (SYSCTL_OUT(req, new_data, sz)); } -SYSCTL_VNET_PCPUSTAT(_net_inet6_icmp6, ICMPV6CTL_STATS, stats, - struct icmp6stat, icmp6stat, +SYSCTL_VNET_PCPUSTAT_VER(_net_inet6_icmp6, ICMPV6CTL_STATS, stats, + struct icmp6stat, icmp6stat, 0, ICMP6STAT_VER, icmp6_copyout_ver, "ICMPv6 statistics (struct icmp6stat, netinet/icmp6.h)");
Dynamic OIDs
We currently have some sort of dynamic OIDs, implemented by filling in oid_handler callback in sysctl nodes.
It allows to request arbitrary oids by providing exact OID numberic path via sysctl(3). This interfaces is not complete: it does not support iteration, sysctlbyname() and fetching dynamic OID metadata.
The change allows making dynamic OIDs first-class citizens.
It is omplemented by filling in special structure sysctl_node_handlers with the necessary callbacks:
struct sysctl_node_handlers { oid_handler_next_oid_t *nextoid; oid_handler_name2oid_t *name2oid; oid_handler_oidfmt_t *oidfmt; };
Pointer to this structure is stored in oid_arg1 field, which is empty for non-leaf OIDs.
Existing kernel functions used for resolving, iterating and providing info for the OIDs have been augmented to call these handlers and provide and (mostly) transparently handle dynamic oids.