diff --git a/sbin/ipfw/ipfw.8 b/sbin/ipfw/ipfw.8 index e77930355094..d2c4885bc119 100644 --- a/sbin/ipfw/ipfw.8 +++ b/sbin/ipfw/ipfw.8 @@ -1,4851 +1,4854 @@ .\" .\" $FreeBSD$ .\" .Dd August 21, 2020 .Dt IPFW 8 .Os .Sh NAME .Nm ipfw .Nd User interface for firewall, traffic shaper, packet scheduler, in-kernel NAT. .Sh SYNOPSIS .Ss FIREWALL CONFIGURATION .Nm .Op Fl cq .Cm add .Ar rule .Nm .Op Fl acdefnNStT .Op Cm set Ar N .Brq Cm list | show .Op Ar rule | first-last ... .Nm .Op Fl f | q .Op Cm set Ar N .Cm flush .Nm .Op Fl q .Op Cm set Ar N .Brq Cm delete | zero | resetlog .Op Ar number ... .Pp .Nm .Cm set Oo Cm disable Ar number ... Oc Op Cm enable Ar number ... .Nm .Cm set move .Op Cm rule .Ar number Cm to Ar number .Nm .Cm set swap Ar number number .Nm .Cm set show .Ss SYSCTL SHORTCUTS .Nm .Cm enable .Brq Cm firewall | altq | one_pass | debug | verbose | dyn_keepalive .Nm .Cm disable .Brq Cm firewall | altq | one_pass | debug | verbose | dyn_keepalive .Ss LOOKUP TABLES .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm create Ar create-options .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm destroy .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm modify Ar modify-options .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm swap Ar name .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm add Ar table-key Op Ar value .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm add Op Ar table-key Ar value ... .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm atomic add Op Ar table-key Ar value ... .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm delete Op Ar table-key ... .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm lookup Ar addr .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm lock .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm unlock .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm list .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm info .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm detail .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm flush .Ss DUMMYNET CONFIGURATION (TRAFFIC SHAPER AND PACKET SCHEDULER) .Nm .Brq Cm pipe | queue | sched .Ar number .Cm config .Ar config-options .Nm .Op Fl s Op Ar field .Brq Cm pipe | queue | sched .Brq Cm delete | list | show .Op Ar number ... .Ss IN-KERNEL NAT .Nm .Op Fl q .Cm nat .Ar number .Cm config .Ar config-options .Ss STATEFUL IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION .Nm .Oo Cm set Ar N Oc Cm nat64lsn Ar name Cm create Ar create-options .Nm .Oo Cm set Ar N Oc Cm nat64lsn Ar name Cm config Ar config-options .Nm .Oo Cm set Ar N Oc Cm nat64lsn .Brq Ar name | all .Brq Cm list | show .Op Cm states .Nm .Oo Cm set Ar N Oc Cm nat64lsn .Brq Ar name | all .Cm destroy .Nm .Oo Cm set Ar N Oc Cm nat64lsn Ar name Cm stats Op Cm reset .Ss STATELESS IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION .Nm .Oo Cm set Ar N Oc Cm nat64stl Ar name Cm create Ar create-options .Nm .Oo Cm set Ar N Oc Cm nat64stl Ar name Cm config Ar config-options .Nm .Oo Cm set Ar N Oc Cm nat64stl .Brq Ar name | all .Brq Cm list | show .Nm .Oo Cm set Ar N Oc Cm nat64stl .Brq Ar name | all .Cm destroy .Nm .Oo Cm set Ar N Oc Cm nat64stl Ar name Cm stats Op Cm reset .Ss XLAT464 CLAT IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION .Nm .Oo Cm set Ar N Oc Cm nat64clat Ar name Cm create Ar create-options .Nm .Oo Cm set Ar N Oc Cm nat64clat Ar name Cm config Ar config-options .Nm .Oo Cm set Ar N Oc Cm nat64clat .Brq Ar name | all .Brq Cm list | show .Nm .Oo Cm set Ar N Oc Cm nat64clat .Brq Ar name | all .Cm destroy .Nm .Oo Cm set Ar N Oc Cm nat64clat Ar name Cm stats Op Cm reset .Ss IPv6-to-IPv6 NETWORK PREFIX TRANSLATION .Nm .Oo Cm set Ar N Oc Cm nptv6 Ar name Cm create Ar create-options .Nm .Oo Cm set Ar N Oc Cm nptv6 .Brq Ar name | all .Brq Cm list | show .Nm .Oo Cm set Ar N Oc Cm nptv6 .Brq Ar name | all .Cm destroy .Nm .Oo Cm set Ar N Oc Cm nptv6 Ar name Cm stats Op Cm reset .Ss INTERNAL DIAGNOSTICS .Nm .Cm internal iflist .Nm .Cm internal talist .Nm .Cm internal vlist .Ss LIST OF RULES AND PREPROCESSING .Nm .Op Fl cfnNqS .Oo .Fl p Ar preproc .Oo .Ar preproc-flags .Oc .Oc .Ar pathname .Sh DESCRIPTION The .Nm utility is the user interface for controlling the .Xr ipfw 4 firewall, the .Xr dummynet 4 traffic shaper/packet scheduler, and the in-kernel NAT services. .Pp A firewall configuration, or .Em ruleset , is made of a list of .Em rules numbered from 1 to 65535. Packets are passed to the firewall from a number of different places in the protocol stack (depending on the source and destination of the packet, it is possible for the firewall to be invoked multiple times on the same packet). The packet passed to the firewall is compared against each of the rules in the .Em ruleset , in rule-number order (multiple rules with the same number are permitted, in which case they are processed in order of insertion). When a match is found, the action corresponding to the matching rule is performed. .Pp Depending on the action and certain system settings, packets can be reinjected into the firewall at some rule after the matching one for further processing. .Pp A ruleset always includes a .Em default rule (numbered 65535) which cannot be modified or deleted, and matches all packets. The action associated with the .Em default rule can be either .Cm deny or .Cm allow depending on how the kernel is configured. .Pp If the ruleset includes one or more rules with the .Cm keep-state , .Cm record-state , .Cm limit or .Cm set-limit option, the firewall will have a .Em stateful behaviour, i.e., upon a match it will create .Em dynamic rules , i.e., rules that match packets with the same 5-tuple (protocol, source and destination addresses and ports) as the packet which caused their creation. Dynamic rules, which have a limited lifetime, are checked at the first occurrence of a .Cm check-state , .Cm keep-state or .Cm limit rule, and are typically used to open the firewall on-demand to legitimate traffic only. Please note, that .Cm keep-state and .Cm limit imply implicit .Cm check-state for all packets (not only these matched by the rule) but .Cm record-state and .Cm set-limit have no implicit .Cm check-state . See the .Sx STATEFUL FIREWALL and .Sx EXAMPLES Sections below for more information on the stateful behaviour of .Nm . .Pp All rules (including dynamic ones) have a few associated counters: a packet count, a byte count, a log count and a timestamp indicating the time of the last match. Counters can be displayed or reset with .Nm commands. .Pp Each rule belongs to one of 32 different .Em sets , and there are .Nm commands to atomically manipulate sets, such as enable, disable, swap sets, move all rules in a set to another one, delete all rules in a set. These can be useful to install temporary configurations, or to test them. See Section .Sx SETS OF RULES for more information on .Em sets . .Pp Rules can be added with the .Cm add command; deleted individually or in groups with the .Cm delete command, and globally (except those in set 31) with the .Cm flush command; displayed, optionally with the content of the counters, using the .Cm show and .Cm list commands. Finally, counters can be reset with the .Cm zero and .Cm resetlog commands. .Ss COMMAND OPTIONS The following general options are available when invoking .Nm : .Bl -tag -width indent .It Fl a Show counter values when listing rules. The .Cm show command implies this option. .It Fl b Only show the action and the comment, not the body of a rule. Implies .Fl c . .It Fl c When entering or showing rules, print them in compact form, i.e., omitting the "ip from any to any" string when this does not carry any additional information. .It Fl d When listing, show dynamic rules in addition to static ones. .It Fl D When listing, show only dynamic states. When deleting, delete only dynamic states. .It Fl f Run without prompting for confirmation for commands that can cause problems if misused, i.e., .Cm flush . If there is no tty associated with the process, this is implied. The .Cm delete command with this flag ignores possible errors, i.e., nonexistent rule number. And for batched commands execution continues with the next command. .It Fl i When listing a table (see the .Sx LOOKUP TABLES section below for more information on lookup tables), format values as IP addresses. By default, values are shown as integers. .It Fl n Only check syntax of the command strings, without actually passing them to the kernel. .It Fl N Try to resolve addresses and service names in output. .It Fl q Be quiet when executing the .Cm add , .Cm nat , .Cm zero , .Cm resetlog or .Cm flush commands; (implies .Fl f ) . This is useful when updating rulesets by executing multiple .Nm commands in a script (e.g., .Ql sh\ /etc/rc.firewall ) , or by processing a file with many .Nm rules across a remote login session. It also stops a table add or delete from failing if the entry already exists or is not present. .Pp The reason why this option may be important is that for some of these actions, .Nm may print a message; if the action results in blocking the traffic to the remote client, the remote login session will be closed and the rest of the ruleset will not be processed. Access to the console would then be required to recover. .It Fl S When listing rules, show the .Em set each rule belongs to. If this flag is not specified, disabled rules will not be listed. .It Fl s Op Ar field When listing pipes, sort according to one of the four counters (total or current packets or bytes). .It Fl t When listing, show last match timestamp converted with .Fn ctime . .It Fl T When listing, show last match timestamp as seconds from the epoch. This form can be more convenient for postprocessing by scripts. .El .Ss LIST OF RULES AND PREPROCESSING To ease configuration, rules can be put into a file which is processed using .Nm as shown in the last synopsis line. An absolute .Ar pathname must be used. The file will be read line by line and applied as arguments to the .Nm utility. .Pp Optionally, a preprocessor can be specified using .Fl p Ar preproc where .Ar pathname is to be piped through. Useful preprocessors include .Xr cpp 1 and .Xr m4 1 . If .Ar preproc does not start with a slash .Pq Ql / as its first character, the usual .Ev PATH name search is performed. Care should be taken with this in environments where not all file systems are mounted (yet) by the time .Nm is being run (e.g.\& when they are mounted over NFS). Once .Fl p has been specified, any additional arguments are passed on to the preprocessor for interpretation. This allows for flexible configuration files (like conditionalizing them on the local hostname) and the use of macros to centralize frequently required arguments like IP addresses. .Ss TRAFFIC SHAPER CONFIGURATION The .Nm .Cm pipe , queue and .Cm sched commands are used to configure the traffic shaper and packet scheduler. See the .Sx TRAFFIC SHAPER (DUMMYNET) CONFIGURATION Section below for details. .Pp If the world and the kernel get out of sync the .Nm ABI may break, preventing you from being able to add any rules. This can adversely affect the booting process. You can use .Nm .Cm disable .Cm firewall to temporarily disable the firewall to regain access to the network, allowing you to fix the problem. .Sh PACKET FLOW A packet is checked against the active ruleset in multiple places in the protocol stack, under control of several sysctl variables. These places and variables are shown below, and it is important to have this picture in mind in order to design a correct ruleset. .Bd -literal -offset indent ^ to upper layers V | | +----------->-----------+ ^ V [ip(6)_input] [ip(6)_output] net.inet(6).ip(6).fw.enable=1 | | ^ V [ether_demux] [ether_output_frame] net.link.ether.ipfw=1 | | +-->--[bdg_forward]-->--+ net.link.bridge.ipfw=1 ^ V | to devices | .Ed .Pp The number of times the same packet goes through the firewall can vary between 0 and 4 depending on packet source and destination, and system configuration. .Pp Note that as packets flow through the stack, headers can be stripped or added to it, and so they may or may not be available for inspection. E.g., incoming packets will include the MAC header when .Nm is invoked from .Cm ether_demux() , but the same packets will have the MAC header stripped off when .Nm is invoked from .Cm ip_input() or .Cm ip6_input() . .Pp Also note that each packet is always checked against the complete ruleset, irrespective of the place where the check occurs, or the source of the packet. If a rule contains some match patterns or actions which are not valid for the place of invocation (e.g.\& trying to match a MAC header within .Cm ip_input or .Cm ip6_input ), the match pattern will not match, but a .Cm not operator in front of such patterns .Em will cause the pattern to .Em always match on those packets. It is thus the responsibility of the programmer, if necessary, to write a suitable ruleset to differentiate among the possible places. .Cm skipto rules can be useful here, as an example: .Bd -literal -offset indent # packets from ether_demux or bdg_forward ipfw add 10 skipto 1000 all from any to any layer2 in # packets from ip_input ipfw add 10 skipto 2000 all from any to any not layer2 in # packets from ip_output ipfw add 10 skipto 3000 all from any to any not layer2 out # packets from ether_output_frame ipfw add 10 skipto 4000 all from any to any layer2 out .Ed .Pp (yes, at the moment there is no way to differentiate between ether_demux and bdg_forward). .Pp Also note that only actions .Cm allow , .Cm deny , .Cm netgraph , .Cm ngtee and related to .Cm dummynet are processed for .Cm layer2 frames and all other actions act as if they were .Cm allow for such frames. Full set of actions is supported for IP packets without .Cm layer2 headers only. For example, .Cm divert action does not divert .Cm layer2 frames. .Sh SYNTAX In general, each keyword or argument must be provided as a separate command line argument, with no leading or trailing spaces. Keywords are case-sensitive, whereas arguments may or may not be case-sensitive depending on their nature (e.g.\& uid's are, hostnames are not). .Pp Some arguments (e.g., port or address lists) are comma-separated lists of values. In this case, spaces after commas ',' are allowed to make the line more readable. You can also put the entire command (including flags) into a single argument. E.g., the following forms are equivalent: .Bd -literal -offset indent ipfw -q add deny src-ip 10.0.0.0/24,127.0.0.1/8 ipfw -q add deny src-ip 10.0.0.0/24, 127.0.0.1/8 ipfw "-q add deny src-ip 10.0.0.0/24, 127.0.0.1/8" .Ed .Sh RULE FORMAT The format of firewall rules is the following: .Bd -ragged -offset indent .Bk -words .Op Ar rule_number .Op Cm set Ar set_number .Op Cm prob Ar match_probability .Ar action .Op Cm log Op Cm logamount Ar number .Op Cm altq Ar queue .Oo .Bro Cm tag | untag .Brc Ar number .Oc .Ar body .Ek .Ed .Pp where the body of the rule specifies which information is used for filtering packets, among the following: .Pp .Bl -tag -width "Source and dest. addresses and ports" -offset XXX -compact .It Layer-2 header fields When available .It IPv4 and IPv6 Protocol SCTP, TCP, UDP, ICMP, etc. .It Source and dest. addresses and ports .It Direction See Section .Sx PACKET FLOW .It Transmit and receive interface By name or address .It Misc. IP header fields Version, type of service, datagram length, identification, fragmentation flags, Time To Live .It IP options .It IPv6 Extension headers Fragmentation, Hop-by-Hop options, Routing Headers, Source routing rthdr0, Mobile IPv6 rthdr2, IPSec options. .It IPv6 Flow-ID .It Misc. TCP header fields TCP flags (SYN, FIN, ACK, RST, etc.), sequence number, acknowledgment number, window .It TCP options .It ICMP types for ICMP packets .It ICMP6 types for ICMP6 packets .It User/group ID When the packet can be associated with a local socket. .It Divert status Whether a packet came from a divert socket (e.g., .Xr natd 8 ) . .It Fib annotation state Whether a packet has been tagged for using a specific FIB (routing table) in future forwarding decisions. .El .Pp Note that some of the above information, e.g.\& source MAC or IP addresses and TCP/UDP ports, can be easily spoofed, so filtering on those fields alone might not guarantee the desired results. .Bl -tag -width indent .It Ar rule_number Each rule is associated with a .Ar rule_number in the range 1..65535, with the latter reserved for the .Em default rule. Rules are checked sequentially by rule number. Multiple rules can have the same number, in which case they are checked (and listed) according to the order in which they have been added. If a rule is entered without specifying a number, the kernel will assign one in such a way that the rule becomes the last one before the .Em default rule. Automatic rule numbers are assigned by incrementing the last non-default rule number by the value of the sysctl variable .Ar net.inet.ip.fw.autoinc_step which defaults to 100. If this is not possible (e.g.\& because we would go beyond the maximum allowed rule number), the number of the last non-default value is used instead. .It Cm set Ar set_number Each rule is associated with a .Ar set_number in the range 0..31. Sets can be individually disabled and enabled, so this parameter is of fundamental importance for atomic ruleset manipulation. It can be also used to simplify deletion of groups of rules. If a rule is entered without specifying a set number, set 0 will be used. .br Set 31 is special in that it cannot be disabled, and rules in set 31 are not deleted by the .Nm ipfw flush command (but you can delete them with the .Nm ipfw delete set 31 command). Set 31 is also used for the .Em default rule. .It Cm prob Ar match_probability A match is only declared with the specified probability (floating point number between 0 and 1). This can be useful for a number of applications such as random packet drop or (in conjunction with .Nm dummynet ) to simulate the effect of multiple paths leading to out-of-order packet delivery. .Pp Note: this condition is checked before any other condition, including ones such as .Cm keep-state or .Cm check-state which might have side effects. .It Cm log Op Cm logamount Ar number Packets matching a rule with the .Cm log keyword will be made available for logging in two ways: if the sysctl variable .Va net.inet.ip.fw.verbose is set to 0 (default), one can use .Xr bpf 4 attached to the .Li ipfw0 pseudo interface. This pseudo interface can be created manually after a system boot by using the following command: .Bd -literal -offset indent # ifconfig ipfw0 create .Ed .Pp Or, automatically at boot time by adding the following line to the .Xr rc.conf 5 file: .Bd -literal -offset indent firewall_logif="YES" .Ed .Pp There is zero overhead when no .Xr bpf 4 is attached to the pseudo interface. .Pp If .Va net.inet.ip.fw.verbose is set to 1, packets will be logged to .Xr syslogd 8 with a .Dv LOG_SECURITY facility up to a maximum of .Cm logamount packets. If no .Cm logamount is specified, the limit is taken from the sysctl variable .Va net.inet.ip.fw.verbose_limit . In both cases, a value of 0 means unlimited logging. .Pp Once the limit is reached, logging can be re-enabled by clearing the logging counter or the packet counter for that entry, see the .Cm resetlog command. .Pp Note: logging is done after all other packet matching conditions have been successfully verified, and before performing the final action (accept, deny, etc.) on the packet. .It Cm tag Ar number When a packet matches a rule with the .Cm tag keyword, the numeric tag for the given .Ar number in the range 1..65534 will be attached to the packet. The tag acts as an internal marker (it is not sent out over the wire) that can be used to identify these packets later on. This can be used, for example, to provide trust between interfaces and to start doing policy-based filtering. A packet can have multiple tags at the same time. Tags are "sticky", meaning once a tag is applied to a packet by a matching rule it exists until explicit removal. Tags are kept with the packet everywhere within the kernel, but are lost when packet leaves the kernel, for example, on transmitting packet out to the network or sending packet to a .Xr divert 4 socket. .Pp To check for previously applied tags, use the .Cm tagged rule option. To delete previously applied tag, use the .Cm untag keyword. .Pp Note: since tags are kept with the packet everywhere in kernelspace, they can be set and unset anywhere in the kernel network subsystem (using the .Xr mbuf_tags 9 facility), not only by means of the .Xr ipfw 4 .Cm tag and .Cm untag keywords. For example, there can be a specialized .Xr netgraph 4 node doing traffic analyzing and tagging for later inspecting in firewall. .It Cm untag Ar number When a packet matches a rule with the .Cm untag keyword, the tag with the number .Ar number is searched among the tags attached to this packet and, if found, removed from it. Other tags bound to packet, if present, are left untouched. .It Cm altq Ar queue When a packet matches a rule with the .Cm altq keyword, the ALTQ identifier for the given .Ar queue (see .Xr altq 4 ) will be attached. Note that this ALTQ tag is only meaningful for packets going "out" of IPFW, and not being rejected or going to divert sockets. Note that if there is insufficient memory at the time the packet is processed, it will not be tagged, so it is wise to make your ALTQ "default" queue policy account for this. If multiple .Cm altq rules match a single packet, only the first one adds the ALTQ classification tag. In doing so, traffic may be shaped by using .Cm count Cm altq Ar queue rules for classification early in the ruleset, then later applying the filtering decision. For example, .Cm check-state and .Cm keep-state rules may come later and provide the actual filtering decisions in addition to the fallback ALTQ tag. .Pp You must run .Xr pfctl 8 to set up the queues before IPFW will be able to look them up by name, and if the ALTQ disciplines are rearranged, the rules in containing the queue identifiers in the kernel will likely have gone stale and need to be reloaded. Stale queue identifiers will probably result in misclassification. .Pp All system ALTQ processing can be turned on or off via .Nm .Cm enable Ar altq and .Nm .Cm disable Ar altq . The usage of .Va net.inet.ip.fw.one_pass is irrelevant to ALTQ traffic shaping, as the actual rule action is followed always after adding an ALTQ tag. .El .Ss RULE ACTIONS A rule can be associated with one of the following actions, which will be executed when the packet matches the body of the rule. .Bl -tag -width indent .It Cm allow | accept | pass | permit Allow packets that match rule. The search terminates. .It Cm check-state Op Ar :flowname | Cm :any Checks the packet against the dynamic ruleset. If a match is found, execute the action associated with the rule which generated this dynamic rule, otherwise move to the next rule. .br .Cm Check-state rules do not have a body. If no .Cm check-state rule is found, the dynamic ruleset is checked at the first .Cm keep-state or .Cm limit rule. The .Ar :flowname is symbolic name assigned to dynamic rule by .Cm keep-state opcode. The special flowname .Cm :any can be used to ignore states flowname when matching. The .Cm :default keyword is special name used for compatibility with old rulesets. .It Cm count Update counters for all packets that match rule. The search continues with the next rule. .It Cm deny | drop Discard packets that match this rule. The search terminates. .It Cm divert Ar port Divert packets that match this rule to the .Xr divert 4 socket bound to port .Ar port . The search terminates. .It Cm fwd | forward Ar ipaddr | tablearg Ns Op , Ns Ar port Change the next-hop on matching packets to .Ar ipaddr , which can be an IP address or a host name. The next hop can also be supplied by the last table looked up for the packet by using the .Cm tablearg keyword instead of an explicit address. The search terminates if this rule matches. .Pp If .Ar ipaddr is a local address, then matching packets will be forwarded to .Ar port (or the port number in the packet if one is not specified in the rule) on the local machine. .br If .Ar ipaddr is not a local address, then the port number (if specified) is ignored, and the packet will be forwarded to the remote address, using the route as found in the local routing table for that IP. .br A .Ar fwd rule will not match layer-2 packets (those received on ether_input, ether_output, or bridged). .br The .Cm fwd action does not change the contents of the packet at all. In particular, the destination address remains unmodified, so packets forwarded to another system will usually be rejected by that system unless there is a matching rule on that system to capture them. For packets forwarded locally, the local address of the socket will be set to the original destination address of the packet. This makes the .Xr netstat 1 entry look rather weird but is intended for use with transparent proxy servers. .It Cm nat Ar nat_nr | tablearg Pass packet to a nat instance (for network address translation, address redirect, etc.): see the .Sx NETWORK ADDRESS TRANSLATION (NAT) Section for further information. .It Cm nat64lsn Ar name Pass packet to a stateful NAT64 instance (for IPv6/IPv4 network address and protocol translation): see the .Sx IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION Section for further information. .It Cm nat64stl Ar name Pass packet to a stateless NAT64 instance (for IPv6/IPv4 network address and protocol translation): see the .Sx IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION Section for further information. .It Cm nat64clat Ar name Pass packet to a CLAT NAT64 instance (for client-side IPv6/IPv4 network address and protocol translation): see the .Sx IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION Section for further information. .It Cm nptv6 Ar name Pass packet to a NPTv6 instance (for IPv6-to-IPv6 network prefix translation): see the .Sx IPv6-to-IPv6 NETWORK PREFIX TRANSLATION (NPTv6) Section for further information. .It Cm pipe Ar pipe_nr Pass packet to a .Nm dummynet .Dq pipe (for bandwidth limitation, delay, etc.). See the .Sx TRAFFIC SHAPER (DUMMYNET) CONFIGURATION Section for further information. The search terminates; however, on exit from the pipe and if the .Xr sysctl 8 variable .Va net.inet.ip.fw.one_pass is not set, the packet is passed again to the firewall code starting from the next rule. .It Cm queue Ar queue_nr Pass packet to a .Nm dummynet .Dq queue (for bandwidth limitation using WF2Q+). .It Cm reject (Deprecated). Synonym for .Cm unreach host . .It Cm reset Discard packets that match this rule, and if the packet is a TCP packet, try to send a TCP reset (RST) notice. The search terminates. .It Cm reset6 Discard packets that match this rule, and if the packet is a TCP packet, try to send a TCP reset (RST) notice. The search terminates. .It Cm skipto Ar number | tablearg Skip all subsequent rules numbered less than .Ar number . The search continues with the first rule numbered .Ar number or higher. It is possible to use the .Cm tablearg keyword with a skipto for a .Em computed skipto. Skipto may work either in O(log(N)) or in O(1) depending on amount of memory and/or sysctl variables. See the .Sx SYSCTL VARIABLES section for more details. .It Cm call Ar number | tablearg The current rule number is saved in the internal stack and ruleset processing continues with the first rule numbered .Ar number or higher. If later a rule with the .Cm return action is encountered, the processing returns to the first rule with number of this .Cm call rule plus one or higher (the same behaviour as with packets returning from .Xr divert 4 socket after a .Cm divert action). This could be used to make somewhat like an assembly language .Dq subroutine calls to rules with common checks for different interfaces, etc. .Pp Rule with any number could be called, not just forward jumps as with .Cm skipto . So, to prevent endless loops in case of mistakes, both .Cm call and .Cm return actions don't do any jumps and simply go to the next rule if memory cannot be allocated or stack overflowed/underflowed. .Pp Internally stack for rule numbers is implemented using .Xr mbuf_tags 9 facility and currently has size of 16 entries. As mbuf tags are lost when packet leaves the kernel, .Cm divert should not be used in subroutines to avoid endless loops and other undesired effects. .It Cm return Takes rule number saved to internal stack by the last .Cm call action and returns ruleset processing to the first rule with number greater than number of corresponding .Cm call rule. See description of the .Cm call action for more details. .Pp Note that .Cm return rules usually end a .Dq subroutine and thus are unconditional, but .Nm command-line utility currently requires every action except .Cm check-state to have body. While it is sometimes useful to return only on some packets, usually you want to print just .Dq return for readability. A workaround for this is to use new syntax and .Fl c switch: .Bd -literal -offset indent # Add a rule without actual body ipfw add 2999 return via any # List rules without "from any to any" part ipfw -c list .Ed .Pp This cosmetic annoyance may be fixed in future releases. .It Cm tee Ar port Send a copy of packets matching this rule to the .Xr divert 4 socket bound to port .Ar port . The search continues with the next rule. .It Cm unreach Ar code Discard packets that match this rule, and try to send an ICMP unreachable notice with code .Ar code , where .Ar code is a number from 0 to 255, or one of these aliases: .Cm net , host , protocol , port , .Cm needfrag , srcfail , net-unknown , host-unknown , .Cm isolated , net-prohib , host-prohib , tosnet , .Cm toshost , filter-prohib , host-precedence or .Cm precedence-cutoff . The search terminates. .It Cm unreach6 Ar code Discard packets that match this rule, and try to send an ICMPv6 unreachable notice with code .Ar code , where .Ar code is a number from 0, 1, 3 or 4, or one of these aliases: .Cm no-route, admin-prohib, address or .Cm port . The search terminates. .It Cm netgraph Ar cookie Divert packet into netgraph with given .Ar cookie . The search terminates. If packet is later returned from netgraph it is either accepted or continues with the next rule, depending on .Va net.inet.ip.fw.one_pass sysctl variable. .It Cm ngtee Ar cookie A copy of packet is diverted into netgraph, original packet continues with the next rule. See .Xr ng_ipfw 4 for more information on .Cm netgraph and .Cm ngtee actions. .It Cm setfib Ar fibnum | tablearg The packet is tagged so as to use the FIB (routing table) .Ar fibnum in any subsequent forwarding decisions. In the current implementation, this is limited to the values 0 through 15, see .Xr setfib 2 . Processing continues at the next rule. It is possible to use the .Cm tablearg keyword with setfib. If the tablearg value is not within the compiled range of fibs, the packet's fib is set to 0. .It Cm setdscp Ar DSCP | number | tablearg Set specified DiffServ codepoint for an IPv4/IPv6 packet. Processing continues at the next rule. Supported values are: .Pp .Cm cs0 .Pq Dv 000000 , .Cm cs1 .Pq Dv 001000 , .Cm cs2 .Pq Dv 010000 , .Cm cs3 .Pq Dv 011000 , .Cm cs4 .Pq Dv 100000 , .Cm cs5 .Pq Dv 101000 , .Cm cs6 .Pq Dv 110000 , .Cm cs7 .Pq Dv 111000 , .Cm af11 .Pq Dv 001010 , .Cm af12 .Pq Dv 001100 , .Cm af13 .Pq Dv 001110 , .Cm af21 .Pq Dv 010010 , .Cm af22 .Pq Dv 010100 , .Cm af23 .Pq Dv 010110 , .Cm af31 .Pq Dv 011010 , .Cm af32 .Pq Dv 011100 , .Cm af33 .Pq Dv 011110 , .Cm af41 .Pq Dv 100010 , .Cm af42 .Pq Dv 100100 , .Cm af43 .Pq Dv 100110 , .Cm ef .Pq Dv 101110 , .Cm be .Pq Dv 000000 . Additionally, DSCP value can be specified by number (0..63). It is also possible to use the .Cm tablearg keyword with setdscp. If the tablearg value is not within the 0..63 range, lower 6 bits of supplied value are used. .It Cm tcp-setmss Ar mss Set the Maximum Segment Size (MSS) in the TCP segment to value .Ar mss . The kernel module .Cm ipfw_pmod should be loaded or kernel should have .Cm options IPFIREWALL_PMOD to be able use this action. This command does not change a packet if original MSS value is lower than specified value. Both TCP over IPv4 and over IPv6 are supported. Regardless of matched a packet or not by the .Cm tcp-setmss rule, the search continues with the next rule. .It Cm reass Queue and reassemble IPv4 fragments. If the packet is not fragmented, counters are updated and processing continues with the next rule. If the packet is the last logical fragment, the packet is reassembled and, if .Va net.inet.ip.fw.one_pass is set to 0, processing continues with the next rule. Otherwise, the packet is allowed to pass and the search terminates. If the packet is a fragment in the middle of a logical group of fragments, it is consumed and processing stops immediately. .Pp Fragment handling can be tuned via .Va net.inet.ip.maxfragpackets and .Va net.inet.ip.maxfragsperpacket which limit, respectively, the maximum number of processable fragments (default: 800) and the maximum number of fragments per packet (default: 16). .Pp NOTA BENE: since fragments do not contain port numbers, they should be avoided with the .Nm reass rule. Alternatively, direction-based (like .Nm in / .Nm out ) and source-based (like .Nm via ) match patterns can be used to select fragments. .Pp Usually a simple rule like: .Bd -literal -offset indent # reassemble incoming fragments ipfw add reass all from any to any in .Ed .Pp is all you need at the beginning of your ruleset. .It Cm abort Discard packets that match this rule, and if the packet is an SCTP packet, try to send an SCTP packet containing an ABORT chunk. The search terminates. .It Cm abort6 Discard packets that match this rule, and if the packet is an SCTP packet, try to send an SCTP packet containing an ABORT chunk. The search terminates. .El .Ss RULE BODY The body of a rule contains zero or more patterns (such as specific source and destination addresses or ports, protocol options, incoming or outgoing interfaces, etc.) that the packet must match in order to be recognised. In general, the patterns are connected by (implicit) .Cm and operators -- i.e., all must match in order for the rule to match. Individual patterns can be prefixed by the .Cm not operator to reverse the result of the match, as in .Pp .Dl "ipfw add 100 allow ip from not 1.2.3.4 to any" .Pp Additionally, sets of alternative match patterns .Pq Em or-blocks can be constructed by putting the patterns in lists enclosed between parentheses ( ) or braces { }, and using the .Cm or operator as follows: .Pp .Dl "ipfw add 100 allow ip from { x or not y or z } to any" .Pp Only one level of parentheses is allowed. Beware that most shells have special meanings for parentheses or braces, so it is advisable to put a backslash \\ in front of them to prevent such interpretations. .Pp The body of a rule must in general include a source and destination address specifier. The keyword .Ar any can be used in various places to specify that the content of a required field is irrelevant. .Pp The rule body has the following format: .Bd -ragged -offset indent .Op Ar proto Cm from Ar src Cm to Ar dst .Op Ar options .Ed .Pp The first part (proto from src to dst) is for backward compatibility with earlier versions of .Fx . In modern .Fx any match pattern (including MAC headers, IP protocols, addresses and ports) can be specified in the .Ar options section. .Pp Rule fields have the following meaning: .Bl -tag -width indent .It Ar proto : protocol | Cm { Ar protocol Cm or ... } .It Ar protocol : Oo Cm not Oc Ar protocol-name | protocol-number An IP protocol specified by number or name (for a complete list see .Pa /etc/protocols ) , or one of the following keywords: .Bl -tag -width indent .It Cm ip4 | ipv4 Matches IPv4 packets. .It Cm ip6 | ipv6 Matches IPv6 packets. .It Cm ip | all Matches any packet. .El .Pp The .Cm ipv6 in .Cm proto option will be treated as inner protocol. And, the .Cm ipv4 is not available in .Cm proto option. .Pp The .Cm { Ar protocol Cm or ... } format (an .Em or-block ) is provided for convenience only but its use is deprecated. .It Ar src No and Ar dst : Bro Cm addr | Cm { Ar addr Cm or ... } Brc Op Oo Cm not Oc Ar ports An address (or a list, see below) optionally followed by .Ar ports specifiers. .Pp The second format .Em ( or-block with multiple addresses) is provided for convenience only and its use is discouraged. .It Ar addr : Oo Cm not Oc Bro .Cm any | me | me6 | .Cm table Ns Pq Ar name Ns Op , Ns Ar value .Ar | addr-list | addr-set .Brc .Bl -tag -width indent .It Cm any Matches any IP address. .It Cm me Matches any IP address configured on an interface in the system. .It Cm me6 Matches any IPv6 address configured on an interface in the system. The address list is evaluated at the time the packet is analysed. .It Cm table Ns Pq Ar name Ns Op , Ns Ar value Matches any IPv4 or IPv6 address for which an entry exists in the lookup table .Ar number . If an optional 32-bit unsigned .Ar value is also specified, an entry will match only if it has this value. See the .Sx LOOKUP TABLES section below for more information on lookup tables. .El .It Ar addr-list : ip-addr Ns Op Ns , Ns Ar addr-list .It Ar ip-addr : A host or subnet address specified in one of the following ways: .Bl -tag -width indent .It Ar numeric-ip | hostname Matches a single IPv4 address, specified as dotted-quad or a hostname. Hostnames are resolved at the time the rule is added to the firewall list. .It Ar addr Ns / Ns Ar masklen Matches all addresses with base .Ar addr (specified as an IP address, a network number, or a hostname) and mask width of .Cm masklen bits. As an example, 1.2.3.4/25 or 1.2.3.0/25 will match all IP numbers from 1.2.3.0 to 1.2.3.127 . .It Ar addr Ns : Ns Ar mask Matches all addresses with base .Ar addr (specified as an IP address, a network number, or a hostname) and the mask of .Ar mask , specified as a dotted quad. As an example, 1.2.3.4:255.0.255.0 or 1.0.3.0:255.0.255.0 will match 1.*.3.*. This form is advised only for non-contiguous masks. It is better to resort to the .Ar addr Ns / Ns Ar masklen format for contiguous masks, which is more compact and less error-prone. .El .It Ar addr-set : addr Ns Oo Ns / Ns Ar masklen Oc Ns Cm { Ns Ar list Ns Cm } .It Ar list : Bro Ar num | num-num Brc Ns Op Ns , Ns Ar list Matches all addresses with base address .Ar addr (specified as an IP address, a network number, or a hostname) and whose last byte is in the list between braces { } . Note that there must be no spaces between braces and numbers (spaces after commas are allowed). Elements of the list can be specified as single entries or ranges. The .Ar masklen field is used to limit the size of the set of addresses, and can have any value between 24 and 32. If not specified, it will be assumed as 24. .br This format is particularly useful to handle sparse address sets within a single rule. Because the matching occurs using a bitmask, it takes constant time and dramatically reduces the complexity of rulesets. .br As an example, an address specified as 1.2.3.4/24{128,35-55,89} or 1.2.3.0/24{128,35-55,89} will match the following IP addresses: .br 1.2.3.128, 1.2.3.35 to 1.2.3.55, 1.2.3.89 . .It Ar addr6-list : ip6-addr Ns Op Ns , Ns Ar addr6-list .It Ar ip6-addr : A host or subnet specified one of the following ways: .Bl -tag -width indent .It Ar numeric-ip | hostname Matches a single IPv6 address as allowed by .Xr inet_pton 3 or a hostname. Hostnames are resolved at the time the rule is added to the firewall list. .It Ar addr Ns / Ns Ar masklen Matches all IPv6 addresses with base .Ar addr (specified as allowed by .Xr inet_pton 3 or a hostname) and mask width of .Cm masklen bits. .It Ar addr Ns / Ns Ar mask Matches all IPv6 addresses with base .Ar addr (specified as allowed by .Xr inet_pton 3 or a hostname) and the mask of .Ar mask , specified as allowed by .Xr inet_pton 3 . As an example, fe::640:0:0/ffff::ffff:ffff:0:0 will match fe:*:*:*:0:640:*:*. This form is advised only for non-contiguous masks. It is better to resort to the .Ar addr Ns / Ns Ar masklen format for contiguous masks, which is more compact and less error-prone. .El .Pp No support for sets of IPv6 addresses is provided because IPv6 addresses are typically random past the initial prefix. .It Ar ports : Bro Ar port | port Ns \&- Ns Ar port Ns Brc Ns Op , Ns Ar ports For protocols which support port numbers (such as SCTP, TCP and UDP), optional .Cm ports may be specified as one or more ports or port ranges, separated by commas but no spaces, and an optional .Cm not operator. The .Ql \&- notation specifies a range of ports (including boundaries). .Pp Service names (from .Pa /etc/services ) may be used instead of numeric port values. The length of the port list is limited to 30 ports or ranges, though one can specify larger ranges by using an .Em or-block in the .Cm options section of the rule. .Pp A backslash .Pq Ql \e can be used to escape the dash .Pq Ql - character in a service name (from a shell, the backslash must be typed twice to avoid the shell itself interpreting it as an escape character). .Pp .Dl "ipfw add count tcp from any ftp\e\e-data-ftp to any" .Pp Fragmented packets which have a non-zero offset (i.e., not the first fragment) will never match a rule which has one or more port specifications. See the .Cm frag option for details on matching fragmented packets. .El .Ss RULE OPTIONS (MATCH PATTERNS) Additional match patterns can be used within rules. Zero or more of these so-called .Em options can be present in a rule, optionally prefixed by the .Cm not operand, and possibly grouped into .Em or-blocks . .Pp The following match patterns can be used (listed in alphabetical order): .Bl -tag -width indent .It Cm // this is a comment . Inserts the specified text as a comment in the rule. Everything following // is considered as a comment and stored in the rule. You can have comment-only rules, which are listed as having a .Cm count action followed by the comment. .It Cm bridged Alias for .Cm layer2 . .It Cm defer-immediate-action | defer-action A rule with this option will not perform normal action upon a match. This option is intended to be used with .Cm record-state or .Cm keep-state as the dynamic rule, created but ignored on match, will work as intended. Rules with both .Cm record-state and .Cm defer-immediate-action create a dynamic rule and continue with the next rule without actually performing the action part of this rule. When the rule is later activated via the state table, the action is performed as usual. .It Cm diverted Matches only packets generated by a divert socket. .It Cm diverted-loopback Matches only packets coming from a divert socket back into the IP stack input for delivery. .It Cm diverted-output Matches only packets going from a divert socket back outward to the IP stack output for delivery. .It Cm dst-ip Ar ip-address Matches IPv4 packets whose destination IP is one of the address(es) specified as argument. .It Bro Cm dst-ip6 | dst-ipv6 Brc Ar ip6-address Matches IPv6 packets whose destination IP is one of the address(es) specified as argument. .It Cm dst-port Ar ports Matches IP packets whose destination port is one of the port(s) specified as argument. .It Cm established Matches TCP packets that have the RST or ACK bits set. .It Cm ext6hdr Ar header Matches IPv6 packets containing the extended header given by .Ar header . Supported headers are: .Pp Fragment, .Pq Cm frag , Hop-to-hop options .Pq Cm hopopt , any type of Routing Header .Pq Cm route , Source routing Routing Header Type 0 .Pq Cm rthdr0 , Mobile IPv6 Routing Header Type 2 .Pq Cm rthdr2 , Destination options .Pq Cm dstopt , IPSec authentication headers .Pq Cm ah , and IPsec encapsulated security payload headers .Pq Cm esp . .It Cm fib Ar fibnum Matches a packet that has been tagged to use the given FIB (routing table) number. .It Cm flow Ar table Ns Pq Ar name Ns Op , Ns Ar value Search for the flow entry in lookup table .Ar name . If not found, the match fails. Otherwise, the match succeeds and .Cm tablearg is set to the value extracted from the table. .Pp This option can be useful to quickly dispatch traffic based on certain packet fields. See the .Sx LOOKUP TABLES section below for more information on lookup tables. .It Cm flow-id Ar labels Matches IPv6 packets containing any of the flow labels given in .Ar labels . .Ar labels is a comma separated list of numeric flow labels. .It Cm frag Ar spec Matches IPv4 packets whose .Cm ip_off field contains the comma separated list of IPv4 fragmentation options specified in .Ar spec . The recognized options are: .Cm df .Pq Dv don't fragment , .Cm mf .Pq Dv more fragments , .Cm rf .Pq Dv reserved fragment bit .Cm offset .Pq Dv non-zero fragment offset . The absence of a particular options may be denoted with a .Ql \&! . .Pp Empty list of options defaults to matching on non-zero fragment offset. Such rule would match all not the first fragment datagrams, both IPv4 and IPv6. This is a backward compatibility with older rulesets. .It Cm gid Ar group Matches all TCP or UDP packets sent by or received for a .Ar group . A .Ar group may be specified by name or number. .It Cm jail Ar jail Matches all TCP or UDP packets sent by or received for the jail whose ID or name is .Ar jail . .It Cm icmptypes Ar types Matches ICMP packets whose ICMP type is in the list .Ar types . The list may be specified as any combination of individual types (numeric) separated by commas. .Em Ranges are not allowed . The supported ICMP types are: .Pp echo reply .Pq Cm 0 , destination unreachable .Pq Cm 3 , source quench .Pq Cm 4 , redirect .Pq Cm 5 , echo request .Pq Cm 8 , router advertisement .Pq Cm 9 , router solicitation .Pq Cm 10 , time-to-live exceeded .Pq Cm 11 , IP header bad .Pq Cm 12 , timestamp request .Pq Cm 13 , timestamp reply .Pq Cm 14 , information request .Pq Cm 15 , information reply .Pq Cm 16 , address mask request .Pq Cm 17 and address mask reply .Pq Cm 18 . .It Cm icmp6types Ar types Matches ICMP6 packets whose ICMP6 type is in the list of .Ar types . The list may be specified as any combination of individual types (numeric) separated by commas. .Em Ranges are not allowed . .It Cm in | out Matches incoming or outgoing packets, respectively. .Cm in and .Cm out are mutually exclusive (in fact, .Cm out is implemented as .Cm not in Ns No ). .It Cm ipid Ar id-list Matches IPv4 packets whose .Cm ip_id field has value included in .Ar id-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm iplen Ar len-list Matches IP packets whose total length, including header and data, is in the set .Ar len-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm ipoptions Ar spec Matches packets whose IPv4 header contains the comma separated list of options specified in .Ar spec . The supported IP options are: .Pp .Cm ssrr (strict source route), .Cm lsrr (loose source route), .Cm rr (record packet route) and .Cm ts (timestamp). The absence of a particular option may be denoted with a .Ql \&! . .It Cm ipprecedence Ar precedence Matches IPv4 packets whose precedence field is equal to .Ar precedence . .It Cm ipsec Matches packets that have IPSEC history associated with them (i.e., the packet comes encapsulated in IPSEC, the kernel has IPSEC support, and can correctly decapsulate it). .Pp Note that specifying .Cm ipsec is different from specifying .Cm proto Ar ipsec as the latter will only look at the specific IP protocol field, irrespective of IPSEC kernel support and the validity of the IPSEC data. .Pp Further note that this flag is silently ignored in kernels without IPSEC support. It does not affect rule processing when given and the rules are handled as if with no .Cm ipsec flag. .It Cm iptos Ar spec Matches IPv4 packets whose .Cm tos field contains the comma separated list of service types specified in .Ar spec . The supported IP types of service are: .Pp .Cm lowdelay .Pq Dv IPTOS_LOWDELAY , .Cm throughput .Pq Dv IPTOS_THROUGHPUT , .Cm reliability .Pq Dv IPTOS_RELIABILITY , .Cm mincost .Pq Dv IPTOS_MINCOST , .Cm congestion .Pq Dv IPTOS_ECN_CE . The absence of a particular type may be denoted with a .Ql \&! . .It Cm dscp spec Ns Op , Ns Ar spec Matches IPv4/IPv6 packets whose .Cm DS field value is contained in .Ar spec mask. Multiple values can be specified via the comma separated list. Value can be one of keywords used in .Cm setdscp action or exact number. .It Cm ipttl Ar ttl-list Matches IPv4 packets whose time to live is included in .Ar ttl-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm ipversion Ar ver Matches IP packets whose IP version field is .Ar ver . .It Cm keep-state Op Ar :flowname Upon a match, the firewall will create a dynamic rule, whose default behaviour is to match bidirectional traffic between source and destination IP/port using the same protocol. The rule has a limited lifetime (controlled by a set of .Xr sysctl 8 variables), and the lifetime is refreshed every time a matching packet is found. The .Ar :flowname is used to assign additional to addresses, ports and protocol parameter to dynamic rule. It can be used for more accurate matching by .Cm check-state rule. The .Cm :default keyword is special name used for compatibility with old rulesets. .It Cm layer2 Matches only layer2 packets, i.e., those passed to .Nm from .Fn ether_demux and .Fn ether_output_frame . .It Cm limit Bro Cm src-addr | src-port | dst-addr | dst-port Brc Ar N Op Ar :flowname The firewall will only allow .Ar N connections with the same set of parameters as specified in the rule. One or more of source and destination addresses and ports can be specified. .It Cm lookup Bro Cm dst-ip | dst-port | src-ip | src-port | uid | jail Brc Ar name Search an entry in lookup table .Ar name that matches the field specified as argument. If not found, the match fails. Otherwise, the match succeeds and .Cm tablearg is set to the value extracted from the table. .Pp This option can be useful to quickly dispatch traffic based on certain packet fields. See the .Sx LOOKUP TABLES section below for more information on lookup tables. .It Cm { MAC | mac } Ar dst-mac src-mac Match packets with a given .Ar dst-mac and .Ar src-mac addresses, specified as the .Cm any keyword (matching any MAC address), or six groups of hex digits separated by colons, and optionally followed by a mask indicating the significant bits. The mask may be specified using either of the following methods: .Bl -enum -width indent .It A slash .Pq / followed by the number of significant bits. For example, an address with 33 significant bits could be specified as: .Pp .Dl "MAC 10:20:30:40:50:60/33 any" .It An ampersand .Pq & followed by a bitmask specified as six groups of hex digits separated by colons. For example, an address in which the last 16 bits are significant could be specified as: .Pp .Dl "MAC 10:20:30:40:50:60&00:00:00:00:ff:ff any" .Pp Note that the ampersand character has a special meaning in many shells and should generally be escaped. .El Note that the order of MAC addresses (destination first, source second) is the same as on the wire, but the opposite of the one used for IP addresses. .It Cm mac-type Ar mac-type Matches packets whose Ethernet Type field corresponds to one of those specified as argument. .Ar mac-type is specified in the same way as .Cm port numbers (i.e., one or more comma-separated single values or ranges). You can use symbolic names for known values such as .Em vlan , ipv4, ipv6 . Values can be entered as decimal or hexadecimal (if prefixed by 0x), and they are always printed as hexadecimal (unless the .Cm -N option is used, in which case symbolic resolution will be attempted). .It Cm proto Ar protocol Matches packets with the corresponding IP protocol. .It Cm record-state Upon a match, the firewall will create a dynamic rule as if .Cm keep-state was specified. However, this option doesn't imply an implicit .Cm check-state in contrast to .Cm keep-state . .It Cm recv | xmit | via Brq Ar ifX | Ar if Ns Cm * | Ar table Ns Po Ar name Ns Oo , Ns Ar value Oc Pc | Ar ipno | Ar any Matches packets received, transmitted or going through, respectively, the interface specified by exact name .Po Ar ifX Pc , by device name .Po Ar if* Pc , by IP address, or through some interface. Table .Ar name may be used to match interface by its kernel ifindex. See the .Sx LOOKUP TABLES section below for more information on lookup tables. .Pp The .Cm via keyword causes the interface to always be checked. If .Cm recv or .Cm xmit is used instead of .Cm via , then only the receive or transmit interface (respectively) is checked. By specifying both, it is possible to match packets based on both receive and transmit interface, e.g.: .Pp .Dl "ipfw add deny ip from any to any out recv ed0 xmit ed1" .Pp The .Cm recv interface can be tested on either incoming or outgoing packets, while the .Cm xmit interface can only be tested on outgoing packets. So .Cm out is required (and .Cm in is invalid) whenever .Cm xmit is used. .Pp A packet might not have a receive or transmit interface: packets originating from the local host have no receive interface, while packets destined for the local host have no transmit interface. .It Cm set-limit Bro Cm src-addr | src-port | dst-addr | dst-port Brc Ar N Works like .Cm limit but does not have an implicit .Cm check-state attached to it. .It Cm setup Matches TCP packets that have the SYN bit set but no ACK bit. This is the short form of .Dq Li tcpflags\ syn,!ack . .It Cm sockarg Matches packets that are associated to a local socket and for which the SO_USER_COOKIE socket option has been set to a non-zero value. As a side effect, the value of the option is made available as .Cm tablearg value, which in turn can be used as .Cm skipto or .Cm pipe number. .It Cm src-ip Ar ip-address Matches IPv4 packets whose source IP is one of the address(es) specified as an argument. .It Cm src-ip6 Ar ip6-address Matches IPv6 packets whose source IP is one of the address(es) specified as an argument. .It Cm src-port Ar ports Matches IP packets whose source port is one of the port(s) specified as argument. .It Cm tagged Ar tag-list Matches packets whose tags are included in .Ar tag-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . Tags can be applied to the packet using .Cm tag rule action parameter (see it's description for details on tags). .It Cm tcpack Ar ack TCP packets only. Match if the TCP header acknowledgment number field is set to .Ar ack . .It Cm tcpdatalen Ar tcpdatalen-list Matches TCP packets whose length of TCP data is .Ar tcpdatalen-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm tcpflags Ar spec TCP packets only. Match if the TCP header contains the comma separated list of flags specified in .Ar spec . The supported TCP flags are: .Pp .Cm fin , .Cm syn , .Cm rst , .Cm psh , .Cm ack and .Cm urg . The absence of a particular flag may be denoted with a .Ql \&! . A rule which contains a .Cm tcpflags specification can never match a fragmented packet which has a non-zero offset. See the .Cm frag option for details on matching fragmented packets. .It Cm tcpmss Ar tcpmss-list Matches TCP packets whose MSS (maximum segment size) value is set to .Ar tcpmss-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm tcpseq Ar seq TCP packets only. Match if the TCP header sequence number field is set to .Ar seq . .It Cm tcpwin Ar tcpwin-list Matches TCP packets whose header window field is set to .Ar tcpwin-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm tcpoptions Ar spec TCP packets only. Match if the TCP header contains the comma separated list of options specified in .Ar spec . The supported TCP options are: .Pp .Cm mss (maximum segment size), .Cm window (tcp window advertisement), .Cm sack (selective ack), .Cm ts (rfc1323 timestamp) and .Cm cc (rfc1644 t/tcp connection count). The absence of a particular option may be denoted with a .Ql \&! . .It Cm uid Ar user Match all TCP or UDP packets sent by or received for a .Ar user . A .Ar user may be matched by name or identification number. .It Cm verrevpath For incoming packets, a routing table lookup is done on the packet's source address. If the interface on which the packet entered the system matches the outgoing interface for the route, the packet matches. If the interfaces do not match up, the packet does not match. All outgoing packets or packets with no incoming interface match. .Pp The name and functionality of the option is intentionally similar to the Cisco IOS command: .Pp .Dl ip verify unicast reverse-path .Pp This option can be used to make anti-spoofing rules to reject all packets with source addresses not from this interface. See also the option .Cm antispoof . .It Cm versrcreach For incoming packets, a routing table lookup is done on the packet's source address. If a route to the source address exists, but not the default route or a blackhole/reject route, the packet matches. Otherwise, the packet does not match. All outgoing packets match. .Pp The name and functionality of the option is intentionally similar to the Cisco IOS command: .Pp .Dl ip verify unicast source reachable-via any .Pp This option can be used to make anti-spoofing rules to reject all packets whose source address is unreachable. .It Cm antispoof For incoming packets, the packet's source address is checked if it belongs to a directly connected network. If the network is directly connected, then the interface the packet came on in is compared to the interface the network is connected to. When incoming interface and directly connected interface are not the same, the packet does not match. Otherwise, the packet does match. All outgoing packets match. .Pp This option can be used to make anti-spoofing rules to reject all packets that pretend to be from a directly connected network but do not come in through that interface. This option is similar to but more restricted than .Cm verrevpath because it engages only on packets with source addresses of directly connected networks instead of all source addresses. .El .Sh LOOKUP TABLES Lookup tables are useful to handle large sparse sets of addresses or other search keys (e.g., ports, jail IDs, interface names). In the rest of this section we will use the term ``key''. Table name needs to match the following spec: .Ar table-name . Tables with the same name can be created in different .Ar sets . However, rule links to the tables in .Ar set 0 by default. This behavior can be controlled by .Va net.inet.ip.fw.tables_sets variable. See the .Sx SETS OF RULES section for more information. There may be up to 65535 different lookup tables. .Pp The following table types are supported: .Bl -tag -width indent .It Ar table-type : Ar addr | iface | number | flow .It Ar table-key : Ar addr Ns Oo / Ns Ar masklen Oc | iface-name | number | flow-spec .It Ar flow-spec : Ar flow-field Ns Op , Ns Ar flow-spec .It Ar flow-field : src-ip | proto | src-port | dst-ip | dst-port .It Cm addr Matches IPv4 or IPv6 address. Each entry is represented by an .Ar addr Ns Op / Ns Ar masklen and will match all addresses with base .Ar addr (specified as an IPv4/IPv6 address, or a hostname) and mask width of .Ar masklen bits. If .Ar masklen is not specified, it defaults to 32 for IPv4 and 128 for IPv6. When looking up an IP address in a table, the most specific entry will match. .It Cm iface Matches interface names. Each entry is represented by string treated as interface name. Wildcards are not supported. .It Cm number Matches protocol ports, uids/gids or jail IDs. Each entry is represented by 32-bit unsigned integer. Ranges are not supported. .It Cm flow Matches packet fields specified by .Ar flow type suboptions with table entries. .El .Pp Tables require explicit creation via .Cm create before use. .Pp The following creation options are supported: .Bl -tag -width indent .It Ar create-options : Ar create-option | create-options .It Ar create-option : Cm type Ar table-type | Cm valtype Ar value-mask | Cm algo Ar algo-desc | .Cm limit Ar number | Cm locked | Cm missing | Cm or-flush .It Cm type Table key type. .It Cm valtype Table value mask. .It Cm algo Table algorithm to use (see below). .It Cm limit Maximum number of items that may be inserted into table. .It Cm locked Restrict any table modifications. .It Cm missing Do not fail if table already exists and has exactly same options as new one. .It Cm or-flush Flush existing table with same name instead of returning error. Implies .Cm missing so existing table must be compatible with new one. .El .Pp Some of these options may be modified later via .Cm modify keyword. The following options can be changed: .Bl -tag -width indent .It Ar modify-options : Ar modify-option | modify-options .It Ar modify-option : Cm limit Ar number .It Cm limit Alter maximum number of items that may be inserted into table. .El .Pp Additionally, table can be locked or unlocked using .Cm lock or .Cm unlock commands. .Pp Tables of the same .Ar type can be swapped with each other using .Cm swap Ar name command. Swap may fail if tables limits are set and data exchange would result in limits hit. Operation is performed atomically. .Pp One or more entries can be added to a table at once using .Cm add command. Addition of all items are performed atomically. By default, error in addition of one entry does not influence addition of other entries. However, non-zero error code is returned in that case. Special .Cm atomic keyword may be specified before .Cm add to indicate all-or-none add request. .Pp One or more entries can be removed from a table at once using .Cm delete command. By default, error in removal of one entry does not influence removing of other entries. However, non-zero error code is returned in that case. .Pp It may be possible to check what entry will be found on particular .Ar table-key using .Cm lookup .Ar table-key command. This functionality is optional and may be unsupported in some algorithms. .Pp The following operations can be performed on .Ar one or .Cm all tables: .Bl -tag -width indent .It Cm list List all entries. .It Cm flush Removes all entries. .It Cm info Shows generic table information. .It Cm detail Shows generic table information and algo-specific data. .El .Pp The following lookup algorithms are supported: .Bl -tag -width indent .It Ar algo-desc : algo-name | "algo-name algo-data" .It Ar algo-name : Ar addr: radix | addr: hash | iface: array | number: array | flow: hash .It Cm addr: radix Separate Radix trees for IPv4 and IPv6, the same way as the routing table (see .Xr route 4 ) . Default choice for .Ar addr type. .It Cm addr:hash Separate auto-growing hashes for IPv4 and IPv6. Accepts entries with the same mask length specified initially via .Cm "addr:hash masks=/v4,/v6" algorithm creation options. Assume /32 and /128 masks by default. Search removes host bits (according to mask) from supplied address and checks resulting key in appropriate hash. Mostly optimized for /64 and byte-ranged IPv6 masks. .It Cm iface:array Array storing sorted indexes for entries which are presented in the system. Optimized for very fast lookup. .It Cm number:array Array storing sorted u32 numbers. .It Cm flow:hash Auto-growing hash storing flow entries. Search calculates hash on required packet fields and searches for matching entries in selected bucket. .El .Pp The .Cm tablearg feature provides the ability to use a value, looked up in the table, as the argument for a rule action, action parameter or rule option. This can significantly reduce number of rules in some configurations. If two tables are used in a rule, the result of the second (destination) is used. .Pp Each record may hold one or more values according to .Ar value-mask . This mask is set on table creation via .Cm valtype option. The following value types are supported: .Bl -tag -width indent .It Ar value-mask : Ar value-type Ns Op , Ns Ar value-mask .It Ar value-type : Ar skipto | pipe | fib | nat | dscp | tag | divert | .Ar netgraph | limit | ipv4 .It Cm skipto rule number to jump to. .It Cm pipe Pipe number to use. .It Cm fib fib number to match/set. .It Cm nat nat number to jump to. .It Cm dscp dscp value to match/set. .It Cm tag tag number to match/set. .It Cm divert port number to divert traffic to. .It Cm netgraph hook number to move packet to. .It Cm limit maximum number of connections. .It Cm ipv4 IPv4 nexthop to fwd packets to. .It Cm ipv6 IPv6 nexthop to fwd packets to. .El .Pp The .Cm tablearg argument can be used with the following actions: .Cm nat, pipe, queue, divert, tee, netgraph, ngtee, fwd, skipto, setfib , action parameters: .Cm tag, untag , rule options: .Cm limit, tagged . .Pp When used with the .Cm skipto action, the user should be aware that the code will walk the ruleset up to a rule equal to, or past, the given number. .Pp See the .Sx EXAMPLES Section for example usage of tables and the tablearg keyword. .Sh SETS OF RULES Each rule or table belongs to one of 32 different .Em sets , numbered 0 to 31. Set 31 is reserved for the default rule. .Pp By default, rules or tables are put in set 0, unless you use the .Cm set N attribute when adding a new rule or table. Sets can be individually and atomically enabled or disabled, so this mechanism permits an easy way to store multiple configurations of the firewall and quickly (and atomically) switch between them. .Pp By default, tables from set 0 are referenced when adding rule with table opcodes regardless of rule set. This behavior can be changed by setting .Va net.inet.ip.fw.tables_sets variable to 1. Rule's set will then be used for table references. .Pp The command to enable/disable sets is .Bd -ragged -offset indent .Nm .Cm set Oo Cm disable Ar number ... Oc Op Cm enable Ar number ... .Ed .Pp where multiple .Cm enable or .Cm disable sections can be specified. Command execution is atomic on all the sets specified in the command. By default, all sets are enabled. .Pp When you disable a set, its rules behave as if they do not exist in the firewall configuration, with only one exception: .Bd -ragged -offset indent dynamic rules created from a rule before it had been disabled will still be active until they expire. In order to delete dynamic rules you have to explicitly delete the parent rule which generated them. .Ed .Pp The set number of rules can be changed with the command .Bd -ragged -offset indent .Nm .Cm set move .Brq Cm rule Ar rule-number | old-set .Cm to Ar new-set .Ed .Pp Also, you can atomically swap two rulesets with the command .Bd -ragged -offset indent .Nm .Cm set swap Ar first-set second-set .Ed .Pp See the .Sx EXAMPLES Section on some possible uses of sets of rules. .Sh STATEFUL FIREWALL Stateful operation is a way for the firewall to dynamically create rules for specific flows when packets that match a given pattern are detected. Support for stateful operation comes through the .Cm check-state , keep-state , record-state , limit and .Cm set-limit options of .Nm rules . .Pp Dynamic rules are created when a packet matches a .Cm keep-state , .Cm record-state , .Cm limit or .Cm set-limit rule, causing the creation of a .Em dynamic rule which will match all and only packets with a given .Em protocol between a .Em src-ip/src-port dst-ip/dst-port pair of addresses .Em ( src and .Em dst are used here only to denote the initial match addresses, but they are completely equivalent afterwards). Rules created by .Cm keep-state option also have a .Ar :flowname taken from it. This name is used in matching together with addresses, ports and protocol. Dynamic rules will be checked at the first .Cm check-state, keep-state or .Cm limit occurrence, and the action performed upon a match will be the same as in the parent rule. .Pp Note that no additional attributes other than protocol and IP addresses and ports and :flowname are checked on dynamic rules. .Pp The typical use of dynamic rules is to keep a closed firewall configuration, but let the first TCP SYN packet from the inside network install a dynamic rule for the flow so that packets belonging to that session will be allowed through the firewall: .Pp .Dl "ipfw add check-state :OUTBOUND" .Dl "ipfw add allow tcp from my-subnet to any setup keep-state :OUTBOUND" .Dl "ipfw add deny tcp from any to any" .Pp A similar approach can be used for UDP, where an UDP packet coming from the inside will install a dynamic rule to let the response through the firewall: .Pp .Dl "ipfw add check-state :OUTBOUND" .Dl "ipfw add allow udp from my-subnet to any keep-state :OUTBOUND" .Dl "ipfw add deny udp from any to any" .Pp Dynamic rules expire after some time, which depends on the status of the flow and the setting of some .Cm sysctl variables. See Section .Sx SYSCTL VARIABLES for more details. For TCP sessions, dynamic rules can be instructed to periodically send keepalive packets to refresh the state of the rule when it is about to expire. .Pp See Section .Sx EXAMPLES for more examples on how to use dynamic rules. .Sh TRAFFIC SHAPER (DUMMYNET) CONFIGURATION .Nm is also the user interface for the .Nm dummynet traffic shaper, packet scheduler and network emulator, a subsystem that can artificially queue, delay or drop packets emulating the behaviour of certain network links or queueing systems. .Pp .Nm dummynet operates by first using the firewall to select packets using any match pattern that can be used in .Nm rules. Matching packets are then passed to either of two different objects, which implement the traffic regulation: .Bl -hang -offset XXXX .It Em pipe A .Em pipe emulates a .Em link with given bandwidth and propagation delay, driven by a FIFO scheduler and a single queue with programmable queue size and packet loss rate. Packets are appended to the queue as they come out from .Nm ipfw , and then transferred in FIFO order to the link at the desired rate. .It Em queue A .Em queue is an abstraction used to implement packet scheduling using one of several packet scheduling algorithms. Packets sent to a .Em queue are first grouped into flows according to a mask on the 5-tuple. Flows are then passed to the scheduler associated to the .Em queue , and each flow uses scheduling parameters (weight and others) as configured in the .Em queue itself. A scheduler in turn is connected to an emulated link, and arbitrates the link's bandwidth among backlogged flows according to weights and to the features of the scheduling algorithm in use. .El .Pp In practice, .Em pipes can be used to set hard limits to the bandwidth that a flow can use, whereas .Em queues can be used to determine how different flows share the available bandwidth. .Pp A graphical representation of the binding of queues, flows, schedulers and links is below. .Bd -literal -offset indent (flow_mask|sched_mask) sched_mask +---------+ weight Wx +-------------+ | |->-[flow]-->--| |-+ -->--| QUEUE x | ... | | | | |->-[flow]-->--| SCHEDuler N | | +---------+ | | | ... | +--[LINK N]-->-- +---------+ weight Wy | | +--[LINK N]-->-- | |->-[flow]-->--| | | -->--| QUEUE y | ... | | | | |->-[flow]-->--| | | +---------+ +-------------+ | +-------------+ .Ed It is important to understand the role of the SCHED_MASK and FLOW_MASK, which are configured through the commands .Dl "ipfw sched N config mask SCHED_MASK ..." and .Dl "ipfw queue X config mask FLOW_MASK ..." . .Pp The SCHED_MASK is used to assign flows to one or more scheduler instances, one for each value of the packet's 5-tuple after applying SCHED_MASK. As an example, using ``src-ip 0xffffff00'' creates one instance for each /24 destination subnet. .Pp The FLOW_MASK, together with the SCHED_MASK, is used to split packets into flows. As an example, using ``src-ip 0x000000ff'' together with the previous SCHED_MASK makes a flow for each individual source address. In turn, flows for each /24 subnet will be sent to the same scheduler instance. .Pp The above diagram holds even for the .Em pipe case, with the only restriction that a .Em pipe only supports a SCHED_MASK, and forces the use of a FIFO scheduler (these are for backward compatibility reasons; in fact, internally, a .Nm dummynet's pipe is implemented exactly as above). .Pp There are two modes of .Nm dummynet operation: .Dq normal and .Dq fast . The .Dq normal mode tries to emulate a real link: the .Nm dummynet scheduler ensures that the packet will not leave the pipe faster than it would on the real link with a given bandwidth. The .Dq fast mode allows certain packets to bypass the .Nm dummynet scheduler (if packet flow does not exceed pipe's bandwidth). This is the reason why the .Dq fast mode requires less CPU cycles per packet (on average) and packet latency can be significantly lower in comparison to a real link with the same bandwidth. The default mode is .Dq normal . The .Dq fast mode can be enabled by setting the .Va net.inet.ip.dummynet.io_fast .Xr sysctl 8 variable to a non-zero value. .Ss PIPE, QUEUE AND SCHEDULER CONFIGURATION The .Em pipe , .Em queue and .Em scheduler configuration commands are the following: .Bd -ragged -offset indent .Cm pipe Ar number Cm config Ar pipe-configuration .Pp .Cm queue Ar number Cm config Ar queue-configuration .Pp .Cm sched Ar number Cm config Ar sched-configuration .Ed .Pp The following parameters can be configured for a pipe: .Pp .Bl -tag -width indent -compact .It Cm bw Ar bandwidth | device Bandwidth, measured in .Sm off .Op Cm K | M | G .Brq Cm bit/s | Byte/s . .Sm on .Pp A value of 0 (default) means unlimited bandwidth. The unit must immediately follow the number, as in .Pp .Dl "ipfw pipe 1 config bw 300Kbit/s" .Pp If a device name is specified instead of a numeric value, as in .Pp .Dl "ipfw pipe 1 config bw tun0" .Pp then the transmit clock is supplied by the specified device. At the moment only the .Xr tun 4 device supports this functionality, for use in conjunction with .Xr ppp 8 . .Pp .It Cm delay Ar ms-delay Propagation delay, measured in milliseconds. The value is rounded to the next multiple of the clock tick (typically 10ms, but it is a good practice to run kernels with .Dq "options HZ=1000" to reduce the granularity to 1ms or less). The default value is 0, meaning no delay. .Pp .It Cm burst Ar size If the data to be sent exceeds the pipe's bandwidth limit (and the pipe was previously idle), up to .Ar size bytes of data are allowed to bypass the .Nm dummynet scheduler, and will be sent as fast as the physical link allows. Any additional data will be transmitted at the rate specified by the .Nm pipe bandwidth. The burst size depends on how long the pipe has been idle; the effective burst size is calculated as follows: MAX( .Ar size , .Nm bw * pipe_idle_time). .Pp .It Cm profile Ar filename A file specifying the additional overhead incurred in the transmission of a packet on the link. .Pp Some link types introduce extra delays in the transmission of a packet, e.g., because of MAC level framing, contention on the use of the channel, MAC level retransmissions and so on. From our point of view, the channel is effectively unavailable for this extra time, which is constant or variable depending on the link type. Additionally, packets may be dropped after this time (e.g., on a wireless link after too many retransmissions). We can model the additional delay with an empirical curve that represents its distribution. .Bd -literal -offset indent cumulative probability 1.0 ^ | L +-- loss-level x | ****** | * | ***** | * | ** | * +-------*-------------------> delay .Ed The empirical curve may have both vertical and horizontal lines. Vertical lines represent constant delay for a range of probabilities. Horizontal lines correspond to a discontinuity in the delay distribution: the pipe will use the largest delay for a given probability. .Pp The file format is the following, with whitespace acting as a separator and '#' indicating the beginning a comment: .Bl -tag -width indent .It Cm name Ar identifier optional name (listed by "ipfw pipe show") to identify the delay distribution; .It Cm bw Ar value the bandwidth used for the pipe. If not specified here, it must be present explicitly as a configuration parameter for the pipe; .It Cm loss-level Ar L the probability above which packets are lost. (0.0 <= L <= 1.0, default 1.0 i.e., no loss); .It Cm samples Ar N the number of samples used in the internal representation of the curve (2..1024; default 100); .It Cm "delay prob" | "prob delay" One of these two lines is mandatory and defines the format of the following lines with data points. .It Ar XXX Ar YYY 2 or more lines representing points in the curve, with either delay or probability first, according to the chosen format. The unit for delay is milliseconds. Data points do not need to be sorted. Also, the number of actual lines can be different from the value of the "samples" parameter: .Nm utility will sort and interpolate the curve as needed. .El .Pp Example of a profile file: .Bd -literal -offset indent name bla_bla_bla samples 100 loss-level 0.86 prob delay 0 200 # minimum overhead is 200ms 0.5 200 0.5 300 0.8 1000 0.9 1300 1 1300 #configuration file end .Ed .El .Pp The following parameters can be configured for a queue: .Pp .Bl -tag -width indent -compact .It Cm pipe Ar pipe_nr Connects a queue to the specified pipe. Multiple queues (with the same or different weights) can be connected to the same pipe, which specifies the aggregate rate for the set of queues. .Pp .It Cm weight Ar weight Specifies the weight to be used for flows matching this queue. The weight must be in the range 1..100, and defaults to 1. .El .Pp The following case-insensitive parameters can be configured for a scheduler: .Pp .Bl -tag -width indent -compact .It Cm type Ar {fifo | wf2q+ | rr | qfq | fq_codel | fq_pie} specifies the scheduling algorithm to use. .Bl -tag -width indent -compact .It Cm fifo is just a FIFO scheduler (which means that all packets are stored in the same queue as they arrive to the scheduler). FIFO has O(1) per-packet time complexity, with very low constants (estimate 60-80ns on a 2GHz desktop machine) but gives no service guarantees. .It Cm wf2q+ implements the WF2Q+ algorithm, which is a Weighted Fair Queueing algorithm which permits flows to share bandwidth according to their weights. Note that weights are not priorities; even a flow with a minuscule weight will never starve. WF2Q+ has O(log N) per-packet processing cost, where N is the number of flows, and is the default algorithm used by previous versions dummynet's queues. .It Cm rr implements the Deficit Round Robin algorithm, which has O(1) processing costs (roughly, 100-150ns per packet) and permits bandwidth allocation according to weights, but with poor service guarantees. .It Cm qfq implements the QFQ algorithm, which is a very fast variant of WF2Q+, with similar service guarantees and O(1) processing costs (roughly, 200-250ns per packet). .It Cm fq_codel implements the FQ-CoDel (FlowQueue-CoDel) scheduler/AQM algorithm, which uses a modified Deficit Round Robin scheduler to manage two lists of sub-queues (old sub-queues and new sub-queues) for providing brief periods of priority to lightweight or short burst flows. By default, the total number of sub-queues is 1024. FQ-CoDel's internal, dynamically created sub-queues are controlled by separate instances of CoDel AQM. .It Cm fq_pie implements the FQ-PIE (FlowQueue-PIE) scheduler/AQM algorithm, which similar to .Cm fq_codel but uses per sub-queue PIE AQM instance to control the queue delay. .El .Pp .Cm fq_codel inherits AQM parameters and options from .Cm codel (see below), and .Cm fq_pie inherits AQM parameters and options from .Cm pie (see below). Additionally, both of .Cm fq_codel and .Cm fq_pie have shared scheduler parameters which are: .Bl -tag -width indent .It Cm quantum .Ar m specifies the quantum (credit) of the scheduler. .Ar m is the number of bytes a queue can serve before being moved to the tail of old queues list. The default is 1514 bytes, and the maximum acceptable value is 9000 bytes. .It Cm limit .Ar m specifies the hard size limit (in unit of packets) of all queues managed by an instance of the scheduler. The default value of .Ar m is 10240 packets, and the maximum acceptable value is 20480 packets. .It Cm flows .Ar m specifies the total number of flow queues (sub-queues) that fq_* creates and manages. By default, 1024 sub-queues are created when an instance of the fq_{codel/pie} scheduler is created. The maximum acceptable value is 65536. .El .Pp Note that any token after .Cm fq_codel or .Cm fq_pie is considered a parameter for fq_{codel/pie}. So, ensure all scheduler configuration options not related to fq_{codel/pie} are written before .Cm fq_codel/fq_pie tokens. .El .Pp In addition to the type, all parameters allowed for a pipe can also be specified for a scheduler. .Pp Finally, the following parameters can be configured for both pipes and queues: .Pp .Bl -tag -width XXXX -compact .It Cm buckets Ar hash-table-size Specifies the size of the hash table used for storing the various queues. Default value is 64 controlled by the .Xr sysctl 8 variable .Va net.inet.ip.dummynet.hash_size , allowed range is 16 to 65536. .Pp .It Cm mask Ar mask-specifier Packets sent to a given pipe or queue by an .Nm rule can be further classified into multiple flows, each of which is then sent to a different .Em dynamic pipe or queue. A flow identifier is constructed by masking the IP addresses, ports and protocol types as specified with the .Cm mask options in the configuration of the pipe or queue. For each different flow identifier, a new pipe or queue is created with the same parameters as the original object, and matching packets are sent to it. .Pp Thus, when .Em dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when .Em dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe). .br Available mask specifiers are a combination of one or more of the following: .Pp .Cm dst-ip Ar mask , .Cm dst-ip6 Ar mask , .Cm src-ip Ar mask , .Cm src-ip6 Ar mask , .Cm dst-port Ar mask , .Cm src-port Ar mask , .Cm flow-id Ar mask , .Cm proto Ar mask or .Cm all , .Pp where the latter means all bits in all fields are significant. .Pp .It Cm noerror When a packet is dropped by a .Nm dummynet queue or pipe, the error is normally reported to the caller routine in the kernel, in the same way as it happens when a device queue fills up. Setting this option reports the packet as successfully delivered, which can be needed for some experimental setups where you want to simulate loss or congestion at a remote router. .Pp .It Cm plr Ar packet-loss-rate Packet loss rate. Argument .Ar packet-loss-rate is a floating-point number between 0 and 1, with 0 meaning no loss, 1 meaning 100% loss. The loss rate is internally represented on 31 bits. .Pp .It Cm queue Brq Ar slots | size Ns Cm Kbytes Queue size, in .Ar slots or .Cm KBytes . Default value is 50 slots, which is the typical queue size for Ethernet devices. Note that for slow speed links you should keep the queue size short or your traffic might be affected by a significant queueing delay. E.g., 50 max-sized Ethernet packets (1500 bytes) mean 600Kbit or 20s of queue on a 30Kbit/s pipe. Even worse effects can result if you get packets from an interface with a much larger MTU, e.g.\& the loopback interface with its 16KB packets. The .Xr sysctl 8 variables .Em net.inet.ip.dummynet.pipe_byte_limit and .Em net.inet.ip.dummynet.pipe_slot_limit control the maximum lengths that can be specified. .Pp .It Cm red | gred Ar w_q Ns / Ns Ar min_th Ns / Ns Ar max_th Ns / Ns Ar max_p [ecn] Make use of the RED (Random Early Detection) queue management algorithm. .Ar w_q and .Ar max_p are floating point numbers between 0 and 1 (inclusive), while .Ar min_th and .Ar max_th are integer numbers specifying thresholds for queue management (thresholds are computed in bytes if the queue has been defined in bytes, in slots otherwise). The two parameters can also be of the same value if needed. The .Nm dummynet also supports the gentle RED variant (gred) and ECN (Explicit Congestion Notification) as optional. Three .Xr sysctl 8 variables can be used to control the RED behaviour: .Bl -tag -width indent .It Va net.inet.ip.dummynet.red_lookup_depth specifies the accuracy in computing the average queue when the link is idle (defaults to 256, must be greater than zero) .It Va net.inet.ip.dummynet.red_avg_pkt_size specifies the expected average packet size (defaults to 512, must be greater than zero) .It Va net.inet.ip.dummynet.red_max_pkt_size specifies the expected maximum packet size, only used when queue thresholds are in bytes (defaults to 1500, must be greater than zero). .El .Pp .It Cm codel Oo Cm target Ar time Oc Oo Cm interval Ar time Oc Oo Cm ecn | .Cm noecn Oc Make use of the CoDel (Controlled-Delay) queue management algorithm. .Ar time is interpreted as milliseconds by default but seconds (s), milliseconds (ms) or microseconds (us) can be specified instead. CoDel drops or marks (ECN) packets depending on packet sojourn time in the queue. .Cm target .Ar time (5ms by default) is the minimum acceptable persistent queue delay that CoDel allows. CoDel does not drop packets directly after packets sojourn time becomes higher than .Cm target .Ar time but waits for .Cm interval .Ar time (100ms default) before dropping. .Cm interval .Ar time should be set to maximum RTT for all expected connections. .Cm ecn enables (disabled by default) packet marking (instead of dropping) for ECN-enabled TCP flows when queue delay becomes high. .Pp Note that any token after .Cm codel is considered a parameter for CoDel. So, ensure all pipe/queue configuration options are written before .Cm codel token. .Pp The .Xr sysctl 8 variables .Va net.inet.ip.dummynet.codel.target and .Va net.inet.ip.dummynet.codel.interval can be used to set CoDel default parameters. .Pp .It Cm pie Oo Cm target Ar time Oc Oo Cm tupdate Ar time Oc Oo .Cm alpha Ar n Oc Oo Cm beta Ar n Oc Oo Cm max_burst Ar time Oc Oo .Cm max_ecnth Ar n Oc Oo Cm ecn | Cm noecn Oc Oo Cm capdrop | .Cm nocapdrop Oc Oo Cm drand | Cm nodrand Oc Oo Cm onoff .Oc Oo Cm dre | Cm ts Oc Make use of the PIE (Proportional Integral controller Enhanced) queue management algorithm. PIE drops or marks packets depending on a calculated drop probability during en-queue process, with the aim of achieving high throughput while keeping queue delay low. At regular time intervals of .Cm tupdate .Ar time (15ms by default) a background process (re)calculates the probability based on queue delay deviations from .Cm target .Ar time (15ms by default) and queue delay trends. PIE approximates current queue delay by using a departure rate estimation method, or (optionally) by using a packet timestamp method similar to CoDel. .Ar time is interpreted as milliseconds by default but seconds (s), milliseconds (ms) or microseconds (us) can be specified instead. The other PIE parameters and options are as follows: .Bl -tag -width indent .It Cm alpha Ar n .Ar n is a floating point number between 0 and 7 which specifies the weight of queue delay deviations that is used in drop probability calculation. 0.125 is the default. .It Cm beta Ar n .Ar n is a floating point number between 0 and 7 which specifies is the weight of queue delay trend that is used in drop probability calculation. 1.25 is the default. .It Cm max_burst Ar time The maximum period of time that PIE does not drop/mark packets. 150ms is the default and 10s is the maximum value. .It Cm max_ecnth Ar n Even when ECN is enabled, PIE drops packets instead of marking them when drop probability becomes higher than ECN probability threshold .Cm max_ecnth Ar n , the default is 0.1 (i.e 10%) and 1 is the maximum value. .It Cm ecn | noecn enable or disable ECN marking for ECN-enabled TCP flows. Disabled by default. .It Cm capdrop | nocapdrop enable or disable cap drop adjustment. Cap drop adjustment is enabled by default. .It Cm drand | nodrand enable or disable drop probability de-randomisation. De-randomisation eliminates the problem of dropping packets too close or too far. De-randomisation is enabled by default. .It Cm onoff enable turning PIE on and off depending on queue load. If this option is enabled, PIE turns on when over 1/3 of queue becomes full. This option is disabled by default. .It Cm dre | ts Calculate queue delay using departure rate estimation .Cm dre or timestamps .Cm ts . .Cm dre is used by default. .El .Pp Note that any token after .Cm pie is considered a parameter for PIE. So ensure all pipe/queue the configuration options are written before .Cm pie token. .Xr sysctl 8 variables can be used to control the .Cm pie default parameters. See the .Sx SYSCTL VARIABLES section for more details. .El .Pp When used with IPv6 data, .Nm dummynet currently has several limitations. Information necessary to route link-local packets to an interface is not available after processing by .Nm dummynet so those packets are dropped in the output path. Care should be taken to ensure that link-local packets are not passed to .Nm dummynet . .Sh CHECKLIST Here are some important points to consider when designing your rules: .Bl -bullet .It Remember that you filter both packets going .Cm in and .Cm out . Most connections need packets going in both directions. .It Remember to test very carefully. It is a good idea to be near the console when doing this. If you cannot be near the console, use an auto-recovery script such as the one in .Pa /usr/share/examples/ipfw/change_rules.sh . .It Do not forget the loopback interface. .El .Sh FINE POINTS .Bl -bullet .It There are circumstances where fragmented datagrams are unconditionally dropped. TCP packets are dropped if they do not contain at least 20 bytes of TCP header, UDP packets are dropped if they do not contain a full 8 byte UDP header, and ICMP packets are dropped if they do not contain 4 bytes of ICMP header, enough to specify the ICMP type, code, and checksum. These packets are simply logged as .Dq pullup failed since there may not be enough good data in the packet to produce a meaningful log entry. .It Another type of packet is unconditionally dropped, a TCP packet with a fragment offset of one. This is a valid packet, but it only has one use, to try to circumvent firewalls. When logging is enabled, these packets are reported as being dropped by rule -1. .It If you are logged in over a network, loading the .Xr kld 4 version of .Nm is probably not as straightforward as you would think. The following command line is recommended: .Bd -literal -offset indent kldload ipfw && \e ipfw add 32000 allow ip from any to any .Ed .Pp Along the same lines, doing an .Bd -literal -offset indent ipfw flush .Ed .Pp in similar surroundings is also a bad idea. .It The .Nm filter list may not be modified if the system security level is set to 3 or higher (see .Xr init 8 for information on system security levels). .El .Sh PACKET DIVERSION A .Xr divert 4 socket bound to the specified port will receive all packets diverted to that port. If no socket is bound to the destination port, or if the divert module is not loaded, or if the kernel was not compiled with divert socket support, the packets are dropped. .Sh NETWORK ADDRESS TRANSLATION (NAT) .Nm support in-kernel NAT using the kernel version of .Xr libalias 3 . The kernel module .Cm ipfw_nat should be loaded or kernel should have .Cm options IPFIREWALL_NAT to be able use NAT. .Pp The nat configuration command is the following: .Bd -ragged -offset indent .Bk -words .Cm nat .Ar nat_number .Cm config .Ar nat-configuration .Ek .Ed .Pp The following parameters can be configured: .Bl -tag -width indent .It Cm ip Ar ip_address Define an ip address to use for aliasing. .It Cm if Ar nic Use ip address of NIC for aliasing, dynamically changing it if NIC's ip address changes. .It Cm log Enable logging on this nat instance. .It Cm deny_in Deny any incoming connection from outside world. .It Cm same_ports Try to leave the alias port numbers unchanged from the actual local port numbers. .It Cm unreg_only Traffic on the local network not originating from a RFC 1918 unregistered address spaces will be ignored. .It Cm unreg_cgn Like unreg_only, but includes the RFC 6598 (Carrier Grade NAT) address range. .It Cm reset Reset table of the packet aliasing engine on address change. .It Cm reverse Reverse the way libalias handles aliasing. .It Cm proxy_only Obey transparent proxy rules only, packet aliasing is not performed. .It Cm skip_global Skip instance in case of global state lookup (see below). +.It Cm port_range Ar lower-upper +Set the aliasing ports between the ranges given. Upper port has to be greater +than lower. .El .Pp Some specials value can be supplied instead of .Va nat_number : .Bl -tag -width indent .It Cm global Looks up translation state in all configured nat instances. If an entry is found, packet is aliased according to that entry. If no entry was found in any of the instances, packet is passed unchanged, and no new entry will be created. See section .Sx MULTIPLE INSTANCES in .Xr natd 8 for more information. .It Cm tablearg Uses argument supplied in lookup table. See .Sx LOOKUP TABLES section below for more information on lookup tables. .El .Pp To let the packet continue after being (de)aliased, set the sysctl variable .Va net.inet.ip.fw.one_pass to 0. For more information about aliasing modes, refer to .Xr libalias 3 . See Section .Sx EXAMPLES for some examples about nat usage. .Ss REDIRECT AND LSNAT SUPPORT IN IPFW Redirect and LSNAT support follow closely the syntax used in .Xr natd 8 . See Section .Sx EXAMPLES for some examples on how to do redirect and lsnat. .Ss SCTP NAT SUPPORT SCTP nat can be configured in a similar manner to TCP through the .Nm command line tool. The main difference is that .Nm sctp nat does not do port translation. Since the local and global side ports will be the same, there is no need to specify both. Ports are redirected as follows: .Bd -ragged -offset indent .Bk -words .Cm nat .Ar nat_number .Cm config if .Ar nic .Cm redirect_port sctp .Ar ip_address [,addr_list] {[port | port-port] [,ports]} .Ek .Ed .Pp Most .Nm sctp nat configuration can be done in real-time through the .Xr sysctl 8 interface. All may be changed dynamically, though the hash_table size will only change for new .Nm nat instances. See .Sx SYSCTL VARIABLES for more info. .Sh IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION .Ss Stateful translation .Nm supports in-kernel IPv6/IPv4 network address and protocol translation. Stateful NAT64 translation allows IPv6-only clients to contact IPv4 servers using unicast TCP, UDP or ICMP protocols. One or more IPv4 addresses assigned to a stateful NAT64 translator are shared among several IPv6-only clients. When stateful NAT64 is used in conjunction with DNS64, no changes are usually required in the IPv6 client or the IPv4 server. The kernel module .Cm ipfw_nat64 should be loaded or kernel should have .Cm options IPFIREWALL_NAT64 to be able use stateful NAT64 translator. .Pp Stateful NAT64 uses a bunch of memory for several types of objects. When IPv6 client initiates connection, NAT64 translator creates a host entry in the states table. Each host entry uses preallocated IPv4 alias entry. Each alias entry has a number of ports group entries allocated on demand. Ports group entries contains connection state entries. There are several options to control limits and lifetime for these objects. .Pp NAT64 translator follows RFC7915 when does ICMPv6/ICMP translation, unsupported message types will be silently dropped. IPv6 needs several ICMPv6 message types to be explicitly allowed for correct operation. Make sure that ND6 neighbor solicitation (ICMPv6 type 135) and neighbor advertisement (ICMPv6 type 136) messages will not be handled by translation rules. .Pp After translation NAT64 translator by default sends packets through corresponding netisr queue. Thus translator host should be configured as IPv4 and IPv6 router. Also this means, that a packet is handled by firewall twice. First time an original packet is handled and consumed by translator, and then it is handled again as translated packet. This behavior can be changed by sysctl variable .Va net.inet.ip.fw.nat64_direct_output . Also translated packet can be tagged using .Cm tag rule action, and then matched by .Cm tagged opcode to avoid loops and extra overhead. .Pp The stateful NAT64 configuration command is the following: .Bd -ragged -offset indent .Bk -words .Cm nat64lsn .Ar name .Cm create .Ar create-options .Ek .Ed .Pp The following parameters can be configured: .Bl -tag -width indent .It Cm prefix4 Ar ipv4_prefix/plen The IPv4 prefix with mask defines the pool of IPv4 addresses used as source address after translation. Stateful NAT64 module translates IPv6 source address of client to one IPv4 address from this pool. Note that incoming IPv4 packets that don't have corresponding state entry in the states table will be dropped by translator. Make sure that translation rules handle packets, destined to configured prefix. .It Cm prefix6 Ar ipv6_prefix/length The IPv6 prefix defines IPv4-embedded IPv6 addresses used by translator to represent IPv4 addresses. This IPv6 prefix should be configured in DNS64. The translator implementation follows RFC6052, that restricts the length of prefixes to one of following: 32, 40, 48, 56, 64, or 96. The Well-Known IPv6 Prefix 64:ff9b:: must be 96 bits long. The special .Ar ::/length prefix can be used to handle several IPv6 prefixes with one NAT64 instance. The NAT64 instance will determine a destination IPv4 address from prefix .Ar length . .It Cm states_chunks Ar number The number of states chunks in single ports group. Each ports group by default can keep 64 state entries in single chunk. The above value affects the maximum number of states that can be associated with single IPv4 alias address and port. The value must be power of 2, and up to 128. .It Cm host_del_age Ar seconds The number of seconds until the host entry for a IPv6 client will be deleted and all its resources will be released due to inactivity. Default value is .Ar 3600 . .It Cm pg_del_age Ar seconds The number of seconds until a ports group with unused state entries will be released. Default value is .Ar 900 . .It Cm tcp_syn_age Ar seconds The number of seconds while a state entry for TCP connection with only SYN sent will be kept. If TCP connection establishing will not be finished, state entry will be deleted. Default value is .Ar 10 . .It Cm tcp_est_age Ar seconds The number of seconds while a state entry for established TCP connection will be kept. Default value is .Ar 7200 . .It Cm tcp_close_age Ar seconds The number of seconds while a state entry for closed TCP connection will be kept. Keeping state entries for closed connections is needed, because IPv4 servers typically keep closed connections in a TIME_WAIT state for a several minutes. Since translator's IPv4 addresses are shared among all IPv6 clients, new connections from the same addresses and ports may be rejected by server, because these connections are still in a TIME_WAIT state. Keeping them in translator's state table protects from such rejects. Default value is .Ar 180 . .It Cm udp_age Ar seconds The number of seconds while translator keeps state entry in a waiting for reply to the sent UDP datagram. Default value is .Ar 120 . .It Cm icmp_age Ar seconds The number of seconds while translator keeps state entry in a waiting for reply to the sent ICMP message. Default value is .Ar 60 . .It Cm log Turn on logging of all handled packets via BPF through .Ar ipfwlog0 interface. .Ar ipfwlog0 is a pseudo interface and can be created after a boot manually with .Cm ifconfig command. Note that it has different purpose than .Ar ipfw0 interface. Translators sends to BPF an additional information with each packet. With .Cm tcpdump you are able to see each handled packet before and after translation. .It Cm -log Turn off logging of all handled packets via BPF. .It Cm allow_private Turn on processing private IPv4 addresses. By default IPv6 packets with destinations mapped to private address ranges defined by RFC1918 are not processed. .It Cm -allow_private Turn off private address handling in .Nm nat64 instance. .El .Pp To inspect a states table of stateful NAT64 the following command can be used: .Bd -ragged -offset indent .Bk -words .Cm nat64lsn .Ar name .Cm show Cm states .Ek .Ed .Pp Stateless NAT64 translator doesn't use a states table for translation and converts IPv4 addresses to IPv6 and vice versa solely based on the mappings taken from configured lookup tables. Since a states table doesn't used by stateless translator, it can be configured to pass IPv4 clients to IPv6-only servers. .Pp The stateless NAT64 configuration command is the following: .Bd -ragged -offset indent .Bk -words .Cm nat64stl .Ar name .Cm create .Ar create-options .Ek .Ed .Pp The following parameters can be configured: .Bl -tag -width indent .It Cm prefix6 Ar ipv6_prefix/length The IPv6 prefix defines IPv4-embedded IPv6 addresses used by translator to represent IPv4 addresses. This IPv6 prefix should be configured in DNS64. .It Cm table4 Ar table46 The lookup table .Ar table46 contains mapping how IPv4 addresses should be translated to IPv6 addresses. .It Cm table6 Ar table64 The lookup table .Ar table64 contains mapping how IPv6 addresses should be translated to IPv4 addresses. .It Cm log Turn on logging of all handled packets via BPF through .Ar ipfwlog0 interface. .It Cm -log Turn off logging of all handled packets via BPF. .It Cm allow_private Turn on processing private IPv4 addresses. By default IPv6 packets with destinations mapped to private address ranges defined by RFC1918 are not processed. .It Cm -allow_private Turn off private address handling in .Nm nat64 instance. .El .Pp Note that the behavior of stateless translator with respect to not matched packets differs from stateful translator. If corresponding addresses was not found in the lookup tables, the packet will not be dropped and the search continues. .Ss XLAT464 CLAT translation XLAT464 CLAT NAT64 translator implements client-side stateless translation as defined in RFC6877 and is very similar to statless NAT64 translator explained above. Instead of lookup tables it uses one-to-one mapping between IPv4 and IPv6 addresses using configured prefixes. This mode can be used as a replacement of DNS64 service for applications that are not using it (e.g. VoIP) allowing them to access IPv4-only Internet over IPv6-only networks with help of remote NAT64 translator. .Pp The CLAT NAT64 configuration command is the following: .Bd -ragged -offset indent .Bk -words .Cm nat64clat .Ar name .Cm create .Ar create-options .Ek .Ed .Pp The following parameters can be configured: .Bl -tag -width indent .It Cm clat_prefix Ar ipv6_prefix/length The IPv6 prefix defines IPv4-embedded IPv6 addresses used by translator to represent source IPv4 addresses. .It Cm plat_prefix Ar ipv6_prefix/length The IPv6 prefix defines IPv4-embedded IPv6 addresses used by translator to represent destination IPv4 addresses. This IPv6 prefix should be configured on a remote NAT64 translator. .It Cm log Turn on logging of all handled packets via BPF through .Ar ipfwlog0 interface. .It Cm -log Turn off logging of all handled packets via BPF. .It Cm allow_private Turn on processing private IPv4 addresses. By default .Nm nat64clat instance will not process IPv4 packets with destination address from private ranges as defined in RFC1918. .It Cm -allow_private Turn off private address handling in .Nm nat64clat instance. .El .Pp Note that the behavior of CLAT translator with respect to not matched packets differs from stateful translator. If corresponding addresses were not matched against prefixes configured, the packet will not be dropped and the search continues. .Sh IPv6-to-IPv6 NETWORK PREFIX TRANSLATION (NPTv6) .Nm supports in-kernel IPv6-to-IPv6 network prefix translation as described in RFC6296. The kernel module .Cm ipfw_nptv6 should be loaded or kernel should has .Cm options IPFIREWALL_NPTV6 to be able use NPTv6 translator. .Pp The NPTv6 configuration command is the following: .Bd -ragged -offset indent .Bk -words .Cm nptv6 .Ar name .Cm create .Ar create-options .Ek .Ed .Pp The following parameters can be configured: .Bl -tag -width indent .It Cm int_prefix Ar ipv6_prefix IPv6 prefix used in internal network. NPTv6 module translates source address when it matches this prefix. .It Cm ext_prefix Ar ipv6_prefix IPv6 prefix used in external network. NPTv6 module translates destination address when it matches this prefix. .It Cm ext_if Ar nic The NPTv6 module will use first global IPv6 address from interface .Ar nic as external prefix. It can be useful when IPv6 prefix of external network is dynamically obtained. .Cm ext_prefix and .Cm ext_if options are mutually exclusive. .It Cm prefixlen Ar length The length of specified IPv6 prefixes. It must be in range from 8 to 64. .El .Pp Note that the prefix translation rules are silently ignored when IPv6 packet forwarding is disabled. To enable the packet forwarding, set the sysctl variable .Va net.inet6.ip6.forwarding to 1. .Pp To let the packet continue after being translated, set the sysctl variable .Va net.inet.ip.fw.one_pass to 0. .Sh LOADER TUNABLES Tunables can be set in .Xr loader 8 prompt, .Xr loader.conf 5 or .Xr kenv 1 before ipfw module gets loaded. .Bl -tag -width indent .It Va net.inet.ip.fw.default_to_accept : No 0 Defines ipfw last rule behavior. This value overrides .Cd "options IPFW_DEFAULT_TO_(ACCEPT|DENY)" from kernel configuration file. .It Va net.inet.ip.fw.tables_max : No 128 Defines number of tables available in ipfw. Number cannot exceed 65534. .El .Sh SYSCTL VARIABLES A set of .Xr sysctl 8 variables controls the behaviour of the firewall and associated modules .Pq Nm dummynet , bridge , sctp nat . These are shown below together with their default value (but always check with the .Xr sysctl 8 command what value is actually in use) and meaning: .Bl -tag -width indent .It Va net.inet.ip.alias.sctp.accept_global_ootb_addip : No 0 Defines how the .Nm nat responds to receipt of global OOTB ASCONF-AddIP: .Bl -tag -width indent .It Cm 0 No response (unless a partially matching association exists - ports and vtags match but global address does not) .It Cm 1 .Nm nat will accept and process all OOTB global AddIP messages. .El .Pp Option 1 should never be selected as this forms a security risk. An attacker can establish multiple fake associations by sending AddIP messages. .It Va net.inet.ip.alias.sctp.chunk_proc_limit : No 5 Defines the maximum number of chunks in an SCTP packet that will be parsed for a packet that matches an existing association. This value is enforced to be greater or equal than .Cm net.inet.ip.alias.sctp.initialising_chunk_proc_limit . A high value is a DoS risk yet setting too low a value may result in important control chunks in the packet not being located and parsed. .It Va net.inet.ip.alias.sctp.error_on_ootb : No 1 Defines when the .Nm nat responds to any Out-of-the-Blue (OOTB) packets with ErrorM packets. An OOTB packet is a packet that arrives with no existing association registered in the .Nm nat and is not an INIT or ASCONF-AddIP packet: .Bl -tag -width indent .It Cm 0 ErrorM is never sent in response to OOTB packets. .It Cm 1 ErrorM is only sent to OOTB packets received on the local side. .It Cm 2 ErrorM is sent to the local side and on the global side ONLY if there is a partial match (ports and vtags match but the source global IP does not). This value is only useful if the .Nm nat is tracking global IP addresses. .It Cm 3 ErrorM is sent in response to all OOTB packets on both the local and global side (DoS risk). .El .Pp At the moment the default is 0, since the ErrorM packet is not yet supported by most SCTP stacks. When it is supported, and if not tracking global addresses, we recommend setting this value to 1 to allow multi-homed local hosts to function with the .Nm nat . To track global addresses, we recommend setting this value to 2 to allow global hosts to be informed when they need to (re)send an ASCONF-AddIP. Value 3 should never be chosen (except for debugging) as the .Nm nat will respond to all OOTB global packets (a DoS risk). .It Va net.inet.ip.alias.sctp.hashtable_size : No 2003 Size of hash tables used for .Nm nat lookups (100 < prime_number > 1000001). This value sets the .Nm hash table size for any future created .Nm nat instance and therefore must be set prior to creating a .Nm nat instance. The table sizes may be changed to suit specific needs. If there will be few concurrent associations, and memory is scarce, you may make these smaller. If there will be many thousands (or millions) of concurrent associations, you should make these larger. A prime number is best for the table size. The sysctl update function will adjust your input value to the next highest prime number. .It Va net.inet.ip.alias.sctp.holddown_time : No 0 Hold association in table for this many seconds after receiving a SHUTDOWN-COMPLETE. This allows endpoints to correct shutdown gracefully if a shutdown_complete is lost and retransmissions are required. .It Va net.inet.ip.alias.sctp.init_timer : No 15 Timeout value while waiting for (INIT-ACK|AddIP-ACK). This value cannot be 0. .It Va net.inet.ip.alias.sctp.initialising_chunk_proc_limit : No 2 Defines the maximum number of chunks in an SCTP packet that will be parsed when no existing association exists that matches that packet. Ideally this packet will only be an INIT or ASCONF-AddIP packet. A higher value may become a DoS risk as malformed packets can consume processing resources. .It Va net.inet.ip.alias.sctp.param_proc_limit : No 25 Defines the maximum number of parameters within a chunk that will be parsed in a packet. As for other similar sysctl variables, larger values pose a DoS risk. .It Va net.inet.ip.alias.sctp.log_level : No 0 Level of detail in the system log messages (0 \- minimal, 1 \- event, 2 \- info, 3 \- detail, 4 \- debug, 5 \- max debug). May be a good option in high loss environments. .It Va net.inet.ip.alias.sctp.shutdown_time : No 15 Timeout value while waiting for SHUTDOWN-COMPLETE. This value cannot be 0. .It Va net.inet.ip.alias.sctp.track_global_addresses : No 0 Enables/disables global IP address tracking within the .Nm nat and places an upper limit on the number of addresses tracked for each association: .Bl -tag -width indent .It Cm 0 Global tracking is disabled .It Cm >1 Enables tracking, the maximum number of addresses tracked for each association is limited to this value .El .Pp This variable is fully dynamic, the new value will be adopted for all newly arriving associations, existing associations are treated as they were previously. Global tracking will decrease the number of collisions within the .Nm nat at a cost of increased processing load, memory usage, complexity, and possible .Nm nat state problems in complex networks with multiple .Nm nats . We recommend not tracking global IP addresses, this will still result in a fully functional .Nm nat . .It Va net.inet.ip.alias.sctp.up_timer : No 300 Timeout value to keep an association up with no traffic. This value cannot be 0. .It Va net.inet.ip.dummynet.codel.interval : No 100000 Default .Cm codel AQM interval in microseconds. The value must be in the range 1..5000000. .It Va net.inet.ip.dummynet.codel.target : No 5000 Default .Cm codel AQM target delay time in microseconds (the minimum acceptable persistent queue delay). The value must be in the range 1..5000000. .It Va net.inet.ip.dummynet.expire : No 1 Lazily delete dynamic pipes/queue once they have no pending traffic. You can disable this by setting the variable to 0, in which case the pipes/queues will only be deleted when the threshold is reached. .It Va net.inet.ip.dummynet.fqcodel.flows : No 1024 Defines the default total number of flow queues (sub-queues) that .Cm fq_codel creates and manages. The value must be in the range 1..65536. .It Va net.inet.ip.dummynet.fqcodel.interval : No 100000 Default .Cm fq_codel scheduler/AQM interval in microseconds. The value must be in the range 1..5000000. .It Va net.inet.ip.dummynet.fqcodel.limit : No 10240 The default hard size limit (in unit of packet) of all queues managed by an instance of the .Cm fq_codel scheduler. The value must be in the range 1..20480. .It Va net.inet.ip.dummynet.fqcodel.quantum : No 1514 The default quantum (credit) of the .Cm fq_codel in unit of byte. The value must be in the range 1..9000. .It Va net.inet.ip.dummynet.fqcodel.target : No 5000 Default .Cm fq_codel scheduler/AQM target delay time in microseconds (the minimum acceptable persistent queue delay). The value must be in the range 1..5000000. .It Va net.inet.ip.dummynet.fqpie.alpha : No 125 The default .Ar alpha parameter (scaled by 1000) for .Cm fq_pie scheduler/AQM. The value must be in the range 1..7000. .It Va net.inet.ip.dummynet.fqpie.beta : No 1250 The default .Ar beta parameter (scaled by 1000) for .Cm fq_pie scheduler/AQM. The value must be in the range 1..7000. .It Va net.inet.ip.dummynet.fqpie.flows : No 1024 Defines the default total number of flow queues (sub-queues) that .Cm fq_pie creates and manages. The value must be in the range 1..65536. .It Va net.inet.ip.dummynet.fqpie.limit : No 10240 The default hard size limit (in unit of packet) of all queues managed by an instance of the .Cm fq_pie scheduler. The value must be in the range 1..20480. .It Va net.inet.ip.dummynet.fqpie.max_burst : No 150000 The default maximum period of microseconds that .Cm fq_pie scheduler/AQM does not drop/mark packets. The value must be in the range 1..10000000. .It Va net.inet.ip.dummynet.fqpie.max_ecnth : No 99 The default maximum ECN probability threshold (scaled by 1000) for .Cm fq_pie scheduler/AQM. The value must be in the range 1..7000. .It Va net.inet.ip.dummynet.fqpie.quantum : No 1514 The default quantum (credit) of the .Cm fq_pie in unit of byte. The value must be in the range 1..9000. .It Va net.inet.ip.dummynet.fqpie.target : No 15000 The default .Cm target delay of the .Cm fq_pie in unit of microsecond. The value must be in the range 1..5000000. .It Va net.inet.ip.dummynet.fqpie.tupdate : No 15000 The default .Cm tupdate of the .Cm fq_pie in unit of microsecond. The value must be in the range 1..5000000. .It Va net.inet.ip.dummynet.hash_size : No 64 Default size of the hash table used for dynamic pipes/queues. This value is used when no .Cm buckets option is specified when configuring a pipe/queue. .It Va net.inet.ip.dummynet.io_fast : No 0 If set to a non-zero value, the .Dq fast mode of .Nm dummynet operation (see above) is enabled. .It Va net.inet.ip.dummynet.io_pkt Number of packets passed to .Nm dummynet . .It Va net.inet.ip.dummynet.io_pkt_drop Number of packets dropped by .Nm dummynet . .It Va net.inet.ip.dummynet.io_pkt_fast Number of packets bypassed by the .Nm dummynet scheduler. .It Va net.inet.ip.dummynet.max_chain_len : No 16 Target value for the maximum number of pipes/queues in a hash bucket. The product .Cm max_chain_len*hash_size is used to determine the threshold over which empty pipes/queues will be expired even when .Cm net.inet.ip.dummynet.expire=0 . .It Va net.inet.ip.dummynet.red_lookup_depth : No 256 .It Va net.inet.ip.dummynet.red_avg_pkt_size : No 512 .It Va net.inet.ip.dummynet.red_max_pkt_size : No 1500 Parameters used in the computations of the drop probability for the RED algorithm. .It Va net.inet.ip.dummynet.pie.alpha : No 125 The default .Ar alpha parameter (scaled by 1000) for .Cm pie AQM. The value must be in the range 1..7000. .It Va net.inet.ip.dummynet.pie.beta : No 1250 The default .Ar beta parameter (scaled by 1000) for .Cm pie AQM. The value must be in the range 1..7000. .It Va net.inet.ip.dummynet.pie.max_burst : No 150000 The default maximum period of microseconds that .Cm pie AQM does not drop/mark packets. The value must be in the range 1..10000000. .It Va net.inet.ip.dummynet.pie.max_ecnth : No 99 The default maximum ECN probability threshold (scaled by 1000) for .Cm pie AQM. The value must be in the range 1..7000. .It Va net.inet.ip.dummynet.pie.target : No 15000 The default .Cm target delay of .Cm pie AQM in unit of microsecond. The value must be in the range 1..5000000. .It Va net.inet.ip.dummynet.pie.tupdate : No 15000 The default .Cm tupdate of .Cm pie AQM in unit of microsecond. The value must be in the range 1..5000000. .It Va net.inet.ip.dummynet.pipe_byte_limit : No 1048576 .It Va net.inet.ip.dummynet.pipe_slot_limit : No 100 The maximum queue size that can be specified in bytes or packets. These limits prevent accidental exhaustion of resources such as mbufs. If you raise these limits, you should make sure the system is configured so that sufficient resources are available. .It Va net.inet.ip.fw.autoinc_step : No 100 Delta between rule numbers when auto-generating them. The value must be in the range 1..1000. .It Va net.inet.ip.fw.curr_dyn_buckets : Va net.inet.ip.fw.dyn_buckets The current number of buckets in the hash table for dynamic rules (readonly). .It Va net.inet.ip.fw.debug : No 1 Controls debugging messages produced by .Nm . .It Va net.inet.ip.fw.default_rule : No 65535 The default rule number (read-only). By the design of .Nm , the default rule is the last one, so its number can also serve as the highest number allowed for a rule. .It Va net.inet.ip.fw.dyn_buckets : No 256 The number of buckets in the hash table for dynamic rules. Must be a power of 2, up to 65536. It only takes effect when all dynamic rules have expired, so you are advised to use a .Cm flush command to make sure that the hash table is resized. .It Va net.inet.ip.fw.dyn_count : No 3 Current number of dynamic rules (read-only). .It Va net.inet.ip.fw.dyn_keepalive : No 1 Enables generation of keepalive packets for .Cm keep-state rules on TCP sessions. A keepalive is generated to both sides of the connection every 5 seconds for the last 20 seconds of the lifetime of the rule. .It Va net.inet.ip.fw.dyn_max : No 8192 Maximum number of dynamic rules. When you hit this limit, no more dynamic rules can be installed until old ones expire. .It Va net.inet.ip.fw.dyn_ack_lifetime : No 300 .It Va net.inet.ip.fw.dyn_syn_lifetime : No 20 .It Va net.inet.ip.fw.dyn_fin_lifetime : No 1 .It Va net.inet.ip.fw.dyn_rst_lifetime : No 1 .It Va net.inet.ip.fw.dyn_udp_lifetime : No 5 .It Va net.inet.ip.fw.dyn_short_lifetime : No 30 These variables control the lifetime, in seconds, of dynamic rules. Upon the initial SYN exchange the lifetime is kept short, then increased after both SYN have been seen, then decreased again during the final FIN exchange or when a RST is received. Both .Em dyn_fin_lifetime and .Em dyn_rst_lifetime must be strictly lower than 5 seconds, the period of repetition of keepalives. The firewall enforces that. .It Va net.inet.ip.fw.dyn_keep_states : No 0 Keep dynamic states on rule/set deletion. States are relinked to default rule (65535). This can be handly for ruleset reload. Turned off by default. .It Va net.inet.ip.fw.enable : No 1 Enables the firewall. Setting this variable to 0 lets you run your machine without firewall even if compiled in. .It Va net.inet6.ip6.fw.enable : No 1 provides the same functionality as above for the IPv6 case. .It Va net.inet.ip.fw.one_pass : No 1 When set, the packet exiting from the .Nm dummynet pipe or from .Xr ng_ipfw 4 node is not passed though the firewall again. Otherwise, after an action, the packet is reinjected into the firewall at the next rule. .It Va net.inet.ip.fw.tables_max : No 128 Maximum number of tables. .It Va net.inet.ip.fw.verbose : No 1 Enables verbose messages. .It Va net.inet.ip.fw.verbose_limit : No 0 Limits the number of messages produced by a verbose firewall. .It Va net.inet6.ip6.fw.deny_unknown_exthdrs : No 1 If enabled packets with unknown IPv6 Extension Headers will be denied. .It Va net.link.ether.ipfw : No 0 Controls whether layer-2 packets are passed to .Nm . Default is no. .It Va net.link.bridge.ipfw : No 0 Controls whether bridged packets are passed to .Nm . Default is no. .It Va net.inet.ip.fw.nat64_debug : No 0 Controls debugging messages produced by .Nm ipfw_nat64 module. .It Va net.inet.ip.fw.nat64_direct_output : No 0 Controls the output method used by .Nm ipfw_nat64 module: .Bl -tag -width indent .It Cm 0 A packet is handled by .Nm ipfw twice. First time an original packet is handled by .Nm ipfw and consumed by .Nm ipfw_nat64 translator. Then translated packet is queued via netisr to input processing again. .It Cm 1 A packet is handled by .Nm ipfw only once, and after translation it will be pushed directly to outgoing interface. .El .El .Sh INTERNAL DIAGNOSTICS There are some commands that may be useful to understand current state of certain subsystems inside kernel module. These commands provide debugging output which may change without notice. .Pp Currently the following commands are available as .Cm internal sub-options: .Bl -tag -width indent .It Cm iflist Lists all interface which are currently tracked by .Nm with their in-kernel status. .It Cm talist List all table lookup algorithms currently available. .El .Sh EXAMPLES There are far too many possible uses of .Nm so this Section will only give a small set of examples. .Ss BASIC PACKET FILTERING This command adds an entry which denies all tcp packets from .Em cracker.evil.org to the telnet port of .Em wolf.tambov.su from being forwarded by the host: .Pp .Dl "ipfw add deny tcp from cracker.evil.org to wolf.tambov.su telnet" .Pp This one disallows any connection from the entire cracker's network to my host: .Pp .Dl "ipfw add deny ip from 123.45.67.0/24 to my.host.org" .Pp A first and efficient way to limit access (not using dynamic rules) is the use of the following rules: .Pp .Dl "ipfw add allow tcp from any to any established" .Dl "ipfw add allow tcp from net1 portlist1 to net2 portlist2 setup" .Dl "ipfw add allow tcp from net3 portlist3 to net3 portlist3 setup" .Dl "..." .Dl "ipfw add deny tcp from any to any" .Pp The first rule will be a quick match for normal TCP packets, but it will not match the initial SYN packet, which will be matched by the .Cm setup rules only for selected source/destination pairs. All other SYN packets will be rejected by the final .Cm deny rule. .Pp If you administer one or more subnets, you can take advantage of the address sets and or-blocks and write extremely compact rulesets which selectively enable services to blocks of clients, as below: .Pp .Dl "goodguys=\*q{ 10.1.2.0/24{20,35,66,18} or 10.2.3.0/28{6,3,11} }\*q" .Dl "badguys=\*q10.1.2.0/24{8,38,60}\*q" .Dl "" .Dl "ipfw add allow ip from ${goodguys} to any" .Dl "ipfw add deny ip from ${badguys} to any" .Dl "... normal policies ..." .Pp The .Cm verrevpath option could be used to do automated anti-spoofing by adding the following to the top of a ruleset: .Pp .Dl "ipfw add deny ip from any to any not verrevpath in" .Pp This rule drops all incoming packets that appear to be coming to the system on the wrong interface. For example, a packet with a source address belonging to a host on a protected internal network would be dropped if it tried to enter the system from an external interface. .Pp The .Cm antispoof option could be used to do similar but more restricted anti-spoofing by adding the following to the top of a ruleset: .Pp .Dl "ipfw add deny ip from any to any not antispoof in" .Pp This rule drops all incoming packets that appear to be coming from another directly connected system but on the wrong interface. For example, a packet with a source address of .Li 192.168.0.0/24 , configured on .Li fxp0 , but coming in on .Li fxp1 would be dropped. .Pp The .Cm setdscp option could be used to (re)mark user traffic, by adding the following to the appropriate place in ruleset: .Pp .Dl "ipfw add setdscp be ip from any to any dscp af11,af21" .Ss SELECTIVE MIRRORING If your network has network traffic analyzer connected to your host directly via dedicated interface or remotely via RSPAN vlan, you can selectively mirror some Ethernet layer2 frames to the analyzer. .Pp First, make sure your firewall is already configured and runs. Then, enable layer2 processing if not already enabled: .Pp .Dl "sysctl net.link.ether.ipfw=1" .Pp Next, load needed additional kernel modules: .Pp .Dl "kldload ng_ether ng_ipfw" .Pp Optionally, make system load these modules automatically at startup: .Pp .Dl sysrc kld_list+="ng_ether ng_ipfw" .Pp Next, configure .Xr ng_ipfw 4 kernel module to transmit mirrored copies of layer2 frames out via vlan900 interface: .Pp .Dl "ngctl connect ipfw: vlan900: 1 lower" .Pp Think of "1" here as of "mirroring instance index" and vlan900 is its destination. You can have arbitrary number of instances. Refer to .Xr ng_ipfw 4 for details. .Pp At last, actually start mirroring of selected frames using "instance 1". For frames incoming from em0 interface: .Pp .Dl "ipfw add ngtee 1 ip from any to 192.168.0.1 layer2 in recv em0" .Pp For frames outgoing to em0 interface: .Pp .Dl "ipfw add ngtee 1 ip from any to 192.168.0.1 layer2 out xmit em0" .Pp For both incoming and outgoing frames while flowing through em0: .Pp .Dl "ipfw add ngtee 1 ip from any to 192.168.0.1 layer2 via em0" .Pp Make sure you do not perform mirroring for already duplicated frames or kernel may hang as there is no safety net. .Ss DYNAMIC RULES In order to protect a site from flood attacks involving fake TCP packets, it is safer to use dynamic rules: .Pp .Dl "ipfw add check-state" .Dl "ipfw add deny tcp from any to any established" .Dl "ipfw add allow tcp from my-net to any setup keep-state" .Pp This will let the firewall install dynamic rules only for those connection which start with a regular SYN packet coming from the inside of our network. Dynamic rules are checked when encountering the first occurrence of a .Cm check-state , .Cm keep-state or .Cm limit rule. A .Cm check-state rule should usually be placed near the beginning of the ruleset to minimize the amount of work scanning the ruleset. Your mileage may vary. .Pp For more complex scenarios with dynamic rules .Cm record-state and .Cm defer-action can be used to precisely control creation and checking of dynamic rules. Example of usage of these options are provided in .Sx NETWORK ADDRESS TRANSLATION (NAT) Section. .Pp To limit the number of connections a user can open you can use the following type of rules: .Pp .Dl "ipfw add allow tcp from my-net/24 to any setup limit src-addr 10" .Dl "ipfw add allow tcp from any to me setup limit src-addr 4" .Pp The former (assuming it runs on a gateway) will allow each host on a /24 network to open at most 10 TCP connections. The latter can be placed on a server to make sure that a single client does not use more than 4 simultaneous connections. .Pp .Em BEWARE : stateful rules can be subject to denial-of-service attacks by a SYN-flood which opens a huge number of dynamic rules. The effects of such attacks can be partially limited by acting on a set of .Xr sysctl 8 variables which control the operation of the firewall. .Pp Here is a good usage of the .Cm list command to see accounting records and timestamp information: .Pp .Dl ipfw -at list .Pp or in short form without timestamps: .Pp .Dl ipfw -a list .Pp which is equivalent to: .Pp .Dl ipfw show .Pp Next rule diverts all incoming packets from 192.168.2.0/24 to divert port 5000: .Pp .Dl ipfw divert 5000 ip from 192.168.2.0/24 to any in .Ss TRAFFIC SHAPING The following rules show some of the applications of .Nm and .Nm dummynet for simulations and the like. .Pp This rule drops random incoming packets with a probability of 5%: .Pp .Dl "ipfw add prob 0.05 deny ip from any to any in" .Pp A similar effect can be achieved making use of .Nm dummynet pipes: .Pp .Dl "ipfw add pipe 10 ip from any to any" .Dl "ipfw pipe 10 config plr 0.05" .Pp We can use pipes to artificially limit bandwidth, e.g.\& on a machine acting as a router, if we want to limit traffic from local clients on 192.168.2.0/24 we do: .Pp .Dl "ipfw add pipe 1 ip from 192.168.2.0/24 to any out" .Dl "ipfw pipe 1 config bw 300Kbit/s queue 50KBytes" .Pp note that we use the .Cm out modifier so that the rule is not used twice. Remember in fact that .Nm rules are checked both on incoming and outgoing packets. .Pp Should we want to simulate a bidirectional link with bandwidth limitations, the correct way is the following: .Pp .Dl "ipfw add pipe 1 ip from any to any out" .Dl "ipfw add pipe 2 ip from any to any in" .Dl "ipfw pipe 1 config bw 64Kbit/s queue 10Kbytes" .Dl "ipfw pipe 2 config bw 64Kbit/s queue 10Kbytes" .Pp The above can be very useful, e.g.\& if you want to see how your fancy Web page will look for a residential user who is connected only through a slow link. You should not use only one pipe for both directions, unless you want to simulate a half-duplex medium (e.g.\& AppleTalk, Ethernet, IRDA). It is not necessary that both pipes have the same configuration, so we can also simulate asymmetric links. .Pp Should we want to verify network performance with the RED queue management algorithm: .Pp .Dl "ipfw add pipe 1 ip from any to any" .Dl "ipfw pipe 1 config bw 500Kbit/s queue 100 red 0.002/30/80/0.1" .Pp Another typical application of the traffic shaper is to introduce some delay in the communication. This can significantly affect applications which do a lot of Remote Procedure Calls, and where the round-trip-time of the connection often becomes a limiting factor much more than bandwidth: .Pp .Dl "ipfw add pipe 1 ip from any to any out" .Dl "ipfw add pipe 2 ip from any to any in" .Dl "ipfw pipe 1 config delay 250ms bw 1Mbit/s" .Dl "ipfw pipe 2 config delay 250ms bw 1Mbit/s" .Pp Per-flow queueing can be useful for a variety of purposes. A very simple one is counting traffic: .Pp .Dl "ipfw add pipe 1 tcp from any to any" .Dl "ipfw add pipe 1 udp from any to any" .Dl "ipfw add pipe 1 ip from any to any" .Dl "ipfw pipe 1 config mask all" .Pp The above set of rules will create queues (and collect statistics) for all traffic. Because the pipes have no limitations, the only effect is collecting statistics. Note that we need 3 rules, not just the last one, because when .Nm tries to match IP packets it will not consider ports, so we would not see connections on separate ports as different ones. .Pp A more sophisticated example is limiting the outbound traffic on a net with per-host limits, rather than per-network limits: .Pp .Dl "ipfw add pipe 1 ip from 192.168.2.0/24 to any out" .Dl "ipfw add pipe 2 ip from any to 192.168.2.0/24 in" .Dl "ipfw pipe 1 config mask src-ip 0x000000ff bw 200Kbit/s queue 20Kbytes" .Dl "ipfw pipe 2 config mask dst-ip 0x000000ff bw 200Kbit/s queue 20Kbytes" .Ss LOOKUP TABLES In the following example, we need to create several traffic bandwidth classes and we need different hosts/networks to fall into different classes. We create one pipe for each class and configure them accordingly. Then we create a single table and fill it with IP subnets and addresses. For each subnet/host we set the argument equal to the number of the pipe that it should use. Then we classify traffic using a single rule: .Pp .Dl "ipfw pipe 1 config bw 1000Kbyte/s" .Dl "ipfw pipe 4 config bw 4000Kbyte/s" .Dl "..." .Dl "ipfw table T1 create type addr" .Dl "ipfw table T1 add 192.168.2.0/24 1" .Dl "ipfw table T1 add 192.168.0.0/27 4" .Dl "ipfw table T1 add 192.168.0.2 1" .Dl "..." .Dl "ipfw add pipe tablearg ip from 'table(T1)' to any" .Pp Using the .Cm fwd action, the table entries may include hostnames and IP addresses. .Pp .Dl "ipfw table T2 create type addr ftype ip" .Dl "ipfw table T2 add 192.168.2.0/24 10.23.2.1" .Dl "ipfw table T21 add 192.168.0.0/27 router1.dmz" .Dl "..." .Dl "ipfw add 100 fwd tablearg ip from any to table(1)" .Pp In the following example per-interface firewall is created: .Pp .Dl "ipfw table IN create type iface valtype skipto,fib" .Dl "ipfw table IN add vlan20 12000,12" .Dl "ipfw table IN add vlan30 13000,13" .Dl "ipfw table OUT create type iface valtype skipto" .Dl "ipfw table OUT add vlan20 22000" .Dl "ipfw table OUT add vlan30 23000" .Dl ".." .Dl "ipfw add 100 setfib tablearg ip from any to any recv 'table(IN)' in" .Dl "ipfw add 200 skipto tablearg ip from any to any recv 'table(IN)' in" .Dl "ipfw add 300 skipto tablearg ip from any to any xmit 'table(OUT)' out" .Pp The following example illustrate usage of flow tables: .Pp .Dl "ipfw table fl create type flow:src-ip,proto,dst-ip,dst-port" .Dl "ipfw table fl add 2a02:6b8:77::88,tcp,2a02:6b8:77::99,80 11" .Dl "ipfw table fl add 10.0.0.1,udp,10.0.0.2,53 12" .Dl ".." .Dl "ipfw add 100 allow ip from any to any flow 'table(fl,11)' recv ix0" .Ss SETS OF RULES To add a set of rules atomically, e.g.\& set 18: .Pp .Dl "ipfw set disable 18" .Dl "ipfw add NN set 18 ... # repeat as needed" .Dl "ipfw set enable 18" .Pp To delete a set of rules atomically the command is simply: .Pp .Dl "ipfw delete set 18" .Pp To test a ruleset and disable it and regain control if something goes wrong: .Pp .Dl "ipfw set disable 18" .Dl "ipfw add NN set 18 ... # repeat as needed" .Dl "ipfw set enable 18; echo done; sleep 30 && ipfw set disable 18" .Pp Here if everything goes well, you press control-C before the "sleep" terminates, and your ruleset will be left active. Otherwise, e.g.\& if you cannot access your box, the ruleset will be disabled after the sleep terminates thus restoring the previous situation. .Pp To show rules of the specific set: .Pp .Dl "ipfw set 18 show" .Pp To show rules of the disabled set: .Pp .Dl "ipfw -S set 18 show" .Pp To clear a specific rule counters of the specific set: .Pp .Dl "ipfw set 18 zero NN" .Pp To delete a specific rule of the specific set: .Pp .Dl "ipfw set 18 delete NN" .Ss NAT, REDIRECT AND LSNAT First redirect all the traffic to nat instance 123: .Pp .Dl "ipfw add nat 123 all from any to any" .Pp Then to configure nat instance 123 to alias all the outgoing traffic with ip 192.168.0.123, blocking all incoming connections, trying to keep same ports on both sides, clearing aliasing table on address change and keeping a log of traffic/link statistics: .Pp .Dl "ipfw nat 123 config ip 192.168.0.123 log deny_in reset same_ports" .Pp Or to change address of instance 123, aliasing table will be cleared (see reset option): .Pp .Dl "ipfw nat 123 config ip 10.0.0.1" .Pp To see configuration of nat instance 123: .Pp .Dl "ipfw nat 123 show config" .Pp To show logs of all the instances in range 111-999: .Pp .Dl "ipfw nat 111-999 show" .Pp To see configurations of all instances: .Pp .Dl "ipfw nat show config" .Pp Or a redirect rule with mixed modes could looks like: .Bd -literal -offset 2n ipfw nat 123 config redirect_addr 10.0.0.1 10.0.0.66 redirect_port tcp 192.168.0.1:80 500 redirect_proto udp 192.168.1.43 192.168.1.1 redirect_addr 192.168.0.10,192.168.0.11 10.0.0.100 # LSNAT redirect_port tcp 192.168.0.1:80,192.168.0.10:22 500 # LSNAT .Ed .Pp or it could be split in: .Bd -literal -offset 2n ipfw nat 1 config redirect_addr 10.0.0.1 10.0.0.66 ipfw nat 2 config redirect_port tcp 192.168.0.1:80 500 ipfw nat 3 config redirect_proto udp 192.168.1.43 192.168.1.1 ipfw nat 4 config redirect_addr 192.168.0.10,192.168.0.11,192.168.0.12 10.0.0.100 ipfw nat 5 config redirect_port tcp 192.168.0.1:80,192.168.0.10:22,192.168.0.20:25 500 .Ed .Pp Sometimes you may want to mix NAT and dynamic rules. It could be achieved with .Cm record-state and .Cm defer-action options. Problem is, you need to create dynamic rule before NAT and check it after NAT actions (or vice versa) to have consistent addresses and ports. Rule with .Cm keep-state option will trigger activation of existing dynamic state, and action of such rule will be performed as soon as rule is matched. In case of NAT and .Cm allow rule packet need to be passed to NAT, not allowed as soon is possible. .Pp There is example of set of rules to achieve this. Bear in mind that this is example only and it is not very useful by itself. .Pp On way out, after all checks place this rules: .Pp .Dl "ipfw add allow record-state skip-action" .Dl "ipfw add nat 1" .Pp And on way in there should be something like this: .Pp .Dl "ipfw add nat 1" .Dl "ipfw add check-state" .Pp Please note, that first rule on way out doesn't allow packet and doesn't execute existing dynamic rules. All it does, create new dynamic rule with .Cm allow action, if it is not created yet. Later, this dynamic rule is used on way in by .Cm check-state rule. .Ss CONFIGURING CODEL, PIE, FQ-CODEL and FQ-PIE AQM .Cm codel and .Cm pie AQM can be configured for .Nm dummynet .Cm pipe or .Cm queue . .Pp To configure a .Cm pipe with .Cm codel AQM using default configuration for traffic from 192.168.0.0/24 and 1Mbits/s rate limit, we do: .Pp .Dl "ipfw pipe 1 config bw 1mbits/s codel" .Dl "ipfw add 100 pipe 1 ip from 192.168.0.0/24 to any" .Pp To configure a .Cm queue with .Cm codel AQM using different configurations parameters for traffic from 192.168.0.0/24 and 1Mbits/s rate limit, we do: .Pp .Dl "ipfw pipe 1 config bw 1mbits/s" .Dl "ipfw queue 1 config pipe 1 codel target 8ms interval 160ms ecn" .Dl "ipfw add 100 queue 1 ip from 192.168.0.0/24 to any" .Pp To configure a .Cm pipe with .Cm pie AQM using default configuration for traffic from 192.168.0.0/24 and 1Mbits/s rate limit, we do: .Pp .Dl "ipfw pipe 1 config bw 1mbits/s pie" .Dl "ipfw add 100 pipe 1 ip from 192.168.0.0/24 to any" .Pp To configure a .Cm queue with .Cm pie AQM using different configuration parameters for traffic from 192.168.0.0/24 and 1Mbits/s rate limit, we do: .Pp .Dl "ipfw pipe 1 config bw 1mbits/s" .Dl "ipfw queue 1 config pipe 1 pie target 20ms tupdate 30ms ecn" .Dl "ipfw add 100 queue 1 ip from 192.168.0.0/24 to any" .Pp .Cm fq_codel and .Cm fq_pie AQM can be configured for .Nm dummynet schedulers. .Pp To configure .Cm fq_codel scheduler using different configurations parameters for traffic from 192.168.0.0/24 and 1Mbits/s rate limit, we do: .Pp .Dl "ipfw pipe 1 config bw 1mbits/s" .Dl "ipfw sched 1 config pipe 1 type fq_codel" .Dl "ipfw queue 1 config sched 1" .Dl "ipfw add 100 queue 1 ip from 192.168.0.0/24 to any" .Pp To change .Cm fq_codel default configuration for a .Cm sched such as disable ECN and change the .Ar target to 10ms, we do: .Pp .Dl "ipfw sched 1 config pipe 1 type fq_codel target 10ms noecn" .Pp Similar to .Cm fq_codel , to configure .Cm fq_pie scheduler using different configurations parameters for traffic from 192.168.0.0/24 and 1Mbits/s rate limit, we do: .Pp .Dl "ipfw pipe 1 config bw 1mbits/s" .Dl "ipfw sched 1 config pipe 1 type fq_pie" .Dl "ipfw queue 1 config sched 1" .Dl "ipfw add 100 queue 1 ip from 192.168.0.0/24 to any" .Pp The configurations of .Cm fq_pie .Cm sched can be changed in a similar way as for .Cm fq_codel .Sh SEE ALSO .Xr cpp 1 , .Xr m4 1 , .Xr altq 4 , .Xr divert 4 , .Xr dummynet 4 , .Xr if_bridge 4 , .Xr ip 4 , .Xr ipfirewall 4 , .Xr ng_ether 4 , .Xr ng_ipfw 4 , .Xr protocols 5 , .Xr services 5 , .Xr init 8 , .Xr kldload 8 , .Xr reboot 8 , .Xr sysctl 8 , .Xr syslogd 8 , .Xr sysrc 8 .Sh HISTORY The .Nm utility first appeared in .Fx 2.0 . .Nm dummynet was introduced in .Fx 2.2.8 . Stateful extensions were introduced in .Fx 4.0 . .Nm ipfw2 was introduced in Summer 2002. .Sh AUTHORS .An Ugen J. S. Antsilevich , .An Poul-Henning Kamp , .An Alex Nash , .An Archie Cobbs , .An Luigi Rizzo , .An Rasool Al-Saadi . .Pp .An -nosplit API based upon code written by .An Daniel Boulet for BSDI. .Pp Dummynet has been introduced by Luigi Rizzo in 1997-1998. .Pp Some early work (1999-2000) on the .Nm dummynet traffic shaper supported by Akamba Corp. .Pp The ipfw core (ipfw2) has been completely redesigned and reimplemented by Luigi Rizzo in summer 2002. Further actions and options have been added by various developers over the years. .Pp .An -nosplit In-kernel NAT support written by .An Paolo Pisati Aq Mt piso@FreeBSD.org as part of a Summer of Code 2005 project. .Pp SCTP .Nm nat support has been developed by .An The Centre for Advanced Internet Architectures (CAIA) Aq http://www.caia.swin.edu.au . The primary developers and maintainers are David Hayes and Jason But. For further information visit: .Aq http://www.caia.swin.edu.au/urp/SONATA .Pp Delay profiles have been developed by Alessandro Cerri and Luigi Rizzo, supported by the European Commission within Projects Onelab and Onelab2. .Pp CoDel, PIE, FQ-CoDel and FQ-PIE AQM for Dummynet have been implemented by .An The Centre for Advanced Internet Architectures (CAIA) in 2016, supported by The Comcast Innovation Fund. The primary developer is Rasool Al-Saadi. .Sh BUGS The syntax has grown over the years and sometimes it might be confusing. Unfortunately, backward compatibility prevents cleaning up mistakes made in the definition of the syntax. .Pp .Em !!! WARNING !!! .Pp Misconfiguring the firewall can put your computer in an unusable state, possibly shutting down network services and requiring console access to regain control of it. .Pp Incoming packet fragments diverted by .Cm divert are reassembled before delivery to the socket. The action used on those packet is the one from the rule which matches the first fragment of the packet. .Pp Packets diverted to userland, and then reinserted by a userland process may lose various packet attributes. The packet source interface name will be preserved if it is shorter than 8 bytes and the userland process saves and reuses the sockaddr_in (as does .Xr natd 8 ) ; otherwise, it may be lost. If a packet is reinserted in this manner, later rules may be incorrectly applied, making the order of .Cm divert rules in the rule sequence very important. .Pp Dummynet drops all packets with IPv6 link-local addresses. .Pp Rules using .Cm uid or .Cm gid may not behave as expected. In particular, incoming SYN packets may have no uid or gid associated with them since they do not yet belong to a TCP connection, and the uid/gid associated with a packet may not be as expected if the associated process calls .Xr setuid 2 or similar system calls. .Pp Rule syntax is subject to the command line environment and some patterns may need to be escaped with the backslash character or quoted appropriately. .Pp Due to the architecture of .Xr libalias 3 , ipfw nat is not compatible with the TCP segmentation offloading (TSO). Thus, to reliably nat your network traffic, please disable TSO on your NICs using .Xr ifconfig 8 . .Pp ICMP error messages are not implicitly matched by dynamic rules for the respective conversations. To avoid failures of network error detection and path MTU discovery, ICMP error messages may need to be allowed explicitly through static rules. .Pp Rules using .Cm call and .Cm return actions may lead to confusing behaviour if ruleset has mistakes, and/or interaction with other subsystems (netgraph, dummynet, etc.) is used. One possible case for this is packet leaving .Nm in subroutine on the input pass, while later on output encountering unpaired .Cm return first. As the call stack is kept intact after input pass, packet will suddenly return to the rule number used on input pass, not on output one. Order of processing should be checked carefully to avoid such mistakes. diff --git a/sbin/ipfw/ipfw2.h b/sbin/ipfw/ipfw2.h index 764e5176e8ef..42e4f4777792 100644 --- a/sbin/ipfw/ipfw2.h +++ b/sbin/ipfw/ipfw2.h @@ -1,453 +1,454 @@ /*- * Copyright (c) 2002-2003 Luigi Rizzo * Copyright (c) 1996 Alex Nash, Paul Traina, Poul-Henning Kamp * Copyright (c) 1994 Ugen J.S.Antsilevich * * Idea and grammar partially left from: * Copyright (c) 1993 Daniel Boulet * * Redistribution and use in source forms, with and without modification, * are permitted provided that this entire comment appears intact. * * Redistribution in binary form may occur without any restrictions. * Obviously, it would be nice if you gave credit where credit is due * but requiring it would be too onerous. * * This software is provided ``AS IS'' without any warranties of any kind. * * NEW command line interface for IP firewall facility * * $FreeBSD$ */ /* * Options that can be set on the command line. * When reading commands from a file, a subset of the options can also * be applied globally by specifying them before the file name. * After that, each line can contain its own option that changes * the global value. * XXX The context is not restored after each line. */ struct cmdline_opts { /* boolean options: */ int do_value_as_ip; /* show table value as IP */ int do_resolv; /* try to resolve all ip to names */ int do_time; /* Show time stamps */ int do_quiet; /* Be quiet in add and flush */ int do_pipe; /* this cmd refers to a pipe/queue/sched */ int do_nat; /* this cmd refers to a nat config */ int do_compact; /* show rules in compact mode */ int do_force; /* do not ask for confirmation */ int show_sets; /* display the set each rule belongs to */ int test_only; /* only check syntax */ int comment_only; /* only print action and comment */ int verbose; /* be verbose on some commands */ /* The options below can have multiple values. */ int do_dynamic; /* 1 - display dynamic rules */ /* 2 - display/delete only dynamic rules */ int do_sort; /* field to sort results (0 = no) */ /* valid fields are 1 and above */ uint32_t use_set; /* work with specified set number */ /* 0 means all sets, otherwise apply to set use_set - 1 */ }; enum { TIMESTAMP_NONE = 0, TIMESTAMP_STRING, TIMESTAMP_NUMERIC, }; extern struct cmdline_opts g_co; /* * _s_x is a structure that stores a string <-> token pairs, used in * various places in the parser. Entries are stored in arrays, * with an entry with s=NULL as terminator. * The search routines are match_token() and match_value(). * Often, an element with x=0 contains an error string. * */ struct _s_x { char const *s; int x; }; extern struct _s_x f_ipdscp[]; enum tokens { TOK_NULL=0, TOK_OR, TOK_NOT, TOK_STARTBRACE, TOK_ENDBRACE, TOK_ABORT6, TOK_ABORT, TOK_ACCEPT, TOK_COUNT, TOK_EACTION, TOK_PIPE, TOK_LINK, TOK_QUEUE, TOK_FLOWSET, TOK_SCHED, TOK_DIVERT, TOK_TEE, TOK_NETGRAPH, TOK_NGTEE, TOK_FORWARD, TOK_SKIPTO, TOK_DENY, TOK_REJECT, TOK_RESET, TOK_UNREACH, TOK_CHECKSTATE, TOK_NAT, TOK_REASS, TOK_CALL, TOK_RETURN, TOK_ALTQ, TOK_LOG, TOK_TAG, TOK_UNTAG, TOK_TAGGED, TOK_UID, TOK_GID, TOK_JAIL, TOK_IN, TOK_LIMIT, TOK_SETLIMIT, TOK_KEEPSTATE, TOK_RECORDSTATE, TOK_LAYER2, TOK_OUT, TOK_DIVERTED, TOK_DIVERTEDLOOPBACK, TOK_DIVERTEDOUTPUT, TOK_XMIT, TOK_RECV, TOK_VIA, TOK_FRAG, TOK_IPOPTS, TOK_IPLEN, TOK_IPID, TOK_IPPRECEDENCE, TOK_DSCP, TOK_IPTOS, TOK_IPTTL, TOK_IPVER, TOK_ESTAB, TOK_SETUP, TOK_TCPDATALEN, TOK_TCPFLAGS, TOK_TCPOPTS, TOK_TCPSEQ, TOK_TCPACK, TOK_TCPMSS, TOK_TCPWIN, TOK_ICMPTYPES, TOK_MAC, TOK_MACTYPE, TOK_VERREVPATH, TOK_VERSRCREACH, TOK_ANTISPOOF, TOK_IPSEC, TOK_COMMENT, TOK_PLR, TOK_NOERROR, TOK_BUCKETS, TOK_DSTIP, TOK_SRCIP, TOK_DSTPORT, TOK_SRCPORT, TOK_ALL, TOK_MASK, TOK_FLOW_MASK, TOK_SCHED_MASK, TOK_BW, TOK_DELAY, TOK_PROFILE, TOK_BURST, TOK_RED, TOK_GRED, TOK_ECN, TOK_DROPTAIL, TOK_PROTO, #ifdef NEW_AQM /* AQM tokens*/ TOK_NO_ECN, TOK_CODEL, TOK_FQ_CODEL, TOK_TARGET, TOK_INTERVAL, TOK_FLOWS, TOK_QUANTUM, TOK_PIE, TOK_FQ_PIE, TOK_TUPDATE, TOK_MAX_BURST, TOK_MAX_ECNTH, TOK_ALPHA, TOK_BETA, TOK_CAPDROP, TOK_NO_CAPDROP, TOK_ONOFF, TOK_DRE, TOK_TS, TOK_DERAND, TOK_NO_DERAND, #endif /* dummynet tokens */ TOK_WEIGHT, TOK_LMAX, TOK_PRI, TOK_TYPE, TOK_SLOTSIZE, TOK_IP, TOK_IF, TOK_ALOG, TOK_DENY_INC, TOK_SAME_PORTS, TOK_UNREG_ONLY, TOK_UNREG_CGN, TOK_SKIP_GLOBAL, TOK_RESET_ADDR, TOK_ALIAS_REV, TOK_PROXY_ONLY, TOK_REDIR_ADDR, TOK_REDIR_PORT, TOK_REDIR_PROTO, TOK_IPV6, TOK_FLOWID, TOK_ICMP6TYPES, TOK_EXT6HDR, TOK_DSTIP6, TOK_SRCIP6, TOK_IPV4, TOK_UNREACH6, TOK_RESET6, TOK_FIB, TOK_SETFIB, TOK_LOOKUP, TOK_SOCKARG, TOK_SETDSCP, TOK_FLOW, TOK_IFLIST, /* Table tokens */ TOK_CREATE, TOK_DESTROY, TOK_LIST, TOK_INFO, TOK_DETAIL, TOK_MODIFY, TOK_FLUSH, TOK_SWAP, TOK_ADD, TOK_DEL, TOK_VALTYPE, TOK_ALGO, TOK_TALIST, TOK_ATOMIC, TOK_LOCK, TOK_UNLOCK, TOK_VLIST, TOK_OLIST, TOK_MISSING, TOK_ORFLUSH, /* NAT64 tokens */ TOK_NAT64STL, TOK_NAT64LSN, TOK_STATS, TOK_STATES, TOK_CONFIG, TOK_TABLE4, TOK_TABLE6, TOK_PREFIX4, TOK_PREFIX6, TOK_AGG_LEN, TOK_AGG_COUNT, TOK_MAX_PORTS, TOK_STATES_CHUNKS, TOK_JMAXLEN, TOK_PORT_RANGE, + TOK_PORT_ALIAS, TOK_HOST_DEL_AGE, TOK_PG_DEL_AGE, TOK_TCP_SYN_AGE, TOK_TCP_CLOSE_AGE, TOK_TCP_EST_AGE, TOK_UDP_AGE, TOK_ICMP_AGE, TOK_LOGOFF, TOK_PRIVATE, TOK_PRIVATEOFF, /* NAT64 CLAT tokens */ TOK_NAT64CLAT, TOK_PLAT_PREFIX, TOK_CLAT_PREFIX, /* NPTv6 tokens */ TOK_NPTV6, TOK_INTPREFIX, TOK_EXTPREFIX, TOK_PREFIXLEN, TOK_EXTIF, TOK_TCPSETMSS, TOK_SKIPACTION, }; /* * the following macro returns an error message if we run out of * arguments. */ #define NEED(_p, msg) {if (!_p) errx(EX_USAGE, msg);} #define NEED1(msg) {if (!(*av)) errx(EX_USAGE, msg);} struct buf_pr { char *buf; /* allocated buffer */ char *ptr; /* current pointer */ size_t size; /* total buffer size */ size_t avail; /* available storage */ size_t needed; /* length needed */ }; int pr_u64(struct buf_pr *bp, void *pd, int width); int bp_alloc(struct buf_pr *b, size_t size); void bp_free(struct buf_pr *b); int bprintf(struct buf_pr *b, const char *format, ...); /* memory allocation support */ void *safe_calloc(size_t number, size_t size); void *safe_realloc(void *ptr, size_t size); /* string comparison functions used for historical compatibility */ int _substrcmp(const char *str1, const char* str2); int _substrcmp2(const char *str1, const char* str2, const char* str3); int stringnum_cmp(const char *a, const char *b); /* utility functions */ int match_token(struct _s_x *table, const char *string); int match_token_relaxed(struct _s_x *table, const char *string); int get_token(struct _s_x *table, const char *string, const char *errbase); char const *match_value(struct _s_x *p, int value); size_t concat_tokens(char *buf, size_t bufsize, struct _s_x *table, const char *delimiter); int fill_flags(struct _s_x *flags, char *p, char **e, uint32_t *set, uint32_t *clear); void print_flags_buffer(char *buf, size_t sz, struct _s_x *list, uint32_t set); struct _ip_fw3_opheader; int do_cmd(int optname, void *optval, uintptr_t optlen); int do_set3(int optname, struct _ip_fw3_opheader *op3, size_t optlen); int do_get3(int optname, struct _ip_fw3_opheader *op3, size_t *optlen); struct in6_addr; void n2mask(struct in6_addr *mask, int n); int contigmask(const uint8_t *p, int len); /* * Forward declarations to avoid include way too many headers. * C does not allow duplicated typedefs, so we use the base struct * that the typedef points to. * Should the typedefs use a different type, the compiler will * still detect the change when compiling the body of the * functions involved, so we do not lose error checking. */ struct _ipfw_insn; struct _ipfw_insn_altq; struct _ipfw_insn_u32; struct _ipfw_insn_ip6; struct _ipfw_insn_icmp6; /* * The reserved set numer. This is a constant in ip_fw.h * but we store it in a variable so other files do not depend * in that header just for one constant. */ extern int resvd_set_number; /* first-level command handlers */ void ipfw_add(char *av[]); void ipfw_show_nat(int ac, char **av); int ipfw_delete_nat(int i); void ipfw_config_pipe(int ac, char **av); void ipfw_config_nat(int ac, char **av); void ipfw_sets_handler(char *av[]); void ipfw_table_handler(int ac, char *av[]); void ipfw_sysctl_handler(char *av[], int which); void ipfw_delete(char *av[]); void ipfw_flush(int force); void ipfw_zero(int ac, char *av[], int optname); void ipfw_list(int ac, char *av[], int show_counters); void ipfw_internal_handler(int ac, char *av[]); void ipfw_nat64clat_handler(int ac, char *av[]); void ipfw_nat64lsn_handler(int ac, char *av[]); void ipfw_nat64stl_handler(int ac, char *av[]); void ipfw_nptv6_handler(int ac, char *av[]); int ipfw_check_object_name(const char *name); int ipfw_check_nat64prefix(const struct in6_addr *prefix, int length); #ifdef PF /* altq.c */ void altq_set_enabled(int enabled); u_int32_t altq_name_to_qid(const char *name); void print_altq_cmd(struct buf_pr *bp, const struct _ipfw_insn_altq *altqptr); #else #define NO_ALTQ #endif /* dummynet.c */ void dummynet_list(int ac, char *av[], int show_counters); void dummynet_flush(void); int ipfw_delete_pipe(int pipe_or_queue, int n); /* ipv6.c */ void print_unreach6_code(struct buf_pr *bp, uint16_t code); void print_ip6(struct buf_pr *bp, const struct _ipfw_insn_ip6 *cmd); void print_flow6id(struct buf_pr *bp, const struct _ipfw_insn_u32 *cmd); void print_icmp6types(struct buf_pr *bp, const struct _ipfw_insn_u32 *cmd); void print_ext6hdr(struct buf_pr *bp, const struct _ipfw_insn *cmd); struct tidx; struct _ipfw_insn *add_srcip6(struct _ipfw_insn *cmd, char *av, int cblen, struct tidx *tstate); struct _ipfw_insn *add_dstip6(struct _ipfw_insn *cmd, char *av, int cblen, struct tidx *tstate); void fill_flow6(struct _ipfw_insn_u32 *cmd, char *av, int cblen); void fill_unreach6_code(u_short *codep, char *str); void fill_icmp6types(struct _ipfw_insn_icmp6 *cmd, char *av, int cblen); int fill_ext6hdr(struct _ipfw_insn *cmd, char *av); /* ipfw2.c */ void bp_flush(struct buf_pr *b); void fill_table(struct _ipfw_insn *cmd, char *av, uint8_t opcode, struct tidx *tstate); /* tables.c */ struct _ipfw_obj_ctlv; struct _ipfw_obj_ntlv; int table_check_name(const char *tablename); void ipfw_list_ta(int ac, char *av[]); void ipfw_list_values(int ac, char *av[]); void table_fill_ntlv(struct _ipfw_obj_ntlv *ntlv, const char *name, uint8_t set, uint16_t uidx); diff --git a/sbin/ipfw/main.c b/sbin/ipfw/main.c index dc55d0bfa416..89f21062811c 100644 --- a/sbin/ipfw/main.c +++ b/sbin/ipfw/main.c @@ -1,640 +1,641 @@ /*- * Copyright (c) 2002-2003,2010 Luigi Rizzo * Copyright (c) 1996 Alex Nash, Paul Traina, Poul-Henning Kamp * Copyright (c) 1994 Ugen J.S.Antsilevich * * Idea and grammar partially left from: * Copyright (c) 1993 Daniel Boulet * * Redistribution and use in source forms, with and without modification, * are permitted provided that this entire comment appears intact. * * Redistribution in binary form may occur without any restrictions. * Obviously, it would be nice if you gave credit where credit is due * but requiring it would be too onerous. * * This software is provided ``AS IS'' without any warranties of any kind. * * Command line interface for IP firewall facility * * $FreeBSD$ */ #include #include #include #include #include #include #include #include #include #include #include "ipfw2.h" static void help(void) { fprintf(stderr, "ipfw syntax summary (but please do read the ipfw(8) manpage):\n\n" "\tipfw [-abcdefhnNqStTv] \n\n" "where is one of the following:\n\n" "add [num] [set N] [prob x] RULE-BODY\n" "{pipe|queue} N config PIPE-BODY\n" "[pipe|queue] {zero|delete|show} [N{,N}]\n" "nat N config {ip IPADDR|if IFNAME|log|deny_in|same_ports|unreg_only|unreg_cgn|\n" " reset|reverse|proxy_only|redirect_addr linkspec|\n" -" redirect_port linkspec|redirect_proto linkspec}\n" +" redirect_port linkspec|redirect_proto linkspec|\n" +" port_range lower-upper}\n" "set [disable N... enable N...] | move [rule] X to Y | swap X Y | show\n" "set N {show|list|zero|resetlog|delete} [N{,N}] | flush\n" "table N {add ip[/bits] [value] | delete ip[/bits] | flush | list}\n" "table all {flush | list}\n" "\n" "RULE-BODY: check-state [PARAMS] | ACTION [PARAMS] ADDR [OPTION_LIST]\n" "ACTION: check-state | allow | count | deny | unreach{,6} CODE |\n" " skipto N | {divert|tee} PORT | forward ADDR |\n" " pipe N | queue N | nat N | setfib FIB | reass\n" "PARAMS: [log [logamount LOGLIMIT]] [altq QUEUE_NAME]\n" "ADDR: [ MAC dst src ether_type ] \n" " [ ip from IPADDR [ PORT ] to IPADDR [ PORTLIST ] ]\n" " [ ipv6|ip6 from IP6ADDR [ PORT ] to IP6ADDR [ PORTLIST ] ]\n" "IPADDR: [not] { any | me | ip/bits{x,y,z} | table(t[,v]) | IPLIST }\n" "IP6ADDR: [not] { any | me | me6 | ip6/bits | IP6LIST }\n" "IP6LIST: { ip6 | ip6/bits }[,IP6LIST]\n" "IPLIST: { ip | ip/bits | ip:mask }[,IPLIST]\n" "OPTION_LIST: OPTION [OPTION_LIST]\n" "OPTION: bridged | diverted | diverted-loopback | diverted-output |\n" " {dst-ip|src-ip} IPADDR | {dst-ip6|src-ip6|dst-ipv6|src-ipv6} IP6ADDR |\n" " {dst-port|src-port} LIST |\n" " estab | frag | {gid|uid} N | icmptypes LIST | in | out | ipid LIST |\n" " iplen LIST | ipoptions SPEC | ipprecedence | ipsec | iptos SPEC |\n" " ipttl LIST | ipversion VER | keep-state | layer2 | limit ... |\n" " icmp6types LIST | ext6hdr LIST | flow-id N[,N] | fib FIB |\n" " mac ... | mac-type LIST | proto LIST | {recv|xmit|via} {IF|IPADDR} |\n" " setup | {tcpack|tcpseq|tcpwin} NN | tcpflags SPEC | tcpoptions SPEC |\n" " tcpdatalen LIST | verrevpath | versrcreach | antispoof\n" ); exit(0); } /* * Called with the arguments, including program name because getopt * wants it to be present. * Returns 0 if successful, 1 if empty command, errx() in case of errors. * First thing we do is process parameters creating an argv[] array * which includes the program name and a NULL entry at the end. * If we are called with a single string, we split it on whitespace. * Also, arguments with a trailing ',' are joined to the next one. * The pointers (av[]) and data are in a single chunk of memory. * av[0] points to the original program name, all other entries * point into the allocated chunk. */ static int ipfw_main(int oldac, char **oldav) { int ch, ac; const char *errstr; char **av, **save_av; int do_acct = 0; /* Show packet/byte count */ int try_next = 0; /* set if pipe cmd not found */ int av_size; /* compute the av size */ char *av_p; /* used to build the av list */ #define WHITESP " \t\f\v\n\r" if (oldac < 2) return 1; /* need at least one argument */ if (oldac == 2) { /* * If we are called with one argument, try to split it into * words for subsequent parsing. Spaces after a ',' are * removed by copying the string in-place. */ char *arg = oldav[1]; /* The string is the first arg. */ int l = strlen(arg); int copy = 0; /* 1 if we need to copy, 0 otherwise */ int i, j; for (i = j = 0; i < l; i++) { if (arg[i] == '#') /* comment marker */ break; if (copy) { arg[j++] = arg[i]; copy = !strchr("," WHITESP, arg[i]); } else { copy = !strchr(WHITESP, arg[i]); if (copy) arg[j++] = arg[i]; } } if (!copy && j > 0) /* last char was a 'blank', remove it */ j--; l = j; /* the new argument length */ arg[j++] = '\0'; if (l == 0) /* empty string! */ return 1; /* * First, count number of arguments. Because of the previous * processing, this is just the number of blanks plus 1. */ for (i = 0, ac = 1; i < l; i++) if (strchr(WHITESP, arg[i]) != NULL) ac++; /* * Allocate the argument list structure as a single block * of memory, containing pointers and the argument * strings. We include one entry for the program name * because getopt expects it, and a NULL at the end * to simplify further parsing. */ ac++; /* add 1 for the program name */ av_size = (ac+1) * sizeof(char *) + l + 1; av = safe_calloc(av_size, 1); /* * Init the argument pointer to the end of the array * and copy arguments from arg[] to av[]. For each one, * j is the initial character, i is the one past the end. */ av_p = (char *)&av[ac+1]; for (ac = 1, i = j = 0; i < l; i++) { if (strchr(WHITESP, arg[i]) != NULL || i == l-1) { if (i == l-1) i++; bcopy(arg+j, av_p, i-j); av[ac] = av_p; av_p += i-j; /* the length of the string */ *av_p++ = '\0'; ac++; j = i + 1; } } } else { /* * If an argument ends with ',' join with the next one. */ int first, i, l=0; /* * Allocate the argument list structure as a single block * of memory, containing both pointers and the argument * strings. We include some space for the program name * because getopt expects it. * We add an extra pointer to the end of the array, * to make simpler further parsing. */ for (i=0; i= 2 && !strcmp(av[1], "sysctl")) { char *s; int i; if (ac != 3) { printf( "sysctl emulation usage:\n" " ipfw sysctl name[=value]\n" " ipfw sysctl -a\n"); return 0; } s = strchr(av[2], '='); if (s == NULL) { s = !strcmp(av[2], "-a") ? NULL : av[2]; sysctlbyname(s, NULL, NULL, NULL, 0); } else { /* ipfw sysctl x.y.z=value */ /* assume an INT value, will extend later */ if (s[1] == '\0') { printf("ipfw sysctl: missing value\n\n"); return 0; } *s = '\0'; i = strtol(s+1, NULL, 0); sysctlbyname(av[2], NULL, NULL, &i, sizeof(int)); } return 0; } #endif /* Save arguments for final freeing of memory. */ save_av = av; optind = optreset = 1; /* restart getopt() */ while ((ch = getopt(ac, av, "abcdDefhinNp:qs:STtv")) != -1) switch (ch) { case 'a': do_acct = 1; break; case 'b': g_co.comment_only = 1; g_co.do_compact = 1; break; case 'c': g_co.do_compact = 1; break; case 'd': g_co.do_dynamic = 1; break; case 'D': g_co.do_dynamic = 2; break; case 'e': /* nop for compatibility */ break; case 'f': g_co.do_force = 1; break; case 'h': /* help */ free(save_av); help(); break; /* NOTREACHED */ case 'i': g_co.do_value_as_ip = 1; break; case 'n': g_co.test_only = 1; break; case 'N': g_co.do_resolv = 1; break; case 'p': errx(EX_USAGE, "An absolute pathname must be used " "with -p option."); /* NOTREACHED */ case 'q': g_co.do_quiet = 1; break; case 's': /* sort */ g_co.do_sort = atoi(optarg); break; case 'S': g_co.show_sets = 1; break; case 't': g_co.do_time = TIMESTAMP_STRING; break; case 'T': g_co.do_time = TIMESTAMP_NUMERIC; break; case 'v': /* verbose */ g_co.verbose = 1; break; default: free(save_av); return 1; } ac -= optind; av += optind; NEED1("bad arguments, for usage summary ``ipfw''"); /* * An undocumented behaviour of ipfw1 was to allow rule numbers first, * e.g. "100 add allow ..." instead of "add 100 allow ...". * In case, swap first and second argument to get the normal form. */ if (ac > 1 && isdigit(*av[0])) { char *p = av[0]; av[0] = av[1]; av[1] = p; } /* * Optional: pipe, queue or nat. */ g_co.do_nat = 0; g_co.do_pipe = 0; g_co.use_set = 0; if (!strncmp(*av, "nat", strlen(*av))) g_co.do_nat = 1; else if (!strncmp(*av, "pipe", strlen(*av))) g_co.do_pipe = 1; else if (_substrcmp(*av, "queue") == 0) g_co.do_pipe = 2; else if (_substrcmp(*av, "flowset") == 0) g_co.do_pipe = 2; else if (_substrcmp(*av, "sched") == 0) g_co.do_pipe = 3; else if (!strncmp(*av, "set", strlen(*av))) { if (ac > 1 && isdigit(av[1][0])) { g_co.use_set = strtonum(av[1], 0, resvd_set_number, &errstr); if (errstr) errx(EX_DATAERR, "invalid set number %s\n", av[1]); ac -= 2; av += 2; g_co.use_set++; } } if (g_co.do_pipe || g_co.do_nat) { ac--; av++; } NEED1("missing command"); /* * For pipes, queues and nats we normally say 'nat|pipe NN config' * but the code is easier to parse as 'nat|pipe config NN' * so we swap the two arguments. */ if ((g_co.do_pipe || g_co.do_nat) && ac > 1 && isdigit(*av[0])) { char *p = av[0]; av[0] = av[1]; av[1] = p; } if (g_co.use_set == 0) { if (_substrcmp(*av, "add") == 0) ipfw_add(av); else if (g_co.do_nat && _substrcmp(*av, "show") == 0) ipfw_show_nat(ac, av); else if (g_co.do_pipe && _substrcmp(*av, "config") == 0) ipfw_config_pipe(ac, av); else if (g_co.do_nat && _substrcmp(*av, "config") == 0) ipfw_config_nat(ac, av); else if (_substrcmp(*av, "set") == 0) ipfw_sets_handler(av); else if (_substrcmp(*av, "table") == 0) ipfw_table_handler(ac, av); else if (_substrcmp(*av, "enable") == 0) ipfw_sysctl_handler(av, 1); else if (_substrcmp(*av, "disable") == 0) ipfw_sysctl_handler(av, 0); else try_next = 1; } if (g_co.use_set || try_next) { if (_substrcmp(*av, "delete") == 0) ipfw_delete(av); else if (!strncmp(*av, "nat64clat", strlen(*av))) ipfw_nat64clat_handler(ac, av); else if (!strncmp(*av, "nat64stl", strlen(*av))) ipfw_nat64stl_handler(ac, av); else if (!strncmp(*av, "nat64lsn", strlen(*av))) ipfw_nat64lsn_handler(ac, av); else if (!strncmp(*av, "nptv6", strlen(*av))) ipfw_nptv6_handler(ac, av); else if (_substrcmp(*av, "flush") == 0) ipfw_flush(g_co.do_force); else if (_substrcmp(*av, "zero") == 0) ipfw_zero(ac, av, 0 /* IP_FW_ZERO */); else if (_substrcmp(*av, "resetlog") == 0) ipfw_zero(ac, av, 1 /* IP_FW_RESETLOG */); else if (_substrcmp(*av, "print") == 0 || _substrcmp(*av, "list") == 0) ipfw_list(ac, av, do_acct); else if (_substrcmp(*av, "show") == 0) ipfw_list(ac, av, 1 /* show counters */); else if (_substrcmp(*av, "table") == 0) ipfw_table_handler(ac, av); else if (_substrcmp(*av, "internal") == 0) ipfw_internal_handler(ac, av); else errx(EX_USAGE, "bad command `%s'", *av); } /* Free memory allocated in the argument parsing. */ free(save_av); return 0; } static void ipfw_readfile(int ac, char *av[]) { #define MAX_ARGS 32 char buf[4096]; char *progname = av[0]; /* original program name */ const char *cmd = NULL; /* preprocessor name, if any */ const char *filename = av[ac-1]; /* file to read */ int c, lineno=0; FILE *f = NULL; pid_t preproc = 0; while ((c = getopt(ac, av, "cfNnp:qS")) != -1) { switch(c) { case 'c': g_co.do_compact = 1; break; case 'f': g_co.do_force = 1; break; case 'N': g_co.do_resolv = 1; break; case 'n': g_co.test_only = 1; break; case 'p': /* * ipfw -p cmd [args] filename * * We are done with getopt(). All arguments * except the filename go to the preprocessor, * so we need to do the following: * - check that a filename is actually present; * - advance av by optind-1 to skip arguments * already processed; * - decrease ac by optind, to remove the args * already processed and the final filename; * - set the last entry in av[] to NULL so * popen() can detect the end of the array; * - set optind=ac to let getopt() terminate. */ if (optind == ac) errx(EX_USAGE, "no filename argument"); cmd = optarg; av[ac-1] = NULL; av += optind - 1; ac -= optind; optind = ac; break; case 'q': g_co.do_quiet = 1; break; case 'S': g_co.show_sets = 1; break; default: errx(EX_USAGE, "bad arguments, for usage" " summary ``ipfw''"); } } if (cmd == NULL && ac != optind + 1) errx(EX_USAGE, "extraneous filename arguments %s", av[ac-1]); if ((f = fopen(filename, "r")) == NULL) err(EX_UNAVAILABLE, "fopen: %s", filename); if (cmd != NULL) { /* pipe through preprocessor */ int pipedes[2]; if (pipe(pipedes) == -1) err(EX_OSERR, "cannot create pipe"); preproc = fork(); if (preproc == -1) err(EX_OSERR, "cannot fork"); if (preproc == 0) { /* * Child, will run the preprocessor with the * file on stdin and the pipe on stdout. */ if (dup2(fileno(f), 0) == -1 || dup2(pipedes[1], 1) == -1) err(EX_OSERR, "dup2()"); fclose(f); close(pipedes[1]); close(pipedes[0]); execvp(cmd, av); err(EX_OSERR, "execvp(%s) failed", cmd); } else { /* parent, will reopen f as the pipe */ fclose(f); close(pipedes[1]); if ((f = fdopen(pipedes[0], "r")) == NULL) { int savederrno = errno; (void)kill(preproc, SIGTERM); errno = savederrno; err(EX_OSERR, "fdopen()"); } } } while (fgets(buf, sizeof(buf), f)) { /* read commands */ char linename[20]; char *args[2]; lineno++; snprintf(linename, sizeof(linename), "Line %d", lineno); setprogname(linename); /* XXX */ args[0] = progname; args[1] = buf; ipfw_main(2, args); } fclose(f); if (cmd != NULL) { int status; if (waitpid(preproc, &status, 0) == -1) errx(EX_OSERR, "waitpid()"); if (WIFEXITED(status) && WEXITSTATUS(status) != EX_OK) errx(EX_UNAVAILABLE, "preprocessor exited with status %d", WEXITSTATUS(status)); else if (WIFSIGNALED(status)) errx(EX_UNAVAILABLE, "preprocessor exited with signal %d", WTERMSIG(status)); } } int main(int ac, char *av[]) { #if defined(_WIN32) && defined(TCC) { WSADATA wsaData; int ret=0; unsigned short wVersionRequested = MAKEWORD(2, 2); ret = WSAStartup(wVersionRequested, &wsaData); if (ret != 0) { /* Tell the user that we could not find a usable */ /* Winsock DLL. */ printf("WSAStartup failed with error: %d\n", ret); return 1; } } #endif /* * If the last argument is an absolute pathname, interpret it * as a file to be preprocessed. */ if (ac > 1 && av[ac - 1][0] == '/') { if (access(av[ac - 1], R_OK) == 0) ipfw_readfile(ac, av); else err(EX_USAGE, "pathname: %s", av[ac - 1]); } else { if (ipfw_main(ac, av)) { errx(EX_USAGE, "usage: ipfw [options]\n" "do \"ipfw -h\" or \"man ipfw\" for details"); } } return EX_OK; } diff --git a/sbin/ipfw/nat.c b/sbin/ipfw/nat.c index bbf5be666ea0..1076e1038b8a 100644 --- a/sbin/ipfw/nat.c +++ b/sbin/ipfw/nat.c @@ -1,1150 +1,1191 @@ /*- * Copyright (c) 2002-2003 Luigi Rizzo * Copyright (c) 1996 Alex Nash, Paul Traina, Poul-Henning Kamp * Copyright (c) 1994 Ugen J.S.Antsilevich * * Idea and grammar partially left from: * Copyright (c) 1993 Daniel Boulet * * Redistribution and use in source forms, with and without modification, * are permitted provided that this entire comment appears intact. * * Redistribution in binary form may occur without any restrictions. * Obviously, it would be nice if you gave credit where credit is due * but requiring it would be too onerous. * * This software is provided ``AS IS'' without any warranties of any kind. * * NEW command line interface for IP firewall facility * * $FreeBSD$ * * In-kernel nat support */ #include #include #include #include "ipfw2.h" #include #include #include #include #include #include #include #include #include #include #include /* def. of struct route */ #include #include #include #include typedef int (nat_cb_t)(struct nat44_cfg_nat *cfg, void *arg); static void nat_show_cfg(struct nat44_cfg_nat *n, void *arg); static void nat_show_log(struct nat44_cfg_nat *n, void *arg); static int nat_show_data(struct nat44_cfg_nat *cfg, void *arg); static int natname_cmp(const void *a, const void *b); static int nat_foreach(nat_cb_t *f, void *arg, int sort); static int nat_get_cmd(char *name, uint16_t cmd, ipfw_obj_header **ooh); static struct _s_x nat_params[] = { { "ip", TOK_IP }, { "if", TOK_IF }, { "log", TOK_ALOG }, { "deny_in", TOK_DENY_INC }, { "same_ports", TOK_SAME_PORTS }, { "unreg_only", TOK_UNREG_ONLY }, { "unreg_cgn", TOK_UNREG_CGN }, { "skip_global", TOK_SKIP_GLOBAL }, { "reset", TOK_RESET_ADDR }, { "reverse", TOK_ALIAS_REV }, { "proxy_only", TOK_PROXY_ONLY }, + { "port_range", TOK_PORT_ALIAS }, { "redirect_addr", TOK_REDIR_ADDR }, { "redirect_port", TOK_REDIR_PORT }, { "redirect_proto", TOK_REDIR_PROTO }, { NULL, 0 } /* terminator */ }; /* * Search for interface with name "ifn", and fill n accordingly: * * n->ip ip address of interface "ifn" * n->if_name copy of interface name "ifn" */ static void set_addr_dynamic(const char *ifn, struct nat44_cfg_nat *n) { size_t needed; int mib[6]; char *buf, *lim, *next; struct if_msghdr *ifm; struct ifa_msghdr *ifam; struct sockaddr_dl *sdl; struct sockaddr_in *sin; int ifIndex; mib[0] = CTL_NET; mib[1] = PF_ROUTE; mib[2] = 0; mib[3] = AF_INET; mib[4] = NET_RT_IFLIST; mib[5] = 0; /* * Get interface data. */ if (sysctl(mib, 6, NULL, &needed, NULL, 0) == -1) err(1, "iflist-sysctl-estimate"); buf = safe_calloc(1, needed); if (sysctl(mib, 6, buf, &needed, NULL, 0) == -1) err(1, "iflist-sysctl-get"); lim = buf + needed; /* * Loop through interfaces until one with * given name is found. This is done to * find correct interface index for routing * message processing. */ ifIndex = 0; next = buf; while (next < lim) { ifm = (struct if_msghdr *)next; next += ifm->ifm_msglen; if (ifm->ifm_version != RTM_VERSION) { if (g_co.verbose) warnx("routing message version %d " "not understood", ifm->ifm_version); continue; } if (ifm->ifm_type == RTM_IFINFO) { sdl = (struct sockaddr_dl *)(ifm + 1); if (strlen(ifn) == sdl->sdl_nlen && strncmp(ifn, sdl->sdl_data, sdl->sdl_nlen) == 0) { ifIndex = ifm->ifm_index; break; } } } if (!ifIndex) errx(1, "unknown interface name %s", ifn); /* * Get interface address. */ sin = NULL; while (next < lim) { ifam = (struct ifa_msghdr *)next; next += ifam->ifam_msglen; if (ifam->ifam_version != RTM_VERSION) { if (g_co.verbose) warnx("routing message version %d " "not understood", ifam->ifam_version); continue; } if (ifam->ifam_type != RTM_NEWADDR) break; if (ifam->ifam_addrs & RTA_IFA) { int i; char *cp = (char *)(ifam + 1); for (i = 1; i < RTA_IFA; i <<= 1) { if (ifam->ifam_addrs & i) cp += SA_SIZE((struct sockaddr *)cp); } if (((struct sockaddr *)cp)->sa_family == AF_INET) { sin = (struct sockaddr_in *)cp; break; } } } if (sin == NULL) n->ip.s_addr = htonl(INADDR_ANY); else n->ip = sin->sin_addr; strncpy(n->if_name, ifn, IF_NAMESIZE); free(buf); } /* * XXX - The following functions, macros and definitions come from natd.c: * it would be better to move them outside natd.c, in a file * (redirect_support.[ch]?) shared by ipfw and natd, but for now i can live * with it. */ /* * Definition of a port range, and macros to deal with values. * FORMAT: HI 16-bits == first port in range, 0 == all ports. * LO 16-bits == number of ports in range * NOTES: - Port values are not stored in network byte order. */ #define port_range u_long #define GETLOPORT(x) ((x) >> 0x10) #define GETNUMPORTS(x) ((x) & 0x0000ffff) #define GETHIPORT(x) (GETLOPORT((x)) + GETNUMPORTS((x))) /* Set y to be the low-port value in port_range variable x. */ #define SETLOPORT(x,y) ((x) = ((x) & 0x0000ffff) | ((y) << 0x10)) /* Set y to be the number of ports in port_range variable x. */ #define SETNUMPORTS(x,y) ((x) = ((x) & 0xffff0000) | (y)) static void StrToAddr (const char* str, struct in_addr* addr) { struct hostent* hp; if (inet_aton (str, addr)) return; hp = gethostbyname (str); if (!hp) errx (1, "unknown host %s", str); memcpy (addr, hp->h_addr, sizeof (struct in_addr)); } static int StrToPortRange (const char* str, const char* proto, port_range *portRange) { char* sep; struct servent* sp; char* end; u_short loPort; u_short hiPort; /* First see if this is a service, return corresponding port if so. */ sp = getservbyname (str,proto); if (sp) { SETLOPORT(*portRange, ntohs(sp->s_port)); SETNUMPORTS(*portRange, 1); return 0; } /* Not a service, see if it's a single port or port range. */ sep = strchr (str, '-'); if (sep == NULL) { SETLOPORT(*portRange, strtol(str, &end, 10)); if (end != str) { /* Single port. */ SETNUMPORTS(*portRange, 1); return 0; } /* Error in port range field. */ errx (EX_DATAERR, "%s/%s: unknown service", str, proto); } /* Port range, get the values and sanity check. */ sscanf (str, "%hu-%hu", &loPort, &hiPort); SETLOPORT(*portRange, loPort); SETNUMPORTS(*portRange, 0); /* Error by default */ if (loPort <= hiPort) SETNUMPORTS(*portRange, hiPort - loPort + 1); if (GETNUMPORTS(*portRange) == 0) errx (EX_DATAERR, "invalid port range %s", str); return 0; } static int StrToProto (const char* str) { if (!strcmp (str, "tcp")) return IPPROTO_TCP; if (!strcmp (str, "udp")) return IPPROTO_UDP; if (!strcmp (str, "sctp")) return IPPROTO_SCTP; errx (EX_DATAERR, "unknown protocol %s. Expected sctp, tcp or udp", str); } static int StrToAddrAndPortRange (const char* str, struct in_addr* addr, char* proto, port_range *portRange) { char* ptr; ptr = strchr (str, ':'); if (!ptr) errx (EX_DATAERR, "%s is missing port number", str); *ptr = '\0'; ++ptr; StrToAddr (str, addr); return StrToPortRange (ptr, proto, portRange); } /* End of stuff taken from natd.c. */ /* * The next 3 functions add support for the addr, port and proto redirect and * their logic is loosely based on SetupAddressRedirect(), SetupPortRedirect() * and SetupProtoRedirect() from natd.c. * * Every setup_* function fills at least one redirect entry * (struct nat44_cfg_redir) and zero or more server pool entry * (struct nat44_cfg_spool) in buf. * * The format of data in buf is: * * nat44_cfg_nat nat44_cfg_redir nat44_cfg_spool ...... nat44_cfg_spool * * ------------------------------------- ------------ * | | .....X ..... | | | | ..... * ------------------------------------- ...... ------------ * ^ * spool_cnt n=0 ...... n=(X-1) * * len points to the amount of available space in buf * space counts the memory consumed by every function * * XXX - Every function get all the argv params so it * has to check, in optional parameters, that the next * args is a valid option for the redir entry and not * another token. Only redir_port and redir_proto are * affected by this. */ static int estimate_redir_addr(int *ac, char ***av) { size_t space = sizeof(struct nat44_cfg_redir); char *sep = **av; u_int c = 0; (void)ac; /* UNUSED */ while ((sep = strchr(sep, ',')) != NULL) { c++; sep++; } if (c > 0) c++; space += c * sizeof(struct nat44_cfg_spool); return (space); } static int setup_redir_addr(char *buf, int *ac, char ***av) { struct nat44_cfg_redir *r; char *sep; size_t space; r = (struct nat44_cfg_redir *)buf; r->mode = REDIR_ADDR; /* Skip nat44_cfg_redir at beginning of buf. */ buf = &buf[sizeof(struct nat44_cfg_redir)]; space = sizeof(struct nat44_cfg_redir); /* Extract local address. */ if (strchr(**av, ',') != NULL) { struct nat44_cfg_spool *spool; /* Setup LSNAT server pool. */ r->laddr.s_addr = INADDR_NONE; sep = strtok(**av, ","); while (sep != NULL) { spool = (struct nat44_cfg_spool *)buf; space += sizeof(struct nat44_cfg_spool); StrToAddr(sep, &spool->addr); spool->port = ~0; r->spool_cnt++; /* Point to the next possible nat44_cfg_spool. */ buf = &buf[sizeof(struct nat44_cfg_spool)]; sep = strtok(NULL, ","); } } else StrToAddr(**av, &r->laddr); (*av)++; (*ac)--; /* Extract public address. */ StrToAddr(**av, &r->paddr); (*av)++; (*ac)--; return (space); } static int estimate_redir_port(int *ac, char ***av) { size_t space = sizeof(struct nat44_cfg_redir); char *sep = **av; u_int c = 0; (void)ac; /* UNUSED */ while ((sep = strchr(sep, ',')) != NULL) { c++; sep++; } if (c > 0) c++; space += c * sizeof(struct nat44_cfg_spool); return (space); } static int setup_redir_port(char *buf, int *ac, char ***av) { struct nat44_cfg_redir *r; char *sep, *protoName, *lsnat = NULL; size_t space; u_short numLocalPorts; port_range portRange; numLocalPorts = 0; r = (struct nat44_cfg_redir *)buf; r->mode = REDIR_PORT; /* Skip nat44_cfg_redir at beginning of buf. */ buf = &buf[sizeof(struct nat44_cfg_redir)]; space = sizeof(struct nat44_cfg_redir); /* * Extract protocol. */ r->proto = StrToProto(**av); protoName = **av; (*av)++; (*ac)--; /* * Extract local address. */ if (strchr(**av, ',') != NULL) { r->laddr.s_addr = INADDR_NONE; r->lport = ~0; numLocalPorts = 1; lsnat = **av; } else { /* * The sctp nat does not allow the port numbers to be mapped to * new port numbers. Therefore, no ports are to be specified * in the target port field. */ if (r->proto == IPPROTO_SCTP) { if (strchr(**av, ':')) errx(EX_DATAERR, "redirect_port:" "port numbers do not change in sctp, so do " "not specify them as part of the target"); else StrToAddr(**av, &r->laddr); } else { if (StrToAddrAndPortRange(**av, &r->laddr, protoName, &portRange) != 0) errx(EX_DATAERR, "redirect_port: " "invalid local port range"); r->lport = GETLOPORT(portRange); numLocalPorts = GETNUMPORTS(portRange); } } (*av)++; (*ac)--; /* * Extract public port and optionally address. */ if (strchr(**av, ':') != NULL) { if (StrToAddrAndPortRange(**av, &r->paddr, protoName, &portRange) != 0) errx(EX_DATAERR, "redirect_port: " "invalid public port range"); } else { r->paddr.s_addr = INADDR_ANY; if (StrToPortRange(**av, protoName, &portRange) != 0) errx(EX_DATAERR, "redirect_port: " "invalid public port range"); } r->pport = GETLOPORT(portRange); if (r->proto == IPPROTO_SCTP) { /* so the logic below still works */ numLocalPorts = GETNUMPORTS(portRange); r->lport = r->pport; } r->pport_cnt = GETNUMPORTS(portRange); (*av)++; (*ac)--; /* * Extract remote address and optionally port. */ /* * NB: isdigit(**av) => we've to check that next parameter is really an * option for this redirect entry, else stop here processing arg[cv]. */ if (*ac != 0 && isdigit(***av)) { if (strchr(**av, ':') != NULL) { if (StrToAddrAndPortRange(**av, &r->raddr, protoName, &portRange) != 0) errx(EX_DATAERR, "redirect_port: " "invalid remote port range"); } else { SETLOPORT(portRange, 0); SETNUMPORTS(portRange, 1); StrToAddr(**av, &r->raddr); } (*av)++; (*ac)--; } else { SETLOPORT(portRange, 0); SETNUMPORTS(portRange, 1); r->raddr.s_addr = INADDR_ANY; } r->rport = GETLOPORT(portRange); r->rport_cnt = GETNUMPORTS(portRange); /* * Make sure port ranges match up, then add the redirect ports. */ if (numLocalPorts != r->pport_cnt) errx(EX_DATAERR, "redirect_port: " "port ranges must be equal in size"); /* Remote port range is allowed to be '0' which means all ports. */ if (r->rport_cnt != numLocalPorts && (r->rport_cnt != 1 || r->rport != 0)) errx(EX_DATAERR, "redirect_port: remote port must" "be 0 or equal to local port range in size"); /* Setup LSNAT server pool. */ if (lsnat != NULL) { struct nat44_cfg_spool *spool; sep = strtok(lsnat, ","); while (sep != NULL) { spool = (struct nat44_cfg_spool *)buf; space += sizeof(struct nat44_cfg_spool); /* * The sctp nat does not allow the port numbers to * be mapped to new port numbers. Therefore, no ports * are to be specified in the target port field. */ if (r->proto == IPPROTO_SCTP) { if (strchr (sep, ':')) { errx(EX_DATAERR, "redirect_port:" "port numbers do not change in " "sctp, so do not specify them as " "part of the target"); } else { StrToAddr(sep, &spool->addr); spool->port = r->pport; } } else { if (StrToAddrAndPortRange(sep, &spool->addr, protoName, &portRange) != 0) errx(EX_DATAERR, "redirect_port:" "invalid local port range"); if (GETNUMPORTS(portRange) != 1) errx(EX_DATAERR, "redirect_port: " "local port must be single in " "this context"); spool->port = GETLOPORT(portRange); } r->spool_cnt++; /* Point to the next possible nat44_cfg_spool. */ buf = &buf[sizeof(struct nat44_cfg_spool)]; sep = strtok(NULL, ","); } } return (space); } static int setup_redir_proto(char *buf, int *ac, char ***av) { struct nat44_cfg_redir *r; struct protoent *protoent; size_t space; r = (struct nat44_cfg_redir *)buf; r->mode = REDIR_PROTO; /* Skip nat44_cfg_redir at beginning of buf. */ buf = &buf[sizeof(struct nat44_cfg_redir)]; space = sizeof(struct nat44_cfg_redir); /* * Extract protocol. */ protoent = getprotobyname(**av); if (protoent == NULL) errx(EX_DATAERR, "redirect_proto: unknown protocol %s", **av); else r->proto = protoent->p_proto; (*av)++; (*ac)--; /* * Extract local address. */ StrToAddr(**av, &r->laddr); (*av)++; (*ac)--; /* * Extract optional public address. */ if (*ac == 0) { r->paddr.s_addr = INADDR_ANY; r->raddr.s_addr = INADDR_ANY; } else { /* see above in setup_redir_port() */ if (isdigit(***av)) { StrToAddr(**av, &r->paddr); (*av)++; (*ac)--; /* * Extract optional remote address. */ /* see above in setup_redir_port() */ if (*ac != 0 && isdigit(***av)) { StrToAddr(**av, &r->raddr); (*av)++; (*ac)--; } } } return (space); } static void nat_show_log(struct nat44_cfg_nat *n, void *arg __unused) { char *buf; buf = (char *)(n + 1); if (buf[0] != '\0') printf("nat %s: %s\n", n->name, buf); } static void nat_show_cfg(struct nat44_cfg_nat *n, void *arg __unused) { struct nat44_cfg_redir *t; struct nat44_cfg_spool *s; caddr_t buf; struct protoent *p; uint32_t cnt; int i, off; buf = (caddr_t)n; off = sizeof(*n); printf("ipfw nat %s config", n->name); if (strlen(n->if_name) != 0) printf(" if %s", n->if_name); else if (n->ip.s_addr != 0) printf(" ip %s", inet_ntoa(n->ip)); while (n->mode != 0) { if (n->mode & PKT_ALIAS_LOG) { printf(" log"); n->mode &= ~PKT_ALIAS_LOG; } else if (n->mode & PKT_ALIAS_DENY_INCOMING) { printf(" deny_in"); n->mode &= ~PKT_ALIAS_DENY_INCOMING; } else if (n->mode & PKT_ALIAS_SAME_PORTS) { printf(" same_ports"); n->mode &= ~PKT_ALIAS_SAME_PORTS; } else if (n->mode & PKT_ALIAS_SKIP_GLOBAL) { printf(" skip_global"); n->mode &= ~PKT_ALIAS_SKIP_GLOBAL; } else if (n->mode & PKT_ALIAS_UNREGISTERED_ONLY) { printf(" unreg_only"); n->mode &= ~PKT_ALIAS_UNREGISTERED_ONLY; } else if (n->mode & PKT_ALIAS_UNREGISTERED_CGN) { printf(" unreg_cgn"); n->mode &= ~PKT_ALIAS_UNREGISTERED_CGN; } else if (n->mode & PKT_ALIAS_RESET_ON_ADDR_CHANGE) { printf(" reset"); n->mode &= ~PKT_ALIAS_RESET_ON_ADDR_CHANGE; } else if (n->mode & PKT_ALIAS_REVERSE) { printf(" reverse"); n->mode &= ~PKT_ALIAS_REVERSE; } else if (n->mode & PKT_ALIAS_PROXY_ONLY) { printf(" proxy_only"); n->mode &= ~PKT_ALIAS_PROXY_ONLY; } } /* Print all the redirect's data configuration. */ for (cnt = 0; cnt < n->redir_cnt; cnt++) { t = (struct nat44_cfg_redir *)&buf[off]; off += sizeof(struct nat44_cfg_redir); switch (t->mode) { case REDIR_ADDR: printf(" redirect_addr"); if (t->spool_cnt == 0) printf(" %s", inet_ntoa(t->laddr)); else for (i = 0; i < t->spool_cnt; i++) { s = (struct nat44_cfg_spool *)&buf[off]; if (i) printf(","); else printf(" "); printf("%s", inet_ntoa(s->addr)); off += sizeof(struct nat44_cfg_spool); } printf(" %s", inet_ntoa(t->paddr)); break; case REDIR_PORT: p = getprotobynumber(t->proto); printf(" redirect_port %s ", p->p_name); if (!t->spool_cnt) { printf("%s:%u", inet_ntoa(t->laddr), t->lport); if (t->pport_cnt > 1) printf("-%u", t->lport + t->pport_cnt - 1); } else for (i=0; i < t->spool_cnt; i++) { s = (struct nat44_cfg_spool *)&buf[off]; if (i) printf(","); printf("%s:%u", inet_ntoa(s->addr), s->port); off += sizeof(struct nat44_cfg_spool); } printf(" "); if (t->paddr.s_addr) printf("%s:", inet_ntoa(t->paddr)); printf("%u", t->pport); if (!t->spool_cnt && t->pport_cnt > 1) printf("-%u", t->pport + t->pport_cnt - 1); if (t->raddr.s_addr) { printf(" %s", inet_ntoa(t->raddr)); if (t->rport) { printf(":%u", t->rport); if (!t->spool_cnt && t->rport_cnt > 1) printf("-%u", t->rport + t->rport_cnt - 1); } } break; case REDIR_PROTO: p = getprotobynumber(t->proto); printf(" redirect_proto %s %s", p->p_name, inet_ntoa(t->laddr)); if (t->paddr.s_addr != 0) { printf(" %s", inet_ntoa(t->paddr)); if (t->raddr.s_addr) printf(" %s", inet_ntoa(t->raddr)); } break; default: errx(EX_DATAERR, "unknown redir mode"); break; } } printf("\n"); } +static int +nat_port_alias_parse(char *str, u_short *lpout, u_short *hpout) { + long lp, hp; + char *ptr; + /* Lower port parsing */ + lp = (long) strtol(str, &ptr, 10); + if (lp < 1024 || lp > 65535) + return 0; + if (!ptr || *ptr != '-') + return 0; + /* Upper port parsing */ + hp = (long) strtol(ptr, &ptr, 10); + if (hp < 1024 || hp > 65535) + return 0; + if (ptr) + return 0; + + *lpout = (u_short) lp; + *hpout = (u_short) hp; + return 1; +} + void ipfw_config_nat(int ac, char **av) { ipfw_obj_header *oh; struct nat44_cfg_nat *n; /* Nat instance configuration. */ int i, off, tok, ac1; + u_short lp, hp; char *id, *buf, **av1, *end; size_t len; av++; ac--; /* Nat id. */ if (ac == 0) errx(EX_DATAERR, "missing nat id"); id = *av; i = (int)strtol(id, &end, 0); if (i <= 0 || *end != '\0') errx(EX_DATAERR, "illegal nat id: %s", id); av++; ac--; if (ac == 0) errx(EX_DATAERR, "missing option"); len = sizeof(*oh) + sizeof(*n); ac1 = ac; av1 = av; while (ac1 > 0) { tok = match_token(nat_params, *av1); ac1--; av1++; switch (tok) { case TOK_IP: case TOK_IF: + case TOK_PORT_ALIAS: ac1--; av1++; break; case TOK_ALOG: case TOK_DENY_INC: case TOK_SAME_PORTS: case TOK_SKIP_GLOBAL: case TOK_UNREG_ONLY: case TOK_UNREG_CGN: case TOK_RESET_ADDR: case TOK_ALIAS_REV: case TOK_PROXY_ONLY: break; case TOK_REDIR_ADDR: if (ac1 < 2) errx(EX_DATAERR, "redirect_addr: " "not enough arguments"); len += estimate_redir_addr(&ac1, &av1); av1 += 2; ac1 -= 2; break; case TOK_REDIR_PORT: if (ac1 < 3) errx(EX_DATAERR, "redirect_port: " "not enough arguments"); av1++; ac1--; len += estimate_redir_port(&ac1, &av1); av1 += 2; ac1 -= 2; /* Skip optional remoteIP/port */ if (ac1 != 0 && isdigit(**av1)) { av1++; ac1--; } break; case TOK_REDIR_PROTO: if (ac1 < 2) errx(EX_DATAERR, "redirect_proto: " "not enough arguments"); len += sizeof(struct nat44_cfg_redir); av1 += 2; ac1 -= 2; /* Skip optional remoteIP/port */ if (ac1 != 0 && isdigit(**av1)) { av1++; ac1--; } if (ac1 != 0 && isdigit(**av1)) { av1++; ac1--; } break; default: errx(EX_DATAERR, "unrecognised option ``%s''", av1[-1]); } } if ((buf = malloc(len)) == NULL) errx(EX_OSERR, "malloc failed"); /* Offset in buf: save space for header at the beginning. */ off = sizeof(*oh) + sizeof(*n); memset(buf, 0, len); oh = (ipfw_obj_header *)buf; n = (struct nat44_cfg_nat *)(oh + 1); oh->ntlv.head.length = sizeof(oh->ntlv); snprintf(oh->ntlv.name, sizeof(oh->ntlv.name), "%d", i); snprintf(n->name, sizeof(n->name), "%d", i); while (ac > 0) { tok = match_token(nat_params, *av); ac--; av++; switch (tok) { case TOK_IP: if (ac == 0) errx(EX_DATAERR, "missing option"); if (!inet_aton(av[0], &(n->ip))) errx(EX_DATAERR, "bad ip address ``%s''", av[0]); ac--; av++; break; case TOK_IF: if (ac == 0) errx(EX_DATAERR, "missing option"); set_addr_dynamic(av[0], n); ac--; av++; break; case TOK_ALOG: n->mode |= PKT_ALIAS_LOG; break; case TOK_DENY_INC: n->mode |= PKT_ALIAS_DENY_INCOMING; break; case TOK_SAME_PORTS: n->mode |= PKT_ALIAS_SAME_PORTS; break; case TOK_UNREG_ONLY: n->mode |= PKT_ALIAS_UNREGISTERED_ONLY; break; case TOK_UNREG_CGN: n->mode |= PKT_ALIAS_UNREGISTERED_CGN; break; case TOK_SKIP_GLOBAL: n->mode |= PKT_ALIAS_SKIP_GLOBAL; break; case TOK_RESET_ADDR: n->mode |= PKT_ALIAS_RESET_ON_ADDR_CHANGE; break; case TOK_ALIAS_REV: n->mode |= PKT_ALIAS_REVERSE; break; case TOK_PROXY_ONLY: n->mode |= PKT_ALIAS_PROXY_ONLY; break; /* * All the setup_redir_* functions work directly in * the final buffer, see above for details. */ case TOK_REDIR_ADDR: case TOK_REDIR_PORT: case TOK_REDIR_PROTO: switch (tok) { case TOK_REDIR_ADDR: i = setup_redir_addr(&buf[off], &ac, &av); break; case TOK_REDIR_PORT: i = setup_redir_port(&buf[off], &ac, &av); break; case TOK_REDIR_PROTO: i = setup_redir_proto(&buf[off], &ac, &av); break; } n->redir_cnt++; off += i; break; + case TOK_PORT_ALIAS: + if (ac == 0) + errx(EX_DATAERR, "missing option"); + if (!nat_port_alias_parse(av[0], &lp, &hp)) + errx(EX_DATAERR, + "You need a range of port(s) from 1024 <= x < 65536"); + if (lp >= hp) + errx(EX_DATAERR, + "Upper port has to be greater than lower port"); + n->alias_port_lo = lp; + n->alias_port_hi = hp; + ac--; + av++; + break; } } + if (n->mode & PKT_ALIAS_SAME_PORTS && n->alias_port_lo) + errx(EX_DATAERR, "same_ports and port_range cannot both be selected"); i = do_set3(IP_FW_NAT44_XCONFIG, &oh->opheader, len); if (i != 0) err(1, "setsockopt(%s)", "IP_FW_NAT44_XCONFIG"); if (!g_co.do_quiet) { /* After every modification, we show the resultant rule. */ int _ac = 3; const char *_av[] = {"show", "config", id}; ipfw_show_nat(_ac, (char **)(void *)_av); } } static void nat_fill_ntlv(ipfw_obj_ntlv *ntlv, int i) { ntlv->head.type = IPFW_TLV_EACTION_NAME(1); /* it doesn't matter */ ntlv->head.length = sizeof(ipfw_obj_ntlv); ntlv->idx = 1; ntlv->set = 0; /* not yet */ snprintf(ntlv->name, sizeof(ntlv->name), "%d", i); } int ipfw_delete_nat(int i) { ipfw_obj_header oh; int ret; memset(&oh, 0, sizeof(oh)); nat_fill_ntlv(&oh.ntlv, i); ret = do_set3(IP_FW_NAT44_DESTROY, &oh.opheader, sizeof(oh)); if (ret == -1) { if (!g_co.do_quiet) warn("nat %u not available", i); return (EX_UNAVAILABLE); } return (EX_OK); } struct nat_list_arg { uint16_t cmd; int is_all; }; static int nat_show_data(struct nat44_cfg_nat *cfg, void *arg) { struct nat_list_arg *nla; ipfw_obj_header *oh; nla = (struct nat_list_arg *)arg; switch (nla->cmd) { case IP_FW_NAT44_XGETCONFIG: if (nat_get_cmd(cfg->name, nla->cmd, &oh) != 0) { warnx("Error getting nat instance %s info", cfg->name); break; } nat_show_cfg((struct nat44_cfg_nat *)(oh + 1), NULL); free(oh); break; case IP_FW_NAT44_XGETLOG: if (nat_get_cmd(cfg->name, nla->cmd, &oh) == 0) { nat_show_log((struct nat44_cfg_nat *)(oh + 1), NULL); free(oh); break; } /* Handle error */ if (nla->is_all != 0 && errno == ENOENT) break; warn("Error getting nat instance %s info", cfg->name); break; } return (0); } /* * Compare nat names. * Honor number comparison. */ static int natname_cmp(const void *a, const void *b) { const struct nat44_cfg_nat *ia, *ib; ia = (const struct nat44_cfg_nat *)a; ib = (const struct nat44_cfg_nat *)b; return (stringnum_cmp(ia->name, ib->name)); } /* * Retrieves nat list from kernel, * optionally sorts it and calls requested function for each table. * Returns 0 on success. */ static int nat_foreach(nat_cb_t *f, void *arg, int sort) { ipfw_obj_lheader *olh; struct nat44_cfg_nat *cfg; size_t sz; uint32_t i; int error; /* Start with reasonable default */ sz = sizeof(*olh) + 16 * sizeof(struct nat44_cfg_nat); for (;;) { if ((olh = calloc(1, sz)) == NULL) return (ENOMEM); olh->size = sz; if (do_get3(IP_FW_NAT44_LIST_NAT, &olh->opheader, &sz) != 0) { sz = olh->size; free(olh); if (errno == ENOMEM) continue; return (errno); } if (sort != 0) qsort(olh + 1, olh->count, olh->objsize, natname_cmp); cfg = (struct nat44_cfg_nat*)(olh + 1); for (i = 0; i < olh->count; i++) { error = f(cfg, arg); /* Ignore errors for now */ cfg = (struct nat44_cfg_nat *)((caddr_t)cfg + olh->objsize); } free(olh); break; } return (0); } static int nat_get_cmd(char *name, uint16_t cmd, ipfw_obj_header **ooh) { ipfw_obj_header *oh; struct nat44_cfg_nat *cfg; size_t sz; /* Start with reasonable default */ sz = sizeof(*oh) + sizeof(*cfg) + 128; for (;;) { if ((oh = calloc(1, sz)) == NULL) return (ENOMEM); cfg = (struct nat44_cfg_nat *)(oh + 1); oh->ntlv.head.length = sizeof(oh->ntlv); strlcpy(oh->ntlv.name, name, sizeof(oh->ntlv.name)); strlcpy(cfg->name, name, sizeof(cfg->name)); if (do_get3(cmd, &oh->opheader, &sz) != 0) { sz = cfg->size; free(oh); if (errno == ENOMEM) continue; return (errno); } *ooh = oh; break; } return (0); } void ipfw_show_nat(int ac, char **av) { ipfw_obj_header *oh; char *name; int cmd; struct nat_list_arg nla; ac--; av++; if (g_co.test_only) return; /* Parse parameters. */ cmd = 0; /* XXX: Change to IP_FW_NAT44_XGETLOG @ MFC */ name = NULL; for ( ; ac != 0; ac--, av++) { if (!strncmp(av[0], "config", strlen(av[0]))) { cmd = IP_FW_NAT44_XGETCONFIG; continue; } if (strcmp(av[0], "log") == 0) { cmd = IP_FW_NAT44_XGETLOG; continue; } if (name != NULL) err(EX_USAGE,"only one instance name may be specified"); name = av[0]; } if (cmd == 0) errx(EX_USAGE, "Please specify action. Available: config,log"); if (name == NULL) { memset(&nla, 0, sizeof(nla)); nla.cmd = cmd; nla.is_all = 1; nat_foreach(nat_show_data, &nla, 1); } else { if (nat_get_cmd(name, cmd, &oh) != 0) err(EX_OSERR, "Error getting nat %s instance info", name); nat_show_cfg((struct nat44_cfg_nat *)(oh + 1), NULL); free(oh); } } diff --git a/sys/netinet/ip_fw.h b/sys/netinet/ip_fw.h index f0531a67132d..4d3099a781a0 100644 --- a/sys/netinet/ip_fw.h +++ b/sys/netinet/ip_fw.h @@ -1,1067 +1,1069 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2002-2009 Luigi Rizzo, Universita` di Pisa * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _IPFW2_H #define _IPFW2_H /* * The default rule number. By the design of ip_fw, the default rule * is the last one, so its number can also serve as the highest number * allowed for a rule. The ip_fw code relies on both meanings of this * constant. */ #define IPFW_DEFAULT_RULE 65535 #define RESVD_SET 31 /*set for default and persistent rules*/ #define IPFW_MAX_SETS 32 /* Number of sets supported by ipfw*/ /* * Compat values for old clients */ #ifndef _KERNEL #define IPFW_TABLES_MAX 65535 #define IPFW_TABLES_DEFAULT 128 #endif /* * Most commands (queue, pipe, tag, untag, limit...) can have a 16-bit * argument between 1 and 65534. The value 0 (IP_FW_TARG) is used * to represent 'tablearg' value, e.g. indicate the use of a 'tablearg' * result of the most recent table() lookup. * Note that 16bit is only a historical limit, resulting from * the use of a 16-bit fields for that value. In reality, we can have * 2^32 pipes, queues, tag values and so on. */ #define IPFW_ARG_MIN 1 #define IPFW_ARG_MAX 65534 #define IP_FW_TABLEARG 65535 /* Compat value for old clients */ #define IP_FW_TARG 0 /* Current tablearg value */ #define IP_FW_NAT44_GLOBAL 65535 /* arg1 value for "nat global" */ /* * Number of entries in the call stack of the call/return commands. * Call stack currently is an uint16_t array with rule numbers. */ #define IPFW_CALLSTACK_SIZE 16 /* IP_FW3 header/opcodes */ typedef struct _ip_fw3_opheader { uint16_t opcode; /* Operation opcode */ uint16_t version; /* Opcode version */ uint16_t reserved[2]; /* Align to 64-bit boundary */ } ip_fw3_opheader; /* IP_FW3 opcodes */ #define IP_FW_TABLE_XADD 86 /* add entry */ #define IP_FW_TABLE_XDEL 87 /* delete entry */ #define IP_FW_TABLE_XGETSIZE 88 /* get table size (deprecated) */ #define IP_FW_TABLE_XLIST 89 /* list table contents */ #define IP_FW_TABLE_XDESTROY 90 /* destroy table */ #define IP_FW_TABLES_XLIST 92 /* list all tables */ #define IP_FW_TABLE_XINFO 93 /* request info for one table */ #define IP_FW_TABLE_XFLUSH 94 /* flush table data */ #define IP_FW_TABLE_XCREATE 95 /* create new table */ #define IP_FW_TABLE_XMODIFY 96 /* modify existing table */ #define IP_FW_XGET 97 /* Retrieve configuration */ #define IP_FW_XADD 98 /* add rule */ #define IP_FW_XDEL 99 /* del rule */ #define IP_FW_XMOVE 100 /* move rules to different set */ #define IP_FW_XZERO 101 /* clear accounting */ #define IP_FW_XRESETLOG 102 /* zero rules logs */ #define IP_FW_SET_SWAP 103 /* Swap between 2 sets */ #define IP_FW_SET_MOVE 104 /* Move one set to another one */ #define IP_FW_SET_ENABLE 105 /* Enable/disable sets */ #define IP_FW_TABLE_XFIND 106 /* finds an entry */ #define IP_FW_XIFLIST 107 /* list tracked interfaces */ #define IP_FW_TABLES_ALIST 108 /* list table algorithms */ #define IP_FW_TABLE_XSWAP 109 /* swap two tables */ #define IP_FW_TABLE_VLIST 110 /* dump table value hash */ #define IP_FW_NAT44_XCONFIG 111 /* Create/modify NAT44 instance */ #define IP_FW_NAT44_DESTROY 112 /* Destroys NAT44 instance */ #define IP_FW_NAT44_XGETCONFIG 113 /* Get NAT44 instance config */ #define IP_FW_NAT44_LIST_NAT 114 /* List all NAT44 instances */ #define IP_FW_NAT44_XGETLOG 115 /* Get log from NAT44 instance */ #define IP_FW_DUMP_SOPTCODES 116 /* Dump available sopts/versions */ #define IP_FW_DUMP_SRVOBJECTS 117 /* Dump existing named objects */ #define IP_FW_NAT64STL_CREATE 130 /* Create stateless NAT64 instance */ #define IP_FW_NAT64STL_DESTROY 131 /* Destroy stateless NAT64 instance */ #define IP_FW_NAT64STL_CONFIG 132 /* Modify stateless NAT64 instance */ #define IP_FW_NAT64STL_LIST 133 /* List stateless NAT64 instances */ #define IP_FW_NAT64STL_STATS 134 /* Get NAT64STL instance statistics */ #define IP_FW_NAT64STL_RESET_STATS 135 /* Reset NAT64STL instance statistics */ #define IP_FW_NAT64LSN_CREATE 140 /* Create stateful NAT64 instance */ #define IP_FW_NAT64LSN_DESTROY 141 /* Destroy stateful NAT64 instance */ #define IP_FW_NAT64LSN_CONFIG 142 /* Modify stateful NAT64 instance */ #define IP_FW_NAT64LSN_LIST 143 /* List stateful NAT64 instances */ #define IP_FW_NAT64LSN_STATS 144 /* Get NAT64LSN instance statistics */ #define IP_FW_NAT64LSN_LIST_STATES 145 /* Get stateful NAT64 states */ #define IP_FW_NAT64LSN_RESET_STATS 146 /* Reset NAT64LSN instance statistics */ #define IP_FW_NPTV6_CREATE 150 /* Create NPTv6 instance */ #define IP_FW_NPTV6_DESTROY 151 /* Destroy NPTv6 instance */ #define IP_FW_NPTV6_CONFIG 152 /* Modify NPTv6 instance */ #define IP_FW_NPTV6_LIST 153 /* List NPTv6 instances */ #define IP_FW_NPTV6_STATS 154 /* Get NPTv6 instance statistics */ #define IP_FW_NPTV6_RESET_STATS 155 /* Reset NPTv6 instance statistics */ #define IP_FW_NAT64CLAT_CREATE 160 /* Create clat NAT64 instance */ #define IP_FW_NAT64CLAT_DESTROY 161 /* Destroy clat NAT64 instance */ #define IP_FW_NAT64CLAT_CONFIG 162 /* Modify clat NAT64 instance */ #define IP_FW_NAT64CLAT_LIST 163 /* List clat NAT64 instances */ #define IP_FW_NAT64CLAT_STATS 164 /* Get NAT64CLAT instance statistics */ #define IP_FW_NAT64CLAT_RESET_STATS 165 /* Reset NAT64CLAT instance statistics */ /* * The kernel representation of ipfw rules is made of a list of * 'instructions' (for all practical purposes equivalent to BPF * instructions), which specify which fields of the packet * (or its metadata) should be analysed. * * Each instruction is stored in a structure which begins with * "ipfw_insn", and can contain extra fields depending on the * instruction type (listed below). * Note that the code is written so that individual instructions * have a size which is a multiple of 32 bits. This means that, if * such structures contain pointers or other 64-bit entities, * (there is just one instance now) they may end up unaligned on * 64-bit architectures, so the must be handled with care. * * "enum ipfw_opcodes" are the opcodes supported. We can have up * to 256 different opcodes. When adding new opcodes, they should * be appended to the end of the opcode list before O_LAST_OPCODE, * this will prevent the ABI from being broken, otherwise users * will have to recompile ipfw(8) when they update the kernel. */ enum ipfw_opcodes { /* arguments (4 byte each) */ O_NOP, O_IP_SRC, /* u32 = IP */ O_IP_SRC_MASK, /* ip = IP/mask */ O_IP_SRC_ME, /* none */ O_IP_SRC_SET, /* u32=base, arg1=len, bitmap */ O_IP_DST, /* u32 = IP */ O_IP_DST_MASK, /* ip = IP/mask */ O_IP_DST_ME, /* none */ O_IP_DST_SET, /* u32=base, arg1=len, bitmap */ O_IP_SRCPORT, /* (n)port list:mask 4 byte ea */ O_IP_DSTPORT, /* (n)port list:mask 4 byte ea */ O_PROTO, /* arg1=protocol */ O_MACADDR2, /* 2 mac addr:mask */ O_MAC_TYPE, /* same as srcport */ O_LAYER2, /* none */ O_IN, /* none */ O_FRAG, /* none */ O_RECV, /* none */ O_XMIT, /* none */ O_VIA, /* none */ O_IPOPT, /* arg1 = 2*u8 bitmap */ O_IPLEN, /* arg1 = len */ O_IPID, /* arg1 = id */ O_IPTOS, /* arg1 = id */ O_IPPRECEDENCE, /* arg1 = precedence << 5 */ O_IPTTL, /* arg1 = TTL */ O_IPVER, /* arg1 = version */ O_UID, /* u32 = id */ O_GID, /* u32 = id */ O_ESTAB, /* none (tcp established) */ O_TCPFLAGS, /* arg1 = 2*u8 bitmap */ O_TCPWIN, /* arg1 = desired win */ O_TCPSEQ, /* u32 = desired seq. */ O_TCPACK, /* u32 = desired seq. */ O_ICMPTYPE, /* u32 = icmp bitmap */ O_TCPOPTS, /* arg1 = 2*u8 bitmap */ O_VERREVPATH, /* none */ O_VERSRCREACH, /* none */ O_PROBE_STATE, /* none */ O_KEEP_STATE, /* none */ O_LIMIT, /* ipfw_insn_limit */ O_LIMIT_PARENT, /* dyn_type, not an opcode. */ /* * These are really 'actions'. */ O_LOG, /* ipfw_insn_log */ O_PROB, /* u32 = match probability */ O_CHECK_STATE, /* none */ O_ACCEPT, /* none */ O_DENY, /* none */ O_REJECT, /* arg1=icmp arg (same as deny) */ O_COUNT, /* none */ O_SKIPTO, /* arg1=next rule number */ O_PIPE, /* arg1=pipe number */ O_QUEUE, /* arg1=queue number */ O_DIVERT, /* arg1=port number */ O_TEE, /* arg1=port number */ O_FORWARD_IP, /* fwd sockaddr */ O_FORWARD_MAC, /* fwd mac */ O_NAT, /* nope */ O_REASS, /* none */ /* * More opcodes. */ O_IPSEC, /* has ipsec history */ O_IP_SRC_LOOKUP, /* arg1=table number, u32=value */ O_IP_DST_LOOKUP, /* arg1=table number, u32=value */ O_ANTISPOOF, /* none */ O_JAIL, /* u32 = id */ O_ALTQ, /* u32 = altq classif. qid */ O_DIVERTED, /* arg1=bitmap (1:loop, 2:out) */ O_TCPDATALEN, /* arg1 = tcp data len */ O_IP6_SRC, /* address without mask */ O_IP6_SRC_ME, /* my addresses */ O_IP6_SRC_MASK, /* address with the mask */ O_IP6_DST, O_IP6_DST_ME, O_IP6_DST_MASK, O_FLOW6ID, /* for flow id tag in the ipv6 pkt */ O_ICMP6TYPE, /* icmp6 packet type filtering */ O_EXT_HDR, /* filtering for ipv6 extension header */ O_IP6, /* * actions for ng_ipfw */ O_NETGRAPH, /* send to ng_ipfw */ O_NGTEE, /* copy to ng_ipfw */ O_IP4, O_UNREACH6, /* arg1=icmpv6 code arg (deny) */ O_TAG, /* arg1=tag number */ O_TAGGED, /* arg1=tag number */ O_SETFIB, /* arg1=FIB number */ O_FIB, /* arg1=FIB desired fib number */ O_SOCKARG, /* socket argument */ O_CALLRETURN, /* arg1=called rule number */ O_FORWARD_IP6, /* fwd sockaddr_in6 */ O_DSCP, /* 2 u32 = DSCP mask */ O_SETDSCP, /* arg1=DSCP value */ O_IP_FLOW_LOOKUP, /* arg1=table number, u32=value */ O_EXTERNAL_ACTION, /* arg1=id of external action handler */ O_EXTERNAL_INSTANCE, /* arg1=id of eaction handler instance */ O_EXTERNAL_DATA, /* variable length data */ O_SKIP_ACTION, /* none */ O_TCPMSS, /* arg1=MSS value */ O_LAST_OPCODE /* not an opcode! */ }; /* * The extension header are filtered only for presence using a bit * vector with a flag for each header. */ #define EXT_FRAGMENT 0x1 #define EXT_HOPOPTS 0x2 #define EXT_ROUTING 0x4 #define EXT_AH 0x8 #define EXT_ESP 0x10 #define EXT_DSTOPTS 0x20 #define EXT_RTHDR0 0x40 #define EXT_RTHDR2 0x80 /* * Template for instructions. * * ipfw_insn is used for all instructions which require no operands, * a single 16-bit value (arg1), or a couple of 8-bit values. * * For other instructions which require different/larger arguments * we have derived structures, ipfw_insn_*. * * The size of the instruction (in 32-bit words) is in the low * 6 bits of "len". The 2 remaining bits are used to implement * NOT and OR on individual instructions. Given a type, you can * compute the length to be put in "len" using F_INSN_SIZE(t) * * F_NOT negates the match result of the instruction. * * F_OR is used to build or blocks. By default, instructions * are evaluated as part of a logical AND. An "or" block * { X or Y or Z } contains F_OR set in all but the last * instruction of the block. A match will cause the code * to skip past the last instruction of the block. * * NOTA BENE: in a couple of places we assume that * sizeof(ipfw_insn) == sizeof(u_int32_t) * this needs to be fixed. * */ typedef struct _ipfw_insn { /* template for instructions */ _Alignas(_Alignof(u_int32_t)) u_int8_t opcode; u_int8_t len; /* number of 32-bit words */ #define F_NOT 0x80 #define F_OR 0x40 #define F_LEN_MASK 0x3f #define F_LEN(cmd) ((cmd)->len & F_LEN_MASK) u_int16_t arg1; } ipfw_insn; /* * The F_INSN_SIZE(type) computes the size, in 4-byte words, of * a given type. */ #define F_INSN_SIZE(t) ((sizeof (t))/sizeof(u_int32_t)) /* * This is used to store an array of 16-bit entries (ports etc.) */ typedef struct _ipfw_insn_u16 { ipfw_insn o; u_int16_t ports[2]; /* there may be more */ } ipfw_insn_u16; /* * This is used to store an array of 32-bit entries * (uid, single IPv4 addresses etc.) */ typedef struct _ipfw_insn_u32 { ipfw_insn o; u_int32_t d[1]; /* one or more */ } ipfw_insn_u32; /* * This is used to store IP addr-mask pairs. */ typedef struct _ipfw_insn_ip { ipfw_insn o; struct in_addr addr; struct in_addr mask; } ipfw_insn_ip; /* * This is used to forward to a given address (ip). */ typedef struct _ipfw_insn_sa { ipfw_insn o; struct sockaddr_in sa; } ipfw_insn_sa; /* * This is used to forward to a given address (ipv6). */ typedef struct _ipfw_insn_sa6 { ipfw_insn o; struct sockaddr_in6 sa; } ipfw_insn_sa6; /* * This is used for MAC addr-mask pairs. */ typedef struct _ipfw_insn_mac { ipfw_insn o; u_char addr[12]; /* dst[6] + src[6] */ u_char mask[12]; /* dst[6] + src[6] */ } ipfw_insn_mac; /* * This is used for interface match rules (recv xx, xmit xx). */ typedef struct _ipfw_insn_if { ipfw_insn o; union { struct in_addr ip; int glob; uint16_t kidx; } p; char name[IFNAMSIZ]; } ipfw_insn_if; /* * This is used for storing an altq queue id number. */ typedef struct _ipfw_insn_altq { ipfw_insn o; u_int32_t qid; } ipfw_insn_altq; /* * This is used for limit rules. */ typedef struct _ipfw_insn_limit { ipfw_insn o; u_int8_t _pad; u_int8_t limit_mask; /* combination of DYN_* below */ #define DYN_SRC_ADDR 0x1 #define DYN_SRC_PORT 0x2 #define DYN_DST_ADDR 0x4 #define DYN_DST_PORT 0x8 u_int16_t conn_limit; } ipfw_insn_limit; /* * This is used for log instructions. */ typedef struct _ipfw_insn_log { ipfw_insn o; u_int32_t max_log; /* how many do we log -- 0 = all */ u_int32_t log_left; /* how many left to log */ } ipfw_insn_log; /* Legacy NAT structures, compat only */ #ifndef _KERNEL /* * Data structures required by both ipfw(8) and ipfw(4) but not part of the * management API are protected by IPFW_INTERNAL. */ #ifdef IPFW_INTERNAL /* Server pool support (LSNAT). */ struct cfg_spool { LIST_ENTRY(cfg_spool) _next; /* chain of spool instances */ struct in_addr addr; u_short port; }; #endif /* Redirect modes id. */ #define REDIR_ADDR 0x01 #define REDIR_PORT 0x02 #define REDIR_PROTO 0x04 #ifdef IPFW_INTERNAL /* Nat redirect configuration. */ struct cfg_redir { LIST_ENTRY(cfg_redir) _next; /* chain of redir instances */ u_int16_t mode; /* type of redirect mode */ struct in_addr laddr; /* local ip address */ struct in_addr paddr; /* public ip address */ struct in_addr raddr; /* remote ip address */ u_short lport; /* local port */ u_short pport; /* public port */ u_short rport; /* remote port */ u_short pport_cnt; /* number of public ports */ u_short rport_cnt; /* number of remote ports */ int proto; /* protocol: tcp/udp */ struct alias_link **alink; /* num of entry in spool chain */ u_int16_t spool_cnt; /* chain of spool instances */ LIST_HEAD(spool_chain, cfg_spool) spool_chain; }; #endif #ifdef IPFW_INTERNAL /* Nat configuration data struct. */ struct cfg_nat { /* chain of nat instances */ LIST_ENTRY(cfg_nat) _next; int id; /* nat id */ struct in_addr ip; /* nat ip address */ char if_name[IF_NAMESIZE]; /* interface name */ int mode; /* aliasing mode */ struct libalias *lib; /* libalias instance */ /* number of entry in spool chain */ int redir_cnt; /* chain of redir instances */ LIST_HEAD(redir_chain, cfg_redir) redir_chain; }; #endif #define SOF_NAT sizeof(struct cfg_nat) #define SOF_REDIR sizeof(struct cfg_redir) #define SOF_SPOOL sizeof(struct cfg_spool) #endif /* ifndef _KERNEL */ struct nat44_cfg_spool { struct in_addr addr; uint16_t port; uint16_t spare; }; #define NAT44_REDIR_ADDR 0x01 #define NAT44_REDIR_PORT 0x02 #define NAT44_REDIR_PROTO 0x04 /* Nat redirect configuration. */ struct nat44_cfg_redir { struct in_addr laddr; /* local ip address */ struct in_addr paddr; /* public ip address */ struct in_addr raddr; /* remote ip address */ uint16_t lport; /* local port */ uint16_t pport; /* public port */ uint16_t rport; /* remote port */ uint16_t pport_cnt; /* number of public ports */ uint16_t rport_cnt; /* number of remote ports */ uint16_t mode; /* type of redirect mode */ uint16_t spool_cnt; /* num of entry in spool chain */ uint16_t spare; uint32_t proto; /* protocol: tcp/udp */ }; /* Nat configuration data struct. */ struct nat44_cfg_nat { char name[64]; /* nat name */ char if_name[64]; /* interface name */ uint32_t size; /* structure size incl. redirs */ struct in_addr ip; /* nat IPv4 address */ uint32_t mode; /* aliasing mode */ uint32_t redir_cnt; /* number of entry in spool chain */ + u_short alias_port_lo; /* low range for port aliasing */ + u_short alias_port_hi; /* high range for port aliasing */ }; /* Nat command. */ typedef struct _ipfw_insn_nat { ipfw_insn o; struct cfg_nat *nat; } ipfw_insn_nat; /* Apply ipv6 mask on ipv6 addr */ #define APPLY_MASK(addr,mask) do { \ (addr)->__u6_addr.__u6_addr32[0] &= (mask)->__u6_addr.__u6_addr32[0]; \ (addr)->__u6_addr.__u6_addr32[1] &= (mask)->__u6_addr.__u6_addr32[1]; \ (addr)->__u6_addr.__u6_addr32[2] &= (mask)->__u6_addr.__u6_addr32[2]; \ (addr)->__u6_addr.__u6_addr32[3] &= (mask)->__u6_addr.__u6_addr32[3]; \ } while (0) /* Structure for ipv6 */ typedef struct _ipfw_insn_ip6 { ipfw_insn o; struct in6_addr addr6; struct in6_addr mask6; } ipfw_insn_ip6; /* Used to support icmp6 types */ typedef struct _ipfw_insn_icmp6 { ipfw_insn o; uint32_t d[7]; /* XXX This number si related to the netinet/icmp6.h * define ICMP6_MAXTYPE * as follows: n = ICMP6_MAXTYPE/32 + 1 * Actually is 203 */ } ipfw_insn_icmp6; /* * Here we have the structure representing an ipfw rule. * * Layout: * struct ip_fw_rule * [ counter block, size = rule->cntr_len ] * [ one or more instructions, size = rule->cmd_len * 4 ] * * It starts with a general area (with link fields). * Counter block may be next (if rule->cntr_len > 0), * followed by an array of one or more instructions, which the code * accesses as an array of 32-bit values. rule->cmd_len represents * the total instructions legth in u32 worrd, while act_ofs represents * rule action offset in u32 words. * * When assembling instruction, remember the following: * * + if a rule has a "keep-state" (or "limit") option, then the * first instruction (at r->cmd) MUST BE an O_PROBE_STATE * + if a rule has a "log" option, then the first action * (at ACTION_PTR(r)) MUST be O_LOG * + if a rule has an "altq" option, it comes after "log" * + if a rule has an O_TAG option, it comes after "log" and "altq" * * * All structures (excluding instructions) are u64-aligned. * Please keep this. */ struct ip_fw_rule { uint16_t act_ofs; /* offset of action in 32-bit units */ uint16_t cmd_len; /* # of 32-bit words in cmd */ uint16_t spare; uint8_t set; /* rule set (0..31) */ uint8_t flags; /* rule flags */ uint32_t rulenum; /* rule number */ uint32_t id; /* rule id */ ipfw_insn cmd[1]; /* storage for commands */ }; #define IPFW_RULE_NOOPT 0x01 /* Has no options in body */ #define IPFW_RULE_JUSTOPTS 0x02 /* new format of rule body */ /* Unaligned version */ /* Base ipfw rule counter block. */ struct ip_fw_bcounter { uint16_t size; /* Size of counter block, bytes */ uint8_t flags; /* flags for given block */ uint8_t spare; uint32_t timestamp; /* tv_sec of last match */ uint64_t pcnt; /* Packet counter */ uint64_t bcnt; /* Byte counter */ }; #ifndef _KERNEL /* * Legacy rule format */ struct ip_fw { struct ip_fw *x_next; /* linked list of rules */ struct ip_fw *next_rule; /* ptr to next [skipto] rule */ /* 'next_rule' is used to pass up 'set_disable' status */ uint16_t act_ofs; /* offset of action in 32-bit units */ uint16_t cmd_len; /* # of 32-bit words in cmd */ uint16_t rulenum; /* rule number */ uint8_t set; /* rule set (0..31) */ uint8_t _pad; /* padding */ uint32_t id; /* rule id */ /* These fields are present in all rules. */ uint64_t pcnt; /* Packet counter */ uint64_t bcnt; /* Byte counter */ uint32_t timestamp; /* tv_sec of last match */ ipfw_insn cmd[1]; /* storage for commands */ }; #endif #define ACTION_PTR(rule) \ (ipfw_insn *)( (u_int32_t *)((rule)->cmd) + ((rule)->act_ofs) ) #define RULESIZE(rule) (sizeof(*(rule)) + (rule)->cmd_len * 4 - 4) #if 1 // should be moved to in.h /* * This structure is used as a flow mask and a flow id for various * parts of the code. * addr_type is used in userland and kernel to mark the address type. * fib is used in the kernel to record the fib in use. * _flags is used in the kernel to store tcp flags for dynamic rules. */ struct ipfw_flow_id { uint32_t dst_ip; uint32_t src_ip; uint16_t dst_port; uint16_t src_port; uint8_t fib; /* XXX: must be uint16_t */ uint8_t proto; uint8_t _flags; /* protocol-specific flags */ uint8_t addr_type; /* 4=ip4, 6=ip6, 1=ether ? */ struct in6_addr dst_ip6; struct in6_addr src_ip6; uint32_t flow_id6; uint32_t extra; /* queue/pipe or frag_id */ }; #endif #define IS_IP4_FLOW_ID(id) ((id)->addr_type == 4) #define IS_IP6_FLOW_ID(id) ((id)->addr_type == 6) /* * Dynamic ipfw rule. */ typedef struct _ipfw_dyn_rule ipfw_dyn_rule; struct _ipfw_dyn_rule { ipfw_dyn_rule *next; /* linked list of rules. */ struct ip_fw *rule; /* pointer to rule */ /* 'rule' is used to pass up the rule number (from the parent) */ ipfw_dyn_rule *parent; /* pointer to parent rule */ u_int64_t pcnt; /* packet match counter */ u_int64_t bcnt; /* byte match counter */ struct ipfw_flow_id id; /* (masked) flow id */ u_int32_t expire; /* expire time */ u_int32_t bucket; /* which bucket in hash table */ u_int32_t state; /* state of this rule (typically a * combination of TCP flags) */ #define IPFW_DYN_ORPHANED 0x40000 /* state's parent rule was deleted */ u_int32_t ack_fwd; /* most recent ACKs in forward */ u_int32_t ack_rev; /* and reverse directions (used */ /* to generate keepalives) */ u_int16_t dyn_type; /* rule type */ u_int16_t count; /* refcount */ u_int16_t kidx; /* index of named object */ } __packed __aligned(8); /* * Definitions for IP option names. */ #define IP_FW_IPOPT_LSRR 0x01 #define IP_FW_IPOPT_SSRR 0x02 #define IP_FW_IPOPT_RR 0x04 #define IP_FW_IPOPT_TS 0x08 /* * Definitions for TCP option names. */ #define IP_FW_TCPOPT_MSS 0x01 #define IP_FW_TCPOPT_WINDOW 0x02 #define IP_FW_TCPOPT_SACK 0x04 #define IP_FW_TCPOPT_TS 0x08 #define IP_FW_TCPOPT_CC 0x10 #define ICMP_REJECT_RST 0x100 /* fake ICMP code (send a TCP RST) */ #define ICMP6_UNREACH_RST 0x100 /* fake ICMPv6 code (send a TCP RST) */ #define ICMP_REJECT_ABORT 0x101 /* fake ICMP code (send an SCTP ABORT) */ #define ICMP6_UNREACH_ABORT 0x101 /* fake ICMPv6 code (send an SCTP ABORT) */ /* * These are used for lookup tables. */ #define IPFW_TABLE_ADDR 1 /* Table for holding IPv4/IPv6 prefixes */ #define IPFW_TABLE_INTERFACE 2 /* Table for holding interface names */ #define IPFW_TABLE_NUMBER 3 /* Table for holding ports/uid/gid/etc */ #define IPFW_TABLE_FLOW 4 /* Table for holding flow data */ #define IPFW_TABLE_MAXTYPE 4 /* Maximum valid number */ #define IPFW_TABLE_CIDR IPFW_TABLE_ADDR /* compat */ /* Value types */ #define IPFW_VTYPE_LEGACY 0xFFFFFFFF /* All data is filled in */ #define IPFW_VTYPE_SKIPTO 0x00000001 /* skipto/call/callreturn */ #define IPFW_VTYPE_PIPE 0x00000002 /* pipe/queue */ #define IPFW_VTYPE_FIB 0x00000004 /* setfib */ #define IPFW_VTYPE_NAT 0x00000008 /* nat */ #define IPFW_VTYPE_DSCP 0x00000010 /* dscp */ #define IPFW_VTYPE_TAG 0x00000020 /* tag/untag */ #define IPFW_VTYPE_DIVERT 0x00000040 /* divert/tee */ #define IPFW_VTYPE_NETGRAPH 0x00000080 /* netgraph/ngtee */ #define IPFW_VTYPE_LIMIT 0x00000100 /* limit */ #define IPFW_VTYPE_NH4 0x00000200 /* IPv4 nexthop */ #define IPFW_VTYPE_NH6 0x00000400 /* IPv6 nexthop */ typedef struct _ipfw_table_entry { in_addr_t addr; /* network address */ u_int32_t value; /* value */ u_int16_t tbl; /* table number */ u_int8_t masklen; /* mask length */ } ipfw_table_entry; typedef struct _ipfw_table_xentry { uint16_t len; /* Total entry length */ uint8_t type; /* entry type */ uint8_t masklen; /* mask length */ uint16_t tbl; /* table number */ uint16_t flags; /* record flags */ uint32_t value; /* value */ union { /* Longest field needs to be aligned by 4-byte boundary */ struct in6_addr addr6; /* IPv6 address */ char iface[IF_NAMESIZE]; /* interface name */ } k; } ipfw_table_xentry; #define IPFW_TCF_INET 0x01 /* CIDR flags: IPv4 record */ typedef struct _ipfw_table { u_int32_t size; /* size of entries in bytes */ u_int32_t cnt; /* # of entries */ u_int16_t tbl; /* table number */ ipfw_table_entry ent[0]; /* entries */ } ipfw_table; typedef struct _ipfw_xtable { ip_fw3_opheader opheader; /* IP_FW3 opcode */ uint32_t size; /* size of entries in bytes */ uint32_t cnt; /* # of entries */ uint16_t tbl; /* table number */ uint8_t type; /* table type */ ipfw_table_xentry xent[0]; /* entries */ } ipfw_xtable; typedef struct _ipfw_obj_tlv { uint16_t type; /* TLV type */ uint16_t flags; /* TLV-specific flags */ uint32_t length; /* Total length, aligned to u64 */ } ipfw_obj_tlv; #define IPFW_TLV_TBL_NAME 1 #define IPFW_TLV_TBLNAME_LIST 2 #define IPFW_TLV_RULE_LIST 3 #define IPFW_TLV_DYNSTATE_LIST 4 #define IPFW_TLV_TBL_ENT 5 #define IPFW_TLV_DYN_ENT 6 #define IPFW_TLV_RULE_ENT 7 #define IPFW_TLV_TBLENT_LIST 8 #define IPFW_TLV_RANGE 9 #define IPFW_TLV_EACTION 10 #define IPFW_TLV_COUNTERS 11 #define IPFW_TLV_OBJDATA 12 #define IPFW_TLV_STATE_NAME 14 #define IPFW_TLV_EACTION_BASE 1000 #define IPFW_TLV_EACTION_NAME(arg) (IPFW_TLV_EACTION_BASE + (arg)) typedef struct _ipfw_obj_data { ipfw_obj_tlv head; void *data[0]; } ipfw_obj_data; /* Object name TLV */ typedef struct _ipfw_obj_ntlv { ipfw_obj_tlv head; /* TLV header */ uint16_t idx; /* Name index */ uint8_t set; /* set, if applicable */ uint8_t type; /* object type, if applicable */ uint32_t spare; /* unused */ char name[64]; /* Null-terminated name */ } ipfw_obj_ntlv; /* IPv4/IPv6 L4 flow description */ struct tflow_entry { uint8_t af; uint8_t proto; uint16_t spare; uint16_t sport; uint16_t dport; union { struct { struct in_addr sip; struct in_addr dip; } a4; struct { struct in6_addr sip6; struct in6_addr dip6; } a6; } a; }; typedef struct _ipfw_table_value { uint32_t tag; /* O_TAG/O_TAGGED */ uint32_t pipe; /* O_PIPE/O_QUEUE */ uint16_t divert; /* O_DIVERT/O_TEE */ uint16_t skipto; /* skipto, CALLRET */ uint32_t netgraph; /* O_NETGRAPH/O_NGTEE */ uint32_t fib; /* O_SETFIB */ uint32_t nat; /* O_NAT */ uint32_t nh4; uint8_t dscp; uint8_t spare0; uint16_t spare1; struct in6_addr nh6; uint32_t limit; /* O_LIMIT */ uint32_t zoneid; /* scope zone id for nh6 */ uint64_t reserved; } ipfw_table_value; /* Table entry TLV */ typedef struct _ipfw_obj_tentry { ipfw_obj_tlv head; /* TLV header */ uint8_t subtype; /* subtype (IPv4,IPv6) */ uint8_t masklen; /* mask length */ uint8_t result; /* request result */ uint8_t spare0; uint16_t idx; /* Table name index */ uint16_t spare1; union { /* Longest field needs to be aligned by 8-byte boundary */ struct in_addr addr; /* IPv4 address */ uint32_t key; /* uid/gid/port */ struct in6_addr addr6; /* IPv6 address */ char iface[IF_NAMESIZE]; /* interface name */ struct tflow_entry flow; } k; union { ipfw_table_value value; /* value data */ uint32_t kidx; /* value kernel index */ } v; } ipfw_obj_tentry; #define IPFW_TF_UPDATE 0x01 /* Update record if exists */ /* Container TLV */ #define IPFW_CTF_ATOMIC 0x01 /* Perform atomic operation */ /* Operation results */ #define IPFW_TR_IGNORED 0 /* Entry was ignored (rollback) */ #define IPFW_TR_ADDED 1 /* Entry was successfully added */ #define IPFW_TR_UPDATED 2 /* Entry was successfully updated*/ #define IPFW_TR_DELETED 3 /* Entry was successfully deleted*/ #define IPFW_TR_LIMIT 4 /* Entry was ignored (limit) */ #define IPFW_TR_NOTFOUND 5 /* Entry was not found */ #define IPFW_TR_EXISTS 6 /* Entry already exists */ #define IPFW_TR_ERROR 7 /* Request has failed (unknown) */ typedef struct _ipfw_obj_dyntlv { ipfw_obj_tlv head; ipfw_dyn_rule state; } ipfw_obj_dyntlv; #define IPFW_DF_LAST 0x01 /* Last state in chain */ /* Containter TLVs */ typedef struct _ipfw_obj_ctlv { ipfw_obj_tlv head; /* TLV header */ uint32_t count; /* Number of sub-TLVs */ uint16_t objsize; /* Single object size */ uint8_t version; /* TLV version */ uint8_t flags; /* TLV-specific flags */ } ipfw_obj_ctlv; /* Range TLV */ typedef struct _ipfw_range_tlv { ipfw_obj_tlv head; /* TLV header */ uint32_t flags; /* Range flags */ uint16_t start_rule; /* Range start */ uint16_t end_rule; /* Range end */ uint32_t set; /* Range set to match */ uint32_t new_set; /* New set to move/swap to */ } ipfw_range_tlv; #define IPFW_RCFLAG_RANGE 0x01 /* rule range is set */ #define IPFW_RCFLAG_ALL 0x02 /* match ALL rules */ #define IPFW_RCFLAG_SET 0x04 /* match rules in given set */ #define IPFW_RCFLAG_DYNAMIC 0x08 /* match only dynamic states */ /* User-settable flags */ #define IPFW_RCFLAG_USER (IPFW_RCFLAG_RANGE | IPFW_RCFLAG_ALL | \ IPFW_RCFLAG_SET | IPFW_RCFLAG_DYNAMIC) /* Internally used flags */ #define IPFW_RCFLAG_DEFAULT 0x0100 /* Do not skip defaul rule */ typedef struct _ipfw_ta_tinfo { uint32_t flags; /* Format flags */ uint32_t spare; uint8_t taclass4; /* algorithm class */ uint8_t spare4; uint16_t itemsize4; /* item size in runtime */ uint32_t size4; /* runtime structure size */ uint32_t count4; /* number of items in runtime */ uint8_t taclass6; /* algorithm class */ uint8_t spare6; uint16_t itemsize6; /* item size in runtime */ uint32_t size6; /* runtime structure size */ uint32_t count6; /* number of items in runtime */ } ipfw_ta_tinfo; #define IPFW_TACLASS_HASH 1 /* algo is based on hash */ #define IPFW_TACLASS_ARRAY 2 /* algo is based on array */ #define IPFW_TACLASS_RADIX 3 /* algo is based on radix tree */ #define IPFW_TATFLAGS_DATA 0x0001 /* Has data filled in */ #define IPFW_TATFLAGS_AFDATA 0x0002 /* Separate data per AF */ #define IPFW_TATFLAGS_AFITEM 0x0004 /* diff. items per AF */ typedef struct _ipfw_xtable_info { uint8_t type; /* table type (addr,iface,..) */ uint8_t tflags; /* type flags */ uint16_t mflags; /* modification flags */ uint16_t flags; /* generic table flags */ uint16_t spare[3]; uint32_t vmask; /* bitmask with value types */ uint32_t set; /* set table is in */ uint32_t kidx; /* kernel index */ uint32_t refcnt; /* number of references */ uint32_t count; /* Number of records */ uint32_t size; /* Total size of records(export)*/ uint32_t limit; /* Max number of records */ char tablename[64]; /* table name */ char algoname[64]; /* algorithm name */ ipfw_ta_tinfo ta_info; /* additional algo stats */ } ipfw_xtable_info; /* Generic table flags */ #define IPFW_TGFLAGS_LOCKED 0x01 /* Tables is locked from changes*/ /* Table type-specific flags */ #define IPFW_TFFLAG_SRCIP 0x01 #define IPFW_TFFLAG_DSTIP 0x02 #define IPFW_TFFLAG_SRCPORT 0x04 #define IPFW_TFFLAG_DSTPORT 0x08 #define IPFW_TFFLAG_PROTO 0x10 /* Table modification flags */ #define IPFW_TMFLAGS_LIMIT 0x0002 /* Change limit value */ #define IPFW_TMFLAGS_LOCK 0x0004 /* Change table lock state */ typedef struct _ipfw_iface_info { char ifname[64]; /* interface name */ uint32_t ifindex; /* interface index */ uint32_t flags; /* flags */ uint32_t refcnt; /* number of references */ uint32_t gencnt; /* number of changes */ uint64_t spare; } ipfw_iface_info; #define IPFW_IFFLAG_RESOLVED 0x01 /* Interface exists */ typedef struct _ipfw_ta_info { char algoname[64]; /* algorithm name */ uint32_t type; /* lookup type */ uint32_t flags; uint32_t refcnt; uint32_t spare0; uint64_t spare1; } ipfw_ta_info; typedef struct _ipfw_obj_header { ip_fw3_opheader opheader; /* IP_FW3 opcode */ uint32_t spare; uint16_t idx; /* object name index */ uint8_t objtype; /* object type */ uint8_t objsubtype; /* object subtype */ ipfw_obj_ntlv ntlv; /* object name tlv */ } ipfw_obj_header; typedef struct _ipfw_obj_lheader { ip_fw3_opheader opheader; /* IP_FW3 opcode */ uint32_t set_mask; /* disabled set mask */ uint32_t count; /* Total objects count */ uint32_t size; /* Total size (incl. header) */ uint32_t objsize; /* Size of one object */ } ipfw_obj_lheader; #define IPFW_CFG_GET_STATIC 0x01 #define IPFW_CFG_GET_STATES 0x02 #define IPFW_CFG_GET_COUNTERS 0x04 typedef struct _ipfw_cfg_lheader { ip_fw3_opheader opheader; /* IP_FW3 opcode */ uint32_t set_mask; /* enabled set mask */ uint32_t spare; uint32_t flags; /* Request flags */ uint32_t size; /* neded buffer size */ uint32_t start_rule; uint32_t end_rule; } ipfw_cfg_lheader; typedef struct _ipfw_range_header { ip_fw3_opheader opheader; /* IP_FW3 opcode */ ipfw_range_tlv range; } ipfw_range_header; typedef struct _ipfw_sopt_info { uint16_t opcode; uint8_t version; uint8_t dir; uint8_t spare; uint64_t refcnt; } ipfw_sopt_info; #endif /* _IPFW2_H */ diff --git a/sys/netinet/libalias/alias.h b/sys/netinet/libalias/alias.h index 671241212799..91351a9eb8b9 100644 --- a/sys/netinet/libalias/alias.h +++ b/sys/netinet/libalias/alias.h @@ -1,247 +1,248 @@ /* lint -save -library Flexelint comment for external headers */ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2001 Charles Mott * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ /* * Alias.h defines the outside world interfaces for the packet aliasing * software. * * This software is placed into the public domain with no restrictions on its * distribution. */ #ifndef _ALIAS_H_ #define _ALIAS_H_ #include #include #include #define LIBALIAS_BUF_SIZE 128 #ifdef _KERNEL /* * The kernel version of libalias does not support these features. */ #define NO_FW_PUNCH #define NO_USE_SOCKETS #endif /* * The external interface to libalias, the packet aliasing engine. * * There are two sets of functions: * * PacketAlias*() the old API which doesn't take an instance pointer * and therefore can only have one packet engine at a time. * * LibAlias*() the new API which takes as first argument a pointer to * the instance of the packet aliasing engine. * * The functions otherwise correspond to each other one for one, except * for the LibAliasUnaliasOut()/PacketUnaliasOut() function which were * were misnamed in the old API. */ /* * The instance structure */ struct libalias; /* * An anonymous structure, a pointer to which is returned from * PacketAliasRedirectAddr(), PacketAliasRedirectPort() or * PacketAliasRedirectProto(), passed to PacketAliasAddServer(), * and freed by PacketAliasRedirectDelete(). */ struct alias_link; /* Initialization and control functions. */ struct libalias *LibAliasInit(struct libalias *); void LibAliasSetAddress(struct libalias *, struct in_addr _addr); +void LibAliasSetAliasPortRange(struct libalias *la, u_short port_low, u_short port_hi); void LibAliasSetFWBase(struct libalias *, unsigned int _base, unsigned int _num); void LibAliasSetSkinnyPort(struct libalias *, unsigned int _port); unsigned int LibAliasSetMode(struct libalias *, unsigned int _flags, unsigned int _mask); void LibAliasUninit(struct libalias *); /* Packet Handling functions. */ int LibAliasIn (struct libalias *, void *_ptr, int _maxpacketsize); int LibAliasOut(struct libalias *, void *_ptr, int _maxpacketsize); int LibAliasOutTry(struct libalias *, void *_ptr, int _maxpacketsize, int _create); int LibAliasUnaliasOut(struct libalias *, void *_ptr, int _maxpacketsize); /* Port and address redirection functions. */ int LibAliasAddServer(struct libalias *, struct alias_link *_lnk, struct in_addr _addr, unsigned short _port); struct alias_link * LibAliasRedirectAddr(struct libalias *, struct in_addr _src_addr, struct in_addr _alias_addr); int LibAliasRedirectDynamic(struct libalias *, struct alias_link *_lnk); void LibAliasRedirectDelete(struct libalias *, struct alias_link *_lnk); struct alias_link * LibAliasRedirectPort(struct libalias *, struct in_addr _src_addr, unsigned short _src_port, struct in_addr _dst_addr, unsigned short _dst_port, struct in_addr _alias_addr, unsigned short _alias_port, unsigned char _proto); struct alias_link * LibAliasRedirectProto(struct libalias *, struct in_addr _src_addr, struct in_addr _dst_addr, struct in_addr _alias_addr, unsigned char _proto); /* Fragment Handling functions. */ void LibAliasFragmentIn(struct libalias *, void *_ptr, void *_ptr_fragment); void *LibAliasGetFragment(struct libalias *, void *_ptr); int LibAliasSaveFragment(struct libalias *, void *_ptr); /* Miscellaneous functions. */ int LibAliasCheckNewLink(struct libalias *); unsigned short LibAliasInternetChecksum(struct libalias *, unsigned short *_ptr, int _nbytes); void LibAliasSetTarget(struct libalias *, struct in_addr _target_addr); /* Transparent proxying routines. */ int LibAliasProxyRule(struct libalias *, const char *_cmd); /* Module handling API */ int LibAliasLoadModule(char *); int LibAliasUnLoadAllModule(void); int LibAliasRefreshModules(void); /* Mbuf helper function. */ struct mbuf *m_megapullup(struct mbuf *, int); /* * Mode flags and other constants. */ /* Mode flags, set using PacketAliasSetMode() */ /* * If PKT_ALIAS_LOG is set, a message will be printed to /var/log/alias.log * every time a link is created or deleted. This is useful for debugging. */ #define PKT_ALIAS_LOG 0x01 /* * If PKT_ALIAS_DENY_INCOMING is set, then incoming connections (e.g. to ftp, * telnet or web servers will be prevented by the aliasing mechanism. */ #define PKT_ALIAS_DENY_INCOMING 0x02 /* * If PKT_ALIAS_SAME_PORTS is set, packets will be attempted sent from the * same port as they originated on. This allows e.g. rsh to work *99% of the * time*, but _not_ 100% (it will be slightly flakey instead of not working * at all). This mode bit is set by PacketAliasInit(), so it is a default * mode of operation. */ #define PKT_ALIAS_SAME_PORTS 0x04 /* * If PKT_ALIAS_USE_SOCKETS is set, then when partially specified links (e.g. * destination port and/or address is zero), the packet aliasing engine will * attempt to allocate a socket for the aliasing port it chooses. This will * avoid interference with the host machine. Fully specified links do not * require this. This bit is set after a call to PacketAliasInit(), so it is * a default mode of operation. */ #ifndef NO_USE_SOCKETS #define PKT_ALIAS_USE_SOCKETS 0x08 #endif /*- * If PKT_ALIAS_UNREGISTERED_ONLY is set, then only packets with * unregistered source addresses will be aliased. Private * addresses are those in the following ranges: * * 10.0.0.0 -> 10.255.255.255 * 172.16.0.0 -> 172.31.255.255 * 192.168.0.0 -> 192.168.255.255 */ #define PKT_ALIAS_UNREGISTERED_ONLY 0x10 /* * If PKT_ALIAS_RESET_ON_ADDR_CHANGE is set, then the table of dynamic * aliasing links will be reset whenever PacketAliasSetAddress() changes the * default aliasing address. If the default aliasing address is left * unchanged by this function call, then the table of dynamic aliasing links * will be left intact. This bit is set after a call to PacketAliasInit(). */ #define PKT_ALIAS_RESET_ON_ADDR_CHANGE 0x20 /* * If PKT_ALIAS_PROXY_ONLY is set, then NAT will be disabled and only * transparent proxying is performed. */ #define PKT_ALIAS_PROXY_ONLY 0x40 /* * If PKT_ALIAS_REVERSE is set, the actions of PacketAliasIn() and * PacketAliasOut() are reversed. */ #define PKT_ALIAS_REVERSE 0x80 #ifndef NO_FW_PUNCH /* * If PKT_ALIAS_PUNCH_FW is set, active FTP and IRC DCC connections will * create a 'hole' in the firewall to allow the transfers to work. The * ipfw rule number that the hole is created with is controlled by * PacketAliasSetFWBase(). The hole will be attached to that * particular alias_link, so when the link goes away the hole is deleted. */ #define PKT_ALIAS_PUNCH_FW 0x100 #endif /* * If PKT_ALIAS_SKIP_GLOBAL is set, nat instance is not checked for matching * states in 'ipfw nat global' rule. */ #define PKT_ALIAS_SKIP_GLOBAL 0x200 /* * Like PKT_ALIAS_UNREGISTERED_ONLY, but includes the RFC 6598 * (Carrier Grade NAT) address range as follows: * * 100.64.0.0 -> 100.127.255.255 */ #define PKT_ALIAS_UNREGISTERED_CGN 0x400 /* Function return codes. */ #define PKT_ALIAS_ERROR -1 #define PKT_ALIAS_OK 1 #define PKT_ALIAS_IGNORED 2 #define PKT_ALIAS_UNRESOLVED_FRAGMENT 3 #define PKT_ALIAS_FOUND_HEADER_FRAGMENT 4 #endif /* !_ALIAS_H_ */ /* lint -restore */ diff --git a/sys/netinet/libalias/alias_db.c b/sys/netinet/libalias/alias_db.c index 8da9a7fe683a..1f85a606b2d5 100644 --- a/sys/netinet/libalias/alias_db.c +++ b/sys/netinet/libalias/alias_db.c @@ -1,2850 +1,2874 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2001 Charles Mott * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); /* Alias_db.c encapsulates all data structures used for storing packet aliasing data. Other parts of the aliasing software access data through functions provided in this file. Data storage is based on the notion of a "link", which is established for ICMP echo/reply packets, UDP datagrams and TCP stream connections. A link stores the original source and destination addresses. For UDP and TCP, it also stores source and destination port numbers, as well as an alias port number. Links are also used to store information about fragments. There is a facility for sweeping through and deleting old links as new packets are sent through. A simple timeout is used for ICMP and UDP links. TCP links are left alone unless there is an incomplete connection, in which case the link can be deleted after a certain amount of time. Initial version: August, 1996 (cjm) Version 1.4: September 16, 1996 (cjm) Facility for handling incoming links added. Version 1.6: September 18, 1996 (cjm) ICMP data handling simplified. Version 1.7: January 9, 1997 (cjm) Fragment handling simplified. Saves pointers for unresolved fragments. Permits links for unspecified remote ports or unspecified remote addresses. Fixed bug which did not properly zero port table entries after a link was deleted. Cleaned up some obsolete comments. Version 1.8: January 14, 1997 (cjm) Fixed data type error in StartPoint(). (This error did not exist prior to v1.7 and was discovered and fixed by Ari Suutari) Version 1.9: February 1, 1997 Optionally, connections initiated from packet aliasing host machine will will not have their port number aliased unless it conflicts with an aliasing port already being used. (cjm) All options earlier being #ifdef'ed are now available through a new interface, SetPacketAliasMode(). This allows run time control (which is now available in PPP+pktAlias through the 'alias' keyword). (ee) Added ability to create an alias port without either destination address or port specified. port type = ALIAS_PORT_UNKNOWN_DEST_ALL (ee) Removed K&R style function headers and general cleanup. (ee) Added packetAliasMode to replace compiler #defines's (ee) Allocates sockets for partially specified ports if ALIAS_USE_SOCKETS defined. (cjm) Version 2.0: March, 1997 SetAliasAddress() will now clean up alias links if the aliasing address is changed. (cjm) PacketAliasPermanentLink() function added to support permanent links. (J. Fortes suggested the need for this.) Examples: (192.168.0.1, port 23) <-> alias port 6002, unknown dest addr/port (192.168.0.2, port 21) <-> alias port 3604, known dest addr unknown dest port These permanent links allow for incoming connections to machines on the local network. They can be given with a user-chosen amount of specificity, with increasing specificity meaning more security. (cjm) Quite a bit of rework to the basic engine. The portTable[] array, which kept track of which ports were in use was replaced by a table/linked list structure. (cjm) SetExpire() function added. (cjm) DeleteLink() no longer frees memory association with a pointer to a fragment (this bug was first recognized by E. Eklund in v1.9). Version 2.1: May, 1997 (cjm) Packet aliasing engine reworked so that it can handle multiple external addresses rather than just a single host address. PacketAliasRedirectPort() and PacketAliasRedirectAddr() added to the API. The first function is a more generalized version of PacketAliasPermanentLink(). The second function implements static network address translation. Version 3.2: July, 2000 (salander and satoh) Added FindNewPortGroup to get contiguous range of port values. Added QueryUdpTcpIn and QueryUdpTcpOut to look for an aliasing link but not actually add one. Added FindRtspOut, which is closely derived from FindUdpTcpOut, except that the alias port (from FindNewPortGroup) is provided as input. See HISTORY file for additional revisions. */ #ifdef _KERNEL #include #include #include #include #include #include #include #include #else #include #include #include #include #include #include #endif #include #include #ifdef _KERNEL #include #include #include #include #else #include "alias.h" #include "alias_local.h" #include "alias_mod.h" #endif static LIST_HEAD(, libalias) instancehead = LIST_HEAD_INITIALIZER(instancehead); /* Constants (note: constants are also defined near relevant functions or structs) */ /* Parameters used for cleanup of expired links */ /* NOTE: ALIAS_CLEANUP_INTERVAL_SECS must be less then LINK_TABLE_OUT_SIZE */ #define ALIAS_CLEANUP_INTERVAL_SECS 64 #define ALIAS_CLEANUP_MAX_SPOKES (LINK_TABLE_OUT_SIZE/5) /* Timeouts (in seconds) for different link types */ #define ICMP_EXPIRE_TIME 60 #define UDP_EXPIRE_TIME 60 #define PROTO_EXPIRE_TIME 60 #define FRAGMENT_ID_EXPIRE_TIME 10 #define FRAGMENT_PTR_EXPIRE_TIME 30 /* TCP link expire time for different cases */ /* When the link has been used and closed - minimal grace time to allow ACKs and potential re-connect in FTP (XXX - is this allowed?) */ #ifndef TCP_EXPIRE_DEAD #define TCP_EXPIRE_DEAD 10 #endif /* When the link has been used and closed on one side - the other side is allowed to still send data */ #ifndef TCP_EXPIRE_SINGLEDEAD #define TCP_EXPIRE_SINGLEDEAD 90 #endif /* When the link isn't yet up */ #ifndef TCP_EXPIRE_INITIAL #define TCP_EXPIRE_INITIAL 300 #endif /* When the link is up */ #ifndef TCP_EXPIRE_CONNECTED #define TCP_EXPIRE_CONNECTED 86400 #endif /* Dummy port number codes used for FindLinkIn/Out() and AddLink(). These constants can be anything except zero, which indicates an unknown port number. */ #define NO_DEST_PORT 1 #define NO_SRC_PORT 1 /* Data Structures The fundamental data structure used in this program is "struct alias_link". Whenever a TCP connection is made, a UDP datagram is sent out, or an ICMP echo request is made, a link record is made (if it has not already been created). The link record is identified by the source address/port and the destination address/port. In the case of an ICMP echo request, the source port is treated as being equivalent with the 16-bit ID number of the ICMP packet. The link record also can store some auxiliary data. For TCP connections that have had sequence and acknowledgment modifications, data space is available to track these changes. A state field is used to keep track in changes to the TCP connection state. ID numbers of fragments can also be stored in the auxiliary space. Pointers to unresolved fragments can also be stored. The link records support two independent chainings. Lookup tables for input and out tables hold the initial pointers the link chains. On input, the lookup table indexes on alias port and link type. On output, the lookup table indexes on source address, destination address, source port, destination port and link type. */ struct ack_data_record { /* used to save changes to ACK/sequence * numbers */ u_long ack_old; u_long ack_new; int delta; int active; }; struct tcp_state { /* Information about TCP connection */ int in; /* State for outside -> inside */ int out; /* State for inside -> outside */ int index; /* Index to ACK data array */ int ack_modified; /* Indicates whether ACK and * sequence numbers */ /* been modified */ }; #define N_LINK_TCP_DATA 3 /* Number of distinct ACK number changes * saved for a modified TCP stream */ struct tcp_dat { struct tcp_state state; struct ack_data_record ack[N_LINK_TCP_DATA]; int fwhole; /* Which firewall record is used for this * hole? */ }; struct server { /* LSNAT server pool (circular list) */ struct in_addr addr; u_short port; struct server *next; }; struct alias_link { /* Main data structure */ struct libalias *la; struct in_addr src_addr; /* Address and port information */ struct in_addr dst_addr; struct in_addr alias_addr; struct in_addr proxy_addr; u_short src_port; u_short dst_port; u_short alias_port; u_short proxy_port; struct server *server; int link_type; /* Type of link: TCP, UDP, ICMP, * proto, frag */ /* values for link_type */ #define LINK_ICMP IPPROTO_ICMP #define LINK_UDP IPPROTO_UDP #define LINK_TCP IPPROTO_TCP #define LINK_FRAGMENT_ID (IPPROTO_MAX + 1) #define LINK_FRAGMENT_PTR (IPPROTO_MAX + 2) #define LINK_ADDR (IPPROTO_MAX + 3) #define LINK_PPTP (IPPROTO_MAX + 4) int flags; /* indicates special characteristics */ int pflags; /* protocol-specific flags */ /* flag bits */ #define LINK_UNKNOWN_DEST_PORT 0x01 #define LINK_UNKNOWN_DEST_ADDR 0x02 #define LINK_PERMANENT 0x04 #define LINK_PARTIALLY_SPECIFIED 0x03 /* logical-or of first two bits */ #define LINK_UNFIREWALLED 0x08 int timestamp; /* Time link was last accessed */ int expire_time; /* Expire time for link */ #ifndef NO_USE_SOCKETS int sockfd; /* socket descriptor */ #endif LIST_ENTRY (alias_link) list_out; /* Linked list of * pointers for */ LIST_ENTRY (alias_link) list_in; /* input and output * lookup tables */ union { /* Auxiliary data */ char *frag_ptr; struct in_addr frag_addr; struct tcp_dat *tcp; } data; }; /* Clean up procedure. */ static void finishoff(void); /* Kernel module definition. */ #ifdef _KERNEL MALLOC_DEFINE(M_ALIAS, "libalias", "packet aliasing"); MODULE_VERSION(libalias, 1); static int alias_mod_handler(module_t mod, int type, void *data) { switch (type) { case MOD_QUIESCE: case MOD_UNLOAD: finishoff(); case MOD_LOAD: return (0); default: return (EINVAL); } } static moduledata_t alias_mod = { "alias", alias_mod_handler, NULL }; DECLARE_MODULE(alias, alias_mod, SI_SUB_DRIVERS, SI_ORDER_SECOND); #endif /* Internal utility routines (used only in alias_db.c) Lookup table starting points: StartPointIn() -- link table initial search point for incoming packets StartPointOut() -- link table initial search point for outgoing packets Miscellaneous: SeqDiff() -- difference between two TCP sequences ShowAliasStats() -- send alias statistics to a monitor file */ /* Local prototypes */ static u_int StartPointIn(struct in_addr, u_short, int); static u_int StartPointOut(struct in_addr, struct in_addr, u_short, u_short, int); static int SeqDiff(u_long, u_long); #ifndef NO_FW_PUNCH /* Firewall control */ static void InitPunchFW(struct libalias *); static void UninitPunchFW(struct libalias *); static void ClearFWHole(struct alias_link *); #endif /* Log file control */ static void ShowAliasStats(struct libalias *); static int InitPacketAliasLog(struct libalias *); static void UninitPacketAliasLog(struct libalias *); void SctpShowAliasStats(struct libalias *la); static u_int StartPointIn(struct in_addr alias_addr, u_short alias_port, int link_type) { u_int n; n = alias_addr.s_addr; if (link_type != LINK_PPTP) n += alias_port; n += link_type; return (n % LINK_TABLE_IN_SIZE); } static u_int StartPointOut(struct in_addr src_addr, struct in_addr dst_addr, u_short src_port, u_short dst_port, int link_type) { u_int n; n = src_addr.s_addr; n += dst_addr.s_addr; if (link_type != LINK_PPTP) { n += src_port; n += dst_port; } n += link_type; return (n % LINK_TABLE_OUT_SIZE); } static int SeqDiff(u_long x, u_long y) { /* Return the difference between two TCP sequence numbers */ /* This function is encapsulated in case there are any unusual arithmetic conditions that need to be considered. */ return (ntohl(y) - ntohl(x)); } #ifdef _KERNEL static void AliasLog(char *str, const char *format, ...) { va_list ap; va_start(ap, format); vsnprintf(str, LIBALIAS_BUF_SIZE, format, ap); va_end(ap); } #else static void AliasLog(FILE *stream, const char *format, ...) { va_list ap; va_start(ap, format); vfprintf(stream, format, ap); va_end(ap); fflush(stream); } #endif static void ShowAliasStats(struct libalias *la) { LIBALIAS_LOCK_ASSERT(la); /* Used for debugging */ if (la->logDesc) { int tot = la->icmpLinkCount + la->udpLinkCount + (la->sctpLinkCount>>1) + /* sctp counts half associations */ la->tcpLinkCount + la->pptpLinkCount + la->protoLinkCount + la->fragmentIdLinkCount + la->fragmentPtrLinkCount; AliasLog(la->logDesc, "icmp=%u, udp=%u, tcp=%u, sctp=%u, pptp=%u, proto=%u, frag_id=%u frag_ptr=%u / tot=%u", la->icmpLinkCount, la->udpLinkCount, la->tcpLinkCount, la->sctpLinkCount>>1, /* sctp counts half associations */ la->pptpLinkCount, la->protoLinkCount, la->fragmentIdLinkCount, la->fragmentPtrLinkCount, tot); #ifndef _KERNEL AliasLog(la->logDesc, " (sock=%u)\n", la->sockCount); #endif } } void SctpShowAliasStats(struct libalias *la) { ShowAliasStats(la); } /* Internal routines for finding, deleting and adding links Port Allocation: GetNewPort() -- find and reserve new alias port number GetSocket() -- try to allocate a socket for a given port Link creation and deletion: CleanupAliasData() - remove all link chains from lookup table IncrementalCleanup() - look for stale links in a single chain DeleteLink() - remove link AddLink() - add link ReLink() - change link Link search: FindLinkOut() - find link for outgoing packets FindLinkIn() - find link for incoming packets Port search: FindNewPortGroup() - find an available group of ports */ /* Local prototypes */ static int GetNewPort(struct libalias *, struct alias_link *, int); #ifndef NO_USE_SOCKETS static u_short GetSocket(struct libalias *, u_short, int *, int); #endif static void CleanupAliasData(struct libalias *); static void IncrementalCleanup(struct libalias *); static void DeleteLink(struct alias_link *); static struct alias_link * ReLink(struct alias_link *, struct in_addr, struct in_addr, struct in_addr, u_short, u_short, int, int); static struct alias_link * FindLinkOut (struct libalias *, struct in_addr, struct in_addr, u_short, u_short, int, int); static struct alias_link * FindLinkIn (struct libalias *, struct in_addr, struct in_addr, u_short, u_short, int, int); #define ALIAS_PORT_BASE 0x08000 #define ALIAS_PORT_MASK 0x07fff #define ALIAS_PORT_MASK_EVEN 0x07ffe #define GET_NEW_PORT_MAX_ATTEMPTS 20 #define FIND_EVEN_ALIAS_BASE 1 /* GetNewPort() allocates port numbers. Note that if a port number is already in use, that does not mean that it cannot be used by another link concurrently. This is because GetNewPort() looks for unused triplets: (dest addr, dest port, alias port). */ static int GetNewPort(struct libalias *la, struct alias_link *lnk, int alias_port_param) { int i; int max_trials; u_short port_sys; u_short port_net; LIBALIAS_LOCK_ASSERT(la); /* Description of alias_port_param for GetNewPort(). When this parameter is zero or positive, it precisely specifies the port number. GetNewPort() will return this number without check that it is in use. When this parameter is GET_ALIAS_PORT, it indicates to get a randomly selected port number. */ if (alias_port_param == GET_ALIAS_PORT) { /* * The aliasing port is automatically selected by one of * two methods below: */ max_trials = GET_NEW_PORT_MAX_ATTEMPTS; if (la->packetAliasMode & PKT_ALIAS_SAME_PORTS) { /* * When the PKT_ALIAS_SAME_PORTS option is chosen, * the first try will be the actual source port. If * this is already in use, the remainder of the * trials will be random. */ port_net = lnk->src_port; port_sys = ntohs(port_net); + } else if (la->aliasPortLower) { + /* First trial is a random port in the aliasing range. */ + port_sys = la->aliasPortLower + + (arc4random() % la->aliasPortLength); + port_net = htons(port_sys); } else { /* First trial and all subsequent are random. */ port_sys = arc4random() & ALIAS_PORT_MASK; port_sys += ALIAS_PORT_BASE; port_net = htons(port_sys); } } else if (alias_port_param >= 0 && alias_port_param < 0x10000) { lnk->alias_port = (u_short) alias_port_param; return (0); } else { #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAlias/GetNewPort(): "); fprintf(stderr, "input parameter error\n"); #endif return (-1); } /* Port number search */ for (i = 0; i < max_trials; i++) { int go_ahead; struct alias_link *search_result; search_result = FindLinkIn(la, lnk->dst_addr, lnk->alias_addr, lnk->dst_port, port_net, lnk->link_type, 0); if (search_result == NULL) go_ahead = 1; else if (!(lnk->flags & LINK_PARTIALLY_SPECIFIED) && (search_result->flags & LINK_PARTIALLY_SPECIFIED)) go_ahead = 1; else go_ahead = 0; if (go_ahead) { #ifndef NO_USE_SOCKETS if ((la->packetAliasMode & PKT_ALIAS_USE_SOCKETS) && (lnk->flags & LINK_PARTIALLY_SPECIFIED) && ((lnk->link_type == LINK_TCP) || (lnk->link_type == LINK_UDP))) { if (GetSocket(la, port_net, &lnk->sockfd, lnk->link_type)) { lnk->alias_port = port_net; return (0); } } else { #endif lnk->alias_port = port_net; return (0); #ifndef NO_USE_SOCKETS } #endif } - port_sys = arc4random() & ALIAS_PORT_MASK; - port_sys += ALIAS_PORT_BASE; - port_net = htons(port_sys); + if (la->aliasPortLower) { + port_sys = la->aliasPortLower + + (arc4random() % la->aliasPortLength); + port_net = htons(port_sys); + } else { + port_sys = arc4random() & ALIAS_PORT_MASK; + port_sys += ALIAS_PORT_BASE; + port_net = htons(port_sys); + } } #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAlias/GetNewPort(): "); fprintf(stderr, "could not find free port\n"); #endif return (-1); } #ifndef NO_USE_SOCKETS static u_short GetSocket(struct libalias *la, u_short port_net, int *sockfd, int link_type) { int err; int sock; struct sockaddr_in sock_addr; LIBALIAS_LOCK_ASSERT(la); if (link_type == LINK_TCP) sock = socket(AF_INET, SOCK_STREAM, 0); else if (link_type == LINK_UDP) sock = socket(AF_INET, SOCK_DGRAM, 0); else { #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAlias/GetSocket(): "); fprintf(stderr, "incorrect link type\n"); #endif return (0); } if (sock < 0) { #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAlias/GetSocket(): "); fprintf(stderr, "socket() error %d\n", *sockfd); #endif return (0); } sock_addr.sin_family = AF_INET; sock_addr.sin_addr.s_addr = htonl(INADDR_ANY); sock_addr.sin_port = port_net; err = bind(sock, (struct sockaddr *)&sock_addr, sizeof(sock_addr)); if (err == 0) { la->sockCount++; *sockfd = sock; return (1); } else { close(sock); return (0); } } #endif /* FindNewPortGroup() returns a base port number for an available range of contiguous port numbers. Note that if a port number is already in use, that does not mean that it cannot be used by another link concurrently. This is because FindNewPortGroup() looks for unused triplets: (dest addr, dest port, alias port). */ int FindNewPortGroup(struct libalias *la, struct in_addr dst_addr, struct in_addr alias_addr, u_short src_port, u_short dst_port, u_short port_count, u_char proto, u_char align) { int i, j; int max_trials; u_short port_sys; int link_type; LIBALIAS_LOCK_ASSERT(la); /* * Get link_type from protocol */ switch (proto) { case IPPROTO_UDP: link_type = LINK_UDP; break; case IPPROTO_TCP: link_type = LINK_TCP; break; default: return (0); break; } /* * The aliasing port is automatically selected by one of two * methods below: */ max_trials = GET_NEW_PORT_MAX_ATTEMPTS; if (la->packetAliasMode & PKT_ALIAS_SAME_PORTS) { /* * When the ALIAS_SAME_PORTS option is chosen, the first * try will be the actual source port. If this is already * in use, the remainder of the trials will be random. */ port_sys = ntohs(src_port); } else { /* First trial and all subsequent are random. */ if (align == FIND_EVEN_ALIAS_BASE) port_sys = arc4random() & ALIAS_PORT_MASK_EVEN; else port_sys = arc4random() & ALIAS_PORT_MASK; port_sys += ALIAS_PORT_BASE; } /* Port number search */ for (i = 0; i < max_trials; i++) { struct alias_link *search_result; for (j = 0; j < port_count; j++) if ((search_result = FindLinkIn(la, dst_addr, alias_addr, dst_port, htons(port_sys + j), link_type, 0)) != NULL) break; /* Found a good range, return base */ if (j == port_count) return (htons(port_sys)); /* Find a new base to try */ if (align == FIND_EVEN_ALIAS_BASE) port_sys = arc4random() & ALIAS_PORT_MASK_EVEN; else port_sys = arc4random() & ALIAS_PORT_MASK; port_sys += ALIAS_PORT_BASE; } #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAlias/FindNewPortGroup(): "); fprintf(stderr, "could not find free port(s)\n"); #endif return (0); } static void CleanupAliasData(struct libalias *la) { struct alias_link *lnk; int i; LIBALIAS_LOCK_ASSERT(la); for (i = 0; i < LINK_TABLE_OUT_SIZE; i++) { lnk = LIST_FIRST(&la->linkTableOut[i]); while (lnk != NULL) { struct alias_link *link_next = LIST_NEXT(lnk, list_out); DeleteLink(lnk); lnk = link_next; } } la->cleanupIndex = 0; } static void IncrementalCleanup(struct libalias *la) { struct alias_link *lnk, *lnk_tmp; LIBALIAS_LOCK_ASSERT(la); LIST_FOREACH_SAFE(lnk, &la->linkTableOut[la->cleanupIndex++], list_out, lnk_tmp) { if (la->timeStamp - lnk->timestamp > lnk->expire_time) DeleteLink(lnk); } if (la->cleanupIndex == LINK_TABLE_OUT_SIZE) la->cleanupIndex = 0; } static void DeleteLink(struct alias_link *lnk) { struct libalias *la = lnk->la; LIBALIAS_LOCK_ASSERT(la); /* Don't do anything if the link is marked permanent */ if (la->deleteAllLinks == 0 && lnk->flags & LINK_PERMANENT) return; #ifndef NO_FW_PUNCH /* Delete associated firewall hole, if any */ ClearFWHole(lnk); #endif /* Free memory allocated for LSNAT server pool */ if (lnk->server != NULL) { struct server *head, *curr, *next; head = curr = lnk->server; do { next = curr->next; free(curr); } while ((curr = next) != head); } /* Adjust output table pointers */ LIST_REMOVE(lnk, list_out); /* Adjust input table pointers */ LIST_REMOVE(lnk, list_in); #ifndef NO_USE_SOCKETS /* Close socket, if one has been allocated */ if (lnk->sockfd != -1) { la->sockCount--; close(lnk->sockfd); } #endif /* Link-type dependent cleanup */ switch (lnk->link_type) { case LINK_ICMP: la->icmpLinkCount--; break; case LINK_UDP: la->udpLinkCount--; break; case LINK_TCP: la->tcpLinkCount--; free(lnk->data.tcp); break; case LINK_PPTP: la->pptpLinkCount--; break; case LINK_FRAGMENT_ID: la->fragmentIdLinkCount--; break; case LINK_FRAGMENT_PTR: la->fragmentPtrLinkCount--; if (lnk->data.frag_ptr != NULL) free(lnk->data.frag_ptr); break; case LINK_ADDR: break; default: la->protoLinkCount--; break; } /* Free memory */ free(lnk); /* Write statistics, if logging enabled */ if (la->packetAliasMode & PKT_ALIAS_LOG) { ShowAliasStats(la); } } struct alias_link * AddLink(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, struct in_addr alias_addr, u_short src_port, u_short dst_port, int alias_port_param, int link_type) { u_int start_point; struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = malloc(sizeof(struct alias_link)); if (lnk != NULL) { /* Basic initialization */ lnk->la = la; lnk->src_addr = src_addr; lnk->dst_addr = dst_addr; lnk->alias_addr = alias_addr; lnk->proxy_addr.s_addr = INADDR_ANY; lnk->src_port = src_port; lnk->dst_port = dst_port; lnk->proxy_port = 0; lnk->server = NULL; lnk->link_type = link_type; #ifndef NO_USE_SOCKETS lnk->sockfd = -1; #endif lnk->flags = 0; lnk->pflags = 0; lnk->timestamp = la->timeStamp; /* Expiration time */ switch (link_type) { case LINK_ICMP: lnk->expire_time = ICMP_EXPIRE_TIME; break; case LINK_UDP: lnk->expire_time = UDP_EXPIRE_TIME; break; case LINK_TCP: lnk->expire_time = TCP_EXPIRE_INITIAL; break; case LINK_PPTP: lnk->flags |= LINK_PERMANENT; /* no timeout. */ break; case LINK_FRAGMENT_ID: lnk->expire_time = FRAGMENT_ID_EXPIRE_TIME; break; case LINK_FRAGMENT_PTR: lnk->expire_time = FRAGMENT_PTR_EXPIRE_TIME; break; case LINK_ADDR: break; default: lnk->expire_time = PROTO_EXPIRE_TIME; break; } /* Determine alias flags */ if (dst_addr.s_addr == INADDR_ANY) lnk->flags |= LINK_UNKNOWN_DEST_ADDR; if (dst_port == 0) lnk->flags |= LINK_UNKNOWN_DEST_PORT; /* Determine alias port */ if (GetNewPort(la, lnk, alias_port_param) != 0) { free(lnk); return (NULL); } /* Link-type dependent initialization */ switch (link_type) { struct tcp_dat *aux_tcp; case LINK_ICMP: la->icmpLinkCount++; break; case LINK_UDP: la->udpLinkCount++; break; case LINK_TCP: aux_tcp = malloc(sizeof(struct tcp_dat)); if (aux_tcp != NULL) { int i; la->tcpLinkCount++; aux_tcp->state.in = ALIAS_TCP_STATE_NOT_CONNECTED; aux_tcp->state.out = ALIAS_TCP_STATE_NOT_CONNECTED; aux_tcp->state.index = 0; aux_tcp->state.ack_modified = 0; for (i = 0; i < N_LINK_TCP_DATA; i++) aux_tcp->ack[i].active = 0; aux_tcp->fwhole = -1; lnk->data.tcp = aux_tcp; } else { #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAlias/AddLink: "); fprintf(stderr, " cannot allocate auxiliary TCP data\n"); #endif free(lnk); return (NULL); } break; case LINK_PPTP: la->pptpLinkCount++; break; case LINK_FRAGMENT_ID: la->fragmentIdLinkCount++; break; case LINK_FRAGMENT_PTR: la->fragmentPtrLinkCount++; break; case LINK_ADDR: break; default: la->protoLinkCount++; break; } /* Set up pointers for output lookup table */ start_point = StartPointOut(src_addr, dst_addr, src_port, dst_port, link_type); LIST_INSERT_HEAD(&la->linkTableOut[start_point], lnk, list_out); /* Set up pointers for input lookup table */ start_point = StartPointIn(alias_addr, lnk->alias_port, link_type); LIST_INSERT_HEAD(&la->linkTableIn[start_point], lnk, list_in); } else { #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAlias/AddLink(): "); fprintf(stderr, "malloc() call failed.\n"); #endif } if (la->packetAliasMode & PKT_ALIAS_LOG) { ShowAliasStats(la); } return (lnk); } static struct alias_link * ReLink(struct alias_link *old_lnk, struct in_addr src_addr, struct in_addr dst_addr, struct in_addr alias_addr, u_short src_port, u_short dst_port, int alias_port_param, /* if less than zero, alias */ int link_type) { /* port will be automatically *//* chosen. * If greater than */ struct alias_link *new_lnk; /* zero, equal to alias port */ struct libalias *la = old_lnk->la; LIBALIAS_LOCK_ASSERT(la); new_lnk = AddLink(la, src_addr, dst_addr, alias_addr, src_port, dst_port, alias_port_param, link_type); #ifndef NO_FW_PUNCH if (new_lnk != NULL && old_lnk->link_type == LINK_TCP && old_lnk->data.tcp->fwhole > 0) { PunchFWHole(new_lnk); } #endif DeleteLink(old_lnk); return (new_lnk); } static struct alias_link * _FindLinkOut(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, u_short src_port, u_short dst_port, int link_type, int replace_partial_links) { u_int i; struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); i = StartPointOut(src_addr, dst_addr, src_port, dst_port, link_type); LIST_FOREACH(lnk, &la->linkTableOut[i], list_out) { if (lnk->dst_addr.s_addr == dst_addr.s_addr && lnk->src_addr.s_addr == src_addr.s_addr && lnk->src_port == src_port && lnk->dst_port == dst_port && lnk->link_type == link_type && lnk->server == NULL) { lnk->timestamp = la->timeStamp; break; } } /* Search for partially specified links. */ if (lnk == NULL && replace_partial_links) { if (dst_port != 0 && dst_addr.s_addr != INADDR_ANY) { lnk = _FindLinkOut(la, src_addr, dst_addr, src_port, 0, link_type, 0); if (lnk == NULL) lnk = _FindLinkOut(la, src_addr, la->nullAddress, src_port, dst_port, link_type, 0); } if (lnk == NULL && (dst_port != 0 || dst_addr.s_addr != INADDR_ANY)) { lnk = _FindLinkOut(la, src_addr, la->nullAddress, src_port, 0, link_type, 0); } if (lnk != NULL) { lnk = ReLink(lnk, src_addr, dst_addr, lnk->alias_addr, src_port, dst_port, lnk->alias_port, link_type); } } return (lnk); } static struct alias_link * FindLinkOut(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, u_short src_port, u_short dst_port, int link_type, int replace_partial_links) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = _FindLinkOut(la, src_addr, dst_addr, src_port, dst_port, link_type, replace_partial_links); if (lnk == NULL) { /* * The following allows permanent links to be specified as * using the default source address (i.e. device interface * address) without knowing in advance what that address * is. */ if (la->aliasAddress.s_addr != INADDR_ANY && src_addr.s_addr == la->aliasAddress.s_addr) { lnk = _FindLinkOut(la, la->nullAddress, dst_addr, src_port, dst_port, link_type, replace_partial_links); } } return (lnk); } static struct alias_link * _FindLinkIn(struct libalias *la, struct in_addr dst_addr, struct in_addr alias_addr, u_short dst_port, u_short alias_port, int link_type, int replace_partial_links) { int flags_in; u_int start_point; struct alias_link *lnk; struct alias_link *lnk_fully_specified; struct alias_link *lnk_unknown_all; struct alias_link *lnk_unknown_dst_addr; struct alias_link *lnk_unknown_dst_port; LIBALIAS_LOCK_ASSERT(la); /* Initialize pointers */ lnk_fully_specified = NULL; lnk_unknown_all = NULL; lnk_unknown_dst_addr = NULL; lnk_unknown_dst_port = NULL; /* If either the dest addr or port is unknown, the search loop will have to know about this. */ flags_in = 0; if (dst_addr.s_addr == INADDR_ANY) flags_in |= LINK_UNKNOWN_DEST_ADDR; if (dst_port == 0) flags_in |= LINK_UNKNOWN_DEST_PORT; /* Search loop */ start_point = StartPointIn(alias_addr, alias_port, link_type); LIST_FOREACH(lnk, &la->linkTableIn[start_point], list_in) { int flags; flags = flags_in | lnk->flags; if (!(flags & LINK_PARTIALLY_SPECIFIED)) { if (lnk->alias_addr.s_addr == alias_addr.s_addr && lnk->alias_port == alias_port && lnk->dst_addr.s_addr == dst_addr.s_addr && lnk->dst_port == dst_port && lnk->link_type == link_type) { lnk_fully_specified = lnk; break; } } else if ((flags & LINK_UNKNOWN_DEST_ADDR) && (flags & LINK_UNKNOWN_DEST_PORT)) { if (lnk->alias_addr.s_addr == alias_addr.s_addr && lnk->alias_port == alias_port && lnk->link_type == link_type) { if (lnk_unknown_all == NULL) lnk_unknown_all = lnk; } } else if (flags & LINK_UNKNOWN_DEST_ADDR) { if (lnk->alias_addr.s_addr == alias_addr.s_addr && lnk->alias_port == alias_port && lnk->link_type == link_type && lnk->dst_port == dst_port) { if (lnk_unknown_dst_addr == NULL) lnk_unknown_dst_addr = lnk; } } else if (flags & LINK_UNKNOWN_DEST_PORT) { if (lnk->alias_addr.s_addr == alias_addr.s_addr && lnk->alias_port == alias_port && lnk->link_type == link_type && lnk->dst_addr.s_addr == dst_addr.s_addr) { if (lnk_unknown_dst_port == NULL) lnk_unknown_dst_port = lnk; } } } if (lnk_fully_specified != NULL) { lnk_fully_specified->timestamp = la->timeStamp; lnk = lnk_fully_specified; } else if (lnk_unknown_dst_port != NULL) lnk = lnk_unknown_dst_port; else if (lnk_unknown_dst_addr != NULL) lnk = lnk_unknown_dst_addr; else if (lnk_unknown_all != NULL) lnk = lnk_unknown_all; else return (NULL); if (replace_partial_links && (lnk->flags & LINK_PARTIALLY_SPECIFIED || lnk->server != NULL)) { struct in_addr src_addr; u_short src_port; if (lnk->server != NULL) { /* LSNAT link */ src_addr = lnk->server->addr; src_port = lnk->server->port; lnk->server = lnk->server->next; } else { src_addr = lnk->src_addr; src_port = lnk->src_port; } if (link_type == LINK_SCTP) { lnk->src_addr = src_addr; lnk->src_port = src_port; return(lnk); } lnk = ReLink(lnk, src_addr, dst_addr, alias_addr, src_port, dst_port, alias_port, link_type); } return (lnk); } static struct alias_link * FindLinkIn(struct libalias *la, struct in_addr dst_addr, struct in_addr alias_addr, u_short dst_port, u_short alias_port, int link_type, int replace_partial_links) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = _FindLinkIn(la, dst_addr, alias_addr, dst_port, alias_port, link_type, replace_partial_links); if (lnk == NULL) { /* * The following allows permanent links to be specified as * using the default aliasing address (i.e. device * interface address) without knowing in advance what that * address is. */ if (la->aliasAddress.s_addr != INADDR_ANY && alias_addr.s_addr == la->aliasAddress.s_addr) { lnk = _FindLinkIn(la, dst_addr, la->nullAddress, dst_port, alias_port, link_type, replace_partial_links); } } return (lnk); } /* External routines for finding/adding links -- "external" means outside alias_db.c, but within alias*.c -- FindIcmpIn(), FindIcmpOut() FindFragmentIn1(), FindFragmentIn2() AddFragmentPtrLink(), FindFragmentPtr() FindProtoIn(), FindProtoOut() FindUdpTcpIn(), FindUdpTcpOut() AddPptp(), FindPptpOutByCallId(), FindPptpInByCallId(), FindPptpOutByPeerCallId(), FindPptpInByPeerCallId() FindOriginalAddress(), FindAliasAddress() (prototypes in alias_local.h) */ struct alias_link * FindIcmpIn(struct libalias *la, struct in_addr dst_addr, struct in_addr alias_addr, u_short id_alias, int create) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = FindLinkIn(la, dst_addr, alias_addr, NO_DEST_PORT, id_alias, LINK_ICMP, 0); if (lnk == NULL && create && !(la->packetAliasMode & PKT_ALIAS_DENY_INCOMING)) { struct in_addr target_addr; target_addr = FindOriginalAddress(la, alias_addr); lnk = AddLink(la, target_addr, dst_addr, alias_addr, id_alias, NO_DEST_PORT, id_alias, LINK_ICMP); } return (lnk); } struct alias_link * FindIcmpOut(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, u_short id, int create) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = FindLinkOut(la, src_addr, dst_addr, id, NO_DEST_PORT, LINK_ICMP, 0); if (lnk == NULL && create) { struct in_addr alias_addr; alias_addr = FindAliasAddress(la, src_addr); lnk = AddLink(la, src_addr, dst_addr, alias_addr, id, NO_DEST_PORT, GET_ALIAS_ID, LINK_ICMP); } return (lnk); } struct alias_link * FindFragmentIn1(struct libalias *la, struct in_addr dst_addr, struct in_addr alias_addr, u_short ip_id) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = FindLinkIn(la, dst_addr, alias_addr, NO_DEST_PORT, ip_id, LINK_FRAGMENT_ID, 0); if (lnk == NULL) { lnk = AddLink(la, la->nullAddress, dst_addr, alias_addr, NO_SRC_PORT, NO_DEST_PORT, ip_id, LINK_FRAGMENT_ID); } return (lnk); } struct alias_link * FindFragmentIn2(struct libalias *la, struct in_addr dst_addr, /* Doesn't add a link if * one */ struct in_addr alias_addr, /* is not found. */ u_short ip_id) { LIBALIAS_LOCK_ASSERT(la); return FindLinkIn(la, dst_addr, alias_addr, NO_DEST_PORT, ip_id, LINK_FRAGMENT_ID, 0); } struct alias_link * AddFragmentPtrLink(struct libalias *la, struct in_addr dst_addr, u_short ip_id) { LIBALIAS_LOCK_ASSERT(la); return AddLink(la, la->nullAddress, dst_addr, la->nullAddress, NO_SRC_PORT, NO_DEST_PORT, ip_id, LINK_FRAGMENT_PTR); } struct alias_link * FindFragmentPtr(struct libalias *la, struct in_addr dst_addr, u_short ip_id) { LIBALIAS_LOCK_ASSERT(la); return FindLinkIn(la, dst_addr, la->nullAddress, NO_DEST_PORT, ip_id, LINK_FRAGMENT_PTR, 0); } struct alias_link * FindProtoIn(struct libalias *la, struct in_addr dst_addr, struct in_addr alias_addr, u_char proto) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = FindLinkIn(la, dst_addr, alias_addr, NO_DEST_PORT, 0, proto, 1); if (lnk == NULL && !(la->packetAliasMode & PKT_ALIAS_DENY_INCOMING)) { struct in_addr target_addr; target_addr = FindOriginalAddress(la, alias_addr); lnk = AddLink(la, target_addr, dst_addr, alias_addr, NO_SRC_PORT, NO_DEST_PORT, 0, proto); } return (lnk); } struct alias_link * FindProtoOut(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, u_char proto) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = FindLinkOut(la, src_addr, dst_addr, NO_SRC_PORT, NO_DEST_PORT, proto, 1); if (lnk == NULL) { struct in_addr alias_addr; alias_addr = FindAliasAddress(la, src_addr); lnk = AddLink(la, src_addr, dst_addr, alias_addr, NO_SRC_PORT, NO_DEST_PORT, 0, proto); } return (lnk); } struct alias_link * FindUdpTcpIn(struct libalias *la, struct in_addr dst_addr, struct in_addr alias_addr, u_short dst_port, u_short alias_port, u_char proto, int create) { int link_type; struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); switch (proto) { case IPPROTO_UDP: link_type = LINK_UDP; break; case IPPROTO_TCP: link_type = LINK_TCP; break; default: return (NULL); break; } lnk = FindLinkIn(la, dst_addr, alias_addr, dst_port, alias_port, link_type, create); if (lnk == NULL && create && !(la->packetAliasMode & PKT_ALIAS_DENY_INCOMING)) { struct in_addr target_addr; target_addr = FindOriginalAddress(la, alias_addr); lnk = AddLink(la, target_addr, dst_addr, alias_addr, alias_port, dst_port, alias_port, link_type); } return (lnk); } struct alias_link * FindUdpTcpOut(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, u_short src_port, u_short dst_port, u_char proto, int create) { int link_type; struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); switch (proto) { case IPPROTO_UDP: link_type = LINK_UDP; break; case IPPROTO_TCP: link_type = LINK_TCP; break; default: return (NULL); break; } lnk = FindLinkOut(la, src_addr, dst_addr, src_port, dst_port, link_type, create); if (lnk == NULL && create) { struct in_addr alias_addr; alias_addr = FindAliasAddress(la, src_addr); lnk = AddLink(la, src_addr, dst_addr, alias_addr, src_port, dst_port, GET_ALIAS_PORT, link_type); } return (lnk); } struct alias_link * AddPptp(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, struct in_addr alias_addr, u_int16_t src_call_id) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = AddLink(la, src_addr, dst_addr, alias_addr, src_call_id, 0, GET_ALIAS_PORT, LINK_PPTP); return (lnk); } struct alias_link * FindPptpOutByCallId(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, u_int16_t src_call_id) { u_int i; struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); i = StartPointOut(src_addr, dst_addr, 0, 0, LINK_PPTP); LIST_FOREACH(lnk, &la->linkTableOut[i], list_out) if (lnk->link_type == LINK_PPTP && lnk->src_addr.s_addr == src_addr.s_addr && lnk->dst_addr.s_addr == dst_addr.s_addr && lnk->src_port == src_call_id) break; return (lnk); } struct alias_link * FindPptpOutByPeerCallId(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, u_int16_t dst_call_id) { u_int i; struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); i = StartPointOut(src_addr, dst_addr, 0, 0, LINK_PPTP); LIST_FOREACH(lnk, &la->linkTableOut[i], list_out) if (lnk->link_type == LINK_PPTP && lnk->src_addr.s_addr == src_addr.s_addr && lnk->dst_addr.s_addr == dst_addr.s_addr && lnk->dst_port == dst_call_id) break; return (lnk); } struct alias_link * FindPptpInByCallId(struct libalias *la, struct in_addr dst_addr, struct in_addr alias_addr, u_int16_t dst_call_id) { u_int i; struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); i = StartPointIn(alias_addr, 0, LINK_PPTP); LIST_FOREACH(lnk, &la->linkTableIn[i], list_in) if (lnk->link_type == LINK_PPTP && lnk->dst_addr.s_addr == dst_addr.s_addr && lnk->alias_addr.s_addr == alias_addr.s_addr && lnk->dst_port == dst_call_id) break; return (lnk); } struct alias_link * FindPptpInByPeerCallId(struct libalias *la, struct in_addr dst_addr, struct in_addr alias_addr, u_int16_t alias_call_id) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = FindLinkIn(la, dst_addr, alias_addr, 0 /* any */ , alias_call_id, LINK_PPTP, 0); return (lnk); } struct alias_link * FindRtspOut(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, u_short src_port, u_short alias_port, u_char proto) { int link_type; struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); switch (proto) { case IPPROTO_UDP: link_type = LINK_UDP; break; case IPPROTO_TCP: link_type = LINK_TCP; break; default: return (NULL); break; } lnk = FindLinkOut(la, src_addr, dst_addr, src_port, 0, link_type, 1); if (lnk == NULL) { struct in_addr alias_addr; alias_addr = FindAliasAddress(la, src_addr); lnk = AddLink(la, src_addr, dst_addr, alias_addr, src_port, 0, alias_port, link_type); } return (lnk); } struct in_addr FindOriginalAddress(struct libalias *la, struct in_addr alias_addr) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = FindLinkIn(la, la->nullAddress, alias_addr, 0, 0, LINK_ADDR, 0); if (lnk == NULL) { la->newDefaultLink = 1; if (la->targetAddress.s_addr == INADDR_ANY) return (alias_addr); else if (la->targetAddress.s_addr == INADDR_NONE) return (la->aliasAddress.s_addr != INADDR_ANY) ? la->aliasAddress : alias_addr; else return (la->targetAddress); } else { if (lnk->server != NULL) { /* LSNAT link */ struct in_addr src_addr; src_addr = lnk->server->addr; lnk->server = lnk->server->next; return (src_addr); } else if (lnk->src_addr.s_addr == INADDR_ANY) return (la->aliasAddress.s_addr != INADDR_ANY) ? la->aliasAddress : alias_addr; else return (lnk->src_addr); } } struct in_addr FindAliasAddress(struct libalias *la, struct in_addr original_addr) { struct alias_link *lnk; LIBALIAS_LOCK_ASSERT(la); lnk = FindLinkOut(la, original_addr, la->nullAddress, 0, 0, LINK_ADDR, 0); if (lnk == NULL) { return (la->aliasAddress.s_addr != INADDR_ANY) ? la->aliasAddress : original_addr; } else { if (lnk->alias_addr.s_addr == INADDR_ANY) return (la->aliasAddress.s_addr != INADDR_ANY) ? la->aliasAddress : original_addr; else return (lnk->alias_addr); } } /* External routines for getting or changing link data (external to alias_db.c, but internal to alias*.c) SetFragmentData(), GetFragmentData() SetFragmentPtr(), GetFragmentPtr() SetStateIn(), SetStateOut(), GetStateIn(), GetStateOut() GetOriginalAddress(), GetDestAddress(), GetAliasAddress() GetOriginalPort(), GetAliasPort() SetAckModified(), GetAckModified() GetDeltaAckIn(), GetDeltaSeqOut(), AddSeq() SetProtocolFlags(), GetProtocolFlags() SetDestCallId() */ void SetFragmentAddr(struct alias_link *lnk, struct in_addr src_addr) { lnk->data.frag_addr = src_addr; } void GetFragmentAddr(struct alias_link *lnk, struct in_addr *src_addr) { *src_addr = lnk->data.frag_addr; } void SetFragmentPtr(struct alias_link *lnk, void *fptr) { lnk->data.frag_ptr = fptr; } void GetFragmentPtr(struct alias_link *lnk, void **fptr) { *fptr = lnk->data.frag_ptr; } void SetStateIn(struct alias_link *lnk, int state) { /* TCP input state */ switch (state) { case ALIAS_TCP_STATE_DISCONNECTED: if (lnk->data.tcp->state.out != ALIAS_TCP_STATE_CONNECTED) lnk->expire_time = TCP_EXPIRE_DEAD; else lnk->expire_time = TCP_EXPIRE_SINGLEDEAD; break; case ALIAS_TCP_STATE_CONNECTED: if (lnk->data.tcp->state.out == ALIAS_TCP_STATE_CONNECTED) lnk->expire_time = TCP_EXPIRE_CONNECTED; break; default: #ifdef _KERNEL panic("libalias:SetStateIn() unknown state"); #else abort(); #endif } lnk->data.tcp->state.in = state; } void SetStateOut(struct alias_link *lnk, int state) { /* TCP output state */ switch (state) { case ALIAS_TCP_STATE_DISCONNECTED: if (lnk->data.tcp->state.in != ALIAS_TCP_STATE_CONNECTED) lnk->expire_time = TCP_EXPIRE_DEAD; else lnk->expire_time = TCP_EXPIRE_SINGLEDEAD; break; case ALIAS_TCP_STATE_CONNECTED: if (lnk->data.tcp->state.in == ALIAS_TCP_STATE_CONNECTED) lnk->expire_time = TCP_EXPIRE_CONNECTED; break; default: #ifdef _KERNEL panic("libalias:SetStateOut() unknown state"); #else abort(); #endif } lnk->data.tcp->state.out = state; } int GetStateIn(struct alias_link *lnk) { /* TCP input state */ return (lnk->data.tcp->state.in); } int GetStateOut(struct alias_link *lnk) { /* TCP output state */ return (lnk->data.tcp->state.out); } struct in_addr GetOriginalAddress(struct alias_link *lnk) { if (lnk->src_addr.s_addr == INADDR_ANY) return (lnk->la->aliasAddress); else return (lnk->src_addr); } struct in_addr GetDestAddress(struct alias_link *lnk) { return (lnk->dst_addr); } struct in_addr GetAliasAddress(struct alias_link *lnk) { if (lnk->alias_addr.s_addr == INADDR_ANY) return (lnk->la->aliasAddress); else return (lnk->alias_addr); } struct in_addr GetDefaultAliasAddress(struct libalias *la) { LIBALIAS_LOCK_ASSERT(la); return (la->aliasAddress); } void SetDefaultAliasAddress(struct libalias *la, struct in_addr alias_addr) { LIBALIAS_LOCK_ASSERT(la); la->aliasAddress = alias_addr; } u_short GetOriginalPort(struct alias_link *lnk) { return (lnk->src_port); } u_short GetAliasPort(struct alias_link *lnk) { return (lnk->alias_port); } #ifndef NO_FW_PUNCH static u_short GetDestPort(struct alias_link *lnk) { return (lnk->dst_port); } #endif void SetAckModified(struct alias_link *lnk) { /* Indicate that ACK numbers have been modified in a TCP connection */ lnk->data.tcp->state.ack_modified = 1; } struct in_addr GetProxyAddress(struct alias_link *lnk) { return (lnk->proxy_addr); } void SetProxyAddress(struct alias_link *lnk, struct in_addr addr) { lnk->proxy_addr = addr; } u_short GetProxyPort(struct alias_link *lnk) { return (lnk->proxy_port); } void SetProxyPort(struct alias_link *lnk, u_short port) { lnk->proxy_port = port; } int GetAckModified(struct alias_link *lnk) { /* See if ACK numbers have been modified */ return (lnk->data.tcp->state.ack_modified); } // XXX ip free int GetDeltaAckIn(u_long ack, struct alias_link *lnk) { /* Find out how much the ACK number has been altered for an incoming TCP packet. To do this, a circular list of ACK numbers where the TCP packet size was altered is searched. */ int i; int delta, ack_diff_min; delta = 0; ack_diff_min = -1; for (i = 0; i < N_LINK_TCP_DATA; i++) { struct ack_data_record x; x = lnk->data.tcp->ack[i]; if (x.active == 1) { int ack_diff; ack_diff = SeqDiff(x.ack_new, ack); if (ack_diff >= 0) { if (ack_diff_min >= 0) { if (ack_diff < ack_diff_min) { delta = x.delta; ack_diff_min = ack_diff; } } else { delta = x.delta; ack_diff_min = ack_diff; } } } } return (delta); } // XXX ip free int GetDeltaSeqOut(u_long seq, struct alias_link *lnk) { /* Find out how much the sequence number has been altered for an outgoing TCP packet. To do this, a circular list of ACK numbers where the TCP packet size was altered is searched. */ int i; int delta, seq_diff_min; delta = 0; seq_diff_min = -1; for (i = 0; i < N_LINK_TCP_DATA; i++) { struct ack_data_record x; x = lnk->data.tcp->ack[i]; if (x.active == 1) { int seq_diff; seq_diff = SeqDiff(x.ack_old, seq); if (seq_diff >= 0) { if (seq_diff_min >= 0) { if (seq_diff < seq_diff_min) { delta = x.delta; seq_diff_min = seq_diff; } } else { delta = x.delta; seq_diff_min = seq_diff; } } } } return (delta); } // XXX ip free void AddSeq(struct alias_link *lnk, int delta, u_int ip_hl, u_short ip_len, u_long th_seq, u_int th_off) { /* When a TCP packet has been altered in length, save this information in a circular list. If enough packets have been altered, then this list will begin to overwrite itself. */ struct ack_data_record x; int hlen, tlen, dlen; int i; hlen = (ip_hl + th_off) << 2; tlen = ntohs(ip_len); dlen = tlen - hlen; x.ack_old = htonl(ntohl(th_seq) + dlen); x.ack_new = htonl(ntohl(th_seq) + dlen + delta); x.delta = delta; x.active = 1; i = lnk->data.tcp->state.index; lnk->data.tcp->ack[i] = x; i++; if (i == N_LINK_TCP_DATA) lnk->data.tcp->state.index = 0; else lnk->data.tcp->state.index = i; } void SetExpire(struct alias_link *lnk, int expire) { if (expire == 0) { lnk->flags &= ~LINK_PERMANENT; DeleteLink(lnk); } else if (expire == -1) { lnk->flags |= LINK_PERMANENT; } else if (expire > 0) { lnk->expire_time = expire; } else { #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAlias/SetExpire(): "); fprintf(stderr, "error in expire parameter\n"); #endif } } void ClearCheckNewLink(struct libalias *la) { LIBALIAS_LOCK_ASSERT(la); la->newDefaultLink = 0; } void SetProtocolFlags(struct alias_link *lnk, int pflags) { lnk->pflags = pflags; } int GetProtocolFlags(struct alias_link *lnk) { return (lnk->pflags); } void SetDestCallId(struct alias_link *lnk, u_int16_t cid) { struct libalias *la = lnk->la; LIBALIAS_LOCK_ASSERT(la); la->deleteAllLinks = 1; ReLink(lnk, lnk->src_addr, lnk->dst_addr, lnk->alias_addr, lnk->src_port, cid, lnk->alias_port, lnk->link_type); la->deleteAllLinks = 0; } /* Miscellaneous Functions HouseKeeping() InitPacketAliasLog() UninitPacketAliasLog() */ /* Whenever an outgoing or incoming packet is handled, HouseKeeping() is called to find and remove timed-out aliasing links. Logic exists to sweep through the entire table and linked list structure every 60 seconds. (prototype in alias_local.h) */ void HouseKeeping(struct libalias *la) { int i, n; #ifndef _KERNEL struct timeval tv; #endif LIBALIAS_LOCK_ASSERT(la); /* * Save system time (seconds) in global variable timeStamp for use * by other functions. This is done so as not to unnecessarily * waste timeline by making system calls. */ #ifdef _KERNEL la->timeStamp = time_uptime; #else gettimeofday(&tv, NULL); la->timeStamp = tv.tv_sec; #endif /* Compute number of spokes (output table link chains) to cover */ n = LINK_TABLE_OUT_SIZE * (la->timeStamp - la->lastCleanupTime); n /= ALIAS_CLEANUP_INTERVAL_SECS; /* Handle different cases */ if (n > 0) { if (n > ALIAS_CLEANUP_MAX_SPOKES) n = ALIAS_CLEANUP_MAX_SPOKES; la->lastCleanupTime = la->timeStamp; for (i = 0; i < n; i++) IncrementalCleanup(la); } else if (n < 0) { #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAlias/HouseKeeping(): "); fprintf(stderr, "something unexpected in time values\n"); #endif la->lastCleanupTime = la->timeStamp; } } /* Init the log file and enable logging */ static int InitPacketAliasLog(struct libalias *la) { LIBALIAS_LOCK_ASSERT(la); if (~la->packetAliasMode & PKT_ALIAS_LOG) { #ifdef _KERNEL if ((la->logDesc = malloc(LIBALIAS_BUF_SIZE))) ; #else if ((la->logDesc = fopen("/var/log/alias.log", "w"))) fprintf(la->logDesc, "PacketAlias/InitPacketAliasLog: Packet alias logging enabled.\n"); #endif else return (ENOMEM); /* log initialization failed */ la->packetAliasMode |= PKT_ALIAS_LOG; } return (1); } /* Close the log-file and disable logging. */ static void UninitPacketAliasLog(struct libalias *la) { LIBALIAS_LOCK_ASSERT(la); if (la->logDesc) { #ifdef _KERNEL free(la->logDesc); #else fclose(la->logDesc); #endif la->logDesc = NULL; } la->packetAliasMode &= ~PKT_ALIAS_LOG; } /* Outside world interfaces -- "outside world" means other than alias*.c routines -- PacketAliasRedirectPort() PacketAliasAddServer() PacketAliasRedirectProto() PacketAliasRedirectAddr() PacketAliasRedirectDynamic() PacketAliasRedirectDelete() PacketAliasSetAddress() PacketAliasInit() PacketAliasUninit() PacketAliasSetMode() (prototypes in alias.h) */ /* Redirection from a specific public addr:port to a private addr:port */ struct alias_link * LibAliasRedirectPort(struct libalias *la, struct in_addr src_addr, u_short src_port, struct in_addr dst_addr, u_short dst_port, struct in_addr alias_addr, u_short alias_port, u_char proto) { int link_type; struct alias_link *lnk; LIBALIAS_LOCK(la); switch (proto) { case IPPROTO_UDP: link_type = LINK_UDP; break; case IPPROTO_TCP: link_type = LINK_TCP; break; case IPPROTO_SCTP: link_type = LINK_SCTP; break; default: #ifdef LIBALIAS_DEBUG fprintf(stderr, "PacketAliasRedirectPort(): "); fprintf(stderr, "only SCTP, TCP and UDP protocols allowed\n"); #endif lnk = NULL; goto getout; } lnk = AddLink(la, src_addr, dst_addr, alias_addr, src_port, dst_port, alias_port, link_type); if (lnk != NULL) { lnk->flags |= LINK_PERMANENT; } #ifdef LIBALIAS_DEBUG else { fprintf(stderr, "PacketAliasRedirectPort(): " "call to AddLink() failed\n"); } #endif getout: LIBALIAS_UNLOCK(la); return (lnk); } /* Add server to the pool of servers */ int LibAliasAddServer(struct libalias *la, struct alias_link *lnk, struct in_addr addr, u_short port) { struct server *server; int res; LIBALIAS_LOCK(la); (void)la; server = malloc(sizeof(struct server)); if (server != NULL) { struct server *head; server->addr = addr; server->port = port; head = lnk->server; if (head == NULL) server->next = server; else { struct server *s; for (s = head; s->next != head; s = s->next); s->next = server; server->next = head; } lnk->server = server; res = 0; } else res = -1; LIBALIAS_UNLOCK(la); return (res); } /* Redirect packets of a given IP protocol from a specific public address to a private address */ struct alias_link * LibAliasRedirectProto(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, struct in_addr alias_addr, u_char proto) { struct alias_link *lnk; LIBALIAS_LOCK(la); lnk = AddLink(la, src_addr, dst_addr, alias_addr, NO_SRC_PORT, NO_DEST_PORT, 0, proto); if (lnk != NULL) { lnk->flags |= LINK_PERMANENT; } #ifdef LIBALIAS_DEBUG else { fprintf(stderr, "PacketAliasRedirectProto(): " "call to AddLink() failed\n"); } #endif LIBALIAS_UNLOCK(la); return (lnk); } /* Static address translation */ struct alias_link * LibAliasRedirectAddr(struct libalias *la, struct in_addr src_addr, struct in_addr alias_addr) { struct alias_link *lnk; LIBALIAS_LOCK(la); lnk = AddLink(la, src_addr, la->nullAddress, alias_addr, 0, 0, 0, LINK_ADDR); if (lnk != NULL) { lnk->flags |= LINK_PERMANENT; } #ifdef LIBALIAS_DEBUG else { fprintf(stderr, "PacketAliasRedirectAddr(): " "call to AddLink() failed\n"); } #endif LIBALIAS_UNLOCK(la); return (lnk); } /* Mark the aliasing link dynamic */ int LibAliasRedirectDynamic(struct libalias *la, struct alias_link *lnk) { int res; LIBALIAS_LOCK(la); (void)la; if (lnk->flags & LINK_PARTIALLY_SPECIFIED) res = -1; else { lnk->flags &= ~LINK_PERMANENT; res = 0; } LIBALIAS_UNLOCK(la); return (res); } void LibAliasRedirectDelete(struct libalias *la, struct alias_link *lnk) { /* This is a dangerous function to put in the API, because an invalid pointer can crash the program. */ LIBALIAS_LOCK(la); la->deleteAllLinks = 1; DeleteLink(lnk); la->deleteAllLinks = 0; LIBALIAS_UNLOCK(la); } void LibAliasSetAddress(struct libalias *la, struct in_addr addr) { LIBALIAS_LOCK(la); if (la->packetAliasMode & PKT_ALIAS_RESET_ON_ADDR_CHANGE && la->aliasAddress.s_addr != addr.s_addr) CleanupAliasData(la); la->aliasAddress = addr; LIBALIAS_UNLOCK(la); } + +void +LibAliasSetAliasPortRange(struct libalias *la, u_short port_low, + u_short port_high) +{ + + LIBALIAS_LOCK(la); + la->aliasPortLower = port_low; + /* Add 1 to the aliasPortLength as modulo has range of 1 to n-1 */ + la->aliasPortLength = port_high - port_low + 1; + LIBALIAS_UNLOCK(la); +} + void LibAliasSetTarget(struct libalias *la, struct in_addr target_addr) { LIBALIAS_LOCK(la); la->targetAddress = target_addr; LIBALIAS_UNLOCK(la); } static void finishoff(void) { while (!LIST_EMPTY(&instancehead)) LibAliasUninit(LIST_FIRST(&instancehead)); } struct libalias * LibAliasInit(struct libalias *la) { int i; #ifndef _KERNEL struct timeval tv; #endif if (la == NULL) { #ifdef _KERNEL #undef malloc /* XXX: ugly */ la = malloc(sizeof *la, M_ALIAS, M_WAITOK | M_ZERO); #else la = calloc(sizeof *la, 1); if (la == NULL) return (la); #endif #ifndef _KERNEL /* kernel cleans up on module unload */ if (LIST_EMPTY(&instancehead)) atexit(finishoff); #endif LIST_INSERT_HEAD(&instancehead, la, instancelist); #ifdef _KERNEL la->timeStamp = time_uptime; la->lastCleanupTime = time_uptime; #else gettimeofday(&tv, NULL); la->timeStamp = tv.tv_sec; la->lastCleanupTime = tv.tv_sec; #endif for (i = 0; i < LINK_TABLE_OUT_SIZE; i++) LIST_INIT(&la->linkTableOut[i]); for (i = 0; i < LINK_TABLE_IN_SIZE; i++) LIST_INIT(&la->linkTableIn[i]); #ifdef _KERNEL AliasSctpInit(la); #endif LIBALIAS_LOCK_INIT(la); LIBALIAS_LOCK(la); } else { LIBALIAS_LOCK(la); la->deleteAllLinks = 1; CleanupAliasData(la); la->deleteAllLinks = 0; #ifdef _KERNEL AliasSctpTerm(la); AliasSctpInit(la); #endif } la->aliasAddress.s_addr = INADDR_ANY; la->targetAddress.s_addr = INADDR_ANY; la->icmpLinkCount = 0; la->udpLinkCount = 0; la->tcpLinkCount = 0; la->sctpLinkCount = 0; la->pptpLinkCount = 0; la->protoLinkCount = 0; la->fragmentIdLinkCount = 0; la->fragmentPtrLinkCount = 0; la->sockCount = 0; la->cleanupIndex = 0; la->packetAliasMode = PKT_ALIAS_SAME_PORTS #ifndef NO_USE_SOCKETS | PKT_ALIAS_USE_SOCKETS #endif | PKT_ALIAS_RESET_ON_ADDR_CHANGE; #ifndef NO_FW_PUNCH la->fireWallFD = -1; #endif #ifndef _KERNEL LibAliasRefreshModules(); #endif LIBALIAS_UNLOCK(la); return (la); } void LibAliasUninit(struct libalias *la) { LIBALIAS_LOCK(la); #ifdef _KERNEL AliasSctpTerm(la); #endif la->deleteAllLinks = 1; CleanupAliasData(la); la->deleteAllLinks = 0; UninitPacketAliasLog(la); #ifndef NO_FW_PUNCH UninitPunchFW(la); #endif LIST_REMOVE(la, instancelist); LIBALIAS_UNLOCK(la); LIBALIAS_LOCK_DESTROY(la); free(la); } /* Change mode for some operations */ unsigned int LibAliasSetMode( struct libalias *la, unsigned int flags, /* Which state to bring flags to */ unsigned int mask /* Mask of which flags to affect (use 0 to * do a probe for flag values) */ ) { int res = -1; LIBALIAS_LOCK(la); /* Enable logging? */ if (flags & mask & PKT_ALIAS_LOG) { /* Do the enable */ if (InitPacketAliasLog(la) == ENOMEM) goto getout; } else /* _Disable_ logging? */ if (~flags & mask & PKT_ALIAS_LOG) { UninitPacketAliasLog(la); } #ifndef NO_FW_PUNCH /* Start punching holes in the firewall? */ if (flags & mask & PKT_ALIAS_PUNCH_FW) { InitPunchFW(la); } else /* Stop punching holes in the firewall? */ if (~flags & mask & PKT_ALIAS_PUNCH_FW) { UninitPunchFW(la); } #endif /* Other flags can be set/cleared without special action */ la->packetAliasMode = (flags & mask) | (la->packetAliasMode & ~mask); res = la->packetAliasMode; getout: LIBALIAS_UNLOCK(la); return (res); } int LibAliasCheckNewLink(struct libalias *la) { int res; LIBALIAS_LOCK(la); res = la->newDefaultLink; LIBALIAS_UNLOCK(la); return (res); } #ifndef NO_FW_PUNCH /***************** Code to support firewall punching. This shouldn't really be in this file, but making variables global is evil too. ****************/ /* Firewall include files */ #include #include #include #include /* * helper function, updates the pointer to cmd with the length * of the current command, and also cleans up the first word of * the new command in case it has been clobbered before. */ static ipfw_insn * next_cmd(ipfw_insn * cmd) { cmd += F_LEN(cmd); bzero(cmd, sizeof(*cmd)); return (cmd); } /* * A function to fill simple commands of size 1. * Existing flags are preserved. */ static ipfw_insn * fill_cmd(ipfw_insn * cmd, enum ipfw_opcodes opcode, int size, int flags, u_int16_t arg) { cmd->opcode = opcode; cmd->len = ((cmd->len | flags) & (F_NOT | F_OR)) | (size & F_LEN_MASK); cmd->arg1 = arg; return next_cmd(cmd); } static ipfw_insn * fill_ip(ipfw_insn * cmd1, enum ipfw_opcodes opcode, u_int32_t addr) { ipfw_insn_ip *cmd = (ipfw_insn_ip *) cmd1; cmd->addr.s_addr = addr; return fill_cmd(cmd1, opcode, F_INSN_SIZE(ipfw_insn_u32), 0, 0); } static ipfw_insn * fill_one_port(ipfw_insn * cmd1, enum ipfw_opcodes opcode, u_int16_t port) { ipfw_insn_u16 *cmd = (ipfw_insn_u16 *) cmd1; cmd->ports[0] = cmd->ports[1] = port; return fill_cmd(cmd1, opcode, F_INSN_SIZE(ipfw_insn_u16), 0, 0); } static int fill_rule(void *buf, int bufsize, int rulenum, enum ipfw_opcodes action, int proto, struct in_addr sa, u_int16_t sp, struct in_addr da, u_int16_t dp) { struct ip_fw *rule = (struct ip_fw *)buf; ipfw_insn *cmd = (ipfw_insn *) rule->cmd; bzero(buf, bufsize); rule->rulenum = rulenum; cmd = fill_cmd(cmd, O_PROTO, F_INSN_SIZE(ipfw_insn), 0, proto); cmd = fill_ip(cmd, O_IP_SRC, sa.s_addr); cmd = fill_one_port(cmd, O_IP_SRCPORT, sp); cmd = fill_ip(cmd, O_IP_DST, da.s_addr); cmd = fill_one_port(cmd, O_IP_DSTPORT, dp); rule->act_ofs = (u_int32_t *) cmd - (u_int32_t *) rule->cmd; cmd = fill_cmd(cmd, action, F_INSN_SIZE(ipfw_insn), 0, 0); rule->cmd_len = (u_int32_t *) cmd - (u_int32_t *) rule->cmd; return ((char *)cmd - (char *)buf); } static void ClearAllFWHoles(struct libalias *la); #define fw_setfield(la, field, num) \ do { \ (field)[(num) - la->fireWallBaseNum] = 1; \ } /*lint -save -e717 */ while(0)/* lint -restore */ #define fw_clrfield(la, field, num) \ do { \ (field)[(num) - la->fireWallBaseNum] = 0; \ } /*lint -save -e717 */ while(0)/* lint -restore */ #define fw_tstfield(la, field, num) ((field)[(num) - la->fireWallBaseNum]) static void InitPunchFW(struct libalias *la) { la->fireWallField = malloc(la->fireWallNumNums); if (la->fireWallField) { memset(la->fireWallField, 0, la->fireWallNumNums); if (la->fireWallFD < 0) { la->fireWallFD = socket(AF_INET, SOCK_RAW, IPPROTO_RAW); } ClearAllFWHoles(la); la->fireWallActiveNum = la->fireWallBaseNum; } } static void UninitPunchFW(struct libalias *la) { ClearAllFWHoles(la); if (la->fireWallFD >= 0) close(la->fireWallFD); la->fireWallFD = -1; if (la->fireWallField) free(la->fireWallField); la->fireWallField = NULL; la->packetAliasMode &= ~PKT_ALIAS_PUNCH_FW; } /* Make a certain link go through the firewall */ void PunchFWHole(struct alias_link *lnk) { struct libalias *la; int r; /* Result code */ struct ip_fw rule; /* On-the-fly built rule */ int fwhole; /* Where to punch hole */ la = lnk->la; /* Don't do anything unless we are asked to */ if (!(la->packetAliasMode & PKT_ALIAS_PUNCH_FW) || la->fireWallFD < 0 || lnk->link_type != LINK_TCP) return; memset(&rule, 0, sizeof rule); /** Build rule **/ /* Find empty slot */ for (fwhole = la->fireWallActiveNum; fwhole < la->fireWallBaseNum + la->fireWallNumNums && fw_tstfield(la, la->fireWallField, fwhole); fwhole++); if (fwhole == la->fireWallBaseNum + la->fireWallNumNums) { for (fwhole = la->fireWallBaseNum; fwhole < la->fireWallActiveNum && fw_tstfield(la, la->fireWallField, fwhole); fwhole++); if (fwhole == la->fireWallActiveNum) { /* No rule point empty - we can't punch more holes. */ la->fireWallActiveNum = la->fireWallBaseNum; #ifdef LIBALIAS_DEBUG fprintf(stderr, "libalias: Unable to create firewall hole!\n"); #endif return; } } /* Start next search at next position */ la->fireWallActiveNum = fwhole + 1; /* * generate two rules of the form * * add fwhole accept tcp from OAddr OPort to DAddr DPort add fwhole * accept tcp from DAddr DPort to OAddr OPort */ if (GetOriginalPort(lnk) != 0 && GetDestPort(lnk) != 0) { u_int32_t rulebuf[255]; int i; i = fill_rule(rulebuf, sizeof(rulebuf), fwhole, O_ACCEPT, IPPROTO_TCP, GetOriginalAddress(lnk), ntohs(GetOriginalPort(lnk)), GetDestAddress(lnk), ntohs(GetDestPort(lnk))); r = setsockopt(la->fireWallFD, IPPROTO_IP, IP_FW_ADD, rulebuf, i); if (r) err(1, "alias punch inbound(1) setsockopt(IP_FW_ADD)"); i = fill_rule(rulebuf, sizeof(rulebuf), fwhole, O_ACCEPT, IPPROTO_TCP, GetDestAddress(lnk), ntohs(GetDestPort(lnk)), GetOriginalAddress(lnk), ntohs(GetOriginalPort(lnk))); r = setsockopt(la->fireWallFD, IPPROTO_IP, IP_FW_ADD, rulebuf, i); if (r) err(1, "alias punch inbound(2) setsockopt(IP_FW_ADD)"); } /* Indicate hole applied */ lnk->data.tcp->fwhole = fwhole; fw_setfield(la, la->fireWallField, fwhole); } /* Remove a hole in a firewall associated with a particular alias lnk. Calling this too often is harmless. */ static void ClearFWHole(struct alias_link *lnk) { struct libalias *la; la = lnk->la; if (lnk->link_type == LINK_TCP) { int fwhole = lnk->data.tcp->fwhole; /* Where is the firewall * hole? */ struct ip_fw rule; if (fwhole < 0) return; memset(&rule, 0, sizeof rule); /* useless for ipfw2 */ while (!setsockopt(la->fireWallFD, IPPROTO_IP, IP_FW_DEL, &fwhole, sizeof fwhole)); fw_clrfield(la, la->fireWallField, fwhole); lnk->data.tcp->fwhole = -1; } } /* Clear out the entire range dedicated to firewall holes. */ static void ClearAllFWHoles(struct libalias *la) { struct ip_fw rule; /* On-the-fly built rule */ int i; if (la->fireWallFD < 0) return; memset(&rule, 0, sizeof rule); for (i = la->fireWallBaseNum; i < la->fireWallBaseNum + la->fireWallNumNums; i++) { int r = i; while (!setsockopt(la->fireWallFD, IPPROTO_IP, IP_FW_DEL, &r, sizeof r)); } /* XXX: third arg correct here ? /phk */ memset(la->fireWallField, 0, la->fireWallNumNums); } #endif /* !NO_FW_PUNCH */ void LibAliasSetFWBase(struct libalias *la, unsigned int base, unsigned int num) { LIBALIAS_LOCK(la); #ifndef NO_FW_PUNCH la->fireWallBaseNum = base; la->fireWallNumNums = num; #endif LIBALIAS_UNLOCK(la); } void LibAliasSetSkinnyPort(struct libalias *la, unsigned int port) { LIBALIAS_LOCK(la); la->skinnyPort = port; LIBALIAS_UNLOCK(la); } /* * Find the address to redirect incoming packets */ struct in_addr FindSctpRedirectAddress(struct libalias *la, struct sctp_nat_msg *sm) { struct alias_link *lnk; struct in_addr redir; LIBALIAS_LOCK_ASSERT(la); lnk = FindLinkIn(la, sm->ip_hdr->ip_src, sm->ip_hdr->ip_dst, sm->sctp_hdr->dest_port,sm->sctp_hdr->dest_port, LINK_SCTP, 1); if (lnk != NULL) { return(lnk->src_addr); /* port redirect */ } else { redir = FindOriginalAddress(la,sm->ip_hdr->ip_dst); if (redir.s_addr == la->aliasAddress.s_addr || redir.s_addr == la->targetAddress.s_addr) { /* No address found */ lnk = FindLinkIn(la, sm->ip_hdr->ip_src, sm->ip_hdr->ip_dst, NO_DEST_PORT, 0, LINK_SCTP, 1); if (lnk != NULL) return(lnk->src_addr); /* redirect proto */ } return(redir); /* address redirect */ } } diff --git a/sys/netinet/libalias/alias_local.h b/sys/netinet/libalias/alias_local.h index 5919851a4019..ba128638c1fe 100644 --- a/sys/netinet/libalias/alias_local.h +++ b/sys/netinet/libalias/alias_local.h @@ -1,411 +1,415 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2001 Charles Mott * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ /* * Alias_local.h contains the function prototypes for alias.c, * alias_db.c, alias_util.c and alias_ftp.c, alias_irc.c (as well * as any future add-ons). It also includes macros, globals and * struct definitions shared by more than one alias*.c file. * * This include file is intended to be used only within the aliasing * software. Outside world interfaces are defined in alias.h * * This software is placed into the public domain with no restrictions * on its distribution. * * Initial version: August, 1996 (cjm) * * */ #ifndef _ALIAS_LOCAL_H_ #define _ALIAS_LOCAL_H_ #include #include #ifdef _KERNEL #include #include #include #include /* XXX: LibAliasSetTarget() uses this constant. */ #define INADDR_NONE 0xffffffff #include #else #include "alias_sctp.h" #endif /* Sizes of input and output link tables */ #define LINK_TABLE_OUT_SIZE 4001 #define LINK_TABLE_IN_SIZE 4001 #define GET_ALIAS_PORT -1 #define GET_ALIAS_ID GET_ALIAS_PORT #ifdef _KERNEL #define INET_NTOA_BUF(buf) (buf) #else #define INET_NTOA_BUF(buf) (buf), sizeof(buf) #endif struct proxy_entry; struct libalias { LIST_ENTRY(libalias) instancelist; int packetAliasMode; /* Mode flags */ /* - documented in alias.h */ struct in_addr aliasAddress; /* Address written onto source */ /* field of IP packet. */ struct in_addr targetAddress; /* IP address incoming packets */ /* are sent to if no aliasing */ /* link already exists */ struct in_addr nullAddress; /* Used as a dummy parameter for */ /* some function calls */ LIST_HEAD (, alias_link) linkTableOut[LINK_TABLE_OUT_SIZE]; /* Lookup table of pointers to */ /* chains of link records. Each */ LIST_HEAD (, alias_link) linkTableIn[LINK_TABLE_IN_SIZE]; /* link record is doubly indexed */ /* into input and output lookup */ /* tables. */ /* Link statistics */ int icmpLinkCount; int udpLinkCount; int tcpLinkCount; int pptpLinkCount; int protoLinkCount; int fragmentIdLinkCount; int fragmentPtrLinkCount; int sockCount; int cleanupIndex; /* Index to chain of link table */ /* being inspected for old links */ int timeStamp; /* System time in seconds for */ /* current packet */ int lastCleanupTime; /* Last time * IncrementalCleanup() */ /* was called */ int deleteAllLinks; /* If equal to zero, DeleteLink() */ /* will not remove permanent links */ /* log descriptor */ #ifdef _KERNEL char *logDesc; #else FILE *logDesc; #endif /* statistics monitoring */ int newDefaultLink; /* Indicates if a new aliasing */ /* link has been created after a */ /* call to PacketAliasIn/Out(). */ #ifndef NO_FW_PUNCH int fireWallFD; /* File descriptor to be able to */ /* control firewall. Opened by */ /* PacketAliasSetMode on first */ /* setting the PKT_ALIAS_PUNCH_FW */ /* flag. */ int fireWallBaseNum; /* The first firewall entry * free for our use */ int fireWallNumNums; /* How many entries can we * use? */ int fireWallActiveNum; /* Which entry did we last * use? */ char *fireWallField; /* bool array for entries */ #endif unsigned int skinnyPort; /* TCP port used by the Skinny */ /* protocol. */ struct proxy_entry *proxyList; struct in_addr true_addr; /* in network byte order. */ u_short true_port; /* in host byte order. */ + /* Port ranges for aliasing. */ + u_short aliasPortLower; + u_short aliasPortLength; + /* * sctp code support */ /* counts associations that have progressed to UP and not yet removed */ int sctpLinkCount; #ifdef _KERNEL /* timing queue for keeping track of association timeouts */ struct sctp_nat_timer sctpNatTimer; /* size of hash table used in this instance */ u_int sctpNatTableSize; /* * local look up table sorted by l_vtag/l_port */ LIST_HEAD(sctpNatTableL, sctp_nat_assoc) *sctpTableLocal; /* * global look up table sorted by g_vtag/g_port */ LIST_HEAD(sctpNatTableG, sctp_nat_assoc) *sctpTableGlobal; /* * avoid races in libalias: every public function has to use it. */ struct mtx mutex; #endif }; /* Macros */ #ifdef _KERNEL #define LIBALIAS_LOCK_INIT(l) \ mtx_init(&l->mutex, "per-instance libalias mutex", NULL, MTX_DEF) #define LIBALIAS_LOCK_ASSERT(l) mtx_assert(&l->mutex, MA_OWNED) #define LIBALIAS_LOCK(l) mtx_lock(&l->mutex) #define LIBALIAS_UNLOCK(l) mtx_unlock(&l->mutex) #define LIBALIAS_LOCK_DESTROY(l) mtx_destroy(&l->mutex) #else #define LIBALIAS_LOCK_INIT(l) #define LIBALIAS_LOCK_ASSERT(l) #define LIBALIAS_LOCK(l) #define LIBALIAS_UNLOCK(l) #define LIBALIAS_LOCK_DESTROY(l) #endif /* * The following macro is used to update an * internet checksum. "delta" is a 32-bit * accumulation of all the changes to the * checksum (adding in new 16-bit words and * subtracting out old words), and "cksum" * is the checksum value to be updated. */ #define ADJUST_CHECKSUM(acc, cksum) \ do { \ acc += cksum; \ if (acc < 0) { \ acc = -acc; \ acc = (acc >> 16) + (acc & 0xffff); \ acc += acc >> 16; \ cksum = (u_short) ~acc; \ } else { \ acc = (acc >> 16) + (acc & 0xffff); \ acc += acc >> 16; \ cksum = (u_short) acc; \ } \ } while (0) /* Prototypes */ /* * SctpFunction prototypes * */ void AliasSctpInit(struct libalias *la); void AliasSctpTerm(struct libalias *la); int SctpAlias(struct libalias *la, struct ip *ip, int direction); /* * We do not calculate TCP checksums when libalias is a kernel * module, since it has no idea about checksum offloading. * If TCP data has changed, then we just set checksum to zero, * and caller must recalculate it himself. * In case if libalias will edit UDP data, the same approach * should be used. */ #ifndef _KERNEL u_short IpChecksum(struct ip *_pip); u_short TcpChecksum(struct ip *_pip); #endif void DifferentialChecksum(u_short * _cksum, void * _new, void * _old, int _n); /* Internal data access */ struct alias_link * AddLink(struct libalias *la, struct in_addr src_addr, struct in_addr dst_addr, struct in_addr alias_addr, u_short src_port, u_short dst_port, int alias_param, int link_type); struct alias_link * FindIcmpIn(struct libalias *la, struct in_addr _dst_addr, struct in_addr _alias_addr, u_short _id_alias, int _create); struct alias_link * FindIcmpOut(struct libalias *la, struct in_addr _src_addr, struct in_addr _dst_addr, u_short _id, int _create); struct alias_link * FindFragmentIn1(struct libalias *la, struct in_addr _dst_addr, struct in_addr _alias_addr, u_short _ip_id); struct alias_link * FindFragmentIn2(struct libalias *la, struct in_addr _dst_addr, struct in_addr _alias_addr, u_short _ip_id); struct alias_link * AddFragmentPtrLink(struct libalias *la, struct in_addr _dst_addr, u_short _ip_id); struct alias_link * FindFragmentPtr(struct libalias *la, struct in_addr _dst_addr, u_short _ip_id); struct alias_link * FindProtoIn(struct libalias *la, struct in_addr _dst_addr, struct in_addr _alias_addr, u_char _proto); struct alias_link * FindProtoOut(struct libalias *la, struct in_addr _src_addr, struct in_addr _dst_addr, u_char _proto); struct alias_link * FindUdpTcpIn(struct libalias *la, struct in_addr _dst_addr, struct in_addr _alias_addr, u_short _dst_port, u_short _alias_port, u_char _proto, int _create); struct alias_link * FindUdpTcpOut(struct libalias *la, struct in_addr _src_addr, struct in_addr _dst_addr, u_short _src_port, u_short _dst_port, u_char _proto, int _create); struct alias_link * AddPptp(struct libalias *la, struct in_addr _src_addr, struct in_addr _dst_addr, struct in_addr _alias_addr, u_int16_t _src_call_id); struct alias_link * FindPptpOutByCallId(struct libalias *la, struct in_addr _src_addr, struct in_addr _dst_addr, u_int16_t _src_call_id); struct alias_link * FindPptpInByCallId(struct libalias *la, struct in_addr _dst_addr, struct in_addr _alias_addr, u_int16_t _dst_call_id); struct alias_link * FindPptpOutByPeerCallId(struct libalias *la, struct in_addr _src_addr, struct in_addr _dst_addr, u_int16_t _dst_call_id); struct alias_link * FindPptpInByPeerCallId(struct libalias *la, struct in_addr _dst_addr, struct in_addr _alias_addr, u_int16_t _alias_call_id); struct alias_link * FindRtspOut(struct libalias *la, struct in_addr _src_addr, struct in_addr _dst_addr, u_short _src_port, u_short _alias_port, u_char _proto); struct in_addr FindOriginalAddress(struct libalias *la, struct in_addr _alias_addr); struct in_addr FindAliasAddress(struct libalias *la, struct in_addr _original_addr); struct in_addr FindSctpRedirectAddress(struct libalias *la, struct sctp_nat_msg *sm); /* External data access/modification */ int FindNewPortGroup(struct libalias *la, struct in_addr _dst_addr, struct in_addr _alias_addr, u_short _src_port, u_short _dst_port, u_short _port_count, u_char _proto, u_char _align); void GetFragmentAddr(struct alias_link *_lnk, struct in_addr *_src_addr); void SetFragmentAddr(struct alias_link *_lnk, struct in_addr _src_addr); void GetFragmentPtr(struct alias_link *_lnk, void **_fptr); void SetFragmentPtr(struct alias_link *_lnk, void *fptr); void SetStateIn(struct alias_link *_lnk, int _state); void SetStateOut(struct alias_link *_lnk, int _state); int GetStateIn (struct alias_link *_lnk); int GetStateOut(struct alias_link *_lnk); struct in_addr GetOriginalAddress(struct alias_link *_lnk); struct in_addr GetDestAddress(struct alias_link *_lnk); struct in_addr GetAliasAddress(struct alias_link *_lnk); struct in_addr GetDefaultAliasAddress(struct libalias *la); void SetDefaultAliasAddress(struct libalias *la, struct in_addr _alias_addr); u_short GetOriginalPort(struct alias_link *_lnk); u_short GetAliasPort(struct alias_link *_lnk); struct in_addr GetProxyAddress(struct alias_link *_lnk); void SetProxyAddress(struct alias_link *_lnk, struct in_addr _addr); u_short GetProxyPort(struct alias_link *_lnk); void SetProxyPort(struct alias_link *_lnk, u_short _port); void SetAckModified(struct alias_link *_lnk); int GetAckModified(struct alias_link *_lnk); int GetDeltaAckIn(u_long, struct alias_link *_lnk); int GetDeltaSeqOut(u_long, struct alias_link *lnk); void AddSeq(struct alias_link *lnk, int delta, u_int ip_hl, u_short ip_len, u_long th_seq, u_int th_off); void SetExpire (struct alias_link *_lnk, int _expire); void ClearCheckNewLink(struct libalias *la); void SetProtocolFlags(struct alias_link *_lnk, int _pflags); int GetProtocolFlags(struct alias_link *_lnk); void SetDestCallId(struct alias_link *_lnk, u_int16_t _cid); #ifndef NO_FW_PUNCH void PunchFWHole(struct alias_link *_lnk); #endif /* Housekeeping function */ void HouseKeeping(struct libalias *); /* Tcp specific routines */ /* lint -save -library Suppress flexelint warnings */ /* Transparent proxy routines */ int ProxyCheck(struct libalias *la, struct in_addr *proxy_server_addr, u_short * proxy_server_port, struct in_addr src_addr, struct in_addr dst_addr, u_short dst_port, u_char ip_p); void ProxyModify(struct libalias *la, struct alias_link *_lnk, struct ip *_pip, int _maxpacketsize, int _proxy_type); enum alias_tcp_state { ALIAS_TCP_STATE_NOT_CONNECTED, ALIAS_TCP_STATE_CONNECTED, ALIAS_TCP_STATE_DISCONNECTED }; #if defined(_NETINET_IP_H_) static __inline void * ip_next(struct ip *iphdr) { char *p = (char *)iphdr; return (&p[iphdr->ip_hl * 4]); } #endif #if defined(_NETINET_TCP_H_) static __inline void * tcp_next(struct tcphdr *tcphdr) { char *p = (char *)tcphdr; return (&p[tcphdr->th_off * 4]); } #endif #if defined(_NETINET_UDP_H_) static __inline void * udp_next(struct udphdr *udphdr) { return ((void *)(udphdr + 1)); } #endif #endif /* !_ALIAS_LOCAL_H_ */ diff --git a/sys/netpfil/ipfw/ip_fw_nat.c b/sys/netpfil/ipfw/ip_fw_nat.c index 81229b2707e9..bcda3cff011c 100644 --- a/sys/netpfil/ipfw/ip_fw_nat.c +++ b/sys/netpfil/ipfw/ip_fw_nat.c @@ -1,1240 +1,1247 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2008 Paolo Pisati * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* XXX for in_cksum */ struct cfg_spool { LIST_ENTRY(cfg_spool) _next; /* chain of spool instances */ struct in_addr addr; uint16_t port; }; /* Nat redirect configuration. */ struct cfg_redir { LIST_ENTRY(cfg_redir) _next; /* chain of redir instances */ uint16_t mode; /* type of redirect mode */ uint16_t proto; /* protocol: tcp/udp */ struct in_addr laddr; /* local ip address */ struct in_addr paddr; /* public ip address */ struct in_addr raddr; /* remote ip address */ uint16_t lport; /* local port */ uint16_t pport; /* public port */ uint16_t rport; /* remote port */ uint16_t pport_cnt; /* number of public ports */ uint16_t rport_cnt; /* number of remote ports */ struct alias_link **alink; u_int16_t spool_cnt; /* num of entry in spool chain */ /* chain of spool instances */ LIST_HEAD(spool_chain, cfg_spool) spool_chain; }; /* Nat configuration data struct. */ struct cfg_nat { /* chain of nat instances */ LIST_ENTRY(cfg_nat) _next; int id; /* nat id */ struct in_addr ip; /* nat ip address */ struct libalias *lib; /* libalias instance */ int mode; /* aliasing mode */ int redir_cnt; /* number of entry in spool chain */ /* chain of redir instances */ LIST_HEAD(redir_chain, cfg_redir) redir_chain; char if_name[IF_NAMESIZE]; /* interface name */ + u_short alias_port_lo; /* low range for port aliasing */ + u_short alias_port_hi; /* high range for port aliasing */ }; static eventhandler_tag ifaddr_event_tag; static void ifaddr_change(void *arg __unused, struct ifnet *ifp) { struct cfg_nat *ptr; struct ifaddr *ifa; struct ip_fw_chain *chain; KASSERT(curvnet == ifp->if_vnet, ("curvnet(%p) differs from iface vnet(%p)", curvnet, ifp->if_vnet)); if (V_ipfw_vnet_ready == 0 || V_ipfw_nat_ready == 0) return; chain = &V_layer3_chain; IPFW_UH_WLOCK(chain); /* Check every nat entry... */ LIST_FOREACH(ptr, &chain->nat, _next) { struct epoch_tracker et; /* ...using nic 'ifp->if_xname' as dynamic alias address. */ if (strncmp(ptr->if_name, ifp->if_xname, IF_NAMESIZE) != 0) continue; NET_EPOCH_ENTER(et); CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr == NULL) continue; if (ifa->ifa_addr->sa_family != AF_INET) continue; IPFW_WLOCK(chain); ptr->ip = ((struct sockaddr_in *) (ifa->ifa_addr))->sin_addr; LibAliasSetAddress(ptr->lib, ptr->ip); IPFW_WUNLOCK(chain); } NET_EPOCH_EXIT(et); } IPFW_UH_WUNLOCK(chain); } /* * delete the pointers for nat entry ix, or all of them if ix < 0 */ static void flush_nat_ptrs(struct ip_fw_chain *chain, const int ix) { ipfw_insn_nat *cmd; int i; IPFW_WLOCK_ASSERT(chain); for (i = 0; i < chain->n_rules; i++) { cmd = (ipfw_insn_nat *)ipfw_get_action(chain->map[i]); if (cmd->o.opcode == O_NAT && cmd->nat != NULL && (ix < 0 || cmd->nat->id == ix)) cmd->nat = NULL; } } static void del_redir_spool_cfg(struct cfg_nat *n, struct redir_chain *head) { struct cfg_redir *r, *tmp_r; struct cfg_spool *s, *tmp_s; int i, num; LIST_FOREACH_SAFE(r, head, _next, tmp_r) { num = 1; /* Number of alias_link to delete. */ switch (r->mode) { case NAT44_REDIR_PORT: num = r->pport_cnt; /* FALLTHROUGH */ case NAT44_REDIR_ADDR: case NAT44_REDIR_PROTO: /* Delete all libalias redirect entry. */ for (i = 0; i < num; i++) LibAliasRedirectDelete(n->lib, r->alink[i]); /* Del spool cfg if any. */ LIST_FOREACH_SAFE(s, &r->spool_chain, _next, tmp_s) { LIST_REMOVE(s, _next); free(s, M_IPFW); } free(r->alink, M_IPFW); LIST_REMOVE(r, _next); free(r, M_IPFW); break; default: printf("unknown redirect mode: %u\n", r->mode); /* XXX - panic?!?!? */ break; } } } static int add_redir_spool_cfg(char *buf, struct cfg_nat *ptr) { struct cfg_redir *r; struct cfg_spool *s; struct nat44_cfg_redir *ser_r; struct nat44_cfg_spool *ser_s; int cnt, off, i; for (cnt = 0, off = 0; cnt < ptr->redir_cnt; cnt++) { ser_r = (struct nat44_cfg_redir *)&buf[off]; r = malloc(sizeof(*r), M_IPFW, M_WAITOK | M_ZERO); r->mode = ser_r->mode; r->laddr = ser_r->laddr; r->paddr = ser_r->paddr; r->raddr = ser_r->raddr; r->lport = ser_r->lport; r->pport = ser_r->pport; r->rport = ser_r->rport; r->pport_cnt = ser_r->pport_cnt; r->rport_cnt = ser_r->rport_cnt; r->proto = ser_r->proto; r->spool_cnt = ser_r->spool_cnt; //memcpy(r, ser_r, SOF_REDIR); LIST_INIT(&r->spool_chain); off += sizeof(struct nat44_cfg_redir); r->alink = malloc(sizeof(struct alias_link *) * r->pport_cnt, M_IPFW, M_WAITOK | M_ZERO); switch (r->mode) { case NAT44_REDIR_ADDR: r->alink[0] = LibAliasRedirectAddr(ptr->lib, r->laddr, r->paddr); break; case NAT44_REDIR_PORT: for (i = 0 ; i < r->pport_cnt; i++) { /* If remotePort is all ports, set it to 0. */ u_short remotePortCopy = r->rport + i; if (r->rport_cnt == 1 && r->rport == 0) remotePortCopy = 0; r->alink[i] = LibAliasRedirectPort(ptr->lib, r->laddr, htons(r->lport + i), r->raddr, htons(remotePortCopy), r->paddr, htons(r->pport + i), r->proto); if (r->alink[i] == NULL) { r->alink[0] = NULL; break; } } break; case NAT44_REDIR_PROTO: r->alink[0] = LibAliasRedirectProto(ptr->lib ,r->laddr, r->raddr, r->paddr, r->proto); break; default: printf("unknown redirect mode: %u\n", r->mode); break; } if (r->alink[0] == NULL) { printf("LibAliasRedirect* returned NULL\n"); free(r->alink, M_IPFW); free(r, M_IPFW); return (EINVAL); } /* LSNAT handling. */ for (i = 0; i < r->spool_cnt; i++) { ser_s = (struct nat44_cfg_spool *)&buf[off]; s = malloc(sizeof(*s), M_IPFW, M_WAITOK | M_ZERO); s->addr = ser_s->addr; s->port = ser_s->port; LibAliasAddServer(ptr->lib, r->alink[0], s->addr, htons(s->port)); off += sizeof(struct nat44_cfg_spool); /* Hook spool entry. */ LIST_INSERT_HEAD(&r->spool_chain, s, _next); } /* And finally hook this redir entry. */ LIST_INSERT_HEAD(&ptr->redir_chain, r, _next); } return (0); } static void free_nat_instance(struct cfg_nat *ptr) { del_redir_spool_cfg(ptr, &ptr->redir_chain); LibAliasUninit(ptr->lib); free(ptr, M_IPFW); } /* * ipfw_nat - perform mbuf header translation. * * Note V_layer3_chain has to be locked while calling ipfw_nat() in * 'global' operation mode (t == NULL). * */ static int ipfw_nat(struct ip_fw_args *args, struct cfg_nat *t, struct mbuf *m) { struct mbuf *mcl; struct ip *ip; /* XXX - libalias duct tape */ int ldt, retval, found; struct ip_fw_chain *chain; char *c; ldt = 0; retval = 0; mcl = m_megapullup(m, m->m_pkthdr.len); if (mcl == NULL) { args->m = NULL; return (IP_FW_DENY); } ip = mtod(mcl, struct ip *); /* * XXX - Libalias checksum offload 'duct tape': * * locally generated packets have only pseudo-header checksum * calculated and libalias will break it[1], so mark them for * later fix. Moreover there are cases when libalias modifies * tcp packet data[2], mark them for later fix too. * * [1] libalias was never meant to run in kernel, so it does * not have any knowledge about checksum offloading, and * expects a packet with a full internet checksum. * Unfortunately, packets generated locally will have just the * pseudo header calculated, and when libalias tries to adjust * the checksum it will actually compute a wrong value. * * [2] when libalias modifies tcp's data content, full TCP * checksum has to be recomputed: the problem is that * libalias does not have any idea about checksum offloading. * To work around this, we do not do checksumming in LibAlias, * but only mark the packets in th_x2 field. If we receive a * marked packet, we calculate correct checksum for it * aware of offloading. Why such a terrible hack instead of * recalculating checksum for each packet? * Because the previous checksum was not checked! * Recalculating checksums for EVERY packet will hide ALL * transmission errors. Yes, marked packets still suffer from * this problem. But, sigh, natd(8) has this problem, too. * * TODO: -make libalias mbuf aware (so * it can handle delayed checksum and tso) */ if (mcl->m_pkthdr.rcvif == NULL && mcl->m_pkthdr.csum_flags & CSUM_DELAY_DATA) ldt = 1; c = mtod(mcl, char *); /* Check if this is 'global' instance */ if (t == NULL) { if (args->flags & IPFW_ARGS_IN) { /* Wrong direction, skip processing */ args->m = mcl; return (IP_FW_NAT); } found = 0; chain = &V_layer3_chain; IPFW_RLOCK_ASSERT(chain); /* Check every nat entry... */ LIST_FOREACH(t, &chain->nat, _next) { if ((t->mode & PKT_ALIAS_SKIP_GLOBAL) != 0) continue; retval = LibAliasOutTry(t->lib, c, mcl->m_len + M_TRAILINGSPACE(mcl), 0); if (retval == PKT_ALIAS_OK) { /* Nat instance recognises state */ found = 1; break; } } if (found != 1) { /* No instance found, return ignore */ args->m = mcl; return (IP_FW_NAT); } } else { if (args->flags & IPFW_ARGS_IN) retval = LibAliasIn(t->lib, c, mcl->m_len + M_TRAILINGSPACE(mcl)); else retval = LibAliasOut(t->lib, c, mcl->m_len + M_TRAILINGSPACE(mcl)); } /* * We drop packet when: * 1. libalias returns PKT_ALIAS_ERROR; * 2. For incoming packets: * a) for unresolved fragments; * b) libalias returns PKT_ALIAS_IGNORED and * PKT_ALIAS_DENY_INCOMING flag is set. */ if (retval == PKT_ALIAS_ERROR || ((args->flags & IPFW_ARGS_IN) && (retval == PKT_ALIAS_UNRESOLVED_FRAGMENT || (retval == PKT_ALIAS_IGNORED && (t->mode & PKT_ALIAS_DENY_INCOMING) != 0)))) { /* XXX - should i add some logging? */ m_free(mcl); args->m = NULL; return (IP_FW_DENY); } if (retval == PKT_ALIAS_RESPOND) mcl->m_flags |= M_SKIP_FIREWALL; mcl->m_pkthdr.len = mcl->m_len = ntohs(ip->ip_len); /* * XXX - libalias checksum offload * 'duct tape' (see above) */ if ((ip->ip_off & htons(IP_OFFMASK)) == 0 && ip->ip_p == IPPROTO_TCP) { struct tcphdr *th; th = (struct tcphdr *)(ip + 1); if (th->th_x2) ldt = 1; } if (ldt) { struct tcphdr *th; struct udphdr *uh; uint16_t ip_len, cksum; ip_len = ntohs(ip->ip_len); cksum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, htons(ip->ip_p + ip_len - (ip->ip_hl << 2))); switch (ip->ip_p) { case IPPROTO_TCP: th = (struct tcphdr *)(ip + 1); /* * Maybe it was set in * libalias... */ th->th_x2 = 0; th->th_sum = cksum; mcl->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum); break; case IPPROTO_UDP: uh = (struct udphdr *)(ip + 1); uh->uh_sum = cksum; mcl->m_pkthdr.csum_data = offsetof(struct udphdr, uh_sum); break; } /* No hw checksum offloading: do it ourselves */ if ((mcl->m_pkthdr.csum_flags & CSUM_DELAY_DATA) == 0) { in_delayed_cksum(mcl); mcl->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA; } } args->m = mcl; return (IP_FW_NAT); } static struct cfg_nat * lookup_nat(struct nat_list *l, int nat_id) { struct cfg_nat *res; LIST_FOREACH(res, l, _next) { if (res->id == nat_id) break; } return res; } static struct cfg_nat * lookup_nat_name(struct nat_list *l, char *name) { struct cfg_nat *res; int id; char *errptr; id = strtol(name, &errptr, 10); if (id == 0 || *errptr != '\0') return (NULL); LIST_FOREACH(res, l, _next) { if (res->id == id) break; } return (res); } /* IP_FW3 configuration routines */ static void nat44_config(struct ip_fw_chain *chain, struct nat44_cfg_nat *ucfg) { struct cfg_nat *ptr, *tcfg; int gencnt; /* * Find/create nat rule. */ IPFW_UH_WLOCK(chain); gencnt = chain->gencnt; ptr = lookup_nat_name(&chain->nat, ucfg->name); if (ptr == NULL) { IPFW_UH_WUNLOCK(chain); /* New rule: allocate and init new instance. */ ptr = malloc(sizeof(struct cfg_nat), M_IPFW, M_WAITOK | M_ZERO); ptr->lib = LibAliasInit(NULL); LIST_INIT(&ptr->redir_chain); } else { /* Entry already present: temporarily unhook it. */ IPFW_WLOCK(chain); LIST_REMOVE(ptr, _next); flush_nat_ptrs(chain, ptr->id); IPFW_WUNLOCK(chain); IPFW_UH_WUNLOCK(chain); } /* * Basic nat (re)configuration. */ ptr->id = strtol(ucfg->name, NULL, 10); /* * XXX - what if this rule doesn't nat any ip and just * redirect? * do we set aliasaddress to 0.0.0.0? */ ptr->ip = ucfg->ip; ptr->redir_cnt = ucfg->redir_cnt; ptr->mode = ucfg->mode; + ptr->alias_port_lo = ucfg->alias_port_lo; + ptr->alias_port_hi = ucfg->alias_port_hi; strlcpy(ptr->if_name, ucfg->if_name, sizeof(ptr->if_name)); LibAliasSetMode(ptr->lib, ptr->mode, ~0); LibAliasSetAddress(ptr->lib, ptr->ip); + LibAliasSetAliasPortRange(ptr->lib, ptr->alias_port_lo, ptr->alias_port_hi); /* * Redir and LSNAT configuration. */ /* Delete old cfgs. */ del_redir_spool_cfg(ptr, &ptr->redir_chain); /* Add new entries. */ add_redir_spool_cfg((char *)(ucfg + 1), ptr); IPFW_UH_WLOCK(chain); /* Extra check to avoid race with another ipfw_nat_cfg() */ tcfg = NULL; if (gencnt != chain->gencnt) tcfg = lookup_nat_name(&chain->nat, ucfg->name); IPFW_WLOCK(chain); if (tcfg != NULL) LIST_REMOVE(tcfg, _next); LIST_INSERT_HEAD(&chain->nat, ptr, _next); IPFW_WUNLOCK(chain); chain->gencnt++; IPFW_UH_WUNLOCK(chain); if (tcfg != NULL) free_nat_instance(ptr); } /* * Creates/configure nat44 instance * Data layout (v0)(current): * Request: [ ipfw_obj_header nat44_cfg_nat .. ] * * Returns 0 on success */ static int nat44_cfg(struct ip_fw_chain *chain, ip_fw3_opheader *op3, struct sockopt_data *sd) { ipfw_obj_header *oh; struct nat44_cfg_nat *ucfg; int id; size_t read; char *errptr; /* Check minimum header size */ if (sd->valsize < (sizeof(*oh) + sizeof(*ucfg))) return (EINVAL); oh = (ipfw_obj_header *)sd->kbuf; /* Basic length checks for TLVs */ if (oh->ntlv.head.length != sizeof(oh->ntlv)) return (EINVAL); ucfg = (struct nat44_cfg_nat *)(oh + 1); /* Check if name is properly terminated and looks like number */ if (strnlen(ucfg->name, sizeof(ucfg->name)) == sizeof(ucfg->name)) return (EINVAL); id = strtol(ucfg->name, &errptr, 10); if (id == 0 || *errptr != '\0') return (EINVAL); read = sizeof(*oh) + sizeof(*ucfg); /* Check number of redirs */ if (sd->valsize < read + ucfg->redir_cnt*sizeof(struct nat44_cfg_redir)) return (EINVAL); nat44_config(chain, ucfg); return (0); } /* * Destroys given nat instances. * Data layout (v0)(current): * Request: [ ipfw_obj_header ] * * Returns 0 on success */ static int nat44_destroy(struct ip_fw_chain *chain, ip_fw3_opheader *op3, struct sockopt_data *sd) { ipfw_obj_header *oh; struct cfg_nat *ptr; ipfw_obj_ntlv *ntlv; /* Check minimum header size */ if (sd->valsize < sizeof(*oh)) return (EINVAL); oh = (ipfw_obj_header *)sd->kbuf; /* Basic length checks for TLVs */ if (oh->ntlv.head.length != sizeof(oh->ntlv)) return (EINVAL); ntlv = &oh->ntlv; /* Check if name is properly terminated */ if (strnlen(ntlv->name, sizeof(ntlv->name)) == sizeof(ntlv->name)) return (EINVAL); IPFW_UH_WLOCK(chain); ptr = lookup_nat_name(&chain->nat, ntlv->name); if (ptr == NULL) { IPFW_UH_WUNLOCK(chain); return (ESRCH); } IPFW_WLOCK(chain); LIST_REMOVE(ptr, _next); flush_nat_ptrs(chain, ptr->id); IPFW_WUNLOCK(chain); IPFW_UH_WUNLOCK(chain); free_nat_instance(ptr); return (0); } static void export_nat_cfg(struct cfg_nat *ptr, struct nat44_cfg_nat *ucfg) { snprintf(ucfg->name, sizeof(ucfg->name), "%d", ptr->id); ucfg->ip = ptr->ip; ucfg->redir_cnt = ptr->redir_cnt; ucfg->mode = ptr->mode; + ucfg->alias_port_lo = ptr->alias_port_lo; + ucfg->alias_port_hi = ptr->alias_port_hi; strlcpy(ucfg->if_name, ptr->if_name, sizeof(ucfg->if_name)); } /* * Gets config for given nat instance * Data layout (v0)(current): * Request: [ ipfw_obj_header nat44_cfg_nat .. ] * * Returns 0 on success */ static int nat44_get_cfg(struct ip_fw_chain *chain, ip_fw3_opheader *op3, struct sockopt_data *sd) { ipfw_obj_header *oh; struct nat44_cfg_nat *ucfg; struct cfg_nat *ptr; struct cfg_redir *r; struct cfg_spool *s; struct nat44_cfg_redir *ser_r; struct nat44_cfg_spool *ser_s; size_t sz; sz = sizeof(*oh) + sizeof(*ucfg); /* Check minimum header size */ if (sd->valsize < sz) return (EINVAL); oh = (struct _ipfw_obj_header *)ipfw_get_sopt_header(sd, sz); /* Basic length checks for TLVs */ if (oh->ntlv.head.length != sizeof(oh->ntlv)) return (EINVAL); ucfg = (struct nat44_cfg_nat *)(oh + 1); /* Check if name is properly terminated */ if (strnlen(ucfg->name, sizeof(ucfg->name)) == sizeof(ucfg->name)) return (EINVAL); IPFW_UH_RLOCK(chain); ptr = lookup_nat_name(&chain->nat, ucfg->name); if (ptr == NULL) { IPFW_UH_RUNLOCK(chain); return (ESRCH); } export_nat_cfg(ptr, ucfg); /* Estimate memory amount */ sz = sizeof(ipfw_obj_header) + sizeof(struct nat44_cfg_nat); LIST_FOREACH(r, &ptr->redir_chain, _next) { sz += sizeof(struct nat44_cfg_redir); LIST_FOREACH(s, &r->spool_chain, _next) sz += sizeof(struct nat44_cfg_spool); } ucfg->size = sz; if (sd->valsize < sz) { /* * Submitted buffer size is not enough. * WE've already filled in @ucfg structure with * relevant info including size, so we * can return. Buffer will be flushed automatically. */ IPFW_UH_RUNLOCK(chain); return (ENOMEM); } /* Size OK, let's copy data */ LIST_FOREACH(r, &ptr->redir_chain, _next) { ser_r = (struct nat44_cfg_redir *)ipfw_get_sopt_space(sd, sizeof(*ser_r)); ser_r->mode = r->mode; ser_r->laddr = r->laddr; ser_r->paddr = r->paddr; ser_r->raddr = r->raddr; ser_r->lport = r->lport; ser_r->pport = r->pport; ser_r->rport = r->rport; ser_r->pport_cnt = r->pport_cnt; ser_r->rport_cnt = r->rport_cnt; ser_r->proto = r->proto; ser_r->spool_cnt = r->spool_cnt; LIST_FOREACH(s, &r->spool_chain, _next) { ser_s = (struct nat44_cfg_spool *)ipfw_get_sopt_space( sd, sizeof(*ser_s)); ser_s->addr = s->addr; ser_s->port = s->port; } } IPFW_UH_RUNLOCK(chain); return (0); } /* * Lists all nat44 instances currently available in kernel. * Data layout (v0)(current): * Request: [ ipfw_obj_lheader ] * Reply: [ ipfw_obj_lheader nat44_cfg_nat x N ] * * Returns 0 on success */ static int nat44_list_nat(struct ip_fw_chain *chain, ip_fw3_opheader *op3, struct sockopt_data *sd) { ipfw_obj_lheader *olh; struct nat44_cfg_nat *ucfg; struct cfg_nat *ptr; int nat_count; /* Check minimum header size */ if (sd->valsize < sizeof(ipfw_obj_lheader)) return (EINVAL); olh = (ipfw_obj_lheader *)ipfw_get_sopt_header(sd, sizeof(*olh)); IPFW_UH_RLOCK(chain); nat_count = 0; LIST_FOREACH(ptr, &chain->nat, _next) nat_count++; olh->count = nat_count; olh->objsize = sizeof(struct nat44_cfg_nat); olh->size = sizeof(*olh) + olh->count * olh->objsize; if (sd->valsize < olh->size) { IPFW_UH_RUNLOCK(chain); return (ENOMEM); } LIST_FOREACH(ptr, &chain->nat, _next) { ucfg = (struct nat44_cfg_nat *)ipfw_get_sopt_space(sd, sizeof(*ucfg)); export_nat_cfg(ptr, ucfg); } IPFW_UH_RUNLOCK(chain); return (0); } /* * Gets log for given nat instance * Data layout (v0)(current): * Request: [ ipfw_obj_header nat44_cfg_nat ] * Reply: [ ipfw_obj_header nat44_cfg_nat LOGBUFFER ] * * Returns 0 on success */ static int nat44_get_log(struct ip_fw_chain *chain, ip_fw3_opheader *op3, struct sockopt_data *sd) { ipfw_obj_header *oh; struct nat44_cfg_nat *ucfg; struct cfg_nat *ptr; void *pbuf; size_t sz; sz = sizeof(*oh) + sizeof(*ucfg); /* Check minimum header size */ if (sd->valsize < sz) return (EINVAL); oh = (struct _ipfw_obj_header *)ipfw_get_sopt_header(sd, sz); /* Basic length checks for TLVs */ if (oh->ntlv.head.length != sizeof(oh->ntlv)) return (EINVAL); ucfg = (struct nat44_cfg_nat *)(oh + 1); /* Check if name is properly terminated */ if (strnlen(ucfg->name, sizeof(ucfg->name)) == sizeof(ucfg->name)) return (EINVAL); IPFW_UH_RLOCK(chain); ptr = lookup_nat_name(&chain->nat, ucfg->name); if (ptr == NULL) { IPFW_UH_RUNLOCK(chain); return (ESRCH); } if (ptr->lib->logDesc == NULL) { IPFW_UH_RUNLOCK(chain); return (ENOENT); } export_nat_cfg(ptr, ucfg); /* Estimate memory amount */ ucfg->size = sizeof(struct nat44_cfg_nat) + LIBALIAS_BUF_SIZE; if (sd->valsize < sz + sizeof(*oh)) { /* * Submitted buffer size is not enough. * WE've already filled in @ucfg structure with * relevant info including size, so we * can return. Buffer will be flushed automatically. */ IPFW_UH_RUNLOCK(chain); return (ENOMEM); } pbuf = (void *)ipfw_get_sopt_space(sd, LIBALIAS_BUF_SIZE); memcpy(pbuf, ptr->lib->logDesc, LIBALIAS_BUF_SIZE); IPFW_UH_RUNLOCK(chain); return (0); } static struct ipfw_sopt_handler scodes[] = { { IP_FW_NAT44_XCONFIG, 0, HDIR_SET, nat44_cfg }, { IP_FW_NAT44_DESTROY, 0, HDIR_SET, nat44_destroy }, { IP_FW_NAT44_XGETCONFIG, 0, HDIR_GET, nat44_get_cfg }, { IP_FW_NAT44_LIST_NAT, 0, HDIR_GET, nat44_list_nat }, { IP_FW_NAT44_XGETLOG, 0, HDIR_GET, nat44_get_log }, }; /* * Legacy configuration routines */ struct cfg_spool_legacy { LIST_ENTRY(cfg_spool_legacy) _next; struct in_addr addr; u_short port; }; struct cfg_redir_legacy { LIST_ENTRY(cfg_redir) _next; u_int16_t mode; struct in_addr laddr; struct in_addr paddr; struct in_addr raddr; u_short lport; u_short pport; u_short rport; u_short pport_cnt; u_short rport_cnt; int proto; struct alias_link **alink; u_int16_t spool_cnt; LIST_HEAD(, cfg_spool_legacy) spool_chain; }; struct cfg_nat_legacy { LIST_ENTRY(cfg_nat_legacy) _next; int id; struct in_addr ip; char if_name[IF_NAMESIZE]; int mode; struct libalias *lib; int redir_cnt; LIST_HEAD(, cfg_redir_legacy) redir_chain; }; static int ipfw_nat_cfg(struct sockopt *sopt) { struct cfg_nat_legacy *cfg; struct nat44_cfg_nat *ucfg; struct cfg_redir_legacy *rdir; struct nat44_cfg_redir *urdir; char *buf; size_t len, len2; int error, i; len = sopt->sopt_valsize; len2 = len + 128; /* * Allocate 2x buffer to store converted structures. * new redir_cfg has shrunk, so we're sure that * new buffer size is enough. */ buf = malloc(roundup2(len, 8) + len2, M_TEMP, M_WAITOK | M_ZERO); error = sooptcopyin(sopt, buf, len, sizeof(struct cfg_nat_legacy)); if (error != 0) goto out; cfg = (struct cfg_nat_legacy *)buf; if (cfg->id < 0) { error = EINVAL; goto out; } ucfg = (struct nat44_cfg_nat *)&buf[roundup2(len, 8)]; snprintf(ucfg->name, sizeof(ucfg->name), "%d", cfg->id); strlcpy(ucfg->if_name, cfg->if_name, sizeof(ucfg->if_name)); ucfg->ip = cfg->ip; ucfg->mode = cfg->mode; ucfg->redir_cnt = cfg->redir_cnt; if (len < sizeof(*cfg) + cfg->redir_cnt * sizeof(*rdir)) { error = EINVAL; goto out; } urdir = (struct nat44_cfg_redir *)(ucfg + 1); rdir = (struct cfg_redir_legacy *)(cfg + 1); for (i = 0; i < cfg->redir_cnt; i++) { urdir->mode = rdir->mode; urdir->laddr = rdir->laddr; urdir->paddr = rdir->paddr; urdir->raddr = rdir->raddr; urdir->lport = rdir->lport; urdir->pport = rdir->pport; urdir->rport = rdir->rport; urdir->pport_cnt = rdir->pport_cnt; urdir->rport_cnt = rdir->rport_cnt; urdir->proto = rdir->proto; urdir->spool_cnt = rdir->spool_cnt; urdir++; rdir++; } nat44_config(&V_layer3_chain, ucfg); out: free(buf, M_TEMP); return (error); } static int ipfw_nat_del(struct sockopt *sopt) { struct cfg_nat *ptr; struct ip_fw_chain *chain = &V_layer3_chain; int i; sooptcopyin(sopt, &i, sizeof i, sizeof i); /* XXX validate i */ IPFW_UH_WLOCK(chain); ptr = lookup_nat(&chain->nat, i); if (ptr == NULL) { IPFW_UH_WUNLOCK(chain); return (EINVAL); } IPFW_WLOCK(chain); LIST_REMOVE(ptr, _next); flush_nat_ptrs(chain, i); IPFW_WUNLOCK(chain); IPFW_UH_WUNLOCK(chain); free_nat_instance(ptr); return (0); } static int ipfw_nat_get_cfg(struct sockopt *sopt) { struct ip_fw_chain *chain = &V_layer3_chain; struct cfg_nat *n; struct cfg_nat_legacy *ucfg; struct cfg_redir *r; struct cfg_spool *s; struct cfg_redir_legacy *ser_r; struct cfg_spool_legacy *ser_s; char *data; int gencnt, nat_cnt, len, error; nat_cnt = 0; len = sizeof(nat_cnt); IPFW_UH_RLOCK(chain); retry: gencnt = chain->gencnt; /* Estimate memory amount */ LIST_FOREACH(n, &chain->nat, _next) { nat_cnt++; len += sizeof(struct cfg_nat_legacy); LIST_FOREACH(r, &n->redir_chain, _next) { len += sizeof(struct cfg_redir_legacy); LIST_FOREACH(s, &r->spool_chain, _next) len += sizeof(struct cfg_spool_legacy); } } IPFW_UH_RUNLOCK(chain); data = malloc(len, M_TEMP, M_WAITOK | M_ZERO); bcopy(&nat_cnt, data, sizeof(nat_cnt)); nat_cnt = 0; len = sizeof(nat_cnt); IPFW_UH_RLOCK(chain); if (gencnt != chain->gencnt) { free(data, M_TEMP); goto retry; } /* Serialize all the data. */ LIST_FOREACH(n, &chain->nat, _next) { ucfg = (struct cfg_nat_legacy *)&data[len]; ucfg->id = n->id; ucfg->ip = n->ip; ucfg->redir_cnt = n->redir_cnt; ucfg->mode = n->mode; strlcpy(ucfg->if_name, n->if_name, sizeof(ucfg->if_name)); len += sizeof(struct cfg_nat_legacy); LIST_FOREACH(r, &n->redir_chain, _next) { ser_r = (struct cfg_redir_legacy *)&data[len]; ser_r->mode = r->mode; ser_r->laddr = r->laddr; ser_r->paddr = r->paddr; ser_r->raddr = r->raddr; ser_r->lport = r->lport; ser_r->pport = r->pport; ser_r->rport = r->rport; ser_r->pport_cnt = r->pport_cnt; ser_r->rport_cnt = r->rport_cnt; ser_r->proto = r->proto; ser_r->spool_cnt = r->spool_cnt; len += sizeof(struct cfg_redir_legacy); LIST_FOREACH(s, &r->spool_chain, _next) { ser_s = (struct cfg_spool_legacy *)&data[len]; ser_s->addr = s->addr; ser_s->port = s->port; len += sizeof(struct cfg_spool_legacy); } } } IPFW_UH_RUNLOCK(chain); error = sooptcopyout(sopt, data, len); free(data, M_TEMP); return (error); } static int ipfw_nat_get_log(struct sockopt *sopt) { uint8_t *data; struct cfg_nat *ptr; int i, size; struct ip_fw_chain *chain; IPFW_RLOCK_TRACKER; chain = &V_layer3_chain; IPFW_RLOCK(chain); /* one pass to count, one to copy the data */ i = 0; LIST_FOREACH(ptr, &chain->nat, _next) { if (ptr->lib->logDesc == NULL) continue; i++; } size = i * (LIBALIAS_BUF_SIZE + sizeof(int)); data = malloc(size, M_IPFW, M_NOWAIT | M_ZERO); if (data == NULL) { IPFW_RUNLOCK(chain); return (ENOSPC); } i = 0; LIST_FOREACH(ptr, &chain->nat, _next) { if (ptr->lib->logDesc == NULL) continue; bcopy(&ptr->id, &data[i], sizeof(int)); i += sizeof(int); bcopy(ptr->lib->logDesc, &data[i], LIBALIAS_BUF_SIZE); i += LIBALIAS_BUF_SIZE; } IPFW_RUNLOCK(chain); sooptcopyout(sopt, data, size); free(data, M_IPFW); return(0); } static int vnet_ipfw_nat_init(const void *arg __unused) { V_ipfw_nat_ready = 1; return (0); } static int vnet_ipfw_nat_uninit(const void *arg __unused) { struct cfg_nat *ptr, *ptr_temp; struct ip_fw_chain *chain; chain = &V_layer3_chain; IPFW_WLOCK(chain); V_ipfw_nat_ready = 0; LIST_FOREACH_SAFE(ptr, &chain->nat, _next, ptr_temp) { LIST_REMOVE(ptr, _next); free_nat_instance(ptr); } flush_nat_ptrs(chain, -1 /* flush all */); IPFW_WUNLOCK(chain); return (0); } static void ipfw_nat_init(void) { /* init ipfw hooks */ ipfw_nat_ptr = ipfw_nat; lookup_nat_ptr = lookup_nat; ipfw_nat_cfg_ptr = ipfw_nat_cfg; ipfw_nat_del_ptr = ipfw_nat_del; ipfw_nat_get_cfg_ptr = ipfw_nat_get_cfg; ipfw_nat_get_log_ptr = ipfw_nat_get_log; IPFW_ADD_SOPT_HANDLER(1, scodes); ifaddr_event_tag = EVENTHANDLER_REGISTER(ifaddr_event, ifaddr_change, NULL, EVENTHANDLER_PRI_ANY); } static void ipfw_nat_destroy(void) { EVENTHANDLER_DEREGISTER(ifaddr_event, ifaddr_event_tag); /* deregister ipfw_nat */ IPFW_DEL_SOPT_HANDLER(1, scodes); ipfw_nat_ptr = NULL; lookup_nat_ptr = NULL; ipfw_nat_cfg_ptr = NULL; ipfw_nat_del_ptr = NULL; ipfw_nat_get_cfg_ptr = NULL; ipfw_nat_get_log_ptr = NULL; } static int ipfw_nat_modevent(module_t mod, int type, void *unused) { int err = 0; switch (type) { case MOD_LOAD: break; case MOD_UNLOAD: break; default: return EOPNOTSUPP; break; } return err; } static moduledata_t ipfw_nat_mod = { "ipfw_nat", ipfw_nat_modevent, 0 }; /* Define startup order. */ #define IPFW_NAT_SI_SUB_FIREWALL SI_SUB_PROTO_FIREWALL #define IPFW_NAT_MODEVENT_ORDER (SI_ORDER_ANY - 128) /* after ipfw */ #define IPFW_NAT_MODULE_ORDER (IPFW_NAT_MODEVENT_ORDER + 1) #define IPFW_NAT_VNET_ORDER (IPFW_NAT_MODEVENT_ORDER + 2) DECLARE_MODULE(ipfw_nat, ipfw_nat_mod, IPFW_NAT_SI_SUB_FIREWALL, SI_ORDER_ANY); MODULE_DEPEND(ipfw_nat, libalias, 1, 1, 1); MODULE_DEPEND(ipfw_nat, ipfw, 3, 3, 3); MODULE_VERSION(ipfw_nat, 1); SYSINIT(ipfw_nat_init, IPFW_NAT_SI_SUB_FIREWALL, IPFW_NAT_MODULE_ORDER, ipfw_nat_init, NULL); VNET_SYSINIT(vnet_ipfw_nat_init, IPFW_NAT_SI_SUB_FIREWALL, IPFW_NAT_VNET_ORDER, vnet_ipfw_nat_init, NULL); SYSUNINIT(ipfw_nat_destroy, IPFW_NAT_SI_SUB_FIREWALL, IPFW_NAT_MODULE_ORDER, ipfw_nat_destroy, NULL); VNET_SYSUNINIT(vnet_ipfw_nat_uninit, IPFW_NAT_SI_SUB_FIREWALL, IPFW_NAT_VNET_ORDER, vnet_ipfw_nat_uninit, NULL); /* end of file */ diff --git a/tests/sys/netpfil/common/nat.sh b/tests/sys/netpfil/common/nat.sh index f74467dce062..9aa06de51337 100644 --- a/tests/sys/netpfil/common/nat.sh +++ b/tests/sys/netpfil/common/nat.sh @@ -1,156 +1,253 @@ #- # SPDX-License-Identifier: BSD-2-Clause-FreeBSD # # Copyright (c) 2019 Ahsan Barkati # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # # $FreeBSD$ # . $(atf_get_srcdir)/utils.subr . $(atf_get_srcdir)/runner.subr basic_head() { atf_set descr 'Basic IPv4 NAT test' atf_set require.user root } basic_body() { firewall=$1 firewall_init $firewall nat_init $firewall epair_host_nat=$(vnet_mkepair) epair_client1_nat=$(vnet_mkepair) epair_client2_nat=$(vnet_mkepair) vnet_mkjail nat ${epair_host_nat}b ${epair_client1_nat}a ${epair_client2_nat}a vnet_mkjail client1 ${epair_client1_nat}b vnet_mkjail client2 ${epair_client2_nat}b ifconfig ${epair_host_nat}a 198.51.100.2/24 up jexec nat ifconfig ${epair_host_nat}b 198.51.100.1/24 up jexec nat ifconfig ${epair_client1_nat}a 192.0.2.1/24 up jexec client1 ifconfig ${epair_client1_nat}b 192.0.2.2/24 up jexec nat ifconfig ${epair_client2_nat}a 192.0.3.1/24 up jexec client2 ifconfig ${epair_client2_nat}b 192.0.3.2/24 up jexec nat sysctl net.inet.ip.forwarding=1 jexec client1 route add -net 198.51.100.0/24 192.0.2.1 jexec client2 route add -net 198.51.100.0/24 192.0.3.1 # ping fails without NAT configuration atf_check -s exit:2 -o ignore jexec client1 ping -t 1 -c 1 198.51.100.2 atf_check -s exit:2 -o ignore jexec client2 ping -t 1 -c 1 198.51.100.2 firewall_config nat ${firewall} \ "pf" \ "nat pass on ${epair_host_nat}b inet from any to any -> (${epair_host_nat}b)" \ "ipfw" \ "ipfw -q nat 123 config if ${epair_host_nat}b" \ "ipfw -q add 1000 nat 123 all from any to any" \ "ipfnat" \ "map ${epair_host_nat}b 192.0.3.0/24 -> 0/32" \ "map ${epair_host_nat}b 192.0.2.0/24 -> 0/32" \ # ping is successful now atf_check -s exit:0 -o ignore jexec client1 ping -t 1 -c 1 198.51.100.2 atf_check -s exit:0 -o ignore jexec client2 ping -t 1 -c 1 198.51.100.2 } basic_cleanup() { firewall=$1 firewall_cleanup $firewall } userspace_nat_head() { atf_set descr 'Nat test for ipfw using userspace natd' atf_set require.user root } userspace_nat_body() { firewall=$1 firewall_init $firewall if ! kldstat -q -m ipdivert; then atf_skip "This test requires ipdivert module loaded" fi epair_host_nat=$(vnet_mkepair) epair_client1_nat=$(vnet_mkepair) epair_client2_nat=$(vnet_mkepair) vnet_mkjail nat ${epair_host_nat}b ${epair_client1_nat}a ${epair_client2_nat}a vnet_mkjail client1 ${epair_client1_nat}b vnet_mkjail client2 ${epair_client2_nat}b ifconfig ${epair_host_nat}a 198.51.100.2/24 up jexec nat ifconfig ${epair_host_nat}b 198.51.100.1/24 up jexec nat ifconfig ${epair_client1_nat}a 192.0.2.1/24 up jexec client1 ifconfig ${epair_client1_nat}b 192.0.2.2/24 up jexec nat ifconfig ${epair_client2_nat}a 192.0.3.1/24 up jexec client2 ifconfig ${epair_client2_nat}b 192.0.3.2/24 up jexec nat sysctl net.inet.ip.forwarding=1 jexec client1 route add -net 198.51.100.0/24 192.0.2.1 jexec client2 route add -net 198.51.100.0/24 192.0.3.1 # Test the userspace NAT of ipfw # ping fails without NAT configuration atf_check -s exit:2 -o ignore jexec client1 ping -t 1 -c 1 198.51.100.2 atf_check -s exit:2 -o ignore jexec client2 ping -t 1 -c 1 198.51.100.2 firewall_config nat ${firewall} \ "ipfw" \ "natd -interface ${epair_host_nat}b" \ "ipfw -q add divert natd all from any to any via ${epair_host_nat}b" \ # ping is successful now atf_check -s exit:0 -o ignore jexec client1 ping -t 1 -c 1 198.51.100.2 atf_check -s exit:0 -o ignore jexec client2 ping -t 1 -c 1 198.51.100.2 } userspace_nat_cleanup() { firewall=$1 firewall_cleanup $firewall } +common_cgn() { + firewall=$1 + portalias=$2 + firewall_init $firewall + nat_init $firewall + + epair_host_nat=$(vnet_mkepair) + epair_client1_nat=$(vnet_mkepair) + epair_client2_nat=$(vnet_mkepair) + + vnet_mkjail nat ${epair_host_nat}b ${epair_client1_nat}a ${epair_client2_nat}a + vnet_mkjail client1 ${epair_client1_nat}b + vnet_mkjail client2 ${epair_client2_nat}b + + ifconfig ${epair_host_nat}a 198.51.100.2/24 up + jexec nat ifconfig ${epair_host_nat}b 198.51.100.1/24 up + + jexec nat ifconfig ${epair_client1_nat}a 100.64.0.1/24 up + jexec client1 ifconfig ${epair_client1_nat}b 100.64.0.2/24 up + + jexec nat ifconfig ${epair_client2_nat}a 100.64.1.1/24 up + jexec client2 ifconfig ${epair_client2_nat}b 100.64.1.2/24 up + + jexec nat sysctl net.inet.ip.forwarding=1 + + jexec client1 route add -net 198.51.100.0/24 100.64.0.1 + jexec client2 route add -net 198.51.100.0/24 100.64.1.1 + + # ping fails without NAT configuration + atf_check -s exit:2 -o ignore jexec client1 ping -t 1 -c 1 198.51.100.2 + atf_check -s exit:2 -o ignore jexec client2 ping -t 1 -c 1 198.51.100.2 + + if [[ $portalias ]]; then + firewall_config nat $firewall \ + "ipfw" \ + "ipfw -q nat 123 config if ${epair_host_nat}b unreg_cgn port_alias 2000-2999" \ + "ipfw -q nat 456 config if ${epair_host_nat}b unreg_cgn port_alias 3000-3999" \ + "ipfw -q add 1000 nat 123 all from any to 198.51.100.2 2000-2999 in via ${epair_host_nat}b" \ + "ipfw -q add 2000 nat 456 all from any to 198.51.100.2 3000-3999 in via ${epair_host_nat}b" \ + "ipfw -q add 3000 nat 123 all from 100.64.0.2 to any out via ${epair_host_nat}b" \ + "ipfw -q add 4000 nat 456 all from 100.64.1.2 to any out via ${epair_host_nat}b" + else + firewall_config nat $firewall \ + "ipfw" \ + "ipfw -q nat 123 config if ${epair_host_nat}b unreg_cgn" \ + "ipfw -q add 1000 nat 123 all from any to any" + fi + + # ping is successful now + atf_check -s exit:0 -o ignore jexec client1 ping -t 1 -c 1 198.51.100.2 + atf_check -s exit:0 -o ignore jexec client2 ping -t 1 -c 1 198.51.100.2 + + # if portalias, test a tcp server/client with nc + if [[ $portalias ]]; then + for inst in 1 2; do + daemon nc -p 198.51.100.2 7 + atf_check -s exit:0 -o ignore jexec client$inst sh -c "echo | nc -N 198.51.100.2 7" + done + fi +} + +cgn_head() +{ + atf_set descr 'IPv4 CGN (RFC 6598) test' + atf_set require.user root +} + +cgn_body() +{ + common_cgn $1 false +} + +cgn_cleanup() +{ + firewall_cleanup ipfw +} + +portalias_head() +{ + atf_set descr 'IPv4 CGN (RFC 6598) port aliasing test' + atf_set require.user root +} + +portalias_body() +{ + common_cgn $1 true +} + +portalias_cleanup() +{ + firewall_cleanup ipfw +} + setup_tests \ basic \ pf \ ipfw \ ipfnat \ userspace_nat \ - ipfw \ No newline at end of file + ipfw \ + cgn \ + ipfw \ + portalias \ + ipfw