Index: stable/4/sbin/ipfw/ipfw.8 =================================================================== --- stable/4/sbin/ipfw/ipfw.8 (revision 130570) +++ stable/4/sbin/ipfw/ipfw.8 (revision 130571) @@ -1,2206 +1,2265 @@ .\" .\" $FreeBSD$ .\" -.Dd August 13, 2002 +.Dd June 9, 2004 .Dt IPFW 8 .Os .Sh NAME .Nm ipfw .Nd IP firewall and traffic shaper control program .Sh SYNOPSIS .Nm .Op Fl cq .Cm add .Ar rule .Nm .Op Fl acdefnNStT .Brq Cm list | show .Op Ar rule | first-last ... .Nm .Op Fl f | q .Cm flush .Nm .Op Fl q .Brq Cm delete | zero | resetlog .Op Cm set .Op Ar number ... .Nm .Cm enable .Brq Cm firewall | one_pass | debug | verbose | dyn_keepalive .Nm .Cm disable .Brq Cm firewall | one_pass | debug | verbose | dyn_keepalive .Pp .Nm .Cm set Oo Cm disable Ar number ... Oc Op Cm enable Ar number ... .Nm .Cm set move .Op Cm rule .Ar number Cm to Ar number .Nm .Cm set swap Ar number number .Nm .Cm set show .Pp .Nm +.Cm table Ar number Cm add Ar addr Ns Oo / Ns Ar masklen Oc Op Ar value +.Nm +.Cm table Ar number Cm delete Ar addr Ns Op / Ns Ar masklen +.Nm +.Cm table Ar number Cm flush +.Nm +.Cm table Ar number Cm list +.Pp +.Nm .Brq Cm pipe | queue .Ar number .Cm config .Ar config-options .Nm .Op Fl s Op Ar field .Brq Cm pipe | queue .Brq Cm delete | list | show .Op Ar number ... .Pp .Nm .Op Fl cfnNqS .Oo .Fl p Ar preproc .Oo .Ar preproc-flags .Oc .Oc .Ar pathname .Sh DESCRIPTION The .Nm utility is the user interface for controlling the .Xr ipfw 4 firewall and the .Xr dummynet 4 traffic shaper in .Fx . .Pp .Bd -ragged -offset XXXX .Em NOTE: this manual page documents the newer version of .Nm introduced in .Fx CURRENT in July 2002, also known as .Nm ipfw2 . .Nm ipfw2 is a superset of the old firewall, .Nm ipfw1 . The differences between the two are listed in Section .Sx IPFW2 ENHANCEMENTS , which you are encouraged to read to revise older rulesets and possibly write them more efficiently. See Section .Sx USING IPFW2 IN FreeBSD-STABLE for instructions on how to run .Nm ipfw2 on .Fx STABLE. .Ed .Pp An .Nm configuration, or .Em ruleset , is made of a list of .Em rules numbered from 1 to 65535. Packets are passed to .Nm from a number of different places in the protocol stack (depending on the source and destination of the packet, it is possible that .Nm is invoked multiple times on the same packet). The packet passed to the firewall is compared against each of the rules in the firewall .Em ruleset . When a match is found, the action corresponding to the matching rule is performed. .Pp Depending on the action and certain system settings, packets can be reinjected into the firewall at some rule after the matching one for further processing. .Pp An .Nm ruleset always includes a .Em default rule (numbered 65535) which cannot be modified or deleted, and matches all packets. The action associated with the .Em default rule can be either .Cm deny or .Cm allow depending on how the kernel is configured. .Pp If the ruleset includes one or more rules with the .Cm keep-state or .Cm limit option, then .Nm assumes a .Em stateful behaviour, i.e. upon a match it will create dynamic rules matching the exact parameters (addresses and ports) of the matching packet. .Pp These dynamic rules, which have a limited lifetime, are checked at the first occurrence of a .Cm check-state , .Cm keep-state or .Cm limit rule, and are typically used to open the firewall on-demand to legitimate traffic only. See the .Sx STATEFUL FIREWALL and .Sx EXAMPLES Sections below for more information on the stateful behaviour of .Nm . .Pp All rules (including dynamic ones) have a few associated counters: a packet count, a byte count, a log count and a timestamp indicating the time of the last match. Counters can be displayed or reset with .Nm commands. .Pp Rules can be added with the .Cm add command; deleted individually or in groups with the .Cm delete command, and globally (except those in set 31) with the .Cm flush command; displayed, optionally with the content of the counters, using the .Cm show and .Cm list commands. Finally, counters can be reset with the .Cm zero and .Cm resetlog commands. .Pp Also, each rule belongs to one of 32 different .Em sets , and there are .Nm commands to atomically manipulate sets, such as enable, disable, swap sets, move all rules in a set to another one, delete all rules in a set. These can be useful to install temporary configurations, or to test them. See Section .Sx SETS OF RULES for more information on .Em sets . .Pp The following options are available: .Bl -tag -width indent .It Fl a While listing, show counter values. The .Cm show command just implies this option. .It Fl c When entering or showing rules, print them in compact form, i.e. without the optional "ip from any to any" string when this does not carry any additional information. .It Fl d While listing, show dynamic rules in addition to static ones. .It Fl e While listing, if the .Fl d option was specified, also show expired dynamic rules. .It Fl f Don't ask for confirmation for commands that can cause problems if misused, .No i.e. Cm flush . If there is no tty associated with the process, this is implied. .It Fl n Only check syntax of the command strings, without actually passing them to the kernel. .It Fl N Try to resolve addresses and service names in output. .It Fl q While .Cm add Ns ing , .Cm zero Ns ing , .Cm resetlog Ns ging or .Cm flush Ns ing , be quiet about actions (implies .Fl f ) . This is useful for adjusting rules by executing multiple .Nm commands in a script (e.g., .Ql sh\ /etc/rc.firewall ) , or by processing a file of many .Nm rules across a remote login session. If a .Cm flush is performed in normal (verbose) mode (with the default kernel configuration), it prints a message. Because all rules are flushed, the message might not be delivered to the login session, causing the remote login session to be closed and the remainder of the ruleset to not be processed. Access to the console would then be required to recover. .It Fl S While listing rules, show the .Em set each rule belongs to. If this flag is not specified, disabled rules will not be listed. .It Fl s Op Ar field While listing pipes, sort according to one of the four counters (total or current packets or bytes). .It Fl t While listing, show last match timestamp (converted with ctime()). .It Fl T While listing, show last match timestamp (as seconds from the epoch). This form can be more convenient for postprocessing by scripts. .El .Pp To ease configuration, rules can be put into a file which is processed using .Nm as shown in the last synopsis line. An absolute .Ar pathname must be used. The file will be read line by line and applied as arguments to the .Nm utility. .Pp Optionally, a preprocessor can be specified using .Fl p Ar preproc where .Ar pathname is to be piped through. Useful preprocessors include .Xr cpp 1 and .Xr m4 1 . If .Ar preproc doesn't start with a slash .Pq Ql / as its first character, the usual .Ev PATH name search is performed. Care should be taken with this in environments where not all file systems are mounted (yet) by the time .Nm is being run (e.g. when they are mounted over NFS). Once .Fl p has been specified, any additional arguments as passed on to the preprocessor for interpretation. This allows for flexible configuration files (like conditionalizing them on the local hostname) and the use of macros to centralize frequently required arguments like IP addresses. .Pp The .Nm .Cm pipe and .Cm queue commands are used to configure the traffic shaper, as shown in the .Sx TRAFFIC SHAPER (DUMMYNET) CONFIGURATION Section below. .Pp If the world and the kernel get out of sync the .Nm ABI may break, preventing you from being able to add any rules. This can adversely effect the booting process. You can use .Nm .Cm disable .Cm firewall to temporarily disable the firewall to regain access to the network, allowing you to fix the problem. .Sh PACKET FLOW A packet is checked against the active ruleset in multiple places in the protocol stack, under control of several sysctl variables. These places and variables are shown below, and it is important to have this picture in mind in order to design a correct ruleset. .Bd -literal -offset indent ^ to upper layers V | | +----------->-----------+ ^ V [ip_input] [ip_output] net.inet.ip.fw.enable=1 | | ^ V [ether_demux] [ether_output_frame] net.link.ether.ipfw=1 | | +-->--[bdg_forward]-->--+ net.link.ether.bridge_ipfw=1 ^ V | to devices | .Ed .Pp As can be noted from the above picture, the number of times the same packet goes through the firewall can vary between 0 and 4 depending on packet source and destination, and system configuration. .Pp Note that as packets flow through the stack, headers can be stripped or added to it, and so they may or may not be available for inspection. E.g., incoming packets will include the MAC header when .Nm is invoked from .Cm ether_demux() , but the same packets will have the MAC header stripped off when .Nm is invoked from .Cm ip_input() . .Pp Also note that each packet is always checked against the complete ruleset, irrespective of the place where the check occurs, or the source of the packet. If a rule contains some match patterns or actions which are not valid for the place of invocation (e.g. trying to match a MAC header within .Cm ip_input() ), the match pattern will not match, but a .Cm not operator in front of such patterns .Em will cause the pattern to .Em always match on those packets. It is thus the responsibility of the programmer, if necessary, to write a suitable ruleset to differentiate among the possible places. .Cm skipto rules can be useful here, as an example: .Bd -literal -offset indent # packets from ether_demux or bdg_forward ipfw add 10 skipto 1000 all from any to any layer2 in # packets from ip_input ipfw add 10 skipto 2000 all from any to any not layer2 in # packets from ip_output ipfw add 10 skipto 3000 all from any to any not layer2 out # packets from ether_output_frame ipfw add 10 skipto 4000 all from any to any layer2 out .Ed .Pp (yes, at the moment there is no way to differentiate between ether_demux and bdg_forward). .Sh SYNTAX In general, each keyword or argument must be provided as a separate command line argument, with no leading or trailing spaces. Keywords are case-sensitive, whereas arguments may or may not be case-sensitive depending on their nature (e.g. uid's are, hostnames are not). .Pp In .Nm ipfw2 you can introduce spaces after commas ',' to make the line more readable. You can also put the entire command (including flags) into a single argument. E.g. the following forms are equivalent: .Bd -literal -offset indent ipfw -q add deny src-ip 10.0.0.0/24,127.0.0.1/8 ipfw -q add deny src-ip 10.0.0.0/24, 127.0.0.1/8 ipfw "-q add deny src-ip 10.0.0.0/24, 127.0.0.1/8" .Ed .Sh RULE FORMAT The format of .Nm rules is the following: .Bd -ragged -offset indent .Op Ar rule_number .Op Cm set Ar set_number .Op Cm prob Ar match_probability .br .Ar " " action .Op Cm log Op Cm logamount Ar number .Ar body .Ed .Pp where the body of the rule specifies which information is used for filtering packets, among the following: .Pp .Bl -tag -width "Source and dest. addresses and ports" -offset XXX -compact .It Layer-2 header fields When available .It IPv4 Protocol TCP, UDP, ICMP, etc. .It Source and dest. addresses and ports .It Direction See Section .Sx PACKET FLOW .It Transmit and receive interface By name or address .It Misc. IP header fields Version, type of service, datagram length, identification, fragment flag (non-zero IP offset), Time To Live .It IP options .It Misc. TCP header fields TCP flags (SYN, FIN, ACK, RST, etc.), sequence number, acknowledgment number, window .It TCP options .It ICMP types for ICMP packets .It User/group ID When the packet can be associated with a local socket. .El .Pp Note that some of the above information, e.g. source MAC or IP addresses and TCP/UDP ports, could easily be spoofed, so filtering on those fields alone might not guarantee the desired results. .Bl -tag -width indent .It Ar rule_number Each rule is associated with a .Ar rule_number in the range 1..65535, with the latter reserved for the .Em default rule. Rules are checked sequentially by rule number. Multiple rules can have the same number, in which case they are checked (and listed) according to the order in which they have been added. If a rule is entered without specifying a number, the kernel will assign one in such a way that the rule becomes the last one before the .Em default rule. Automatic rule numbers are assigned by incrementing the last non-default rule number by the value of the sysctl variable .Ar net.inet.ip.fw.autoinc_step which defaults to 100. If this is not possible (e.g. because we would go beyond the maximum allowed rule number), the number of the last non-default value is used instead. .It Cm set Ar set_number Each rule is associated with a .Ar set_number in the range 0..31. Sets can be individually disabled and enabled, so this parameter is of fundamental importance for atomic ruleset manipulation. It can be also used to simplify deletion of groups of rules. If a rule is entered without specifying a set number, set 0 will be used. .br Set 31 is special in that it cannot be disabled, and rules in set 31 are not deleted by the .Nm ipfw flush command (but you can delete them with the .Nm ipfw delete set 31 command). Set 31 is also used for the .Em default rule. .It Cm prob Ar match_probability A match is only declared with the specified probability (floating point number between 0 and 1). This can be useful for a number of applications such as random packet drop or (in conjunction with .Xr dummynet 4 ) to simulate the effect of multiple paths leading to out-of-order packet delivery. .Pp Note: this condition is checked before any other condition, including ones such as keep-state or check-state which might have side effects. .It Cm log Op Cm logamount Ar number When a packet matches a rule with the .Cm log keyword, a message will be logged to .Xr syslogd 8 with a .Dv LOG_SECURITY facility. The logging only occurs if the sysctl variable .Em net.inet.ip.fw.verbose is set to 1 (which is the default when the kernel is compiled with .Dv IPFIREWALL_VERBOSE ) and the number of packets logged so far for that particular rule does not exceed the .Cm logamount parameter. If no .Cm logamount is specified, the limit is taken from the sysctl variable .Em net.inet.ip.fw.verbose_limit . In both cases, a value of 0 removes the logging limit. .Pp Once the limit is reached, logging can be re-enabled by clearing the logging counter or the packet counter for that entry, see the .Cm resetlog command. .Pp Note: logging is done after all other packet matching conditions have been successfully verified, and before performing the final action (accept, deny, etc.) on the packet. .El .Ss RULE ACTIONS A rule can be associated with one of the following actions, which will be executed when the packet matches the body of the rule. .Bl -tag -width indent .It Cm allow | accept | pass | permit Allow packets that match rule. The search terminates. .It Cm check-state Checks the packet against the dynamic ruleset. If a match is found, execute the action associated with the rule which generated this dynamic rule, otherwise move to the next rule. .br .Cm Check-state rules do not have a body. If no .Cm check-state rule is found, the dynamic ruleset is checked at the first .Cm keep-state or .Cm limit rule. .It Cm count Update counters for all packets that match rule. The search continues with the next rule. .It Cm deny | drop Discard packets that match this rule. The search terminates. .It Cm divert Ar port Divert packets that match this rule to the .Xr divert 4 socket bound to port .Ar port . The search terminates. .It Cm fwd | forward Ar ipaddr Ns Op , Ns Ar port Change the next-hop on matching packets to .Ar ipaddr , which can be an IP address in dotted quad format or a host name. The search terminates if this rule matches. .Pp If .Ar ipaddr is a local address, then matching packets will be forwarded to .Ar port (or the port number in the packet if one is not specified in the rule) on the local machine. .br If .Ar ipaddr is not a local address, then the port number (if specified) is ignored, and the packet will be forwarded to the remote address, using the route as found in the local routing table for that IP. .br A .Ar fwd rule will not match layer-2 packets (those received on ether_input, ether_output, or bridged). .br The .Cm fwd action does not change the contents of the packet at all. In particular, the destination address remains unmodified, so packets forwarded to another system will usually be rejected by that system unless there is a matching rule on that system to capture them. For packets forwarded locally, the local address of the socket will be set to the original destination address of the packet. This makes the .Xr netstat 1 entry look rather weird but is intended for use with transparent proxy servers. .It Cm pipe Ar pipe_nr Pass packet to a .Xr dummynet 4 .Dq pipe (for bandwidth limitation, delay, etc.). See the .Sx TRAFFIC SHAPER (DUMMYNET) CONFIGURATION Section for further information. The search terminates; however, on exit from the pipe and if the .Xr sysctl 8 variable .Em net.inet.ip.fw.one_pass is not set, the packet is passed again to the firewall code starting from the next rule. .It Cm queue Ar queue_nr Pass packet to a .Xr dummynet 4 .Dq queue (for bandwidth limitation using WF2Q+). .It Cm reject (Deprecated). Synonym for .Cm unreach host . .It Cm reset Discard packets that match this rule, and if the packet is a TCP packet, try to send a TCP reset (RST) notice. The search terminates. .It Cm skipto Ar number Skip all subsequent rules numbered less than .Ar number . The search continues with the first rule numbered .Ar number or higher. .It Cm tee Ar port Send a copy of packets matching this rule to the .Xr divert 4 socket bound to port .Ar port . The search terminates and the original packet is accepted (but see Section .Sx BUGS below). .It Cm unreach Ar code Discard packets that match this rule, and try to send an ICMP unreachable notice with code .Ar code , where .Ar code is a number from 0 to 255, or one of these aliases: .Cm net , host , protocol , port , .Cm needfrag , srcfail , net-unknown , host-unknown , .Cm isolated , net-prohib , host-prohib , tosnet , .Cm toshost , filter-prohib , host-precedence or .Cm precedence-cutoff . The search terminates. .El .Ss RULE BODY The body of a rule contains zero or more patterns (such as specific source and destination addresses or ports, protocol options, incoming or outgoing interfaces, etc.) that the packet must match in order to be recognised. In general, the patterns are connected by (implicit) .Cm and operators -- i.e. all must match in order for the rule to match. Individual patterns can be prefixed by the .Cm not operator to reverse the result of the match, as in .Pp .Dl "ipfw add 100 allow ip from not 1.2.3.4 to any" .Pp Additionally, sets of alternative match patterns ( .Em or-blocks ) can be constructed by putting the patterns in lists enclosed between parentheses ( ) or braces { }, and using the .Cm or operator as follows: .Pp .Dl "ipfw add 100 allow ip from { x or not y or z } to any" .Pp Only one level of parentheses is allowed. Beware that most shells have special meanings for parentheses or braces, so it is advisable to put a backslash \\ in front of them to prevent such interpretations. .Pp The body of a rule must in general include a source and destination address specifier. The keyword .Ar any can be used in various places to specify that the content of a required field is irrelevant. .Pp The rule body has the following format: .Bd -ragged -offset indent .Op Ar proto Cm from Ar src Cm to Ar dst .Op Ar options .Ed .Pp The first part (proto from src to dst) is for backward compatibility with .Nm ipfw1 . In .Nm ipfw2 any match pattern (including MAC headers, IPv4 protocols, addresses and ports) can be specified in the .Ar options section. .Pp Rule fields have the following meaning: .Bl -tag -width indent .It Ar proto : protocol | Cm { Ar protocol Cm or ... } .It Ar protocol : Oo Cm not Oc Ar protocol-name | protocol-number An IPv4 protocol specified by number or name (for a complete list see .Pa /etc/protocols ) . The .Cm ip or .Cm all keywords mean any protocol will match. .Pp The .Cm { Ar protocol Cm or ... } format (an .Em or-block ) is provided for convenience only but its use is deprecated. .It Ar src No and Ar dst : Bro Cm addr | Cm { Ar addr Cm or ... } Brc Op Oo Cm not Oc Ar ports An address (or a list, see below) optionally followed by .Ar ports specifiers. .Pp The second format ( .Em or-block with multiple addresses) is provided for convenience only and its use is discouraged. -.It Ar addr : Oo Cm not Oc Brq Cm any | me | Ar addr-list | Ar addr-set +.It Ar addr : Oo Cm not Oc Bro +.Cm any | me | +.Cm table Ns Pq Ar number Ns Op , Ns Ar value +.Ar | addr-list | addr-set +.Brc .It Cm any matches any IP address. .It Cm me matches any IP address configured on an interface in the system. The address list is evaluated at the time the packet is analysed. +.It Cm table Ns Pq Ar number Ns Op , Ns Ar value +Matches any IP address for which an entry exists in the lookup table +.Ar number . +If an optional 32-bit unsigned +.Ar value +is also specified, an entry will match only if it has this value. +See the +.Sx LOOKUP TABLES +section below for more information on lookup tables. .It Ar addr-list : ip-addr Ns Op Ns , Ns Ar addr-list .It Ar ip-addr : A host or subnet address specified in one of the following ways: .Bl -tag -width indent .It Ar numeric-ip | hostname Matches a single IPv4 address, specified as dotted-quad or a hostname. Hostnames are resolved at the time the rule is added to the firewall list. .It Ar addr Ns / Ns Ar masklen Matches all addresses with base .Ar addr (specified as a dotted quad or a hostname) and mask width of .Cm masklen bits. As an example, 1.2.3.4/25 will match all IP numbers from 1.2.3.0 to 1.2.3.127 . .It Ar addr Ns : Ns Ar mask Matches all addresses with base .Ar addr (specified as a dotted quad or a hostname) and the mask of .Ar mask , specified as a dotted quad. As an example, 1.2.3.4/255.0.255.0 will match 1.*.3.*. We suggest to use this form only for non-contiguous masks, and resort to the .Ar addr Ns / Ns Ar masklen format for contiguous masks, which is more compact and less error-prone. .El .It Ar addr-set : addr Ns Oo Ns / Ns Ar masklen Oc Ns Cm { Ns Ar list Ns Cm } .It Ar list : Bro Ar num | num-num Brc Ns Op Ns , Ns Ar list Matches all addresses with base address .Ar addr (specified as a dotted quad or a hostname) and whose last byte is in the list between braces { } . Note that there must be no spaces between braces and numbers (spaces after commas are allowed). Elements of the list can be specified as single entries or ranges. The .Ar masklen field is used to limit the size of the set of addresses, and can have any value between 24 and 32. If not specified, it will be assumed as 24. .br This format is particularly useful to handle sparse address sets within a single rule. Because the matching occurs using a bitmask, it takes constant time and dramatically reduces the complexity of rulesets. .br As an example, an address specified as 1.2.3.4/24{128,35-55,89} will match the following IP addresses: .br 1.2.3.128, 1.2.3.35 to 1.2.3.55, 1.2.3.89 . .It Ar ports : Bro Ar port | port Ns \&- Ns Ar port Ns Brc Ns Op , Ns Ar ports For protocols which support port numbers (such as TCP and UDP), optional .Cm ports may be specified as one or more ports or port ranges, separated by commas but no spaces, and an optional .Cm not operator. The .Ql \&- notation specifies a range of ports (including boundaries). .Pp Service names (from .Pa /etc/services ) may be used instead of numeric port values. The length of the port list is limited to 30 ports or ranges, though one can specify larger ranges by using an .Em or-block in the .Cm options section of the rule. .Pp A backslash .Pq Ql \e can be used to escape the dash .Pq Ql - character in a service name (from a shell, the backslash must be typed twice to avoid the shell itself interpreting it as an escape character). .Pp .Dl "ipfw add count tcp from any ftp\e\e-data-ftp to any" .Pp Fragmented packets which have a non-zero offset (i.e. not the first fragment) will never match a rule which has one or more port specifications. See the .Cm frag option for details on matching fragmented packets. .El .Ss RULE OPTIONS (MATCH PATTERNS) Additional match patterns can be used within rules. Zero or more of these so-called .Em options can be present in a rule, optionally prefixed by the .Cm not operand, and possibly grouped into .Em or-blocks . .Pp The following match patterns can be used (listed in alphabetical order): .Bl -tag -width indent .It Cm // this is a comment. Inserts the specified text as a comment in the rule. Everything following // is considered as a comment and stored in the rule. You can have comment-only rules, which are listed as having a .Cm count action followed by the comment. .It Cm bridged Matches only bridged packets. .It Cm dst-ip Ar ip-address Matches IP packets whose destination IP is one of the address(es) specified as argument. .It Cm dst-port Ar ports Matches IP packets whose destination port is one of the port(s) specified as argument. .It Cm established Matches TCP packets that have the RST or ACK bits set. .It Cm frag Matches packets that are fragments and not the first fragment of an IP datagram. Note that these packets will not have the next protocol header (e.g. TCP, UDP) so options that look into these headers cannot match. .It Cm gid Ar group Matches all TCP or UDP packets sent by or received for a .Ar group . A .Ar group may be specified by name or number. .It Cm icmptypes Ar types Matches ICMP packets whose ICMP type is in the list .Ar types . The list may be specified as any combination of individual types (numeric) separated by commas. .Em Ranges are not allowed. The supported ICMP types are: .Pp echo reply .Pq Cm 0 , destination unreachable .Pq Cm 3 , source quench .Pq Cm 4 , redirect .Pq Cm 5 , echo request .Pq Cm 8 , router advertisement .Pq Cm 9 , router solicitation .Pq Cm 10 , time-to-live exceeded .Pq Cm 11 , IP header bad .Pq Cm 12 , timestamp request .Pq Cm 13 , timestamp reply .Pq Cm 14 , information request .Pq Cm 15 , information reply .Pq Cm 16 , address mask request .Pq Cm 17 and address mask reply .Pq Cm 18 . .It Cm in | out Matches incoming or outgoing packets, respectively. .Cm in and .Cm out are mutually exclusive (in fact, .Cm out is implemented as .Cm not in Ns No ). .It Cm ipid Ar id-list Matches IP packets whose .Cm ip_id field has value included in .Ar id-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm iplen Ar len-list Matches IP packets whose total length, including header and data, is in the set .Ar len-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm ipoptions Ar spec Matches packets whose IP header contains the comma separated list of options specified in .Ar spec . The supported IP options are: .Pp .Cm ssrr (strict source route), .Cm lsrr (loose source route), .Cm rr (record packet route) and .Cm ts (timestamp). The absence of a particular option may be denoted with a .Ql \&! . .It Cm ipprecedence Ar precedence Matches IP packets whose precedence field is equal to .Ar precedence . .It Cm ipsec Matches packets that have IPSEC history associated with them (i.e. the packet comes encapsulated in IPSEC, the kernel has IPSEC support and IPSEC_FILTERGIF option, and can correctly decapsulate it). .Pp Note that specifying .Cm ipsec is different from specifying .Cm proto Ar ipsec as the latter will only look at the specific IP protocol field, irrespective of IPSEC kernel support and the validity of the IPSEC data. .It Cm iptos Ar spec Matches IP packets whose .Cm tos field contains the comma separated list of service types specified in .Ar spec . The supported IP types of service are: .Pp .Cm lowdelay .Pq Dv IPTOS_LOWDELAY , .Cm throughput .Pq Dv IPTOS_THROUGHPUT , .Cm reliability .Pq Dv IPTOS_RELIABILITY , .Cm mincost .Pq Dv IPTOS_MINCOST , .Cm congestion .Pq Dv IPTOS_CE . The absence of a particular type may be denoted with a .Ql \&! . .It Cm ipttl Ar ttl-list Matches IP packets whose time to live is included in .Ar ttl-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm ipversion Ar ver Matches IP packets whose IP version field is .Ar ver . .It Cm keep-state Upon a match, the firewall will create a dynamic rule, whose default behaviour is to match bidirectional traffic between source and destination IP/port using the same protocol. The rule has a limited lifetime (controlled by a set of .Xr sysctl 8 variables), and the lifetime is refreshed every time a matching packet is found. .It Cm layer2 Matches only layer2 packets, i.e. those passed to .Nm from ether_demux() and ether_output_frame(). .It Cm limit Bro Cm src-addr | src-port | dst-addr | dst-port Brc Ar N The firewall will only allow .Ar N connections with the same set of parameters as specified in the rule. One or more of source and destination addresses and ports can be specified. .It Cm { MAC | mac } Ar dst-mac src-mac Match packets with a given .Ar dst-mac and .Ar src-mac addresses, specified as the .Cm any keyword (matching any MAC address), or six groups of hex digits separated by colons, and optionally followed by a mask indicating the significant bits. The mask may be specified using either of the following methods: .Bl -enum -width indent .It A slash .Pq / followed by the number of significant bits. For example, an address with 33 significant bits could be specified as: .Pp .Dl "MAC 10:20:30:40:50:60/33 any" .Pp .It An ampersand .Pq & followed by a bitmask specified as six groups of hex digits separated by colons. For example, an address in which the last 16 bits are significant could be specified as: .Pp .Dl "MAC 10:20:30:40:50:60&00:00:00:00:ff:ff any" .Pp Note that the ampersand character has a special meaning in many shells and should generally be escaped. .Pp .El Note that the order of MAC addresses (destination first, source second) is the same as on the wire, but the opposite of the one used for IP addresses. .It Cm mac-type Ar mac-type Matches packets whose Ethernet Type field corresponds to one of those specified as argument. .Ar mac-type is specified in the same way as .Cm port numbers (i.e. one or more comma-separated single values or ranges). You can use symbolic names for known values such as .Em vlan , ipv4, ipv6 . Values can be entered as decimal or hexadecimal (if prefixed by 0x), and they are always printed as hexadecimal (unless the .Cm -N option is used, in which case symbolic resolution will be attempted). .It Cm proto Ar protocol Matches packets with the corresponding IPv4 protocol. .It Cm recv | xmit | via Brq Ar ifX | Ar if Ns Cm * | Ar ipno | Ar any Matches packets received, transmitted or going through, respectively, the interface specified by exact name .Ns No ( Ar ifX Ns No ), by device name .Ns No ( Ar if Ns Ar * Ns No ), by IP address, or through some interface. .Pp The .Cm via keyword causes the interface to always be checked. If .Cm recv or .Cm xmit is used instead of .Cm via , then only the receive or transmit interface (respectively) is checked. By specifying both, it is possible to match packets based on both receive and transmit interface, e.g.: .Pp .Dl "ipfw add deny ip from any to any out recv ed0 xmit ed1" .Pp The .Cm recv interface can be tested on either incoming or outgoing packets, while the .Cm xmit interface can only be tested on outgoing packets. So .Cm out is required (and .Cm in is invalid) whenever .Cm xmit is used. .Pp A packet may not have a receive or transmit interface: packets originating from the local host have no receive interface, while packets destined for the local host have no transmit interface. .It Cm setup Matches TCP packets that have the SYN bit set but no ACK bit. This is the short form of .Dq Li tcpflags\ syn,!ack . .It Cm src-ip Ar ip-address Matches IP packets whose source IP is one of the address(es) specified as argument. .It Cm src-port Ar ports Matches IP packets whose source port is one of the port(s) specified as argument. .It Cm tcpack Ar ack TCP packets only. Match if the TCP header acknowledgment number field is set to .Ar ack . .It Cm tcpflags Ar spec TCP packets only. Match if the TCP header contains the comma separated list of flags specified in .Ar spec . The supported TCP flags are: .Pp .Cm fin , .Cm syn , .Cm rst , .Cm psh , .Cm ack and .Cm urg . The absence of a particular flag may be denoted with a .Ql \&! . A rule which contains a .Cm tcpflags specification can never match a fragmented packet which has a non-zero offset. See the .Cm frag option for details on matching fragmented packets. .It Cm tcpseq Ar seq TCP packets only. Match if the TCP header sequence number field is set to .Ar seq . .It Cm tcpwin Ar win TCP packets only. Match if the TCP header window field is set to .Ar win . .It Cm tcpoptions Ar spec TCP packets only. Match if the TCP header contains the comma separated list of options specified in .Ar spec . The supported TCP options are: .Pp .Cm mss (maximum segment size), .Cm window (tcp window advertisement), .Cm sack (selective ack), .Cm ts (rfc1323 timestamp) and .Cm cc (rfc1644 t/tcp connection count). The absence of a particular option may be denoted with a .Ql \&! . .It Cm uid Ar user Match all TCP or UDP packets sent by or received for a .Ar user . A .Ar user may be matched by name or identification number. .It Cm verrevpath For incoming packets, a routing table lookup is done on the packet's source address. If the interface on which the packet entered the system matches the outgoing interface for the route, the packet matches. If the interfaces do not match up, the packet does not match. All outgoing packets or packets with no incoming interface match. .Pp The name and functionality of the option is intentionally similar to the Cisco IOS command: .Pp .Dl ip verify unicast reverse-path .Pp This option can be used to make anti-spoofing rules. .El +.Sh LOOKUP TABLES +Lookup tables are useful to handle large sparse address sets, +typically from a hundred to several thousands of entries. +There could be 128 different lookup tables, numbered 0 to 127. +.Pp +Each entry is represented by an +.Ar addr Ns Op / Ns Ar masklen +and will match all addresses with base +.Ar addr +(specified as a dotted quad or a hostname) +and mask width of +.Ar masklen +bits. +If +.Ar masklen +is not specified, it defaults to 32. +When looking up an IP address in a table, the most specific +entry will match. +Associated with each entry is a 32-bit unsigned +.Ar value , +which can optionally be checked by a rule matching code. +When adding an entry, if +.Ar value +is not specified, it defaults to 0. +.Pp +An entry can be added to a table +.Pq Cm add , +removed from a table +.Pq Cm delete , +a table can be examined +.Pq Cm list +or flushed +.Pq Cm flush . +.Pp +Internally, each table is stored in a Radix tree, the same way as +the routing table (see +.Xr route 4 ) . .Sh SETS OF RULES Each rule belongs to one of 32 different .Em sets , numbered 0 to 31. Set 31 is reserved for the default rule. .Pp By default, rules are put in set 0, unless you use the .Cm set N attribute when entering a new rule. Sets can be individually and atomically enabled or disabled, so this mechanism permits an easy way to store multiple configurations of the firewall and quickly (and atomically) switch between them. The command to enable/disable sets is .Bd -ragged -offset indent .Nm .Cm set Oo Cm disable Ar number ... Oc Op Cm enable Ar number ... .Ed .Pp where multiple .Cm enable or .Cm disable sections can be specified. Command execution is atomic on all the sets specified in the command. By default, all sets are enabled. .Pp When you disable a set, its rules behave as if they do not exist in the firewall configuration, with only one exception: .Bd -ragged -offset indent dynamic rules created from a rule before it had been disabled will still be active until they expire. In order to delete dynamic rules you have to explicitly delete the parent rule which generated them. .Ed .Pp The set number of rules can be changed with the command .Bd -ragged -offset indent .Nm .Cm set move .Brq Cm rule Ar rule-number | old-set .Cm to Ar new-set .Ed .Pp Also, you can atomically swap two rulesets with the command .Bd -ragged -offset indent .Nm .Cm set swap Ar first-set second-set .Ed .Pp See the .Sx EXAMPLES Section on some possible uses of sets of rules. .Sh STATEFUL FIREWALL Stateful operation is a way for the firewall to dynamically create rules for specific flows when packets that match a given pattern are detected. Support for stateful operation comes through the .Cm check-state , keep-state and .Cm limit options of .Nm rules. .Pp Dynamic rules are created when a packet matches a .Cm keep-state or .Cm limit rule, causing the creation of a .Em dynamic rule which will match all and only packets with a given .Em protocol between a .Em src-ip/src-port dst-ip/dst-port pair of addresses ( .Em src and .Em dst are used here only to denote the initial match addresses, but they are completely equivalent afterwards). Dynamic rules will be checked at the first .Cm check-state, keep-state or .Cm limit occurrence, and the action performed upon a match will be the same as in the parent rule. .Pp Note that no additional attributes other than protocol and IP addresses and ports are checked on dynamic rules. .Pp The typical use of dynamic rules is to keep a closed firewall configuration, but let the first TCP SYN packet from the inside network install a dynamic rule for the flow so that packets belonging to that session will be allowed through the firewall: .Pp .Dl "ipfw add check-state" .Dl "ipfw add allow tcp from my-subnet to any setup keep-state" .Dl "ipfw add deny tcp from any to any" .Pp A similar approach can be used for UDP, where an UDP packet coming from the inside will install a dynamic rule to let the response through the firewall: .Pp .Dl "ipfw add check-state" .Dl "ipfw add allow udp from my-subnet to any keep-state" .Dl "ipfw add deny udp from any to any" .Pp Dynamic rules expire after some time, which depends on the status of the flow and the setting of some .Cm sysctl variables. See Section .Sx SYSCTL VARIABLES for more details. For TCP sessions, dynamic rules can be instructed to periodically send keepalive packets to refresh the state of the rule when it is about to expire. .Pp See Section .Sx EXAMPLES for more examples on how to use dynamic rules. .Sh TRAFFIC SHAPER (DUMMYNET) CONFIGURATION .Nm is also the user interface for the .Xr dummynet 4 traffic shaper. .Pp .Nm dummynet operates by first using the firewall to classify packets and divide them into .Em flows , using any match pattern that can be used in .Nm rules. Depending on local policies, a flow can contain packets for a single TCP connection, or from/to a given host, or entire subnet, or a protocol type, etc. .Pp Packets belonging to the same flow are then passed to either of two different objects, which implement the traffic regulation: .Bl -hang -offset XXXX .It Em pipe A pipe emulates a link with given bandwidth, propagation delay, queue size and packet loss rate. Packets are queued in front of the pipe as they come out from the classifier, and then transferred to the pipe according to the pipe's parameters. .Pp .It Em queue A queue is an abstraction used to implement the WF2Q+ (Worst-case Fair Weighted Fair Queueing) policy, which is an efficient variant of the WFQ policy. .br The queue associates a .Em weight and a reference pipe to each flow, and then all backlogged (i.e., with packets queued) flows linked to the same pipe share the pipe's bandwidth proportionally to their weights. Note that weights are not priorities; a flow with a lower weight is still guaranteed to get its fraction of the bandwidth even if a flow with a higher weight is permanently backlogged. .Pp .El In practice, .Em pipes can be used to set hard limits to the bandwidth that a flow can use, whereas .Em queues can be used to determine how different flow share the available bandwidth. .Pp The .Em pipe and .Em queue configuration commands are the following: .Bd -ragged -offset indent .Cm pipe Ar number Cm config Ar pipe-configuration .Pp .Cm queue Ar number Cm config Ar queue-configuration .Ed .Pp The following parameters can be configured for a pipe: .Pp .Bl -tag -width indent -compact .It Cm bw Ar bandwidth | device Bandwidth, measured in .Sm off .Op Cm K | M .Brq Cm bit/s | Byte/s . .Sm on .Pp A value of 0 (default) means unlimited bandwidth. The unit must immediately follow the number, as in .Pp .Dl "ipfw pipe 1 config bw 300Kbit/s" .Pp If a device name is specified instead of a numeric value, as in .Pp .Dl "ipfw pipe 1 config bw tun0" .Pp then the transmit clock is supplied by the specified device. At the moment only the .Xr tun 4 device supports this functionality, for use in conjunction with .Xr ppp 8 . .Pp .It Cm delay Ar ms-delay Propagation delay, measured in milliseconds. The value is rounded to the next multiple of the clock tick (typically 10ms, but it is a good practice to run kernels with .Dq "options HZ=1000" to reduce the granularity to 1ms or less). Default value is 0, meaning no delay. .El .Pp The following parameters can be configured for a queue: .Pp .Bl -tag -width indent -compact .It Cm pipe Ar pipe_nr Connects a queue to the specified pipe. Multiple queues (with the same or different weights) can be connected to the same pipe, which specifies the aggregate rate for the set of queues. .Pp .It Cm weight Ar weight Specifies the weight to be used for flows matching this queue. The weight must be in the range 1..100, and defaults to 1. .El .Pp Finally, the following parameters can be configured for both pipes and queues: .Pp .Bl -tag -width XXXX -compact .Pp .It Cm buckets Ar hash-table-size Specifies the size of the hash table used for storing the various queues. Default value is 64 controlled by the .Xr sysctl 8 variable .Em net.inet.ip.dummynet.hash_size , allowed range is 16 to 65536. .Pp .It Cm mask Ar mask-specifier Packets sent to a given pipe or queue by an .Nm rule can be further classified into multiple flows, each of which is then sent to a different .Em dynamic pipe or queue. A flow identifier is constructed by masking the IP addresses, ports and protocol types as specified with the .Cm mask options in the configuration of the pipe or queue. For each different flow identifier, a new pipe or queue is created with the same parameters as the original object, and matching packets are sent to it. .Pp Thus, when .Em dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when .Em dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe). .br Available mask specifiers are a combination of one or more of the following: .Pp .Cm dst-ip Ar mask , .Cm src-ip Ar mask , .Cm dst-port Ar mask , .Cm src-port Ar mask , .Cm proto Ar mask or .Cm all , .Pp where the latter means all bits in all fields are significant. .Pp .It Cm noerror When a packet is dropped by a dummynet queue or pipe, the error is normally reported to the caller routine in the kernel, in the same way as it happens when a device queue fills up. Setting this option reports the packet as successfully delivered, which can be needed for some experimental setups where you want to simulate loss or congestion at a remote router. .Pp .It Cm plr Ar packet-loss-rate Packet loss rate. Argument .Ar packet-loss-rate is a floating-point number between 0 and 1, with 0 meaning no loss, 1 meaning 100% loss. The loss rate is internally represented on 31 bits. .Pp .It Cm queue Brq Ar slots | size Ns Cm Kbytes Queue size, in .Ar slots or .Cm KBytes . Default value is 50 slots, which is the typical queue size for Ethernet devices. Note that for slow speed links you should keep the queue size short or your traffic might be affected by a significant queueing delay. E.g., 50 max-sized ethernet packets (1500 bytes) mean 600Kbit or 20s of queue on a 30Kbit/s pipe. Even worse effect can result if you get packets from an interface with a much larger MTU, e.g. the loopback interface with its 16KB packets. .Pp .It Cm red | gred Ar w_q Ns / Ns Ar min_th Ns / Ns Ar max_th Ns / Ns Ar max_p Make use of the RED (Random Early Detection) queue management algorithm. .Ar w_q and .Ar max_p are floating point numbers between 0 and 1 (0 not included), while .Ar min_th and .Ar max_th are integer numbers specifying thresholds for queue management (thresholds are computed in bytes if the queue has been defined in bytes, in slots otherwise). The .Xr dummynet 4 also supports the gentle RED variant (gred). Three .Xr sysctl 8 variables can be used to control the RED behaviour: .Bl -tag -width indent .It Em net.inet.ip.dummynet.red_lookup_depth specifies the accuracy in computing the average queue when the link is idle (defaults to 256, must be greater than zero) .It Em net.inet.ip.dummynet.red_avg_pkt_size specifies the expected average packet size (defaults to 512, must be greater than zero) .It Em net.inet.ip.dummynet.red_max_pkt_size specifies the expected maximum packet size, only used when queue thresholds are in bytes (defaults to 1500, must be greater than zero). .El .El .Sh CHECKLIST Here are some important points to consider when designing your rules: .Bl -bullet .It Remember that you filter both packets going .Cm in and .Cm out . Most connections need packets going in both directions. .It Remember to test very carefully. It is a good idea to be near the console when doing this. If you cannot be near the console, use an auto-recovery script such as the one in .Pa /usr/share/examples/ipfw/change_rules.sh . .It Don't forget the loopback interface. .El .Sh FINE POINTS .Bl -bullet .It There are circumstances where fragmented datagrams are unconditionally dropped. TCP packets are dropped if they do not contain at least 20 bytes of TCP header, UDP packets are dropped if they do not contain a full 8 byte UDP header, and ICMP packets are dropped if they do not contain 4 bytes of ICMP header, enough to specify the ICMP type, code, and checksum. These packets are simply logged as .Dq pullup failed since there may not be enough good data in the packet to produce a meaningful log entry. .It Another type of packet is unconditionally dropped, a TCP packet with a fragment offset of one. This is a valid packet, but it only has one use, to try to circumvent firewalls. When logging is enabled, these packets are reported as being dropped by rule -1. .It If you are logged in over a network, loading the .Xr kld 4 version of .Nm is probably not as straightforward as you would think. I recommend the following command line: .Bd -literal -offset indent kldload ipfw && \e ipfw add 32000 allow ip from any to any .Ed .Pp Along the same lines, doing an .Bd -literal -offset indent ipfw flush .Ed .Pp in similar surroundings is also a bad idea. .It The .Nm filter list may not be modified if the system security level is set to 3 or higher (see .Xr init 8 for information on system security levels). .El .Sh PACKET DIVERSION A .Xr divert 4 socket bound to the specified port will receive all packets diverted to that port. If no socket is bound to the destination port, or if the kernel wasn't compiled with divert socket support, the packets are dropped. .Sh SYSCTL VARIABLES A set of .Xr sysctl 8 variables controls the behaviour of the firewall and associated modules ( .Nm dummynet, bridge ). These are shown below together with their default value (but always check with the .Xr sysctl 8 command what value is actually in use) and meaning: .Bl -tag -width indent .It Em net.inet.ip.dummynet.expire : No 1 Lazily delete dynamic pipes/queue once they have no pending traffic. You can disable this by setting the variable to 0, in which case the pipes/queues will only be deleted when the threshold is reached. .It Em net.inet.ip.dummynet.hash_size : No 64 Default size of the hash table used for dynamic pipes/queues. This value is used when no .Cm buckets option is specified when configuring a pipe/queue. .It Em net.inet.ip.dummynet.max_chain_len : No 16 Target value for the maximum number of pipes/queues in a hash bucket. The product .Cm max_chain_len*hash_size is used to determine the threshold over which empty pipes/queues will be expired even when .Cm net.inet.ip.dummynet.expire=0 . .It Em net.inet.ip.dummynet.red_lookup_depth : No 256 .It Em net.inet.ip.dummynet.red_avg_pkt_size : No 512 .It Em net.inet.ip.dummynet.red_max_pkt_size : No 1500 Parameters used in the computations of the drop probability for the RED algorithm. .It Em net.inet.ip.fw.autoinc_step : No 100 Delta between rule numbers when auto-generating them. The value must be in the range 1..1000. This variable is only present in .Nm ipfw2 , the delta is hardwired to 100 in .Nm ipfw1 . .It Em net.inet.ip.fw.curr_dyn_buckets : Em net.inet.ip.fw.dyn_buckets The current number of buckets in the hash table for dynamic rules (readonly). .It Em net.inet.ip.fw.debug : No 1 Controls debugging messages produced by .Nm . .It Em net.inet.ip.fw.dyn_buckets : No 256 The number of buckets in the hash table for dynamic rules. Must be a power of 2, up to 65536. It only takes effect when all dynamic rules have expired, so you are advised to use a .Cm flush command to make sure that the hash table is resized. .It Em net.inet.ip.fw.dyn_count : No 3 Current number of dynamic rules (read-only). .It Em net.inet.ip.fw.dyn_keepalive : No 1 Enables generation of keepalive packets for .Cm keep-state rules on TCP sessions. A keepalive is generated to both sides of the connection every 5 seconds for the last 20 seconds of the lifetime of the rule. .It Em net.inet.ip.fw.dyn_max : No 8192 Maximum number of dynamic rules. When you hit this limit, no more dynamic rules can be installed until old ones expire. .It Em net.inet.ip.fw.dyn_ack_lifetime : No 300 .It Em net.inet.ip.fw.dyn_syn_lifetime : No 20 .It Em net.inet.ip.fw.dyn_fin_lifetime : No 1 .It Em net.inet.ip.fw.dyn_rst_lifetime : No 1 .It Em net.inet.ip.fw.dyn_udp_lifetime : No 5 .It Em net.inet.ip.fw.dyn_short_lifetime : No 30 These variables control the lifetime, in seconds, of dynamic rules. Upon the initial SYN exchange the lifetime is kept short, then increased after both SYN have been seen, then decreased again during the final FIN exchange or when a RST is received. Both .Em dyn_fin_lifetime and .Em dyn_rst_lifetime must be strictly lower than 5 seconds, the period of repetition of keepalives. The firewall enforces that. .It Em net.inet.ip.fw.enable : No 1 Enables the firewall. Setting this variable to 0 lets you run your machine without firewall even if compiled in. .It Em net.inet.ip.fw.one_pass : No 1 When set, the packet exiting from the .Xr dummynet 4 pipe is not passed though the firewall again. Otherwise, after a pipe action, the packet is reinjected into the firewall at the next rule. .It Em net.inet.ip.fw.verbose : No 1 Enables verbose messages. .It Em net.inet.ip.fw.verbose_limit : No 0 Limits the number of messages produced by a verbose firewall. .It Em net.link.ether.ipfw : No 0 Controls whether layer-2 packets are passed to .Nm . Default is no. .It Em net.link.ether.bridge_ipfw : No 0 Controls whether bridged packets are passed to .Nm . Default is no. .El .Sh USING IPFW2 IN FreeBSD-STABLE .Nm ipfw2 is standard in .Fx CURRENT, whereas .Fx STABLE still uses .Nm ipfw1 unless the kernel is compiled with .Cm options IPFW2 , and .Nm /sbin/ipfw and .Nm /usr/lib/libalias are recompiled with .Cm -DIPFW2 and reinstalled (the same effect can be achieved by adding .Cm IPFW2=TRUE to .Nm /etc/make.conf before a buildworld). .Pp .Sh IPFW2 ENHANCEMENTS This Section lists the features that have been introduced in .Nm ipfw2 which were not present in .Nm ipfw1 . We list them in order of the potential impact that they can have in writing your rulesets. You might want to consider using these features in order to write your rulesets in a more efficient way. .Bl -tag -width indent .It Syntax and flags .Nm ipfw1 does not support the -n flag (only test syntax), nor it allows spaces after commas or supports all rule fields in a single argument. .Nm ipfw1 does not allow the -f flag (force) in conjunction with the -p flag (preprocessor). .Nm ipfw1 does not support the -c (compact) flag. .It Handling of non-IPv4 packets .Nm ipfw1 will silently accept all non-IPv4 packets (which .Nm ipfw1 will only see when .Em net.link.ether.bridge_ipfw=1 Ns ). .Nm ipfw2 will filter all packets (including non-IPv4 ones) according to the ruleset. To achieve the same behaviour as .Nm ipfw1 you can use the following as the very first rule in your ruleset: .Pp .Dl "ipfw add 1 allow layer2 not mac-type ip" .Pp The .Cm layer2 option might seem redundant, but it is necessary -- packets passed to the firewall from layer3 will not have a MAC header, so the .Cm mac-type ip pattern will always fail on them, and the .Cm not operator will make this rule into a pass-all. .It Addresses .Nm ipfw1 does not supports address sets or lists of addresses. .Pp .It Port specifications .Nm ipfw1 only allows one port range when specifying TCP and UDP ports, and is limited to 10 entries instead of the 30 allowed by .Nm ipfw2 . Also, in .Nm ipfw1 you can only specify ports when the rule is requesting .Cm tcp or .Cm udp packets. With .Nm ipfw2 you can put port specifications in rules matching all packets, and the match will be attempted only on those packets carrying protocols which include port identifiers. .Pp Finally, .Nm ipfw1 allowed the first port entry to be specified as .Ar port:mask where .Ar mask can be an arbitrary 16-bit mask. This syntax is of questionable usefulness and it is not supported anymore in .Nm ipfw2 . .It Or-blocks .Nm ipfw1 does not support Or-blocks. .It keepalives .Nm ipfw1 does not generate keepalives for stateful sessions. As a consequence, it might cause idle sessions to drop because the lifetime of the dynamic rules expires. .It Sets of rules .Nm ipfw1 does not implement sets of rules. .It MAC header filtering and Layer-2 firewalling. .Nm ipfw1 does not implement filtering on MAC header fields, nor is it invoked on packets from .Cm ether_demux() and .Cm ether_output_frame(). The sysctl variable .Em net.link.ether.ipfw has no effect there. .It Options In .Nm ipfw1 , the following options only accept a single value as an argument: .Pp .Cm ipid, iplen, ipttl .Pp The following options are not implemented by .Nm ipfw1 : .Pp .Cm dst-ip, dst-port, layer2, mac, mac-type, src-ip, src-port. .Pp Additionally, the RELENG_4 version of .Nm ipfw1 does not implement the following options: .Pp .Cm ipid, iplen, ipprecedence, iptos, ipttl, .Cm ipversion, tcpack, tcpseq, tcpwin . .It Dummynet options The following option for .Nm dummynet pipes/queues is not supported: .Cm noerror . .El .Sh EXAMPLES There are far too many possible uses of .Nm so this Section will only give a small set of examples. .Pp .Ss BASIC PACKET FILTERING This command adds an entry which denies all tcp packets from .Em cracker.evil.org to the telnet port of .Em wolf.tambov.su from being forwarded by the host: .Pp .Dl "ipfw add deny tcp from cracker.evil.org to wolf.tambov.su telnet" .Pp This one disallows any connection from the entire cracker's network to my host: .Pp .Dl "ipfw add deny ip from 123.45.67.0/24 to my.host.org" .Pp A first and efficient way to limit access (not using dynamic rules) is the use of the following rules: .Pp .Dl "ipfw add allow tcp from any to any established" .Dl "ipfw add allow tcp from net1 portlist1 to net2 portlist2 setup" .Dl "ipfw add allow tcp from net3 portlist3 to net3 portlist3 setup" .Dl "..." .Dl "ipfw add deny tcp from any to any" .Pp The first rule will be a quick match for normal TCP packets, but it will not match the initial SYN packet, which will be matched by the .Cm setup rules only for selected source/destination pairs. All other SYN packets will be rejected by the final .Cm deny rule. .Pp If you administer one or more subnets, you can take advantage of the .Nm ipfw2 syntax to specify address sets and or-blocks and write extremely compact rulesets which selectively enable services to blocks of clients, as below: .Pp .Dl "goodguys=\*q{ 10.1.2.0/24{20,35,66,18} or 10.2.3.0/28{6,3,11} }\*q" .Dl "badguys=\*q10.1.2.0/24{8,38,60}\*q" .Dl "" .Dl "ipfw add allow ip from ${goodguys} to any" .Dl "ipfw add deny ip from ${badguys} to any" .Dl "... normal policies ..." .Pp The .Nm ipfw1 syntax would require a separate rule for each IP in the above example. .Pp The .Cm verrevpath option could be used to do automated anti-spoofing by adding the following to the top of a ruleset: .Pp .Dl "ipfw add deny ip from any to any not verrevpath in" .Pp This rule drops all incoming packets that appear to be coming to the sytem on the wrong interface. For example, a packet with a source address belonging to a host on a protected internal network would be dropped if it tried to enter the system from an external interface. .Ss DYNAMIC RULES In order to protect a site from flood attacks involving fake TCP packets, it is safer to use dynamic rules: .Pp .Dl "ipfw add check-state" .Dl "ipfw add deny tcp from any to any established" .Dl "ipfw add allow tcp from my-net to any setup keep-state" .Pp This will let the firewall install dynamic rules only for those connection which start with a regular SYN packet coming from the inside of our network. Dynamic rules are checked when encountering the first .Cm check-state or .Cm keep-state rule. A .Cm check-state rule should usually be placed near the beginning of the ruleset to minimize the amount of work scanning the ruleset. Your mileage may vary. .Pp To limit the number of connections a user can open you can use the following type of rules: .Pp .Dl "ipfw add allow tcp from my-net/24 to any setup limit src-addr 10" .Dl "ipfw add allow tcp from any to me setup limit src-addr 4" .Pp The former (assuming it runs on a gateway) will allow each host on a /24 network to open at most 10 TCP connections. The latter can be placed on a server to make sure that a single client does not use more than 4 simultaneous connections. .Pp .Em BEWARE : stateful rules can be subject to denial-of-service attacks by a SYN-flood which opens a huge number of dynamic rules. The effects of such attacks can be partially limited by acting on a set of .Xr sysctl 8 variables which control the operation of the firewall. .Pp Here is a good usage of the .Cm list command to see accounting records and timestamp information: .Pp .Dl ipfw -at list .Pp or in short form without timestamps: .Pp .Dl ipfw -a list .Pp which is equivalent to: .Pp .Dl ipfw show .Pp Next rule diverts all incoming packets from 192.168.2.0/24 to divert port 5000: .Pp .Dl ipfw divert 5000 ip from 192.168.2.0/24 to any in .Pp .Ss TRAFFIC SHAPING The following rules show some of the applications of .Nm and .Xr dummynet 4 for simulations and the like. .Pp This rule drops random incoming packets with a probability of 5%: .Pp .Dl "ipfw add prob 0.05 deny ip from any to any in" .Pp A similar effect can be achieved making use of dummynet pipes: .Pp .Dl "ipfw add pipe 10 ip from any to any" .Dl "ipfw pipe 10 config plr 0.05" .Pp We can use pipes to artificially limit bandwidth, e.g. on a machine acting as a router, if we want to limit traffic from local clients on 192.168.2.0/24 we do: .Pp .Dl "ipfw add pipe 1 ip from 192.168.2.0/24 to any out" .Dl "ipfw pipe 1 config bw 300Kbit/s queue 50KBytes" .Pp note that we use the .Cm out modifier so that the rule is not used twice. Remember in fact that .Nm rules are checked both on incoming and outgoing packets. .Pp Should we want to simulate a bidirectional link with bandwidth limitations, the correct way is the following: .Pp .Dl "ipfw add pipe 1 ip from any to any out" .Dl "ipfw add pipe 2 ip from any to any in" .Dl "ipfw pipe 1 config bw 64Kbit/s queue 10Kbytes" .Dl "ipfw pipe 2 config bw 64Kbit/s queue 10Kbytes" .Pp The above can be very useful, e.g. if you want to see how your fancy Web page will look for a residential user who is connected only through a slow link. You should not use only one pipe for both directions, unless you want to simulate a half-duplex medium (e.g. AppleTalk, Ethernet, IRDA). It is not necessary that both pipes have the same configuration, so we can also simulate asymmetric links. .Pp Should we want to verify network performance with the RED queue management algorithm: .Pp .Dl "ipfw add pipe 1 ip from any to any" .Dl "ipfw pipe 1 config bw 500Kbit/s queue 100 red 0.002/30/80/0.1" .Pp Another typical application of the traffic shaper is to introduce some delay in the communication. This can significantly affect applications which do a lot of Remote Procedure Calls, and where the round-trip-time of the connection often becomes a limiting factor much more than bandwidth: .Pp .Dl "ipfw add pipe 1 ip from any to any out" .Dl "ipfw add pipe 2 ip from any to any in" .Dl "ipfw pipe 1 config delay 250ms bw 1Mbit/s" .Dl "ipfw pipe 2 config delay 250ms bw 1Mbit/s" .Pp Per-flow queueing can be useful for a variety of purposes. A very simple one is counting traffic: .Pp .Dl "ipfw add pipe 1 tcp from any to any" .Dl "ipfw add pipe 1 udp from any to any" .Dl "ipfw add pipe 1 ip from any to any" .Dl "ipfw pipe 1 config mask all" .Pp The above set of rules will create queues (and collect statistics) for all traffic. Because the pipes have no limitations, the only effect is collecting statistics. Note that we need 3 rules, not just the last one, because when .Nm tries to match IP packets it will not consider ports, so we would not see connections on separate ports as different ones. .Pp A more sophisticated example is limiting the outbound traffic on a net with per-host limits, rather than per-network limits: .Pp .Dl "ipfw add pipe 1 ip from 192.168.2.0/24 to any out" .Dl "ipfw add pipe 2 ip from any to 192.168.2.0/24 in" .Dl "ipfw pipe 1 config mask src-ip 0x000000ff bw 200Kbit/s queue 20Kbytes" .Dl "ipfw pipe 2 config mask dst-ip 0x000000ff bw 200Kbit/s queue 20Kbytes" .Ss SETS OF RULES To add a set of rules atomically, e.g. set 18: .Pp .Dl "ipfw set disable 18" .Dl "ipfw add NN set 18 ... # repeat as needed" .Dl "ipfw set enable 18" .Pp To delete a set of rules atomically the command is simply: .Pp .Dl "ipfw delete set 18" .Pp To test a ruleset and disable it and regain control if something goes wrong: .Pp .Dl "ipfw set disable 18" .Dl "ipfw add NN set 18 ... # repeat as needed" .Dl "ipfw set enable 18; echo done; sleep 30 && ipfw set disable 18" .Pp Here if everything goes well, you press control-C before the "sleep" terminates, and your ruleset will be left active. Otherwise, e.g. if you cannot access your box, the ruleset will be disabled after the sleep terminates thus restoring the previous situation. .Sh SEE ALSO .Xr cpp 1 , .Xr m4 1 , .Xr bridge 4 , .Xr divert 4 , .Xr dummynet 4 , .Xr ip 4 , .Xr ipfirewall 4 , .Xr protocols 5 , .Xr services 5 , .Xr init 8 , .Xr kldload 8 , .Xr reboot 8 , .Xr sysctl 8 , .Xr syslogd 8 .Sh BUGS The syntax has grown over the years and sometimes it might be confusing. Unfortunately, backward compatibility prevents cleaning up mistakes made in the definition of the syntax. .Pp .Em !!! WARNING !!! .Pp Misconfiguring the firewall can put your computer in an unusable state, possibly shutting down network services and requiring console access to regain control of it. .Pp Incoming packet fragments diverted by .Cm divert or .Cm tee are reassembled before delivery to the socket. The action used on those packet is the one from the rule which matches the first fragment of the packet. .Pp Packets that match a .Cm tee rule should not be immediately accepted, but should continue going through the rule list. This may be fixed in a later version. .Pp Packets diverted to userland, and then reinserted by a userland process may lose various packet attributes. The packet source interface name will be preserved if it is shorter than 8 bytes and the userland process saves and reuses the sockaddr_in (as does .Xr natd 8 ) ; otherwise, it may be lost. If a packet is reinserted in this manner, later rules may be incorrectly applied, making the order of .Cm divert rules in the rule sequence very important. .Sh AUTHORS .An Ugen J. S. Antsilevich , .An Poul-Henning Kamp , .An Alex Nash , .An Archie Cobbs , .An Luigi Rizzo . .Pp .An -nosplit API based upon code written by .An Daniel Boulet for BSDI. .Pp Work on .Xr dummynet 4 traffic shaper supported by Akamba Corp. .Sh HISTORY The .Nm utility first appeared in .Fx 2.0 . .Xr dummynet 4 was introduced in .Fx 2.2.8 . Stateful extensions were introduced in .Fx 4.0 . .Nm ipfw2 was introduced in Summer 2002. Index: stable/4/sbin/ipfw/ipfw2.c =================================================================== --- stable/4/sbin/ipfw/ipfw2.c (revision 130570) +++ stable/4/sbin/ipfw/ipfw2.c (revision 130571) @@ -1,3970 +1,4076 @@ /* * Copyright (c) 2002-2003 Luigi Rizzo * Copyright (c) 1996 Alex Nash, Paul Traina, Poul-Henning Kamp * Copyright (c) 1994 Ugen J.S.Antsilevich * * Idea and grammar partially left from: * Copyright (c) 1993 Daniel Boulet * * Redistribution and use in source forms, with and without modification, * are permitted provided that this entire comment appears intact. * * Redistribution in binary form may occur without any restrictions. * Obviously, it would be nice if you gave credit where credit is due * but requiring it would be too onerous. * * This software is provided ``AS IS'' without any warranties of any kind. * * NEW command line interface for IP firewall facility * * $FreeBSD$ */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* def. of struct route */ #include #include #include int do_resolv, /* Would try to resolve all */ do_time, /* Show time stamps */ do_quiet, /* Be quiet in add and flush */ do_pipe, /* this cmd refers to a pipe */ do_sort, /* field to sort results (0 = no) */ do_dynamic, /* display dynamic rules */ do_expired, /* display expired dynamic rules */ do_compact, /* show rules in compact mode */ do_force, /* do not ask for confirmation */ show_sets, /* display rule sets */ test_only, /* only check syntax */ verbose; #define IP_MASK_ALL 0xffffffff /* * _s_x is a structure that stores a string <-> token pairs, used in * various places in the parser. Entries are stored in arrays, * with an entry with s=NULL as terminator. * The search routines are match_token() and match_value(). * Often, an element with x=0 contains an error string. * */ struct _s_x { char const *s; int x; }; static struct _s_x f_tcpflags[] = { { "syn", TH_SYN }, { "fin", TH_FIN }, { "ack", TH_ACK }, { "psh", TH_PUSH }, { "rst", TH_RST }, { "urg", TH_URG }, { "tcp flag", 0 }, { NULL, 0 } }; static struct _s_x f_tcpopts[] = { { "mss", IP_FW_TCPOPT_MSS }, { "maxseg", IP_FW_TCPOPT_MSS }, { "window", IP_FW_TCPOPT_WINDOW }, { "sack", IP_FW_TCPOPT_SACK }, { "ts", IP_FW_TCPOPT_TS }, { "timestamp", IP_FW_TCPOPT_TS }, { "cc", IP_FW_TCPOPT_CC }, { "tcp option", 0 }, { NULL, 0 } }; /* * IP options span the range 0 to 255 so we need to remap them * (though in fact only the low 5 bits are significant). */ static struct _s_x f_ipopts[] = { { "ssrr", IP_FW_IPOPT_SSRR}, { "lsrr", IP_FW_IPOPT_LSRR}, { "rr", IP_FW_IPOPT_RR}, { "ts", IP_FW_IPOPT_TS}, { "ip option", 0 }, { NULL, 0 } }; static struct _s_x f_iptos[] = { { "lowdelay", IPTOS_LOWDELAY}, { "throughput", IPTOS_THROUGHPUT}, { "reliability", IPTOS_RELIABILITY}, { "mincost", IPTOS_MINCOST}, { "congestion", IPTOS_CE}, { "ecntransport", IPTOS_ECT}, { "ip tos option", 0}, { NULL, 0 } }; static struct _s_x limit_masks[] = { {"all", DYN_SRC_ADDR|DYN_SRC_PORT|DYN_DST_ADDR|DYN_DST_PORT}, {"src-addr", DYN_SRC_ADDR}, {"src-port", DYN_SRC_PORT}, {"dst-addr", DYN_DST_ADDR}, {"dst-port", DYN_DST_PORT}, {NULL, 0} }; /* * we use IPPROTO_ETHERTYPE as a fake protocol id to call the print routines * This is only used in this code. */ #define IPPROTO_ETHERTYPE 0x1000 static struct _s_x ether_types[] = { /* * Note, we cannot use "-:&/" in the names because they are field * separators in the type specifications. Also, we use s = NULL as * end-delimiter, because a type of 0 can be legal. */ { "ip", 0x0800 }, { "ipv4", 0x0800 }, { "ipv6", 0x86dd }, { "arp", 0x0806 }, { "rarp", 0x8035 }, { "vlan", 0x8100 }, { "loop", 0x9000 }, { "trail", 0x1000 }, { "at", 0x809b }, { "atalk", 0x809b }, { "aarp", 0x80f3 }, { "pppoe_disc", 0x8863 }, { "pppoe_sess", 0x8864 }, { "ipx_8022", 0x00E0 }, { "ipx_8023", 0x0000 }, { "ipx_ii", 0x8137 }, { "ipx_snap", 0x8137 }, { "ipx", 0x8137 }, { "ns", 0x0600 }, { NULL, 0 } }; static void show_usage(void); enum tokens { TOK_NULL=0, TOK_OR, TOK_NOT, TOK_STARTBRACE, TOK_ENDBRACE, TOK_ACCEPT, TOK_COUNT, TOK_PIPE, TOK_QUEUE, TOK_DIVERT, TOK_TEE, TOK_FORWARD, TOK_SKIPTO, TOK_DENY, TOK_REJECT, TOK_RESET, TOK_UNREACH, TOK_CHECKSTATE, TOK_UID, TOK_GID, TOK_IN, TOK_LIMIT, TOK_KEEPSTATE, TOK_LAYER2, TOK_OUT, TOK_XMIT, TOK_RECV, TOK_VIA, TOK_FRAG, TOK_IPOPTS, TOK_IPLEN, TOK_IPID, TOK_IPPRECEDENCE, TOK_IPTOS, TOK_IPTTL, TOK_IPVER, TOK_ESTAB, TOK_SETUP, TOK_TCPFLAGS, TOK_TCPOPTS, TOK_TCPSEQ, TOK_TCPACK, TOK_TCPWIN, TOK_ICMPTYPES, TOK_MAC, TOK_MACTYPE, TOK_VERREVPATH, TOK_IPSEC, TOK_COMMENT, TOK_PLR, TOK_NOERROR, TOK_BUCKETS, TOK_DSTIP, TOK_SRCIP, TOK_DSTPORT, TOK_SRCPORT, TOK_ALL, TOK_MASK, TOK_BW, TOK_DELAY, TOK_RED, TOK_GRED, TOK_DROPTAIL, TOK_PROTO, TOK_WEIGHT, }; struct _s_x dummynet_params[] = { { "plr", TOK_PLR }, { "noerror", TOK_NOERROR }, { "buckets", TOK_BUCKETS }, { "dst-ip", TOK_DSTIP }, { "src-ip", TOK_SRCIP }, { "dst-port", TOK_DSTPORT }, { "src-port", TOK_SRCPORT }, { "proto", TOK_PROTO }, { "weight", TOK_WEIGHT }, { "all", TOK_ALL }, { "mask", TOK_MASK }, { "droptail", TOK_DROPTAIL }, { "red", TOK_RED }, { "gred", TOK_GRED }, { "bw", TOK_BW }, { "bandwidth", TOK_BW }, { "delay", TOK_DELAY }, { "pipe", TOK_PIPE }, { "queue", TOK_QUEUE }, { "dummynet-params", TOK_NULL }, { NULL, 0 } /* terminator */ }; struct _s_x rule_actions[] = { { "accept", TOK_ACCEPT }, { "pass", TOK_ACCEPT }, { "allow", TOK_ACCEPT }, { "permit", TOK_ACCEPT }, { "count", TOK_COUNT }, { "pipe", TOK_PIPE }, { "queue", TOK_QUEUE }, { "divert", TOK_DIVERT }, { "tee", TOK_TEE }, { "fwd", TOK_FORWARD }, { "forward", TOK_FORWARD }, { "skipto", TOK_SKIPTO }, { "deny", TOK_DENY }, { "drop", TOK_DENY }, { "reject", TOK_REJECT }, { "reset", TOK_RESET }, { "unreach", TOK_UNREACH }, { "check-state", TOK_CHECKSTATE }, { "//", TOK_COMMENT }, { NULL, 0 } /* terminator */ }; struct _s_x rule_options[] = { { "uid", TOK_UID }, { "gid", TOK_GID }, { "in", TOK_IN }, { "limit", TOK_LIMIT }, { "keep-state", TOK_KEEPSTATE }, { "bridged", TOK_LAYER2 }, { "layer2", TOK_LAYER2 }, { "out", TOK_OUT }, { "xmit", TOK_XMIT }, { "recv", TOK_RECV }, { "via", TOK_VIA }, { "fragment", TOK_FRAG }, { "frag", TOK_FRAG }, { "ipoptions", TOK_IPOPTS }, { "ipopts", TOK_IPOPTS }, { "iplen", TOK_IPLEN }, { "ipid", TOK_IPID }, { "ipprecedence", TOK_IPPRECEDENCE }, { "iptos", TOK_IPTOS }, { "ipttl", TOK_IPTTL }, { "ipversion", TOK_IPVER }, { "ipver", TOK_IPVER }, { "estab", TOK_ESTAB }, { "established", TOK_ESTAB }, { "setup", TOK_SETUP }, { "tcpflags", TOK_TCPFLAGS }, { "tcpflgs", TOK_TCPFLAGS }, { "tcpoptions", TOK_TCPOPTS }, { "tcpopts", TOK_TCPOPTS }, { "tcpseq", TOK_TCPSEQ }, { "tcpack", TOK_TCPACK }, { "tcpwin", TOK_TCPWIN }, { "icmptype", TOK_ICMPTYPES }, { "icmptypes", TOK_ICMPTYPES }, { "dst-ip", TOK_DSTIP }, { "src-ip", TOK_SRCIP }, { "dst-port", TOK_DSTPORT }, { "src-port", TOK_SRCPORT }, { "proto", TOK_PROTO }, { "MAC", TOK_MAC }, { "mac", TOK_MAC }, { "mac-type", TOK_MACTYPE }, { "verrevpath", TOK_VERREVPATH }, { "ipsec", TOK_IPSEC }, { "//", TOK_COMMENT }, { "not", TOK_NOT }, /* pseudo option */ { "!", /* escape ? */ TOK_NOT }, /* pseudo option */ { "or", TOK_OR }, /* pseudo option */ { "|", /* escape */ TOK_OR }, /* pseudo option */ { "{", TOK_STARTBRACE }, /* pseudo option */ { "(", TOK_STARTBRACE }, /* pseudo option */ { "}", TOK_ENDBRACE }, /* pseudo option */ { ")", TOK_ENDBRACE }, /* pseudo option */ { NULL, 0 } /* terminator */ }; static __inline uint64_t align_uint64(uint64_t *pll) { uint64_t ret; bcopy (pll, &ret, sizeof(ret)); return ret; }; /* * conditionally runs the command. */ static int do_cmd(int optname, void *optval, uintptr_t optlen) { static int s = -1; /* the socket */ int i; if (test_only) return 0; if (s == -1) s = socket(AF_INET, SOCK_RAW, IPPROTO_RAW); if (s < 0) err(EX_UNAVAILABLE, "socket"); if (optname == IP_FW_GET || optname == IP_DUMMYNET_GET || - optname == IP_FW_ADD) + optname == IP_FW_ADD || optname == IP_FW_TABLE_LIST || + optname == IP_FW_TABLE_GETSIZE) i = getsockopt(s, IPPROTO_IP, optname, optval, (socklen_t *)optlen); else i = setsockopt(s, IPPROTO_IP, optname, optval, optlen); return i; } /** * match_token takes a table and a string, returns the value associated * with the string (-1 in case of failure). */ static int match_token(struct _s_x *table, char *string) { struct _s_x *pt; uint i = strlen(string); for (pt = table ; i && pt->s != NULL ; pt++) if (strlen(pt->s) == i && !bcmp(string, pt->s, i)) return pt->x; return -1; }; /** * match_value takes a table and a value, returns the string associated * with the value (NULL in case of failure). */ static char const * match_value(struct _s_x *p, int value) { for (; p->s != NULL; p++) if (p->x == value) return p->s; return NULL; } /* * prints one port, symbolic or numeric */ static void print_port(int proto, uint16_t port) { if (proto == IPPROTO_ETHERTYPE) { char const *s; if (do_resolv && (s = match_value(ether_types, port)) ) printf("%s", s); else printf("0x%04x", port); } else { struct servent *se = NULL; if (do_resolv) { struct protoent *pe = getprotobynumber(proto); se = getservbyport(htons(port), pe ? pe->p_name : NULL); } if (se) printf("%s", se->s_name); else printf("%d", port); } } struct _s_x _port_name[] = { {"dst-port", O_IP_DSTPORT}, {"src-port", O_IP_SRCPORT}, {"ipid", O_IPID}, {"iplen", O_IPLEN}, {"ipttl", O_IPTTL}, {"mac-type", O_MAC_TYPE}, {NULL, 0} }; /* * Print the values in a list 16-bit items of the types above. * XXX todo: add support for mask. */ static void print_newports(ipfw_insn_u16 *cmd, int proto, int opcode) { uint16_t *p = cmd->ports; int i; char const *sep; if (cmd->o.len & F_NOT) printf(" not"); if (opcode != 0) { sep = match_value(_port_name, opcode); if (sep == NULL) sep = "???"; printf (" %s", sep); } sep = " "; for (i = F_LEN((ipfw_insn *)cmd) - 1; i > 0; i--, p += 2) { printf(sep); print_port(proto, p[0]); if (p[0] != p[1]) { printf("-"); print_port(proto, p[1]); } sep = ","; } } /* * Like strtol, but also translates service names into port numbers * for some protocols. * In particular: * proto == -1 disables the protocol check; * proto == IPPROTO_ETHERTYPE looks up an internal table * proto == matches the values there. * Returns *end == s in case the parameter is not found. */ static int strtoport(char *s, char **end, int base, int proto) { char *p, *buf; char *s1; int i; *end = s; /* default - not found */ if (*s == '\0') return 0; /* not found */ if (isdigit(*s)) return strtol(s, end, base); /* * find separator. '\\' escapes the next char. */ for (s1 = s; *s1 && (isalnum(*s1) || *s1 == '\\') ; s1++) if (*s1 == '\\' && s1[1] != '\0') s1++; buf = malloc(s1 - s + 1); if (buf == NULL) return 0; /* * copy into a buffer skipping backslashes */ for (p = s, i = 0; p != s1 ; p++) if (*p != '\\') buf[i++] = *p; buf[i++] = '\0'; if (proto == IPPROTO_ETHERTYPE) { i = match_token(ether_types, buf); free(buf); if (i != -1) { /* found */ *end = s1; return i; } } else { struct protoent *pe = NULL; struct servent *se; if (proto != 0) pe = getprotobynumber(proto); setservent(1); se = getservbyname(buf, pe ? pe->p_name : NULL); free(buf); if (se != NULL) { *end = s1; return ntohs(se->s_port); } } return 0; /* not found */ } /* * Fill the body of the command with the list of port ranges. */ static int fill_newports(ipfw_insn_u16 *cmd, char *av, int proto) { uint16_t a, b, *p = cmd->ports; int i = 0; char *s = av; while (*s) { a = strtoport(av, &s, 0, proto); if (s == av) /* no parameter */ break; if (*s == '-') { /* a range */ av = s+1; b = strtoport(av, &s, 0, proto); if (s == av) /* no parameter */ break; p[0] = a; p[1] = b; } else if (*s == ',' || *s == '\0' ) p[0] = p[1] = a; else /* invalid separator */ errx(EX_DATAERR, "invalid separator <%c> in <%s>\n", *s, av); i++; p += 2; av = s+1; } if (i > 0) { if (i+1 > F_LEN_MASK) errx(EX_DATAERR, "too many ports/ranges\n"); cmd->o.len |= i+1; /* leave F_NOT and F_OR untouched */ } return i; } static struct _s_x icmpcodes[] = { { "net", ICMP_UNREACH_NET }, { "host", ICMP_UNREACH_HOST }, { "protocol", ICMP_UNREACH_PROTOCOL }, { "port", ICMP_UNREACH_PORT }, { "needfrag", ICMP_UNREACH_NEEDFRAG }, { "srcfail", ICMP_UNREACH_SRCFAIL }, { "net-unknown", ICMP_UNREACH_NET_UNKNOWN }, { "host-unknown", ICMP_UNREACH_HOST_UNKNOWN }, { "isolated", ICMP_UNREACH_ISOLATED }, { "net-prohib", ICMP_UNREACH_NET_PROHIB }, { "host-prohib", ICMP_UNREACH_HOST_PROHIB }, { "tosnet", ICMP_UNREACH_TOSNET }, { "toshost", ICMP_UNREACH_TOSHOST }, { "filter-prohib", ICMP_UNREACH_FILTER_PROHIB }, { "host-precedence", ICMP_UNREACH_HOST_PRECEDENCE }, { "precedence-cutoff", ICMP_UNREACH_PRECEDENCE_CUTOFF }, { NULL, 0 } }; static void fill_reject_code(u_short *codep, char *str) { int val; char *s; val = strtoul(str, &s, 0); if (s == str || *s != '\0' || val >= 0x100) val = match_token(icmpcodes, str); if (val < 0) errx(EX_DATAERR, "unknown ICMP unreachable code ``%s''", str); *codep = val; return; } static void print_reject_code(uint16_t code) { char const *s = match_value(icmpcodes, code); if (s != NULL) printf("unreach %s", s); else printf("unreach %u", code); } /* * Returns the number of bits set (from left) in a contiguous bitmask, * or -1 if the mask is not contiguous. * XXX this needs a proper fix. * This effectively works on masks in big-endian (network) format. * when compiled on little endian architectures. * * First bit is bit 7 of the first byte -- note, for MAC addresses, * the first bit on the wire is bit 0 of the first byte. * len is the max length in bits. */ static int contigmask(uint8_t *p, int len) { int i, n; for (i=0; iarg1 & 0xff; uint8_t clear = (cmd->arg1 >> 8) & 0xff; if (list == f_tcpflags && set == TH_SYN && clear == TH_ACK) { printf(" setup"); return; } printf(" %s ", name); for (i=0; list[i].x != 0; i++) { if (set & list[i].x) { set &= ~list[i].x; printf("%s%s", comma, list[i].s); comma = ","; } if (clear & list[i].x) { clear &= ~list[i].x; printf("%s!%s", comma, list[i].s); comma = ","; } } } /* * Print the ip address contained in a command. */ static void print_ip(ipfw_insn_ip *cmd, char const *s) { struct hostent *he = NULL; int len = F_LEN((ipfw_insn *)cmd); uint32_t *a = ((ipfw_insn_u32 *)cmd)->d; printf("%s%s ", cmd->o.len & F_NOT ? " not": "", s); if (cmd->o.opcode == O_IP_SRC_ME || cmd->o.opcode == O_IP_DST_ME) { printf("me"); return; } + if (cmd->o.opcode == O_IP_SRC_LOOKUP || + cmd->o.opcode == O_IP_DST_LOOKUP) { + printf("table(%u", ((ipfw_insn *)cmd)->arg1); + if (len == F_INSN_SIZE(ipfw_insn_u32)) + printf(",%u", *a); + printf(")"); + return; + } if (cmd->o.opcode == O_IP_SRC_SET || cmd->o.opcode == O_IP_DST_SET) { uint32_t x, *map = (uint32_t *)&(cmd->mask); int i, j; char comma = '{'; x = cmd->o.arg1 - 1; x = htonl( ~x ); cmd->addr.s_addr = htonl(cmd->addr.s_addr); printf("%s/%d", inet_ntoa(cmd->addr), contigmask((uint8_t *)&x, 32)); x = cmd->addr.s_addr = htonl(cmd->addr.s_addr); x &= 0xff; /* base */ /* * Print bits and ranges. * Locate first bit set (i), then locate first bit unset (j). * If we have 3+ consecutive bits set, then print them as a * range, otherwise only print the initial bit and rescan. */ for (i=0; i < cmd->o.arg1; i++) if (map[i/32] & (1<<(i & 31))) { for (j=i+1; j < cmd->o.arg1; j++) if (!(map[ j/32] & (1<<(j & 31)))) break; printf("%c%d", comma, i+x); if (j>i+2) { /* range has at least 3 elements */ printf("-%d", j-1+x); i = j-1; } comma = ','; } printf("}"); return; } /* * len == 2 indicates a single IP, whereas lists of 1 or more * addr/mask pairs have len = (2n+1). We convert len to n so we * use that to count the number of entries. */ for (len = len / 2; len > 0; len--, a += 2) { int mb = /* mask length */ (cmd->o.opcode == O_IP_SRC || cmd->o.opcode == O_IP_DST) ? 32 : contigmask((uint8_t *)&(a[1]), 32); if (mb == 32 && do_resolv) he = gethostbyaddr((char *)&(a[0]), sizeof(u_long), AF_INET); if (he != NULL) /* resolved to name */ printf("%s", he->h_name); else if (mb == 0) /* any */ printf("any"); else { /* numeric IP followed by some kind of mask */ printf("%s", inet_ntoa( *((struct in_addr *)&a[0]) ) ); if (mb < 0) printf(":%s", inet_ntoa( *((struct in_addr *)&a[1]) ) ); else if (mb < 32) printf("/%d", mb); } if (len > 1) printf(","); } } /* * prints a MAC address/mask pair */ static void print_mac(uint8_t *addr, uint8_t *mask) { int l = contigmask(mask, 48); if (l == 0) printf(" any"); else { printf(" %02x:%02x:%02x:%02x:%02x:%02x", addr[0], addr[1], addr[2], addr[3], addr[4], addr[5]); if (l == -1) printf("&%02x:%02x:%02x:%02x:%02x:%02x", mask[0], mask[1], mask[2], mask[3], mask[4], mask[5]); else if (l < 48) printf("/%d", l); } } static void fill_icmptypes(ipfw_insn_u32 *cmd, char *av) { uint8_t type; cmd->d[0] = 0; while (*av) { if (*av == ',') av++; type = strtoul(av, &av, 0); if (*av != ',' && *av != '\0') errx(EX_DATAERR, "invalid ICMP type"); if (type > 31) errx(EX_DATAERR, "ICMP type out of range"); cmd->d[0] |= 1 << type; } cmd->o.opcode = O_ICMPTYPE; cmd->o.len |= F_INSN_SIZE(ipfw_insn_u32); } static void print_icmptypes(ipfw_insn_u32 *cmd) { int i; char sep= ' '; printf(" icmptypes"); for (i = 0; i < 32; i++) { if ( (cmd->d[0] & (1 << (i))) == 0) continue; printf("%c%d", sep, i); sep = ','; } } /* * show_ipfw() prints the body of an ipfw rule. * Because the standard rule has at least proto src_ip dst_ip, we use * a helper function to produce these entries if not provided explicitly. * The first argument is the list of fields we have, the second is * the list of fields we want to be printed. * * Special cases if we have provided a MAC header: * + if the rule does not contain IP addresses/ports, do not print them; * + if the rule does not contain an IP proto, print "all" instead of "ip"; * * Once we have 'have_options', IP header fields are printed as options. */ #define HAVE_PROTO 0x0001 #define HAVE_SRCIP 0x0002 #define HAVE_DSTIP 0x0004 #define HAVE_MAC 0x0008 #define HAVE_MACTYPE 0x0010 #define HAVE_OPTIONS 0x8000 #define HAVE_IP (HAVE_PROTO | HAVE_SRCIP | HAVE_DSTIP) static void show_prerequisites(int *flags, int want, int cmd) { if ( (*flags & HAVE_IP) == HAVE_IP) *flags |= HAVE_OPTIONS; if ( (*flags & (HAVE_MAC|HAVE_MACTYPE|HAVE_OPTIONS)) == HAVE_MAC && cmd != O_MAC_TYPE) { /* * mac-type was optimized out by the compiler, * restore it */ printf(" any"); *flags |= HAVE_MACTYPE | HAVE_OPTIONS; return; } if ( !(*flags & HAVE_OPTIONS)) { if ( !(*flags & HAVE_PROTO) && (want & HAVE_PROTO)) printf(" ip"); if ( !(*flags & HAVE_SRCIP) && (want & HAVE_SRCIP)) printf(" from any"); if ( !(*flags & HAVE_DSTIP) && (want & HAVE_DSTIP)) printf(" to any"); } *flags |= want; } static void show_ipfw(struct ip_fw *rule, int pcwidth, int bcwidth) { static int twidth = 0; int l; ipfw_insn *cmd; char *comment = NULL; /* ptr to comment if we have one */ int proto = 0; /* default */ int flags = 0; /* prerequisites */ ipfw_insn_log *logptr = NULL; /* set if we find an O_LOG */ int or_block = 0; /* we are in an or block */ uint32_t set_disable; bcopy(&rule->next_rule, &set_disable, sizeof(set_disable)); if (set_disable & (1 << rule->set)) { /* disabled */ if (!show_sets) return; else printf("# DISABLED "); } printf("%05u ", rule->rulenum); if (pcwidth>0 || bcwidth>0) printf("%*llu %*llu ", pcwidth, align_uint64(&rule->pcnt), bcwidth, align_uint64(&rule->bcnt)); if (do_time == 2) printf("%10u ", rule->timestamp); else if (do_time == 1) { char timestr[30]; time_t t = (time_t)0; if (twidth == 0) { strcpy(timestr, ctime(&t)); *strchr(timestr, '\n') = '\0'; twidth = strlen(timestr); } if (rule->timestamp) { #if _FreeBSD_version < 500000 /* XXX check */ #define _long_to_time(x) (time_t)(x) #endif t = _long_to_time(rule->timestamp); strcpy(timestr, ctime(&t)); *strchr(timestr, '\n') = '\0'; printf("%s ", timestr); } else { printf("%*s", twidth, " "); } } if (show_sets) printf("set %d ", rule->set); /* * print the optional "match probability" */ if (rule->cmd_len > 0) { cmd = rule->cmd ; if (cmd->opcode == O_PROB) { ipfw_insn_u32 *p = (ipfw_insn_u32 *)cmd; double d = 1.0 * p->d[0]; d = (d / 0x7fffffff); printf("prob %f ", d); } } /* * first print actions */ for (l = rule->cmd_len - rule->act_ofs, cmd = ACTION_PTR(rule); l > 0 ; l -= F_LEN(cmd), cmd += F_LEN(cmd)) { switch(cmd->opcode) { case O_CHECK_STATE: printf("check-state"); flags = HAVE_IP; /* avoid printing anything else */ break; case O_ACCEPT: printf("allow"); break; case O_COUNT: printf("count"); break; case O_DENY: printf("deny"); break; case O_REJECT: if (cmd->arg1 == ICMP_REJECT_RST) printf("reset"); else if (cmd->arg1 == ICMP_UNREACH_HOST) printf("reject"); else print_reject_code(cmd->arg1); break; case O_SKIPTO: printf("skipto %u", cmd->arg1); break; case O_PIPE: printf("pipe %u", cmd->arg1); break; case O_QUEUE: printf("queue %u", cmd->arg1); break; case O_DIVERT: printf("divert %u", cmd->arg1); break; case O_TEE: printf("tee %u", cmd->arg1); break; case O_FORWARD_IP: { ipfw_insn_sa *s = (ipfw_insn_sa *)cmd; printf("fwd %s", inet_ntoa(s->sa.sin_addr)); if (s->sa.sin_port) printf(",%d", s->sa.sin_port); } break; case O_LOG: /* O_LOG is printed last */ logptr = (ipfw_insn_log *)cmd; break; default: printf("** unrecognized action %d len %d", cmd->opcode, cmd->len); } } if (logptr) { if (logptr->max_log > 0) printf(" log logamount %d", logptr->max_log); else printf(" log"); } /* * then print the body. */ if (rule->_pad & 1) { /* empty rules before options */ if (!do_compact) printf(" ip from any to any"); flags |= HAVE_IP | HAVE_OPTIONS; } for (l = rule->act_ofs, cmd = rule->cmd ; l > 0 ; l -= F_LEN(cmd) , cmd += F_LEN(cmd)) { /* useful alias */ ipfw_insn_u32 *cmd32 = (ipfw_insn_u32 *)cmd; show_prerequisites(&flags, 0, cmd->opcode); switch(cmd->opcode) { case O_PROB: break; /* done already */ case O_PROBE_STATE: break; /* no need to print anything here */ case O_MACADDR2: { ipfw_insn_mac *m = (ipfw_insn_mac *)cmd; if ((cmd->len & F_OR) && !or_block) printf(" {"); if (cmd->len & F_NOT) printf(" not"); printf(" MAC"); flags |= HAVE_MAC; print_mac(m->addr, m->mask); print_mac(m->addr + 6, m->mask + 6); } break; case O_MAC_TYPE: if ((cmd->len & F_OR) && !or_block) printf(" {"); print_newports((ipfw_insn_u16 *)cmd, IPPROTO_ETHERTYPE, (flags & HAVE_OPTIONS) ? cmd->opcode : 0); flags |= HAVE_MAC | HAVE_MACTYPE | HAVE_OPTIONS; break; case O_IP_SRC: + case O_IP_SRC_LOOKUP: case O_IP_SRC_MASK: case O_IP_SRC_ME: case O_IP_SRC_SET: show_prerequisites(&flags, HAVE_PROTO, 0); if (!(flags & HAVE_SRCIP)) printf(" from"); if ((cmd->len & F_OR) && !or_block) printf(" {"); print_ip((ipfw_insn_ip *)cmd, (flags & HAVE_OPTIONS) ? " src-ip" : ""); flags |= HAVE_SRCIP; break; case O_IP_DST: + case O_IP_DST_LOOKUP: case O_IP_DST_MASK: case O_IP_DST_ME: case O_IP_DST_SET: show_prerequisites(&flags, HAVE_PROTO|HAVE_SRCIP, 0); if (!(flags & HAVE_DSTIP)) printf(" to"); if ((cmd->len & F_OR) && !or_block) printf(" {"); print_ip((ipfw_insn_ip *)cmd, (flags & HAVE_OPTIONS) ? " dst-ip" : ""); flags |= HAVE_DSTIP; break; case O_IP_DSTPORT: show_prerequisites(&flags, HAVE_IP, 0); case O_IP_SRCPORT: show_prerequisites(&flags, HAVE_PROTO|HAVE_SRCIP, 0); if ((cmd->len & F_OR) && !or_block) printf(" {"); print_newports((ipfw_insn_u16 *)cmd, proto, (flags & HAVE_OPTIONS) ? cmd->opcode : 0); break; case O_PROTO: { struct protoent *pe; if ((cmd->len & F_OR) && !or_block) printf(" {"); if (cmd->len & F_NOT) printf(" not"); proto = cmd->arg1; pe = getprotobynumber(cmd->arg1); if (flags & HAVE_OPTIONS) printf(" proto"); if (pe) printf(" %s", pe->p_name); else printf(" %u", cmd->arg1); } flags |= HAVE_PROTO; break; default: /*options ... */ show_prerequisites(&flags, HAVE_IP | HAVE_OPTIONS, 0); if ((cmd->len & F_OR) && !or_block) printf(" {"); if (cmd->len & F_NOT && cmd->opcode != O_IN) printf(" not"); switch(cmd->opcode) { case O_FRAG: printf(" frag"); break; case O_IN: printf(cmd->len & F_NOT ? " out" : " in"); break; case O_LAYER2: printf(" layer2"); break; case O_XMIT: case O_RECV: case O_VIA: { char const *s; ipfw_insn_if *cmdif = (ipfw_insn_if *)cmd; if (cmd->opcode == O_XMIT) s = "xmit"; else if (cmd->opcode == O_RECV) s = "recv"; else /* if (cmd->opcode == O_VIA) */ s = "via"; if (cmdif->name[0] == '\0') printf(" %s %s", s, inet_ntoa(cmdif->p.ip)); else if (cmdif->p.unit == -1) printf(" %s %s*", s, cmdif->name); else printf(" %s %s%d", s, cmdif->name, cmdif->p.unit); } break; case O_IPID: if (F_LEN(cmd) == 1) printf(" ipid %u", cmd->arg1 ); else print_newports((ipfw_insn_u16 *)cmd, 0, O_IPID); break; case O_IPTTL: if (F_LEN(cmd) == 1) printf(" ipttl %u", cmd->arg1 ); else print_newports((ipfw_insn_u16 *)cmd, 0, O_IPTTL); break; case O_IPVER: printf(" ipver %u", cmd->arg1 ); break; case O_IPPRECEDENCE: printf(" ipprecedence %u", (cmd->arg1) >> 5 ); break; case O_IPLEN: if (F_LEN(cmd) == 1) printf(" iplen %u", cmd->arg1 ); else print_newports((ipfw_insn_u16 *)cmd, 0, O_IPLEN); break; case O_IPOPT: print_flags("ipoptions", cmd, f_ipopts); break; case O_IPTOS: print_flags("iptos", cmd, f_iptos); break; case O_ICMPTYPE: print_icmptypes((ipfw_insn_u32 *)cmd); break; case O_ESTAB: printf(" established"); break; case O_TCPFLAGS: print_flags("tcpflags", cmd, f_tcpflags); break; case O_TCPOPTS: print_flags("tcpoptions", cmd, f_tcpopts); break; case O_TCPWIN: printf(" tcpwin %d", ntohs(cmd->arg1)); break; case O_TCPACK: printf(" tcpack %ld", ntohl(cmd32->d[0])); break; case O_TCPSEQ: printf(" tcpseq %ld", ntohl(cmd32->d[0])); break; case O_UID: { struct passwd *pwd = getpwuid(cmd32->d[0]); if (pwd) printf(" uid %s", pwd->pw_name); else printf(" uid %u", cmd32->d[0]); } break; case O_GID: { struct group *grp = getgrgid(cmd32->d[0]); if (grp) printf(" gid %s", grp->gr_name); else printf(" gid %u", cmd32->d[0]); } break; case O_VERREVPATH: printf(" verrevpath"); break; case O_IPSEC: printf(" ipsec"); break; case O_NOP: comment = (char *)(cmd + 1); break; case O_KEEP_STATE: printf(" keep-state"); break; case O_LIMIT: { struct _s_x *p = limit_masks; ipfw_insn_limit *c = (ipfw_insn_limit *)cmd; uint8_t x = c->limit_mask; char const *comma = " "; printf(" limit"); for (; p->x != 0 ; p++) if ((x & p->x) == p->x) { x &= ~p->x; printf("%s%s", comma, p->s); comma = ","; } printf(" %d", c->conn_limit); } break; default: printf(" [opcode %d len %d]", cmd->opcode, cmd->len); } } if (cmd->len & F_OR) { printf(" or"); or_block = 1; } else if (or_block) { printf(" }"); or_block = 0; } } show_prerequisites(&flags, HAVE_IP, 0); if (comment) printf(" // %s", comment); printf("\n"); } static void show_dyn_ipfw(ipfw_dyn_rule *d, int pcwidth, int bcwidth) { struct protoent *pe; struct in_addr a; uint16_t rulenum; if (!do_expired) { if (!d->expire && !(d->dyn_type == O_LIMIT_PARENT)) return; } bcopy(&d->rule, &rulenum, sizeof(rulenum)); printf("%05d", rulenum); if (pcwidth>0 || bcwidth>0) printf(" %*llu %*llu (%ds)", pcwidth, align_uint64(&d->pcnt), bcwidth, align_uint64(&d->bcnt), d->expire); switch (d->dyn_type) { case O_LIMIT_PARENT: printf(" PARENT %d", d->count); break; case O_LIMIT: printf(" LIMIT"); break; case O_KEEP_STATE: /* bidir, no mask */ printf(" STATE"); break; } if ((pe = getprotobynumber(d->id.proto)) != NULL) printf(" %s", pe->p_name); else printf(" proto %u", d->id.proto); a.s_addr = htonl(d->id.src_ip); printf(" %s %d", inet_ntoa(a), d->id.src_port); a.s_addr = htonl(d->id.dst_ip); printf(" <-> %s %d", inet_ntoa(a), d->id.dst_port); printf("\n"); } static int sort_q(const void *pa, const void *pb) { int rev = (do_sort < 0); int field = rev ? -do_sort : do_sort; long long res = 0; const struct dn_flow_queue *a = pa; const struct dn_flow_queue *b = pb; switch (field) { case 1: /* pkts */ res = a->len - b->len; break; case 2: /* bytes */ res = a->len_bytes - b->len_bytes; break; case 3: /* tot pkts */ res = a->tot_pkts - b->tot_pkts; break; case 4: /* tot bytes */ res = a->tot_bytes - b->tot_bytes; break; } if (res < 0) res = -1; if (res > 0) res = 1; return (int)(rev ? res : -res); } static void list_queues(struct dn_flow_set *fs, struct dn_flow_queue *q) { int l; printf(" mask: 0x%02x 0x%08x/0x%04x -> 0x%08x/0x%04x\n", fs->flow_mask.proto, fs->flow_mask.src_ip, fs->flow_mask.src_port, fs->flow_mask.dst_ip, fs->flow_mask.dst_port); if (fs->rq_elements == 0) return; printf("BKT Prot ___Source IP/port____ " "____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp\n"); if (do_sort != 0) heapsort(q, fs->rq_elements, sizeof *q, sort_q); for (l = 0; l < fs->rq_elements; l++) { struct in_addr ina; struct protoent *pe; ina.s_addr = htonl(q[l].id.src_ip); printf("%3d ", q[l].hash_slot); pe = getprotobynumber(q[l].id.proto); if (pe) printf("%-4s ", pe->p_name); else printf("%4u ", q[l].id.proto); printf("%15s/%-5d ", inet_ntoa(ina), q[l].id.src_port); ina.s_addr = htonl(q[l].id.dst_ip); printf("%15s/%-5d ", inet_ntoa(ina), q[l].id.dst_port); printf("%4qu %8qu %2u %4u %3u\n", q[l].tot_pkts, q[l].tot_bytes, q[l].len, q[l].len_bytes, q[l].drops); if (verbose) printf(" S %20qd F %20qd\n", q[l].S, q[l].F); } } static void print_flowset_parms(struct dn_flow_set *fs, char *prefix) { int l; char qs[30]; char plr[30]; char red[90]; /* Display RED parameters */ l = fs->qsize; if (fs->flags_fs & DN_QSIZE_IS_BYTES) { if (l >= 8192) sprintf(qs, "%d KB", l / 1024); else sprintf(qs, "%d B", l); } else sprintf(qs, "%3d sl.", l); if (fs->plr) sprintf(plr, "plr %f", 1.0 * fs->plr / (double)(0x7fffffff)); else plr[0] = '\0'; if (fs->flags_fs & DN_IS_RED) /* RED parameters */ sprintf(red, "\n\t %cRED w_q %f min_th %d max_th %d max_p %f", (fs->flags_fs & DN_IS_GENTLE_RED) ? 'G' : ' ', 1.0 * fs->w_q / (double)(1 << SCALE_RED), SCALE_VAL(fs->min_th), SCALE_VAL(fs->max_th), 1.0 * fs->max_p / (double)(1 << SCALE_RED)); else sprintf(red, "droptail"); printf("%s %s%s %d queues (%d buckets) %s\n", prefix, qs, plr, fs->rq_elements, fs->rq_size, red); } static void list_pipes(void *data, uint nbytes, int ac, char *av[]) { int rulenum; void *next = data; struct dn_pipe *p = (struct dn_pipe *) data; struct dn_flow_set *fs; struct dn_flow_queue *q; int l; if (ac > 0) rulenum = strtoul(*av++, NULL, 10); else rulenum = 0; for (; nbytes >= sizeof *p; p = (struct dn_pipe *)next) { double b = p->bandwidth; char buf[30]; char prefix[80]; if (p->next != (struct dn_pipe *)DN_IS_PIPE) break; /* done with pipes, now queues */ /* * compute length, as pipe have variable size */ l = sizeof(*p) + p->fs.rq_elements * sizeof(*q); next = (char *)p + l; nbytes -= l; if (rulenum != 0 && rulenum != p->pipe_nr) continue; /* * Print rate (or clocking interface) */ if (p->if_name[0] != '\0') sprintf(buf, "%s", p->if_name); else if (b == 0) sprintf(buf, "unlimited"); else if (b >= 1000000) sprintf(buf, "%7.3f Mbit/s", b/1000000); else if (b >= 1000) sprintf(buf, "%7.3f Kbit/s", b/1000); else sprintf(buf, "%7.3f bit/s ", b); sprintf(prefix, "%05d: %s %4d ms ", p->pipe_nr, buf, p->delay); print_flowset_parms(&(p->fs), prefix); if (verbose) printf(" V %20qd\n", p->V >> MY_M); q = (struct dn_flow_queue *)(p+1); list_queues(&(p->fs), q); } for (fs = next; nbytes >= sizeof *fs; fs = next) { char prefix[80]; if (fs->next != (struct dn_flow_set *)DN_IS_QUEUE) break; l = sizeof(*fs) + fs->rq_elements * sizeof(*q); next = (char *)fs + l; nbytes -= l; q = (struct dn_flow_queue *)(fs+1); sprintf(prefix, "q%05d: weight %d pipe %d ", fs->fs_nr, fs->weight, fs->parent_nr); print_flowset_parms(fs, prefix); list_queues(fs, q); } } /* * This one handles all set-related commands * ipfw set { show | enable | disable } * ipfw set swap X Y * ipfw set move X to Y * ipfw set move rule X to Y */ static void sets_handler(int ac, char *av[]) { uint32_t set_disable, masks[2]; int i, nbytes; uint16_t rulenum; uint8_t cmd, new_set; ac--; av++; if (!ac) errx(EX_USAGE, "set needs command"); if (!strncmp(*av, "show", strlen(*av)) ) { void *data; char const *msg; nbytes = sizeof(struct ip_fw); if ((data = calloc(1, nbytes)) == NULL) err(EX_OSERR, "calloc"); if (do_cmd(IP_FW_GET, data, (uintptr_t)&nbytes) < 0) err(EX_OSERR, "getsockopt(IP_FW_GET)"); bcopy(&((struct ip_fw *)data)->next_rule, &set_disable, sizeof(set_disable)); for (i = 0, msg = "disable" ; i < RESVD_SET; i++) if ((set_disable & (1< RESVD_SET) errx(EX_DATAERR, "invalid set number %s\n", av[0]); if (!isdigit(*(av[1])) || new_set > RESVD_SET) errx(EX_DATAERR, "invalid set number %s\n", av[1]); masks[0] = (4 << 24) | (new_set << 16) | (rulenum); i = do_cmd(IP_FW_DEL, masks, sizeof(uint32_t)); } else if (!strncmp(*av, "move", strlen(*av))) { ac--; av++; if (ac && !strncmp(*av, "rule", strlen(*av))) { cmd = 2; ac--; av++; } else cmd = 3; if (ac != 3 || strncmp(av[1], "to", strlen(*av))) errx(EX_USAGE, "syntax: set move [rule] X to Y\n"); rulenum = atoi(av[0]); new_set = atoi(av[2]); if (!isdigit(*(av[0])) || (cmd == 3 && rulenum > RESVD_SET) || (cmd == 2 && rulenum == 65535) ) errx(EX_DATAERR, "invalid source number %s\n", av[0]); if (!isdigit(*(av[2])) || new_set > RESVD_SET) errx(EX_DATAERR, "invalid dest. set %s\n", av[1]); masks[0] = (cmd << 24) | (new_set << 16) | (rulenum); i = do_cmd(IP_FW_DEL, masks, sizeof(uint32_t)); } else if (!strncmp(*av, "disable", strlen(*av)) || !strncmp(*av, "enable", strlen(*av)) ) { int which = !strncmp(*av, "enable", strlen(*av)) ? 1 : 0; ac--; av++; masks[0] = masks[1] = 0; while (ac) { if (isdigit(**av)) { i = atoi(*av); if (i < 0 || i > RESVD_SET) errx(EX_DATAERR, "invalid set number %d\n", i); masks[which] |= (1<= nalloc) { nalloc = nalloc * 2 + 200; nbytes = nalloc; if ((data = realloc(data, nbytes)) == NULL) err(EX_OSERR, "realloc"); if (do_cmd(ocmd, data, (uintptr_t)&nbytes) < 0) err(EX_OSERR, "getsockopt(IP_%s_GET)", do_pipe ? "DUMMYNET" : "FW"); } if (do_pipe) { list_pipes(data, nbytes, ac, av); goto done; } /* * Count static rules. They have variable size so we * need to scan the list to count them. */ for (nstat = 1, r = data, lim = (char *)data + nbytes; r->rulenum < 65535 && (char *)r < lim; ++nstat, r = NEXT(r) ) ; /* nothing */ /* * Count dynamic rules. This is easier as they have * fixed size. */ r = NEXT(r); dynrules = (ipfw_dyn_rule *)r ; n = (char *)r - (char *)data; ndyn = (nbytes - n) / sizeof *dynrules; /* if showing stats, figure out column widths ahead of time */ bcwidth = pcwidth = 0; if (show_counters) { for (n = 0, r = data; n < nstat; n++, r = NEXT(r)) { /* packet counter */ width = snprintf(NULL, 0, "%llu", align_uint64(&r->pcnt)); if (width > pcwidth) pcwidth = width; /* byte counter */ width = snprintf(NULL, 0, "%llu", align_uint64(&r->bcnt)); if (width > bcwidth) bcwidth = width; } } if (do_dynamic && ndyn) { for (n = 0, d = dynrules; n < ndyn; n++, d++) { width = snprintf(NULL, 0, "%llu", align_uint64(&d->pcnt)); if (width > pcwidth) pcwidth = width; width = snprintf(NULL, 0, "%llu", align_uint64(&d->bcnt)); if (width > bcwidth) bcwidth = width; } } /* if no rule numbers were specified, list all rules */ if (ac == 0) { for (n = 0, r = data; n < nstat; n++, r = NEXT(r) ) show_ipfw(r, pcwidth, bcwidth); if (do_dynamic && ndyn) { printf("## Dynamic rules (%d):\n", ndyn); for (n = 0, d = dynrules; n < ndyn; n++, d++) show_dyn_ipfw(d, pcwidth, bcwidth); } goto done; } /* display specific rules requested on command line */ for (lac = ac, lav = av; lac != 0; lac--) { /* convert command line rule # */ last = rnum = strtoul(*lav++, &endptr, 10); if (*endptr == '-') last = strtoul(endptr+1, &endptr, 10); if (*endptr) { exitval = EX_USAGE; warnx("invalid rule number: %s", *(lav - 1)); continue; } for (n = seen = 0, r = data; n < nstat; n++, r = NEXT(r) ) { if (r->rulenum > last) break; if (r->rulenum >= rnum && r->rulenum <= last) { show_ipfw(r, pcwidth, bcwidth); seen = 1; } } if (!seen) { /* give precedence to other error(s) */ if (exitval == EX_OK) exitval = EX_UNAVAILABLE; warnx("rule %lu does not exist", rnum); } } if (do_dynamic && ndyn) { printf("## Dynamic rules:\n"); for (lac = ac, lav = av; lac != 0; lac--) { rnum = strtoul(*lav++, &endptr, 10); if (*endptr == '-') last = strtoul(endptr+1, &endptr, 10); if (*endptr) /* already warned */ continue; for (n = 0, d = dynrules; n < ndyn; n++, d++) { uint16_t rulenum; bcopy(&d->rule, &rulenum, sizeof(rulenum)); if (rulenum > rnum) break; if (r->rulenum >= rnum && r->rulenum <= last) show_dyn_ipfw(d, pcwidth, bcwidth); } } } ac = 0; done: free(data); if (exitval != EX_OK) exit(exitval); #undef NEXT } static void show_usage(void) { fprintf(stderr, "usage: ipfw [options]\n" "do \"ipfw -h\" or see ipfw manpage for details\n" ); exit(EX_USAGE); } static void help(void) { fprintf(stderr, "ipfw syntax summary (but please do read the ipfw(8) manpage):\n" "ipfw [-acdeftTnNpqS] where is one of:\n" "add [num] [set N] [prob x] RULE-BODY\n" "{pipe|queue} N config PIPE-BODY\n" "[pipe|queue] {zero|delete|show} [N{,N}]\n" "set [disable N... enable N...] | move [rule] X to Y | swap X Y | show\n" +"table N {add ip[/bits] [value] | delete ip[/bits] | flush | list}\n" "\n" "RULE-BODY: check-state [LOG] | ACTION [LOG] ADDR [OPTION_LIST]\n" "ACTION: check-state | allow | count | deny | reject | skipto N |\n" " {divert|tee} PORT | forward ADDR | pipe N | queue N\n" "ADDR: [ MAC dst src ether_type ] \n" " [ from IPADDR [ PORT ] to IPADDR [ PORTLIST ] ]\n" -"IPADDR: [not] { any | me | ip/bits{x,y,z} | IPLIST }\n" +"IPADDR: [not] { any | me | ip/bits{x,y,z} | table(t[,v]) | IPLIST }\n" "IPLIST: { ip | ip/bits | ip:mask }[,IPLIST]\n" "OPTION_LIST: OPTION [OPTION_LIST]\n" "OPTION: bridged | {dst-ip|src-ip} ADDR | {dst-port|src-port} LIST |\n" " estab | frag | {gid|uid} N | icmptypes LIST | in | out | ipid LIST |\n" " iplen LIST | ipoptions SPEC | ipprecedence | ipsec | iptos SPEC |\n" " ipttl LIST | ipversion VER | keep-state | layer2 | limit ... |\n" " mac ... | mac-type LIST | proto LIST | {recv|xmit|via} {IF|IPADDR} |\n" " setup | {tcpack|tcpseq|tcpwin} NN | tcpflags SPEC | tcpoptions SPEC |\n" " verrevpath\n" ); exit(0); } static int lookup_host (char *host, struct in_addr *ipaddr) { struct hostent *he; if (!inet_aton(host, ipaddr)) { if ((he = gethostbyname(host)) == NULL) return(-1); *ipaddr = *(struct in_addr *)he->h_addr_list[0]; } return(0); } /* * fills the addr and mask fields in the instruction as appropriate from av. * Update length as appropriate. * The following formats are allowed: * any matches any IP. Actually returns an empty instruction. * me returns O_IP_*_ME * 1.2.3.4 single IP address * 1.2.3.4:5.6.7.8 address:mask * 1.2.3.4/24 address/mask * 1.2.3.4/26{1,6,5,4,23} set of addresses in a subnet * We can have multiple comma-separated address/mask entries. */ static void fill_ip(ipfw_insn_ip *cmd, char *av) { int len = 0; uint32_t *d = ((ipfw_insn_u32 *)cmd)->d; cmd->o.len &= ~F_LEN_MASK; /* zero len */ if (!strncmp(av, "any", strlen(av))) return; if (!strncmp(av, "me", strlen(av))) { cmd->o.len |= F_INSN_SIZE(ipfw_insn); return; } + if (!strncmp(av, "table(", 6)) { + char *p = strchr(av + 6, ','); + + if (p) + *p++ = '\0'; + cmd->o.opcode = O_IP_DST_LOOKUP; + cmd->o.arg1 = strtoul(av + 6, NULL, 0); + if (p) { + cmd->o.len |= F_INSN_SIZE(ipfw_insn_u32); + d[0] = strtoul(p, NULL, 0); + } else + cmd->o.len |= F_INSN_SIZE(ipfw_insn); + return; + } + while (av) { /* * After the address we can have '/' or ':' indicating a mask, * ',' indicating another address follows, '{' indicating a * set of addresses of unspecified size. */ char *p = strpbrk(av, "/:,{"); int masklen; char md; if (p) { md = *p; *p++ = '\0'; } else md = '\0'; if (lookup_host(av, (struct in_addr *)&d[0]) != 0) errx(EX_NOHOST, "hostname ``%s'' unknown", av); switch (md) { case ':': if (!inet_aton(p, (struct in_addr *)&d[1])) errx(EX_DATAERR, "bad netmask ``%s''", p); break; case '/': masklen = atoi(p); if (masklen == 0) d[1] = htonl(0); /* mask */ else if (masklen > 32) errx(EX_DATAERR, "bad width ``%s''", p); else d[1] = htonl(~0 << (32 - masklen)); break; case '{': /* no mask, assume /24 and put back the '{' */ d[1] = htonl(~0 << (32 - 24)); *(--p) = md; break; case ',': /* single address plus continuation */ *(--p) = md; /* FALLTHROUGH */ case 0: /* initialization value */ default: d[1] = htonl(~0); /* force /32 */ break; } d[0] &= d[1]; /* mask base address with mask */ /* find next separator */ if (p) p = strpbrk(p, ",{"); if (p && *p == '{') { /* * We have a set of addresses. They are stored as follows: * arg1 is the set size (powers of 2, 2..256) * addr is the base address IN HOST FORMAT * mask.. is an array of arg1 bits (rounded up to * the next multiple of 32) with bits set * for each host in the map. */ uint32_t *map = (uint32_t *)&cmd->mask; int low, high; int i = contigmask((uint8_t *)&(d[1]), 32); if (len > 0) errx(EX_DATAERR, "address set cannot be in a list"); if (i < 24 || i > 31) errx(EX_DATAERR, "invalid set with mask %d\n", i); cmd->o.arg1 = 1<<(32-i); /* map length */ d[0] = ntohl(d[0]); /* base addr in host format */ cmd->o.opcode = O_IP_DST_SET; /* default */ cmd->o.len |= F_INSN_SIZE(ipfw_insn_u32) + (cmd->o.arg1+31)/32; for (i = 0; i < (cmd->o.arg1+31)/32 ; i++) map[i] = 0; /* clear map */ av = p + 1; low = d[0] & 0xff; high = low + cmd->o.arg1 - 1; /* * Here, i stores the previous value when we specify a range * of addresses within a mask, e.g. 45-63. i = -1 means we * have no previous value. */ i = -1; /* previous value in a range */ while (isdigit(*av)) { char *s; int a = strtol(av, &s, 0); if (s == av) { /* no parameter */ if (*av != '}') errx(EX_DATAERR, "set not closed\n"); if (i != -1) errx(EX_DATAERR, "incomplete range %d-", i); break; } if (a < low || a > high) errx(EX_DATAERR, "addr %d out of range [%d-%d]\n", a, low, high); a -= low; if (i == -1) /* no previous in range */ i = a; else { /* check that range is valid */ if (i > a) errx(EX_DATAERR, "invalid range %d-%d", i+low, a+low); if (*s == '-') errx(EX_DATAERR, "double '-' in range"); } for (; i <= a; i++) map[i/32] |= 1<<(i & 31); i = -1; if (*s == '-') i = a; else if (*s == '}') break; av = s+1; } return; } av = p; if (av) /* then *av must be a ',' */ av++; /* Check this entry */ if (d[1] == 0) { /* "any", specified as x.x.x.x/0 */ /* * 'any' turns the entire list into a NOP. * 'not any' never matches, so it is removed from the * list unless it is the only item, in which case we * report an error. */ if (cmd->o.len & F_NOT) { /* "not any" never matches */ if (av == NULL && len == 0) /* only this entry */ errx(EX_DATAERR, "not any never matches"); } /* else do nothing and return */ return; } /* A single IP can be stored in an optimized format */ if (d[1] == IP_MASK_ALL && av == NULL && len == 0) { cmd->o.len |= F_INSN_SIZE(ipfw_insn_u32); return; } len += 2; /* two words... */ d += 2; } /* end while */ cmd->o.len |= len+1; } /* * helper function to process a set of flags and set bits in the * appropriate masks. */ static void fill_flags(ipfw_insn *cmd, enum ipfw_opcodes opcode, struct _s_x *flags, char *p) { uint8_t set=0, clear=0; while (p && *p) { char *q; /* points to the separator */ int val; uint8_t *which; /* mask we are working on */ if (*p == '!') { p++; which = &clear; } else which = &set; q = strchr(p, ','); if (q) *q++ = '\0'; val = match_token(flags, p); if (val <= 0) errx(EX_DATAERR, "invalid flag %s", p); *which |= (uint8_t)val; p = q; } cmd->opcode = opcode; cmd->len = (cmd->len & (F_NOT | F_OR)) | 1; cmd->arg1 = (set & 0xff) | ( (clear & 0xff) << 8); } static void delete(int ac, char *av[]) { uint32_t rulenum; struct dn_pipe p; int i; int exitval = EX_OK; int do_set = 0; memset(&p, 0, sizeof p); av++; ac--; if (ac > 0 && !strncmp(*av, "set", strlen(*av))) { do_set = 1; /* delete set */ ac--; av++; } /* Rule number */ while (ac && isdigit(**av)) { i = atoi(*av); av++; ac--; if (do_pipe) { if (do_pipe == 1) p.pipe_nr = i; else p.fs.fs_nr = i; i = do_cmd(IP_DUMMYNET_DEL, &p, sizeof p); if (i) { exitval = 1; warn("rule %u: setsockopt(IP_DUMMYNET_DEL)", do_pipe == 1 ? p.pipe_nr : p.fs.fs_nr); } } else { rulenum = (i & 0xffff) | (do_set << 24); i = do_cmd(IP_FW_DEL, &rulenum, sizeof rulenum); if (i) { exitval = EX_UNAVAILABLE; warn("rule %u: setsockopt(IP_FW_DEL)", rulenum); } } } if (exitval != EX_OK) exit(exitval); } /* * fill the interface structure. We do not check the name as we can * create interfaces dynamically, so checking them at insert time * makes relatively little sense. * A '*' following the name means any unit. */ static void fill_iface(ipfw_insn_if *cmd, char *arg) { cmd->name[0] = '\0'; cmd->o.len |= F_INSN_SIZE(ipfw_insn_if); /* Parse the interface or address */ if (!strcmp(arg, "any")) cmd->o.len = 0; /* effectively ignore this command */ else if (!isdigit(*arg)) { char *q; strncpy(cmd->name, arg, sizeof(cmd->name)); cmd->name[sizeof(cmd->name) - 1] = '\0'; /* find first digit or wildcard */ for (q = cmd->name; *q && !isdigit(*q) && *q != '*'; q++) continue; cmd->p.unit = (*q == '*') ? -1 : atoi(q); *q = '\0'; } else if (!inet_aton(arg, &cmd->p.ip)) errx(EX_DATAERR, "bad ip address ``%s''", arg); } /* * the following macro returns an error message if we run out of * arguments. */ #define NEED1(msg) {if (!ac) errx(EX_USAGE, msg);} static void config_pipe(int ac, char **av) { struct dn_pipe p; int i; char *end; uint32_t a; void *par = NULL; memset(&p, 0, sizeof p); av++; ac--; /* Pipe number */ if (ac && isdigit(**av)) { i = atoi(*av); av++; ac--; if (do_pipe == 1) p.pipe_nr = i; else p.fs.fs_nr = i; } while (ac > 0) { double d; int tok = match_token(dummynet_params, *av); ac--; av++; switch(tok) { case TOK_NOERROR: p.fs.flags_fs |= DN_NOERROR; break; case TOK_PLR: NEED1("plr needs argument 0..1\n"); d = strtod(av[0], NULL); if (d > 1) d = 1; else if (d < 0) d = 0; p.fs.plr = (int)(d*0x7fffffff); ac--; av++; break; case TOK_QUEUE: NEED1("queue needs queue size\n"); end = NULL; p.fs.qsize = strtoul(av[0], &end, 0); if (*end == 'K' || *end == 'k') { p.fs.flags_fs |= DN_QSIZE_IS_BYTES; p.fs.qsize *= 1024; } else if (*end == 'B' || !strncmp(end, "by", 2)) { p.fs.flags_fs |= DN_QSIZE_IS_BYTES; } ac--; av++; break; case TOK_BUCKETS: NEED1("buckets needs argument\n"); p.fs.rq_size = strtoul(av[0], NULL, 0); ac--; av++; break; case TOK_MASK: NEED1("mask needs mask specifier\n"); /* * per-flow queue, mask is dst_ip, dst_port, * src_ip, src_port, proto measured in bits */ par = NULL; p.fs.flow_mask.dst_ip = 0; p.fs.flow_mask.src_ip = 0; p.fs.flow_mask.dst_port = 0; p.fs.flow_mask.src_port = 0; p.fs.flow_mask.proto = 0; end = NULL; while (ac >= 1) { uint32_t *p32 = NULL; uint16_t *p16 = NULL; tok = match_token(dummynet_params, *av); ac--; av++; switch(tok) { case TOK_ALL: /* * special case, all bits significant */ p.fs.flow_mask.dst_ip = ~0; p.fs.flow_mask.src_ip = ~0; p.fs.flow_mask.dst_port = ~0; p.fs.flow_mask.src_port = ~0; p.fs.flow_mask.proto = ~0; p.fs.flags_fs |= DN_HAVE_FLOW_MASK; goto end_mask; case TOK_DSTIP: p32 = &p.fs.flow_mask.dst_ip; break; case TOK_SRCIP: p32 = &p.fs.flow_mask.src_ip; break; case TOK_DSTPORT: p16 = &p.fs.flow_mask.dst_port; break; case TOK_SRCPORT: p16 = &p.fs.flow_mask.src_port; break; case TOK_PROTO: break; default: ac++; av--; /* backtrack */ goto end_mask; } if (ac < 1) errx(EX_USAGE, "mask: value missing"); if (*av[0] == '/') { a = strtoul(av[0]+1, &end, 0); a = (a == 32) ? ~0 : (1 << a) - 1; } else a = strtoul(av[0], &end, 0); if (p32 != NULL) *p32 = a; else if (p16 != NULL) { if (a > 65535) errx(EX_DATAERR, "mask: must be 16 bit"); *p16 = (uint16_t)a; } else { if (a > 255) errx(EX_DATAERR, "mask: must be 8 bit"); p.fs.flow_mask.proto = (uint8_t)a; } if (a != 0) p.fs.flags_fs |= DN_HAVE_FLOW_MASK; ac--; av++; } /* end while, config masks */ end_mask: break; case TOK_RED: case TOK_GRED: NEED1("red/gred needs w_q/min_th/max_th/max_p\n"); p.fs.flags_fs |= DN_IS_RED; if (tok == TOK_GRED) p.fs.flags_fs |= DN_IS_GENTLE_RED; /* * the format for parameters is w_q/min_th/max_th/max_p */ if ((end = strsep(&av[0], "/"))) { double w_q = strtod(end, NULL); if (w_q > 1 || w_q <= 0) errx(EX_DATAERR, "0 < w_q <= 1"); p.fs.w_q = (int) (w_q * (1 << SCALE_RED)); } if ((end = strsep(&av[0], "/"))) { p.fs.min_th = strtoul(end, &end, 0); if (*end == 'K' || *end == 'k') p.fs.min_th *= 1024; } if ((end = strsep(&av[0], "/"))) { p.fs.max_th = strtoul(end, &end, 0); if (*end == 'K' || *end == 'k') p.fs.max_th *= 1024; } if ((end = strsep(&av[0], "/"))) { double max_p = strtod(end, NULL); if (max_p > 1 || max_p <= 0) errx(EX_DATAERR, "0 < max_p <= 1"); p.fs.max_p = (int)(max_p * (1 << SCALE_RED)); } ac--; av++; break; case TOK_DROPTAIL: p.fs.flags_fs &= ~(DN_IS_RED|DN_IS_GENTLE_RED); break; case TOK_BW: NEED1("bw needs bandwidth or interface\n"); if (do_pipe != 1) errx(EX_DATAERR, "bandwidth only valid for pipes"); /* * set clocking interface or bandwidth value */ if (av[0][0] >= 'a' && av[0][0] <= 'z') { int l = sizeof(p.if_name)-1; /* interface name */ strncpy(p.if_name, av[0], l); p.if_name[l] = '\0'; p.bandwidth = 0; } else { p.if_name[0] = '\0'; p.bandwidth = strtoul(av[0], &end, 0); if (*end == 'K' || *end == 'k') { end++; p.bandwidth *= 1000; } else if (*end == 'M') { end++; p.bandwidth *= 1000000; } if (*end == 'B' || !strncmp(end, "by", 2)) p.bandwidth *= 8; if (p.bandwidth < 0) errx(EX_DATAERR, "bandwidth too large"); } ac--; av++; break; case TOK_DELAY: if (do_pipe != 1) errx(EX_DATAERR, "delay only valid for pipes"); NEED1("delay needs argument 0..10000ms\n"); p.delay = strtoul(av[0], NULL, 0); ac--; av++; break; case TOK_WEIGHT: if (do_pipe == 1) errx(EX_DATAERR,"weight only valid for queues"); NEED1("weight needs argument 0..100\n"); p.fs.weight = strtoul(av[0], &end, 0); ac--; av++; break; case TOK_PIPE: if (do_pipe == 1) errx(EX_DATAERR,"pipe only valid for queues"); NEED1("pipe needs pipe_number\n"); p.fs.parent_nr = strtoul(av[0], &end, 0); ac--; av++; break; default: errx(EX_DATAERR, "unrecognised option ``%s''", av[-1]); } } if (do_pipe == 1) { if (p.pipe_nr == 0) errx(EX_DATAERR, "pipe_nr must be > 0"); if (p.delay > 10000) errx(EX_DATAERR, "delay must be < 10000"); } else { /* do_pipe == 2, queue */ if (p.fs.parent_nr == 0) errx(EX_DATAERR, "pipe must be > 0"); if (p.fs.weight >100) errx(EX_DATAERR, "weight must be <= 100"); } if (p.fs.flags_fs & DN_QSIZE_IS_BYTES) { if (p.fs.qsize > 1024*1024) errx(EX_DATAERR, "queue size must be < 1MB"); } else { if (p.fs.qsize > 100) errx(EX_DATAERR, "2 <= queue size <= 100"); } if (p.fs.flags_fs & DN_IS_RED) { size_t len; int lookup_depth, avg_pkt_size; double s, idle, weight, w_q; struct clockinfo ck; int t; if (p.fs.min_th >= p.fs.max_th) errx(EX_DATAERR, "min_th %d must be < than max_th %d", p.fs.min_th, p.fs.max_th); if (p.fs.max_th == 0) errx(EX_DATAERR, "max_th must be > 0"); len = sizeof(int); if (sysctlbyname("net.inet.ip.dummynet.red_lookup_depth", &lookup_depth, &len, NULL, 0) == -1) errx(1, "sysctlbyname(\"%s\")", "net.inet.ip.dummynet.red_lookup_depth"); if (lookup_depth == 0) errx(EX_DATAERR, "net.inet.ip.dummynet.red_lookup_depth" " must be greater than zero"); len = sizeof(int); if (sysctlbyname("net.inet.ip.dummynet.red_avg_pkt_size", &avg_pkt_size, &len, NULL, 0) == -1) errx(1, "sysctlbyname(\"%s\")", "net.inet.ip.dummynet.red_avg_pkt_size"); if (avg_pkt_size == 0) errx(EX_DATAERR, "net.inet.ip.dummynet.red_avg_pkt_size must" " be greater than zero"); len = sizeof(struct clockinfo); if (sysctlbyname("kern.clockrate", &ck, &len, NULL, 0) == -1) errx(1, "sysctlbyname(\"%s\")", "kern.clockrate"); /* * Ticks needed for sending a medium-sized packet. * Unfortunately, when we are configuring a WF2Q+ queue, we * do not have bandwidth information, because that is stored * in the parent pipe, and also we have multiple queues * competing for it. So we set s=0, which is not very * correct. But on the other hand, why do we want RED with * WF2Q+ ? */ if (p.bandwidth==0) /* this is a WF2Q+ queue */ s = 0; else s = ck.hz * avg_pkt_size * 8 / p.bandwidth; /* * max idle time (in ticks) before avg queue size becomes 0. * NOTA: (3/w_q) is approx the value x so that * (1-w_q)^x < 10^-3. */ w_q = ((double)p.fs.w_q) / (1 << SCALE_RED); idle = s * 3. / w_q; p.fs.lookup_step = (int)idle / lookup_depth; if (!p.fs.lookup_step) p.fs.lookup_step = 1; weight = 1 - w_q; for (t = p.fs.lookup_step; t > 0; --t) weight *= weight; p.fs.lookup_weight = (int)(weight * (1 << SCALE_RED)); } i = do_cmd(IP_DUMMYNET_CONFIGURE, &p, sizeof p); if (i) err(1, "setsockopt(%s)", "IP_DUMMYNET_CONFIGURE"); } static void get_mac_addr_mask(char *p, uint8_t *addr, uint8_t *mask) { int i, l; for (i=0; i<6; i++) addr[i] = mask[i] = 0; if (!strcmp(p, "any")) return; for (i=0; *p && i<6;i++, p++) { addr[i] = strtol(p, &p, 16); if (*p != ':') /* we start with the mask */ break; } if (*p == '/') { /* mask len */ l = strtol(p+1, &p, 0); for (i=0; l>0; l -=8, i++) mask[i] = (l >=8) ? 0xff : (~0) << (8-l); } else if (*p == '&') { /* mask */ for (i=0, p++; *p && i<6;i++, p++) { mask[i] = strtol(p, &p, 16); if (*p != ':') break; } } else if (*p == '\0') { for (i=0; i<6; i++) mask[i] = 0xff; } for (i=0; i<6; i++) addr[i] &= mask[i]; } /* * helper function, updates the pointer to cmd with the length * of the current command, and also cleans up the first word of * the new command in case it has been clobbered before. */ static ipfw_insn * next_cmd(ipfw_insn *cmd) { cmd += F_LEN(cmd); bzero(cmd, sizeof(*cmd)); return cmd; } /* * Takes arguments and copies them into a comment */ static void fill_comment(ipfw_insn *cmd, int ac, char **av) { int i, l; char *p = (char *)(cmd + 1); cmd->opcode = O_NOP; cmd->len = (cmd->len & (F_NOT | F_OR)); /* Compute length of comment string. */ for (i = 0, l = 0; i < ac; i++) l += strlen(av[i]) + 1; if (l == 0) return; if (l > 84) errx(EX_DATAERR, "comment too long (max 80 chars)"); l = 1 + (l+3)/4; cmd->len = (cmd->len & (F_NOT | F_OR)) | l; for (i = 0; i < ac; i++) { strcpy(p, av[i]); p += strlen(av[i]); *p++ = ' '; } *(--p) = '\0'; } /* * A function to fill simple commands of size 1. * Existing flags are preserved. */ static void fill_cmd(ipfw_insn *cmd, enum ipfw_opcodes opcode, int flags, uint16_t arg) { cmd->opcode = opcode; cmd->len = ((cmd->len | flags) & (F_NOT | F_OR)) | 1; cmd->arg1 = arg; } /* * Fetch and add the MAC address and type, with masks. This generates one or * two microinstructions, and returns the pointer to the last one. */ static ipfw_insn * add_mac(ipfw_insn *cmd, int ac, char *av[]) { ipfw_insn_mac *mac; if (ac < 2) errx(EX_DATAERR, "MAC dst src"); cmd->opcode = O_MACADDR2; cmd->len = (cmd->len & (F_NOT | F_OR)) | F_INSN_SIZE(ipfw_insn_mac); mac = (ipfw_insn_mac *)cmd; get_mac_addr_mask(av[0], mac->addr, mac->mask); /* dst */ get_mac_addr_mask(av[1], &(mac->addr[6]), &(mac->mask[6])); /* src */ return cmd; } static ipfw_insn * add_mactype(ipfw_insn *cmd, int ac, char *av) { if (ac < 1) errx(EX_DATAERR, "missing MAC type"); if (strcmp(av, "any") != 0) { /* we have a non-null type */ fill_newports((ipfw_insn_u16 *)cmd, av, IPPROTO_ETHERTYPE); cmd->opcode = O_MAC_TYPE; return cmd; } else return NULL; } static ipfw_insn * add_proto(ipfw_insn *cmd, char *av) { struct protoent *pe; u_char proto = 0; if (!strncmp(av, "all", strlen(av))) ; /* same as "ip" */ else if ((proto = atoi(av)) > 0) ; /* all done! */ else if ((pe = getprotobyname(av)) != NULL) proto = pe->p_proto; else return NULL; if (proto != IPPROTO_IP) fill_cmd(cmd, O_PROTO, 0, proto); return cmd; } static ipfw_insn * add_srcip(ipfw_insn *cmd, char *av) { fill_ip((ipfw_insn_ip *)cmd, av); if (cmd->opcode == O_IP_DST_SET) /* set */ cmd->opcode = O_IP_SRC_SET; + else if (cmd->opcode == O_IP_DST_LOOKUP) /* table */ + cmd->opcode = O_IP_SRC_LOOKUP; else if (F_LEN(cmd) == F_INSN_SIZE(ipfw_insn)) /* me */ cmd->opcode = O_IP_SRC_ME; else if (F_LEN(cmd) == F_INSN_SIZE(ipfw_insn_u32)) /* one IP */ cmd->opcode = O_IP_SRC; else /* addr/mask */ cmd->opcode = O_IP_SRC_MASK; return cmd; } static ipfw_insn * add_dstip(ipfw_insn *cmd, char *av) { fill_ip((ipfw_insn_ip *)cmd, av); if (cmd->opcode == O_IP_DST_SET) /* set */ ; + else if (cmd->opcode == O_IP_DST_LOOKUP) /* table */ + ; else if (F_LEN(cmd) == F_INSN_SIZE(ipfw_insn)) /* me */ cmd->opcode = O_IP_DST_ME; else if (F_LEN(cmd) == F_INSN_SIZE(ipfw_insn_u32)) /* one IP */ cmd->opcode = O_IP_DST; else /* addr/mask */ cmd->opcode = O_IP_DST_MASK; return cmd; } static ipfw_insn * add_ports(ipfw_insn *cmd, char *av, u_char proto, int opcode) { if (!strncmp(av, "any", strlen(av))) { return NULL; } else if (fill_newports((ipfw_insn_u16 *)cmd, av, proto)) { /* XXX todo: check that we have a protocol with ports */ cmd->opcode = opcode; return cmd; } return NULL; } /* * Parse arguments and assemble the microinstructions which make up a rule. * Rules are added into the 'rulebuf' and then copied in the correct order * into the actual rule. * * The syntax for a rule starts with the action, followed by an * optional log action, and the various match patterns. * In the assembled microcode, the first opcode must be an O_PROBE_STATE * (generated if the rule includes a keep-state option), then the * various match patterns, the "log" action, and the actual action. * */ static void add(int ac, char *av[]) { /* * rules are added into the 'rulebuf' and then copied in * the correct order into the actual rule. * Some things that need to go out of order (prob, action etc.) * go into actbuf[]. */ static uint32_t rulebuf[255], actbuf[255], cmdbuf[255]; ipfw_insn *src, *dst, *cmd, *action, *prev=NULL; ipfw_insn *first_cmd; /* first match pattern */ struct ip_fw *rule; /* * various flags used to record that we entered some fields. */ ipfw_insn *have_state = NULL; /* check-state or keep-state */ int i; int open_par = 0; /* open parenthesis ( */ /* proto is here because it is used to fetch ports */ u_char proto = IPPROTO_IP; /* default protocol */ double match_prob = 1; /* match probability, default is always match */ bzero(actbuf, sizeof(actbuf)); /* actions go here */ bzero(cmdbuf, sizeof(cmdbuf)); bzero(rulebuf, sizeof(rulebuf)); rule = (struct ip_fw *)rulebuf; cmd = (ipfw_insn *)cmdbuf; action = (ipfw_insn *)actbuf; av++; ac--; /* [rule N] -- Rule number optional */ if (ac && isdigit(**av)) { rule->rulenum = atoi(*av); av++; ac--; } /* [set N] -- set number (0..RESVD_SET), optional */ if (ac > 1 && !strncmp(*av, "set", strlen(*av))) { int set = strtoul(av[1], NULL, 10); if (set < 0 || set > RESVD_SET) errx(EX_DATAERR, "illegal set %s", av[1]); rule->set = set; av += 2; ac -= 2; } /* [prob D] -- match probability, optional */ if (ac > 1 && !strncmp(*av, "prob", strlen(*av))) { match_prob = strtod(av[1], NULL); if (match_prob <= 0 || match_prob > 1) errx(EX_DATAERR, "illegal match prob. %s", av[1]); av += 2; ac -= 2; } /* action -- mandatory */ NEED1("missing action"); i = match_token(rule_actions, *av); ac--; av++; action->len = 1; /* default */ switch(i) { case TOK_CHECKSTATE: have_state = action; action->opcode = O_CHECK_STATE; break; case TOK_ACCEPT: action->opcode = O_ACCEPT; break; case TOK_DENY: action->opcode = O_DENY; action->arg1 = 0; break; case TOK_REJECT: action->opcode = O_REJECT; action->arg1 = ICMP_UNREACH_HOST; break; case TOK_RESET: action->opcode = O_REJECT; action->arg1 = ICMP_REJECT_RST; break; case TOK_UNREACH: action->opcode = O_REJECT; NEED1("missing reject code"); fill_reject_code(&action->arg1, *av); ac--; av++; break; case TOK_COUNT: action->opcode = O_COUNT; break; case TOK_QUEUE: case TOK_PIPE: action->len = F_INSN_SIZE(ipfw_insn_pipe); case TOK_SKIPTO: if (i == TOK_QUEUE) action->opcode = O_QUEUE; else if (i == TOK_PIPE) action->opcode = O_PIPE; else if (i == TOK_SKIPTO) action->opcode = O_SKIPTO; NEED1("missing skipto/pipe/queue number"); action->arg1 = strtoul(*av, NULL, 10); av++; ac--; break; case TOK_DIVERT: case TOK_TEE: action->opcode = (i == TOK_DIVERT) ? O_DIVERT : O_TEE; NEED1("missing divert/tee port"); action->arg1 = strtoul(*av, NULL, 0); if (action->arg1 == 0) { struct servent *s; setservent(1); s = getservbyname(av[0], "divert"); if (s != NULL) action->arg1 = ntohs(s->s_port); else errx(EX_DATAERR, "illegal divert/tee port"); } ac--; av++; break; case TOK_FORWARD: { ipfw_insn_sa *p = (ipfw_insn_sa *)action; char *s, *end; NEED1("missing forward address[:port]"); action->opcode = O_FORWARD_IP; action->len = F_INSN_SIZE(ipfw_insn_sa); p->sa.sin_len = sizeof(struct sockaddr_in); p->sa.sin_family = AF_INET; p->sa.sin_port = 0; /* * locate the address-port separator (':' or ',') */ s = strchr(*av, ':'); if (s == NULL) s = strchr(*av, ','); if (s != NULL) { *(s++) = '\0'; i = strtoport(s, &end, 0 /* base */, 0 /* proto */); if (s == end) errx(EX_DATAERR, "illegal forwarding port ``%s''", s); p->sa.sin_port = (u_short)i; } lookup_host(*av, &(p->sa.sin_addr)); } ac--; av++; break; case TOK_COMMENT: /* pretend it is a 'count' rule followed by the comment */ action->opcode = O_COUNT; ac++; av--; /* go back... */ break; default: errx(EX_DATAERR, "invalid action %s\n", av[-1]); } action = next_cmd(action); /* * [log [logamount N]] -- log, optional * * If exists, it goes first in the cmdbuf, but then it is * skipped in the copy section to the end of the buffer. */ if (ac && !strncmp(*av, "log", strlen(*av))) { ipfw_insn_log *c = (ipfw_insn_log *)cmd; int l; cmd->len = F_INSN_SIZE(ipfw_insn_log); cmd->opcode = O_LOG; av++; ac--; if (ac && !strncmp(*av, "logamount", strlen(*av))) { ac--; av++; NEED1("logamount requires argument"); l = atoi(*av); if (l < 0) errx(EX_DATAERR, "logamount must be positive"); c->max_log = l; ac--; av++; } cmd = next_cmd(cmd); } if (have_state) /* must be a check-state, we are done */ goto done; #define OR_START(target) \ if (ac && (*av[0] == '(' || *av[0] == '{')) { \ if (open_par) \ errx(EX_USAGE, "nested \"(\" not allowed\n"); \ prev = NULL; \ open_par = 1; \ if ( (av[0])[1] == '\0') { \ ac--; av++; \ } else \ (*av)++; \ } \ target: \ #define CLOSE_PAR \ if (open_par) { \ if (ac && ( \ !strncmp(*av, ")", strlen(*av)) || \ !strncmp(*av, "}", strlen(*av)) )) { \ prev = NULL; \ open_par = 0; \ ac--; av++; \ } else \ errx(EX_USAGE, "missing \")\"\n"); \ } #define NOT_BLOCK \ if (ac && !strncmp(*av, "not", strlen(*av))) { \ if (cmd->len & F_NOT) \ errx(EX_USAGE, "double \"not\" not allowed\n"); \ cmd->len |= F_NOT; \ ac--; av++; \ } #define OR_BLOCK(target) \ if (ac && !strncmp(*av, "or", strlen(*av))) { \ if (prev == NULL || open_par == 0) \ errx(EX_DATAERR, "invalid OR block"); \ prev->len |= F_OR; \ ac--; av++; \ goto target; \ } \ CLOSE_PAR; first_cmd = cmd; #if 0 /* * MAC addresses, optional. * If we have this, we skip the part "proto from src to dst" * and jump straight to the option parsing. */ NOT_BLOCK; NEED1("missing protocol"); if (!strncmp(*av, "MAC", strlen(*av)) || !strncmp(*av, "mac", strlen(*av))) { ac--; av++; /* the "MAC" keyword */ add_mac(cmd, ac, av); /* exits in case of errors */ cmd = next_cmd(cmd); ac -= 2; av += 2; /* dst-mac and src-mac */ NOT_BLOCK; NEED1("missing mac type"); if (add_mactype(cmd, ac, av[0])) cmd = next_cmd(cmd); ac--; av++; /* any or mac-type */ goto read_options; } #endif /* * protocol, mandatory */ OR_START(get_proto); NOT_BLOCK; NEED1("missing protocol"); if (add_proto(cmd, *av)) { av++; ac--; if (F_LEN(cmd) == 0) /* plain IP */ proto = 0; else { proto = cmd->arg1; prev = cmd; cmd = next_cmd(cmd); } } else if (first_cmd != cmd) { errx(EX_DATAERR, "invalid protocol ``%s''", *av); } else goto read_options; OR_BLOCK(get_proto); /* * "from", mandatory */ if (!ac || strncmp(*av, "from", strlen(*av))) errx(EX_USAGE, "missing ``from''"); ac--; av++; /* * source IP, mandatory */ OR_START(source_ip); NOT_BLOCK; /* optional "not" */ NEED1("missing source address"); if (add_srcip(cmd, *av)) { ac--; av++; if (F_LEN(cmd) != 0) { /* ! any */ prev = cmd; cmd = next_cmd(cmd); } } OR_BLOCK(source_ip); /* * source ports, optional */ NOT_BLOCK; /* optional "not" */ if (ac) { if (!strncmp(*av, "any", strlen(*av)) || add_ports(cmd, *av, proto, O_IP_SRCPORT)) { ac--; av++; if (F_LEN(cmd) != 0) cmd = next_cmd(cmd); } } /* * "to", mandatory */ if (!ac || strncmp(*av, "to", strlen(*av))) errx(EX_USAGE, "missing ``to''"); av++; ac--; /* * destination, mandatory */ OR_START(dest_ip); NOT_BLOCK; /* optional "not" */ NEED1("missing dst address"); if (add_dstip(cmd, *av)) { ac--; av++; if (F_LEN(cmd) != 0) { /* ! any */ prev = cmd; cmd = next_cmd(cmd); } } OR_BLOCK(dest_ip); /* * dest. ports, optional */ NOT_BLOCK; /* optional "not" */ if (ac) { if (!strncmp(*av, "any", strlen(*av)) || add_ports(cmd, *av, proto, O_IP_DSTPORT)) { ac--; av++; if (F_LEN(cmd) != 0) cmd = next_cmd(cmd); } } read_options: if (ac && first_cmd == cmd) { /* * nothing specified so far, store in the rule to ease * printout later. */ rule->_pad = 1; } prev = NULL; while (ac) { char *s; ipfw_insn_u32 *cmd32; /* alias for cmd */ s = *av; cmd32 = (ipfw_insn_u32 *)cmd; if (*s == '!') { /* alternate syntax for NOT */ if (cmd->len & F_NOT) errx(EX_USAGE, "double \"not\" not allowed\n"); cmd->len = F_NOT; s++; } i = match_token(rule_options, s); ac--; av++; switch(i) { case TOK_NOT: if (cmd->len & F_NOT) errx(EX_USAGE, "double \"not\" not allowed\n"); cmd->len = F_NOT; break; case TOK_OR: if (open_par == 0 || prev == NULL) errx(EX_USAGE, "invalid \"or\" block\n"); prev->len |= F_OR; break; case TOK_STARTBRACE: if (open_par) errx(EX_USAGE, "+nested \"(\" not allowed\n"); open_par = 1; break; case TOK_ENDBRACE: if (!open_par) errx(EX_USAGE, "+missing \")\"\n"); open_par = 0; prev = NULL; break; case TOK_IN: fill_cmd(cmd, O_IN, 0, 0); break; case TOK_OUT: cmd->len ^= F_NOT; /* toggle F_NOT */ fill_cmd(cmd, O_IN, 0, 0); break; case TOK_FRAG: fill_cmd(cmd, O_FRAG, 0, 0); break; case TOK_LAYER2: fill_cmd(cmd, O_LAYER2, 0, 0); break; case TOK_XMIT: case TOK_RECV: case TOK_VIA: NEED1("recv, xmit, via require interface name" " or address"); fill_iface((ipfw_insn_if *)cmd, av[0]); ac--; av++; if (F_LEN(cmd) == 0) /* not a valid address */ break; if (i == TOK_XMIT) cmd->opcode = O_XMIT; else if (i == TOK_RECV) cmd->opcode = O_RECV; else if (i == TOK_VIA) cmd->opcode = O_VIA; break; case TOK_ICMPTYPES: NEED1("icmptypes requires list of types"); fill_icmptypes((ipfw_insn_u32 *)cmd, *av); av++; ac--; break; case TOK_IPTTL: NEED1("ipttl requires TTL"); if (strpbrk(*av, "-,")) { if (!add_ports(cmd, *av, 0, O_IPTTL)) errx(EX_DATAERR, "invalid ipttl %s", *av); } else fill_cmd(cmd, O_IPTTL, 0, strtoul(*av, NULL, 0)); ac--; av++; break; case TOK_IPID: NEED1("ipid requires id"); if (strpbrk(*av, "-,")) { if (!add_ports(cmd, *av, 0, O_IPID)) errx(EX_DATAERR, "invalid ipid %s", *av); } else fill_cmd(cmd, O_IPID, 0, strtoul(*av, NULL, 0)); ac--; av++; break; case TOK_IPLEN: NEED1("iplen requires length"); if (strpbrk(*av, "-,")) { if (!add_ports(cmd, *av, 0, O_IPLEN)) errx(EX_DATAERR, "invalid ip len %s", *av); } else fill_cmd(cmd, O_IPLEN, 0, strtoul(*av, NULL, 0)); ac--; av++; break; case TOK_IPVER: NEED1("ipver requires version"); fill_cmd(cmd, O_IPVER, 0, strtoul(*av, NULL, 0)); ac--; av++; break; case TOK_IPPRECEDENCE: NEED1("ipprecedence requires value"); fill_cmd(cmd, O_IPPRECEDENCE, 0, (strtoul(*av, NULL, 0) & 7) << 5); ac--; av++; break; case TOK_IPOPTS: NEED1("missing argument for ipoptions"); fill_flags(cmd, O_IPOPT, f_ipopts, *av); ac--; av++; break; case TOK_IPTOS: NEED1("missing argument for iptos"); fill_flags(cmd, O_IPTOS, f_iptos, *av); ac--; av++; break; case TOK_UID: NEED1("uid requires argument"); { char *end; uid_t uid; struct passwd *pwd; cmd->opcode = O_UID; uid = strtoul(*av, &end, 0); pwd = (*end == '\0') ? getpwuid(uid) : getpwnam(*av); if (pwd == NULL) errx(EX_DATAERR, "uid \"%s\" nonexistent", *av); cmd32->d[0] = pwd->pw_uid; cmd->len = F_INSN_SIZE(ipfw_insn_u32); ac--; av++; } break; case TOK_GID: NEED1("gid requires argument"); { char *end; gid_t gid; struct group *grp; cmd->opcode = O_GID; gid = strtoul(*av, &end, 0); grp = (*end == '\0') ? getgrgid(gid) : getgrnam(*av); if (grp == NULL) errx(EX_DATAERR, "gid \"%s\" nonexistent", *av); cmd32->d[0] = grp->gr_gid; cmd->len = F_INSN_SIZE(ipfw_insn_u32); ac--; av++; } break; case TOK_ESTAB: fill_cmd(cmd, O_ESTAB, 0, 0); break; case TOK_SETUP: fill_cmd(cmd, O_TCPFLAGS, 0, (TH_SYN) | ( (TH_ACK) & 0xff) <<8 ); break; case TOK_TCPOPTS: NEED1("missing argument for tcpoptions"); fill_flags(cmd, O_TCPOPTS, f_tcpopts, *av); ac--; av++; break; case TOK_TCPSEQ: case TOK_TCPACK: NEED1("tcpseq/tcpack requires argument"); cmd->len = F_INSN_SIZE(ipfw_insn_u32); cmd->opcode = (i == TOK_TCPSEQ) ? O_TCPSEQ : O_TCPACK; cmd32->d[0] = htonl(strtoul(*av, NULL, 0)); ac--; av++; break; case TOK_TCPWIN: NEED1("tcpwin requires length"); fill_cmd(cmd, O_TCPWIN, 0, htons(strtoul(*av, NULL, 0))); ac--; av++; break; case TOK_TCPFLAGS: NEED1("missing argument for tcpflags"); cmd->opcode = O_TCPFLAGS; fill_flags(cmd, O_TCPFLAGS, f_tcpflags, *av); ac--; av++; break; case TOK_KEEPSTATE: if (open_par) errx(EX_USAGE, "keep-state cannot be part " "of an or block"); if (have_state) errx(EX_USAGE, "only one of keep-state " "and limit is allowed"); have_state = cmd; fill_cmd(cmd, O_KEEP_STATE, 0, 0); break; case TOK_LIMIT: if (open_par) errx(EX_USAGE, "limit cannot be part " "of an or block"); if (have_state) errx(EX_USAGE, "only one of keep-state " "and limit is allowed"); NEED1("limit needs mask and # of connections"); have_state = cmd; { ipfw_insn_limit *c = (ipfw_insn_limit *)cmd; cmd->len = F_INSN_SIZE(ipfw_insn_limit); cmd->opcode = O_LIMIT; c->limit_mask = 0; c->conn_limit = 0; for (; ac >1 ;) { int val; val = match_token(limit_masks, *av); if (val <= 0) break; c->limit_mask |= val; ac--; av++; } c->conn_limit = atoi(*av); if (c->conn_limit == 0) errx(EX_USAGE, "limit: limit must be >0"); if (c->limit_mask == 0) errx(EX_USAGE, "missing limit mask"); ac--; av++; } break; case TOK_PROTO: NEED1("missing protocol"); if (add_proto(cmd, *av)) { proto = cmd->arg1; ac--; av++; } else errx(EX_DATAERR, "invalid protocol ``%s''", *av); break; case TOK_SRCIP: NEED1("missing source IP"); if (add_srcip(cmd, *av)) { ac--; av++; } break; case TOK_DSTIP: NEED1("missing destination IP"); if (add_dstip(cmd, *av)) { ac--; av++; } break; case TOK_SRCPORT: NEED1("missing source port"); if (!strncmp(*av, "any", strlen(*av)) || add_ports(cmd, *av, proto, O_IP_SRCPORT)) { ac--; av++; } else errx(EX_DATAERR, "invalid source port %s", *av); break; case TOK_DSTPORT: NEED1("missing destination port"); if (!strncmp(*av, "any", strlen(*av)) || add_ports(cmd, *av, proto, O_IP_DSTPORT)) { ac--; av++; } else errx(EX_DATAERR, "invalid destination port %s", *av); break; case TOK_MAC: if (ac < 2) errx(EX_USAGE, "MAC dst-mac src-mac"); if (add_mac(cmd, ac, av)) { ac -= 2; av += 2; } break; case TOK_MACTYPE: NEED1("missing mac type"); if (!add_mactype(cmd, ac, *av)) errx(EX_DATAERR, "invalid mac type %s", *av); ac--; av++; break; case TOK_VERREVPATH: fill_cmd(cmd, O_VERREVPATH, 0, 0); break; case TOK_IPSEC: fill_cmd(cmd, O_IPSEC, 0, 0); break; case TOK_COMMENT: fill_comment(cmd, ac, av); av += ac; ac = 0; break; default: errx(EX_USAGE, "unrecognised option [%d] %s\n", i, s); } if (F_LEN(cmd) > 0) { /* prepare to advance */ prev = cmd; cmd = next_cmd(cmd); } } done: /* * Now copy stuff into the rule. * If we have a keep-state option, the first instruction * must be a PROBE_STATE (which is generated here). * If we have a LOG option, it was stored as the first command, * and now must be moved to the top of the action part. */ dst = (ipfw_insn *)rule->cmd; /* * First thing to write into the command stream is the match probability. */ if (match_prob != 1) { /* 1 means always match */ dst->opcode = O_PROB; dst->len = 2; *((int32_t *)(dst+1)) = (int32_t)(match_prob * 0x7fffffff); dst += dst->len; } /* * generate O_PROBE_STATE if necessary */ if (have_state && have_state->opcode != O_CHECK_STATE) { fill_cmd(dst, O_PROBE_STATE, 0, 0); dst = next_cmd(dst); } /* * copy all commands but O_LOG, O_KEEP_STATE, O_LIMIT */ for (src = (ipfw_insn *)cmdbuf; src != cmd; src += i) { i = F_LEN(src); switch (src->opcode) { case O_LOG: case O_KEEP_STATE: case O_LIMIT: break; default: bcopy(src, dst, i * sizeof(uint32_t)); dst += i; } } /* * put back the have_state command as last opcode */ if (have_state && have_state->opcode != O_CHECK_STATE) { i = F_LEN(have_state); bcopy(have_state, dst, i * sizeof(uint32_t)); dst += i; } /* * start action section */ rule->act_ofs = dst - rule->cmd; /* * put back O_LOG if necessary */ src = (ipfw_insn *)cmdbuf; if (src->opcode == O_LOG) { i = F_LEN(src); bcopy(src, dst, i * sizeof(uint32_t)); dst += i; } /* * copy all other actions */ for (src = (ipfw_insn *)actbuf; src != action; src += i) { i = F_LEN(src); bcopy(src, dst, i * sizeof(uint32_t)); dst += i; } rule->cmd_len = (uint32_t *)dst - (uint32_t *)(rule->cmd); i = (char *)dst - (char *)rule; if (do_cmd(IP_FW_ADD, rule, (uintptr_t)&i) == -1) err(EX_UNAVAILABLE, "getsockopt(%s)", "IP_FW_ADD"); if (!do_quiet) show_ipfw(rule, 0, 0); } static void zero(int ac, char *av[], int optname /* IP_FW_ZERO or IP_FW_RESETLOG */) { int rulenum; int failed = EX_OK; char const *name = optname == IP_FW_ZERO ? "ZERO" : "RESETLOG"; av++; ac--; if (!ac) { /* clear all entries */ if (do_cmd(optname, NULL, 0) < 0) err(EX_UNAVAILABLE, "setsockopt(IP_FW_%s)", name); if (!do_quiet) printf("%s.\n", optname == IP_FW_ZERO ? "Accounting cleared":"Logging counts reset"); return; } while (ac) { /* Rule number */ if (isdigit(**av)) { rulenum = atoi(*av); av++; ac--; if (do_cmd(optname, &rulenum, sizeof rulenum)) { warn("rule %u: setsockopt(IP_FW_%s)", rulenum, name); failed = EX_UNAVAILABLE; } else if (!do_quiet) printf("Entry %d %s.\n", rulenum, optname == IP_FW_ZERO ? "cleared" : "logging count reset"); } else { errx(EX_USAGE, "invalid rule number ``%s''", *av); } } if (failed != EX_OK) exit(failed); } static void flush(int force) { int cmd = do_pipe ? IP_DUMMYNET_FLUSH : IP_FW_FLUSH; if (!force && !do_quiet) { /* need to ask user */ int c; printf("Are you sure? [yn] "); fflush(stdout); do { c = toupper(getc(stdin)); while (c != '\n' && getc(stdin) != '\n') if (feof(stdin)) return; /* and do not flush */ } while (c != 'Y' && c != 'N'); printf("\n"); if (c == 'N') /* user said no */ return; } if (do_cmd(cmd, NULL, 0) < 0) err(EX_UNAVAILABLE, "setsockopt(IP_%s_FLUSH)", do_pipe ? "DUMMYNET" : "FW"); if (!do_quiet) printf("Flushed all %s.\n", do_pipe ? "pipes" : "rules"); } /* * Free a the (locally allocated) copy of command line arguments. */ static void free_args(int ac, char **av) { int i; for (i=0; i < ac; i++) free(av[i]); free(av); } /* + * This one handles all table-related commands + * ipfw table N add addr[/masklen] [value] + * ipfw table N delete addr[/masklen] + * ipfw table N flush + * ipfw table N list + */ +static void +table_handler(int ac, char *av[]) +{ + ipfw_table_entry ent; + ipfw_table *tbl; + int do_add; + char *p; + socklen_t l; + uint32_t a; + + ac--; av++; + if (ac && isdigit(**av)) { + ent.tbl = atoi(*av); + ac--; av++; + } else + errx(EX_USAGE, "table number required"); + NEED1("table needs command"); + if (strncmp(*av, "add", strlen(*av)) == 0 || + strncmp(*av, "delete", strlen(*av)) == 0) { + do_add = **av == 'a'; + ac--; av++; + if (!ac) + errx(EX_USAGE, "IP address required"); + p = strchr(*av, '/'); + if (p) { + *p++ = '\0'; + ent.masklen = atoi(p); + if (ent.masklen > 32) + errx(EX_DATAERR, "bad width ``%s''", p); + } else + ent.masklen = 32; + if (lookup_host(*av, (struct in_addr *)&ent.addr) != 0) + errx(EX_NOHOST, "hostname ``%s'' unknown", *av); + ac--; av++; + if (do_add && ac) + ent.value = strtoul(*av, NULL, 0); + else + ent.value = 0; + if (do_cmd(do_add ? IP_FW_TABLE_ADD : IP_FW_TABLE_DEL, + &ent, sizeof(ent)) < 0) + err(EX_OSERR, "setsockopt(IP_FW_TABLE_%s)", + do_add ? "ADD" : "DEL"); + } else if (strncmp(*av, "flush", strlen(*av)) == 0) { + if (do_cmd(IP_FW_TABLE_FLUSH, &ent.tbl, sizeof(ent.tbl)) < 0) + err(EX_OSERR, "setsockopt(IP_FW_TABLE_FLUSH)"); + } else if (strncmp(*av, "list", strlen(*av)) == 0) { + a = ent.tbl; + l = sizeof(a); + if (do_cmd(IP_FW_TABLE_GETSIZE, &a, (uintptr_t)&l) < 0) + err(EX_OSERR, "getsockopt(IP_FW_TABLE_GETSIZE)"); + l = sizeof(*tbl) + a * sizeof(ipfw_table_entry); + tbl = malloc(l); + if (tbl == NULL) + err(EX_OSERR, "malloc"); + tbl->tbl = ent.tbl; + if (do_cmd(IP_FW_TABLE_LIST, tbl, (uintptr_t)&l) < 0) + err(EX_OSERR, "getsockopt(IP_FW_TABLE_LIST)"); + for (a = 0; a < tbl->cnt; a++) { + printf("%s/%u %u\n", + inet_ntoa(*(struct in_addr *)&tbl->ent[a].addr), + tbl->ent[a].masklen, tbl->ent[a].value); + } + } else + errx(EX_USAGE, "invalid table command %s", *av); +} + +/* * Called with the arguments (excluding program name). * Returns 0 if successful, 1 if empty command, errx() in case of errors. */ static int ipfw_main(int oldac, char **oldav) { int ch, ac, save_ac; char **av, **save_av; int do_acct = 0; /* Show packet/byte count */ #define WHITESP " \t\f\v\n\r" if (oldac == 0) return 1; else if (oldac == 1) { /* * If we are called with a single string, try to split it into * arguments for subsequent parsing. * But first, remove spaces after a ',', by copying the string * in-place. */ char *arg = oldav[0]; /* The string... */ int l = strlen(arg); int copy = 0; /* 1 if we need to copy, 0 otherwise */ int i, j; for (i = j = 0; i < l; i++) { if (arg[i] == '#') /* comment marker */ break; if (copy) { arg[j++] = arg[i]; copy = !index("," WHITESP, arg[i]); } else { copy = !index(WHITESP, arg[i]); if (copy) arg[j++] = arg[i]; } } if (!copy && j > 0) /* last char was a 'blank', remove it */ j--; l = j; /* the new argument length */ arg[j++] = '\0'; if (l == 0) /* empty string! */ return 1; /* * First, count number of arguments. Because of the previous * processing, this is just the number of blanks plus 1. */ for (i = 0, ac = 1; i < l; i++) if (index(WHITESP, arg[i]) != NULL) ac++; av = calloc(ac, sizeof(char *)); /* * Second, copy arguments from cmd[] to av[]. For each one, * j is the initial character, i is the one past the end. */ for (ac = 0, i = j = 0; i < l; i++) if (index(WHITESP, arg[i]) != NULL || i == l-1) { if (i == l-1) i++; av[ac] = calloc(i-j+1, 1); bcopy(arg+j, av[ac], i-j); ac++; j = i + 1; } } else { /* * If an argument ends with ',' join with the next one. */ int first, i, l; av = calloc(oldac, sizeof(char *)); for (first = i = ac = 0, l = 0; i < oldac; i++) { char *arg = oldav[i]; int k = strlen(arg); l += k; if (arg[k-1] != ',' || i == oldac-1) { /* Time to copy. */ av[ac] = calloc(l+1, 1); for (l=0; first <= i; first++) { strcat(av[ac]+l, oldav[first]); l += strlen(oldav[first]); } ac++; l = 0; first = i+1; } } } /* Set the force flag for non-interactive processes */ if (!do_force) do_force = !isatty(STDIN_FILENO); /* Save arguments for final freeing of memory. */ save_ac = ac; save_av = av; optind = optreset = 0; while ((ch = getopt(ac, av, "acdefhnNqs:STtv")) != -1) switch (ch) { case 'a': do_acct = 1; break; case 'c': do_compact = 1; break; case 'd': do_dynamic = 1; break; case 'e': do_expired = 1; break; case 'f': do_force = 1; break; case 'h': /* help */ free_args(save_ac, save_av); help(); break; /* NOTREACHED */ case 'n': test_only = 1; break; case 'N': do_resolv = 1; break; case 'q': do_quiet = 1; break; case 's': /* sort */ do_sort = atoi(optarg); break; case 'S': show_sets = 1; break; case 't': do_time = 1; break; case 'T': do_time = 2; /* numeric timestamp */ break; case 'v': /* verbose */ verbose = 1; break; default: free_args(save_ac, save_av); return 1; } ac -= optind; av += optind; NEED1("bad arguments, for usage summary ``ipfw''"); /* * An undocumented behaviour of ipfw1 was to allow rule numbers first, * e.g. "100 add allow ..." instead of "add 100 allow ...". * In case, swap first and second argument to get the normal form. */ if (ac > 1 && isdigit(*av[0])) { char *p = av[0]; av[0] = av[1]; av[1] = p; } /* * optional: pipe or queue */ do_pipe = 0; if (!strncmp(*av, "pipe", strlen(*av))) do_pipe = 1; else if (!strncmp(*av, "queue", strlen(*av))) do_pipe = 2; if (do_pipe) { ac--; av++; } NEED1("missing command"); /* * For pipes and queues we normally say 'pipe NN config' * but the code is easier to parse as 'pipe config NN' * so we swap the two arguments. */ if (do_pipe > 0 && ac > 1 && isdigit(*av[0])) { char *p = av[0]; av[0] = av[1]; av[1] = p; } if (!strncmp(*av, "add", strlen(*av))) add(ac, av); else if (do_pipe && !strncmp(*av, "config", strlen(*av))) config_pipe(ac, av); else if (!strncmp(*av, "delete", strlen(*av))) delete(ac, av); else if (!strncmp(*av, "flush", strlen(*av))) flush(do_force); else if (!strncmp(*av, "zero", strlen(*av))) zero(ac, av, IP_FW_ZERO); else if (!strncmp(*av, "resetlog", strlen(*av))) zero(ac, av, IP_FW_RESETLOG); else if (!strncmp(*av, "print", strlen(*av)) || !strncmp(*av, "list", strlen(*av))) list(ac, av, do_acct); else if (!strncmp(*av, "set", strlen(*av))) sets_handler(ac, av); + else if (!strncmp(*av, "table", strlen(*av))) + table_handler(ac, av); else if (!strncmp(*av, "enable", strlen(*av))) sysctl_handler(ac, av, 1); else if (!strncmp(*av, "disable", strlen(*av))) sysctl_handler(ac, av, 0); else if (!strncmp(*av, "show", strlen(*av))) list(ac, av, 1 /* show counters */); else errx(EX_USAGE, "bad command `%s'", *av); /* Free memory allocated in the argument parsing. */ free_args(save_ac, save_av); return 0; } static void ipfw_readfile(int ac, char *av[]) { #define MAX_ARGS 32 char buf[BUFSIZ]; char *cmd = NULL, *filename = av[ac-1]; int c, lineno=0; FILE *f = NULL; pid_t preproc = 0; filename = av[ac-1]; while ((c = getopt(ac, av, "cfNnp:qS")) != -1) { switch(c) { case 'c': do_compact = 1; break; case 'f': do_force = 1; break; case 'N': do_resolv = 1; break; case 'n': test_only = 1; break; case 'p': cmd = optarg; /* * Skip previous args and delete last one, so we * pass all but the last argument to the preprocessor * via av[optind-1] */ av += optind - 1; ac -= optind - 1; av[ac-1] = NULL; fprintf(stderr, "command is %s\n", av[0]); break; case 'q': do_quiet = 1; break; case 'S': show_sets = 1; break; default: errx(EX_USAGE, "bad arguments, for usage" " summary ``ipfw''"); } if (cmd != NULL) break; } if (cmd == NULL && ac != optind + 1) { fprintf(stderr, "ac %d, optind %d\n", ac, optind); errx(EX_USAGE, "extraneous filename arguments"); } if ((f = fopen(filename, "r")) == NULL) err(EX_UNAVAILABLE, "fopen: %s", filename); if (cmd != NULL) { /* pipe through preprocessor */ int pipedes[2]; if (pipe(pipedes) == -1) err(EX_OSERR, "cannot create pipe"); preproc = fork(); if (preproc == -1) err(EX_OSERR, "cannot fork"); if (preproc == 0) { /* * Child, will run the preprocessor with the * file on stdin and the pipe on stdout. */ if (dup2(fileno(f), 0) == -1 || dup2(pipedes[1], 1) == -1) err(EX_OSERR, "dup2()"); fclose(f); close(pipedes[1]); close(pipedes[0]); execvp(cmd, av); err(EX_OSERR, "execvp(%s) failed", cmd); } else { /* parent, will reopen f as the pipe */ fclose(f); close(pipedes[1]); if ((f = fdopen(pipedes[0], "r")) == NULL) { int savederrno = errno; (void)kill(preproc, SIGTERM); errno = savederrno; err(EX_OSERR, "fdopen()"); } } } while (fgets(buf, BUFSIZ, f)) { /* read commands */ char linename[10]; char *args[1]; lineno++; sprintf(linename, "Line %d", lineno); setprogname(linename); /* XXX */ args[0] = buf; ipfw_main(1, args); } fclose(f); if (cmd != NULL) { int status; if (waitpid(preproc, &status, 0) == -1) errx(EX_OSERR, "waitpid()"); if (WIFEXITED(status) && WEXITSTATUS(status) != EX_OK) errx(EX_UNAVAILABLE, "preprocessor exited with status %d", WEXITSTATUS(status)); else if (WIFSIGNALED(status)) errx(EX_UNAVAILABLE, "preprocessor exited with signal %d", WTERMSIG(status)); } } int main(int ac, char *av[]) { /* * If the last argument is an absolute pathname, interpret it * as a file to be preprocessed. */ if (ac > 1 && av[ac - 1][0] == '/' && access(av[ac - 1], R_OK) == 0) ipfw_readfile(ac, av); else { if (ipfw_main(ac-1, av+1)) show_usage(); } return EX_OK; } Index: stable/4/sys/netinet/in.h =================================================================== --- stable/4/sys/netinet/in.h (revision 130570) +++ stable/4/sys/netinet/in.h (revision 130571) @@ -1,510 +1,516 @@ /* * Copyright (c) 1982, 1986, 1990, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by the University of * California, Berkeley and its contributors. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)in.h 8.3 (Berkeley) 1/3/94 * $FreeBSD$ */ #ifndef _NETINET_IN_H_ #define _NETINET_IN_H_ /* * Constants and structures defined by the internet system, * Per RFC 790, September 1981, and numerous additions. */ /* * Protocols (RFC 1700) */ #define IPPROTO_IP 0 /* dummy for IP */ #define IPPROTO_HOPOPTS 0 /* IP6 hop-by-hop options */ #define IPPROTO_ICMP 1 /* control message protocol */ #define IPPROTO_IGMP 2 /* group mgmt protocol */ #define IPPROTO_GGP 3 /* gateway^2 (deprecated) */ #define IPPROTO_IPV4 4 /* IPv4 encapsulation */ #define IPPROTO_IPIP IPPROTO_IPV4 /* for compatibility */ #define IPPROTO_TCP 6 /* tcp */ #define IPPROTO_ST 7 /* Stream protocol II */ #define IPPROTO_EGP 8 /* exterior gateway protocol */ #define IPPROTO_PIGP 9 /* private interior gateway */ #define IPPROTO_RCCMON 10 /* BBN RCC Monitoring */ #define IPPROTO_NVPII 11 /* network voice protocol*/ #define IPPROTO_PUP 12 /* pup */ #define IPPROTO_ARGUS 13 /* Argus */ #define IPPROTO_EMCON 14 /* EMCON */ #define IPPROTO_XNET 15 /* Cross Net Debugger */ #define IPPROTO_CHAOS 16 /* Chaos*/ #define IPPROTO_UDP 17 /* user datagram protocol */ #define IPPROTO_MUX 18 /* Multiplexing */ #define IPPROTO_MEAS 19 /* DCN Measurement Subsystems */ #define IPPROTO_HMP 20 /* Host Monitoring */ #define IPPROTO_PRM 21 /* Packet Radio Measurement */ #define IPPROTO_IDP 22 /* xns idp */ #define IPPROTO_TRUNK1 23 /* Trunk-1 */ #define IPPROTO_TRUNK2 24 /* Trunk-2 */ #define IPPROTO_LEAF1 25 /* Leaf-1 */ #define IPPROTO_LEAF2 26 /* Leaf-2 */ #define IPPROTO_RDP 27 /* Reliable Data */ #define IPPROTO_IRTP 28 /* Reliable Transaction */ #define IPPROTO_TP 29 /* tp-4 w/ class negotiation */ #define IPPROTO_BLT 30 /* Bulk Data Transfer */ #define IPPROTO_NSP 31 /* Network Services */ #define IPPROTO_INP 32 /* Merit Internodal */ #define IPPROTO_SEP 33 /* Sequential Exchange */ #define IPPROTO_3PC 34 /* Third Party Connect */ #define IPPROTO_IDPR 35 /* InterDomain Policy Routing */ #define IPPROTO_XTP 36 /* XTP */ #define IPPROTO_DDP 37 /* Datagram Delivery */ #define IPPROTO_CMTP 38 /* Control Message Transport */ #define IPPROTO_TPXX 39 /* TP++ Transport */ #define IPPROTO_IL 40 /* IL transport protocol */ #define IPPROTO_IPV6 41 /* IP6 header */ #define IPPROTO_SDRP 42 /* Source Demand Routing */ #define IPPROTO_ROUTING 43 /* IP6 routing header */ #define IPPROTO_FRAGMENT 44 /* IP6 fragmentation header */ #define IPPROTO_IDRP 45 /* InterDomain Routing*/ #define IPPROTO_RSVP 46 /* resource reservation */ #define IPPROTO_GRE 47 /* General Routing Encap. */ #define IPPROTO_MHRP 48 /* Mobile Host Routing */ #define IPPROTO_BHA 49 /* BHA */ #define IPPROTO_ESP 50 /* IP6 Encap Sec. Payload */ #define IPPROTO_AH 51 /* IP6 Auth Header */ #define IPPROTO_INLSP 52 /* Integ. Net Layer Security */ #define IPPROTO_SWIPE 53 /* IP with encryption */ #define IPPROTO_NHRP 54 /* Next Hop Resolution */ #define IPPROTO_MOBILE 55 /* IP Mobility */ #define IPPROTO_TLSP 56 /* Transport Layer Security */ #define IPPROTO_SKIP 57 /* SKIP */ #define IPPROTO_ICMPV6 58 /* ICMP6 */ #define IPPROTO_NONE 59 /* IP6 no next header */ #define IPPROTO_DSTOPTS 60 /* IP6 destination option */ #define IPPROTO_AHIP 61 /* any host internal protocol */ #define IPPROTO_CFTP 62 /* CFTP */ #define IPPROTO_HELLO 63 /* "hello" routing protocol */ #define IPPROTO_SATEXPAK 64 /* SATNET/Backroom EXPAK */ #define IPPROTO_KRYPTOLAN 65 /* Kryptolan */ #define IPPROTO_RVD 66 /* Remote Virtual Disk */ #define IPPROTO_IPPC 67 /* Pluribus Packet Core */ #define IPPROTO_ADFS 68 /* Any distributed FS */ #define IPPROTO_SATMON 69 /* Satnet Monitoring */ #define IPPROTO_VISA 70 /* VISA Protocol */ #define IPPROTO_IPCV 71 /* Packet Core Utility */ #define IPPROTO_CPNX 72 /* Comp. Prot. Net. Executive */ #define IPPROTO_CPHB 73 /* Comp. Prot. HeartBeat */ #define IPPROTO_WSN 74 /* Wang Span Network */ #define IPPROTO_PVP 75 /* Packet Video Protocol */ #define IPPROTO_BRSATMON 76 /* BackRoom SATNET Monitoring */ #define IPPROTO_ND 77 /* Sun net disk proto (temp.) */ #define IPPROTO_WBMON 78 /* WIDEBAND Monitoring */ #define IPPROTO_WBEXPAK 79 /* WIDEBAND EXPAK */ #define IPPROTO_EON 80 /* ISO cnlp */ #define IPPROTO_VMTP 81 /* VMTP */ #define IPPROTO_SVMTP 82 /* Secure VMTP */ #define IPPROTO_VINES 83 /* Banyon VINES */ #define IPPROTO_TTP 84 /* TTP */ #define IPPROTO_IGP 85 /* NSFNET-IGP */ #define IPPROTO_DGP 86 /* dissimilar gateway prot. */ #define IPPROTO_TCF 87 /* TCF */ #define IPPROTO_IGRP 88 /* Cisco/GXS IGRP */ #define IPPROTO_OSPFIGP 89 /* OSPFIGP */ #define IPPROTO_SRPC 90 /* Strite RPC protocol */ #define IPPROTO_LARP 91 /* Locus Address Resoloution */ #define IPPROTO_MTP 92 /* Multicast Transport */ #define IPPROTO_AX25 93 /* AX.25 Frames */ #define IPPROTO_IPEIP 94 /* IP encapsulated in IP */ #define IPPROTO_MICP 95 /* Mobile Int.ing control */ #define IPPROTO_SCCSP 96 /* Semaphore Comm. security */ #define IPPROTO_ETHERIP 97 /* Ethernet IP encapsulation */ #define IPPROTO_ENCAP 98 /* encapsulation header */ #define IPPROTO_APES 99 /* any private encr. scheme */ #define IPPROTO_GMTP 100 /* GMTP*/ #define IPPROTO_IPCOMP 108 /* payload compression (IPComp) */ /* 101-254: Partly Unassigned */ #define IPPROTO_PIM 103 /* Protocol Independent Mcast */ #define IPPROTO_PGM 113 /* PGM */ /* 255: Reserved */ /* BSD Private, local use, namespace incursion */ #define IPPROTO_DIVERT 254 /* divert pseudo-protocol */ #define IPPROTO_RAW 255 /* raw IP packet */ #define IPPROTO_MAX 256 /* last return value of *_input(), meaning "all job for this pkt is done". */ #define IPPROTO_DONE 257 /* * Local port number conventions: * * When a user does a bind(2) or connect(2) with a port number of zero, * a non-conflicting local port address is chosen. * The default range is IPPORT_RESERVED through * IPPORT_USERRESERVED, although that is settable by sysctl. * * A user may set the IPPROTO_IP option IP_PORTRANGE to change this * default assignment range. * * The value IP_PORTRANGE_DEFAULT causes the default behavior. * * The value IP_PORTRANGE_HIGH changes the range of candidate port numbers * into the "high" range. These are reserved for client outbound connections * which do not want to be filtered by any firewalls. * * The value IP_PORTRANGE_LOW changes the range to the "low" are * that is (by convention) restricted to privileged processes. This * convention is based on "vouchsafe" principles only. It is only secure * if you trust the remote host to restrict these ports. * * The default range of ports and the high range can be changed by * sysctl(3). (net.inet.ip.port{hi,low}{first,last}_auto) * * Changing those values has bad security implications if you are * using a a stateless firewall that is allowing packets outside of that * range in order to allow transparent outgoing connections. * * Such a firewall configuration will generally depend on the use of these * default values. If you change them, you may find your Security * Administrator looking for you with a heavy object. * * For a slightly more orthodox text view on this: * * ftp://ftp.isi.edu/in-notes/iana/assignments/port-numbers * * port numbers are divided into three ranges: * * 0 - 1023 Well Known Ports * 1024 - 49151 Registered Ports * 49152 - 65535 Dynamic and/or Private Ports * */ /* * Ports < IPPORT_RESERVED are reserved for * privileged processes (e.g. root). (IP_PORTRANGE_LOW) * Ports > IPPORT_USERRESERVED are reserved * for servers, not necessarily privileged. (IP_PORTRANGE_DEFAULT) */ #define IPPORT_RESERVED 1024 #define IPPORT_USERRESERVED 5000 /* * Default local port range to use by setting IP_PORTRANGE_HIGH */ #define IPPORT_HIFIRSTAUTO 49152 #define IPPORT_HILASTAUTO 65535 /* * Scanning for a free reserved port return a value below IPPORT_RESERVED, * but higher than IPPORT_RESERVEDSTART. Traditionally the start value was * 512, but that conflicts with some well-known-services that firewalls may * have a fit if we use. */ #define IPPORT_RESERVEDSTART 600 /* * Internet address (a structure for historical reasons) */ struct in_addr { in_addr_t s_addr; }; /* * Definitions of bits in internet address integers. * On subnets, the decomposition of addresses to host and net parts * is done according to subnet mask, not the masks here. */ #define IN_CLASSA(i) (((u_int32_t)(i) & 0x80000000) == 0) #define IN_CLASSA_NET 0xff000000 #define IN_CLASSA_NSHIFT 24 #define IN_CLASSA_HOST 0x00ffffff #define IN_CLASSA_MAX 128 #define IN_CLASSB(i) (((u_int32_t)(i) & 0xc0000000) == 0x80000000) #define IN_CLASSB_NET 0xffff0000 #define IN_CLASSB_NSHIFT 16 #define IN_CLASSB_HOST 0x0000ffff #define IN_CLASSB_MAX 65536 #define IN_CLASSC(i) (((u_int32_t)(i) & 0xe0000000) == 0xc0000000) #define IN_CLASSC_NET 0xffffff00 #define IN_CLASSC_NSHIFT 8 #define IN_CLASSC_HOST 0x000000ff #define IN_CLASSD(i) (((u_int32_t)(i) & 0xf0000000) == 0xe0000000) #define IN_CLASSD_NET 0xf0000000 /* These ones aren't really */ #define IN_CLASSD_NSHIFT 28 /* net and host fields, but */ #define IN_CLASSD_HOST 0x0fffffff /* routing needn't know. */ #define IN_MULTICAST(i) IN_CLASSD(i) #define IN_EXPERIMENTAL(i) (((u_int32_t)(i) & 0xf0000000) == 0xf0000000) #define IN_BADCLASS(i) (((u_int32_t)(i) & 0xf0000000) == 0xf0000000) #define INADDR_ANY (u_int32_t)0x00000000 #define INADDR_LOOPBACK (u_int32_t)0x7f000001 #define INADDR_BROADCAST (u_int32_t)0xffffffff /* must be masked */ #ifndef _KERNEL #define INADDR_NONE 0xffffffff /* -1 return */ #endif #define INADDR_UNSPEC_GROUP (u_int32_t)0xe0000000 /* 224.0.0.0 */ #define INADDR_ALLHOSTS_GROUP (u_int32_t)0xe0000001 /* 224.0.0.1 */ #define INADDR_ALLRTRS_GROUP (u_int32_t)0xe0000002 /* 224.0.0.2 */ #define INADDR_MAX_LOCAL_GROUP (u_int32_t)0xe00000ff /* 224.0.0.255 */ #define IN_LOOPBACKNET 127 /* official! */ /* * Socket address, internet style. */ struct sockaddr_in { u_char sin_len; u_char sin_family; u_short sin_port; struct in_addr sin_addr; char sin_zero[8]; }; #define INET_ADDRSTRLEN 16 /* * Options for use with [gs]etsockopt at the IP level. * First word of comment is data type; bool is stored in int. */ #define IP_OPTIONS 1 /* buf/ip_opts; set/get IP options */ #define IP_HDRINCL 2 /* int; header is included with data */ #define IP_TOS 3 /* int; IP type of service and preced. */ #define IP_TTL 4 /* int; IP time to live */ #define IP_RECVOPTS 5 /* bool; receive all IP opts w/dgram */ #define IP_RECVRETOPTS 6 /* bool; receive IP opts for response */ #define IP_RECVDSTADDR 7 /* bool; receive IP dst addr w/dgram */ #define IP_RETOPTS 8 /* ip_opts; set/get IP options */ #define IP_MULTICAST_IF 9 /* u_char; set/get IP multicast i/f */ #define IP_MULTICAST_TTL 10 /* u_char; set/get IP multicast ttl */ #define IP_MULTICAST_LOOP 11 /* u_char; set/get IP multicast loopback */ #define IP_ADD_MEMBERSHIP 12 /* ip_mreq; add an IP group membership */ #define IP_DROP_MEMBERSHIP 13 /* ip_mreq; drop an IP group membership */ #define IP_MULTICAST_VIF 14 /* set/get IP mcast virt. iface */ #define IP_RSVP_ON 15 /* enable RSVP in kernel */ #define IP_RSVP_OFF 16 /* disable RSVP in kernel */ #define IP_RSVP_VIF_ON 17 /* set RSVP per-vif socket */ #define IP_RSVP_VIF_OFF 18 /* unset RSVP per-vif socket */ #define IP_PORTRANGE 19 /* int; range to choose for unspec port */ #define IP_RECVIF 20 /* bool; receive reception if w/dgram */ /* for IPSEC */ #define IP_IPSEC_POLICY 21 /* int; set/get security policy */ #define IP_FAITH 22 /* bool; accept FAITH'ed connections */ #define IP_ONESBCAST 23 /* bool: send all-ones broadcast */ +#define IP_FW_TABLE_ADD 40 /* add entry */ +#define IP_FW_TABLE_DEL 41 /* delete entry */ +#define IP_FW_TABLE_FLUSH 42 /* flush table */ +#define IP_FW_TABLE_GETSIZE 43 /* get table size */ +#define IP_FW_TABLE_LIST 44 /* list table contents */ + #define IP_FW_ADD 50 /* add a firewall rule to chain */ #define IP_FW_DEL 51 /* delete a firewall rule from chain */ #define IP_FW_FLUSH 52 /* flush firewall rule chain */ #define IP_FW_ZERO 53 /* clear single/all firewall counter(s) */ #define IP_FW_GET 54 /* get entire firewall rule chain */ #define IP_FW_RESETLOG 55 /* reset logging counters */ #define IP_DUMMYNET_CONFIGURE 60 /* add/configure a dummynet pipe */ #define IP_DUMMYNET_DEL 61 /* delete a dummynet pipe from chain */ #define IP_DUMMYNET_FLUSH 62 /* flush dummynet */ #define IP_DUMMYNET_GET 64 /* get entire dummynet pipes */ /* * Defaults and limits for options */ #define IP_DEFAULT_MULTICAST_TTL 1 /* normally limit m'casts to 1 hop */ #define IP_DEFAULT_MULTICAST_LOOP 1 /* normally hear sends if a member */ #define IP_MAX_MEMBERSHIPS 20 /* per socket */ /* * Argument structure for IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP. */ struct ip_mreq { struct in_addr imr_multiaddr; /* IP multicast address of group */ struct in_addr imr_interface; /* local IP address of interface */ }; /* * Argument for IP_PORTRANGE: * - which range to search when port is unspecified at bind() or connect() */ #define IP_PORTRANGE_DEFAULT 0 /* default range */ #define IP_PORTRANGE_HIGH 1 /* "high" - request firewall bypass */ #define IP_PORTRANGE_LOW 2 /* "low" - vouchsafe security */ /* * Definitions for inet sysctl operations. * * Third level is protocol number. * Fourth level is desired variable within that protocol. */ #define IPPROTO_MAXID (IPPROTO_AH + 1) /* don't list to IPPROTO_MAX */ #define CTL_IPPROTO_NAMES { \ { "ip", CTLTYPE_NODE }, \ { "icmp", CTLTYPE_NODE }, \ { "igmp", CTLTYPE_NODE }, \ { "ggp", CTLTYPE_NODE }, \ { 0, 0 }, \ { 0, 0 }, \ { "tcp", CTLTYPE_NODE }, \ { 0, 0 }, \ { "egp", CTLTYPE_NODE }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { "pup", CTLTYPE_NODE }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { "udp", CTLTYPE_NODE }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { "idp", CTLTYPE_NODE }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { "ipsec", CTLTYPE_NODE }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { 0, 0 }, \ { "pim", CTLTYPE_NODE }, \ } /* * Names for IP sysctl objects */ #define IPCTL_FORWARDING 1 /* act as router */ #define IPCTL_SENDREDIRECTS 2 /* may send redirects when forwarding */ #define IPCTL_DEFTTL 3 /* default TTL */ #ifdef notyet #define IPCTL_DEFMTU 4 /* default MTU */ #endif #define IPCTL_RTEXPIRE 5 /* cloned route expiration time */ #define IPCTL_RTMINEXPIRE 6 /* min value for expiration time */ #define IPCTL_RTMAXCACHE 7 /* trigger level for dynamic expire */ #define IPCTL_SOURCEROUTE 8 /* may perform source routes */ #define IPCTL_DIRECTEDBROADCAST 9 /* may re-broadcast received packets */ #define IPCTL_INTRQMAXLEN 10 /* max length of netisr queue */ #define IPCTL_INTRQDROPS 11 /* number of netisr q drops */ #define IPCTL_STATS 12 /* ipstat structure */ #define IPCTL_ACCEPTSOURCEROUTE 13 /* may accept source routed packets */ #define IPCTL_FASTFORWARDING 14 /* use fast IP forwarding code */ #define IPCTL_KEEPFAITH 15 /* FAITH IPv4->IPv6 translater ctl */ #define IPCTL_GIF_TTL 16 /* default TTL for gif encap packet */ #define IPCTL_MAXID 17 #define IPCTL_NAMES { \ { 0, 0 }, \ { "forwarding", CTLTYPE_INT }, \ { "redirect", CTLTYPE_INT }, \ { "ttl", CTLTYPE_INT }, \ { "mtu", CTLTYPE_INT }, \ { "rtexpire", CTLTYPE_INT }, \ { "rtminexpire", CTLTYPE_INT }, \ { "rtmaxcache", CTLTYPE_INT }, \ { "sourceroute", CTLTYPE_INT }, \ { "directed-broadcast", CTLTYPE_INT }, \ { "intr-queue-maxlen", CTLTYPE_INT }, \ { "intr-queue-drops", CTLTYPE_INT }, \ { "stats", CTLTYPE_STRUCT }, \ { "accept_sourceroute", CTLTYPE_INT }, \ { "fastforwarding", CTLTYPE_INT }, \ } #ifdef _KERNEL struct ifnet; struct mbuf; /* forward declarations for Standard C */ #endif /* INET6 stuff */ #define __KAME_NETINET_IN_H_INCLUDED_ #include #undef __KAME_NETINET_IN_H_INCLUDED_ #ifdef _KERNEL int in_broadcast __P((struct in_addr, struct ifnet *)); int in_canforward __P((struct in_addr)); int in_cksum __P((struct mbuf *, int)); int in_localaddr __P((struct in_addr)); char *inet_ntoa __P((struct in_addr)); /* in libkern */ int prison_ip __P((struct proc *p, int flag, u_int32_t *ip)); void prison_remote_ip __P((struct proc *p, int flag, u_int32_t *ip)); #define in_hosteq(s, t) ((s).s_addr == (t).s_addr) #define in_nullhost(x) ((x).s_addr == INADDR_ANY) #define satosin(sa) ((struct sockaddr_in *)(sa)) #define sintosa(sin) ((struct sockaddr *)(sin)) #define ifatoia(ifa) ((struct in_ifaddr *)(ifa)) #endif #endif Index: stable/4/sys/netinet/ip_fw2.c =================================================================== --- stable/4/sys/netinet/ip_fw2.c (revision 130570) +++ stable/4/sys/netinet/ip_fw2.c (revision 130571) @@ -1,2869 +1,3204 @@ /* * Copyright (c) 2002 Luigi Rizzo, Universita` di Pisa * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #define DEB(x) #define DDB(x) x /* * Implement IP packet firewall (new version) */ #if !defined(KLD_MODULE) #include "opt_ipfw.h" #include "opt_ipdn.h" #include "opt_ipdivert.h" #include "opt_inet.h" #include "opt_ipsec.h" #ifndef INET #error IPFIREWALL requires INET. #endif /* INET */ #endif #if IPFW2 #include #include #include #include #include #include #include #include #include #include #include #include +#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef IPSEC #include #endif #include /* XXX for ETHERTYPE_IP */ #include /* XXX for in_cksum */ /* * XXX This one should go in sys/mbuf.h. It is used to avoid that * a firewall-generated packet loops forever through the firewall. */ #ifndef M_SKIP_FIREWALL #define M_SKIP_FIREWALL 0x4000 #endif /* * set_disable contains one bit per set value (0..31). * If the bit is set, all rules with the corresponding set * are disabled. Set RESVD_SET(31) is reserved for the default rule * and rules that are not deleted by the flush command, * and CANNOT be disabled. * Rules in set RESVD_SET can only be deleted explicitly. */ static u_int32_t set_disable; static int fw_verbose; static int verbose_limit; static struct callout_handle ipfw_timeout_h; #define IPFW_DEFAULT_RULE 65535 /* * list of rules for layer 3 */ static struct ip_fw *layer3_chain; MALLOC_DEFINE(M_IPFW, "IpFw/IpAcct", "IpFw/IpAcct chain's"); +MALLOC_DEFINE(M_IPFW_TBL, "ipfw_tbl", "IpFw tables"); +struct table_entry { + struct radix_node rn[2]; + struct sockaddr_in addr, mask; + u_int32_t value; +}; + +#define IPFW_TABLES_MAX 128 +static struct { + struct radix_node_head *rnh; + int modified; +} ipfw_tables[IPFW_TABLES_MAX]; + static int fw_debug = 1; static int autoinc_step = 100; /* bounded to 1..1000 in add_rule() */ #ifdef SYSCTL_NODE SYSCTL_NODE(_net_inet_ip, OID_AUTO, fw, CTLFLAG_RW, 0, "Firewall"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, enable, CTLFLAG_RW|CTLFLAG_SECURE, &fw_enable, 0, "Enable ipfw"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, autoinc_step, CTLFLAG_RW, &autoinc_step, 0, "Rule number autincrement step"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, one_pass, CTLFLAG_RW|CTLFLAG_SECURE, &fw_one_pass, 0, "Only do a single pass through ipfw when using dummynet(4)"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, debug, CTLFLAG_RW, &fw_debug, 0, "Enable printing of debug ip_fw statements"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, verbose, CTLFLAG_RW|CTLFLAG_SECURE, &fw_verbose, 0, "Log matches to ipfw rules"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, verbose_limit, CTLFLAG_RW, &verbose_limit, 0, "Set upper limit of matches of ipfw rules logged"); /* * Description of dynamic rules. * * Dynamic rules are stored in lists accessed through a hash table * (ipfw_dyn_v) whose size is curr_dyn_buckets. This value can * be modified through the sysctl variable dyn_buckets which is * updated when the table becomes empty. * * XXX currently there is only one list, ipfw_dyn. * * When a packet is received, its address fields are first masked * with the mask defined for the rule, then hashed, then matched * against the entries in the corresponding list. * Dynamic rules can be used for different purposes: * + stateful rules; * + enforcing limits on the number of sessions; * + in-kernel NAT (not implemented yet) * * The lifetime of dynamic rules is regulated by dyn_*_lifetime, * measured in seconds and depending on the flags. * * The total number of dynamic rules is stored in dyn_count. * The max number of dynamic rules is dyn_max. When we reach * the maximum number of rules we do not create anymore. This is * done to avoid consuming too much memory, but also too much * time when searching on each packet (ideally, we should try instead * to put a limit on the length of the list on each bucket...). * * Each dynamic rule holds a pointer to the parent ipfw rule so * we know what action to perform. Dynamic rules are removed when * the parent rule is deleted. XXX we should make them survive. * * There are some limitations with dynamic rules -- we do not * obey the 'randomized match', and we do not do multiple * passes through the firewall. XXX check the latter!!! */ static ipfw_dyn_rule **ipfw_dyn_v = NULL; static u_int32_t dyn_buckets = 256; /* must be power of 2 */ static u_int32_t curr_dyn_buckets = 256; /* must be power of 2 */ /* * Timeouts for various events in handing dynamic rules. */ static u_int32_t dyn_ack_lifetime = 300; static u_int32_t dyn_syn_lifetime = 20; static u_int32_t dyn_fin_lifetime = 1; static u_int32_t dyn_rst_lifetime = 1; static u_int32_t dyn_udp_lifetime = 10; static u_int32_t dyn_short_lifetime = 5; /* * Keepalives are sent if dyn_keepalive is set. They are sent every * dyn_keepalive_period seconds, in the last dyn_keepalive_interval * seconds of lifetime of a rule. * dyn_rst_lifetime and dyn_fin_lifetime should be strictly lower * than dyn_keepalive_period. */ static u_int32_t dyn_keepalive_interval = 20; static u_int32_t dyn_keepalive_period = 5; static u_int32_t dyn_keepalive = 1; /* do send keepalives */ static u_int32_t static_count; /* # of static rules */ static u_int32_t static_len; /* size in bytes of static rules */ static u_int32_t dyn_count; /* # of dynamic rules */ static u_int32_t dyn_max = 4096; /* max # of dynamic rules */ SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_buckets, CTLFLAG_RW, &dyn_buckets, 0, "Number of dyn. buckets"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, curr_dyn_buckets, CTLFLAG_RD, &curr_dyn_buckets, 0, "Current Number of dyn. buckets"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_count, CTLFLAG_RD, &dyn_count, 0, "Number of dyn. rules"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_max, CTLFLAG_RW, &dyn_max, 0, "Max number of dyn. rules"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, static_count, CTLFLAG_RD, &static_count, 0, "Number of static rules"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_ack_lifetime, CTLFLAG_RW, &dyn_ack_lifetime, 0, "Lifetime of dyn. rules for acks"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_syn_lifetime, CTLFLAG_RW, &dyn_syn_lifetime, 0, "Lifetime of dyn. rules for syn"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_fin_lifetime, CTLFLAG_RW, &dyn_fin_lifetime, 0, "Lifetime of dyn. rules for fin"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_rst_lifetime, CTLFLAG_RW, &dyn_rst_lifetime, 0, "Lifetime of dyn. rules for rst"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_udp_lifetime, CTLFLAG_RW, &dyn_udp_lifetime, 0, "Lifetime of dyn. rules for UDP"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_short_lifetime, CTLFLAG_RW, &dyn_short_lifetime, 0, "Lifetime of dyn. rules for other situations"); SYSCTL_INT(_net_inet_ip_fw, OID_AUTO, dyn_keepalive, CTLFLAG_RW, &dyn_keepalive, 0, "Enable keepalives for dyn. rules"); #endif /* SYSCTL_NODE */ static ip_fw_chk_t ipfw_chk; ip_dn_ruledel_t *ip_dn_ruledel_ptr = NULL; /* hook into dummynet */ /* * This macro maps an ip pointer into a layer3 header pointer of type T */ #define L3HDR(T, ip) ((T *)((u_int32_t *)(ip) + (ip)->ip_hl)) static __inline int icmptype_match(struct ip *ip, ipfw_insn_u32 *cmd) { int type = L3HDR(struct icmp,ip)->icmp_type; return (type <= ICMP_MAXTYPE && (cmd->d[0] & (1<icmp_type; return (type <= ICMP_MAXTYPE && (TT & (1<arg1 or cmd->d[0]. * * We scan options and store the bits we find set. We succeed if * * (want_set & ~bits) == 0 && (want_clear & ~bits) == want_clear * * The code is sometimes optimized not to store additional variables. */ static int flags_match(ipfw_insn *cmd, u_int8_t bits) { u_char want_clear; bits = ~bits; if ( ((cmd->arg1 & 0xff) & bits) != 0) return 0; /* some bits we want set were clear */ want_clear = (cmd->arg1 >> 8) & 0xff; if ( (want_clear & bits) != want_clear) return 0; /* some bits we want clear were set */ return 1; } static int ipopts_match(struct ip *ip, ipfw_insn *cmd) { int optlen, bits = 0; u_char *cp = (u_char *)(ip + 1); int x = (ip->ip_hl << 2) - sizeof (struct ip); for (; x > 0; x -= optlen, cp += optlen) { int opt = cp[IPOPT_OPTVAL]; if (opt == IPOPT_EOL) break; if (opt == IPOPT_NOP) optlen = 1; else { optlen = cp[IPOPT_OLEN]; if (optlen <= 0 || optlen > x) return 0; /* invalid or truncated */ } switch (opt) { default: break; case IPOPT_LSRR: bits |= IP_FW_IPOPT_LSRR; break; case IPOPT_SSRR: bits |= IP_FW_IPOPT_SSRR; break; case IPOPT_RR: bits |= IP_FW_IPOPT_RR; break; case IPOPT_TS: bits |= IP_FW_IPOPT_TS; break; } } return (flags_match(cmd, bits)); } static int tcpopts_match(struct ip *ip, ipfw_insn *cmd) { int optlen, bits = 0; struct tcphdr *tcp = L3HDR(struct tcphdr,ip); u_char *cp = (u_char *)(tcp + 1); int x = (tcp->th_off << 2) - sizeof(struct tcphdr); for (; x > 0; x -= optlen, cp += optlen) { int opt = cp[0]; if (opt == TCPOPT_EOL) break; if (opt == TCPOPT_NOP) optlen = 1; else { optlen = cp[1]; if (optlen <= 0) break; } switch (opt) { default: break; case TCPOPT_MAXSEG: bits |= IP_FW_TCPOPT_MSS; break; case TCPOPT_WINDOW: bits |= IP_FW_TCPOPT_WINDOW; break; case TCPOPT_SACK_PERMITTED: case TCPOPT_SACK: bits |= IP_FW_TCPOPT_SACK; break; case TCPOPT_TIMESTAMP: bits |= IP_FW_TCPOPT_TS; break; case TCPOPT_CC: case TCPOPT_CCNEW: case TCPOPT_CCECHO: bits |= IP_FW_TCPOPT_CC; break; } } return (flags_match(cmd, bits)); } static int iface_match(struct ifnet *ifp, ipfw_insn_if *cmd) { if (ifp == NULL) /* no iface with this packet, match fails */ return 0; /* Check by name or by IP address */ if (cmd->name[0] != '\0') { /* match by name */ /* Check unit number (-1 is wildcard) */ if (cmd->p.unit != -1 && cmd->p.unit != ifp->if_unit) return(0); /* Check name */ if (!strncmp(ifp->if_name, cmd->name, IFNAMSIZ)) return(1); } else { struct ifaddr *ia; TAILQ_FOREACH(ia, &ifp->if_addrhead, ifa_link) { if (ia->ifa_addr == NULL) continue; if (ia->ifa_addr->sa_family != AF_INET) continue; if (cmd->p.ip.s_addr == ((struct sockaddr_in *) (ia->ifa_addr))->sin_addr.s_addr) return(1); /* match */ } } return(0); /* no match, fail ... */ } /* * The 'verrevpath' option checks that the interface that an IP packet * arrives on is the same interface that traffic destined for the * packet's source address would be routed out of. This is a measure * to block forged packets. This is also commonly known as "anti-spoofing" * or Unicast Reverse Path Forwarding (Unicast RFP) in Cisco-ese. The * name of the knob is purposely reminisent of the Cisco IOS command, * * ip verify unicast reverse-path * * which implements the same functionality. But note that syntax is * misleading. The check may be performed on all IP packets whether unicast, * multicast, or broadcast. */ static int verify_rev_path(struct in_addr src, struct ifnet *ifp) { struct route ro; struct sockaddr_in *dst; bzero(&ro, sizeof(ro)); dst = (struct sockaddr_in *)&(ro.ro_dst); dst->sin_family = AF_INET; dst->sin_len = sizeof(*dst); dst->sin_addr = src; rtalloc_ign(&ro, RTF_CLONING|RTF_PRCLONING); if (ro.ro_rt == NULL) return 0; if ((ifp == NULL) || (ro.ro_rt->rt_ifp->if_index != ifp->if_index)) { RTFREE(ro.ro_rt); return 0; } RTFREE(ro.ro_rt); return 1; } static u_int64_t norule_counter; /* counter for ipfw_log(NULL...) */ #define SNPARGS(buf, len) buf + len, sizeof(buf) > len ? sizeof(buf) - len : 0 #define SNP(buf) buf, sizeof(buf) /* * We enter here when we have a rule with O_LOG. * XXX this function alone takes about 2Kbytes of code! */ static void ipfw_log(struct ip_fw *f, u_int hlen, struct ether_header *eh, struct mbuf *m, struct ifnet *oif) { char *action; int limit_reached = 0; char action2[40], proto[48], fragment[28]; fragment[0] = '\0'; proto[0] = '\0'; if (f == NULL) { /* bogus pkt */ if (verbose_limit != 0 && norule_counter >= verbose_limit) return; norule_counter++; if (norule_counter == verbose_limit) limit_reached = verbose_limit; action = "Refuse"; } else { /* O_LOG is the first action, find the real one */ ipfw_insn *cmd = ACTION_PTR(f); ipfw_insn_log *l = (ipfw_insn_log *)cmd; if (l->max_log != 0 && l->log_left == 0) return; l->log_left--; if (l->log_left == 0) limit_reached = l->max_log; cmd += F_LEN(cmd); /* point to first action */ if (cmd->opcode == O_PROB) cmd += F_LEN(cmd); action = action2; switch (cmd->opcode) { case O_DENY: action = "Deny"; break; case O_REJECT: if (cmd->arg1==ICMP_REJECT_RST) action = "Reset"; else if (cmd->arg1==ICMP_UNREACH_HOST) action = "Reject"; else snprintf(SNPARGS(action2, 0), "Unreach %d", cmd->arg1); break; case O_ACCEPT: action = "Accept"; break; case O_COUNT: action = "Count"; break; case O_DIVERT: snprintf(SNPARGS(action2, 0), "Divert %d", cmd->arg1); break; case O_TEE: snprintf(SNPARGS(action2, 0), "Tee %d", cmd->arg1); break; case O_SKIPTO: snprintf(SNPARGS(action2, 0), "SkipTo %d", cmd->arg1); break; case O_PIPE: snprintf(SNPARGS(action2, 0), "Pipe %d", cmd->arg1); break; case O_QUEUE: snprintf(SNPARGS(action2, 0), "Queue %d", cmd->arg1); break; case O_FORWARD_IP: { ipfw_insn_sa *sa = (ipfw_insn_sa *)cmd; int len; len = snprintf(SNPARGS(action2, 0), "Forward to %s", inet_ntoa(sa->sa.sin_addr)); if (sa->sa.sin_port) snprintf(SNPARGS(action2, len), ":%d", sa->sa.sin_port); } break; default: action = "UNKNOWN"; break; } } if (hlen == 0) { /* non-ip */ snprintf(SNPARGS(proto, 0), "MAC"); } else { struct ip *ip = mtod(m, struct ip *); /* these three are all aliases to the same thing */ struct icmp *const icmp = L3HDR(struct icmp, ip); struct tcphdr *const tcp = (struct tcphdr *)icmp; struct udphdr *const udp = (struct udphdr *)icmp; int ip_off, offset, ip_len; int len; if (eh != NULL) { /* layer 2 packets are as on the wire */ ip_off = ntohs(ip->ip_off); ip_len = ntohs(ip->ip_len); } else { ip_off = ip->ip_off; ip_len = ip->ip_len; } offset = ip_off & IP_OFFMASK; switch (ip->ip_p) { case IPPROTO_TCP: len = snprintf(SNPARGS(proto, 0), "TCP %s", inet_ntoa(ip->ip_src)); if (offset == 0) snprintf(SNPARGS(proto, len), ":%d %s:%d", ntohs(tcp->th_sport), inet_ntoa(ip->ip_dst), ntohs(tcp->th_dport)); else snprintf(SNPARGS(proto, len), " %s", inet_ntoa(ip->ip_dst)); break; case IPPROTO_UDP: len = snprintf(SNPARGS(proto, 0), "UDP %s", inet_ntoa(ip->ip_src)); if (offset == 0) snprintf(SNPARGS(proto, len), ":%d %s:%d", ntohs(udp->uh_sport), inet_ntoa(ip->ip_dst), ntohs(udp->uh_dport)); else snprintf(SNPARGS(proto, len), " %s", inet_ntoa(ip->ip_dst)); break; case IPPROTO_ICMP: if (offset == 0) len = snprintf(SNPARGS(proto, 0), "ICMP:%u.%u ", icmp->icmp_type, icmp->icmp_code); else len = snprintf(SNPARGS(proto, 0), "ICMP "); len += snprintf(SNPARGS(proto, len), "%s", inet_ntoa(ip->ip_src)); snprintf(SNPARGS(proto, len), " %s", inet_ntoa(ip->ip_dst)); break; default: len = snprintf(SNPARGS(proto, 0), "P:%d %s", ip->ip_p, inet_ntoa(ip->ip_src)); snprintf(SNPARGS(proto, len), " %s", inet_ntoa(ip->ip_dst)); break; } if (ip_off & (IP_MF | IP_OFFMASK)) snprintf(SNPARGS(fragment, 0), " (frag %d:%d@%d%s)", ntohs(ip->ip_id), ip_len - (ip->ip_hl << 2), offset << 3, (ip_off & IP_MF) ? "+" : ""); } if (oif || m->m_pkthdr.rcvif) log(LOG_SECURITY | LOG_INFO, "ipfw: %d %s %s %s via %s%d%s\n", f ? f->rulenum : -1, action, proto, oif ? "out" : "in", oif ? oif->if_name : m->m_pkthdr.rcvif->if_name, oif ? oif->if_unit : m->m_pkthdr.rcvif->if_unit, fragment); else log(LOG_SECURITY | LOG_INFO, "ipfw: %d %s %s [no if info]%s\n", f ? f->rulenum : -1, action, proto, fragment); if (limit_reached) log(LOG_SECURITY | LOG_NOTICE, "ipfw: limit %d reached on entry %d\n", limit_reached, f ? f->rulenum : -1); } /* * IMPORTANT: the hash function for dynamic rules must be commutative * in source and destination (ip,port), because rules are bidirectional * and we want to find both in the same bucket. */ static __inline int hash_packet(struct ipfw_flow_id *id) { u_int32_t i; i = (id->dst_ip) ^ (id->src_ip) ^ (id->dst_port) ^ (id->src_port); i &= (curr_dyn_buckets - 1); return i; } /** * unlink a dynamic rule from a chain. prev is a pointer to * the previous one, q is a pointer to the rule to delete, * head is a pointer to the head of the queue. * Modifies q and potentially also head. */ #define UNLINK_DYN_RULE(prev, head, q) { \ ipfw_dyn_rule *old_q = q; \ \ /* remove a refcount to the parent */ \ if (q->dyn_type == O_LIMIT) \ q->parent->count--; \ DEB(printf("ipfw: unlink entry 0x%08x %d -> 0x%08x %d, %d left\n",\ (q->id.src_ip), (q->id.src_port), \ (q->id.dst_ip), (q->id.dst_port), dyn_count-1 ); ) \ if (prev != NULL) \ prev->next = q = q->next; \ else \ head = q = q->next; \ dyn_count--; \ free(old_q, M_IPFW); } #define TIME_LEQ(a,b) ((int)((a)-(b)) <= 0) /** * Remove dynamic rules pointing to "rule", or all of them if rule == NULL. * * If keep_me == NULL, rules are deleted even if not expired, * otherwise only expired rules are removed. * * The value of the second parameter is also used to point to identify * a rule we absolutely do not want to remove (e.g. because we are * holding a reference to it -- this is the case with O_LIMIT_PARENT * rules). The pointer is only used for comparison, so any non-null * value will do. */ static void remove_dyn_rule(struct ip_fw *rule, ipfw_dyn_rule *keep_me) { static u_int32_t last_remove = 0; #define FORCE (keep_me == NULL) ipfw_dyn_rule *prev, *q; int i, pass = 0, max_pass = 0; if (ipfw_dyn_v == NULL || dyn_count == 0) return; /* do not expire more than once per second, it is useless */ if (!FORCE && last_remove == time_second) return; last_remove = time_second; /* * because O_LIMIT refer to parent rules, during the first pass only * remove child and mark any pending LIMIT_PARENT, and remove * them in a second pass. */ next_pass: for (i = 0 ; i < curr_dyn_buckets ; i++) { for (prev=NULL, q = ipfw_dyn_v[i] ; q ; ) { /* * Logic can become complex here, so we split tests. */ if (q == keep_me) goto next; if (rule != NULL && rule != q->rule) goto next; /* not the one we are looking for */ if (q->dyn_type == O_LIMIT_PARENT) { /* * handle parent in the second pass, * record we need one. */ max_pass = 1; if (pass == 0) goto next; if (FORCE && q->count != 0 ) { /* XXX should not happen! */ printf("ipfw: OUCH! cannot remove rule," " count %d\n", q->count); } } else { if (!FORCE && !TIME_LEQ( q->expire, time_second )) goto next; } if (q->dyn_type != O_LIMIT_PARENT || !q->count) { UNLINK_DYN_RULE(prev, ipfw_dyn_v[i], q); continue; } next: prev=q; q=q->next; } } if (pass++ < max_pass) goto next_pass; } /** * lookup a dynamic rule. */ static ipfw_dyn_rule * lookup_dyn_rule(struct ipfw_flow_id *pkt, int *match_direction, struct tcphdr *tcp) { /* * stateful ipfw extensions. * Lookup into dynamic session queue */ #define MATCH_REVERSE 0 #define MATCH_FORWARD 1 #define MATCH_NONE 2 #define MATCH_UNKNOWN 3 int i, dir = MATCH_NONE; ipfw_dyn_rule *prev, *q=NULL; if (ipfw_dyn_v == NULL) goto done; /* not found */ i = hash_packet( pkt ); for (prev=NULL, q = ipfw_dyn_v[i] ; q != NULL ; ) { if (q->dyn_type == O_LIMIT_PARENT && q->count) goto next; if (TIME_LEQ( q->expire, time_second)) { /* expire entry */ UNLINK_DYN_RULE(prev, ipfw_dyn_v[i], q); continue; } if (pkt->proto == q->id.proto && q->dyn_type != O_LIMIT_PARENT) { if (pkt->src_ip == q->id.src_ip && pkt->dst_ip == q->id.dst_ip && pkt->src_port == q->id.src_port && pkt->dst_port == q->id.dst_port ) { dir = MATCH_FORWARD; break; } if (pkt->src_ip == q->id.dst_ip && pkt->dst_ip == q->id.src_ip && pkt->src_port == q->id.dst_port && pkt->dst_port == q->id.src_port ) { dir = MATCH_REVERSE; break; } } next: prev = q; q = q->next; } if (q == NULL) goto done; /* q = NULL, not found */ if ( prev != NULL) { /* found and not in front */ prev->next = q->next; q->next = ipfw_dyn_v[i]; ipfw_dyn_v[i] = q; } if (pkt->proto == IPPROTO_TCP) { /* update state according to flags */ u_char flags = pkt->flags & (TH_FIN|TH_SYN|TH_RST); #define BOTH_SYN (TH_SYN | (TH_SYN << 8)) #define BOTH_FIN (TH_FIN | (TH_FIN << 8)) q->state |= (dir == MATCH_FORWARD ) ? flags : (flags << 8); switch (q->state) { case TH_SYN: /* opening */ q->expire = time_second + dyn_syn_lifetime; break; case BOTH_SYN: /* move to established */ case BOTH_SYN | TH_FIN : /* one side tries to close */ case BOTH_SYN | (TH_FIN << 8) : if (tcp) { #define _SEQ_GE(a,b) ((int)(a) - (int)(b) >= 0) u_int32_t ack = ntohl(tcp->th_ack); if (dir == MATCH_FORWARD) { if (q->ack_fwd == 0 || _SEQ_GE(ack, q->ack_fwd)) q->ack_fwd = ack; else { /* ignore out-of-sequence */ break; } } else { if (q->ack_rev == 0 || _SEQ_GE(ack, q->ack_rev)) q->ack_rev = ack; else { /* ignore out-of-sequence */ break; } } } q->expire = time_second + dyn_ack_lifetime; break; case BOTH_SYN | BOTH_FIN: /* both sides closed */ if (dyn_fin_lifetime >= dyn_keepalive_period) dyn_fin_lifetime = dyn_keepalive_period - 1; q->expire = time_second + dyn_fin_lifetime; break; default: #if 0 /* * reset or some invalid combination, but can also * occur if we use keep-state the wrong way. */ if ( (q->state & ((TH_RST << 8)|TH_RST)) == 0) printf("invalid state: 0x%x\n", q->state); #endif if (dyn_rst_lifetime >= dyn_keepalive_period) dyn_rst_lifetime = dyn_keepalive_period - 1; q->expire = time_second + dyn_rst_lifetime; break; } } else if (pkt->proto == IPPROTO_UDP) { q->expire = time_second + dyn_udp_lifetime; } else { /* other protocols */ q->expire = time_second + dyn_short_lifetime; } done: if (match_direction) *match_direction = dir; return q; } static void realloc_dynamic_table(void) { /* * Try reallocation, make sure we have a power of 2 and do * not allow more than 64k entries. In case of overflow, * default to 1024. */ if (dyn_buckets > 65536) dyn_buckets = 1024; if ((dyn_buckets & (dyn_buckets-1)) != 0) { /* not a power of 2 */ dyn_buckets = curr_dyn_buckets; /* reset */ return; } curr_dyn_buckets = dyn_buckets; if (ipfw_dyn_v != NULL) free(ipfw_dyn_v, M_IPFW); for (;;) { ipfw_dyn_v = malloc(curr_dyn_buckets * sizeof(ipfw_dyn_rule *), M_IPFW, M_NOWAIT | M_ZERO); if (ipfw_dyn_v != NULL || curr_dyn_buckets <= 2) break; curr_dyn_buckets /= 2; } } /** * Install state of type 'type' for a dynamic session. * The hash table contains two type of rules: * - regular rules (O_KEEP_STATE) * - rules for sessions with limited number of sess per user * (O_LIMIT). When they are created, the parent is * increased by 1, and decreased on delete. In this case, * the third parameter is the parent rule and not the chain. * - "parent" rules for the above (O_LIMIT_PARENT). */ static ipfw_dyn_rule * add_dyn_rule(struct ipfw_flow_id *id, u_int8_t dyn_type, struct ip_fw *rule) { ipfw_dyn_rule *r; int i; if (ipfw_dyn_v == NULL || (dyn_count == 0 && dyn_buckets != curr_dyn_buckets)) { realloc_dynamic_table(); if (ipfw_dyn_v == NULL) return NULL; /* failed ! */ } i = hash_packet(id); r = malloc(sizeof *r, M_IPFW, M_NOWAIT | M_ZERO); if (r == NULL) { printf ("ipfw: sorry cannot allocate state\n"); return NULL; } /* increase refcount on parent, and set pointer */ if (dyn_type == O_LIMIT) { ipfw_dyn_rule *parent = (ipfw_dyn_rule *)rule; if ( parent->dyn_type != O_LIMIT_PARENT) panic("invalid parent"); parent->count++; r->parent = parent; rule = parent->rule; } r->id = *id; r->expire = time_second + dyn_syn_lifetime; r->rule = rule; r->dyn_type = dyn_type; r->pcnt = r->bcnt = 0; r->count = 0; r->bucket = i; r->next = ipfw_dyn_v[i]; ipfw_dyn_v[i] = r; dyn_count++; DEB(printf("ipfw: add dyn entry ty %d 0x%08x %d -> 0x%08x %d, total %d\n", dyn_type, (r->id.src_ip), (r->id.src_port), (r->id.dst_ip), (r->id.dst_port), dyn_count ); ) return r; } /** * lookup dynamic parent rule using pkt and rule as search keys. * If the lookup fails, then install one. */ static ipfw_dyn_rule * lookup_dyn_parent(struct ipfw_flow_id *pkt, struct ip_fw *rule) { ipfw_dyn_rule *q; int i; if (ipfw_dyn_v) { i = hash_packet( pkt ); for (q = ipfw_dyn_v[i] ; q != NULL ; q=q->next) if (q->dyn_type == O_LIMIT_PARENT && rule== q->rule && pkt->proto == q->id.proto && pkt->src_ip == q->id.src_ip && pkt->dst_ip == q->id.dst_ip && pkt->src_port == q->id.src_port && pkt->dst_port == q->id.dst_port) { q->expire = time_second + dyn_short_lifetime; DEB(printf("ipfw: lookup_dyn_parent found 0x%p\n",q);) return q; } } return add_dyn_rule(pkt, O_LIMIT_PARENT, rule); } /** * Install dynamic state for rule type cmd->o.opcode * * Returns 1 (failure) if state is not installed because of errors or because * session limitations are enforced. */ static int install_state(struct ip_fw *rule, ipfw_insn_limit *cmd, struct ip_fw_args *args) { static int last_log; ipfw_dyn_rule *q; DEB(printf("ipfw: install state type %d 0x%08x %u -> 0x%08x %u\n", cmd->o.opcode, (args->f_id.src_ip), (args->f_id.src_port), (args->f_id.dst_ip), (args->f_id.dst_port) );) q = lookup_dyn_rule(&args->f_id, NULL, NULL); if (q != NULL) { /* should never occur */ if (last_log != time_second) { last_log = time_second; printf("ipfw: install_state: entry already present, done\n"); } return 0; } if (dyn_count >= dyn_max) /* * Run out of slots, try to remove any expired rule. */ remove_dyn_rule(NULL, (ipfw_dyn_rule *)1); if (dyn_count >= dyn_max) { if (last_log != time_second) { last_log = time_second; printf("ipfw: install_state: Too many dynamic rules\n"); } return 1; /* cannot install, notify caller */ } switch (cmd->o.opcode) { case O_KEEP_STATE: /* bidir rule */ add_dyn_rule(&args->f_id, O_KEEP_STATE, rule); break; case O_LIMIT: /* limit number of sessions */ { u_int16_t limit_mask = cmd->limit_mask; struct ipfw_flow_id id; ipfw_dyn_rule *parent; DEB(printf("ipfw: installing dyn-limit rule %d\n", cmd->conn_limit);) id.dst_ip = id.src_ip = 0; id.dst_port = id.src_port = 0; id.proto = args->f_id.proto; if (limit_mask & DYN_SRC_ADDR) id.src_ip = args->f_id.src_ip; if (limit_mask & DYN_DST_ADDR) id.dst_ip = args->f_id.dst_ip; if (limit_mask & DYN_SRC_PORT) id.src_port = args->f_id.src_port; if (limit_mask & DYN_DST_PORT) id.dst_port = args->f_id.dst_port; parent = lookup_dyn_parent(&id, rule); if (parent == NULL) { printf("ipfw: add parent failed\n"); return 1; } if (parent->count >= cmd->conn_limit) { /* * See if we can remove some expired rule. */ remove_dyn_rule(rule, parent); if (parent->count >= cmd->conn_limit) { if (fw_verbose && last_log != time_second) { last_log = time_second; log(LOG_SECURITY | LOG_DEBUG, "drop session, too many entries\n"); } return 1; } } add_dyn_rule(&args->f_id, O_LIMIT, (struct ip_fw *)parent); } break; default: printf("ipfw: unknown dynamic rule type %u\n", cmd->o.opcode); return 1; } lookup_dyn_rule(&args->f_id, NULL, NULL); /* XXX just set lifetime */ return 0; } /* * Transmit a TCP packet, containing either a RST or a keepalive. * When flags & TH_RST, we are sending a RST packet, because of a * "reset" action matched the packet. * Otherwise we are sending a keepalive, and flags & TH_ */ static void send_pkt(struct ipfw_flow_id *id, u_int32_t seq, u_int32_t ack, int flags) { struct mbuf *m; struct ip *ip; struct tcphdr *tcp; struct route sro; /* fake route */ MGETHDR(m, M_DONTWAIT, MT_HEADER); if (m == 0) return; m->m_pkthdr.rcvif = (struct ifnet *)0; m->m_pkthdr.len = m->m_len = sizeof(struct ip) + sizeof(struct tcphdr); m->m_data += max_linkhdr; ip = mtod(m, struct ip *); bzero(ip, m->m_len); tcp = (struct tcphdr *)(ip + 1); /* no IP options */ ip->ip_p = IPPROTO_TCP; tcp->th_off = 5; /* * Assume we are sending a RST (or a keepalive in the reverse * direction), swap src and destination addresses and ports. */ ip->ip_src.s_addr = htonl(id->dst_ip); ip->ip_dst.s_addr = htonl(id->src_ip); tcp->th_sport = htons(id->dst_port); tcp->th_dport = htons(id->src_port); if (flags & TH_RST) { /* we are sending a RST */ if (flags & TH_ACK) { tcp->th_seq = htonl(ack); tcp->th_ack = htonl(0); tcp->th_flags = TH_RST; } else { if (flags & TH_SYN) seq++; tcp->th_seq = htonl(0); tcp->th_ack = htonl(seq); tcp->th_flags = TH_RST | TH_ACK; } } else { /* * We are sending a keepalive. flags & TH_SYN determines * the direction, forward if set, reverse if clear. * NOTE: seq and ack are always assumed to be correct * as set by the caller. This may be confusing... */ if (flags & TH_SYN) { /* * we have to rewrite the correct addresses! */ ip->ip_dst.s_addr = htonl(id->dst_ip); ip->ip_src.s_addr = htonl(id->src_ip); tcp->th_dport = htons(id->dst_port); tcp->th_sport = htons(id->src_port); } tcp->th_seq = htonl(seq); tcp->th_ack = htonl(ack); tcp->th_flags = TH_ACK; } /* * set ip_len to the payload size so we can compute * the tcp checksum on the pseudoheader * XXX check this, could save a couple of words ? */ ip->ip_len = htons(sizeof(struct tcphdr)); tcp->th_sum = in_cksum(m, m->m_pkthdr.len); /* * now fill fields left out earlier */ ip->ip_ttl = ip_defttl; ip->ip_len = m->m_pkthdr.len; bzero (&sro, sizeof (sro)); ip_rtaddr(ip->ip_dst, &sro); m->m_flags |= M_SKIP_FIREWALL; ip_output(m, NULL, &sro, 0, NULL, NULL); if (sro.ro_rt) RTFREE(sro.ro_rt); } /* * sends a reject message, consuming the mbuf passed as an argument. */ static void send_reject(struct ip_fw_args *args, int code, int offset, int ip_len) { if (code != ICMP_REJECT_RST) { /* Send an ICMP unreach */ /* We need the IP header in host order for icmp_error(). */ if (args->eh != NULL) { struct ip *ip = mtod(args->m, struct ip *); ip->ip_len = ntohs(ip->ip_len); ip->ip_off = ntohs(ip->ip_off); } icmp_error(args->m, ICMP_UNREACH, code, 0L, 0); } else if (offset == 0 && args->f_id.proto == IPPROTO_TCP) { struct tcphdr *const tcp = L3HDR(struct tcphdr, mtod(args->m, struct ip *)); if ( (tcp->th_flags & TH_RST) == 0) send_pkt(&(args->f_id), ntohl(tcp->th_seq), ntohl(tcp->th_ack), tcp->th_flags | TH_RST); m_freem(args->m); } else m_freem(args->m); args->m = NULL; } /** * * Given an ip_fw *, lookup_next_rule will return a pointer * to the next rule, which can be either the jump * target (for skipto instructions) or the next one in the list (in * all other cases including a missing jump target). * The result is also written in the "next_rule" field of the rule. * Backward jumps are not allowed, so start looking from the next * rule... * * This never returns NULL -- in case we do not have an exact match, * the next rule is returned. When the ruleset is changed, * pointers are flushed so we are always correct. */ static struct ip_fw * lookup_next_rule(struct ip_fw *me) { struct ip_fw *rule = NULL; ipfw_insn *cmd; /* look for action, in case it is a skipto */ cmd = ACTION_PTR(me); if (cmd->opcode == O_LOG) cmd += F_LEN(cmd); if ( cmd->opcode == O_SKIPTO ) for (rule = me->next; rule ; rule = rule->next) if (rule->rulenum >= cmd->arg1) break; if (rule == NULL) /* failure or not a skipto */ rule = me->next; me->next_rule = rule; return rule; } /* * The main check routine for the firewall. * * All arguments are in args so we can modify them and return them * back to the caller. * * Parameters: * * args->m (in/out) The packet; we set to NULL when/if we nuke it. * Starts with the IP header. * args->eh (in) Mac header if present, or NULL for layer3 packet. * args->oif Outgoing interface, or NULL if packet is incoming. * The incoming interface is in the mbuf. (in) * args->divert_rule (in/out) * Skip up to the first rule past this rule number; * upon return, non-zero port number for divert or tee. * * args->rule Pointer to the last matching rule (in/out) * args->next_hop Socket we are forwarding to (out). * args->f_id Addresses grabbed from the packet (out) * * Return value: * * IP_FW_PORT_DENY_FLAG the packet must be dropped. * 0 The packet is to be accepted and routed normally OR * the packet was denied/rejected and has been dropped; * in the latter case, *m is equal to NULL upon return. * port Divert the packet to port, with these caveats: * * - If IP_FW_PORT_TEE_FLAG is set, tee the packet instead * of diverting it (ie, 'ipfw tee'). * * - If IP_FW_PORT_DYNT_FLAG is set, interpret the lower * 16 bits as a dummynet pipe number instead of diverting */ +static void +init_tables(void) +{ + int i; + + for (i = 0; i < IPFW_TABLES_MAX; i++) { + rn_inithead((void **)&ipfw_tables[i].rnh, 32); + ipfw_tables[i].modified = 1; + } +} + static int +add_table_entry(u_int16_t tbl, in_addr_t addr, u_int8_t mlen, u_int32_t value) +{ + struct radix_node_head *rnh; + struct table_entry *ent; + int s; + + if (tbl >= IPFW_TABLES_MAX) + return (EINVAL); + rnh = ipfw_tables[tbl].rnh; + ent = malloc(sizeof(*ent), M_IPFW_TBL, M_NOWAIT | M_ZERO); + if (ent == NULL) + return (ENOMEM); + ent->value = value; + ent->addr.sin_len = ent->mask.sin_len = 8; + ent->mask.sin_addr.s_addr = htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0); + ent->addr.sin_addr.s_addr = addr & ent->mask.sin_addr.s_addr; + s = splimp(); + if (rnh->rnh_addaddr(&ent->addr, &ent->mask, rnh, (void *)ent) == + NULL) { + splx(s); + free(ent, M_IPFW_TBL); + return (EEXIST); + } + ipfw_tables[tbl].modified = 1; + splx(s); + return (0); +} + +static int +del_table_entry(u_int16_t tbl, in_addr_t addr, u_int8_t mlen) +{ + struct radix_node_head *rnh; + struct table_entry *ent; + struct sockaddr_in sa, mask; + int s; + + if (tbl >= IPFW_TABLES_MAX) + return (EINVAL); + rnh = ipfw_tables[tbl].rnh; + sa.sin_len = mask.sin_len = 8; + mask.sin_addr.s_addr = htonl(mlen ? ~((1 << (32 - mlen)) - 1) : 0); + sa.sin_addr.s_addr = addr & mask.sin_addr.s_addr; + s = splimp(); + ent = (struct table_entry *)rnh->rnh_deladdr(&sa, &mask, rnh); + if (ent == NULL) { + splx(s); + return (ESRCH); + } + ipfw_tables[tbl].modified = 1; + splx(s); + free(ent, M_IPFW_TBL); + return (0); +} + +static int +flush_table_entry(struct radix_node *rn, void *arg) +{ + struct radix_node_head * const rnh = arg; + struct table_entry *ent; + + ent = (struct table_entry *) + rnh->rnh_deladdr(rn->rn_key, rn->rn_mask, rnh); + if (ent != NULL) + free(ent, M_IPFW_TBL); + return (0); +} + +static int +flush_table(u_int16_t tbl) +{ + struct radix_node_head *rnh; + int s; + + if (tbl >= IPFW_TABLES_MAX) + return (EINVAL); + rnh = ipfw_tables[tbl].rnh; + s = splimp(); + rnh->rnh_walktree(rnh, flush_table_entry, rnh); + ipfw_tables[tbl].modified = 1; + splx(s); + return (0); +} + +#if defined(KLD_MODULE) +static void +flush_tables(void) +{ + u_int16_t tbl; + + for (tbl = 0; tbl < IPFW_TABLES_MAX; tbl++) + flush_table(tbl); +} +#endif + +static int +lookup_table(u_int16_t tbl, in_addr_t addr, u_int32_t *val) +{ + struct radix_node_head *rnh; + struct table_entry *ent; + struct sockaddr_in sa; + static in_addr_t last_addr; + static int last_tbl; + static int last_match; + static u_int32_t last_value; + int s; + + if (tbl >= IPFW_TABLES_MAX) + return (0); + if (tbl == last_tbl && addr == last_addr && + !ipfw_tables[tbl].modified) { + if (last_match) + *val = last_value; + return (last_match); + } + rnh = ipfw_tables[tbl].rnh; + sa.sin_len = 8; + sa.sin_addr.s_addr = addr; + s = splimp(); + ipfw_tables[tbl].modified = 0; + ent = (struct table_entry *)(rnh->rnh_lookup(&sa, NULL, rnh)); + splx(s); + last_addr = addr; + last_tbl = tbl; + if (ent != NULL) { + last_value = *val = ent->value; + last_match = 1; + return (1); + } + last_match = 0; + return (0); +} + +static int +count_table_entry(struct radix_node *rn, void *arg) +{ + u_int32_t * const cnt = arg; + + (*cnt)++; + return (0); +} + +static int +count_table(u_int32_t tbl, u_int32_t *cnt) +{ + struct radix_node_head *rnh; + int s; + + if (tbl >= IPFW_TABLES_MAX) + return (EINVAL); + rnh = ipfw_tables[tbl].rnh; + *cnt = 0; + s = splimp(); + rnh->rnh_walktree(rnh, count_table_entry, cnt); + splx(s); + return (0); +} + +static int +dump_table_entry(struct radix_node *rn, void *arg) +{ + struct table_entry * const n = (struct table_entry *)rn; + ipfw_table * const tbl = arg; + ipfw_table_entry *ent; + + if (tbl->cnt == tbl->size) + return (1); + ent = &tbl->ent[tbl->cnt]; + ent->tbl = tbl->tbl; + if (in_nullhost(n->mask.sin_addr)) + ent->masklen = 0; + else + ent->masklen = 33 - ffs(ntohl(n->mask.sin_addr.s_addr)); + ent->addr = n->addr.sin_addr.s_addr; + ent->value = n->value; + tbl->cnt++; + return (0); +} + +static int +dump_table(ipfw_table *tbl) +{ + struct radix_node_head *rnh; + int s; + + if (tbl->tbl >= IPFW_TABLES_MAX) + return (EINVAL); + rnh = ipfw_tables[tbl->tbl].rnh; + tbl->cnt = 0; + s = splimp(); + rnh->rnh_walktree(rnh, dump_table_entry, tbl); + splx(s); + return (0); +} + +static int ipfw_chk(struct ip_fw_args *args) { /* * Local variables hold state during the processing of a packet. * * IMPORTANT NOTE: to speed up the processing of rules, there * are some assumption on the values of the variables, which * are documented here. Should you change them, please check * the implementation of the various instructions to make sure * that they still work. * * args->eh The MAC header. It is non-null for a layer2 * packet, it is NULL for a layer-3 packet. * * m | args->m Pointer to the mbuf, as received from the caller. * It may change if ipfw_chk() does an m_pullup, or if it * consumes the packet because it calls send_reject(). * XXX This has to change, so that ipfw_chk() never modifies * or consumes the buffer. * ip is simply an alias of the value of m, and it is kept * in sync with it (the packet is supposed to start with * the ip header). */ struct mbuf *m = args->m; struct ip *ip = mtod(m, struct ip *); /* * oif | args->oif If NULL, ipfw_chk has been called on the * inbound path (ether_input, bdg_forward, ip_input). * If non-NULL, ipfw_chk has been called on the outbound path * (ether_output, ip_output). */ struct ifnet *oif = args->oif; struct ip_fw *f = NULL; /* matching rule */ int retval = 0; /* * hlen The length of the IPv4 header. * hlen >0 means we have an IPv4 packet. */ u_int hlen = 0; /* hlen >0 means we have an IP pkt */ /* * offset The offset of a fragment. offset != 0 means that * we have a fragment at this offset of an IPv4 packet. * offset == 0 means that (if this is an IPv4 packet) * this is the first or only fragment. */ u_short offset = 0; /* * Local copies of addresses. They are only valid if we have * an IP packet. * * proto The protocol. Set to 0 for non-ip packets, * or to the protocol read from the packet otherwise. * proto != 0 means that we have an IPv4 packet. * * src_port, dst_port port numbers, in HOST format. Only * valid for TCP and UDP packets. * * src_ip, dst_ip ip addresses, in NETWORK format. * Only valid for IPv4 packets. */ u_int8_t proto; u_int16_t src_port = 0, dst_port = 0; /* NOTE: host format */ struct in_addr src_ip, dst_ip; /* NOTE: network format */ u_int16_t ip_len=0; int pktlen; int dyn_dir = MATCH_UNKNOWN; ipfw_dyn_rule *q = NULL; if (m->m_flags & M_SKIP_FIREWALL) return 0; /* accept */ /* * dyn_dir = MATCH_UNKNOWN when rules unchecked, * MATCH_NONE when checked and not matched (q = NULL), * MATCH_FORWARD or MATCH_REVERSE otherwise (q != NULL) */ pktlen = m->m_pkthdr.len; if (args->eh == NULL || /* layer 3 packet */ ( m->m_pkthdr.len >= sizeof(struct ip) && ntohs(args->eh->ether_type) == ETHERTYPE_IP)) hlen = ip->ip_hl << 2; /* * Collect parameters into local variables for faster matching. */ if (hlen == 0) { /* do not grab addresses for non-ip pkts */ proto = args->f_id.proto = 0; /* mark f_id invalid */ goto after_ip_checks; } proto = args->f_id.proto = ip->ip_p; src_ip = ip->ip_src; dst_ip = ip->ip_dst; if (args->eh != NULL) { /* layer 2 packets are as on the wire */ offset = ntohs(ip->ip_off) & IP_OFFMASK; ip_len = ntohs(ip->ip_len); } else { offset = ip->ip_off & IP_OFFMASK; ip_len = ip->ip_len; } pktlen = ip_len < pktlen ? ip_len : pktlen; #define PULLUP_TO(len) \ do { \ if ((m)->m_len < (len)) { \ args->m = m = m_pullup(m, (len)); \ if (m == 0) \ goto pullup_failed; \ ip = mtod(m, struct ip *); \ } \ } while (0) if (offset == 0) { switch (proto) { case IPPROTO_TCP: { struct tcphdr *tcp; PULLUP_TO(hlen + sizeof(struct tcphdr)); tcp = L3HDR(struct tcphdr, ip); dst_port = tcp->th_dport; src_port = tcp->th_sport; args->f_id.flags = tcp->th_flags; } break; case IPPROTO_UDP: { struct udphdr *udp; PULLUP_TO(hlen + sizeof(struct udphdr)); udp = L3HDR(struct udphdr, ip); dst_port = udp->uh_dport; src_port = udp->uh_sport; } break; case IPPROTO_ICMP: PULLUP_TO(hlen + 4); /* type, code and checksum. */ args->f_id.flags = L3HDR(struct icmp, ip)->icmp_type; break; default: break; } #undef PULLUP_TO } args->f_id.src_ip = ntohl(src_ip.s_addr); args->f_id.dst_ip = ntohl(dst_ip.s_addr); args->f_id.src_port = src_port = ntohs(src_port); args->f_id.dst_port = dst_port = ntohs(dst_port); after_ip_checks: if (args->rule) { /* * Packet has already been tagged. Look for the next rule * to restart processing. * * If fw_one_pass != 0 then just accept it. * XXX should not happen here, but optimized out in * the caller. */ if (fw_one_pass) return 0; f = args->rule->next_rule; if (f == NULL) f = lookup_next_rule(args->rule); } else { /* * Find the starting rule. It can be either the first * one, or the one after divert_rule if asked so. */ int skipto = args->divert_rule; f = layer3_chain; if (args->eh == NULL && skipto != 0) { if (skipto >= IPFW_DEFAULT_RULE) return(IP_FW_PORT_DENY_FLAG); /* invalid */ while (f && f->rulenum <= skipto) f = f->next; if (f == NULL) /* drop packet */ return(IP_FW_PORT_DENY_FLAG); } } args->divert_rule = 0; /* reset to avoid confusion later */ /* * Now scan the rules, and parse microinstructions for each rule. */ for (; f; f = f->next) { int l, cmdlen; ipfw_insn *cmd; int skip_or; /* skip rest of OR block */ again: if (set_disable & (1 << f->set) ) continue; skip_or = 0; for (l = f->cmd_len, cmd = f->cmd ; l > 0 ; l -= cmdlen, cmd += cmdlen) { int match; /* * check_body is a jump target used when we find a * CHECK_STATE, and need to jump to the body of * the target rule. */ check_body: cmdlen = F_LEN(cmd); /* * An OR block (insn_1 || .. || insn_n) has the * F_OR bit set in all but the last instruction. * The first match will set "skip_or", and cause * the following instructions to be skipped until * past the one with the F_OR bit clear. */ if (skip_or) { /* skip this instruction */ if ((cmd->len & F_OR) == 0) skip_or = 0; /* next one is good */ continue; } match = 0; /* set to 1 if we succeed */ switch (cmd->opcode) { /* * The first set of opcodes compares the packet's * fields with some pattern, setting 'match' if a * match is found. At the end of the loop there is * logic to deal with F_NOT and F_OR flags associated * with the opcode. */ case O_NOP: match = 1; break; case O_FORWARD_MAC: printf("ipfw: opcode %d unimplemented\n", cmd->opcode); break; case O_GID: case O_UID: /* * We only check offset == 0 && proto != 0, * as this ensures that we have an IPv4 * packet with the ports info. */ if (offset!=0) break; { struct inpcbinfo *pi; int wildcard; struct inpcb *pcb; if (proto == IPPROTO_TCP) { wildcard = 0; pi = &tcbinfo; } else if (proto == IPPROTO_UDP) { wildcard = 1; pi = &udbinfo; } else break; pcb = (oif) ? in_pcblookup_hash(pi, dst_ip, htons(dst_port), src_ip, htons(src_port), wildcard, oif) : in_pcblookup_hash(pi, src_ip, htons(src_port), dst_ip, htons(dst_port), wildcard, NULL); if (pcb == NULL || pcb->inp_socket == NULL) break; #if __FreeBSD_version < 500034 #define socheckuid(a,b) ((a)->so_cred->cr_uid != (b)) #endif if (cmd->opcode == O_UID) { match = !socheckuid(pcb->inp_socket, (uid_t)((ipfw_insn_u32 *)cmd)->d[0]); } else { match = groupmember( (uid_t)((ipfw_insn_u32 *)cmd)->d[0], pcb->inp_socket->so_cred); } } break; case O_RECV: match = iface_match(m->m_pkthdr.rcvif, (ipfw_insn_if *)cmd); break; case O_XMIT: match = iface_match(oif, (ipfw_insn_if *)cmd); break; case O_VIA: match = iface_match(oif ? oif : m->m_pkthdr.rcvif, (ipfw_insn_if *)cmd); break; case O_MACADDR2: if (args->eh != NULL) { /* have MAC header */ u_int32_t *want = (u_int32_t *) ((ipfw_insn_mac *)cmd)->addr; u_int32_t *mask = (u_int32_t *) ((ipfw_insn_mac *)cmd)->mask; u_int32_t *hdr = (u_int32_t *)args->eh; match = ( want[0] == (hdr[0] & mask[0]) && want[1] == (hdr[1] & mask[1]) && want[2] == (hdr[2] & mask[2]) ); } break; case O_MAC_TYPE: if (args->eh != NULL) { u_int16_t t = ntohs(args->eh->ether_type); u_int16_t *p = ((ipfw_insn_u16 *)cmd)->ports; int i; for (i = cmdlen - 1; !match && i>0; i--, p += 2) match = (t>=p[0] && t<=p[1]); } break; case O_FRAG: match = (hlen > 0 && offset != 0); break; case O_IN: /* "out" is "not in" */ match = (oif == NULL); break; case O_LAYER2: match = (args->eh != NULL); break; case O_PROTO: /* * We do not allow an arg of 0 so the * check of "proto" only suffices. */ match = (proto == cmd->arg1); break; case O_IP_SRC: match = (hlen > 0 && ((ipfw_insn_ip *)cmd)->addr.s_addr == src_ip.s_addr); break; + case O_IP_SRC_LOOKUP: + case O_IP_DST_LOOKUP: + if (hlen > 0) { + uint32_t a = + (cmd->opcode == O_IP_DST_LOOKUP) ? + dst_ip.s_addr : src_ip.s_addr; + uint32_t v; + + match = lookup_table(cmd->arg1, a, &v); + if (!match) + break; + if (cmdlen == F_INSN_SIZE(ipfw_insn_u32)) + match = + ((ipfw_insn_u32 *)cmd)->d[0] == v; + } + break; + case O_IP_SRC_MASK: case O_IP_DST_MASK: if (hlen > 0) { uint32_t a = (cmd->opcode == O_IP_DST_MASK) ? dst_ip.s_addr : src_ip.s_addr; uint32_t *p = ((ipfw_insn_u32 *)cmd)->d; int i = cmdlen-1; for (; !match && i>0; i-= 2, p+= 2) match = (p[0] == (a & p[1])); } break; case O_IP_SRC_ME: if (hlen > 0) { struct ifnet *tif; INADDR_TO_IFP(src_ip, tif); match = (tif != NULL); } break; case O_IP_DST_SET: case O_IP_SRC_SET: if (hlen > 0) { u_int32_t *d = (u_int32_t *)(cmd+1); u_int32_t addr = cmd->opcode == O_IP_DST_SET ? args->f_id.dst_ip : args->f_id.src_ip; if (addr < d[0]) break; addr -= d[0]; /* subtract base */ match = (addr < cmd->arg1) && ( d[ 1 + (addr>>5)] & (1<<(addr & 0x1f)) ); } break; case O_IP_DST: match = (hlen > 0 && ((ipfw_insn_ip *)cmd)->addr.s_addr == dst_ip.s_addr); break; case O_IP_DST_ME: if (hlen > 0) { struct ifnet *tif; INADDR_TO_IFP(dst_ip, tif); match = (tif != NULL); } break; case O_IP_SRCPORT: case O_IP_DSTPORT: /* * offset == 0 && proto != 0 is enough * to guarantee that we have an IPv4 * packet with port info. */ if ((proto==IPPROTO_UDP || proto==IPPROTO_TCP) && offset == 0) { u_int16_t x = (cmd->opcode == O_IP_SRCPORT) ? src_port : dst_port ; u_int16_t *p = ((ipfw_insn_u16 *)cmd)->ports; int i; for (i = cmdlen - 1; !match && i>0; i--, p += 2) match = (x>=p[0] && x<=p[1]); } break; case O_ICMPTYPE: match = (offset == 0 && proto==IPPROTO_ICMP && icmptype_match(ip, (ipfw_insn_u32 *)cmd) ); break; case O_IPOPT: match = (hlen > 0 && ipopts_match(ip, cmd) ); break; case O_IPVER: match = (hlen > 0 && cmd->arg1 == ip->ip_v); break; case O_IPID: case O_IPLEN: case O_IPTTL: if (hlen > 0) { /* only for IP packets */ uint16_t x; uint16_t *p; int i; if (cmd->opcode == O_IPLEN) x = ip_len; else if (cmd->opcode == O_IPTTL) x = ip->ip_ttl; else /* must be IPID */ x = ntohs(ip->ip_id); if (cmdlen == 1) { match = (cmd->arg1 == x); break; } /* otherwise we have ranges */ p = ((ipfw_insn_u16 *)cmd)->ports; i = cmdlen - 1; for (; !match && i>0; i--, p += 2) match = (x >= p[0] && x <= p[1]); } break; case O_IPPRECEDENCE: match = (hlen > 0 && (cmd->arg1 == (ip->ip_tos & 0xe0)) ); break; case O_IPTOS: match = (hlen > 0 && flags_match(cmd, ip->ip_tos)); break; case O_TCPFLAGS: match = (proto == IPPROTO_TCP && offset == 0 && flags_match(cmd, L3HDR(struct tcphdr,ip)->th_flags)); break; case O_TCPOPTS: match = (proto == IPPROTO_TCP && offset == 0 && tcpopts_match(ip, cmd)); break; case O_TCPSEQ: match = (proto == IPPROTO_TCP && offset == 0 && ((ipfw_insn_u32 *)cmd)->d[0] == L3HDR(struct tcphdr,ip)->th_seq); break; case O_TCPACK: match = (proto == IPPROTO_TCP && offset == 0 && ((ipfw_insn_u32 *)cmd)->d[0] == L3HDR(struct tcphdr,ip)->th_ack); break; case O_TCPWIN: match = (proto == IPPROTO_TCP && offset == 0 && cmd->arg1 == L3HDR(struct tcphdr,ip)->th_win); break; case O_ESTAB: /* reject packets which have SYN only */ /* XXX should i also check for TH_ACK ? */ match = (proto == IPPROTO_TCP && offset == 0 && (L3HDR(struct tcphdr,ip)->th_flags & (TH_RST | TH_ACK | TH_SYN)) != TH_SYN); break; case O_LOG: if (fw_verbose) ipfw_log(f, hlen, args->eh, m, oif); match = 1; break; case O_PROB: match = (random()<((ipfw_insn_u32 *)cmd)->d[0]); break; case O_VERREVPATH: /* Outgoing packets automatically pass/match */ match = ((oif != NULL) || (m->m_pkthdr.rcvif == NULL) || verify_rev_path(src_ip, m->m_pkthdr.rcvif)); break; case O_IPSEC: #ifdef FAST_IPSEC match = (m_tag_find(m, PACKET_TAG_IPSEC_IN_DONE, NULL) != NULL); #endif #ifdef IPSEC match = (ipsec_gethist(m, NULL) != NULL); #endif /* otherwise no match */ break; /* * The second set of opcodes represents 'actions', * i.e. the terminal part of a rule once the packet * matches all previous patterns. * Typically there is only one action for each rule, * and the opcode is stored at the end of the rule * (but there are exceptions -- see below). * * In general, here we set retval and terminate the * outer loop (would be a 'break 3' in some language, * but we need to do a 'goto done'). * * Exceptions: * O_COUNT and O_SKIPTO actions: * instead of terminating, we jump to the next rule * ('goto next_rule', equivalent to a 'break 2'), * or to the SKIPTO target ('goto again' after * having set f, cmd and l), respectively. * * O_LIMIT and O_KEEP_STATE: these opcodes are * not real 'actions', and are stored right * before the 'action' part of the rule. * These opcodes try to install an entry in the * state tables; if successful, we continue with * the next opcode (match=1; break;), otherwise * the packet * must be dropped * ('goto done' after setting retval); * * O_PROBE_STATE and O_CHECK_STATE: these opcodes * cause a lookup of the state table, and a jump * to the 'action' part of the parent rule * ('goto check_body') if an entry is found, or * (CHECK_STATE only) a jump to the next rule if * the entry is not found ('goto next_rule'). * The result of the lookup is cached to make * further instances of these opcodes are * effectively NOPs. */ case O_LIMIT: case O_KEEP_STATE: if (install_state(f, (ipfw_insn_limit *)cmd, args)) { retval = IP_FW_PORT_DENY_FLAG; goto done; /* error/limit violation */ } match = 1; break; case O_PROBE_STATE: case O_CHECK_STATE: /* * dynamic rules are checked at the first * keep-state or check-state occurrence, * with the result being stored in dyn_dir. * The compiler introduces a PROBE_STATE * instruction for us when we have a * KEEP_STATE (because PROBE_STATE needs * to be run first). */ if (dyn_dir == MATCH_UNKNOWN && (q = lookup_dyn_rule(&args->f_id, &dyn_dir, proto == IPPROTO_TCP ? L3HDR(struct tcphdr, ip) : NULL)) != NULL) { /* * Found dynamic entry, update stats * and jump to the 'action' part of * the parent rule. */ q->pcnt++; q->bcnt += pktlen; f = q->rule; cmd = ACTION_PTR(f); l = f->cmd_len - f->act_ofs; goto check_body; } /* * Dynamic entry not found. If CHECK_STATE, * skip to next rule, if PROBE_STATE just * ignore and continue with next opcode. */ if (cmd->opcode == O_CHECK_STATE) goto next_rule; match = 1; break; case O_ACCEPT: retval = 0; /* accept */ goto done; case O_PIPE: case O_QUEUE: args->rule = f; /* report matching rule */ retval = cmd->arg1 | IP_FW_PORT_DYNT_FLAG; goto done; case O_DIVERT: case O_TEE: if (args->eh) /* not on layer 2 */ break; args->divert_rule = f->rulenum; retval = (cmd->opcode == O_DIVERT) ? cmd->arg1 : cmd->arg1 | IP_FW_PORT_TEE_FLAG; goto done; case O_COUNT: case O_SKIPTO: f->pcnt++; /* update stats */ f->bcnt += pktlen; f->timestamp = time_second; if (cmd->opcode == O_COUNT) goto next_rule; /* handle skipto */ if (f->next_rule == NULL) lookup_next_rule(f); f = f->next_rule; goto again; case O_REJECT: /* * Drop the packet and send a reject notice * if the packet is not ICMP (or is an ICMP * query), and it is not multicast/broadcast. */ if (hlen > 0 && (proto != IPPROTO_ICMP || is_icmp_query(ip)) && !(m->m_flags & (M_BCAST|M_MCAST)) && !IN_MULTICAST(ntohl(dst_ip.s_addr))) { send_reject(args, cmd->arg1, offset,ip_len); m = args->m; } /* FALLTHROUGH */ case O_DENY: retval = IP_FW_PORT_DENY_FLAG; goto done; case O_FORWARD_IP: if (args->eh) /* not valid on layer2 pkts */ break; if (!q || dyn_dir == MATCH_FORWARD) args->next_hop = &((ipfw_insn_sa *)cmd)->sa; retval = 0; goto done; default: panic("-- unknown opcode %d\n", cmd->opcode); } /* end of switch() on opcodes */ if (cmd->len & F_NOT) match = !match; if (match) { if (cmd->len & F_OR) skip_or = 1; } else { if (!(cmd->len & F_OR)) /* not an OR block, */ break; /* try next rule */ } } /* end of inner for, scan opcodes */ next_rule:; /* try next rule */ } /* end of outer for, scan rules */ printf("ipfw: ouch!, skip past end of rules, denying packet\n"); return(IP_FW_PORT_DENY_FLAG); done: /* Update statistics */ f->pcnt++; f->bcnt += pktlen; f->timestamp = time_second; return retval; pullup_failed: if (fw_verbose) printf("ipfw: pullup failed\n"); return(IP_FW_PORT_DENY_FLAG); } /* * When a rule is added/deleted, clear the next_rule pointers in all rules. * These will be reconstructed on the fly as packets are matched. * Must be called at splimp(). */ static void flush_rule_ptrs(void) { struct ip_fw *rule; for (rule = layer3_chain; rule; rule = rule->next) rule->next_rule = NULL; } /* * When pipes/queues are deleted, clear the "pipe_ptr" pointer to a given * pipe/queue, or to all of them (match == NULL). * Must be called at splimp(). */ void flush_pipe_ptrs(struct dn_flow_set *match) { struct ip_fw *rule; for (rule = layer3_chain; rule; rule = rule->next) { ipfw_insn_pipe *cmd = (ipfw_insn_pipe *)ACTION_PTR(rule); if (cmd->o.opcode != O_PIPE && cmd->o.opcode != O_QUEUE) continue; /* * XXX Use bcmp/bzero to handle pipe_ptr to overcome * possible alignment problems on 64-bit architectures. * This code is seldom used so we do not worry too * much about efficiency. */ if (match == NULL || !bcmp(&cmd->pipe_ptr, &match, sizeof(match)) ) bzero(&cmd->pipe_ptr, sizeof(cmd->pipe_ptr)); } } /* * Add a new rule to the list. Copy the rule into a malloc'ed area, then * possibly create a rule number and add the rule to the list. * Update the rule_number in the input struct so the caller knows it as well. */ static int add_rule(struct ip_fw **head, struct ip_fw *input_rule) { struct ip_fw *rule, *f, *prev; int s; int l = RULESIZE(input_rule); if (*head == NULL && input_rule->rulenum != IPFW_DEFAULT_RULE) return (EINVAL); rule = malloc(l, M_IPFW, M_NOWAIT | M_ZERO); if (rule == NULL) return (ENOSPC); bcopy(input_rule, rule, l); rule->next = NULL; rule->next_rule = NULL; rule->pcnt = 0; rule->bcnt = 0; rule->timestamp = 0; s = splimp(); if (*head == NULL) { /* default rule */ *head = rule; goto done; } /* * If rulenum is 0, find highest numbered rule before the * default rule, and add autoinc_step */ if (autoinc_step < 1) autoinc_step = 1; else if (autoinc_step > 1000) autoinc_step = 1000; if (rule->rulenum == 0) { /* * locate the highest numbered rule before default */ for (f = *head; f; f = f->next) { if (f->rulenum == IPFW_DEFAULT_RULE) break; rule->rulenum = f->rulenum; } if (rule->rulenum < IPFW_DEFAULT_RULE - autoinc_step) rule->rulenum += autoinc_step; input_rule->rulenum = rule->rulenum; } /* * Now insert the new rule in the right place in the sorted list. */ for (prev = NULL, f = *head; f; prev = f, f = f->next) { if (f->rulenum > rule->rulenum) { /* found the location */ if (prev) { rule->next = f; prev->next = rule; } else { /* head insert */ rule->next = *head; *head = rule; } break; } } flush_rule_ptrs(); done: static_count++; static_len += l; splx(s); DEB(printf("ipfw: installed rule %d, static count now %d\n", rule->rulenum, static_count);) return (0); } /** * Free storage associated with a static rule (including derived * dynamic rules). * The caller is in charge of clearing rule pointers to avoid * dangling pointers. * @return a pointer to the next entry. * Arguments are not checked, so they better be correct. * Must be called at splimp(). */ static struct ip_fw * delete_rule(struct ip_fw **head, struct ip_fw *prev, struct ip_fw *rule) { struct ip_fw *n; int l = RULESIZE(rule); n = rule->next; remove_dyn_rule(rule, NULL /* force removal */); if (prev == NULL) *head = n; else prev->next = n; static_count--; static_len -= l; if (DUMMYNET_LOADED) ip_dn_ruledel_ptr(rule); free(rule, M_IPFW); return n; } /* * Deletes all rules from a chain (except rules in set RESVD_SET * unless kill_default = 1). * Must be called at splimp(). */ static void free_chain(struct ip_fw **chain, int kill_default) { struct ip_fw *prev, *rule; flush_rule_ptrs(); /* more efficient to do outside the loop */ for (prev = NULL, rule = *chain; rule ; ) if (kill_default || rule->set != RESVD_SET) rule = delete_rule(chain, prev, rule); else { prev = rule; rule = rule->next; } } /** * Remove all rules with given number, and also do set manipulation. * Assumes chain != NULL && *chain != NULL. * * The argument is an u_int32_t. The low 16 bit are the rule or set number, * the next 8 bits are the new set, the top 8 bits are the command: * * 0 delete rules with given number * 1 delete rules with given set number * 2 move rules with given number to new set * 3 move rules with given set number to new set * 4 swap sets with given numbers */ static int del_entry(struct ip_fw **chain, u_int32_t arg) { struct ip_fw *prev = NULL, *rule = *chain; int s; u_int16_t rulenum; /* rule or old_set */ u_int8_t cmd, new_set; rulenum = arg & 0xffff; cmd = (arg >> 24) & 0xff; new_set = (arg >> 16) & 0xff; if (cmd > 4) return EINVAL; if (new_set > RESVD_SET) return EINVAL; if (cmd == 0 || cmd == 2) { if (rulenum >= IPFW_DEFAULT_RULE) return EINVAL; } else { if (rulenum > RESVD_SET) /* old_set */ return EINVAL; } switch (cmd) { case 0: /* delete rules with given number */ /* * locate first rule to delete */ for (; rule->rulenum < rulenum; prev = rule, rule = rule->next) ; if (rule->rulenum != rulenum) return EINVAL; s = splimp(); /* no access to rules while removing */ /* * flush pointers outside the loop, then delete all matching * rules. prev remains the same throughout the cycle. */ flush_rule_ptrs(); while (rule->rulenum == rulenum) rule = delete_rule(chain, prev, rule); splx(s); break; case 1: /* delete all rules with given set number */ s = splimp(); flush_rule_ptrs(); while (rule->rulenum < IPFW_DEFAULT_RULE) if (rule->set == rulenum) rule = delete_rule(chain, prev, rule); else { prev = rule; rule = rule->next; } splx(s); break; case 2: /* move rules with given number to new set */ s = splimp(); for (; rule->rulenum < IPFW_DEFAULT_RULE; rule = rule->next) if (rule->rulenum == rulenum) rule->set = new_set; splx(s); break; case 3: /* move rules with given set number to new set */ s = splimp(); for (; rule->rulenum < IPFW_DEFAULT_RULE; rule = rule->next) if (rule->set == rulenum) rule->set = new_set; splx(s); break; case 4: /* swap two sets */ s = splimp(); for (; rule->rulenum < IPFW_DEFAULT_RULE; rule = rule->next) if (rule->set == rulenum) rule->set = new_set; else if (rule->set == new_set) rule->set = rulenum; splx(s); break; } return 0; } /* * Clear counters for a specific rule. */ static void clear_counters(struct ip_fw *rule, int log_only) { ipfw_insn_log *l = (ipfw_insn_log *)ACTION_PTR(rule); if (log_only == 0) { rule->bcnt = rule->pcnt = 0; rule->timestamp = 0; } if (l->o.opcode == O_LOG) l->log_left = l->max_log; } /** * Reset some or all counters on firewall rules. * @arg frwl is null to clear all entries, or contains a specific * rule number. * @arg log_only is 1 if we only want to reset logs, zero otherwise. */ static int zero_entry(int rulenum, int log_only) { struct ip_fw *rule; int s; char *msg; if (rulenum == 0) { s = splimp(); norule_counter = 0; for (rule = layer3_chain; rule; rule = rule->next) clear_counters(rule, log_only); splx(s); msg = log_only ? "ipfw: All logging counts reset.\n" : "ipfw: Accounting cleared.\n"; } else { int cleared = 0; /* * We can have multiple rules with the same number, so we * need to clear them all. */ for (rule = layer3_chain; rule; rule = rule->next) if (rule->rulenum == rulenum) { s = splimp(); while (rule && rule->rulenum == rulenum) { clear_counters(rule, log_only); rule = rule->next; } splx(s); cleared = 1; break; } if (!cleared) /* we did not find any matching rules */ return (EINVAL); msg = log_only ? "ipfw: Entry %d logging count reset.\n" : "ipfw: Entry %d cleared.\n"; } if (fw_verbose) log(LOG_SECURITY | LOG_NOTICE, msg, rulenum); return (0); } /* * Check validity of the structure before insert. * Fortunately rules are simple, so this mostly need to check rule sizes. */ static int check_ipfw_struct(struct ip_fw *rule, int size) { int l, cmdlen = 0; int have_action=0; ipfw_insn *cmd; if (size < sizeof(*rule)) { printf("ipfw: rule too short\n"); return (EINVAL); } /* first, check for valid size */ l = RULESIZE(rule); if (l != size) { printf("ipfw: size mismatch (have %d want %d)\n", size, l); return (EINVAL); } /* * Now go for the individual checks. Very simple ones, basically only * instruction sizes. */ for (l = rule->cmd_len, cmd = rule->cmd ; l > 0 ; l -= cmdlen, cmd += cmdlen) { cmdlen = F_LEN(cmd); if (cmdlen > l) { printf("ipfw: opcode %d size truncated\n", cmd->opcode); return EINVAL; } DEB(printf("ipfw: opcode %d\n", cmd->opcode);) switch (cmd->opcode) { case O_PROBE_STATE: case O_KEEP_STATE: case O_PROTO: case O_IP_SRC_ME: case O_IP_DST_ME: case O_LAYER2: case O_IN: case O_FRAG: case O_IPOPT: case O_IPTOS: case O_IPPRECEDENCE: case O_IPVER: case O_TCPWIN: case O_TCPFLAGS: case O_TCPOPTS: case O_ESTAB: case O_VERREVPATH: case O_IPSEC: if (cmdlen != F_INSN_SIZE(ipfw_insn)) goto bad_size; break; case O_UID: case O_GID: case O_IP_SRC: case O_IP_DST: case O_TCPSEQ: case O_TCPACK: case O_PROB: case O_ICMPTYPE: if (cmdlen != F_INSN_SIZE(ipfw_insn_u32)) goto bad_size; break; case O_LIMIT: if (cmdlen != F_INSN_SIZE(ipfw_insn_limit)) goto bad_size; break; case O_LOG: if (cmdlen != F_INSN_SIZE(ipfw_insn_log)) goto bad_size; ((ipfw_insn_log *)cmd)->log_left = ((ipfw_insn_log *)cmd)->max_log; break; case O_IP_SRC_MASK: case O_IP_DST_MASK: /* only odd command lengths */ if ( !(cmdlen & 1) || cmdlen > 31) goto bad_size; break; case O_IP_SRC_SET: case O_IP_DST_SET: if (cmd->arg1 == 0 || cmd->arg1 > 256) { printf("ipfw: invalid set size %d\n", cmd->arg1); return EINVAL; } if (cmdlen != F_INSN_SIZE(ipfw_insn_u32) + (cmd->arg1+31)/32 ) goto bad_size; break; + case O_IP_SRC_LOOKUP: + case O_IP_DST_LOOKUP: + if (cmd->arg1 >= IPFW_TABLES_MAX) { + printf("ipfw: invalid table number %d\n", + cmd->arg1); + return (EINVAL); + } + if (cmdlen != F_INSN_SIZE(ipfw_insn) && + cmdlen != F_INSN_SIZE(ipfw_insn_u32)) + goto bad_size; + break; + case O_MACADDR2: if (cmdlen != F_INSN_SIZE(ipfw_insn_mac)) goto bad_size; break; case O_NOP: case O_IPID: case O_IPTTL: case O_IPLEN: if (cmdlen < 1 || cmdlen > 31) goto bad_size; break; case O_MAC_TYPE: case O_IP_SRCPORT: case O_IP_DSTPORT: /* XXX artificial limit, 30 port pairs */ if (cmdlen < 2 || cmdlen > 31) goto bad_size; break; case O_RECV: case O_XMIT: case O_VIA: if (cmdlen != F_INSN_SIZE(ipfw_insn_if)) goto bad_size; break; case O_PIPE: case O_QUEUE: if (cmdlen != F_INSN_SIZE(ipfw_insn_pipe)) goto bad_size; goto check_action; case O_FORWARD_IP: if (cmdlen != F_INSN_SIZE(ipfw_insn_sa)) goto bad_size; goto check_action; case O_FORWARD_MAC: /* XXX not implemented yet */ case O_CHECK_STATE: case O_COUNT: case O_ACCEPT: case O_DENY: case O_REJECT: case O_SKIPTO: case O_DIVERT: case O_TEE: if (cmdlen != F_INSN_SIZE(ipfw_insn)) goto bad_size; check_action: if (have_action) { printf("ipfw: opcode %d, multiple actions" " not allowed\n", cmd->opcode); return EINVAL; } have_action = 1; if (l != cmdlen) { printf("ipfw: opcode %d, action must be" " last opcode\n", cmd->opcode); return EINVAL; } break; default: printf("ipfw: opcode %d, unknown opcode\n", cmd->opcode); return EINVAL; } } if (have_action == 0) { printf("ipfw: missing action\n"); return EINVAL; } return 0; bad_size: printf("ipfw: opcode %d size %d wrong\n", cmd->opcode, cmdlen); return EINVAL; } /** * {set|get}sockopt parser. */ static int ipfw_ctl(struct sockopt *sopt) { int error, s, rulenum; size_t size; struct ip_fw *bp , *buf, *rule; static u_int32_t rule_buf[255]; /* we copy the data here */ /* * Disallow modifications in really-really secure mode, but still allow * the logging counters to be reset. */ if (sopt->sopt_name == IP_FW_ADD || (sopt->sopt_dir == SOPT_SET && sopt->sopt_name != IP_FW_RESETLOG)) { #if __FreeBSD_version >= 500034 error = securelevel_ge(sopt->sopt_td->td_ucred, 3); if (error) return (error); #else /* FreeBSD 4.x */ if (securelevel >= 3) return (EPERM); #endif } error = 0; switch (sopt->sopt_name) { case IP_FW_GET: /* * pass up a copy of the current rules. Static rules * come first (the last of which has number IPFW_DEFAULT_RULE), * followed by a possibly empty list of dynamic rule. * The last dynamic rule has NULL in the "next" field. */ s = splimp(); size = static_len; /* size of static rules */ if (ipfw_dyn_v) /* add size of dyn.rules */ size += (dyn_count * sizeof(ipfw_dyn_rule)); /* * XXX todo: if the user passes a short length just to know * how much room is needed, do not bother filling up the * buffer, just jump to the sooptcopyout. */ buf = malloc(size, M_TEMP, M_WAITOK); if (buf == 0) { splx(s); error = ENOBUFS; break; } bp = buf; for (rule = layer3_chain; rule ; rule = rule->next) { int i = RULESIZE(rule); bcopy(rule, bp, i); bcopy(&set_disable, &(bp->next_rule), sizeof(set_disable)); bp = (struct ip_fw *)((char *)bp + i); } if (ipfw_dyn_v) { int i; ipfw_dyn_rule *p, *dst, *last = NULL; dst = (ipfw_dyn_rule *)bp; for (i = 0 ; i < curr_dyn_buckets ; i++ ) for ( p = ipfw_dyn_v[i] ; p != NULL ; p = p->next, dst++ ) { bcopy(p, dst, sizeof *p); bcopy(&(p->rule->rulenum), &(dst->rule), sizeof(p->rule->rulenum)); /* * store a non-null value in "next". * The userland code will interpret a * NULL here as a marker * for the last dynamic rule. */ bcopy(&dst, &dst->next, sizeof(dst)); last = dst ; dst->expire = TIME_LEQ(dst->expire, time_second) ? 0 : dst->expire - time_second ; } if (last != NULL) /* mark last dynamic rule */ bzero(&last->next, sizeof(last)); } splx(s); error = sooptcopyout(sopt, buf, size); free(buf, M_TEMP); break; case IP_FW_FLUSH: /* * Normally we cannot release the lock on each iteration. * We could do it here only because we start from the head all * the times so there is no risk of missing some entries. * On the other hand, the risk is that we end up with * a very inconsistent ruleset, so better keep the lock * around the whole cycle. * * XXX this code can be improved by resetting the head of * the list to point to the default rule, and then freeing * the old list without the need for a lock. */ s = splimp(); free_chain(&layer3_chain, 0 /* keep default rule */); splx(s); break; case IP_FW_ADD: rule = (struct ip_fw *)rule_buf; /* XXX do a malloc */ error = sooptcopyin(sopt, rule, sizeof(rule_buf), sizeof(struct ip_fw) ); size = sopt->sopt_valsize; if (error || (error = check_ipfw_struct(rule, size))) break; error = add_rule(&layer3_chain, rule); size = RULESIZE(rule); if (!error && sopt->sopt_dir == SOPT_GET) error = sooptcopyout(sopt, rule, size); break; case IP_FW_DEL: /* * IP_FW_DEL is used for deleting single rules or sets, * and (ab)used to atomically manipulate sets. Argument size * is used to distinguish between the two: * sizeof(u_int32_t) * delete single rule or set of rules, * or reassign rules (or sets) to a different set. * 2*sizeof(u_int32_t) * atomic disable/enable sets. * first u_int32_t contains sets to be disabled, * second u_int32_t contains sets to be enabled. */ error = sooptcopyin(sopt, rule_buf, 2*sizeof(u_int32_t), sizeof(u_int32_t)); if (error) break; size = sopt->sopt_valsize; if (size == sizeof(u_int32_t)) /* delete or reassign */ error = del_entry(&layer3_chain, rule_buf[0]); else if (size == 2*sizeof(u_int32_t)) /* set enable/disable */ set_disable = (set_disable | rule_buf[0]) & ~rule_buf[1] & ~(1<sopt_val != 0) { error = sooptcopyin(sopt, &rulenum, sizeof(int), sizeof(int)); if (error) break; } error = zero_entry(rulenum, sopt->sopt_name == IP_FW_RESETLOG); break; + case IP_FW_TABLE_ADD: + { + ipfw_table_entry ent; + + error = sooptcopyin(sopt, &ent, + sizeof(ent), sizeof(ent)); + if (error) + break; + error = add_table_entry(ent.tbl, ent.addr, + ent.masklen, ent.value); + } + break; + + case IP_FW_TABLE_DEL: + { + ipfw_table_entry ent; + + error = sooptcopyin(sopt, &ent, + sizeof(ent), sizeof(ent)); + if (error) + break; + error = del_table_entry(ent.tbl, ent.addr, ent.masklen); + } + break; + + case IP_FW_TABLE_FLUSH: + { + u_int16_t tbl; + + error = sooptcopyin(sopt, &tbl, + sizeof(tbl), sizeof(tbl)); + if (error) + break; + error = flush_table(tbl); + } + break; + + case IP_FW_TABLE_GETSIZE: + { + u_int32_t tbl, cnt; + + if ((error = sooptcopyin(sopt, &tbl, sizeof(tbl), + sizeof(tbl)))) + break; + if ((error = count_table(tbl, &cnt))) + break; + error = sooptcopyout(sopt, &cnt, sizeof(cnt)); + } + break; + + case IP_FW_TABLE_LIST: + { + ipfw_table *tbl; + + if (sopt->sopt_valsize < sizeof(*tbl)) { + error = EINVAL; + break; + } + size = sopt->sopt_valsize; + tbl = malloc(size, M_TEMP, M_WAITOK); + if (tbl == NULL) { + error = ENOMEM; + break; + } + error = sooptcopyin(sopt, tbl, size, sizeof(*tbl)); + if (error) { + free(tbl, M_TEMP); + break; + } + tbl->size = (size - sizeof(*tbl)) / + sizeof(ipfw_table_entry); + error = dump_table(tbl); + if (error) { + free(tbl, M_TEMP); + break; + } + error = sooptcopyout(sopt, tbl, size); + free(tbl, M_TEMP); + } + break; + default: printf("ipfw: ipfw_ctl invalid option %d\n", sopt->sopt_name); error = EINVAL; } return (error); } /** * dummynet needs a reference to the default rule, because rules can be * deleted while packets hold a reference to them. When this happens, * dummynet changes the reference to the default rule (it could well be a * NULL pointer, but this way we do not need to check for the special * case, plus here he have info on the default behaviour). */ struct ip_fw *ip_fw_default_rule; /* * This procedure is only used to handle keepalives. It is invoked * every dyn_keepalive_period */ static void ipfw_tick(void * __unused unused) { int i; int s; ipfw_dyn_rule *q; if (dyn_keepalive == 0 || ipfw_dyn_v == NULL || dyn_count == 0) goto done; s = splimp(); for (i = 0 ; i < curr_dyn_buckets ; i++) { for (q = ipfw_dyn_v[i] ; q ; q = q->next ) { if (q->dyn_type == O_LIMIT_PARENT) continue; if (q->id.proto != IPPROTO_TCP) continue; if ( (q->state & BOTH_SYN) != BOTH_SYN) continue; if (TIME_LEQ( time_second+dyn_keepalive_interval, q->expire)) continue; /* too early */ if (TIME_LEQ(q->expire, time_second)) continue; /* too late, rule expired */ send_pkt(&(q->id), q->ack_rev - 1, q->ack_fwd, TH_SYN); send_pkt(&(q->id), q->ack_fwd - 1, q->ack_rev, 0); } } splx(s); done: ipfw_timeout_h = timeout(ipfw_tick, NULL, dyn_keepalive_period*hz); } static void ipfw_init(void) { struct ip_fw default_rule; ip_fw_chk_ptr = ipfw_chk; ip_fw_ctl_ptr = ipfw_ctl; layer3_chain = NULL; bzero(&default_rule, sizeof default_rule); default_rule.act_ofs = 0; default_rule.rulenum = IPFW_DEFAULT_RULE; default_rule.cmd_len = 1; default_rule.set = RESVD_SET; default_rule.cmd[0].len = 1; default_rule.cmd[0].opcode = #ifdef IPFIREWALL_DEFAULT_TO_ACCEPT 1 ? O_ACCEPT : #endif O_DENY; add_rule(&layer3_chain, &default_rule); ip_fw_default_rule = layer3_chain; printf("ipfw2 initialized, divert %s, " "rule-based forwarding enabled, default to %s, logging ", #ifdef IPDIVERT "enabled", #else "disabled", #endif default_rule.cmd[0].opcode == O_ACCEPT ? "accept" : "deny"); #ifdef IPFIREWALL_VERBOSE fw_verbose = 1; #endif #ifdef IPFIREWALL_VERBOSE_LIMIT verbose_limit = IPFIREWALL_VERBOSE_LIMIT; #endif if (fw_verbose == 0) printf("disabled\n"); else if (verbose_limit == 0) printf("unlimited\n"); else printf("limited to %d packets/entry by default\n", verbose_limit); bzero(&ipfw_timeout_h, sizeof(struct callout_handle)); ipfw_timeout_h = timeout(ipfw_tick, NULL, hz); } static int ipfw_modevent(module_t mod, int type, void *unused) { int s; int err = 0; switch (type) { case MOD_LOAD: s = splimp(); if (IPFW_LOADED) { splx(s); printf("IP firewall already loaded\n"); err = EEXIST; } else { ipfw_init(); splx(s); } break; case MOD_UNLOAD: #if !defined(KLD_MODULE) printf("ipfw statically compiled, cannot unload\n"); err = EBUSY; #else s = splimp(); untimeout(ipfw_tick, NULL, ipfw_timeout_h); ip_fw_chk_ptr = NULL; ip_fw_ctl_ptr = NULL; free_chain(&layer3_chain, 1 /* kill default rule */); + flush_tables(); splx(s); printf("IP firewall unloaded\n"); #endif break; default: break; } return err; } static moduledata_t ipfwmod = { "ipfw", ipfw_modevent, 0 }; DECLARE_MODULE(ipfw, ipfwmod, SI_SUB_PSEUDO, SI_ORDER_ANY); MODULE_VERSION(ipfw, 1); + +/* Must be run after route_init(). */ +SYSINIT(ipfw, SI_SUB_PROTO_DOMAIN, SI_ORDER_ANY, init_tables, 0) + #endif /* IPFW2 */ Index: stable/4/sys/netinet/ip_fw2.h =================================================================== --- stable/4/sys/netinet/ip_fw2.h (revision 130570) +++ stable/4/sys/netinet/ip_fw2.h (revision 130571) @@ -1,426 +1,445 @@ /* * Copyright (c) 2002 Luigi Rizzo, Universita` di Pisa * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _IPFW2_H #define _IPFW2_H /* * The kernel representation of ipfw rules is made of a list of * 'instructions' (for all practical purposes equivalent to BPF * instructions), which specify which fields of the packet * (or its metadata) should be analysed. * * Each instruction is stored in a structure which begins with * "ipfw_insn", and can contain extra fields depending on the * instruction type (listed below). * Note that the code is written so that individual instructions * have a size which is a multiple of 32 bits. This means that, if * such structures contain pointers or other 64-bit entities, * (there is just one instance now) they may end up unaligned on * 64-bit architectures, so the must be handled with care. * * "enum ipfw_opcodes" are the opcodes supported. We can have up * to 256 different opcodes. */ enum ipfw_opcodes { /* arguments (4 byte each) */ O_NOP, O_IP_SRC, /* u32 = IP */ O_IP_SRC_MASK, /* ip = IP/mask */ O_IP_SRC_ME, /* none */ O_IP_SRC_SET, /* u32=base, arg1=len, bitmap */ O_IP_DST, /* u32 = IP */ O_IP_DST_MASK, /* ip = IP/mask */ O_IP_DST_ME, /* none */ O_IP_DST_SET, /* u32=base, arg1=len, bitmap */ O_IP_SRCPORT, /* (n)port list:mask 4 byte ea */ O_IP_DSTPORT, /* (n)port list:mask 4 byte ea */ O_PROTO, /* arg1=protocol */ O_MACADDR2, /* 2 mac addr:mask */ O_MAC_TYPE, /* same as srcport */ O_LAYER2, /* none */ O_IN, /* none */ O_FRAG, /* none */ O_RECV, /* none */ O_XMIT, /* none */ O_VIA, /* none */ O_IPOPT, /* arg1 = 2*u8 bitmap */ O_IPLEN, /* arg1 = len */ O_IPID, /* arg1 = id */ O_IPTOS, /* arg1 = id */ O_IPPRECEDENCE, /* arg1 = precedence << 5 */ O_IPTTL, /* arg1 = TTL */ O_IPVER, /* arg1 = version */ O_UID, /* u32 = id */ O_GID, /* u32 = id */ O_ESTAB, /* none (tcp established) */ O_TCPFLAGS, /* arg1 = 2*u8 bitmap */ O_TCPWIN, /* arg1 = desired win */ O_TCPSEQ, /* u32 = desired seq. */ O_TCPACK, /* u32 = desired seq. */ O_ICMPTYPE, /* u32 = icmp bitmap */ O_TCPOPTS, /* arg1 = 2*u8 bitmap */ O_VERREVPATH, /* none */ O_PROBE_STATE, /* none */ O_KEEP_STATE, /* none */ O_LIMIT, /* ipfw_insn_limit */ O_LIMIT_PARENT, /* dyn_type, not an opcode. */ /* * These are really 'actions'. */ O_LOG, /* ipfw_insn_log */ O_PROB, /* u32 = match probability */ O_CHECK_STATE, /* none */ O_ACCEPT, /* none */ O_DENY, /* none */ O_REJECT, /* arg1=icmp arg (same as deny) */ O_COUNT, /* none */ O_SKIPTO, /* arg1=next rule number */ O_PIPE, /* arg1=pipe number */ O_QUEUE, /* arg1=queue number */ O_DIVERT, /* arg1=port number */ O_TEE, /* arg1=port number */ O_FORWARD_IP, /* fwd sockaddr */ O_FORWARD_MAC, /* fwd mac */ /* * More opcodes. */ O_IPSEC, /* has ipsec history */ + O_IP_SRC_LOOKUP, /* arg1=table number, u32=value */ + O_IP_DST_LOOKUP, /* arg1=table number, u32=value */ O_LAST_OPCODE /* not an opcode! */ }; /* * Template for instructions. * * ipfw_insn is used for all instructions which require no operands, * a single 16-bit value (arg1), or a couple of 8-bit values. * * For other instructions which require different/larger arguments * we have derived structures, ipfw_insn_*. * * The size of the instruction (in 32-bit words) is in the low * 6 bits of "len". The 2 remaining bits are used to implement * NOT and OR on individual instructions. Given a type, you can * compute the length to be put in "len" using F_INSN_SIZE(t) * * F_NOT negates the match result of the instruction. * * F_OR is used to build or blocks. By default, instructions * are evaluated as part of a logical AND. An "or" block * { X or Y or Z } contains F_OR set in all but the last * instruction of the block. A match will cause the code * to skip past the last instruction of the block. * * NOTA BENE: in a couple of places we assume that * sizeof(ipfw_insn) == sizeof(u_int32_t) * this needs to be fixed. * */ typedef struct _ipfw_insn { /* template for instructions */ enum ipfw_opcodes opcode:8; u_int8_t len; /* numer of 32-byte words */ #define F_NOT 0x80 #define F_OR 0x40 #define F_LEN_MASK 0x3f #define F_LEN(cmd) ((cmd)->len & F_LEN_MASK) u_int16_t arg1; } ipfw_insn; /* * The F_INSN_SIZE(type) computes the size, in 4-byte words, of * a given type. */ #define F_INSN_SIZE(t) ((sizeof (t))/sizeof(u_int32_t)) /* * This is used to store an array of 16-bit entries (ports etc.) */ typedef struct _ipfw_insn_u16 { ipfw_insn o; u_int16_t ports[2]; /* there may be more */ } ipfw_insn_u16; /* * This is used to store an array of 32-bit entries * (uid, single IPv4 addresses etc.) */ typedef struct _ipfw_insn_u32 { ipfw_insn o; u_int32_t d[1]; /* one or more */ } ipfw_insn_u32; /* * This is used to store IP addr-mask pairs. */ typedef struct _ipfw_insn_ip { ipfw_insn o; struct in_addr addr; struct in_addr mask; } ipfw_insn_ip; /* * This is used to forward to a given address (ip). */ typedef struct _ipfw_insn_sa { ipfw_insn o; struct sockaddr_in sa; } ipfw_insn_sa; /* * This is used for MAC addr-mask pairs. */ typedef struct _ipfw_insn_mac { ipfw_insn o; u_char addr[12]; /* dst[6] + src[6] */ u_char mask[12]; /* dst[6] + src[6] */ } ipfw_insn_mac; /* * This is used for interface match rules (recv xx, xmit xx). */ typedef struct _ipfw_insn_if { ipfw_insn o; union { struct in_addr ip; int32_t unit; } p; char name[IFNAMSIZ]; } ipfw_insn_if; /* * This is used for pipe and queue actions, which need to store * a single pointer (which can have different size on different * architectures. * Note that, because of previous instructions, pipe_ptr might * be unaligned in the overall structure, so it needs to be * manipulated with care. */ typedef struct _ipfw_insn_pipe { ipfw_insn o; void *pipe_ptr; /* XXX */ } ipfw_insn_pipe; /* * This is used for limit rules. */ typedef struct _ipfw_insn_limit { ipfw_insn o; u_int8_t _pad; u_int8_t limit_mask; /* combination of DYN_* below */ #define DYN_SRC_ADDR 0x1 #define DYN_SRC_PORT 0x2 #define DYN_DST_ADDR 0x4 #define DYN_DST_PORT 0x8 u_int16_t conn_limit; } ipfw_insn_limit; /* * This is used for log instructions. */ typedef struct _ipfw_insn_log { ipfw_insn o; u_int32_t max_log; /* how many do we log -- 0 = all */ u_int32_t log_left; /* how many left to log */ } ipfw_insn_log; /* * Here we have the structure representing an ipfw rule. * * It starts with a general area (with link fields and counters) * followed by an array of one or more instructions, which the code * accesses as an array of 32-bit values. * * Given a rule pointer r: * * r->cmd is the start of the first instruction. * ACTION_PTR(r) is the start of the first action (things to do * once a rule matched). * * When assembling instruction, remember the following: * * + if a rule has a "keep-state" (or "limit") option, then the * first instruction (at r->cmd) MUST BE an O_PROBE_STATE * + if a rule has a "log" option, then the first action * (at ACTION_PTR(r)) MUST be O_LOG * * NOTE: we use a simple linked list of rules because we never need * to delete a rule without scanning the list. We do not use * queue(3) macros for portability and readability. */ struct ip_fw { struct ip_fw *next; /* linked list of rules */ struct ip_fw *next_rule; /* ptr to next [skipto] rule */ /* 'next_rule' is used to pass up 'set_disable' status */ u_int16_t act_ofs; /* offset of action in 32-bit units */ u_int16_t cmd_len; /* # of 32-bit words in cmd */ u_int16_t rulenum; /* rule number */ u_int8_t set; /* rule set (0..31) */ #define RESVD_SET 31 /* set for default and persistent rules */ u_int8_t _pad; /* padding */ /* These fields are present in all rules. */ u_int64_t pcnt; /* Packet counter */ u_int64_t bcnt; /* Byte counter */ u_int32_t timestamp; /* tv_sec of last match */ ipfw_insn cmd[1]; /* storage for commands */ }; #define ACTION_PTR(rule) \ (ipfw_insn *)( (u_int32_t *)((rule)->cmd) + ((rule)->act_ofs) ) #define RULESIZE(rule) (sizeof(struct ip_fw) + \ ((struct ip_fw *)(rule))->cmd_len * 4 - 4) /* * This structure is used as a flow mask and a flow id for various * parts of the code. */ struct ipfw_flow_id { u_int32_t dst_ip; u_int32_t src_ip; u_int16_t dst_port; u_int16_t src_port; u_int8_t proto; u_int8_t flags; /* protocol-specific flags */ }; /* * Dynamic ipfw rule. */ typedef struct _ipfw_dyn_rule ipfw_dyn_rule; struct _ipfw_dyn_rule { ipfw_dyn_rule *next; /* linked list of rules. */ struct ip_fw *rule; /* pointer to rule */ /* 'rule' is used to pass up the rule number (from the parent) */ ipfw_dyn_rule *parent; /* pointer to parent rule */ u_int64_t pcnt; /* packet match counter */ u_int64_t bcnt; /* byte match counter */ struct ipfw_flow_id id; /* (masked) flow id */ u_int32_t expire; /* expire time */ u_int32_t bucket; /* which bucket in hash table */ u_int32_t state; /* state of this rule (typically a * combination of TCP flags) */ u_int32_t ack_fwd; /* most recent ACKs in forward */ u_int32_t ack_rev; /* and reverse directions (used */ /* to generate keepalives) */ u_int16_t dyn_type; /* rule type */ u_int16_t count; /* refcount */ }; /* * Definitions for IP option names. */ #define IP_FW_IPOPT_LSRR 0x01 #define IP_FW_IPOPT_SSRR 0x02 #define IP_FW_IPOPT_RR 0x04 #define IP_FW_IPOPT_TS 0x08 /* * Definitions for TCP option names. */ #define IP_FW_TCPOPT_MSS 0x01 #define IP_FW_TCPOPT_WINDOW 0x02 #define IP_FW_TCPOPT_SACK 0x04 #define IP_FW_TCPOPT_TS 0x08 #define IP_FW_TCPOPT_CC 0x10 #define ICMP_REJECT_RST 0x100 /* fake ICMP code (send a TCP RST) */ + +/* + * These are used for lookup tables. + */ +typedef struct _ipfw_table_entry { + in_addr_t addr; /* network address */ + u_int32_t value; /* value */ + u_int16_t tbl; /* table number */ + u_int8_t masklen; /* mask length */ +} ipfw_table_entry; + +typedef struct _ipfw_table { + u_int32_t size; /* size of entries in bytes */ + u_int32_t cnt; /* # of entries */ + u_int16_t tbl; /* table number */ + ipfw_table_entry ent[0]; /* entries */ +} ipfw_table; /* * Main firewall chains definitions and global var's definitions. */ #ifdef _KERNEL #define IP_FW_PORT_DYNT_FLAG 0x10000 #define IP_FW_PORT_TEE_FLAG 0x20000 #define IP_FW_PORT_DENY_FLAG 0x40000 /* * Arguments for calling ipfw_chk() and dummynet_io(). We put them * all into a structure because this way it is easier and more * efficient to pass variables around and extend the interface. */ struct ip_fw_args { struct mbuf *m; /* the mbuf chain */ struct ifnet *oif; /* output interface */ struct sockaddr_in *next_hop; /* forward address */ struct ip_fw *rule; /* matching rule */ struct ether_header *eh; /* for bridged packets */ struct route *ro; /* for dummynet */ struct sockaddr_in *dst; /* for dummynet */ int flags; /* for dummynet */ struct ipfw_flow_id f_id; /* grabbed from IP header */ u_int16_t divert_rule; /* divert cookie */ u_int32_t retval; }; /* * Function definitions. */ /* Firewall hooks */ struct sockopt; struct dn_flow_set; void flush_pipe_ptrs(struct dn_flow_set *match); /* used by dummynet */ typedef int ip_fw_chk_t (struct ip_fw_args *args); typedef int ip_fw_ctl_t (struct sockopt *); extern ip_fw_chk_t *ip_fw_chk_ptr; extern ip_fw_ctl_t *ip_fw_ctl_ptr; extern int fw_one_pass; extern int fw_enable; #define IPFW_LOADED (ip_fw_chk_ptr != NULL) #endif /* _KERNEL */ #endif /* _IPFW2_H */ Index: stable/4/sys/netinet/raw_ip.c =================================================================== --- stable/4/sys/netinet/raw_ip.c (revision 130570) +++ stable/4/sys/netinet/raw_ip.c (revision 130571) @@ -1,720 +1,725 @@ /* * Copyright (c) 1982, 1986, 1988, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by the University of * California, Berkeley and its contributors. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)raw_ip.c 8.7 (Berkeley) 5/15/95 * $FreeBSD$ */ #include "opt_inet6.h" #include "opt_ipsec.h" #include "opt_random_ip_id.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #define _IP_VHL #include #include #include #include #include #include #include #include #include #ifdef FAST_IPSEC #include #endif /*FAST_IPSEC*/ #ifdef IPSEC #include #endif /*IPSEC*/ struct inpcbhead ripcb; struct inpcbinfo ripcbinfo; /* control hooks for ipfw and dummynet */ ip_fw_ctl_t *ip_fw_ctl_ptr; ip_dn_ctl_t *ip_dn_ctl_ptr; /* * hooks for multicast routing. They all default to NULL, * so leave them not initialized and rely on BSS being set to 0. */ /* The socket used to communicate with the multicast routing daemon. */ struct socket *ip_mrouter; /* The various mrouter and rsvp functions */ int (*ip_mrouter_set)(struct socket *, struct sockopt *); int (*ip_mrouter_get)(struct socket *, struct sockopt *); int (*ip_mrouter_done)(void); int (*ip_mforward)(struct ip *, struct ifnet *, struct mbuf *, struct ip_moptions *); int (*mrt_ioctl)(int, caddr_t); int (*legal_vif_num)(int); u_long (*ip_mcast_src)(int); void (*rsvp_input_p)(struct mbuf *m, int off, int proto); int (*ip_rsvp_vif)(struct socket *, struct sockopt *); void (*ip_rsvp_force_done)(struct socket *); /* * Nominal space allocated to a raw ip socket. */ #define RIPSNDQ 8192 #define RIPRCVQ 8192 /* * Raw interface to IP protocol. */ /* * Initialize raw connection block queue. */ void rip_init(void) { LIST_INIT(&ripcb); ripcbinfo.listhead = &ripcb; /* * XXX We don't use the hash list for raw IP, but it's easier * to allocate a one entry hash list than it is to check all * over the place for hashbase == NULL. */ ripcbinfo.hashbase = hashinit(1, M_PCB, &ripcbinfo.hashmask); ripcbinfo.porthashbase = hashinit(1, M_PCB, &ripcbinfo.porthashmask); ripcbinfo.ipi_zone = zinit("ripcb", sizeof(struct inpcb), maxsockets, ZONE_INTERRUPT, 0); } /* * XXX ripsrc is modified in rip_input, so we must be fix this * when we want to make this code smp-friendly. */ static struct sockaddr_in ripsrc = { sizeof(ripsrc), AF_INET }; /* * Setup generic address and protocol structures * for raw_input routine, then pass them along with * mbuf chain. */ void rip_input(struct mbuf *m, int off, int proto) { struct ip *ip = mtod(m, struct ip *); struct inpcb *inp; struct inpcb *last = NULL; struct mbuf *opts = NULL; ripsrc.sin_addr = ip->ip_src; LIST_FOREACH(inp, &ripcb, inp_list) { #ifdef INET6 if ((inp->inp_vflag & INP_IPV4) == 0) continue; #endif if (inp->inp_ip_p && inp->inp_ip_p != proto) continue; if (inp->inp_laddr.s_addr != INADDR_ANY && inp->inp_laddr.s_addr != ip->ip_dst.s_addr) continue; if (inp->inp_faddr.s_addr != INADDR_ANY && inp->inp_faddr.s_addr != ip->ip_src.s_addr) continue; if (last) { struct mbuf *n = m_copypacket(m, M_DONTWAIT); #ifdef IPSEC /* check AH/ESP integrity. */ if (n && ipsec4_in_reject_so(n, last->inp_socket)) { m_freem(n); ipsecstat.in_polvio++; /* do not inject data to pcb */ } else #endif /*IPSEC*/ #ifdef FAST_IPSEC /* check AH/ESP integrity. */ if (ipsec4_in_reject(n, last)) { m_freem(n); /* do not inject data to pcb */ } else #endif /*FAST_IPSEC*/ if (n) { if (last->inp_flags & INP_CONTROLOPTS || last->inp_socket->so_options & SO_TIMESTAMP) ip_savecontrol(last, &opts, ip, n); if (sbappendaddr(&last->inp_socket->so_rcv, (struct sockaddr *)&ripsrc, n, opts) == 0) { /* should notify about lost packet */ m_freem(n); if (opts) m_freem(opts); } else sorwakeup(last->inp_socket); opts = 0; } } last = inp; } #ifdef IPSEC /* check AH/ESP integrity. */ if (last && ipsec4_in_reject_so(m, last->inp_socket)) { m_freem(m); ipsecstat.in_polvio++; ipstat.ips_delivered--; /* do not inject data to pcb */ } else #endif /*IPSEC*/ #ifdef FAST_IPSEC /* check AH/ESP integrity. */ if (last && ipsec4_in_reject(m, last)) { m_freem(m); ipstat.ips_delivered--; /* do not inject data to pcb */ } else #endif /*FAST_IPSEC*/ if (last) { if (last->inp_flags & INP_CONTROLOPTS || last->inp_socket->so_options & SO_TIMESTAMP) ip_savecontrol(last, &opts, ip, m); if (sbappendaddr(&last->inp_socket->so_rcv, (struct sockaddr *)&ripsrc, m, opts) == 0) { m_freem(m); if (opts) m_freem(opts); } else sorwakeup(last->inp_socket); } else { m_freem(m); ipstat.ips_noproto++; ipstat.ips_delivered--; } } /* * Generate IP header and pass packet to ip_output. * Tack on options user may have setup with control call. */ int rip_output(struct mbuf *m, struct socket *so, u_long dst) { struct ip *ip; struct inpcb *inp = sotoinpcb(so); int flags = (so->so_options & SO_DONTROUTE) | IP_ALLOWBROADCAST; /* * If the user handed us a complete IP packet, use it. * Otherwise, allocate an mbuf for a header and fill it in. */ if ((inp->inp_flags & INP_HDRINCL) == 0) { if (m->m_pkthdr.len + sizeof(struct ip) > IP_MAXPACKET) { m_freem(m); return(EMSGSIZE); } M_PREPEND(m, sizeof(struct ip), M_WAIT); if (m == NULL) return(ENOBUFS); ip = mtod(m, struct ip *); ip->ip_tos = inp->inp_ip_tos; ip->ip_off = 0; ip->ip_p = inp->inp_ip_p; ip->ip_len = m->m_pkthdr.len; ip->ip_src = inp->inp_laddr; ip->ip_dst.s_addr = dst; ip->ip_ttl = inp->inp_ip_ttl; } else { if (m->m_pkthdr.len > IP_MAXPACKET) { m_freem(m); return(EMSGSIZE); } ip = mtod(m, struct ip *); /* don't allow both user specified and setsockopt options, and don't allow packet length sizes that will crash */ if (((IP_VHL_HL(ip->ip_vhl) != (sizeof (*ip) >> 2)) && inp->inp_options) || (ip->ip_len > m->m_pkthdr.len) || (ip->ip_len < (IP_VHL_HL(ip->ip_vhl) << 2))) { m_freem(m); return EINVAL; } if (ip->ip_id == 0) #ifdef RANDOM_IP_ID ip->ip_id = ip_randomid(); #else ip->ip_id = htons(ip_id++); #endif /* XXX prevent ip_output from overwriting header fields */ flags |= IP_RAWOUTPUT; ipstat.ips_rawout++; } if (inp->inp_flags & INP_ONESBCAST) flags |= IP_SENDONES; return (ip_output(m, inp->inp_options, &inp->inp_route, flags, inp->inp_moptions, inp)); } /* * Raw IP socket option processing. */ int rip_ctloutput(struct socket *so, struct sockopt *sopt) { struct inpcb *inp = sotoinpcb(so); int error, optval; if (sopt->sopt_level != IPPROTO_IP) return (EINVAL); error = 0; switch (sopt->sopt_dir) { case SOPT_GET: switch (sopt->sopt_name) { case IP_HDRINCL: optval = inp->inp_flags & INP_HDRINCL; error = sooptcopyout(sopt, &optval, sizeof optval); break; case IP_FW_ADD: /* ADD actually returns the body... */ case IP_FW_GET: + case IP_FW_TABLE_GETSIZE: + case IP_FW_TABLE_LIST: if (IPFW_LOADED) error = ip_fw_ctl_ptr(sopt); else error = ENOPROTOOPT; break; case IP_DUMMYNET_GET: if (DUMMYNET_LOADED) error = ip_dn_ctl_ptr(sopt); else error = ENOPROTOOPT; break ; case MRT_INIT: case MRT_DONE: case MRT_ADD_VIF: case MRT_DEL_VIF: case MRT_ADD_MFC: case MRT_DEL_MFC: case MRT_VERSION: case MRT_ASSERT: case MRT_API_SUPPORT: case MRT_API_CONFIG: case MRT_ADD_BW_UPCALL: case MRT_DEL_BW_UPCALL: error = ip_mrouter_get ? ip_mrouter_get(so, sopt) : EOPNOTSUPP; break; default: error = ip_ctloutput(so, sopt); break; } break; case SOPT_SET: switch (sopt->sopt_name) { case IP_HDRINCL: error = sooptcopyin(sopt, &optval, sizeof optval, sizeof optval); if (error) break; if (optval) inp->inp_flags |= INP_HDRINCL; else inp->inp_flags &= ~INP_HDRINCL; break; case IP_FW_ADD: case IP_FW_DEL: case IP_FW_FLUSH: case IP_FW_ZERO: case IP_FW_RESETLOG: + case IP_FW_TABLE_ADD: + case IP_FW_TABLE_DEL: + case IP_FW_TABLE_FLUSH: if (IPFW_LOADED) error = ip_fw_ctl_ptr(sopt); else error = ENOPROTOOPT; break; case IP_DUMMYNET_CONFIGURE: case IP_DUMMYNET_DEL: case IP_DUMMYNET_FLUSH: if (DUMMYNET_LOADED) error = ip_dn_ctl_ptr(sopt); else error = ENOPROTOOPT ; break ; case IP_RSVP_ON: error = ip_rsvp_init(so); break; case IP_RSVP_OFF: error = ip_rsvp_done(); break; case IP_RSVP_VIF_ON: case IP_RSVP_VIF_OFF: error = ip_rsvp_vif ? ip_rsvp_vif(so, sopt) : EINVAL; break; case MRT_INIT: case MRT_DONE: case MRT_ADD_VIF: case MRT_DEL_VIF: case MRT_ADD_MFC: case MRT_DEL_MFC: case MRT_VERSION: case MRT_ASSERT: case MRT_API_SUPPORT: case MRT_API_CONFIG: case MRT_ADD_BW_UPCALL: case MRT_DEL_BW_UPCALL: error = ip_mrouter_set ? ip_mrouter_set(so, sopt) : EOPNOTSUPP; break; default: error = ip_ctloutput(so, sopt); break; } break; } return (error); } /* * This function exists solely to receive the PRC_IFDOWN messages which * are sent by if_down(). It looks for an ifaddr whose ifa_addr is sa, * and calls in_ifadown() to remove all routes corresponding to that address. * It also receives the PRC_IFUP messages from if_up() and reinstalls the * interface routes. */ void rip_ctlinput(int cmd, struct sockaddr *sa, void *vip) { struct in_ifaddr *ia; struct ifnet *ifp; int err; int flags; switch (cmd) { case PRC_IFDOWN: TAILQ_FOREACH(ia, &in_ifaddrhead, ia_link) { if (ia->ia_ifa.ifa_addr == sa && (ia->ia_flags & IFA_ROUTE)) { /* * in_ifscrub kills the interface route. */ in_ifscrub(ia->ia_ifp, ia); /* * in_ifadown gets rid of all the rest of * the routes. This is not quite the right * thing to do, but at least if we are running * a routing process they will come back. */ in_ifadown(&ia->ia_ifa, 0); break; } } break; case PRC_IFUP: TAILQ_FOREACH(ia, &in_ifaddrhead, ia_link) { if (ia->ia_ifa.ifa_addr == sa) break; } if (ia == 0 || (ia->ia_flags & IFA_ROUTE)) return; flags = RTF_UP; ifp = ia->ia_ifa.ifa_ifp; if ((ifp->if_flags & IFF_LOOPBACK) || (ifp->if_flags & IFF_POINTOPOINT)) flags |= RTF_HOST; err = rtinit(&ia->ia_ifa, RTM_ADD, flags); if (err == 0) ia->ia_flags |= IFA_ROUTE; break; } } u_long rip_sendspace = RIPSNDQ; u_long rip_recvspace = RIPRCVQ; SYSCTL_INT(_net_inet_raw, OID_AUTO, maxdgram, CTLFLAG_RW, &rip_sendspace, 0, "Maximum outgoing raw IP datagram size"); SYSCTL_INT(_net_inet_raw, OID_AUTO, recvspace, CTLFLAG_RW, &rip_recvspace, 0, "Maximum incoming raw IP datagram size"); static int rip_attach(struct socket *so, int proto, struct proc *p) { struct inpcb *inp; int error, s; inp = sotoinpcb(so); if (inp) panic("rip_attach"); if (p && (error = suser(p)) != 0) return error; error = soreserve(so, rip_sendspace, rip_recvspace); if (error) return error; s = splnet(); error = in_pcballoc(so, &ripcbinfo, p); splx(s); if (error) return error; inp = (struct inpcb *)so->so_pcb; inp->inp_vflag |= INP_IPV4; inp->inp_ip_p = proto; inp->inp_ip_ttl = ip_defttl; return 0; } static int rip_detach(struct socket *so) { struct inpcb *inp; inp = sotoinpcb(so); if (inp == 0) panic("rip_detach"); if (so == ip_mrouter && ip_mrouter_done) ip_mrouter_done(); if (ip_rsvp_force_done) ip_rsvp_force_done(so); if (so == ip_rsvpd) ip_rsvp_done(); in_pcbdetach(inp); return 0; } static int rip_abort(struct socket *so) { soisdisconnected(so); if (so->so_state & SS_NOFDREF) return rip_detach(so); return 0; } static int rip_disconnect(struct socket *so) { if ((so->so_state & SS_ISCONNECTED) == 0) return ENOTCONN; return rip_abort(so); } static int rip_bind(struct socket *so, struct sockaddr *nam, struct proc *p) { struct inpcb *inp = sotoinpcb(so); struct sockaddr_in *addr = (struct sockaddr_in *)nam; if (nam->sa_len != sizeof(*addr)) return EINVAL; if (TAILQ_EMPTY(&ifnet) || ((addr->sin_family != AF_INET) && (addr->sin_family != AF_IMPLINK)) || (addr->sin_addr.s_addr != INADDR_ANY && ifa_ifwithaddr((struct sockaddr *)addr) == 0)) return EADDRNOTAVAIL; inp->inp_laddr = addr->sin_addr; return 0; } static int rip_connect(struct socket *so, struct sockaddr *nam, struct proc *p) { struct inpcb *inp = sotoinpcb(so); struct sockaddr_in *addr = (struct sockaddr_in *)nam; if (nam->sa_len != sizeof(*addr)) return EINVAL; if (TAILQ_EMPTY(&ifnet)) return EADDRNOTAVAIL; if ((addr->sin_family != AF_INET) && (addr->sin_family != AF_IMPLINK)) return EAFNOSUPPORT; inp->inp_faddr = addr->sin_addr; soisconnected(so); return 0; } static int rip_shutdown(struct socket *so) { socantsendmore(so); return 0; } static int rip_send(struct socket *so, int flags, struct mbuf *m, struct sockaddr *nam, struct mbuf *control, struct proc *p) { struct inpcb *inp = sotoinpcb(so); u_long dst; if (so->so_state & SS_ISCONNECTED) { if (nam) { m_freem(m); return EISCONN; } dst = inp->inp_faddr.s_addr; } else { if (nam == NULL) { m_freem(m); return ENOTCONN; } dst = ((struct sockaddr_in *)nam)->sin_addr.s_addr; } return rip_output(m, so, dst); } static int rip_pcblist(SYSCTL_HANDLER_ARGS) { int error, i, n, s; struct inpcb *inp, **inp_list; inp_gen_t gencnt; struct xinpgen xig; /* * The process of preparing the TCB list is too time-consuming and * resource-intensive to repeat twice on every request. */ if (req->oldptr == 0) { n = ripcbinfo.ipi_count; req->oldidx = 2 * (sizeof xig) + (n + n/8) * sizeof(struct xinpcb); return 0; } if (req->newptr != 0) return EPERM; /* * OK, now we're committed to doing something. */ s = splnet(); gencnt = ripcbinfo.ipi_gencnt; n = ripcbinfo.ipi_count; splx(s); xig.xig_len = sizeof xig; xig.xig_count = n; xig.xig_gen = gencnt; xig.xig_sogen = so_gencnt; error = SYSCTL_OUT(req, &xig, sizeof xig); if (error) return error; inp_list = malloc(n * sizeof *inp_list, M_TEMP, M_WAITOK); if (inp_list == 0) return ENOMEM; s = splnet(); for (inp = LIST_FIRST(ripcbinfo.listhead), i = 0; inp && i < n; inp = LIST_NEXT(inp, inp_list)) { if (inp->inp_gencnt <= gencnt) inp_list[i++] = inp; } splx(s); n = i; error = 0; for (i = 0; i < n; i++) { inp = inp_list[i]; if (inp->inp_gencnt <= gencnt) { struct xinpcb xi; xi.xi_len = sizeof xi; /* XXX should avoid extra copy */ bcopy(inp, &xi.xi_inp, sizeof *inp); if (inp->inp_socket) sotoxsocket(inp->inp_socket, &xi.xi_socket); error = SYSCTL_OUT(req, &xi, sizeof xi); } } if (!error) { /* * Give the user an updated idea of our state. * If the generation differs from what we told * her before, she knows that something happened * while we were processing this request, and it * might be necessary to retry. */ s = splnet(); xig.xig_gen = ripcbinfo.ipi_gencnt; xig.xig_sogen = so_gencnt; xig.xig_count = ripcbinfo.ipi_count; splx(s); error = SYSCTL_OUT(req, &xig, sizeof xig); } free(inp_list, M_TEMP); return error; } SYSCTL_PROC(_net_inet_raw, OID_AUTO/*XXX*/, pcblist, CTLFLAG_RD, 0, 0, rip_pcblist, "S,xinpcb", "List of active raw IP sockets"); struct pr_usrreqs rip_usrreqs = { rip_abort, pru_accept_notsupp, rip_attach, rip_bind, rip_connect, pru_connect2_notsupp, in_control, rip_detach, rip_disconnect, pru_listen_notsupp, in_setpeeraddr, pru_rcvd_notsupp, pru_rcvoob_notsupp, rip_send, pru_sense_null, rip_shutdown, in_setsockaddr, sosend, soreceive, sopoll };