freebsd-dev

Author	SHA1	Message	Date
Andrey V. Elsukov	90ecb41fba	Add IPv6 support for O_IPLEN opcode. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2019-04-29 09:33:16 +00:00
Gleb Smirnoff	97245d4074	Always create ipfw(4) hooks as long as module is loaded. Now enabling ipfw(4) with sysctls controls only linkage of hooks to default heads. When module is loaded fetch sysctls as tunables, to make it possible to boot with ipfw(4) in kernel, but not linked to any pfil(9) hooks.	2019-03-21 16:15:29 +00:00
Gleb Smirnoff	f355cb3e6f	PFIL_MEMPTR for ipfw link level hook With new pfil(9) KPI it is possible to pass a void pointer with length instead of mbuf pointer to a packet filter. Until this commit no filters supported that, so pfil run through a shim function pfil_fake_mbuf(). Now the ipfw(4) hook named "default-link", that is instantiated when net.link.ether.ipfw sysctl is on, supports processing pointer/length packets natively. - ip_fw_args now has union for either mbuf or void , and if flags have non-zero length, then we use the void . - through ipfw_chk() we handle mem/mbuf cases differently. - ether_header goes away from args. It is ipfw_chk() responsibility to do parsing of Ethernet header. - ipfw_log() now uses different bpf APIs to log packets. Although ipfw_chk() is now capable to process pointer/length packets, this commit adds support for the link level hook only, see ipfw_check_frame(). Potentially the IP processing hook ipfw_check_packet() can be improved too, but that requires more changes since the hook supports more complex actions: NAT, divert, etc. Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D19357	2019-03-14 22:52:16 +00:00
Gleb Smirnoff	dc0fa4f712	Remove 'dir' argument from dummynet_io(). This makes it possible to make dn_dir flags private to dummynet. There is still some room for improvement.	2019-03-14 22:32:50 +00:00
Gleb Smirnoff	b7795b6746	- Add more flags to ip_fw_args. At this changeset only IPFW_ARGS_IN and IPFW_ARGS_OUT are utilized. They are intented to substitute the "dir" parameter that is often passes together with args. - Rename ip_fw_args.oif to ifp and now it is set to either input or output interface, depending on IPFW_ARGS_IN/OUT bit set.	2019-03-14 22:28:50 +00:00
Gleb Smirnoff	f712b16127	Revert r316461: Remove "IPFW static rules" rmlock, and use pfil's global lock. The pfil(9) system is about to be converted to epoch(9) synchronization, so we need [temporarily] go back with ipfw internal locking. Discussed with: ae	2019-01-31 21:04:50 +00:00
Andrey V. Elsukov	7664b71b62	Fix the bug introduced in r342908, that causes problems with dynamic handling for protocols without ports numbers. Since port numbers were uninitialized for protocols like ICMP/ICMPv6, ipfw_chk() used some non-zero values to create dynamic states, and due this it failed to match replies with created states. Reported by: Oliver Hartmann, Boris Lytochkin Obtained from: Yandex LLC X-MFC after: r342908	2019-01-29 11:18:41 +00:00
Andrey V. Elsukov	48266154de	Relax requirement to packet size of CARP protocol and remove version check. CARP shares protocol number 112 with VRRP (RFC 5798). And the size of VRRP packet may be smaller than CARP. ipfw_chk() does m_pullup() to at least sizeof(struct carp_header) and can fail when packet is VRRP. This leads to packet drop and message about failed pullup attempt. Also, RFC 5798 defines version 3 of VRRP protocol, this version number also unsupported by CARP and such check leads to packet drop. carp_input() does its own checks for protocol version and packet size, so we can remove these checks to be able pass VRRP packets. PR: 234207 MFC after: 1 week	2019-01-11 01:54:15 +00:00
Andrey V. Elsukov	1cdf23bc03	Reduce the size of struct ip_fw_args from 240 to 128 bytes on amd64. And refactor the code to avoid unneeded initialization to reduce overhead of per-packet processing. ipfw(4) can be invoked by pfil(9) framework for each packet several times. Each call uses on-stack variable of type struct ip_fw_args to keep the state of ipfw(4) processing. Currently this variable has 240 bytes size on amd64. Each time ipfw(4) does bzero() on it, and then it initializes some fields. glebius@ has reported that they at Netflix discovered, that initialization of this variable produces significant overhead on packet processing. After patching I managed to increase performance of packet processing on simple routing with ipfw(4) firewalling to about 11% from 9.8Mpps up to 11Mpps (Xeon E5-2660 v4@ + Mellanox 100G card). Introduced new field flags, it is used to keep track of what fields was initialized. Some fields were moved into the anonymous union, to reduce the size. They all are mutually exclusive. dummypar field was unused, and therefore it is removed. The hopstore6 field type was changed from sockaddr_in6 to a bit smaller struct ip_fw_nh6. And now the size of struct ip_fw_args is 128 bytes. ipfw_chk() was modified to properly handle ip_fw_args.flags instead of rely on checking for NULL pointers. Reviewed by: gallatin Obtained from: Yandex LLC MFC after: 1 month Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D18690	2019-01-10 01:47:57 +00:00
Andrey V. Elsukov	986368d85d	Add extra parentheses to fix "versrcreach" opcode, (oif != NULL) should not be used as condition for ternary operator. Submitted by: Tatsuki Makino <tatsuki_makino at hotmail dot com> Approved by: re (kib) MFC after: 1 week	2018-10-15 10:25:34 +00:00
Andrew Turner	5f901c92a8	Use the new VNET_DEFINE_STATIC macro when we are defining static VNET variables. Reviewed by: bz Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16147	2018-07-24 16:35:52 +00:00
Andrey V. Elsukov	f7c4fdee1a	Add "record-state", "set-limit" and "defer-action" rule options to ipfw. "record-state" is similar to "keep-state", but it doesn't produce implicit O_PROBE_STATE opcode in a rule. "set-limit" is like "limit", but it has the same feature as "record-state", it is single opcode without implicit O_PROBE_STATE opcode. "defer-action" is targeted to be used with dynamic states. When rule with this opcode is matched, the rule's action will not be executed, instead dynamic state will be created. And when this state will be matched by "check-state", then rule action will be executed. This allows create a more complicated rulesets. Submitted by: lev MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D1776	2018-07-09 11:35:18 +00:00
Matt Macy	d7c5a620e2	ifnet: Replace if_addr_lock rwlock with epoch + mutex Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366	2018-05-18 20:13:34 +00:00
Andrey V. Elsukov	12c080e613	Do not try to reassemble IPv6 fragments in "reass" rule. ip_reass() expects IPv4 packet and will just corrupt any IPv6 packets that it gets. Until proper IPv6 fragments handling function will be implemented, pass IPv6 packets to next rule. PR: 170604 MFC after: 1 week	2018-03-12 09:40:46 +00:00
Andrey V. Elsukov	b99a682320	Rework ipfw dynamic states implementation to be lockless on fast path. o added struct ipfw_dyn_info that keeps all needed for ipfw_chk and for dynamic states implementation information; o added DYN_LOOKUP_NEEDED() macro that can be used to determine the need of new lookup of dynamic states; o ipfw_dyn_rule now becomes obsolete. Currently it used to pass information from kernel to userland only. o IPv4 and IPv6 states now described by different structures dyn_ipv4_state and dyn_ipv6_state; o IPv6 scope zones support is added; o ipfw(4) now depends from Concurrency Kit; o states are linked with "entry" field using CK_SLIST. This allows lockless lookup and protected by mutex modifications. o the "expired" SLIST field is used for states expiring. o struct dyn_data is used to keep generic information for both IPv4 and IPv6; o struct dyn_parent is used to keep O_LIMIT_PARENT information; o IPv4 and IPv6 states are stored in different hash tables; o O_LIMIT_PARENT states now are kept separately from O_LIMIT and O_KEEP_STATE states; o per-cpu dyn_hp pointers are used to implement hazard pointers and they prevent freeing states that are locklessly used by lookup threads; o mutexes to protect modification of lists in hash tables now kept in separate arrays. 65535 limit to maximum number of hash buckets now removed. o Separate lookup and install functions added for IPv4 and IPv6 states and for parent states. o By default now is used Jenkinks hash function. Obtained from: Yandex LLC MFC after: 42 days Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D12685	2018-02-07 18:59:54 +00:00
Andrey V. Elsukov	14a6bab1da	When IPv6 packet is handled by O_REJECT opcode, convert ICMP code specified in the arg1 into ICMPv6 destination unreachable code according to RFC7915. Obtained from: Yandex LLC MFC after: 2 weeks Sponsored by: Yandex LLC	2018-01-24 12:40:28 +00:00
Andrey V. Elsukov	d38344208e	Add UDPLite support to ipfw(4). Now it is possible to use UDPLite's port numbers in rules, create dynamic states for UDPLite packets and see "UDPLite" for matched packets in log. Obtained from: Yandex LLC MFC after: 2 weeks Sponsored by: Yandex LLC	2018-01-19 12:50:03 +00:00
Pedro F. Giffuni	fe267a5590	sys: general adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. No functional change intended.	2017-11-27 15:23:17 +00:00
Michael Tuexen	665c8a2ee5	Add to ipfw support for sending an SCTP packet containing an ABORT chunk. This is similar to the TCP case. where a TCP RST segment can be sent. There is one limitation: When sending an ABORT in response to an incoming packet, it should be tested if there is no ABORT chunk in the received packet. Currently, it is only checked if the first chunk is an ABORT chunk to avoid parsing the whole packet, which could result in a DOS attack. Thanks to Timo Voelker for helping me to test this patch. Reviewed by: bcr@ (man page part), ae@ (generic, non-SCTP part) Differential Revision: https://reviews.freebsd.org/D13239	2017-11-26 18:19:01 +00:00
Andrey V. Elsukov	1719df1bb4	Modify ipfw's dynamic states KPI. Hide the locking logic used in the dynamic states implementation from generic code. Rename ipfw_install_state() and ipfw_lookup_dyn_rule() function to have similar names: ipfw_dyn_install_state() and ipfw_dyn_lookup_state(). Move dynamic rule counters updating to the ipfw_dyn_lookup_state() function. Now this function return NULL when there is no state and pointer to the parent rule when state is found. Thus now there is no need to return pointer to dynamic rule, and no need to hold bucket lock for this state. Remove ipfw_dyn_unlock() function. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D11657	2017-11-23 08:02:02 +00:00
Andrey V. Elsukov	30df59d581	Move ipfw_send_pkt() from ip_fw_dynamic.c into ip_fw2.c. It is not specific for dynamic states function and called also from generic code. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2017-11-23 06:04:57 +00:00
Andrey V. Elsukov	7143bb7626	Add ipfw_add_protected_rule() function that creates rule with 65535 number in the reserved set 31. Use this function to create default rule. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2017-11-22 05:49:21 +00:00
Andrey V. Elsukov	66f84fabb3	Add comment for accidentally committed unrelated change in r325960. Do not invoke IPv4 NAT handler for non IPv4 packets. Libalias expects a packet is IPv4. And in case when it is IPv6, it just translates them as IPv4. This leads to corruption and in some cases to panics. In particular a panic can happen when value of ip6_plen modified to something that leads to IP fragmentation, but actual packet length does not match the IP length. Packets that are not IPv4 will be dropped by NAT rule. Reported by: Viktor Dukhovni <freebsd at dukhovni dot org> MFC after: 1 week	2017-11-17 23:25:06 +00:00
Andrey V. Elsukov	e11f0a0c4c	Unconditionally enable support for O_IPSEC opcode. IPsec support can be loaded as kernel module, thus do not depend from kernel option IPSEC and always build O_IPSEC opcode implementation as enabled. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2017-11-17 22:40:02 +00:00
Andrey V. Elsukov	5c70ebfa57	Add IPv6 support for O_TCPDATALEN opcode. PR: 222746 MFC after: 1 week	2017-10-24 08:39:05 +00:00
Andrey V. Elsukov	ff0a137952	Fix regression in handling O_FORWARD_IP opcode after r279948. To properly handle 'fwd tablearg,port' opcode, copy sin_port value from sockaddr_in structure stored in the opcode into corresponding hopstore field. PR: 222953 MFC after: 1 week	2017-10-13 11:11:53 +00:00
Michael Tuexen	945906384d	Fix a bug which avoided that rules for matching port numbers for SCTP packets where actually matched. While there, make clean in the man-page that SCTP port numbers are supported in rules. MFC after: 1 month	2017-10-02 18:25:30 +00:00
Andrey V. Elsukov	5df8171da3	Use in_localip() function instead of unlocked access to addresses hash to determine that an address is our local. PR: 220078 MFC after: 1 week	2017-09-20 22:35:28 +00:00
Philip Paeps	b0e1660d53	Fix GRE over IPv6 tunnels with IPFW Previously, GRE packets in IPv6 tunnels would be dropped by IPFW (unless net.inet6.ip6.fw.deny_unknown_exthdrs was unset). PR: 220640 Submitted by: Kun Xie <kxie@xiplink.com> MFC after: 1 week	2017-07-13 09:01:22 +00:00
Andrey V. Elsukov	88d950a650	Remove "IPFW static rules" rmlock. Make PFIL's lock global and use it for this purpose. This reduces the number of locks needed to acquire for each packet. Obtained from: Yandex LLC MFC after: 2 weeks Sponsored by: Yandex LLC No objection from: #network Differential Revision: https://reviews.freebsd.org/D10154	2017-04-03 13:35:04 +00:00
Andrey V. Elsukov	788e62864f	Reset the cached state of last lookup in the dynamic states when an external action is completed, but the rule search is continued. External action handler can change the content of @args argument, that is used for dynamic state lookup. Enforce the new lookup to be able install new state, when the search is continued. Obtained from: Yandex LLC MFC after: 1 week Sponsored by: Yandex LLC	2017-03-31 09:26:08 +00:00
Andrey V. Elsukov	54e5669d8c	Add IPv6 support to O_IP_DST_LOOKUP opcode. o check the size of O_IP_SRC_LOOKUP opcode, it can not exceed the size of ipfw_insn_u32; o rename ipfw_lookup_table_extended() function into ipfw_lookup_table() and remove old ipfw_lookup_table(); o use args->f_id.flow_id6 that is in host byte order to get DSCP value; o add SCTP ports support to 'lookup src/dst-port' opcode; o add IPv6 support to 'lookup src/dst-ip' opcode. PR: 217292 Reviewed by: melifaro MFC after: 2 weeks Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9873	2017-03-05 23:48:24 +00:00
Andrey V. Elsukov	43b294a4db	Fix matching table entry value. Use real table value instead of its index in valuestate array. When opcode has size equal to ipfw_insn_u32, this means that it should additionally match value specified in d[0] with table entry value. ipfw_table_lookup() returns table value index, use TARG_VAL() macro to convert it to its value. The actual 32-bit value stored in the tag field of table_value structure, where all unspecified u32 values are kept. PR: 217262 Reviewed by: melifaro MFC after: 1 week Sponsored by: Yandex LLC	2017-03-03 20:22:42 +00:00
Andrey V. Elsukov	576429f04b	Fix NPTv6 rule counters when one_pass is not enabled. Consider the rule matching when both @done and @retval values returned from ipfw_run_eaction() are zero. And modify ipfw_nptv6() to return IP_FW_DENY and @done=0 when addresses do not match. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2017-03-01 20:00:19 +00:00
Andrey V. Elsukov	0d9cbb874c	Move opcode rewriter init and destroy handlers into non-VENT code. PR: 212576,212649,212077 Submitted by: John Zielinski MFC after: 1 week	2016-09-18 17:35:17 +00:00
Andrey V. Elsukov	56132dcc0d	Move logging via BPF support into separate file. * make interface cloner VNET-aware; * simplify cloner code and use if_clone_simple(); * migrate LOGIF_LOCK() to rmlock; * add ipfw_bpf_mtap2() function to pass mbuf to BPF; * introduce new additional ipfwlog0 pseudo interface. It differs from ipfw0 by DLT type used in bpfattach. This interface is intended to used by ipfw modules to dump packets with additional info attached. Currently pflog format is used. ipfw_bpf_mtap2() function uses second argument to determine which interface use for dumping. If dlen is equal to ETHER_HDR_LEN it uses old ipfw0 interface, if dlen is equal to PFLOG_HDRLEN - ipfwlog0 will be used. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-08-13 15:41:04 +00:00
Andrey V. Elsukov	d6eb9b0249	Restore "nat global" support. Now zero value of arg1 used to specify "tablearg", use the old "tablearg" value for "nat global". Introduce new macro IP_FW_NAT44_GLOBAL to replace hardcoded magic number to specify "nat global". Also replace 65535 magic number with corresponding macro. Fix typo in comments. PR: 211256 Tested by: Victor Chernov MFC after: 3 days	2016-08-11 10:10:10 +00:00
Andrey V. Elsukov	ed22e564b8	Add named dynamic states support to ipfw(4). The keep-state, limit and check-state now will have additional argument flowname. This flowname will be assigned to dynamic rule by keep-state or limit opcode. And then can be matched by check-state opcode or O_PROBE_STATE internal opcode. To reduce possible breakage and to maximize compatibility with old rulesets default flowname introduced. It will be assigned to the rules when user has omitted state name in keep-state and check-state opcodes. Also if name is ambiguous (can be evaluated as rule opcode) it will be replaced to default. Reviewed by: julian Obtained from: Yandex LLC MFC after: 1 month Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D6674	2016-07-19 04:56:59 +00:00
Bjoern A. Zeeb	31fe4e62fa	Move the ipfw_log_bpf() calls from global module initialisation to per-VNET initialisation and virtualise the interface cloning to allow a dedicated ipfw log interface per VNET. Approved by: re (gjb) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2016-06-30 01:33:14 +00:00
Bjoern A. Zeeb	89856f7e2d	Get closer to a VIMAGE network stack teardown from top to bottom rather than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated. Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC. Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet. For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown. Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers. For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()). Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level. Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747	2016-06-21 13:48:49 +00:00
Pedro F. Giffuni	a4641f4eaa	sys/net*: minor spelling fixes. No functional change.	2016-05-03 18:05:43 +00:00
Andrey V. Elsukov	2acdf79f53	Add External Actions KPI to ipfw(9). It allows implementing loadable kernel modules with new actions and without needing to modify kernel headers and ipfw(8). The module registers its action handler and keyword string, that will be used as action name. Using generic syntax user can add rules with this action. Also ipfw(8) can be easily modified to extend basic syntax for external actions, that become a part base system. Sample modules will coming soon. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2016-04-14 22:51:23 +00:00
Andrey V. Elsukov	23a6c7330c	Fix bug in filling and handling ipfw's O_DSCP opcode. Due to integer overflow CS4 token was handled as BE. PR: 207459 MFC after: 1 week	2016-02-24 13:16:03 +00:00
Hans Petter Selasky	c8cfbc066f	Properly drain callouts in the IPFW subsystem to avoid use after free panics when unloading the dummynet and IPFW modules: - The callout drain function can sleep and should not be called having a non-sleepable lock locked. Remove locks around "ipfw_dyn_uninit(0)". - Add a new "dn_gone" variable to prevent asynchronous restart of dummynet callouts when unloading the dummynet kernel module. - Call "dn_reschedule()" locked so that "dn_gone" can be set and checked atomically with regard to starting a new callout. Reviewed by: hiren MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D3855	2015-12-15 09:02:05 +00:00
Alexander V. Chernikov	65ff3638df	Merge helper fib* functions used for basic lookups. Vast majority of rtalloc(9) users require only basic info from route table (e.g. "does the rtentry interface match with the interface I have?". "what is the MTU?", "Give me the IPv4 source address to use", etc..). Instead of hand-rolling lookups, checking if rtentry is up, valid, dealing with IPv6 mtu, finding "address" ifp (almost never done right), provide easy-to-use API hiding all the complexity and returning the needed info into small on-stack structure. This change also helps hiding route subsystem internals (locking, direct rtentry accesses). Additionaly, using this API improves lookup performance since rtentry is not locked. (This is safe, since all the rtentry changes happens under both radix WLOCK and rtentry WLOCK). Sponsored by: Yandex LLC	2015-12-08 10:50:03 +00:00
Alexander V. Chernikov	b554a27822	Fix setfib target. Problem was introduced in r272840 when converting tablearg value to 0. Submitted by: Denis Schneider <v1ne2go at gmail>	2015-11-08 12:24:19 +00:00
Andrey V. Elsukov	b13653baf9	Reduce overhead of ipfw's me6 opcode. Skip checks for IPv6 multicast addresses. Use in6_localip() for global unicast. And for IPv6 link-local addresses do search in the IPv6 addresses list. Since LLA are stored in the kernel internal form, use IN6_ARE_MASKED_ADDR_EQUAL() macro with lla_mask for addresses comparison. lla_mask has zero bits in the second word, where we keep sin6_scope_id. Obtained from: Yandex LLC Sponsored by: Yandex LLC	2015-07-29 10:53:42 +00:00
Andrey V. Elsukov	af9aa0a837	Add helper functions for IP checksum adjusting. Use these functions in dummynet code and for setdscp. This fixes wrong checksums in some cases. Obtained from: Yandex LLC MFC after: 2 weeks Sponsored by: Yandex LLC	2015-07-20 07:26:31 +00:00
Luigi Rizzo	8ff71b031e	bugfix (only affecting the "lookup" option in the userspace version of ipfw): the conditional block should not include the 'else' otherwise the code does a 'break;' without completing the check	2015-05-13 11:53:25 +00:00
Alexander V. Chernikov	74b22066b0	Make rule table kernel-index rewriting support any kind of objects. Currently we have tables identified by their names in userland with internal kernel-assigned indices. This works the following way: When userland wishes to communicate with kernel to add or change rule(s), it makes indexed sorted array of table names (internally ipfw_obj_ntlv entries), and refer to indices in that array in rule manipulation. Prior to committing new rule to the ruleset kernel a) finds all referenced tables, bump their refcounts and change values inside the opcodes to be real kernel indices b) auto-creates all referenced but not existing tables and then do a) for them. Kernel does almost the same when exporting rules to userland: prepares array of used tables in all rules in range, and prepends it before the actual ruleset retaining actual in-kernel indexes for that. There is also special translation layer for legacy clients which is able to provide 'real' indices for table names (basically doing atoi()). While it is arguable that every subsystem really needs names instead of numbers, there are several things that should be noted: 1) every non-singleton subsystem needs to store its runtime state somewhere inside ipfw chain (and be able to get it fast) 2) we can't assume object numbers provided by humans will be dense. Existing nat implementation (O(n) access and LIST inside chain) is a good example. Hence the following: * Convert table-centric rewrite code to be more generic, callback-based * Move most of the code from ip_fw_table.c to ip_fw_sockopt.c * Provide abstract API to permit subsystems convert their objects between userland string identifier and in-kernel index. (See struct opcode_obj_rewrite) for more details * Create another per-chain index (in next commit) shared among all subsystems * Convert current NAT44 implementation to use new API, O(1) lookups, shared index and names instead of numbers (in next commit). Sponsored by: Yandex LLC	2015-04-27 08:29:39 +00:00

1 2 3

103 Commits