freebsd-skq

Author	SHA1	Message	Date
rwatson	f2d5b39895	Add missing unlock of if_addr_mtx when an unmatched ARP packet is received. Reported by: lstewart MFC after: 6 weeks	2009-06-24 14:49:26 +00:00
rwatson	b1ee9361a4	Clear 'ia' after iterating if_addrhead for unicast address matching: since 'ifa' was used as the TAILQ_FOREACH() iterator argument, and 'ia' was just derived form it, it could be left non-NULL which confused later conditional freeing code. This could cause kernel panics if multicast IP packets were received. [1] Call 'struct in_ifaddr *' in ip_rtaddr() 'ia', not 'ifa' in keeping with normal conventions. When 'ipstealth' is enabled returns from ip_input early, properly release the 'ia' reference. Reported by: lstewart, sam [1] MFC after: 6 weeks	2009-06-24 14:29:40 +00:00
rwatson	4bbf479ced	In ARP input, more consistently acquire and release ifaddr references. MFC after: 6 weeks	2009-06-24 10:33:35 +00:00
bz	a8839212d2	Make callers to in6_selectsrc() and in6_pcbladdr() pass in memory to save the selected source address rather than returning an unreferenced copy to a pointer that might long be gone by the time we use the pointer for anything meaningful. Asked for by: rwatson Reviewed by: rwatson	2009-06-23 22:08:55 +00:00
rwatson	c9ef486fe1	Modify most routines returning 'struct ifaddr *' to return references rather than pointers, requiring callers to properly dispose of those references. The following routines now return references: ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr Remove unused macro which didn't have required referencing: IFP_TO_IA6 This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references. Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed. Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)	2009-06-23 20:19:09 +00:00
bz	0808d0b1a6	After cleaning up rt_tables from vnet.h and cleaning up opt_route.h a lot of files no longer need route.h either. Garbage collect them. While here remove now unneeded vnet.h #includes as well.	2009-06-23 17:03:45 +00:00
andre	e66ed06df4	Add soreceive_stream(), an optimized version of soreceive() for stream (TCP) sockets. It is functionally identical to generic soreceive() but has a number stream specific optimizations: o does only one sockbuf unlock/lock per receive independent of the length of data to be moved into the uio compared to soreceive() which unlocks/locks per mbuf. o uses m_mbuftouio() instead of its own copy(out) variant. o much more compact code flow as a large number of special cases is removed. o much improved reability. It offers significantly reduced CPU usage and lock contention when receiving fast TCP streams. Additional gains are obtained when the receiving application is using SO_RCVLOWAT to batch up some data before a read (and wakeup) is done. This function was written by "reverse engineering" and is not just a stripped down variant of soreceive(). It is not yet enabled by default on TCP sockets. Instead it is commented out in the protocol initialization in tcp_usrreq.c until more widespread testing has been done. Testers, especially with 10GigE gear, are welcome. MFP4: r164817 //depot/user/andre/soreceive_stream/	2009-06-22 23:08:05 +00:00
zec	f9dfbad808	V_irtualize flowtable state. This change should make options VIMAGE kernel builds usable again, to some extent at least. Note that the size of struct vnet_inet has changed, though in accordance with one-bump-per-day policy we didn't update the __FreeBSD_version number, given that it has already been touched by r194640 a few hours ago. Reviewed by: bz Approved by: julian (mentor)	2009-06-22 21:19:24 +00:00
rwatson	3c02410d55	Add a new function, ifa_ifwithaddr_check(), which rather than returning a pointer to an ifaddr matching the passed socket address, returns a boolean indicating whether one was present. In the (near) future, ifa_ifwithaddr() will return a referenced ifaddr rather than a raw ifaddr pointer, and the new wrapper will allow callers that care only about the boolean condition to avoid having to free that reference. MFC after: 3 weeks	2009-06-22 10:59:34 +00:00
bz	71e36df28b	Remove a hack from r186086 so that IPsec via loopback routes continued working. It was targeted for stable/7 compatibility and actually never did anything in HEAD. Reminded by: rwatson X-MFC after: never	2009-06-22 09:24:46 +00:00
rwatson	1f7e54e8c5	Clean up common ifaddr management: - Unify reference count and lock initialization in a single function, ifa_init(). - Move tear-down from a macro (IFAFREE) to a function ifa_free(). - Move reference count bump from a macro (IFAREF) to a function ifa_ref(). - Instead of using a u_int protected by a mutex to refcount(9) for reference count management. The ifa_mtx is now used for exactly one ioctl, and possibly should be removed. MFC after: 3 weeks	2009-06-21 19:30:33 +00:00
rdivacky	9992cf9aeb	Switch cmd argument to u_long. This matches what if_ethersubr.c does and allows the code to compile cleanly on amd64 with clang. Reviewed by: rwatson Approved by: ed (mentor)	2009-06-21 10:29:31 +00:00
brooks	f53c1c309d	Rework the credential code to support larger values of NGROUPS and NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867	2009-06-19 17:10:35 +00:00
bz	48dc6805f8	Add explicit includes for jail.h to the files that need them and remove the "hidden" one from vimage.h.	2009-06-17 15:01:01 +00:00
bz	b017972f11	Add the explicit include of vimage.h to another five .c files still missing it. Remove the "hidden" kernel only include of vimage.h from ip_var.h added with the very first Vimage commit r181803 to avoid further kernel poisoning.	2009-06-17 12:44:11 +00:00
rrs	8a5e849118	Changes to the NR-Sack code so that: 1) All bit disappears 2) The two sets of gaps (nr and non-nr) are disjointed, you don't have gaps struck in both places. This adjusts us to coorespond to the new draft. Still to-do, cleanup the code so that there are only one set of sack routines (original NR-Sack done by E cloned all sack code).	2009-06-17 12:34:56 +00:00
jhb	8215fabe39	Trim extra sets of ()'s. Requested by: bde	2009-06-16 19:00:48 +00:00
jhb	8869d432a2	Fix edge cases with ticks wrapping from INT_MAX to INT_MIN in the handling of the per-tcpcb t_badtrxtwin. Submitted by: bde	2009-06-16 19:00:12 +00:00
jhb	43c09e3d00	- Change members of tcpcb that cache values of ticks from int to u_int: t_rcvtime, t_starttime, t_rtttime, t_bw_rtttime, ts_recent_age, t_badrxtwin. - Change t_recent in struct timewait from u_long to u_int32_t to match the type of the field it shadows from tcpcb: ts_recent. - Change t_starttime in struct timewait from u_long to u_int to match the t_starttime field in tcpcb. Requested by: bde (1, 3)	2009-06-16 18:58:50 +00:00
jamie	f950eed7d7	Get vnets from creds instead of threads where they're available, and from passed threads instead of curthread. Reviewed by: zec, julian Approved by: bz (mentor)	2009-06-15 19:01:53 +00:00
oleg	84d5b1472a	Since dn_pipe.numbytes is int64_t now - remove unnecessary overflow detection code in ready_event_wfq().	2009-06-15 17:14:47 +00:00
bz	64d1a84073	Move the kernel option FLOWTABLE chacking from the header file to the actual implementation. Remove the accessor functions for the compiled out case, just returning "unavail" values. Remove the kernel conditional from the header file as it is no longer needed, only leaving the externs. Hide the improperly virtualized SYSCTL/TUNABLE for the flowtable size under the kernel option as well. Reviewed by: rwatson	2009-06-12 20:46:36 +00:00
vanhu	16c1346b9a	Added support for NAT-Traversal (RFC 3948) in IPsec stack. Thanks to (no special order) Emmanuel Dreyfus (manu@netbsd.org), Larry Baird (lab@gta.com), gnn, bz, and other FreeBSD devs, Julien Vanherzeele (julien.vanherzeele@netasq.com, for years of bug reporting), the PFSense team, and all people who used / tried the NAT-T patch for years and reported bugs, patches, etc... X-MFC: never Reviewed by: bz Approved by: gnn(mentor) Obtained from: NETASQ	2009-06-12 15:44:35 +00:00
jhb	ea832e9f39	Correct printf format type mismatches.	2009-06-11 14:37:18 +00:00
jhb	59765b88ce	Trim extra ()'s. Submitted by: bde	2009-06-11 14:36:13 +00:00
jhb	d5b1c989db	Change a few members of tcpcb that store cached copies of ticks to be ints instead of unsigned longs. This fixes a few overflow edge cases on 64-bit platforms. Specifically, if an idle connection receives a packet shortly before 2^31 clock ticks of uptime (about 25 days with hz=1000) and the keep alive timer fires after 2^31 clock ticks, the keep alive timer will think that the connection has been idle for a very long time and will immediately drop the connection instead of sending a keep alive probe. Reviewed by: silby, gnn, lstewart MFC after: 1 week	2009-06-10 18:27:15 +00:00
imp	a18572086b	These are no longer referenced in the tree, so can be safely removed. Reviewed by: bms@	2009-06-10 18:12:15 +00:00
luigi	9f16f6c36d	in ip_dn_ctl(), do not allocate a large structure on the stack, and use malloc() instead if/when it is necessary. The problem is less relevant in previous versions because the variable involved (tmp_pipe) is much smaller there. Still worth fixing though. Submitted by: Marta Carbone (GSOC) MFC after: 3 days	2009-06-10 10:47:31 +00:00
bz	5cd78ef442	Remove the "The option TCPDEBUG requires option INET." requirement. In case of !INET we will not have a timestamp on the trace for now but that might only affect spx debugging as long as INET6 requires INET. Reviewed by: rwatson (earlier version)	2009-06-10 10:39:41 +00:00
luigi	b465834b3e	small simplifications to the code in charge of reaping deleted rules: - clear the head pointer immediately before using it, so there is no chance of mistakes; - call reap_rules() unconditionally. The function can handle a NULL argument just fine, and the cost of the extra call is hardly significant given that we do it rarely and outside the lock. MFC after: 3 days	2009-06-10 10:34:59 +00:00
oleg	1980405dfd	Close long existed race with net.inet.ip.fw.one_pass = 0: If packet leaves ipfw to other kernel subsystem (dummynet, netgraph, etc) it carries pointer to matching ipfw rule. If this packet then reinjected back to ipfw, ruleset processing starts from that rule. If rule was deleted meanwhile, due to existed race condition panic was possible (as well as other odd effects like parsing rules in 'reap list'). P.S. this commit changes ABI so userland ipfw related binaries should be recompiled. MFC after: 1 month Tested by: Mikolaj Golub	2009-06-09 21:27:11 +00:00
bz	b7ff2bdc20	After r193232 rt_tables in vnet.h are no longer indirectly dependent on the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.	2009-06-08 19:57:35 +00:00
zec	8b1f38241a	Introduce an infrastructure for dismantling vnet instances. Vnet modules and protocol domains may now register destructor functions to clean up and release per-module state. The destructor mechanisms can be triggered by invoking "vimage -d", or a future equivalent command which will be provided via the new jail framework. While this patch introduces numerous placeholder destructor functions, many of those are currently incomplete, thus leaking memory or (even worse) failing to stop all running timers. Many of such issues are already known and will be incrementaly fixed over the next weeks in smaller incremental commits. Apart from introducing new fields in structs ifnet, domain, protosw and vnet_net, which requires the kernel and modules to be rebuilt, this change should have no impact on nooptions VIMAGE builds, since vnet destructors can only be called in VIMAGE kernels. Moreover, destructor functions should be in general compiled in only in options VIMAGE builds, except for kernel modules which can be safely kldunloaded at run time. Bump __FreeBSD_version to 800097. Reviewed by: bz, julian Approved by: rwatson, kib (re), julian (mentor)	2009-06-08 17:15:40 +00:00
hrs	9bf362d0cc	Fix and add a workaround on an issue of EtherIP packet with reversed version field sent via gif(4)+if_bridge(4). The EtherIP implementation found on FreeBSD 6.1, 6.2, 6.3, 7.0, 7.1, and 7.2 had an interoperability issue because it sent the incorrect EtherIP packets and discarded the correct ones. This change introduces the following two flags to gif(4): accept_rev_ethip_ver: accepts both correct EtherIP packets and ones with reversed version field, if enabled. If disabled, the gif accepts the correct packets only. This flag is enabled by default. send_rev_ethip_ver: sends EtherIP packets with reversed version field intentionally, if enabled. If disabled, the gif sends the correct packets only. This flag is disabled by default. These flags are stored in struct gif_softc and can be set by ifconfig(8) on per-interface basis. Note that this is an incompatible change of EtherIP with the older FreeBSD releases. If you need to interoperate older FreeBSD boxes and new versions after this commit, setting "send_rev_ethip_ver" is needed. Reviewed by: thompsa and rwatson Spotted by: Shunsuke SHINOMIYA PR: kern/125003 MFC after: 2 weeks	2009-06-07 23:00:40 +00:00
zec	d9f2402f73	Unbreak options VIMAGE build. Submitted by: julian (mentor) Approved by: julian (mentor)	2009-06-06 12:43:13 +00:00
pjd	80ee56b38b	Only four out of nine arguments for ip_ipsec_output() are actually used. Kill unused arguments except for 'ifp' as it might be used in the future for detecting IPsec-capable interfaces.	2009-06-05 23:53:17 +00:00
luigi	82f1e32a7f	move kernel ipfw-related sources to a separate directory, adjust conf/files and modules' Makefiles accordingly. No code or ABI changes so this and most of previous related changes can be easily MFC'ed MFC after: 5 days	2009-06-05 19:22:47 +00:00
luigi	78a4bbf287	Several ipfw options and actions use a 16-bit argument to indicate pipes, queues, tags, rule numbers and so on. These are all different namespaces, and the only thing they have in common is the fact they use a 16-bit slot to represent the argument. There is some confusion in the code, mostly for historical reasons, on how the values 0 and 65535 should be used. At the moment, 0 is forbidden almost everywhere, while 65535 is used to represent a 'tablearg' argument, i.e. the result of the most recent table() lookup. For now, try to use explicit constants for the min and max allowed values, and do not overload the default rule number for that. Also, make the MTAG_IPFW declaration only visible to the kernel. NOTE: I think the issue needs to be revisited before 8.0 is out: the 2^16 namespace limit for rule numbers and pipe/queue is annoying, and we can easily bump the limit to 2^32 which gives a lot more flexibility in partitioning the namespace. MFC after: 5 days	2009-06-05 16:16:07 +00:00
rwatson	f4934662e5	Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd	2009-06-05 14:55:22 +00:00
rwatson	4df1a4e912	Unifdef MAC label pointer in syncache entries -- in general, ifdef'd structure contents are a bad idea in the kernel for binary compatibility reasons, and this is a single pointer that is now included in compiles by default anyway due to options MAC being in GENERIC.	2009-06-05 14:31:03 +00:00
luigi	1384b62c4b	More cleanup in preparation of ipfw relocation (no actual code change): + move ipfw and dummynet hooks declarations to raw_ip.c (definitions in ip_var.h) same as for most other global variables. This removes some dependencies from ip_input.c; + remove the IPFW_LOADED macro, just test ip_fw_chk_ptr directly; + remove the DUMMYNET_LOADED macro, just test ip_dn_io_ptr directly; + move ip_dn_ruledel_ptr to ip_fw2.c which is the only file using it; To be merged together with rev 193497 MFC after: 5 days	2009-06-05 13:44:30 +00:00
luigi	c59f82aeab	Small changes (no actual code changes) in preparation of moving ipfw-related stuff to its own directory, and cleaning headers and dependencies: In this commit: + remove one use of a typedef; + document dn_rule_delete(); + replace one usage of the DUMMYNET_LOADED macro with its value; No MFC planned until the cleanup is complete.	2009-06-05 12:49:54 +00:00
luigi	88c056a07b	fix a bug introduced in rev.190865 related to the signedness of the credit of a pipe. On passing, also use explicit signed/unsigned types for two other fields. Noticed by Oleg Bulyzhin and Maxim Ignatenko long ago, i forgot to commit the fix. Does not affect RELENG_7.	2009-06-04 12:27:57 +00:00
rwatson	849a8ce20b	Continue work to optimize performance of "options MAC" when no MAC policy modules are loaded by avoiding mbuf label lookups when policies aren't loaded, pushing further socket locking into MAC policy modules, and avoiding locking MAC ifnet locks when no policies are loaded: - Check mac_policies_count before looking for mbuf MAC label m_tags in MAC Framework entry points. We will still pay label lookup costs if MAC policies are present but don't require labels (typically a single mbuf header field read, but perhaps further indirection if IPSEC or other m_tag consumers are in use). - Further push socket locking for socket-related access control checks and events into MAC policies from the MAC Framework, so that sockets are only locked if a policy specifically requires a lock to protect a label. This resolves lock order issues during sonewconn() and also in local domain socket cross-connect where multiple socket locks could not be held at once for the purposes of propagatig MAC labels across multiple sockets. Eliminate mac_policy_count check in some entry points where it no longer avoids locking. - Add mac_policy_count checking in some entry points relating to network interfaces that otherwise lock a global MAC ifnet lock used to protect ifnet labels. Obtained from: TrustedBSD Project	2009-06-03 18:46:28 +00:00
rwatson	0f9e858440	Add internal 'mac_policy_count' counter to the MAC Framework, which is a count of the number of registered policies. Rather than unconditionally locking sockets before passing them into MAC, lock them in the MAC entry points only if mac_policy_count is non-zero. This avoids locking overhead for a number of socket system calls when no policies are registered, eliminating measurable overhead for the MAC Framework for the socket subsystem when there are no active policies. Possibly socket locks should be acquired by policies if they are required for socket labels, which would further avoid locking overhead when there are policies but they don't require labeling of sockets, or possibly don't even implement socket controls. Obtained from: TrustedBSD Project	2009-06-02 18:26:17 +00:00
jhb	a1af9ecca4	Rework socket upcalls to close some races with setup/teardown of upcalls. - Each socket upcall is now invoked with the appropriate socket buffer locked. It is not permissible to call soisconnected() with this lock held; however, so socket upcalls now return an integer value. The two possible values are SU_OK and SU_ISCONNECTED. If an upcall returns SU_ISCONNECTED, then the soisconnected() will be invoked on the socket after the socket buffer lock is dropped. - A new API is provided for setting and clearing socket upcalls. The API consists of soupcall_set() and soupcall_clear(). - To simplify locking, each socket buffer now has a separate upcall. - When a socket upcall returns SU_ISCONNECTED, the upcall is cleared from the receive socket buffer automatically. Note that a SO_SND upcall should never return SU_ISCONNECTED. - All this means that accept filters should now return SU_ISCONNECTED instead of calling soisconnected() directly. They also no longer need to explicitly clear the upcall on the new socket. - The HTTP accept filter still uses soupcall_set() to manage its internal state machine, but other accept filters no longer have any explicit knowlege of socket upcall internals aside from their return value. - The various RPC client upcalls currently drop the socket buffer lock while invoking soreceive() as a temporary band-aid. The plan for the future is to add a new flag to allow soreceive() to be called with the socket buffer locked. - The AIO callback for socket I/O is now also invoked with the socket buffer locked. Previously sowakeup() would drop the socket buffer lock only to call aio_swake() which immediately re-acquired the socket buffer lock for the duration of the function call. Discussed with: rwatson, rmacklem	2009-06-01 21:17:03 +00:00
bz	c62e99f85d	Convert the two dimensional array to be malloced and introduce an accessor function to get the correct rnh pointer back. Update netstat to get the correct pointer using kvm_read() as well. This not only fixes the ABI problem depending on the kernel option but also permits the tunable to overwrite the kernel option at boot time up to MAXFIBS, enlarging the number of FIBs without having to recompile. So people could just use GENERIC now. Reviewed by: julian, rwatson, zec X-MFC: not possible	2009-06-01 15:49:42 +00:00
bms	c731979fff	Merge fixes from p4: * Tighten v1 query input processing. * Borrow changes from MLDv2 for how general queries are processed. * Do address field validation upfront before accepting input. * Do NOT switch protocol version if old querier present timer active. * Always clear IGMPv3 state in igmp_v3_cancel_link_timers(). * Update comments. Tested by: deeptech71 at gmail dot com	2009-06-01 15:30:18 +00:00
rwatson	2bab695560	Reimplement the netisr framework in order to support parallel netisr threads: - Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Threads may be bound to specific CPUs, or allowed to migrate, based on a global policy. In the future it would be desirable to support topology-centric policies, such as "one netisr per package". - Allow each protocol to advertise an ordering policy, which can currently be one of: NETISR_POLICY_SOURCE: packets must maintain ordering with respect to an implicit or explicit source (such as an interface or socket). NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work, as well as allowing protocols to provide a flow generation function for mbufs without flow identifers (m2flow). Falls back on NETISR_POLICY_SOURCE if now flow ID is available. NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for each packet handled by netisr (m2cpuid). - Provide utility functions for querying the number of workstreams being used, as well as a mapping function from workstream to CPU ID, which protocols may use in work placement decisions. - Add explicit interfaces to get and set per-protocol queue limits, and get and clear drop counters, which query data or apply changes across all workstreams. - Add a more extensible netisr registration interface, in which protocols declare 'struct netisr_handler' structures for each registered NETISR_ type. These include name, handler function, optional mbuf to flow ID function, optional mbuf to CPU ID function, queue limit, and ordering policy. Padding is present to allow these to be expanded in the future. If no queue limit is declared, then a default is used. - Queue limits are now per-workstream, and raised from the previous IFQ_MAXLEN default of 50 to 256. - All protocols are updated to use the new registration interface, and with the exception of netnatm, default queue limits. Most protocols register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use NETISR_POLICY_FLOW, and will therefore take advantage of driver- generated flow IDs if present. - Formalize a non-packet based interface between interface polling and the netisr, rather than having polling pretend to be two protocols. Provide two explicit hooks in the netisr worker for start and end events for runs: netisr_poll() and netisr_pollmore(), as well as a function, netisr_sched_poll(), to allow the polling code to schedule netisr execution. DEVICE_POLLING still embeds single-netisr assumptions in its implementation, so for now if it is compiled into the kernel, a single and un-bound netisr thread is enforced regardless of tunable configuration. In the default configuration, the new netisr implementation maintains the same basic assumptions as the previous implementation: a single, un-bound worker thread processes all deferred work, and direct dispatch is enabled by default wherever possible. Performance measurement shows a marginal performance improvement over the old implementation due to the use of batched dequeue. An rmlock is used to synchronize use and registration/unregistration using the framework; currently, synchronized use is disabled (replicating current netisr policy) due to a measurable 3%-6% hit in ping-pong micro-benchmarking. It will be enabled once further rmlock optimization has taken place. However, in practice, netisrs are rarely registered or unregistered at runtime. A new man page for netisr will follow, but since one doesn't currently exist, it hasn't been updated. This change is not appropriate for MFC, although the polling shutdown handler should be merged to 7-STABLE. Bump __FreeBSD_version. Reviewed by: bz	2009-06-01 10:41:38 +00:00
pjd	5243d2d206	- Rename IP_NONLOCALOK IP socket option to IP_BINDANY, to be more consistent with OpenBSD (and BSD/OS originally). We can't easly do it SOL_SOCKET option as there is no more space for more SOL_SOCKET options, but this option also fits better as an IP socket option, it seems. - Implement this functionality also for IPv6 and RAW IP sockets. - Always compile it in (don't use additional kernel options). - Remove sysctl to turn this functionality on and off. - Introduce new privilege - PRIV_NETINET_BINDANY, which allows to use this functionality (currently only unjail root can use it). Discussed with: julian, adrian, jhb, rwatson, kmacy	2009-06-01 10:30:00 +00:00

1 2 3 4 5 ...

3510 Commits