freebsd-skq

Author	SHA1	Message	Date
Julian Elischer	0b4b0b0fee	Virtualize the pfil hooks so that different jails may chose different packet filters. ALso allows ipfw to be enabled on on ejail and disabled on another. In 8.0 it's a global setting. Sitting aroung in tree waiting to commit for: 2 months MFC after: 2 months	2009-10-11 05:59:43 +00:00
Robert Watson	530c006014	Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)	2009-08-01 19:26:27 +00:00
Qing Li	9fca4f79c7	The new flow table caches both the routing table entry as well as the L2 information. For an indirect route the cached L2 entry contains the MAC address of the gateway. Typically the default route is used to transmit multicast packets when explicit multicast routes are not available. The ether_output() function bypasses L2 resolution function if it verifies the L2 cache is valid, because the cached L2 address (a unicast MAC address) is copied into the packets as the destination MAC address. This validation, however, does not apply to broadcast and multicast packets because the destination MAC address is mapped according to a standard method instead. Submitted by: Xin Li Reviewed by: bz Approved by: re	2009-07-28 17:16:54 +00:00
Robert Watson	1e77c1056a	Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references. Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)	2009-07-16 21:13:04 +00:00
Robert Watson	eddfbb763d	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)	2009-07-14 22:48:30 +00:00
Robert Watson	6c7ffe9340	Break at_ifawithnet() into two variants: - at_ifawithnet(), which acquires an locks it needs and returns an at_ifaddr reference. - at_ifawithnet_locked(), which relies on the caller locking at_ifaddr_list, and returns a pointer rather than a reference. Update various consumers to prefer one or the other, including ether and fddi output, to properly release at_ifaddr references. Rework at_control() to manage locking and references in a manner identical to in_control(). MFC after: 6 weeks	2009-06-24 10:32:44 +00:00
Bjoern A. Zeeb	259d2d5431	carp(4) allows people to share a set of IP addresses and can only use IPv4/v6 for inter-node communication (according to my reading). Properly wrap the carp callouts in INET \|\| INET6 and refelect this in sys/conf/files as well. While in theory this should be ok, it might be a bit optimistic to think that carp could build with inet6 only[1]. Discussed with: mlaier [1]	2009-06-11 10:26:38 +00:00
Oleg Bulyzhin	dda10d624c	Close long existed race with net.inet.ip.fw.one_pass = 0: If packet leaves ipfw to other kernel subsystem (dummynet, netgraph, etc) it carries pointer to matching ipfw rule. If this packet then reinjected back to ipfw, ruleset processing starts from that rule. If rule was deleted meanwhile, due to existed race condition panic was possible (as well as other odd effects like parsing rules in 'reap list'). P.S. this commit changes ABI so userland ipfw related binaries should be recompiled. MFC after: 1 month Tested by: Mikolaj Golub	2009-06-09 21:27:11 +00:00
Bjoern A. Zeeb	8d8bc0182e	After r193232 rt_tables in vnet.h are no longer indirectly dependent on the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.	2009-06-08 19:57:35 +00:00
Robert Watson	bcf11e8d00	Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd	2009-06-05 14:55:22 +00:00
Luigi Rizzo	115a40c7bf	More cleanup in preparation of ipfw relocation (no actual code change): + move ipfw and dummynet hooks declarations to raw_ip.c (definitions in ip_var.h) same as for most other global variables. This removes some dependencies from ip_input.c; + remove the IPFW_LOADED macro, just test ip_fw_chk_ptr directly; + remove the DUMMYNET_LOADED macro, just test ip_dn_io_ptr directly; + move ip_dn_ruledel_ptr to ip_fw2.c which is the only file using it; To be merged together with rev 193497 MFC after: 5 days	2009-06-05 13:44:30 +00:00
Marko Zec	21ca7b57bd	Change the curvnet variable from a global const struct vnet , previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged. This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_ macros expand to whitespace. The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another. The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions. This change also introduces a DDB subcommand to show the list of all vnet instances. Approved by: julian (mentor)	2009-05-05 10:56:12 +00:00
Kip Macy	279aa3d419	Change if_output to take a struct route as its fourth argument in order to allow passing a cached struct llentry * down to L2 Reviewed by: rwatson	2009-04-16 20:30:28 +00:00
Bjoern A. Zeeb	33553d6e99	For all files including net/vnet.h directly include opt_route.h and net/route.h. Remove the hidden include of opt_route.h and net/route.h from net/vnet.h. We need to make sure that both opt_route.h and net/route.h are included before net/vnet.h because of the way MRT figures out the number of FIBs from the kernel option. If we do not, we end up with the default number of 1 when including net/vnet.h and array sizes are wrong. This does not change the list of files which depend on opt_route.h but we can identify them now more easily.	2009-02-27 14:12:05 +00:00
Randall Stewart	2f4afd2125	Adds support for SCTP checksum offload. This means we, like TCP and UDP, move the checksum calculation into the IP routines when there is no hardware support we call into the normal SCTP checksum routine. The next round of SCTP updates will use this functionality. Of course the IGB driver needs a few updates to support the new intel controller set that actually does SCTP csum offload too. Reviewed by: gnn, rwatson, kmacy	2009-02-03 11:00:43 +00:00
Qing Li	9928dafbb8	Remove the rt argument from nd6_storelladdr() because rt is no longer accessed.	2008-12-17 10:27:34 +00:00
Qing Li	6e6b3f7cbc	This main goals of this project are: 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code, The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries. Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently: - Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion	2008-12-15 06:10:57 +00:00
Marko Zec	385195c062	Conditionally compile out V_ globals while instantiating the appropriate container structures, depending on VIMAGE_GLOBALS compile time option. Make VIMAGE_GLOBALS a new compile-time option, which by default will not be defined, resulting in instatiations of global variables selected for V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be effectively compiled out. Instantiate new global container structures to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0, vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0. Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_ macros resolve either to the original globals, or to fields inside container structures, i.e. effectively #ifdef VIMAGE_GLOBALS #define V_rt_tables rt_tables #else #define V_rt_tables vnet_net_0._rt_tables #endif Update SYSCTL_V_*() macros to operate either on globals or on fields inside container structs. Extend the internal kldsym() lookups with the ability to resolve selected fields inside the virtualization container structs. This applies only to the fields which are explicitly registered for kldsym() visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently this is done only in sys/net/if.c. Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code, and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in turn result in proper code being generated depending on VIMAGE_GLOBALS. De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c which were prematurely V_irtualized by automated V_ prepending scripts during earlier merging steps. PF virtualization will be done separately, most probably after next PF import. Convert a few variable initializations at instantiation to initialization in init functions, most notably in ipfw. Also convert TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in initializer functions. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-12-10 23:12:39 +00:00
Bjoern A. Zeeb	4b79449e2f	Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation	2008-12-02 21:37:28 +00:00
Kip Macy	aea78d2094	convert calls to IFQ_HANDOFF to if_transmit	2008-11-22 07:35:45 +00:00
Marko Zec	44e33a0758	Change the initialization methodology for global variables scheduled for virtualization. Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks. Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-19 09:39:34 +00:00
Dag-Erling Smørgrav	1ede983cc9	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
Marko Zec	8b615593fc	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
Ed Maste	d54d93ac7f	Move CTASSERT of ether header sizes out of the header file and into if_ethersubr.c. CTASSERT is implemented using a dummy typedef, which if used in a header file may conflict with another CTASSERT in a source file using that header. I'll make a note of this in CTASSERT's man page. Approved by: imp	2008-08-27 17:10:37 +00:00
Bjoern A. Zeeb	603724d3ab	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
Antoine Brodin	933dad75e3	Add missing braces in #if 0ed code. Approved by: rwatson (mentor) MFC after: 1 month	2008-05-10 18:33:38 +00:00
Julian Elischer	6eeac1d921	Add an option (compiled out by default) to profile outoing packets for a number of mbuf chain related parameters e.g. number of mbufs, wasted space. probably will do with further work later. Reviewed by: various	2008-04-29 21:23:21 +00:00
Kip Macy	879773c18b	back out last change as Sam believes that it breaks multicast - need to revisit after following up with pyun	2008-03-20 06:19:34 +00:00
Kip Macy	83631568fe	Don't re-initialize the interface if it is already running. This one line change makes the following code found in many ethernet device drivers (at least em, igb, ixgbe, and cxgb) gratuitous case SIOCSIFADDR: if (ifa->ifa_addr->sa_family == AF_INET) { /* * XXX * Since resetting hardware takes a very long time * and results in link renegotiation we only * initialize the hardware only when it is absolutely * required. */ ifp->if_flags \|= IFF_UP; if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) { EM_CORE_LOCK(adapter); em_init_locked(adapter); EM_CORE_UNLOCK(adapter); } arp_ifinit(ifp, ifa); } else error = ether_ioctl(ifp, command, data); break;	2008-03-20 05:35:02 +00:00
Robert Watson	b9175c4556	Move IFF_NEEDSGIANT warning from if_ethersubr.c to if.c so it is displayed for all network interfaces, not just ethernet-like ones. Upgrade it to a louder WARNING and be explicit that the flag is obsolete. Support for IFF_NEEDSGIANT will be removed in a few months (see arch@ for details) and will not appear in 8.0. Upgrade if_watchdog to a WARNING.	2008-03-07 16:00:44 +00:00
Oleg Bulyzhin	897c0f57d4	1) dummynet_io() declaration has changed. 2) Alter packet flow inside dummynet: allow certain packets to bypass dummynet scheduler. Benefits are: - lower latency: if packet flow does not exceed pipe bandwidth, packets will not be (up to tick) delayed (due to dummynet's scheduler granularity). - lower overhead: if packet avoids dummynet scheduler it shouldn't reenter ip stack later. Such packets can be fastforwarded. - recursion (which can lead to kernel stack exhaution) eliminated. This fix long existed panic, which can be triggered this way: kldload dummynet sysctl net.inet.ip.fw.one_pass=0 ipfw pipe 1 config bw 0 for i in `jot 30`; do ipfw add 1 pipe 1 icmp from any to any; done ping -c 1 localhost 3) Three new sysctl nodes are added: net.inet.ip.dummynet.io_pkt - packets passed to dummynet net.inet.ip.dummynet.io_pkt_fast - packets avoided dummynet scheduler net.inet.ip.dummynet.io_pkt_drop - packets dropped by dummynet P.S. Above comments are true only for layer 3 packets. Layer 2 packet flow is not changed yet. MFC after: 3 month	2007-11-06 23:01:42 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
Andrew Thompson	a0cf818670	Use a uint16_t type for the vlan tag rather an int.	2007-10-18 21:52:31 +00:00
Andrew Thompson	60e87ca8df	The bridging output function puts the mbuf directly on the interfaces send queue so the output network card must support the same tagging mechanism as how the frame was input (prepended Ethernet header tag or stripped HW mflag). Now the vlan Ethernet header is _always_ stripped in ether_input and the mbuf flagged, only only network cards with VLAN_HWTAGGING enabled would properly re-tag any outgoing vlan frames. If the outgoing interface does not support hardware tagging then readd the vlan header to the front of the frame. Move the common vlan encapsulation in to ether_vlanencap(). Reported by: Erik Osterholm, Jon Otterholm MFC after: 1 week	2007-10-18 21:22:15 +00:00
Julian Elischer	d3a31a9562	Remove DIAG code that discards oversized packets. There has been general consensus that this was a bad idea/ Approved by: re (bmah)	2007-09-14 06:57:28 +00:00
Robert Watson	33d2bb9ca3	First in a series of changes to remove the now-unused Giant compatibility framework for non-MPSAFE network protocols: - Remove debug_mpsafenet variable, sysctl, and tunable. - Remove NET_NEEDS_GIANT() and associate SYSINITSs used by it to force debug.mpsafenet=0 if non-MPSAFE protocols are compiled into the kernel. - Remove logic to automatically flag interrupt handlers as non-MPSAFE if debug.mpsafenet is set for an INTR_TYPE_NET handler. - Remove logic to automatically flag netisr handlers as non-MPSAFE if debug.mpsafenet is set. - Remove references in a few subsystems, including NFS and Cronyx drivers, which keyed off debug_mpsafenet to determine various aspects of their own locking behavior. - Convert NET_LOCK_GIANT(), NET_UNLOCK_GIANT(), and NET_ASSERT_GIANT into no-op's, as their entire behavior was determined by the value in debug_mpsafenet. - Alias NET_CALLOUT_MPSAFE to CALLOUT_MPSAFE. Many remaining references to NET_.*_GIANT() and NET_CALLOUT_MPSAFE are still present in subsystems, and will be removed in followup commits. Reviewed by: bz, jhb Approved by: re (kensmith)	2007-07-27 11:59:57 +00:00
Max Laier	60ee384760	Link pf 4.1 to the build: - move ftp-proxy from libexec to usr.sbin - add tftp-proxy - new altq mtag link Approved by: re (kensmith)	2007-07-03 12:46:08 +00:00
Andrew Gallatin	ead7661da1	Use if_capenable to allow LRO enabled drivers to bypass the MTU check in ether_input().	2007-06-12 19:53:44 +00:00
Andrew Gallatin	2144e288fa	Move the oversize ethernet frame size check into DIAGNOSTIC, as was proposed when it was originally added. This allows LRO to work on non-DIAGNOSTIC kernels without consuming any mbuf flags. Discussed with: sam	2007-06-11 18:45:49 +00:00
Andrew Gallatin	56b8f0b02d	Back out the previous commit which added an M_LRO mbuf flag to defeat the mtu check in ether_input. Mbuf flags are too scarce. Discussed with: sam	2007-06-11 18:26:25 +00:00
Andrew Gallatin	b6af0abdc4	Allow drivers, such as cxgb and mxge, which support LRO to bypass the MTU check in ether_input() on LRO merged frames. Discussed with: kmacy	2007-06-11 14:59:56 +00:00
Yaroslav Tykhiy	995c7fd1bf	Sync ether_ioctl() with ioctl(2) and ifnet.if_ioctl as to the type of the command argument: int -> u_long. These types have different widths in the 64-bit world. Add a note to UPDATING because the change breaks KBI on 64-bit platforms. Discussed on: -net, -current Reviewed by: bms, ru	2007-05-29 12:40:45 +00:00
Andrew Thompson	18242d3b09	Rename the trunk(4) driver to lagg(4) as it is too similar to vlan trunking. The name trunk is misused as the networking term trunk means carrying multiple VLANs over a single connection. The IEEE standard for link aggregation (802.3 section 3) does not talk about 'trunk' at all while it is used throughout IEEE 802.1Q in describing vlans. The lagg(4) driver provides link aggregation, failover and fault tolerance. Discussed on: current@	2007-04-17 00:35:11 +00:00
Andrew Thompson	b47888ceba	Add the trunk(4) driver for providing link aggregation, failover and fault tolerance. This driver allows aggregation of multiple network interfaces as one virtual interface using a number of different protocols/algorithms. failover - Sends traffic through the secondary port if the master becomes inactive. fec - Supports Cisco Fast EtherChannel. lacp - Supports the IEEE 802.3ad Link Aggregation Control Protocol (LACP) and the Marker Protocol. loadbalance - Static loadbalancing using an outgoing hash. roundrobin - Distributes outgoing traffic using a round-robin scheduler through all active ports. This code was obtained from OpenBSD and this also includes 802.3ad LACP support from agr(4) in NetBSD.	2007-04-10 00:27:25 +00:00
Bruce M Simpson	e7f8c8339c	Fix a typo, and update a comment. Submitted by: yar	2007-03-22 19:08:39 +00:00
Bruce M Simpson	402d5e2755	Make the m_pullup() diagnostic message compile-time conditional on DIAGNOSTIC. Requested by: glebius	2007-03-20 14:29:54 +00:00
Bruce M Simpson	69462a8297	Clean up the ether_input() path by using the M_PROMISC flag. Main points of this change: * Drop frames immediately if the interface is not marked IFF_UP. * Always trim off the frame checksum if present. * Always use M_VLANTAG in preference to passing 802.1Q frames to consumers. * Use __func__ consistently for KASSERT(). * Use the M_PROMISC flag to detect situations where ether_input() may reenter itself on the same call graph with the same mbuf which was promiscuously received on behalf of subsystems such as netgraph, carp, and vlan. * 802.1P frames (that is, VLAN frames with an ID of 0) will now be passed to layer 3 input paths. * Deal with the special case for CARP in a sane way. This is a significant rewrite of code on the critical path. Please report any issues to me if they arise. Frames will now only pass through dummynet if M_PROMISC is cleared, to avoid problems with re-entry. The handling of CARP needs to be revisited architecturally. The M_PROMISC flag may potentially be demoted to a link-layer flag only as it is in NetBSD, where the idea originated. Discussed on: net Idea from: NetBSD Reviewed by: yar MFC after: 1 month	2007-03-19 18:39:36 +00:00
Christian S.J. Peron	d05d46160a	Use ETHER_BPF_MTAP() instead of BPF_MTAP() here. It's possible incoming packets have had their 802.1Q tags processed by the hardware, resulting in them being stripped from the packets, and placed on the mbuf. This fixes the processing of 802.1Q tags when hardware offload of 802.1Q tags is enabled.	2007-02-22 14:50:31 +00:00
Yaroslav Tykhiy	9983b3c02d	Note that rev. 1.221 introduced a local workaround for a general problem. Add a pointer to the relevant PR for future reference. The whole comment will be OK to remove as soon as the general solution is applied. PR: kern/105943	2006-12-24 08:52:13 +00:00
Luigi Rizzo	294dd290c6	Fix an oscure bug triggered by a recent change in kern_socket.c. The symptoms were that outgoing DHCP requests for diskless kernels had the IP header corrupt. After long investigations, the source of the problem was found in ether_output() - for SIMPLEX interfaces and broadcast traffic, a copy of the packet is passed back to the kernel through if_simloop(). However if_simloop() modifies the mbuf, while the copy obtained through m_copym() is a readonly one. The bug has been there forever, but it has been triggered only recently by a change in sosend_dgram() which passed down mbufs with sufficient space to prepend the header. This fix is trivial - use m_dup() instead of m_copy() to create the copy. As an alternative, we could try and modify if_simloop() to play safely with readonly mbufs, but i don't think it is worthwhile because 1) this is a relatively infrequent code path so we do not need to worry too much about performance, and 2) the cost of doing an extra m_pullup in if_simloop() is probably the same as doing the copy of the cluster, anyways. MFC after: 1 week	2006-12-08 10:36:45 +00:00

1 2 3 4 5 ...

269 Commits