freebsd-dev

Author	SHA1	Message	Date
Marko Zec	4d79e3d5e8	In the !VIMAGE_GLOBALS case, make sure not to call vnet_net_iattach() both via the vnet_mod_register() framework and then directly, but only once. Reviewed by: bz Approved by: julian (mentor)	2009-04-15 18:15:29 +00:00
Kip Macy	aee3056f64	call default if_qflush on ifq if default method isn't used by the driver	2009-04-14 03:17:44 +00:00
Marko Zec	bfe1aba468	Introduce vnet module registration / initialization framework with dependency tracking and ordering enforcement. With this change, per-vnet initialization functions introduced with r190787 are no longer directly called from traditional initialization functions (which cc in most cases inlined to pre-r190787 code), but are instead registered via the vnet framework first, and are invoked only after all prerequisite modules have been initialized. In the long run, this framework should allow us to both initialize and dismantle multiple vnet instances in a correct order. The problem this change aims to solve is how to replay the initialization sequence of various network stack components, which have been traditionally triggered via different mechanisms (SYSINIT, protosw). Note that this initialization sequence was and still can be subtly different depending on whether certain pieces of code have been statically compiled into the kernel, loaded as modules by boot loader, or kldloaded at run time. The approach is simple - we record the initialization sequence established by the traditional mechanisms whenever vnet_mod_register() is called for a particular vnet module. The vnet_mod_register_multi() variant allows a single initializer function to be registered multiple times but with different arguments - currently this is only used in kern/uipc_domain.c by net_add_domain() with different struct domain * as arguments, which allows for protosw-registered initialization routines to be invoked in a correct order by the new vnet initialization framework. For the purpose of identifying vnet modules, each vnet module has to have a unique ID, which is statically assigned in sys/vimage.h. Dynamic assignment of vnet module IDs is not supported yet. A vnet module may specify a single prerequisite module at registration time by filling in the vmi_dependson field of its vnet_modinfo struct with the ID of the module it depends on. Unless specified otherwise, all vnet modules depend on VNET_MOD_NET (container for ifnet list head, rt_tables etc.), which thus has to and will always be initialized first. The framework will panic if it detects any unresolved dependencies before completing system initialization. Detection of unresolved dependencies for vnet modules registered after boot (kldloaded modules) is not provided. Note that the fact that each module can specify only a single prerequisite may become problematic in the long run. In particular, INET6 depends on INET being already instantiated, due to TCP / UDP structures residing in INET container. IPSEC also depends on INET, which will in turn additionally complicate making INET6-only kernel configs a reality. The entire registration framework can be compiled out by turning on the VIMAGE_GLOBALS kernel config option. Reviewed by: bz Approved by: julian (mentor)	2009-04-11 05:58:58 +00:00
Max Laier	8623f9fd7a	Follow up for r190895 It's not only the "all" group that is affected, but all groups on the given interface. PR: kern/130977, kern/131310 MFC after: 3 days (%vnet)	2009-04-10 19:16:14 +00:00
Max Laier	876d5c038f	Remove interfaces from IFG_ALL on detach. This cures a couple of pf panics when using the "self" keyword in tables or as ()-style host address and fixes "ifconfig -g all" output. PR: kern/130977, kern/131310 Submitted by: Mikolaj Golub MFC after: 3 days	2009-04-10 14:41:51 +00:00
Marko Zec	1ed81b739e	First pass at separating per-vnet initializer functions from existing functions for initializing global state. At this stage, the new per-vnet initializer functions are directly called from the existing global initialization code, which should in most cases result in compiler inlining those new functions, hence yielding a near-zero functional change. Modify the existing initializer functions which are invoked via protosw, like ip_init() et. al., to allow them to be invoked multiple times, i.e. per each vnet. Global state, if any, is initialized only if such functions are called within the context of vnet0, which will be determined via the IS_DEFAULT_VNET(curvnet) check (currently always true). While here, V_irtualize a few remaining global UMA zones used by net/netinet/netipsec networking code. While it is not yet clear to me or anybody else whether this is the right thing to do, at this stage this makes the code more readable, and makes it easier to track uncollected UMA-zone-backed objects on vnet removal. In the long run, it's quite possible that some form of shared use of UMA zone pools among multiple vnets should be considered. Bump __FreeBSD_version due to changes in layout of structs vnet_ipfw, vnet_inet and vnet_net. Approved by: julian (mentor)	2009-04-06 22:29:41 +00:00
Sam Leffler	a51f44a7a6	enable setting the mac address of 802.11 devices	2009-03-28 17:36:56 +00:00
Jamie Gritton	bc3977f122	Call the interface's if_ioctl from ifioctl(), if the protocol didn't handle the ioctl. There are other paths that already call it, but this allows for a non-interface socket (like AF_LOCAL which ifconfig now uses) to use a broader class of interface ioctls. Approved by: bz (mentor), rwatson	2009-03-20 13:41:23 +00:00
Robert Watson	e5adda3d51	Remove IFF_NEEDSGIANT, a compatibility infrastructure introduced in FreeBSD 5.x to allow network device drivers to run with Giant despite the network stack being Giant-free. This significantly simplifies calls into ioctl() on network interfaces, especially in the multicast code, as well as eliminates deferred invocation of interface if_start routines. Disable the build on device drivers still depending on IFF_NEEDSGIANT as they no longer compile. They will be removed in a few weeks if they haven't been made MPSAFE in that time. Disabled drivers: if_ar if_axe if_aue if_cdce if_cue if_kue if_ray if_rue if_rum if_sr if_udav if_ural if_zyd Drivers that were already disabled because of tty changes: if_ppp if_sl Discussed on: arch@	2009-03-15 14:21:05 +00:00
Sam Leffler	c0c9ea90a8	remove stray ;	2009-03-14 17:54:58 +00:00
Bjoern A. Zeeb	33553d6e99	For all files including net/vnet.h directly include opt_route.h and net/route.h. Remove the hidden include of opt_route.h and net/route.h from net/vnet.h. We need to make sure that both opt_route.h and net/route.h are included before net/vnet.h because of the way MRT figures out the number of FIBs from the kernel option. If we do not, we end up with the default number of 1 when including net/vnet.h and array sizes are wrong. This does not change the list of files which depend on opt_route.h but we can identify them now more easily.	2009-02-27 14:12:05 +00:00
Jamie Gritton	b89e82dd87	Standardize the various prison_foo_ip[46] functions and prison_if to return zero on success and an error code otherwise. The possible errors are EADDRNOTAVAIL if an address being checked for doesn't match the prison, and EAFNOSUPPORT if the prison doesn't have any addresses in that address family. For most callers of these functions, use the returned error code instead of e.g. a hard-coded EADDRNOTAVAIL or EINVAL. Always include a jailed() check in these functions, where a non-jailed cred always returns success (and makes no changes). Remove the explicit jailed() checks that preceded many of the function calls. Approved by: bz (mentor)	2009-02-05 14:06:09 +00:00
John Baldwin	eb322a6f77	Only start the if_slowtimo timer (which drives the if_watchdog methods of network interfaces) if we have at least one interface with an if_watchdog routine. MFC after: 2 weeks	2009-01-23 20:53:01 +00:00
Kip Macy	6241d13a1b	if_rtdel is always called with the RADIX_NODE_HEAD lock held	2008-12-18 09:59:24 +00:00
Kip Macy	d24c444ca0	add ifnet_byindex_locked to allow for use of IFNET_RLOCK	2008-12-18 04:50:44 +00:00
Kip Macy	c368cff776	avoid trying to acquire a shared lock while holding an exclusive lock by making the ifnet lock acquisition exclusive	2008-12-17 04:33:52 +00:00
Kip Macy	991f8615e4	convert ifnet and afdata locks from mutexes to rwlocks	2008-12-17 00:11:56 +00:00
Qing Li	6e6b3f7cbc	This main goals of this project are: 1. separating L2 tables (ARP, NDP) from the L3 routing tables 2. removing as much locking dependencies among these layers as possible to allow for some parallelism in the search operations 3. simplify the logic in the routing code, The most notable end result is the obsolescent of the route cloning (RTF_CLONING) concept, which translated into code reduction in both IPv4 ARP and IPv6 NDP related modules, and size reduction in struct rtentry{}. The change in design obsoletes the semantics of RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland applications such as "arp" and "ndp" have been modified to reflect those changes. The output from "netstat -r" shows only the routing entries. Quite a few developers have contributed to this project in the past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and Andre Oppermann. And most recently: - Kip Macy revised the locking code completely, thus completing the last piece of the puzzle, Kip has also been conducting active functional testing - Sam Leffler has helped me improving/refactoring the code, and provided valuable reviews - Julian Elischer setup the perforce tree for me and has helped me maintaining that branch before the svn conversion	2008-12-15 06:10:57 +00:00
Bjoern A. Zeeb	40eb85e75e	Whitespace changes only - tabs must have been converted to spaces somehow, when moving the code from p4 to svn. Sponsored by: The FreeBSD Foundation	2008-12-11 15:42:59 +00:00
Marko Zec	385195c062	Conditionally compile out V_ globals while instantiating the appropriate container structures, depending on VIMAGE_GLOBALS compile time option. Make VIMAGE_GLOBALS a new compile-time option, which by default will not be defined, resulting in instatiations of global variables selected for V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be effectively compiled out. Instantiate new global container structures to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0, vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0. Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_ macros resolve either to the original globals, or to fields inside container structures, i.e. effectively #ifdef VIMAGE_GLOBALS #define V_rt_tables rt_tables #else #define V_rt_tables vnet_net_0._rt_tables #endif Update SYSCTL_V_*() macros to operate either on globals or on fields inside container structs. Extend the internal kldsym() lookups with the ability to resolve selected fields inside the virtualization container structs. This applies only to the fields which are explicitly registered for kldsym() visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently this is done only in sys/net/if.c. Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code, and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in turn result in proper code being generated depending on VIMAGE_GLOBALS. De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c which were prematurely V_irtualized by automated V_ prepending scripts during earlier merging steps. PF virtualization will be done separately, most probably after next PF import. Convert a few variable initializations at instantiation to initialization in init functions, most notably in ipfw. Also convert TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in initializer functions. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-12-10 23:12:39 +00:00
Bjoern A. Zeeb	21b14a75f6	It does not make much sense to include net/route.h twice. Remove one #include.	2008-12-09 21:09:05 +00:00
Bjoern A. Zeeb	653735c44c	Add rwlock.h (and lock.h for that) to keep no-INET kernels compiling after RADIX_NODE_HEAD_{,UN}LOCK() were added. Must have been "learned" by pollution before (most likely: route.h -> radix.h -> rwlock.h)	2008-12-09 20:05:58 +00:00
Bjoern A. Zeeb	4b79449e2f	Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation	2008-12-02 21:37:28 +00:00
Bjoern A. Zeeb	413628a7e3	MFp4: Bring in updated jail support from bz_jail branch. This enhances the current jail implementation to permit multiple addresses per jail. In addtion to IPv4, IPv6 is supported as well. Due to updated checks it is even possible to have jails without an IP address at all, which basically gives one a chroot with restricted process view, no networking,.. SCTP support was updated and supports IPv6 in jails as well. Cpuset support permits jails to be bound to specific processor sets after creation. Jails can have an unrestricted (no duplicate protection, etc.) name in addition to the hostname. The jail name cannot be changed from within a jail and is considered to be used for management purposes or as audit-token in the future. DDB 'show jails' command was added to aid debugging. Proper compat support permits 32bit jail binaries to be used on 64bit systems to manage jails. Also backward compatibility was preserved where possible: for jail v1 syscalls, as well as with user space management utilities. Both jail as well as prison version were updated for the new features. A gap was intentionally left as the intermediate versions had been used by various patches floating around the last years. Bump __FreeBSD_version for the afore mentioned and in kernel changes. Special thanks to: - Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches and Olivier Houchard (cognet) for initial single-IPv6 patches. - Jeff Roberson (jeff) and Randall Stewart (rrs) for their help, ideas and review on cpuset and SCTP support. - Robert Watson (rwatson) for lots and lots of help, discussions, suggestions and review of most of the patch at various stages. - John Baldwin (jhb) for his help. - Simon L. Nielsen (simon) as early adopter testing changes on cluster machines as well as all the testers and people who provided feedback the last months on freebsd-jail and other channels. - My employer, CK Software GmbH, for the support so I could work on this. Reviewed by: (see above) MFC after: 3 months (this is just so that I get the mail) X-MFC Before: 7.2-RELEASE if possible	2008-11-29 14:32:14 +00:00
Marko Zec	97021c2464	Merge more of currently non-functional (i.e. resolving to whitespace) macros from p4/vimage branch. Do a better job at enclosing all instantiations of globals scheduled for virtualization in #ifdef VIMAGE_GLOBALS blocks. De-virtualize and mark as const saorder_state_alive and saorder_state_any arrays from ipsec code, given that they are never updated at runtime, so virtualizing them would be pointless. Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-26 22:32:07 +00:00
Sam Leffler	1444358966	use consistent style	2008-11-24 17:34:00 +00:00
Kip Macy	db7f0b974f	- bump __FreeBSD version to reflect added buf_ring, memory barriers, and ifnet functions - add memory barriers to <machine/atomic.h> - update drivers to only conditionally define their own - add lockless producer / consumer ring buffer - remove ring buffer implementation from cxgb and update its callers - add if_transmit(struct ifnet ifp, struct mbuf m) to ifnet to allow drivers to efficiently manage multiple hardware queues (i.e. not serialize all packets through one ifq) - expose if_qflush to allow drivers to flush any driver managed queues This work was supported by Bitgravity Inc. and Chelsio Inc.	2008-11-22 05:55:56 +00:00
Marko Zec	44e33a0758	Change the initialization methodology for global variables scheduled for virtualization. Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks. Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-11-19 09:39:34 +00:00
Bjoern A. Zeeb	5a97c9d46c	Include if_arp.h for IFP2AC so that the netgraph parts in if.c are happy even if compiled without INET or INET6. MFC after: 2 months	2008-11-06 15:26:09 +00:00
Dag-Erling Smørgrav	1ede983cc9	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
Marko Zec	8b615593fc	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
Ed Schouten	6bfa9a2d66	Replace all calls to minor() with dev2unit(). After I removed all the unit2minor()/minor2unit() calls from the kernel yesterday, I realised calling minor() everywhere is quite confusing. Character devices now only have the ability to store a unit number, not a minor number. Remove the confusion by using dev2unit() everywhere. This commit could also be considered as a bug fix. A lot of drivers call minor(), while they should actually be calling dev2unit(). In -CURRENT this isn't a problem, but it turns out we never had any problem reports related to that issue in the past. I suspect not many people connect more than 256 pieces of the same hardware. Reviewed by: kib	2008-09-27 08:51:18 +00:00
Ed Schouten	d3ce832719	Remove unit2minor() use from kernel code. When I changed kern_conf.c three months ago I made device unit numbers equal to (unneeded) device minor numbers. We used to require bitshifting, because there were eight bits in the middle that were reserved for a device major number. Not very long after I turned dev2unit(), minor(), unit2minor() and minor2unit() into macro's. The unit2minor() and minor2unit() macro's were no-ops. We'd better not remove these four macro's from the kernel, because there is a lot of (external) code that may still depend on them. For now it's harmless to remove all invocations of unit2minor() and minor2unit(). Reviewed by: kib	2008-09-26 14:19:52 +00:00
Bjoern A. Zeeb	f0c042211b	Make the checks for ptp interfaces in ifa_ifwithdstaddr() and ifa_ifwithnet() look more similar by comparing the pointer to NULL in both cases. MFC after: 3 months	2008-08-24 11:03:43 +00:00
Andrew Thompson	516993d48e	ifnet_setbyindex() is only used locally, go back to being static.	2008-08-20 05:00:18 +00:00
Julian Elischer	ac957cd271	A bunch of formatting fixes brough to light by, or created by the Vimage commit a few days ago.	2008-08-20 01:05:56 +00:00
Bjoern A. Zeeb	603724d3ab	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
Robert Watson	02f4879d3a	Introduce locking around use of ifindex_table, whose use was previously unsynchronized. While races were extremely rare, we've now had a couple of reports of panics in environments involving large numbers of IPSEC tunnels being added very quickly on an active system. - Add accessor functions ifnet_byindex(), ifaddr_byindex(), ifdev_byindex() to replace existing accessor macros. These functions now acquire the ifnet lock before derefencing the table. - Add IFNET_WLOCK_ASSERT(). - Add static accessor functions ifnet_setbyindex(), ifdev_setbyindex(), which set values in the table either asserting of acquiring the ifnet lock. - Use accessor functions throughout if.c to modify and read ifindex_table. - Rework ifnet attach/detach to lock around ifindex_table modification. Note that these changes simply close races around use of ifindex_table, and make no attempt to solve the probem of disappearing ifnets. Further refinement of this work, including with respect to ifindex_table resizing, is still required. In a future change, the ifnet lock should be converted from a mutex to an rwlock in order to reduce contention. Reviewed and tested by: brooks	2008-06-26 23:05:28 +00:00
Brooks Davis	d94ccb096b	The if_check() function performed three actions: - verified that the ifp->if_snd.ifq_mtx was initalized for all attached interfaces. This was pointless because it was initalized for all interfaces in if_attach() so I've removed it. - Checked that ifp->if_snd.ifq_maxlen is initalized and set it to ifqmaxlen if unset. This makes more sense in if_attach() so I moved it there. - The first call of if_slowtimo(). Delete if_check() and call if_slowtimo() directly from the SYSINIT().	2008-05-17 03:38:13 +00:00
Julian Elischer	8b07e49a00	Add code to allow the system to handle multiple routing tables. This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x) Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux. From my notes: ----- One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address. Constraints: ------------ I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need. One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing". One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch. This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it. Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs. To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family. The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before. The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row. In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later. One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically). You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it. This brings us as to how the correct FIB is selected for an outgoing IPV4 packet. Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways. Packets fall into one of a number of classes. 1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice.. setfib -3 ping target.example.com # will use fib 3 for ping. It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands. 2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.) 3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2). 4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib. 5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to. 6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1. Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented) In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB. In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process. Early testing experience: ------------------------- Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks. For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done. Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly. ipfw has grown 2 new keywords: setfib N ip from anay to any count ip from any to any fib N In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required. SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something. Where to next: -------------------- After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code. Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code. My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it. When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry. Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already. This work was sponsored by Ironport Systems/Cisco Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco	2008-05-09 23:03:00 +00:00
Brooks Davis	ae0615f633	Delay the global registration of the struct ifnet in if_alloc() until after we're certain the allocation will entierly succeed. This fixes a leak in a fairly unlikely case. Reported by: vijay singh <vijjus at rocketmail dot com> MFC after: 1 week	2008-04-19 22:04:51 +00:00
Sam Leffler	fb27dd1db3	expose if_purgemaddrs, it will be used by the vap code unless someone redesigns the mcast support code in the next few weeks MFC after: 3 weeks	2008-03-25 21:23:32 +00:00
Robert Watson	237fdd787b	In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink	2008-03-16 10:58:09 +00:00
Robert Watson	b9175c4556	Move IFF_NEEDSGIANT warning from if_ethersubr.c to if.c so it is displayed for all network interfaces, not just ethernet-like ones. Upgrade it to a louder WARNING and be explicit that the flag is obsolete. Support for IFF_NEEDSGIANT will be removed in a few months (see arch@ for details) and will not appear in 8.0. Upgrade if_watchdog to a WARNING.	2008-03-07 16:00:44 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
Robert Watson	33d2bb9ca3	First in a series of changes to remove the now-unused Giant compatibility framework for non-MPSAFE network protocols: - Remove debug_mpsafenet variable, sysctl, and tunable. - Remove NET_NEEDS_GIANT() and associate SYSINITSs used by it to force debug.mpsafenet=0 if non-MPSAFE protocols are compiled into the kernel. - Remove logic to automatically flag interrupt handlers as non-MPSAFE if debug.mpsafenet is set for an INTR_TYPE_NET handler. - Remove logic to automatically flag netisr handlers as non-MPSAFE if debug.mpsafenet is set. - Remove references in a few subsystems, including NFS and Cronyx drivers, which keyed off debug_mpsafenet to determine various aspects of their own locking behavior. - Convert NET_LOCK_GIANT(), NET_UNLOCK_GIANT(), and NET_ASSERT_GIANT into no-op's, as their entire behavior was determined by the value in debug_mpsafenet. - Alias NET_CALLOUT_MPSAFE to CALLOUT_MPSAFE. Many remaining references to NET_.*_GIANT() and NET_CALLOUT_MPSAFE are still present in subsystems, and will be removed in followup commits. Reviewed by: bz, jhb Approved by: re (kensmith)	2007-07-27 11:59:57 +00:00
Brooks Davis	a45cbf12c8	Update the comments on if_alloc(), if_free(), if_free_type(), and if_attach. Remove a comment about pre-3.0 network drivers from if_attach(). Be a bit more consistant about whitespace near comments.	2007-05-16 19:59:01 +00:00
Andrew Thompson	18242d3b09	Rename the trunk(4) driver to lagg(4) as it is too similar to vlan trunking. The name trunk is misused as the networking term trunk means carrying multiple VLANs over a single connection. The IEEE standard for link aggregation (802.3 section 3) does not talk about 'trunk' at all while it is used throughout IEEE 802.1Q in describing vlans. The lagg(4) driver provides link aggregation, failover and fault tolerance. Discussed on: current@	2007-04-17 00:35:11 +00:00
Andrew Thompson	b47888ceba	Add the trunk(4) driver for providing link aggregation, failover and fault tolerance. This driver allows aggregation of multiple network interfaces as one virtual interface using a number of different protocols/algorithms. failover - Sends traffic through the secondary port if the master becomes inactive. fec - Supports Cisco Fast EtherChannel. lacp - Supports the IEEE 802.3ad Link Aggregation Control Protocol (LACP) and the Marker Protocol. loadbalance - Static loadbalancing using an outgoing hash. roundrobin - Distributes outgoing traffic using a round-robin scheduler through all active ports. This code was obtained from OpenBSD and this also includes 802.3ad LACP support from agr(4) in NetBSD.	2007-04-10 00:27:25 +00:00
Bruce M Simpson	75ae0c016b	Fix a case where hardware removal of an interface caused an attempt to announce an ll_ifma which has gone away. Add a KASSERT to catch regressions. Bug found by: Tom Uffner	2007-03-27 16:11:28 +00:00
Bruce M Simpson	5896d12465	Fix tinderbox; ng_ether needs to see if_findmulti().	2007-03-20 03:15:43 +00:00
Bruce M Simpson	ec002fee99	Implement reference counting for ifmultiaddr, in_multi, and in6_multi structures. Detect when ifnet instances are detached from the network stack and perform appropriate cleanup to prevent memory leaks. This has been implemented in such a way as to be backwards ABI compatible. Kernel consumers are changed to use if_delmulti_ifma(); in_delmulti() is unable to detect interface removal by design, as it performs searches on structures which are removed with the interface. With this architectural change, the panics FreeBSD users have experienced with carp and pfsync should be resolved. Obtained from: p4 branch bms_netdev Reviewed by: andre Sponsored by: Garance A Drosehn Idea from: NetBSD MFC after: 1 month	2007-03-20 00:36:10 +00:00
Bruce M Simpson	40d8a30241	Fix a bug in if_findmulti(), whereby it would not find (and thus delete) a link-layer multicast group membership. Such memberships are needed in order to support protocols such as IS-IS without putting the interface into PROMISC or ALLMULTI modes. sa_equal() is not OK for comparing sockaddr_dl as it has deeper structure than a simple byte array, so add sa_dl_equal() and use that instead. Reviewed by: rwatson Verified with: /usr/sbin/mtest Bug found by: Jouke Witteveen MFC after: 2 weeks	2007-02-22 00:14:02 +00:00
Gleb Smirnoff	c18ffdc87d	The recent issues with em(4) interface has shown that the old 4.4BSD if_watchdog/if_timer interface doesn't fit modern SMP network stack design. Device drivers that need watchdog to monitor their hardware should implement it theirselves. Eventually the if_watchdog/if_timer API will be removed. For now, warn that driver uses it. Reviewed by: scottl	2006-11-30 15:02:01 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
Robert Watson	aed5570872	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
Andre Oppermann	773725a255	Fix the socket option IP_ONESBCAST by giving it its own case in ip_output() and skip over the normal IP processing. Add a supporting function ifa_ifwithbroadaddr() to verify and validate the supplied subnet broadcast address. PR: kern/99558 Tested by: Andrey V. Elsukov <bu7cher-at-yandex.ru> Sponsored by: TCP/IP Optimization Fundraise 2005 MFC after: 3 days	2006-09-06 17:12:10 +00:00
Sam Leffler	6b7330e2d4	Revise network interface cloning to take an optional opaque parameter that can specify configuration parameters: o rev cloner api's to add optional parameter block o add SIOCCREATE2 that accepts parameter data o rev vlan support to use new api (maintain old code) Reviewed by: arch@	2006-07-09 06:04:01 +00:00
Yaroslav Tykhiy	4b97d7affd	There is a consensus that ifaddr.ifa_addr should never be NULL, except in places dealing with ifaddr creation or destruction; and in such special places incomplete ifaddrs should never be linked to system-wide data structures. Therefore we can eliminate all the superfluous checks for "ifa->ifa_addr != NULL" and get ready to the system crashing honestly instead of masking possible bugs. Suggested by: glebius, jhb, ru	2006-06-29 19:22:05 +00:00
Gleb Smirnoff	457f48e65c	- First initialize ifnet, and then insert it into global list. - First remove from global list, then start destroying. PR: kern/97679 Submitted by: Alex Lyashkov <shadow itt.net.ru> Reviewed by: rwatson, brooks	2006-06-21 06:02:35 +00:00
Max Laier	0dad3f0e15	Import interface groups from OpenBSD. This allows to group interfaces in order to - for example - apply firewall rules to a whole group of interfaces. This is required for importing pf from OpenBSD 3.9 Obtained from: OpenBSD (with changes) Discussed on: -net (back in April)	2006-06-19 22:20:45 +00:00
Max Khon	affcaf7871	Fix KASSERT conditions in if_deregister_com_alloc().	2006-06-11 22:09:28 +00:00
Andrew Thompson	f3b90d48bb	Announce all interfaces to devd on attach/detach. This adds a new devctl notification so all interfaces including pseudo are reported. When netif creates the clones at startup devctl_disable has not been turned off yet so the interfaces will not be initialised twice, enforce this by adding an explicit order between rc.d/netif and rc.d/devd. This change allows actions to taken in userland when an interface is cloned and the pseudo interface will be automatically configured if a ifconfig_<int>="" line exists in rc.conf. Reviewed by: brooks No objections on: net	2006-06-01 00:41:07 +00:00
Gleb Smirnoff	93a69f5703	No direct call to carp_ifdetach() anymore. It is called by event handler. PR: kern/82908 Submitted by: Dan Lukes <dan obluda.cz>	2006-03-21 14:31:18 +00:00
Paul Saab	19cf04981a	Implement SIOCGIFCONF for 32bit binaries.	2006-02-02 19:58:37 +00:00
Gleb Smirnoff	75ee267c22	Merge the //depot/user/yar/vlan branch into CVS. It contains some collective work by yar, thompsa and myself. The checksum offloading part also involves work done by Mihail Balikov. The most important changes: o Instead of global linked list of all vlan softc use a per-trunk hash. The size of hash is dynamically adjusted, depending on number of entries. This changes struct ifnet, replacing counter of vlans with a pointer to trunk structure. This change is an improvement for setups with big number of VLANs, several interfaces and several CPUs. It is a small regression for a setup with a single VLAN interface. An alternative to dynamic hash is a per-trunk static array with 4096 entries, which is a compile time option - VLAN_ARRAY. In my experiments the array is not an improvement, probably because such a big trunk structure doesn't fit into CPU cache. o Introduce an UMA zone for VLAN tags. Since drivers depend on it, the zone is declared in kern_mbuf.c, not in optional vlan(4) driver. This change is a big improvement for any setup utilizing vlan(4). o Use rwlock(9) instead of mutex(9) for locking. We are the first ones to do this! :) o Some drivers can do hardware VLAN tagging + hardware checksum offloading. Add an infrastructure for this. Whenever vlan(4) is attached to a parent or parent configuration is changed, the flags on vlan(4) interface are updated. In collaboration with: yar, thompsa In collaboration with: Mihail Balikov <mihail.balikov interbgc.com>	2006-01-30 13:45:15 +00:00
Yaroslav Tykhiy	83ec464f61	Be consistent in checking ifa->ifa_addr for NULL. Found by: Coverity Prevent (tm) MFC after: 3 days	2006-01-23 10:30:34 +00:00
Ruslan Ermilov	4a0d6638b3	- Store pointer to the link-level address right in "struct ifnet" rather than in ifindex_table[]; all (except one) accesses are through ifp anyway. IF_LLADDR() works faster, and all (except one) ifaddr_byindex() users were converted to use ifp->if_addr. - Stop storing a (pointer to) Ethernet address in "struct arpcom", and drop the IFP2ENADDR() macro; all users have been converted to use IF_LLADDR() instead.	2005-11-11 16:04:59 +00:00
Ruslan Ermilov	d09ed26fd8	- Make IFP2ENADDR() a pointer to IF_LLADDR() rather than another copy of Ethernet address. - Change iso88025_ifattach() and fddi_ifattach() to accept MAC address as an argument, similar to ether_ifattach(), to make this work.	2005-11-11 07:36:14 +00:00
Yaroslav Tykhiy	b5c8bd5924	Clean up consistency checks in if_setflag(): . use KASSERT for all checks so that the source of an error can be detected; . use __func__ instead of spelling function name each time; . fix a typo.	2005-10-03 02:14:51 +00:00
Yaroslav Tykhiy	7aebc5e86e	Log a message about entering or leaving permanently promiscuous mode, as it is done for usual promiscuous mode already. This info is important because promiscuous mode in the hands of a malicious party can jeopardize the whole network.	2005-10-03 01:47:43 +00:00
Robert Watson	b1c53bc9c0	Take a first cut at cleaning up ifnet removal and multicast socket panics, which occur when stale ifnet pointers are left in struct moptions hung off of inpcbs: - Add in_ifdetach(), which matches in6_ifdetach(), and allows the protocol to perform early tear-down on the interface early in if_detach(). - Annotate that if_detach() needs careful consideration. - Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR -- this is not the place to detect interface removal! This also removes what is basically a nasty (and now unnecessary) hack. - Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP IPv4 sockets. It is now possible to run the msocket_ifnet_remove regression test using HEAD without panicking. MFC after: 3 days	2005-09-18 17:36:28 +00:00
Robert Watson	0a53be4671	In netkqfilter(), return EINVAL instead of 1 (EPERM) when a filter type is requested on a network interface file descriptor that is non-applicable. MFC after: 3 days	2005-09-12 19:26:03 +00:00
Sam Leffler	62313e4c3f	reclaim sbuf and clear lock on error in ifconf Submitted by: Ted Unangst Reviewed by: rwatson MFC after: 3 days	2005-09-04 17:32:47 +00:00
Brooks Davis	dc7c539e33	When we started calling if_findindex() from if_alloc() with an empty struct ifnet most of if_findindex() become a complex no-op. Remove it and replace it with a corrected version of the four line for loop it devolved to plus some error handling. This should probably be replaced with subr_unit at some point. Switch from checking ifaddr_byindex to ifnet_byindex when looking for empty indexes. Since we're doing this from if_alloc/if_free, we can only be sure that ifnet_byindex will be correct. This fixes panics when loading the ef(4) module. The panics were caused by the fact that if_alloc was called four time before if_attach was called and thus ifaddr_byindex was not set and the same unit was allocated again. This in turn caused the first if_attach to fail because the ifp was not the one in ifnet_byindex(ifp->if_index). Reported by: "Wojciech A. Koszek" <dunstan at freebsd dot czest dot pl> PR: kern/84987 MFC After: 1 day	2005-08-18 18:36:40 +00:00
Brooks Davis	7cf30146f0	- Move IF_ADDR_LOCK_DESTROY(ifp) from if_free to if_free_type. - Add a note that additions should be made to if_free_type and not if_free to help avoid this in the future. This apparently fixes a use after free in if_bridge and may fix bugs in other direct if_free_type consumers. Reported by: thompsa	2005-08-16 17:02:35 +00:00
Robert Watson	292ee7be1c	Rename IFF_RUNNING to IFF_DRV_RUNNING, IFF_OACTIVE to IFF_DRV_OACTIVE, and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making and documenting the locking of these flags the responsibility of the device driver, not the network stack. The flags for these two fields will be mutually exclusive so that they can be exposed to user space as though they were stored in the same variable. Provide #defines to provide the old names #ifndef _KERNEL, so that user applications (such as ifconfig) can use the old flag names. Using the old names in a device driver will result in a compile error in order to help device driver writers adopt the new model. When exposing the interface flags to user space, via interface ioctls or routing sockets, or the two fields together. Since the driver flags cannot currently be set for user space, no new logic is currently required to handle this case. Add some assertions that general purpose network stack routines, such as if_setflags(), are not improperly used on driver-owned flags. With this change, a large number of very minor network stack races are closed, subject to correct device driver locking. Most were likely never triggered. Driver sweep to follow; many thanks to pjd and bz for the line-by-line review they gave this patch. Reviewed by: pjd, bz MFC after: 7 days	2005-08-09 10:16:17 +00:00
Sam Leffler	456d182d5b	destroy lock _before_ free'ing the structure it resides in	2005-08-06 18:42:01 +00:00
John Baldwin	6da3131abd	Initialize the if_addr mutex in if_alloc() rather than waiting until if_attach(). This allows ethernet drivers to use it in their routines to program their MAC filters before ether_ifattach() is called (de(4) is one such driver). Also, the if_addr mutex is destroyed in if_free() rather than if_detach(), so there was another potential bug in that a driver that failed during attach and called if_free() without having called ether_ifattach() would have tried to destroy an uninitialized mutex. Reported by: Holm Tiffe holm at freibergnet dot de Discussed with: rwatson	2005-08-04 14:39:47 +00:00
Robert Watson	c3b31afd92	Protect link layer network interface multicast address list manipulation using ifp->if_addr_mtx: - Initialize if_addr_mtx when ifnet is initialized. - Destroy if_addr_mtx when ifnet is torn down. - Rename ifmaof_ifpforaddr() to if_findmulti(); assert if_addr_mtx. Staticize. - Extract ifmultiaddr allocation and initialization into if_allocmulti(); accept a 'mflags' argument to indicate whether or not sleeping is permitted. This centralizes error handling and address duplication. - Extract ifmultiaddr tear-down and deallocation in if_freemulti(). - Re-structure if_addmulti() to hold if_addr_mtx around manipulation of the ifnet multicast address list and reference count manipulation. Make use of non-sleeping allocations. Annotate the fact that we only generate routing socket events for explicit address addition, not implicit link layer address addition. - Re-structure if_delmulti() to hold if_addr_mtx around manipulation of the ifnet multicast address list and reference count manipulation. Annotate the lack of a routing socket event for implicit link layer address removal. - De-spl all and sundry. Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca> MFC after: 1 week	2005-08-02 23:23:26 +00:00
Robert Watson	2432c31c8b	In multicast routines: Compare pointers with NULL rather than treating them as booleans. Compare pointers with NULL rather than 0 to make it more clear they are pointers. Assign pointers value of NULL rather than 0 to make it more clear they are pointers. MFC after: 3 days	2005-07-19 10:12:58 +00:00
Robert Watson	d8d5b10e84	Rename equal() macro to sa_equal(), which matches the definitions of sa_equal() in other files, and makes it more clear what equal() is comparing. MFC after: 3 days	2005-07-19 10:03:47 +00:00
Max Laier	52023244de	Move eventhandler for 'ifnet_departure_event' at the end of the progress. Some of the (IPv6) cleanup functions send packets to inform peers of the departure. These packets confused users of ifnet_departure_event (pf at the moment). PR: kern/80627 Tested by: Divacky Roman MFC after: 1 week	2005-07-14 20:26:43 +00:00
Yaroslav Tykhiy	1a3b685942	MFp4: - Introduce a helper function if_setflag() containing the code common to ifpromisc() and if_allmulti() instead of duplicating the code poorly, with different bugs. - Call ifp->if_ioctl() in a consistent way: always use more compatible C syntax and check whether ifp->if_ioctl is not NULL prior to the call. MFC after: 1 month	2005-07-14 13:56:51 +00:00
Suleiman Souhlal	571dcd15e2	Fix the recent panics/LORs/hangs created by my kqueue commit by: - Introducing the possibility of using locks different than mutexes for the knlist locking. In order to do this, we add three arguments to knlist_init() to specify the functions to use to lock, unlock and check if the lock is owned. If these arguments are NULL, we assume mtx_lock, mtx_unlock and mtx_owned, respectively. - Using the vnode lock for the knlist locking, when doing kqueue operations on a vnode. This way, we don't have to lock the vnode while holding a mutex, in filt_vfsread. Reviewed by: jmg Approved by: re (scottl), scottl (mentor override) Pointyhat to: ssouhlal Will be happy: everyone	2005-07-01 16:28:32 +00:00
Brooks Davis	1436936ab0	Spelling/grammer fixes in comment. Reported by: Hans Petter Selasky <hselasky at c2i dot net> Approved by: re (ifnet blanked)	2005-06-17 17:19:34 +00:00
Brooks Davis	28ef2db496	Return NULL instead of a bogus pointer from if_alloc when if_com_alloc fails. Move detaching the ifnet from the ifindex_table into if_free so we can both keep the sanity checks and actually delete the ifnets. [0] Reported by: gallatin [0] Approved by: re (blanket)	2005-06-12 00:53:03 +00:00
Brooks Davis	fc74a9f93a	Stop embedding struct ifnet at the top of driver softcs. Instead the struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam	2005-06-10 16:49:24 +00:00
Brooks Davis	9d80a3307a	Send link state change notifications to /dev/devctl. This is needed to start the OpenBSD dhclient when links come up.	2005-06-06 19:08:11 +00:00
Andrew Thompson	8f86751705	Add hooks into the networking layer to support if_bridge. This changes struct ifnet so a buildworld is necessary. Approved by: mlaier (mentor) Obtained from: NetBSD	2005-06-05 03:13:13 +00:00
Peter Edwards	45778b37b2	Separate out address-detaching part of if_detach into if_purgeaddrs, so if_tap doesn't need to rely on locally-rolled code to do same. The observable symptom of if_tap's bzero'ing the address details was a crash in "ifconfig tap0" after an if_tap device was closed. Reported By: Matti Saarinen (mjsaarin at cc dot helsinki dot fi)	2005-05-25 13:52:03 +00:00
Gleb Smirnoff	68a3482f69	Do not call all link state callbacks directly, but schedule a taskqueue(9) task. This fixes LORs and adds possibility to serve such events pseudorecursively, when link state change of interface causes subsequent change on other interfaces. Sponsored by: Rambler Reviewed by: sam, brooks, mux	2005-04-20 09:30:54 +00:00
Colin Percival	fbd24c5ed6	Zero the ifr.ifr_name buffer in ifconf() in order to avoid accidental disclosure of kernel memory to userland. Security: FreeBSD-SA-05:04.ifconf	2005-04-15 01:52:40 +00:00
Gleb Smirnoff	d4d2297060	ifma_protospec is a pointer. Use NULL when assigning or compating it.	2005-03-20 14:31:45 +00:00
Gleb Smirnoff	5515c2e793	Add a sysctl net.link.log_link_state_change, which allows to suppress logging of interface link state changes. Requested by: sam, kan	2005-03-12 12:58:03 +00:00
Brooks Davis	bc9d299133	Change the definition of struct if_data's member ifi_epoch from wall clock time to uptime because wall clock time may go backwards. This is a change in the API which will impact SNMP agents who are using ifi_epoch to set RFC2233's ifCounterDiscontinuityTime. None are know to exist today. This will not impact applications that are using the <index, epoch> tuple to verify interface uniqueness except that it eliminates a race which could lead to a false assumption of uniqueness. Because this is a behavior change, bump __FreeBSD_version. Discussed with: re (jhb, scottl) MFC after: 3 days Pointed out by: pkh (way back at EuroBSDCon) Pointy hat: brooks	2005-02-25 19:46:41 +00:00
Gleb Smirnoff	8b25904e36	Typo in comment.	2005-02-22 15:29:29 +00:00
Gleb Smirnoff	4d96314f88	- In if_link_state_change() extract function body from if-block, to improve readability. - Call carp_carpdev_state() from if_link_state_change() if interface has associated CARP interface. Sponsored by: Rambler	2005-02-22 14:21:59 +00:00
Gleb Smirnoff	a97719482d	Add CARP (Common Address Redundancy Protocol), which allows multiple hosts to share an IP address, providing high availability and load balancing. Original work on CARP done by Michael Shalayeff, with many additions by Marco Pfatschbacher and Ryan McBride. FreeBSD port done solely by Max Laier. Patch by: mlaier Obtained from: OpenBSD (mickey, mcbride)	2005-02-22 13:04:05 +00:00
Xin LI	b0b4b28bf1	Validate ifc->ifc_len before submitting its incarnation to sbuf_new, which will finally lead to kernel panic. Security: This prevents a local (root-launched) DoS Submitted by: Wojciech A. Koszek [dunstan at freebsd czest pl] PR: 77421 MFC After: 1 week	2005-02-12 17:51:12 +00:00
Gleb Smirnoff	8b02df2485	Log changes of link state. Reviewed by: rwatson	2005-01-30 12:57:47 +00:00
Gleb Smirnoff	1c7899c74e	This change adds reliability for Ethernet trunks built with ng_one2many: - Introduce another ng_ether(4) callback ng_ether_link_state_p, which is called from if_link_state_change(), every time link is changed. - In ng_ether_link_state() send netgraph control message notifying of link state change to a node connected to "lower" hook. Reviewed by: sam MFC after: 2 weeks	2005-01-08 12:42:03 +00:00
Warner Losh	c398230b64	/* -> /*- for license, minor formatting changes	2005-01-07 01:45:51 +00:00
Sam Leffler	94f5c9cfc0	Cleanup link state change notification: o add new if_link_state_change routine that deals with link state changes o change mii to use if_link_state_change	2004-12-08 05:45:59 +00:00
Max Laier	69fb23b73d	Implement the check I was talking about in the previous message already. Introduce domain_init_status to keep track of the init status of the domains list (surprise). 0 = uninitialized, 1 = initialized/unpopulated, 2 = initialized/done. Higher values can be used to support late addition of domains which right now "works", but is potential dangerous. I choose to only give a warning when doing so. Use domain_init_status with if_attachdomain[1]() to ensure that we have a complete domains list when we init the if_afdata array. Store the current value of domain_init_status in if_afdata_initialized. This way we can update if_afdata after a new protocol has been added (once that is allowed). Submitted by: se (with changes) Reviewed by: julian, glebius, se PR: kern/73321 (partly)	2004-11-30 22:38:37 +00:00
Robert Watson	6237419d5c	Assign if_broadcastaddr to NULL not 0 in if_attach(). Printf() a warning if if_attachdomain() is called more than once on an interface to generate some noise on mailing lists when this occurs. Fix up style in if_start(), where spaces crept in instead of tabs at some point. MFC after: 1 week MFC note: Not the printf().	2004-11-23 23:31:33 +00:00
Robert Watson	0b762445b9	Move if_handoff() from an inline in if_var.h to a function to if.c in orden to harden the ABI for 5.x; this will permit us to modify the locking in the ifnet packet dispatch without requiring drivers to be recompiled. MFC after: 3 days Discussed at: EuroBSDCon Developer's Summit	2004-10-30 09:39:13 +00:00
Robert Watson	31302ebf9d	Define IFF_LOCKGIANT() and IFF_UNLOCKGIANT() macros, which conditionally acquire Giant if the passed interface has IFF_NEEDSGIANT set on it. Modify calls into (ifp)->if_ioctl() in if.c to use these macros in order to ensure that Giant is held. MFC after: 3 days Bumped into by: jmg	2004-10-19 18:11:55 +00:00
Brian Feldman	5ed8cedc83	Call sbuf_finish() before sbuf_data() so as to not panic the system.	2004-09-22 12:53:27 +00:00
Brooks Davis	4dcf2bbbff	Fix a LOR where ifconf() used copyout while holding a mutex. This LOR was seen when configuring addresses on interfaces using ifconfig. This patch has been verified to work with over eight thousand addresses assigned to an interface. LOR id: 031	2004-09-22 08:59:41 +00:00
Brooks Davis	71672bb6f6	Log the renaming of an interface. This should make it easier to follow kernel log files.	2004-09-18 05:02:08 +00:00
Brooks Davis	55287f2a60	Re-add ifi_epoch, to struct if_data, this time replacing ifi_unused to avoid ABI changes. It is set to the last time the interface counters were zeroed, currently the time if_attach() was called. It is intentended to be a valid value for RFC2233's ifCounterDiscontinuityTime and to make it easier for applications to verify that the interface they find at a given index is the one that was there last time they looked. Due to space constraints ifi_epoch is a time_t rather then a struct timeval. SNMP would prefer higher precision, but this unlikely to be useful in practice.	2004-09-08 04:50:55 +00:00
John-Mark Gurney	9b90387dcf	don't call f_detach if the filter has alread removed the knote.. This happens when a proc exits, but needs to inform the user that this has happened.. This also means we can remove the check for detached from proc and sig f_detach functions as this is doing in kqueue now... MFC after: 5 days	2004-09-06 19:02:42 +00:00
Brooks Davis	4ff62bd97b	Back out ifi_epoch. The ABI breakage is too disruptive this close to 5-STABLE. ifi_epoch will shortly be reintroduced with less precistion using the space currently allocated to ifi_unused.	2004-09-02 05:07:29 +00:00
Max Laier	7b21048cea	Fix an assertion when if_down()ing a ALTQ managed interface. The lock should have been in place all the time the mtx_assert in the ALTQ code just discovered the shortcoming. PR: i386/71195 Tested by: Bettan (PR originator), myself MFC after: 5 days	2004-09-01 19:56:47 +00:00
Brooks Davis	9e734b4468	Use a spare byte in struct if_data to store the structure size without increasing it. Add code to ifconfig to use this size to find the sockaddr_dl after the struct if_data in the routing message. This allows struct if_data to grow (up to 255 bytes) without breaking ifconfig. Submitted by: peter	2004-09-01 18:22:14 +00:00
Brooks Davis	1fc4519b1d	Add a new variable, ifi_epoch, to struct if_data. It is set to the last time the interface counters were zeroed, currently the time if_attach() was called. It is indentended to be a valid value for RFC2233's ifCounterDiscontinuityTime and to make it easier for applications to verify that the interface they find at a given index is the one that was there last time they looked. An if_epoch "compatability" macro has not been created as ifi_epoch has never been a member of struct ifnet. Approved by: andre, bms, wollman	2004-08-30 06:29:26 +00:00
Brooks Davis	b9907cd45b	When detaching an interface, don't leave an obsolete pointer to the soon to be deleted struct ifnet around. PR: kern/52260 MFC After: 3 days	2004-08-27 19:42:40 +00:00
John-Mark Gurney	ad3b9257c2	Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)	2004-08-15 06:24:42 +00:00
Peter Pentchev	3f35d5150b	Do not attempt to clean up data that has not been initialized yet. This fixes two kernel panics on boot when the xl driver fails to allocate bus/port/memory resources. Reviewed by: silence on -net	2004-08-06 09:08:33 +00:00
Robert Watson	af5e59bf28	Add a new network interface flag, IFF_NEEDSGIANT, which will allow device drivers to declare that the ifp->if_start() method implemented by the driver requires Giant in order to operate correctly. Add a 'struct task' to 'struct ifnet' that can be used to execute a deferred ifp->if_start() in the event that if_start needs to be called in a Giant-free environment. To do this, introduce if_start(), a wrapper function for ifp->if_start(). If the interface can run MPSAFE, it directly dispatches into the interface start routine. If it can't run MPSAFE, we're running with debug.mpsafenet != 0, and Giant isn't currently held, the task is queued to execute in a swi holding Giant via if_start_deferred(). Modify if_handoff() to use if_start() instead of direct dispatch. Modify 802.11 to use if_start() instead of direct dispatch. This is intended to provide increased compatibility for non-MPSAFE network device drivers in the presence of Giant-free operation via asynchronous dispatch. However, this commit does not mark any network interfaces as IFF_NEEDSGIANT.	2004-07-27 23:20:45 +00:00
Robert Watson	8bbfdc98e4	Gratuitous whitespace change to un-wrap a short line.	2004-07-18 19:53:35 +00:00
Brooks Davis	f889d2ef8d	Major overhaul of pseudo-interface cloning. Highlights include: - Split the code out into if_clone.[ch]. - Locked struct if_clone. [1] - Add a per-cloner match function rather then simply matching names of the form <name><unit> and <name>. - Use the match function to allow creation of <interface>.<tag> vlan interfaces. The old way is preserved unchanged! - Also the match function to allow creation of stf(4) interfaces named stf0, stf, or 6to4. This is the only major user visible change in that "ifconfig stf" creates the interface stf rather then stf0 and does not print "stf0" to stdout. - Allow destroy functions to fail so they can refuse to delete interfaces. Currently, we forbid the deletion of interfaces which were created in the init function, particularly lo0, pflog0, and pfsync0. In the case of lo0 this was a panic implementation so it does not count as a user visiable change. :-) - Since most interfaces do not need the new functionality, an family of wrapper functions, ifc_simple_*(), were created to wrap old style cloner functions. - The IF_CLONE_INITIALIZER macro is replaced with a new incompatible IFC_CLONE_INITIALIZER and ifc_simple consumers use IFC_SIMPLE_DECLARE instead. Submitted by: Maurycy Pawlowski-Wieronski <maurycy at fouk.org> [1] Reviewed by: andre, mlaier Discussed on: net	2004-06-22 20:13:25 +00:00
Poul-Henning Kamp	89c9c53da0	Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.	2004-06-16 09:47:26 +00:00
Max Laier	4cb655c020	Transform tbr_dequeue into a function pointer in order to build drivers with ALTQ enabled versions of IFQ_* macros by default, as requested by serveral others. This is a follow-up to the quick fix I committed yesterday which turned off the ALTQ checks for non-ALTQ kernels.	2004-06-15 01:45:19 +00:00
Max Laier	02b199f158	Link ALTQ to the build and break with ABI for struct ifnet. Please recompile your (network) modules as well as any userland that might make sense of sizeof(struct ifnet). This does not change the queueing yet. These changes will follow in a seperate commit. Same with the driver changes, which need case by case evaluation. __FreeBSD_version bump will follow. Tested-by: (i386)LINT	2004-06-13 17:29:10 +00:00
Luigi Rizzo	3fefbff0c2	arpcom untangling: consistently with the rest of the code, use IFP2AC(ifp) to access the arpcom structure given the ifp. In this case also fix a difference in assumptions WRT the rest of the net/ sources: it is not the 'struct *softc' that starts with a 'struct arpcom', but a 'struct arpcom' that starts with a 'struct ifnet'	2004-04-24 22:24:48 +00:00
Luigi Rizzo	f4247b5934	Fix a recently introduced panic in if_detach() by delaying the invalidation of ifindex_table[] entry. Probably this code should be moved even further down, but for the time being let's do it this way.	2004-04-19 17:28:15 +00:00
Max Laier	8614fb12a0	Make if_(un)route static in if.c as they are called from if_up/if_down only. This is also cleanup to make locking easier. Reviewed by: luigi Approved by: bms(mentor)	2004-04-18 18:59:44 +00:00
Luigi Rizzo	9046571f1c	Use if_link instead of the alias if_list, and change a for() into the TAILQ_FOREACH() form. Comment the need to store the same info (mac address for ethernet-type devices) in two different places. No functional changes. Even the compiler output should be unmodified by this change.	2004-04-16 10:32:13 +00:00
Luigi Rizzo	9b98ee2c4f	Consistently use ifaddr_byindex() to access the link-level address of an interface. No functional change. On passing, comment a likely bug in net/rtsock.c:sysctl_ifmalist() which, if confirmed, would deserve to be fixed and MFC'ed	2004-04-16 08:14:34 +00:00
Warner Losh	f36cfd49ad	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson	2004-04-07 20:46:16 +00:00
Brooks Davis	bc1470f1f1	Don't allow interfaces to be renamed to the empty string. While I'm here, errors aren't bools. Pointed out by: hmp	2004-03-13 02:35:03 +00:00
Brooks Davis	196f7f54d2	Remove if_withname. It came in with the KAME import, but never got used. Should someone need its functionality, it's a really expensive implementation of: ifnet_byindex(sdl->sdl_index) Reviewed by: bde, ume	2004-03-13 02:31:40 +00:00
Max Laier	25a4adcec4	Bring eventhandler callbacks for pf. This enables pf to track dynamic address changes on interfaces (dailup) with the "on (<ifname>)"-syntax. This also brings hooks in anticipation of tracking cloned interfaces, which will be in future versions of pf. Approved by: bms(mentor)	2004-02-26 04:27:55 +00:00
Poul-Henning Kamp	dc08ffec87	Device megapatch 4/6: Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION. Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.	2004-02-21 21:10:55 +00:00
Yaroslav Tykhiy	913e410e29	Minor beautifications related to style(9) and code consistency. No functional changes.	2004-02-21 12:56:09 +00:00
Yaroslav Tykhiy	efb4018be7	Improve the SIOCSIFCAP handler a bit: - allow for ifp->if_ioctl being NULL, as the rest of ifioctl() does; - give the interface driver a chance to report a error to the caller; - don't forget to update ifp->if_lastchange upon successful modification of interface operation parameters.	2004-02-21 12:48:25 +00:00
Brooks Davis	36c19a572a	Add the kernel side of network interface renaming support. The basic process is to send a routing socket announcement that the interface has departed, change if_xname, update the sockaddr_dl associated with the interface, and announce the arrival of the interface on the routing socket. As part of this change, ifunit() is greatly simplified by testing if_xname directly. if_clone_destroy() now uses if_dname to look up the cloner for the interface and if_dunit to identify the unit number. Reviewed by: ru, sam (concept) Vincent Jardin <vjardin AT free.fr> Max Laier <max AT love2party.net>	2004-02-04 02:54:25 +00:00
Brooks Davis	ccb82468ac	More macro cleanup. Use the system roundup2() macro instead of making our own ROUNDUP() macro. Suggested by: bde	2004-02-02 21:55:34 +00:00
Brooks Davis	a8773564ca	Cleanup malloc() use in if_attach(): - malloc() returns a void* and does not need a cast - when called with M_WAITOK, malloc() can not return NULL so don't check for that case. The result of the check was bogus anyway since it would leave the interface broken.	2004-01-27 19:35:05 +00:00
Brooks Davis	8abaf58586	Clean up macro usage in if_attach(): - Use the system offsetof macro rather then making out own. - undef ROUND after we use it rather then polluting the whole file.	2004-01-27 03:15:09 +00:00
Ruslan Ermilov	12b8b80e45	Don't panic if there are more than 255 interfaces in the system.	2004-01-23 15:53:23 +00:00
Brian Feldman	5d7252afab	Don't truncate the interface name in ifunit(). It's now possible to query "very long interface names", e.g.: ndis_atheros0: flags=8847<UP,BROADCAST,DEBUG,RUNNING,SIMPLEX,MULTICAST> mtu 1500	2003-12-26 18:09:35 +00:00
Brooks Davis	9bf40ede4a	Replace the if_name and if_unit members of struct ifnet with new members if_xname, if_dname, and if_dunit. if_xname is the name of the interface and if_dname/unit are the driver name and instance. This change paves the way for interface renaming and enhanced pseudo device creation and configuration symantics. Approved By: re (in principle) Reviewed By: njl, imp Tested On: i386, amd64, sparc64 Obtained From: NetBSD (if_xname)	2003-10-31 18:32:15 +00:00
Brooks Davis	13fb40df0a	Replace a couple printfs with if_printfs.	2003-10-31 01:35:07 +00:00
Hajimu UMEMOTO	234a35c714	Since dp->dom_ifattach calls malloc() with M_WAITOK, we cannot use mutex lock directly here. Protect ifp->if_afdata instead. Reported by: grehan	2003-10-24 16:57:59 +00:00
Dag-Erling Smørgrav	72fd1b6a20	Clean up whitespace, remove "register" keyword, ANSIfy. No functional changes.	2003-10-23 13:49:10 +00:00
Hajimu UMEMOTO	e115574c1d	protect by IFNET_RLOCK.	2003-10-22 15:10:39 +00:00
Hajimu UMEMOTO	31b1bfe1b0	- add dom_if{attach,detach} framework. - transition to use ifp->if_afdata. Obtained from: KAME	2003-10-17 15:46:31 +00:00

1 2 3 4 5 ...

416 Commits