virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
unsynchronized. While races were extremely rare, we've now had a
couple of reports of panics in environments involving large numbers of
IPSEC tunnels being added very quickly on an active system.
- Add accessor functions ifnet_byindex(), ifaddr_byindex(),
ifdev_byindex() to replace existing accessor macros. These functions
now acquire the ifnet lock before derefencing the table.
- Add IFNET_WLOCK_ASSERT().
- Add static accessor functions ifnet_setbyindex(), ifdev_setbyindex(),
which set values in the table either asserting of acquiring the ifnet
lock.
- Use accessor functions throughout if.c to modify and read
ifindex_table.
- Rework ifnet attach/detach to lock around ifindex_table modification.
Note that these changes simply close races around use of ifindex_table,
and make no attempt to solve the probem of disappearing ifnets. Further
refinement of this work, including with respect to ifindex_table
resizing, is still required.
In a future change, the ifnet lock should be converted from a mutex to an
rwlock in order to reduce contention.
Reviewed and tested by: brooks
- verified that the ifp->if_snd.ifq_mtx was initalized for
all attached interfaces. This was pointless because it was
initalized for all interfaces in if_attach() so I've removed it.
- Checked that ifp->if_snd.ifq_maxlen is initalized and set it to
ifqmaxlen if unset. This makes more sense in if_attach() so
I moved it there.
- The first call of if_slowtimo(). Delete if_check() and call
if_slowtimo() directly from the SYSINIT().
This particular implementation is designed to be fully backwards compatible
and to be MFC-able to 7.x (and 6.x)
Currently the only protocol that can make use of the multiple tables is IPv4
Similar functionality exists in OpenBSD and Linux.
From my notes:
-----
One thing where FreeBSD has been falling behind, and which by chance I
have some time to work on is "policy based routing", which allows
different
packet streams to be routed by more than just the destination address.
Constraints:
------------
I want to make some form of this available in the 6.x tree
(and by extension 7.x) , but FreeBSD in general needs it so I might as
well do it in -current and back port the portions I need.
One of the ways that this can be done is to have the ability to
instantiate multiple kernel routing tables (which I will now
refer to as "Forwarding Information Bases" or "FIBs" for political
correctness reasons). Which FIB a particular packet uses to make
the next hop decision can be decided by a number of mechanisms.
The policies these mechanisms implement are the "Policies" referred
to in "Policy based routing".
One of the constraints I have if I try to back port this work to
6.x is that it must be implemented as a EXTENSION to the existing
ABIs in 6.x so that third party applications do not need to be
recompiled in timespan of the branch.
This first version will not have some of the bells and whistles that
will come with later versions. It will, for example, be limited to 16
tables in the first commit.
Implementation method, Compatible version. (part 1)
-------------------------------
For this reason I have implemented a "sufficient subset" of a
multiple routing table solution in Perforce, and back-ported it
to 6.x. (also in Perforce though not always caught up with what I
have done in -current/P4). The subset allows a number of FIBs
to be defined at compile time (8 is sufficient for my purposes in 6.x)
and implements the changes needed to allow IPV4 to use them. I have not
done the changes for ipv6 simply because I do not need it, and I do not
have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.
Other protocol families are left untouched and should there be
users with proprietary protocol families, they should continue to work
and be oblivious to the existence of the extra FIBs.
To understand how this is done, one must know that the current FIB
code starts everything off with a single dimensional array of
pointers to FIB head structures (One per protocol family), each of
which in turn points to the trie of routes available to that family.
The basic change in the ABI compatible version of the change is to
extent that array to be a 2 dimensional array, so that
instead of protocol family X looking at rt_tables[X] for the
table it needs, it looks at rt_tables[Y][X] when for all
protocol families except ipv4 Y is always 0.
Code that is unaware of the change always just sees the first row
of the table, which of course looks just like the one dimensional
array that existed before.
The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign()
are all maintained, but refer only to the first row of the array,
so that existing callers in proprietary protocols can continue to
do the "right thing".
Some new entry points are added, for the exclusive use of ipv4 code
called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(),
which have an extra argument which refers the code to the correct row.
In addition, there are some new entry points (currently called
rtalloc_fib() and friends) that check the Address family being
looked up and call either rtalloc() (and friends) if the protocol
is not IPv4 forcing the action to row 0 or to the appropriate row
if it IS IPv4 (and that info is available). These are for calling
from code that is not specific to any particular protocol. The way
these are implemented would change in the non ABI preserving code
to be added later.
One feature of the first version of the code is that for ipv4,
the interface routes show up automatically on all the FIBs, so
that no matter what FIB you select you always have the basic
direct attached hosts available to you. (rtinit() does this
automatically).
You CAN delete an interface route from one FIB should you want
to but by default it's there. ARP information is also available
in each FIB. It's assumed that the same machine would have the
same MAC address, regardless of which FIB you are using to get
to it.
This brings us as to how the correct FIB is selected for an outgoing
IPV4 packet.
Firstly, all packets have a FIB associated with them. if nothing
has been done to change it, it will be FIB 0. The FIB is changed
in the following ways.
Packets fall into one of a number of classes.
1/ locally generated packets, coming from a socket/PCB.
Such packets select a FIB from a number associated with the
socket/PCB. This in turn is inherited from the process,
but can be changed by a socket option. The process in turn
inherits it on fork. I have written a utility call setfib
that acts a bit like nice..
setfib -3 ping target.example.com # will use fib 3 for ping.
It is an obvious extension to make it a property of a jail
but I have not done so. It can be achieved by combining the setfib and
jail commands.
2/ packets received on an interface for forwarding.
By default these packets would use table 0,
(or possibly a number settable in a sysctl(not yet)).
but prior to routing the firewall can inspect them (see below).
(possibly in the future you may be able to associate a FIB
with packets received on an interface.. An ifconfig arg, but not yet.)
3/ packets inspected by a packet classifier, which can arbitrarily
associate a fib with it on a packet by packet basis.
A fib assigned to a packet by a packet classifier
(such as ipfw) would over-ride a fib associated by
a more default source. (such as cases 1 or 2).
4/ a tcp listen socket associated with a fib will generate
accept sockets that are associated with that same fib.
5/ Packets generated in response to some other packet (e.g. reset
or icmp packets). These should use the FIB associated with the
packet being reponded to.
6/ Packets generated during encapsulation.
gif, tun and other tunnel interfaces will encapsulate using the FIB
that was in effect withthe proces that set up the tunnel.
thus setfib 1 ifconfig gif0 [tunnel instructions]
will set the fib for the tunnel to use to be fib 1.
Routing messages would be associated with their
process, and thus select one FIB or another.
messages from the kernel would be associated with the fib they
refer to and would only be received by a routing socket associated
with that fib. (not yet implemented)
In addition Netstat has been edited to be able to cope with the
fact that the array is now 2 dimensional. (It looks in system
memory using libkvm (!)). Old versions of netstat see only the first FIB.
In addition two sysctls are added to give:
a) the number of FIBs compiled in (active)
b) the default FIB of the calling process.
Early testing experience:
-------------------------
Basically our (IronPort's) appliance does this functionality already
using ipfw fwd but that method has some drawbacks.
For example,
It can't fully simulate a routing table because it can't influence the
socket's choice of local address when a connect() is done.
Testing during the generating of these changes has been
remarkably smooth so far. Multiple tables have co-existed
with no notable side effects, and packets have been routes
accordingly.
ipfw has grown 2 new keywords:
setfib N ip from anay to any
count ip from any to any fib N
In pf there seems to be a requirement to be able to give symbolic names to the
fibs but I do not have that capacity. I am not sure if it is required.
SCTP has interestingly enough built in support for this, called VRFs
in Cisco parlance. it will be interesting to see how that handles it
when it suddenly actually does something.
Where to next:
--------------------
After committing the ABI compatible version and MFCing it, I'd
like to proceed in a forward direction in -current. this will
result in some roto-tilling in the routing code.
Firstly: the current code's idea of having a separate tree per
protocol family, all of the same format, and pointed to by the
1 dimensional array is a bit silly. Especially when one considers that
there is code that makes assumptions about every protocol having the
same internal structures there. Some protocols don't WANT that
sort of structure. (for example the whole idea of a netmask is foreign
to appletalk). This needs to be made opaque to the external code.
My suggested first change is to add routing method pointers to the
'domain' structure, along with information pointing the data.
instead of having an array of pointers to uniform structures,
there would be an array pointing to the 'domain' structures
for each protocol address domain (protocol family),
and the methods this reached would be called. The methods would have
an argument that gives FIB number, but the protocol would be free
to ignore it.
When the ABI can be changed it raises the possibilty of the
addition of a fib entry into the "struct route". Currently,
the structure contains the sockaddr of the desination, and the resulting
fib entry. To make this work fully, one could add a fib number
so that given an address and a fib, one can find the third element, the
fib entry.
Interaction with the ARP layer/ LL layer would need to be
revisited as well. Qing Li has been working on this already.
This work was sponsored by Ironport Systems/Cisco
Reviewed by: several including rwatson, bz and mlair (parts each)
Obtained from: Ironport systems/Cisco
we're certain the allocation will entierly succeed. This fixes a leak in a
fairly unlikely case.
Reported by: vijay singh <vijjus at rocketmail dot com>
MFC after: 1 week
after each SYSINIT() macro invocation. This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.
MFC after: 1 month
Discussed with: imp, rink
for all network interfaces, not just ethernet-like ones.
Upgrade it to a louder WARNING and be explicit that the flag is obsolete.
Support for IFF_NEEDSGIANT will be removed in a few months (see arch@ for
details) and will not appear in 8.0.
Upgrade if_watchdog to a WARNING.
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:
mac_<object>_<method/action>
mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly
reversed from the new scheme. Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier. Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods. Also simplify, slightly,
some entry point names.
All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer
framework for non-MPSAFE network protocols:
- Remove debug_mpsafenet variable, sysctl, and tunable.
- Remove NET_NEEDS_GIANT() and associate SYSINITSs used by it to force
debug.mpsafenet=0 if non-MPSAFE protocols are compiled into the kernel.
- Remove logic to automatically flag interrupt handlers as non-MPSAFE if
debug.mpsafenet is set for an INTR_TYPE_NET handler.
- Remove logic to automatically flag netisr handlers as non-MPSAFE if
debug.mpsafenet is set.
- Remove references in a few subsystems, including NFS and Cronyx drivers,
which keyed off debug_mpsafenet to determine various aspects of their own
locking behavior.
- Convert NET_LOCK_GIANT(), NET_UNLOCK_GIANT(), and NET_ASSERT_GIANT into
no-op's, as their entire behavior was determined by the value in
debug_mpsafenet.
- Alias NET_CALLOUT_MPSAFE to CALLOUT_MPSAFE.
Many remaining references to NET_.*_GIANT() and NET_CALLOUT_MPSAFE are still
present in subsystems, and will be removed in followup commits.
Reviewed by: bz, jhb
Approved by: re (kensmith)
The name trunk is misused as the networking term trunk means carrying multiple
VLANs over a single connection. The IEEE standard for link aggregation (802.3
section 3) does not talk about 'trunk' at all while it is used throughout IEEE
802.1Q in describing vlans.
The lagg(4) driver provides link aggregation, failover and fault tolerance.
Discussed on: current@
tolerance. This driver allows aggregation of multiple network interfaces as
one virtual interface using a number of different protocols/algorithms.
failover - Sends traffic through the secondary port if the master becomes
inactive.
fec - Supports Cisco Fast EtherChannel.
lacp - Supports the IEEE 802.3ad Link Aggregation Control Protocol
(LACP) and the Marker Protocol.
loadbalance - Static loadbalancing using an outgoing hash.
roundrobin - Distributes outgoing traffic using a round-robin scheduler
through all active ports.
This code was obtained from OpenBSD and this also includes 802.3ad LACP support
from agr(4) in NetBSD.
structures. Detect when ifnet instances are detached from the network
stack and perform appropriate cleanup to prevent memory leaks.
This has been implemented in such a way as to be backwards ABI compatible.
Kernel consumers are changed to use if_delmulti_ifma(); in_delmulti()
is unable to detect interface removal by design, as it performs searches
on structures which are removed with the interface.
With this architectural change, the panics FreeBSD users have experienced
with carp and pfsync should be resolved.
Obtained from: p4 branch bms_netdev
Reviewed by: andre
Sponsored by: Garance A Drosehn
Idea from: NetBSD
MFC after: 1 month
a link-layer multicast group membership.
Such memberships are needed in order to support protocols such as
IS-IS without putting the interface into PROMISC or ALLMULTI modes.
sa_equal() is not OK for comparing sockaddr_dl as it has deeper structure
than a simple byte array, so add sa_dl_equal() and use that instead.
Reviewed by: rwatson
Verified with: /usr/sbin/mtest
Bug found by: Jouke Witteveen
MFC after: 2 weeks
if_watchdog/if_timer interface doesn't fit modern SMP network
stack design.
Device drivers that need watchdog to monitor their hardware should
implement it theirselves.
Eventually the if_watchdog/if_timer API will be removed. For now,
warn that driver uses it.
Reviewed by: scottl
specific privilege names to a broad range of privileges. These may
require some future tweaking.
Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>
begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.
This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA
and skip over the normal IP processing.
Add a supporting function ifa_ifwithbroadaddr() to verify and validate the
supplied subnet broadcast address.
PR: kern/99558
Tested by: Andrey V. Elsukov <bu7cher-at-yandex.ru>
Sponsored by: TCP/IP Optimization Fundraise 2005
MFC after: 3 days
parameter that can specify configuration parameters:
o rev cloner api's to add optional parameter block
o add SIOCCREATE2 that accepts parameter data
o rev vlan support to use new api (maintain old code)
Reviewed by: arch@
except in places dealing with ifaddr creation or destruction; and
in such special places incomplete ifaddrs should never be linked
to system-wide data structures. Therefore we can eliminate all the
superfluous checks for "ifa->ifa_addr != NULL" and get ready
to the system crashing honestly instead of masking possible bugs.
Suggested by: glebius, jhb, ru
list.
- First remove from global list, then start destroying.
PR: kern/97679
Submitted by: Alex Lyashkov <shadow itt.net.ru>
Reviewed by: rwatson, brooks
order to - for example - apply firewall rules to a whole group of
interfaces. This is required for importing pf from OpenBSD 3.9
Obtained from: OpenBSD (with changes)
Discussed on: -net (back in April)
notification so all interfaces including pseudo are reported. When netif
creates the clones at startup devctl_disable has not been turned off yet so the
interfaces will not be initialised twice, enforce this by adding an explicit
order between rc.d/netif and rc.d/devd.
This change allows actions to taken in userland when an interface is cloned
and the pseudo interface will be automatically configured if a ifconfig_<int>=""
line exists in rc.conf.
Reviewed by: brooks
No objections on: net
work by yar, thompsa and myself. The checksum offloading part also involves
work done by Mihail Balikov.
The most important changes:
o Instead of global linked list of all vlan softc use a per-trunk
hash. The size of hash is dynamically adjusted, depending on
number of entries. This changes struct ifnet, replacing counter
of vlans with a pointer to trunk structure. This change is an
improvement for setups with big number of VLANs, several interfaces
and several CPUs. It is a small regression for a setup with a single
VLAN interface.
An alternative to dynamic hash is a per-trunk static array with
4096 entries, which is a compile time option - VLAN_ARRAY. In my
experiments the array is not an improvement, probably because such
a big trunk structure doesn't fit into CPU cache.
o Introduce an UMA zone for VLAN tags. Since drivers depend on it,
the zone is declared in kern_mbuf.c, not in optional vlan(4) driver.
This change is a big improvement for any setup utilizing vlan(4).
o Use rwlock(9) instead of mutex(9) for locking. We are the first
ones to do this! :)
o Some drivers can do hardware VLAN tagging + hardware checksum
offloading. Add an infrastructure for this. Whenever vlan(4) is
attached to a parent or parent configuration is changed, the flags
on vlan(4) interface are updated.
In collaboration with: yar, thompsa
In collaboration with: Mihail Balikov <mihail.balikov interbgc.com>
rather than in ifindex_table[]; all (except one) accesses are
through ifp anyway. IF_LLADDR() works faster, and all (except
one) ifaddr_byindex() users were converted to use ifp->if_addr.
- Stop storing a (pointer to) Ethernet address in "struct arpcom",
and drop the IFP2ENADDR() macro; all users have been converted
to use IF_LLADDR() instead.
copy of Ethernet address.
- Change iso88025_ifattach() and fddi_ifattach() to accept MAC
address as an argument, similar to ether_ifattach(), to make
this work.
as it is done for usual promiscuous mode already. This info is important
because promiscuous mode in the hands of a malicious party can jeopardize
the whole network.
panics, which occur when stale ifnet pointers are left in struct
moptions hung off of inpcbs:
- Add in_ifdetach(), which matches in6_ifdetach(), and allows the
protocol to perform early tear-down on the interface early in
if_detach().
- Annotate that if_detach() needs careful consideration.
- Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR --
this is not the place to detect interface removal! This also
removes what is basically a nasty (and now unnecessary) hack.
- Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP
IPv4 sockets.
It is now possible to run the msocket_ifnet_remove regression test
using HEAD without panicking.
MFC after: 3 days
struct ifnet most of if_findindex() become a complex no-op. Remove it
and replace it with a corrected version of the four line for loop it
devolved to plus some error handling. This should probably be replaced
with subr_unit at some point.
Switch from checking ifaddr_byindex to ifnet_byindex when looking for
empty indexes. Since we're doing this from if_alloc/if_free, we can
only be sure that ifnet_byindex will be correct. This fixes panics when
loading the ef(4) module. The panics were caused by the fact that
if_alloc was called four time before if_attach was called and thus
ifaddr_byindex was not set and the same unit was allocated again. This
in turn caused the first if_attach to fail because the ifp was not the
one in ifnet_byindex(ifp->if_index).
Reported by: "Wojciech A. Koszek" <dunstan at freebsd dot czest dot pl>
PR: kern/84987
MFC After: 1 day
- Add a note that additions should be made to if_free_type and not
if_free to help avoid this in the future.
This apparently fixes a use after free in if_bridge and may fix bugs
in other direct if_free_type consumers.
Reported by: thompsa
and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making
and documenting the locking of these flags the responsibility of the
device driver, not the network stack. The flags for these two fields
will be mutually exclusive so that they can be exposed to user space as
though they were stored in the same variable.
Provide #defines to provide the old names #ifndef _KERNEL, so that user
applications (such as ifconfig) can use the old flag names. Using the
old names in a device driver will result in a compile error in order to
help device driver writers adopt the new model.
When exposing the interface flags to user space, via interface ioctls
or routing sockets, or the two fields together. Since the driver flags
cannot currently be set for user space, no new logic is currently
required to handle this case.
Add some assertions that general purpose network stack routines, such
as if_setflags(), are not improperly used on driver-owned flags.
With this change, a large number of very minor network stack races are
closed, subject to correct device driver locking. Most were likely
never triggered.
Driver sweep to follow; many thanks to pjd and bz for the line-by-line
review they gave this patch.
Reviewed by: pjd, bz
MFC after: 7 days
if_attach(). This allows ethernet drivers to use it in their routines
to program their MAC filters before ether_ifattach() is called (de(4) is
one such driver). Also, the if_addr mutex is destroyed in if_free()
rather than if_detach(), so there was another potential bug in that a
driver that failed during attach and called if_free() without having
called ether_ifattach() would have tried to destroy an uninitialized mutex.
Reported by: Holm Tiffe holm at freibergnet dot de
Discussed with: rwatson
using ifp->if_addr_mtx:
- Initialize if_addr_mtx when ifnet is initialized.
- Destroy if_addr_mtx when ifnet is torn down.
- Rename ifmaof_ifpforaddr() to if_findmulti(); assert if_addr_mtx.
Staticize.
- Extract ifmultiaddr allocation and initialization into if_allocmulti();
accept a 'mflags' argument to indicate whether or not sleeping is
permitted. This centralizes error handling and address duplication.
- Extract ifmultiaddr tear-down and deallocation in if_freemulti().
- Re-structure if_addmulti() to hold if_addr_mtx around manipulation of
the ifnet multicast address list and reference count manipulation.
Make use of non-sleeping allocations. Annotate the fact that we only
generate routing socket events for explicit address addition, not
implicit link layer address addition.
- Re-structure if_delmulti() to hold if_addr_mtx around manipulation of
the ifnet multicast address list and reference count manipulation.
Annotate the lack of a routing socket event for implicit link layer
address removal.
- De-spl all and sundry.
Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca>
MFC after: 1 week
Compare pointers with NULL rather than treating them as booleans.
Compare pointers with NULL rather than 0 to make it more clear
they are pointers.
Assign pointers value of NULL rather than 0 to make it more clear
they are pointers.
MFC after: 3 days
Some of the (IPv6) cleanup functions send packets to inform peers of the
departure. These packets confused users of ifnet_departure_event (pf at the
moment).
PR: kern/80627
Tested by: Divacky Roman
MFC after: 1 week
- Introduce a helper function if_setflag() containing the code common
to ifpromisc() and if_allmulti() instead of duplicating the code poorly,
with different bugs.
- Call ifp->if_ioctl() in a consistent way: always use more compatible C
syntax and check whether ifp->if_ioctl is not NULL prior to the call.
MFC after: 1 month
- Introducing the possibility of using locks different than mutexes
for the knlist locking. In order to do this, we add three arguments to
knlist_init() to specify the functions to use to lock, unlock and
check if the lock is owned. If these arguments are NULL, we assume
mtx_lock, mtx_unlock and mtx_owned, respectively.
- Using the vnode lock for the knlist locking, when doing kqueue operations
on a vnode. This way, we don't have to lock the vnode while holding a
mutex, in filt_vfsread.
Reviewed by: jmg
Approved by: re (scottl), scottl (mentor override)
Pointyhat to: ssouhlal
Will be happy: everyone
fails.
Move detaching the ifnet from the ifindex_table into if_free so we can
both keep the sanity checks and actually delete the ifnets. [0]
Reported by: gallatin [0]
Approved by: re (blanket)
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.
This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.
Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.
Reviewed by: sobomax, sam
so if_tap doesn't need to rely on locally-rolled code to do same.
The observable symptom of if_tap's bzero'ing the address details
was a crash in "ifconfig tap0" after an if_tap device was closed.
Reported By: Matti Saarinen (mjsaarin at cc dot helsinki dot fi)
a taskqueue(9) task. This fixes LORs and adds possibility
to serve such events pseudorecursively, when link state
change of interface causes subsequent change on other
interfaces.
Sponsored by: Rambler
Reviewed by: sam, brooks, mux
clock time to uptime because wall clock time may go backwards.
This is a change in the API which will impact SNMP agents who are using
ifi_epoch to set RFC2233's ifCounterDiscontinuityTime. None are know to
exist today. This will not impact applications that are using the
<index, epoch> tuple to verify interface uniqueness except that it
eliminates a race which could lead to a false assumption of uniqueness.
Because this is a behavior change, bump __FreeBSD_version.
Discussed with: re (jhb, scottl)
MFC after: 3 days
Pointed out by: pkh (way back at EuroBSDCon)
Pointy hat: brooks
hosts to share an IP address, providing high availability and load
balancing.
Original work on CARP done by Michael Shalayeff, with many
additions by Marco Pfatschbacher and Ryan McBride.
FreeBSD port done solely by Max Laier.
Patch by: mlaier
Obtained from: OpenBSD (mickey, mcbride)
which will finally lead to kernel panic.
Security: This prevents a local (root-launched) DoS
Submitted by: Wojciech A. Koszek [dunstan at freebsd czest pl]
PR: 77421
MFC After: 1 week
- Introduce another ng_ether(4) callback ng_ether_link_state_p, which
is called from if_link_state_change(), every time link is changed.
- In ng_ether_link_state() send netgraph control message notifying
of link state change to a node connected to "lower" hook.
Reviewed by: sam
MFC after: 2 weeks
Introduce domain_init_status to keep track of the init status of the domains
list (surprise). 0 = uninitialized, 1 = initialized/unpopulated, 2 =
initialized/done. Higher values can be used to support late addition of
domains which right now "works", but is potential dangerous. I choose to
only give a warning when doing so.
Use domain_init_status with if_attachdomain[1]() to ensure that we have a
complete domains list when we init the if_afdata array. Store the current
value of domain_init_status in if_afdata_initialized. This way we can update
if_afdata after a new protocol has been added (once that is allowed).
Submitted by: se (with changes)
Reviewed by: julian, glebius, se
PR: kern/73321 (partly)
Printf() a warning if if_attachdomain() is called more than once on an
interface to generate some noise on mailing lists when this occurs.
Fix up style in if_start(), where spaces crept in instead of tabs at
some point.
MFC after: 1 week
MFC note: Not the printf().
in orden to harden the ABI for 5.x; this will permit us to modify
the locking in the ifnet packet dispatch without requiring drivers
to be recompiled.
MFC after: 3 days
Discussed at: EuroBSDCon Developer's Summit
acquire Giant if the passed interface has IFF_NEEDSGIANT set on it.
Modify calls into (ifp)->if_ioctl() in if.c to use these macros in order
to ensure that Giant is held.
MFC after: 3 days
Bumped into by: jmg
was seen when configuring addresses on interfaces using ifconfig. This
patch has been verified to work with over eight thousand addresses
assigned to an interface.
LOR id: 031
to avoid ABI changes. It is set to the last time the interface
counters were zeroed, currently the time if_attach() was called. It is
intentended to be a valid value for RFC2233's ifCounterDiscontinuityTime
and to make it easier for applications to verify that the interface they
find at a given index is the one that was there last time they looked.
Due to space constraints ifi_epoch is a time_t rather then a struct
timeval. SNMP would prefer higher precision, but this unlikely to be
useful in practice.
happens when a proc exits, but needs to inform the user that this has
happened.. This also means we can remove the check for detached from
proc and sig f_detach functions as this is doing in kqueue now...
MFC after: 5 days
have been in place all the time the mtx_assert in the ALTQ code just
discovered the shortcoming.
PR: i386/71195
Tested by: Bettan (PR originator), myself
MFC after: 5 days
increasing it. Add code to ifconfig to use this size to find the
sockaddr_dl after the struct if_data in the routing message. This
allows struct if_data to grow (up to 255 bytes) without breaking
ifconfig.
Submitted by: peter
time the interface counters were zeroed, currently the time if_attach()
was called. It is indentended to be a valid value for RFC2233's
ifCounterDiscontinuityTime and to make it easier for applications to
verify that the interface they find at a given index is the one that was
there last time they looked.
An if_epoch "compatability" macro has not been created as ifi_epoch has
never been a member of struct ifnet.
Approved by: andre, bms, wollman
a more complete subsystem, and removes the knowlege of how things are
implemented from the drivers. Include locking around filter ops, so a
module like aio will know when not to be unloaded if there are outstanding
knotes using it's filter ops.
Currently, it uses the MTX_DUPOK even though it is not always safe to
aquire duplicate locks. Witness currently doesn't support the ability
to discover if a dup lock is ok (in some cases).
Reviewed by: green, rwatson (both earlier versions)
device drivers to declare that the ifp->if_start() method implemented
by the driver requires Giant in order to operate correctly.
Add a 'struct task' to 'struct ifnet' that can be used to execute a
deferred ifp->if_start() in the event that if_start needs to be called
in a Giant-free environment. To do this, introduce if_start(), a
wrapper function for ifp->if_start(). If the interface can run MPSAFE,
it directly dispatches into the interface start routine. If it can't
run MPSAFE, we're running with debug.mpsafenet != 0, and Giant isn't
currently held, the task is queued to execute in a swi holding Giant
via if_start_deferred().
Modify if_handoff() to use if_start() instead of direct dispatch.
Modify 802.11 to use if_start() instead of direct dispatch.
This is intended to provide increased compatibility for non-MPSAFE
network device drivers in the presence of Giant-free operation via
asynchronous dispatch. However, this commit does not mark any network
interfaces as IFF_NEEDSGIANT.
- Split the code out into if_clone.[ch].
- Locked struct if_clone. [1]
- Add a per-cloner match function rather then simply matching names of
the form <name><unit> and <name>.
- Use the match function to allow creation of <interface>.<tag>
vlan interfaces. The old way is preserved unchanged!
- Also the match function to allow creation of stf(4) interfaces named
stf0, stf, or 6to4. This is the only major user visible change in
that "ifconfig stf" creates the interface stf rather then stf0 and
does not print "stf0" to stdout.
- Allow destroy functions to fail so they can refuse to delete
interfaces. Currently, we forbid the deletion of interfaces which
were created in the init function, particularly lo0, pflog0, and
pfsync0. In the case of lo0 this was a panic implementation so it
does not count as a user visiable change. :-)
- Since most interfaces do not need the new functionality, an family of
wrapper functions, ifc_simple_*(), were created to wrap old style
cloner functions.
- The IF_CLONE_INITIALIZER macro is replaced with a new incompatible
IFC_CLONE_INITIALIZER and ifc_simple consumers use IFC_SIMPLE_DECLARE
instead.
Submitted by: Maurycy Pawlowski-Wieronski <maurycy at fouk.org> [1]
Reviewed by: andre, mlaier
Discussed on: net
ALTQ enabled versions of IFQ_* macros by default, as requested by serveral
others. This is a follow-up to the quick fix I committed yesterday which
turned off the ALTQ checks for non-ALTQ kernels.
your (network) modules as well as any userland that might make sense of
sizeof(struct ifnet).
This does not change the queueing yet. These changes will follow in a
seperate commit. Same with the driver changes, which need case by case
evaluation.
__FreeBSD_version bump will follow.
Tested-by: (i386)LINT
consistently with the rest of the code, use IFP2AC(ifp) to access
the arpcom structure given the ifp.
In this case also fix a difference in assumptions WRT the rest of
the net/ sources: it is not the 'struct *softc' that starts with a
'struct arpcom', but a 'struct arpcom' that starts with a
'struct ifnet'
the TAILQ_FOREACH() form.
Comment the need to store the same info (mac address for ethernet-type
devices) in two different places.
No functional changes. Even the compiler output should be unmodified
by this change.
of an interface. No functional change.
On passing, comment a likely bug in net/rtsock.c:sysctl_ifmalist()
which, if confirmed, would deserve to be fixed and MFC'ed
This enables pf to track dynamic address changes on interfaces (dailup) with
the "on (<ifname>)"-syntax. This also brings hooks in anticipation of
tracking cloned interfaces, which will be in future versions of pf.
Approved by: bms(mentor)
Introduce d_version field in struct cdevsw, this must always be
initialized to D_VERSION.
Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing
four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
- allow for ifp->if_ioctl being NULL, as the rest of ifioctl() does;
- give the interface driver a chance to report a error to the caller;
- don't forget to update ifp->if_lastchange upon successful modification
of interface operation parameters.
The basic process is to send a routing socket announcement that the
interface has departed, change if_xname, update the sockaddr_dl
associated with the interface, and announce the arrival of the interface
on the routing socket.
As part of this change, ifunit() is greatly simplified by testing
if_xname directly. if_clone_destroy() now uses if_dname to look up the
cloner for the interface and if_dunit to identify the unit number.
Reviewed by: ru, sam (concept)
Vincent Jardin <vjardin AT free.fr>
Max Laier <max AT love2party.net>
- malloc() returns a void* and does not need a cast
- when called with M_WAITOK, malloc() can not return NULL so don't
check for that case. The result of the check was bogus anyway since
it would leave the interface broken.
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.
This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.
Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)
that covers updates to the contents. Note this is separate from holding
a reference and/or locking the routing table itself.
Other/related changes:
o rtredirect loses the final parameter by which an rtentry reference
may be returned; this was never used and added unwarranted complexity
for locking.
o minor style cleanups to routing code (e.g. ansi-fy function decls)
o remove the logic to bump the refcnt on the parent of cloned routes,
we assume the parent will remain as long as the clone; doing this avoids
a circularity in locking during delete
o convert some timeouts to MPSAFE callouts
Notes:
1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level
applications cannot/do-no know about mutex's. Doing this requires
that the mutex be the last element in the structure. A better solution
is to introduce an externalized version of struct rtentry but this is
a major task because of the intertwining of rtentry and other data
structures that are visible to user applications.
2. There are known LOR's that are expected to go away with forthcoming
work to eliminate many held references. If not these will be resolved
prior to release.
3. ATM changes are untested.
Sponsored by: FreeBSD Foundation
Obtained from: BSD/OS (partly)
even could call VOP_REVOKE() on vnodes associated with its dev_t's
has originated, but it stops right here.
If there are things people belive destroy_dev() needs to learn how to
do, please tell me about it, preferably with a reproducible test case.
Include <sys/uio.h> in bluetooth code rather than rely on <sys/vnode.h>
to do so.
The fact that some of the USB code needs to include <sys/vnode.h>
still disturbs me greatly, but I do not have time to chase that.
is more robust and prevents the hijacking of /dev/console for the typical
mistake.
Remove unneeded MAJOR_AUTO uses, it is only needed explicitly now if the
driver source has cross-branch compatibility to old releases.
branches:
Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.
This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.
Approved by: re(scottl)
IP fast forwarding, SIOCGIFADDR, setting hardware address (not currently
enabled in cm driver), multicasts (experimental)
- add ARC_MAX_DATA, use IF_HANDOFF, remove arc_sprintf() and some unused
variables
- if_simloop logic is made more similar to ethernet
- drop not ours packets early (if we are not in promiscous mode)
Submitted by: mark tinguely (partially)
function takes a struct ifnet pointer followed by the usual printf
arguments and prints "<interfacename>: " before the results of printf.
Since this is the primary form of printf calls in network device drivers
and accounts for most uses of the ifnet menber if_unit, this
significantly simplifies many printf()s.
Also, for all interfaces in this mode pass all ethernet frames to upper layer,
even those not addressed to our own MAC, which allows packets encapsulated
in those frames be processed with packet filters (ipfw(8) et al).
Emphatically requested by: Anton Turygin <pa3op@ukr-link.net>
Valuable suggestions by: fenner
kernel access control.
Introduce two ioctls, SIOCGIFMAC, SIOCSIFMAC, which permit user
processes to manage the MAC labels on network interfaces. Note
that this is part of the user process API/ABI that will be revised
prior to 5.0-RELEASE.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
kernel access control.
Instrument the interface management code so that MAC labels are
properly maintained on network interfaces (struct ifnet). In
particular, invoke entry points when interfaces are created and
removed. MAC policies may initialized the label interface based
on a variety of factors, including the interface name.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
code. The reverts the API change which made the <if>_clone_destory()
functions return an int instead of void bringing us into closer
alignment with NetBSD.
Reviewed by: net (a long time ago)
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
unit allocation with a bitmap in the generic layer. This
allows us to get rid of the duplicated rman code in every
clonable interface.
Reviewed by: brooks
Approved by: phk
pseudo-devices when an interface goes away. Otherwise, an open /dev/net/foo0
when the interface is removed can cause a crash.
Not objected to by: jlemon
to notify other nodes about the address change. Otherwise, they
might try and keep using the old address until their arp table
entry times out and the address is refreshed.
Maybe this ought to be done for INET6 addresses as well but i have
no idea how to do it. It should be pretty straightforward though.
MFC-after: 10 days
socket so that routing daemons and other interested parties
know when an interface is attached/detached.
PR: kern/33747
Obtained from: NetBSD
MFC after: 2 weeks
Have sys/net/route.c:rtrequest1(), which takes ``rt_addrinfo *''
as the argument. Pass rt_addrinfo all the way down to rtrequest1
and ifa->ifa_rtrequest. 3rd argument of ifa->ifa_rtrequest is now
``rt_addrinfo *'' instead of ``sockaddr *'' (almost noone is
using it anyways).
Benefit: the following command now works. Previously we needed
two route(8) invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0
Remove unsafe typecast in rtrequest(), from ``rtentry *'' to
``sockaddr *''. It was introduced by 4.3BSD-Reno and never
corrected.
Obtained from: BSD/OS, NetBSD
MFC after: 1 month
PR: kern/28360
existing devices (e.g.: tunX). This may need a little more thought.
Create a /dev/netX alias for devices. net0 is reserved.
Allow wiring of net aliases in /boot/device.hints of the form:
hint.net.1.dev="lo0"
hint.net.12.ether="00:a0:c9:c9:9d:63"
appear in /dev. Interface hardware ioctls (not protocol or routing) can
be performed on the descriptor. The SIOCGIFCONF ioctl may be performed
on the special /dev/network node.
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.
TODO:
- The definitions of SADB_* in sys/net/pfkeyv2.h are still different
from RFC2407/IANA assignment because of binary compatibility
issue. It should be fixed under 5-CURRENT.
- ip6po_m member of struct ip6_pktopts is no longer used. But, it
is still there because of binary compatibility issue. It should
be removed under 5-CURRENT.
Reviewed by: itojun
Obtained from: KAME
MFC after: 3 weeks
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
pr_free(), invoked by the similarly named credential reference
management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
mutex use.
Notes:
o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
required to protect the reference count plus some fields in the
structure.
Reviewed by: freebsd-arch
Obtained from: TrustedBSD Project
inline functions non-inlined. Hide parts of the mutex implementation that
should not be exposed.
Make sure that WITNESS code is not executed during boot until the mutexes
are fully initialized by SI_SUB_MUTEX (the original motivation for this
commit).
Submitted by: peter
before adding/removing packets from the queue. Also, the if_obytes and
if_omcasts fields should only be manipulated under protection of the mutex.
IF_ENQUEUE, IF_PREPEND, and IF_DEQUEUE perform all necessary locking on
the queue. An IF_LOCK macro is provided, as well as the old (mutex-less)
versions of the macros in the form _IF_ENQUEUE, _IF_QFULL, for code which
needs them, but their use is discouraged.
Two new macros are introduced: IF_DRAIN() to drain a queue, and IF_HANDOFF,
which takes care of locking/enqueue, and also statistics updating/start
if necessary.
This means 'options NETGRAPH' is no longer necessary in order to get
netgraph-enabled Ethernet interfaces. This supports loading/unloading
the ng_ether.ko and attaching/detaching the Ethernet interface in any
order.
Add two new hooks 'upper' and 'lower' to allow access to the protocol
demux engine and the raw device, respectively. This enables bridging
to be defined as a netgraph node, if so desired.
Reviewed by: freebsd-net@freebsd.org
address on an interface. This basically allows you to do what my
little setmac module/utility does via ifconfig. This involves the
following changes:
socket.h: define SIOCSIFLLADDR
if.c: add support for SIOCSIFLLADDR, which resets the values in
the arpcom struct and sockaddr_dl for the specified interface.
Note that if the interface is already up, we need to down/up
it in order to program the underlying hardware's receive filter.
ifconfig.c: add lladdr command
ifconfig.8: document lladdr command
You can now force the MAC address on any ethernet interface to be
whatever you want. (The change is not sticky across reboots of course:
we don't actually reprogram the EEPROM or anything.) Actually, you
can reprogram the MAC address on other kinds of interfaces too; this
shouldn't be ethernet-specific (though at the moment it's limited to
6 bytes of address data).
Nobody ran up to me and said "this is the politically correct way to
do this!" so I don't want to hear any complaints from people who think
I could have done it more elegantly. Consider yourselves lucky I didn't
do it by having ifconfig tread all over /dev/kmem.
would be returned with a wrong value.
While we're here, get rid of unnecessary panic call.
PR: 17311, 12996, 14457
Submitted by: Patrick Bihan-Faou <patrick@mindstep.com>,
Kris Kennaway <kris@FreeBSD.org>
is triggered when qmail is used with INET6 enabled. The bug
manifests itself in that the space variable can become negative
and that in the comparison in the guards of the 2 loops, this was
not noticed because sizeof() returns an unsigned and thus the signed
variable gets promoted to unsigned. I decided not to make space
unsigned because I think we should guard against this from happening.
Thus panic() in case space becomes negative.
Approved by: jkh
Some LAN card chip for fxp is known to cause problem at
interface initialization with IPv6 enabled. It happens at
some delicate timing.
And also, just adding some DELAY before IPv6 address
autoconfiguration is known to avoid the problem.
Complete fix is changing the driver not to use interrupt at
multicast filter initialization, but trying such change in
this stage will be dangerous.
So I add some DELAY() only inside #ifdef INET6 part,
as temporal workaround only for 4.0.
Approbed by: jkh
Noticed by: Mattias Pantzare <pantzer@ludd.luth.se>
Obtained from: openbsd-tech mailing list
do not pollute the interface further.
o Run if_detach at splnet().
o Creatively swipe the relevant parts of the netatm atm_nif_detach
which will delete the relevant references to the interface going
away.
o be more careful about clearing addresses (this isn't a kludge)
o For AF_INET interfaces, call SIOCDIFFADDR to remove last(?) bit
of cruft.
Special cases for AF_INET shouldn't be here, but I didn't see a good
generic way of doing this. If I missed something, please let me know.
This gross hack makes pccard ejection stable for ethernet cards.
Submitted by: Atushi Onoe-san
packet divert at kernel for IPv6/IPv4 translater daemon
This includes queue related patch submitted by jburkhol@home.com.
Submitted by: queue related patch from jburkhol@home.com
Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project
for IPv6 yet)
With this patch, you can assigne IPv6 addr automatically, and can reply to
IPv6 ping.
Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project
This is inteded for to allow ifconfig to print various unstructured
information from an interface.
The data is returned from the kernel in ASCII form, see the comment in
if.h for some technicalities.
Canonical cut&paste example to be found in if_tun.c
Initial use:
Now tun* interfaces tell the PID of the process which opened them.
Future uses could be (volounteers welcome!):
Have ppp/slip interfaces tell which tty they use.
Make sync interfaces return their media state: red/yellow/blue
alarm, timeslot assignment and so on.
Make ethernets warn about missing heartbeats and/or cables
This means that the driver will add/delete routes when it knows it is
up/down, rather than have the generic code belive it is up if configured.
This is probably most useful for serial lines, although many PHY chips
could probably tell us if we're connected to the cable/hub as well.
This is a seriously beefed up chroot kind of thing. The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.
For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact: "real virtual servers".
Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.
Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.
It generally does what one would expect, but setting up a jail
still takes a little knowledge.
A few notes:
I have no scripts for setting up a jail, don't ask me for them.
The IP number should be an alias on one of the interfaces.
mount a /proc in each jail, it will make ps more useable.
/proc/<pid>/status tells the hostname of the prison for
jailed processes.
Quotas are only sensible if you have a mountpoint per prison.
There are no privisions for stopping resource-hogging.
Some "#ifdef INET" and similar may be missing (send patches!)
If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!
Tools, comments, patches & documentation most welcome.
Have fun...
Sponsored by: http://www.rndassociates.com/
Run for almost a year by: http://www.servetheweb.com/
1:
s/suser/suser_xxx/
2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.
3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/
The remaining suser_xxx() calls will be scrutinized and dealt with
later.
There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.
More changes to the suser() API will come along with the "jail" code.
it used to be that way. I'm not sure that it's needed, but it does
walk the ifp list..
Incidently, there's nothing to sanity check the ifq_maxlen on loaded
interfaces..
i386 platform boots, it is no longer ISA-centric, and is fully dynamic.
Most old drivers compile and run without modification via 'compatability
shims' to enable a smoother transition. eisa, isapnp and pccard* are
not yet using the new resource manager. Once fully converted, all drivers
will be loadable, including PCI and ISA.
(Some other changes appear to have snuck in, including a port of Soren's
ATA driver to the Alpha. Soren, back this out if you need to.)
This is a checkpoint of work-in-progress, but is quite functional.
The bulk of the work was done over the last few years by Doug Rabson and
Garrett Wollman.
Approved by: core
before they got changed. This can help eliminate much of the
gymnastics drivers do in their ioctl routines to figure this out.
Remove commented out IFF_NOTRAILERS
Drivers should be updated if they get flagged by this message.
(The reason this is important is because we do not have a way
to catch this mistake for interfaces added after ifinit() runs.)
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.
These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.
Reviewed by: Bruce Evans <bde@zeta.org.au>
Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by: Mike Spengler <mks@networkcs.com>
ioctl() routine at the end of if_delmulti() so that interfaces with
hardware multicast filtering can update their filters in a timely
manner.
If the interface doesn't support hardware multicast filtering, then
reception of multicast frames is done using 'promiscious mode' or
'capture all multicast frames' mode and software filtering in the
kernel. In this case, it doesn't matter if if_delmulti() ever does
an SCIODELMULTI on the interface or not: if MULTICAST support is
enabled, then we join the 'all hosts' group when the interface is
configured, and remain in it until the interface is brought down.
Without hardware filtering, joining one group means joining all
groups, so it makes no difference if we call the SIOCDELMULTI
routine.
If the interface does support hardware multicast filtering, then
by not reprogramming the hardware filter in if_delmulti(), we have
to wait until somebody calls if_setmulti(), during which time the
interface is receiving frames for multicast groups in which we are
no longer interested.
FreeBSD/alpha. The most significant item is to change the command
argument to ioctl functions from int to u_long. This change brings us
inline with various other BSD versions. Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.
The prototype FreeBSD/alpha machdep will follow in a couple of days
time.
Distribute all but the most fundamental malloc types. This time I also
remembered the trick to making things static: Put "static" in front of
them.
A couple of finer points by: bde
Introduce the SIOC[SG]IFGENERIC hooks that can be used to pass an
arbritrary ioctl subcommand into an interface driver. Surprisingly
enough, there was no provision for this already present (except of the
option of abusing SIOC[SG]IFMEDIA for this).
The idea is that an interface driver can establish ioctl subcommands
of its own that can't be meaningfully interpreted by the upper layer
interface ioctl function. Something like this is required to
implement a clean solution of passing down things like CHAP secrets or
PPP options to the /sys/net/if_sppp* files. (Yes, my CHAP is now
finally working with it, but i gotta update my kernel to the new
callout interface before being able to commit _that_.)
Reviewed by: peter [long ago, actually]
by a protocol, to detirmine if an address matches the net this address
is part of. This is needed by protocols for which netmasks
"just don't work", for example appletalk.
Also add the code in appletalk to make use of this new feature.
Thsi fixes one of the longest standing bugs in appletalk.
The inability to talk to machines to which the path is via a router
which is on a different net, but the same netrange, as your interface.
Protocols that do not supply this function (e.g. IP) should not be affected.
This commit includes the following changes:
1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility
glue for them is deleted, and the kernel will panic on boot if any are compiled
in.
2) Certain protocol entry points are modified to take a process structure,
so they they can easily tell whether or not it is possible to sleep, and
also to access credentials.
3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt()
call. Protocols should use the process pointer they are now passed.
4) The PF_LOCAL and PF_ROUTE families have been updated to use the new
style, as has the `raw' skeleton family.
5) PF_LOCAL sockets now obey the process's umask when creating a socket
in the filesystem.
As a result, LINT is now broken. I'm hoping that some enterprising hacker
with a bit more time will either make the broken bits work (should be
easy for netipx) or dike them out.
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
previous hackery involving struct in_ifaddr and arpcom. Get rid of the
abominable multi_kludge. Update all network interfaces to use the
new machanism. Distressingly few Ethernet drivers program the multicast
filter properly (assuming the hardware has one, which it usually does).
multicast group memberships. This is not actually operative
at the moment (a lot of other code still needs to be changed), but
this seemed like a useful reference point to check in so that
others (i.e. Bill Fenner) have fair warning of where we are going.
to TAILQs. Fix places which referenced these for no good reason
that I can see (the references remain, but were fixed to compile
again; they are still questionable).
This is a patch to sys/net/if.c. What it does is patch the algorithm
for finding an IP address on an interface which most closely matches
a given IP address. The problem with it is when no address matches,
and you have to just pick one at random. Then the code ends up picking
the last IP address in the list. This patch changes things so it
picks up the first address instead.
Usually the first address is more useful as the later ones are aliases.
interfaces. This creates two new tables in the net.link.generic branch
of the MIB; one contains (essentially) `ifdata' structures, and the other
contains a blob provided by the interface (and presumably used to
implement link-layer-specific MIB variables). A number of things
have been moved around in the `ifnet' and `ifdata' structures, so
NEW VERSIONS OF ifconfig(8) AND routed(8) ARE REQUIRED. (A simple
recompile is all that's necessary.)
I have a sample program which uses this interface for those interested
in making use of it.
when attepmting to add certain types of routes. This problem
only manifested itself in the presence of unconfigured point-to-point
interfaces.
Noticed by: Chuck Cranor <chuck@maria.wustl.edu>