o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a
socket buffer. The mutex in the receive buffer also protects the data
in struct socket.
o Determine the lock strategy for each members in struct socket.
o Lock down the following members:
- so_count
- so_options
- so_linger
- so_state
o Remove *_locked() socket APIs. Make the following socket APIs
touching the members above now require a locked socket:
- sodisconnect()
- soisconnected()
- soisconnecting()
- soisdisconnected()
- soisdisconnecting()
- sofree()
- soref()
- sorele()
- sorwakeup()
- sotryfree()
- sowakeup()
- sowwakeup()
Reviewed by: alfred
Ipfw processing of frames at layer 2 can be enabled by the sysctl variable
net.link.ether.ipfw=1
Consider this feature experimental, because right now, the firewall
is invoked in the places indicated below, and controlled by the
sysctl variables listed on the right. As a consequence, a packet
can be filtered from 1 to 4 times depending on the path it follows,
which might make a ruleset a bit hard to follow.
I will add an ipfw option to tell if we want a given rule to apply
to ether_demux() and ether_output_frame(), but we have run out of
flags in the struct ip_fw so i need to think a bit on how to implement
this.
to upper layers
| |
+----------->-----------+
^ V
[ip_input] [ip_output] net.inet.ip.fw.enable=1
| |
^ V
[ether_demux] [ether_output_frame] net.link.ether.ipfw=1
| |
+->- [bdg_forward]-->---+ net.link.ether.bridge_ipfw=1
^ V
| |
to devices
it into an "#ifdef INET6" block. This caused a (harmless but annoying)
EINVAL return value to be sent even though the operation completed
successfully.
PR: kern/37786
Submitted by: Ari Suutari <ari.suutari@syncrontech.com>,David Malone <dwmalone@maths.tcd.ie>
MFC after: 1 day
were totally useless and have been removed.
ip_input.c, ip_output.c:
Properly initialize the "ip" pointer in case the firewall does an
m_pullup() on the packet.
Remove some debugging code forgotten long ago.
ip_fw.[ch], bridge.c:
Prepare the grounds for matching MAC header fields in bridged packets,
so we can have 'etherfw' functionality without a lot of kernel and
userland bloat.
field. This returns the sdl_data field to a variable-length field. More
importantly, this prevents a easily-reproduceable data-corruption bug when
the interface name plus the hardware address exceed the sdl_data field's
original 12 byte limit. However, token-ring interfaces may still overflow
the new sdl_data field's 46 byte limit if the interface name exceeds 6
characters (since 6 characters for interface name plus 6 for hardware
address plus 34 for source routing = the size of sdl_data). Further
refinements could overcome this limitation but would break binary
compatibility; this commit only addresses fixing the bug for
commonly-occuring cases without breaking binary compatibility with the
intention that the functionality can be MFC'ed to -stable.
See message ID's (both send to -arch):
20020421013332.F87395-100000@gateway.posi.net20020430181359.G11009-300000@gateway.posi.net
for a more thorough description of the bug addressed and how to
reproduce it.
Approved by: silence on -arch and -net
Sponsored by: NTT Multimedia Communications Labs
MFC after: 1 week
ibss is the modern ad-hoc mode. ibss-master is the same, except that
it creates the ibss network. This distinction is necessary because
some supported cards (symbol) support the former without supporting
the latter.
A seprate commit will introduce a demo-adhoc mode so that we can
disentwingle the multiple, mutually exclusive meandings of adhoc in
the present state of affairs.
Submitted by: jhay
be done internally.
Ensure that no one can fsetown() to a dying process/pgrp. We need
to check the process for P_WEXIT to see if it's exiting. Process
groups are already safe because there is no such thing as a pgrp
zombie, therefore the proctree lock completely protects the pgrp
from having sigio structures associated with it after it runs
funsetownlst.
Add sigio lock to witness list under proctree and allproc, but over
proc and pgrp.
Seigo Tanimura helped with this.
Turn the sigio sx into a mutex.
Sigio lock is really only needed to protect interrupts from dereferencing
the sigio pointer in an object when the sigio itself is being destroyed.
In order to do this in the most unintrusive manner change pgsigio's
sigio * argument into a **, that way we can lock internally to the
function.
Requested by: bde
Since locking sigio_lock is usually followed by calling pgsigio(),
move the declaration of sigio_lock and the definitions of SIGIO_*() to
sys/signalvar.h.
While I am here, sort include files alphabetically, where possible.
of a socket. This avoids lock order reversal caused by locking a
process in pgsigio().
sowakeup() and the callers of it (sowwakeup, soisconnected, etc.) now
require sigio_lock to be locked. Provide sowwakeup_locked(),
soisconnected_locked(), and so on in case where we have to modify a
socket and wake up a process atomically.
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
- Make sure the interface is UP and RUNNING in fddi_input().
- Reorder and comment packet tests in fddi_input().
- Call if_attach() in fddi_ifattach().
- Test for a valid return from ifaddr_byindex().
- Use struct fddi_header where appropriate.
- Use bcopy() rather than memcpy().
- Use FDDI_ADDR_LEN macro instead of ETHER_ADDR_LEN macro.
- Add loadable module support.
- Use FDDI_ADDR_LEN rather than a magic number or a sizeof().
- Hide distracting sizeof() behind FDDI_HDR_LEN macro.
- Don't use sizeof(struct llc) in areas where we mean LLC_SNAPFRAMELEN.
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses. Switch to KNF
formatting and/or rewrap the whole prototype in some cases.
that causes a machine to panic when the kernel PPP / DEFLATE code is used.
1.11 moved a ZFREE to a point after the structural members were clobbered
by stores into a union'd structure.
This commit fixes the bug and adds a big whopping comment to make sure
the code isn't 'cleaned up' again :-)
Ian Dowse came up with the same patch independantly 68 seconds before I
did, talk about Karma!
I would also like to thank Eugene Grosbein for marathon work in tracking the
problem down by udpating his -stable based on date over and over again
to close in on the commit that caused his crashes.
PR: kern/35969
Reviewed by: Ian Dowse <iedowse@maths.tcd.ie>
X-MFC after: immediately
# sysctl net.link.ether.bdg_ipf=1
To enable. Just like ipfw(8) bridging, only input packets are filtered
in the bridge. Filtering works just like in the IP layer, ipf(8)
first, then ipfw(8). And just like in the IP layer, both are
independent, one need not be run to use the other. (Note: This will
not work in, but doesn't break, the bridge.ko module. The ipl.ko
module would need to be fixed before that is worth worrying about.)
Reviewed by: luigi
unit allocation with a bitmap in the generic layer. This
allows us to get rid of the duplicated rman code in every
clonable interface.
Reviewed by: brooks
Approved by: phk
pseudo-devices when an interface goes away. Otherwise, an open /dev/net/foo0
when the interface is removed can cause a crash.
Not objected to by: jlemon
active-filter in pppd(8).
PR: kern/12281
Submitted by: Tim Moore <moore@bricoworks.com>
Not objected by: peter
Reviewed by: ru
Approved by: ru
MFC after: 1 week
header and push it up any attached bpf devices on the parent interface.
This makes hardware vlan decoding more like the normal software path.
Tested by: cjtt@employees.org
MFC after: 2 weeks
via sysctl's. The old #defines, MAX_GIF_NEST and XBONEHACK are
currently supported for backwards compatability, but will probably be
removed at some point in the future.
New locks are:
- pgrpsess_lock which locks the whole pgrps and sessions,
- pg_mtx which protects the pgrp members, and
- s_mtx which protects the session members.
Please refer to sys/proc.h for the coverage of these locks.
Changes on the pgrp/session interface:
- pgfind() needs the pgrpsess_lock held.
- The caller of enterpgrp() is responsible to allocate a new pgrp and
session.
- Call enterthispgrp() in order to enter an existing pgrp.
- pgsignal() requires a pgrp lock held.
Reviewed by: jhb, alfred
Tested on: cvsup.jp.FreeBSD.org
(which is a quad-CPU machine running -current)
to notify other nodes about the address change. Otherwise, they
might try and keep using the old address until their arp table
entry times out and the address is refreshed.
Maybe this ought to be done for INET6 addresses as well but i have
no idea how to do it. It should be pretty straightforward though.
MFC-after: 10 days
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
complexities associated with having MD (asm and inline) versions, and
having to prevent exposure of these functions in other headers that
happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.
Tested on: alpha, i386
Reviewed by: bde, jake, tmm
In order of importance:
+ each cluster now uses private data structures (filtering and
local address tables) so you can treat them as fully independent
switches. This part of the work was supported by:
Cisco Systems, Inc. - NSITE lab, RTP, NC.
+ cleaned up the handling of configuration, so the system will behave
much better when real or pseudo devices are dynamically attached
or detached. It should also not panic anymore on systems with large
number of devices, closing a few existings PRs on the topic.
+ while at it, add support for VLAN. This means that a FreeBSD box
can now work as a real VLAN switch, with trunk interfaces etc.
As an example:
ifconfig vlan0 vlan 3 vlandev dc0
ifconfig vlan1 vlan 4 vlandev dc0
net.link.ether.bridge_cfg="vlan0:3,dc1:3,vlan1:4,dc1:4"
uses dc0 as a trunk interface, and dc1 and dc3 as ports on vlans 3 and 4
You get the idea...
NOTA BENE: by default bridge_cfg is initialised to "" so even if
you enable bridging, no packets will be bridged until you set the
list of interfaces on which you want this to happen.
+ large restructuring of the code, moving private vars and types from
bridge.h to bridge.c.
+ added a lot of comments to the code to explain how to use it.
64-bit platforms. The unaligned access is caused by struct ifa_msghdr
not being a multiple of 8-bytes in size. If an interface has an odd
number of addresses, this causes the next interface to generate an
unaligned access in the user-level app walking the interfaces (ifconfig).
Submitted by: Bernd Walter <ticso@cicely8.cicely.de>
socket so that routing daemons and other interested parties
know when an interface is attached/detached.
PR: kern/33747
Obtained from: NetBSD
MFC after: 2 weeks
are checked on the way in even if they were not calculated on the
way out.
This fixes rwhod
PR: 31954
Submitted by: fenner
Approved by: fenner
MFC after: 1 week
IPv6 on an sppp interface. In an IPv6-enabled kernel, every IPv6
interface automatically gets an IPv6 address assigned (and IPv6
multicast packets sent at initialization time). For sppp links where
we know our remote peer wouldn't support IPv6 at all, there's no point
in attempting to negotiate IPV6CP (or to even dial out for an IPv6
packet at all for dial-on-demand interfaces).
I wish there were a more generic way to administratively disable IPv6
on an interface instead. ume told me there isn't.
While i was at it, converted both, enable_vj and enable_ipv6 into flag
bits in struct sppp (enable_vj used to be an int of its own).
MFC after: 1 month
it again when going from INITIAL to STARTING. This has been done for
passive or auto-conecting interfaces always, but not for permanent
ones.
Obtained from: NetBSD (rev 1.32)
& and && has been botched. This was likely the cause for some havoc
with various negotiation cases of sppp in the past.
Obtained from: NetBSD (rev 1.13)
MFC after: 1 week
makes the implied assumption there were another 128 bytes of space in
front of the packet handed off to it... which is not the case for
sppp. This could easily end up in corrupting random memory.
This fix is about the same as revs 1.6, 1.8, and 1.9 from our
i4b_ispppsubr.c.
Also fixed IPCP option negotiation to zero out the options when
starting IPCP. Otherwise, if negotiation parameters change between
various IPCP startups, it could happen that old options would still be
requested (this happened if VJ was turned off, and ended up in half
off the link still negotiating for VJ compression).
IMHO, the base system's sppp is now feature-wise up to date with the
one in the i4b part of the tree, so the latter can be disabled.
MFC after: 1 month
inclusion of VJ compression into sppp.
Now, instead of the need to include this and that and everything plus
the kitchensink in each of those drivers, struct sppp uses struct
slcompress as an opaque structure only referenced by a pointer. The
actual structure is then malloced at initialization time.
While i was at it, also fixed a bug where received VJ packets would only
be recognized if INET6 was defined.
time from the PPP packets sent. This effectively merges rev 1.2 of
the old i4b_ispppsubr.c, with the exception that i eventually ended up
in debugging and fixing it so the idle time is now really
detected. ;-) (The version in i4b simply doesn't work right since it
still accounts for incoming LCP echo packets which it is supposed to
ignore for idle time considerations...)
Obtained from: i4b
MFC after: 1 month
sppp_parms that are needed for the SPPPIO[GS]DEFS ioctl commands.
This allows it to keep struct sppp inside #ifdef _KERNEL (where it
belongs), and prevents userland programs that wish to include
<net/if_sppp.h> from including the earth, the hell, and the universe
before the are able to resolve all the kernel-internal stuff that's in
struct sppp.
Discussed with: hm
MFC after: 1 month
This (effectively) merges rev 1.36 of i4b's old if_spppsubr.c, albeit
in a slightly different manner (we export the timer in millisecond
values as exposed to tick values from/to userland).
Obtained from: i4b
MFC after: 1 month
This is the logical merge of rev 1.32 of i4b's old if_spppsubr.c (which
was based on PR misc/11767), plus (i4b) rev 1.6 of i4b's if_ispppsubr.c,
albeit with numerous stylistic and cosmetic changes.
PR: misc/11767
Submitted by: i4b, Joachim Kuebart
MFC after: 1 month
Character-Map. RFC 1662 demands it for the sake of async to sync
PPP protocol converters (like Win9* :).
This merges rev 1.26/1.27 of the old i4b sppp changes.
route to the destination twice. Now that brian has fixed route.c to no
longer accept this second route, this long-standing nuisance became a
showstopper bug for sppp users.
In retrospect, this is the same fix as the one in rev 1.78 of if_sl.c;
most likely the original version of sppp has been cloned from SLIP. ;-)
if we've been given an RTA_IFP or changed RTA_IFA sockaddr.
This fixes the following bug:
>/dev/tun100
>/dev/tun101
ifconfig tun100 1.2.3.4 5.6.7.8
ifconfig tun101 1.2.3.4 6.7.8.9
route change 6.7.8.9 -ifa 1.2.3.4 -iface -mtu 500
which erroneously changed tun101's host route to have an ifp of tun100
(rt_getifa() sets the ifp after calling ifa_ifwithnet(1.2.3.4))
This incarnation submitted by: ru
select/poll, and therefore with pthreads. I doubt there is any way
to make this 100% semantically identical to the way it behaves in
unthreaded programs with blocking reads, but the solution here
should do the right thing for all reasonable usage patterns.
The basic idea is to schedule a callout for the read timeout when a
select/poll is done. When the callout fires, it ends the select if
it is still in progress, or marks the state as "timed out" if the
select has already ended for some other reason. Additional logic in
bpfread then does the right thing in the case where the timeout has
fired.
Note, I co-opted the bd_state member of the bpf_d structure. It has
been present in the structure since the initial import of 4.4-lite,
but as far as I can tell it has never been used.
PR: kern/22063 and bin/31649
MFC after: 3 days
Non-SMP, i386-only, no polling in the idle loop at the moment.
To use this code you must compile a kernel with
options DEVICE_POLLING
and at runtime enable polling with
sysctl kern.polling.enable=1
The percentage of CPU reserved to userland can be set with
sysctl kern.polling.user_frac=NN (default is 50)
while the remainder is used by polling device drivers and netisr's.
These are the only two variables that you should need to touch. There
are a few more parameters in kern.polling but the default values
are adequate for all purposes. See the code in kern_poll.c for
more details on them.
Polling in the idle loop will be implemented shortly by introducing
a kernel thread which does the job. Until then, the amount of CPU
dedicated to polling will never exceed (100-user_frac).
The equivalent (actually, better) code for -stable is at
http://info.iet.unipi.it/~luigi/polling/
and also supports polling in the idle loop.
NOTE to Alpha developers:
There is really nothing in this code that is i386-specific.
If you move the 2 lines supporting the new option from
sys/conf/{files,options}.i386 to sys/conf/{files,options} I am
pretty sure that this should work on the Alpha as well, just that
I do not have a suitable test box to try it. If someone feels like
trying it, I would appreciate it.
NOTE to other developers:
sure some things could be done better, and as always I am open to
constructive criticism, which a few of you have already given and
I greatly appreciated.
However, before proposing radical architectural changes, please
take some time to possibly try out this code, or at the very least
read the comments in kern_poll.c, especially re. the reason why I
am using a soft netisr and cannot (I believe) replace it with a
simple timeout.
Quick description of files touched by this commit:
sys/conf/files.i386
new file kern/kern_poll.c
sys/conf/options.i386
new option
sys/i386/i386/trap.c
poll in trap (disabled by default)
sys/kern/kern_clock.c
initialization and hardclock hooks.
sys/kern/kern_intr.c
minor swi_net changes
sys/kern/kern_poll.c
the bulk of the code.
sys/net/if.h
new flag
sys/net/if_var.h
declaration for functions used in device drivers.
sys/net/netisr.h
NETISR_POLL
sys/dev/fxp/if_fxp.c
sys/dev/fxp/if_fxpvar.h
sys/pci/if_dc.c
sys/pci/if_dcreg.h
sys/pci/if_sis.c
sys/pci/if_sisreg.h
device driver modifications
vnodes. This will hopefully serve as a base from which we can
expand the MP code. We currently do not attempt to obtain any
mutex or SX locks, but the door is open to add them when we nail
down exactly how that part of it is going to work.
"[...] and removes the hostcache code from standard kernels---the
code that depends on it is not going to happen any time soon,
I'm afraid."
Time to clean up.
Have sys/net/route.c:rtrequest1(), which takes ``rt_addrinfo *''
as the argument. Pass rt_addrinfo all the way down to rtrequest1
and ifa->ifa_rtrequest. 3rd argument of ifa->ifa_rtrequest is now
``rt_addrinfo *'' instead of ``sockaddr *'' (almost noone is
using it anyways).
Benefit: the following command now works. Previously we needed
two route(8) invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0
Remove unsafe typecast in rtrequest(), from ``rtentry *'' to
``sockaddr *''. It was introduced by 4.3BSD-Reno and never
corrected.
Obtained from: BSD/OS, NetBSD
MFC after: 1 month
PR: kern/28360
- Report destination address of a P2P link when servicing
routing socket messages.
- Report interface name, address, and destination address
of a P2P link when servicing NET_RT_{DUMP,FLAGS} sysctls.
Part of CSRG revision 8.6 coresponds to revision 1.12.
CSRG revision 8.7 corresponds to revision 1.15.
existing devices (e.g.: tunX). This may need a little more thought.
Create a /dev/netX alias for devices. net0 is reserved.
Allow wiring of net aliases in /boot/device.hints of the form:
hint.net.1.dev="lo0"
hint.net.12.ether="00:a0:c9:c9:9d:63"
This fixes the panic when receiving a packet with an unknown tag, and
also allows reception of packets with known tags.
- Allow overlapping tag number spaces when using multiple hardware-assisted
VLAN parent devices (by comparing the parent interface in
vlan_input_tag() just as in vlan_input() ).
- fix typo in comment
MFC after: 1 week
appear in /dev. Interface hardware ioctls (not protocol or routing) can
be performed on the descriptor. The SIOCGIFCONF ioctl may be performed
on the special /dev/network node.
+ implement "limit" rules, which permit to limit the number of sessions
between certain host pairs (according to masks). These are a special
type of stateful rules, which might be of interest in some cases.
See the ipfw manpage for details.
+ merge the list pointers and ipfw rule descriptors in the kernel, so
the code is smaller, faster and more readable. This patch basically
consists in replacing "foo->rule->bar" with "rule->bar" all over
the place.
I have been willing to do this for ages!
MFC after: 1 week
Yes this really is rather silly and the implementation is overkill given
that you are only allowed one of them, but NetBSD implements cloning on
this device and it's a less cluttered example of cloning then most.
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
Allow non-superuser to open, listen to, and send safe commands on the
routing socket. Superuser priviledge is required for all commands
but RTM_GET.
Lose `setuid root' bit of route(8).
Reviewed by: wollman, dd
into sadb_x_sa2_sequence from sadb_x_sa2_reserved3 in the sadb_x_sa2
structure. Also the output of setkey is changed. sequence number
of the sadb is replaced to the end of the output.
Obtained from: KAME
particularly nice that IPSEC inserts a zero-length mbuf into the
chain, and that bug should be fixed too, but interfaces should be
robust to bad input.
Print the interface name when TUNDEBUG()ing about dropping an mbuf.
This is to be friendly with non-IPv6 peer (If the peer complains due to
lack of IPv6CP, drop IPv6CP). This basically implements "RXJ+" state
transition in the RFC.
Obtained from: NetBSD
effect, which would cause unnecessary route deletion:
* Unfortunately, this has the obnoxious
* property of also triggering for insertion /above/ a pre-existing network
* route and clones. Sigh. This may be fixed some day.
The effect has been even worse, because recent versions of route.c set
the parent rtentry for cloned routes from an interface-direct route.
For example, suppose that we have an interface "ne0" that has an IPv4
subnet "10.0.0.0/24". Then we may have a cloned route like 10.0.0.1
on the interface, whose parent route is 10.0.0.0/24 (to the interface
ne0). Now, when we add the default route (i.e. 0.0.0.0/0),
rt_fixchange() will remove the cloned route 10.0.0.1. The (bad) effect
also prevents rt_setgate from configuring rt_gwroute, which would not
be an intended behavior.
As suggested in the comments to rt_fixchange(), we need stricter check
in the function, to prevent unintentional route deletion.
This fix also solve the "IPV6 panic?" problem in nd6_timer().
Submitted by: JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp>
MFC after: 4 days
vlan_unconfig()-ing an interface on which multicast groups have been
joined. Instead, keep the list of groups around (and, in fact, allow
changing of the membership list) and re-join them when the vlan interface
is reassociated with a lower level interface.
of tunclose() rather than the end, and tunopen() grabbed that unit
before tunclose() finished (one process is allocating it while another
is freeing it!).
It may be worth hanging some sort of rw mutex around all specinfo
calls where d_close and the detach handler get a write lock and all
other functions get a read lock. This would guarantee certain levels
of ``atomicity'' (is that a word?) that people may expect (I believe
Solaris does something like this).
requirements(RFC1573, interface MIB). This change for 4.4BSD was
first introduced in if_ethersubr.c:1.17->1.18.
BTW, iflastchange on all of IFs are inconsistent. e.g.
ether, tun: update
fddi, tokenring, ppp: not update
I'll make patch later.
Obtained from: KAME
MFC after: 2 weeks