rather than pointers, requiring callers to properly dispose of those
references. The following routines now return references:
ifaddr_byindex
ifa_ifwithaddr
ifa_ifwithbroadaddr
ifa_ifwithdstaddr
ifa_ifwithnet
ifaof_ifpforaddr
ifa_ifwithroute
ifa_ifwithroute_fib
rt_getifa
rt_getifa_fib
IFP_TO_IA
ip_rtaddr
in6_ifawithifp
in6ifa_ifpforlinklocal
in6ifa_ifpwithaddr
in6_ifadd
carp_iamatch6
ip6_getdstifaddr
Remove unused macro which didn't have required referencing:
IFP_TO_IA6
This closes many small races in which changes to interface
or address lists while an ifaddr was in use could lead to use of freed
memory (etc). In a few cases, add missing if_addr_list locking
required to safely acquire references.
Because of a lack of deep copying support, we accept a race in which
an in6_ifaddr pointed to by mbuf tags and extracted with
ip6_getdstifaddr() doesn't hold a reference while in transmit. Once
we have mbuf tag deep copy support, this can be fixed.
Reviewed by: bz
Obtained from: Apple, Inc. (portions)
MFC after: 6 weeks (portions)
a pointer to an ifaddr matching the passed socket address, returns a
boolean indicating whether one was present. In the (near) future,
ifa_ifwithaddr() will return a referenced ifaddr rather than a raw
ifaddr pointer, and the new wrapper will allow callers that care only
about the boolean condition to avoid having to free that reference.
MFC after: 3 weeks
a new rwlock, ipx_ifaddr_rw, wrapped with macros. This locking is
necessary but not sufficient, in isolation, to satisfy the stability
requirements of a fully parallel IPX input path during interface
reconfiguration.
MFC after: 3 weeks
protocol entry points using functions named proto_getsockaddr and
proto_getpeeraddr rather than proto_setsockaddr and proto_setpeeraddr.
While it's true that sockaddrs are allocated and set, the net effect is
to retrieve (get) the socket address or peer address from a socket, not
set it, so align names to that intent.
specific privilege names to a broad range of privileges. These may
require some future tweaking.
Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>
- Introduce invariant that all IPX/SPX sockets will have valid so_pcb
pointers to ipxpcb structures, and that for SPX, the control block
pointer will always be valid. Don't attempt to free the socket or
pcb at various odd points, such as disconnect.
- Add a new ipxpcb flag, IPXP_DROPPED, which will be set in place of
freeing PCB's so that this invariant can be maintained. This flag
is now checked instead of a NULL check in various socket protocol
calls.
- Introduce many assertions that this invariant holds.
- Various pieces of code, such as the SPX timer code, no longer needs
to jump through hoops in case it frees a PCB while running.
- Break out ipx_pcbfree() from ipx_pcbdetach(). Likewise
spx_pcbdetach().
- Comment on some SMP-related limitations to the SPX code.
- Update copyrights.
MFC after: 1 month
being committed:
- Wrap comments more evenly on right border.
- Clean up braces.
Also, along similar lines:
- Assert some pointers are non-NULL before dereferencing them.
- Remove one assertion that looks, on face value, poor.
MFC after: 1 month
the IPX-related PCB routines. In general, the list lock is required
to iterate the PCB list, either for read or write; the PCB lock is
required to access or modify a PCB. To change the binding of a PCB,
both locks must be held.
MFC after: 3 weeks
IPX PCB lists. Add macros to initialize, destroy, lock, unlock,
and assert the mutex. Initialize the mutex when IPX is started.
Add per-IPX PCB mutexes, ipxp_mtx in struct ipxpcb, to protect
per-PCB IPX/SPX state. Add macros to initialize, destroy, lock,
unlock, and assert the mutex. Initialize the mutex when a new
PCB is allocated; destroy it when the PCB is free'd.
MFC after: 2 weeks
the peer address by using M_WAITOK in ipx_setpeeraddr() to prevent
allocation failure. The socket reference used to reach these calls
will prevent the ipxpcb from being released prematurely.
expects a locked route reference. This removes a panic that occurs
when connected ipxpcb is closed and its route free'd, and may have been
present since the route locking took place.
MFC after: 2 weeks
- ipx_pcbnotify(), which is never called.
- ipx_rtchange(), which is never called, is incomplete inplemented, and
also #ifdef notdef.
- spx_fixmtu(), which is never called, is incompletely implemented, and
also #ifdef notdef.
(sorele()/sotryfree()):
- This permits the caller to acquire the accept mutex before the socket
mutex, avoiding sofree() having to drop the socket mutex and re-order,
which could lead to races permitting more than one thread to enter
sofree() after a socket is ready to be free'd.
- This also covers clearing of the so_pcb weak socket reference from
the protocol to the socket, preventing races in clearing and
evaluation of the reference such that sofree() might be called more
than once on the same socket.
This appears to close a race I was able to easily trigger by repeatedly
opening and resetting TCP connections to a host, in which the
tcp_close() code called as a result of the RST raced with the close()
of the accepted socket in the user process resulting in simultaneous
attempts to de-allocate the same socket. The new locking increases
the overhead for operations that may potentially free the socket, so we
will want to revise the synchronization strategy here as we normalize
the reference counting model for sockets. The use of the accept mutex
in freeing of sockets that are not listen sockets is primarily
motivated by the potential need to remove the socket from the
incomplete connection queue on its parent (listen) socket, so cleaning
up the reference model here may allow us to substantially weaken the
synchronization requirements.
RELENG_5_3 candidate.
MFC after: 3 days
Reviewed by: dwhite
Discussed with: gnn, dwhite, green
Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de>
Reported by: Vlad <marchenko at gmail dot com>
reference count:
- Assert SOCK_LOCK(so) macros that directly manipulate so_count:
soref(), sorele().
- Assert SOCK_LOCK(so) in macros/functions that rely on the state of
so_count: sofree(), sotryfree().
- Acquire SOCK_LOCK(so) before calling these functions or macros in
various contexts in the stack, both at the socket and protocol
layers.
- In some cases, perform soisdisconnected() before sotryfree(), as
this could result in frobbing of a non-present socket if
sotryfree() actually frees the socket.
- Note that sofree()/sotryfree() will release the socket lock even if
they don't free the socket.
Submitted by: sam
Sponsored by: FreeBSD Foundation
Obtained from: BSD/OS
functions in kern_socket.c.
Rename the "canwait" field to "mflags" and pass M_WAITOK and M_NOWAIT
in from the caller context rather than "1" or "0".
Correct mflags pass into mac_init_socket() from previous commit to not
include M_ZERO.
Submitted by: sam
o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a
socket buffer. The mutex in the receive buffer also protects the data
in struct socket.
o Determine the lock strategy for each members in struct socket.
o Lock down the following members:
- so_count
- so_options
- so_linger
- so_state
o Remove *_locked() socket APIs. Make the following socket APIs
touching the members above now require a locked socket:
- sodisconnect()
- soisconnected()
- soisconnecting()
- soisdisconnected()
- soisdisconnecting()
- sofree()
- soref()
- sorele()
- sorwakeup()
- sotryfree()
- sowakeup()
- sowwakeup()
Reviewed by: alfred
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
vnodes. This will hopefully serve as a base from which we can
expand the MP code. We currently do not attempt to obtain any
mutex or SX locks, but the door is open to add them when we nail
down exactly how that part of it is going to work.
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
<sys/proc.h> to <sys/systm.h>.
Correctly document the #includes needed in the manpage.
Add one now needed #include of <sys/systm.h>.
Remove the consequent 48 unused #includes of <sys/proc.h>.
1:
s/suser/suser_xxx/
2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.
3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/
The remaining suser_xxx() calls will be scrutinized and dealt with
later.
There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.
More changes to the suser() API will come along with the "jail" code.
socket addresses in mbufs. (Socket buffers are the one exception.) A number
of kernel APIs needed to get fixed in order to make this happen. Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead. Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.
Use the MAC address of an interface for the host part of an IPX address
and not the MAC address of the first interface for every IPX address.
This is more inline with the way others like Novell do it.
Mostly Submitted by: "Serge A. Babkin" <babkin@hq.icb.chel.su>
Take out the error messages (the ip icmp equivalent) with #ifdef IPXERRORMSGS.
This is bogus and as far as I could figure out IPX don't have anything like
it. This is a leftover from its XNS heritage. If nobody complains, I will
take it out completely in a few weeks.
Add some more ipxstat statistics counters.
Make ipxprintfs a sysctl variable and off by default.
Add IPX Netbios "routing" support. This is off by default and can be
switched on with a sysctl knob.
General code cleanup to at least use the same style throughout the IPX
code, but also be more style(9) conformant. Also make a lot of functions
static.
If I don't get any complaints I'll bring all of this over to the 2.2 tree
in a few weeks.
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.