freebsd-dev

Author	SHA1	Message	Date
Robert Watson	81158452be	Push acquisition of the accept mutex out of sofree() into the caller (sorele()/sotryfree()): - This permits the caller to acquire the accept mutex before the socket mutex, avoiding sofree() having to drop the socket mutex and re-order, which could lead to races permitting more than one thread to enter sofree() after a socket is ready to be free'd. - This also covers clearing of the so_pcb weak socket reference from the protocol to the socket, preventing races in clearing and evaluation of the reference such that sofree() might be called more than once on the same socket. This appears to close a race I was able to easily trigger by repeatedly opening and resetting TCP connections to a host, in which the tcp_close() code called as a result of the RST raced with the close() of the accepted socket in the user process resulting in simultaneous attempts to de-allocate the same socket. The new locking increases the overhead for operations that may potentially free the socket, so we will want to revise the synchronization strategy here as we normalize the reference counting model for sockets. The use of the accept mutex in freeing of sockets that are not listen sockets is primarily motivated by the potential need to remove the socket from the incomplete connection queue on its parent (listen) socket, so cleaning up the reference model here may allow us to substantially weaken the synchronization requirements. RELENG_5_3 candidate. MFC after: 3 days Reviewed by: dwhite Discussed with: gnn, dwhite, green Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de> Reported by: Vlad <marchenko at gmail dot com>	2004-10-18 22:19:43 +00:00
Robert Watson	5a0192650e	When notifying protocol components of an event on an in6pcb, use the result of the notify() function to decide if we need to unlock the in6pcb or not, rather than always unlocking. Otherwise, we may unlock and already unlocked in6pcb. Reported by: kuriyama, Gordon Bergling <gbergling at 0xfce3.net> Tested by: kuriyama, Gordon Bergling <gbergling at 0xfce3.net> Discussed with: mdodd	2004-08-21 17:38:48 +00:00
David Malone	1f44b0a1b5	Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD have already done this, so I have styled the patch on their work: 1) introduce a ip_newid() static inline function that checks the sysctl and then decides if it should return a sequential or random IP ID. 2) named the sysctl net.inet.ip.random_id 3) IPv6 flow IDs and fragment IDs are now always random. Flow IDs and frag IDs are significantly less common in the IPv6 world (ie. rarely generated per-packet), so there should be smaller performance concerns. The sysctl defaults to 0 (sequential IP IDs). Reviewed by: andre, silby, mlaier, ume Based on: NetBSD MFC after: 2 months	2004-08-14 15:32:40 +00:00
Robert Watson	f31f65a708	Pass pcbinfo structures to in6_pcbnotify() rather than pcbhead structures, allowing in6_pcbnotify() to lock the pcbinfo and each inpcb that it notifies of ICMPv6 events. This prevents inpcb assertions from firing when IPv6 generates and delievers event notifications for inpcbs. Reported by: kuriyama Tested by: kuriyama	2004-08-06 03:45:45 +00:00
Yaroslav Tykhiy	a4eb4405e3	Disallow a particular kind of port theft described by the following scenario: Alice is too lazy to write a server application in PF-independent manner. Therefore she knocks up the server using PF_INET6 only and allows the IPv6 socket to accept mapped IPv4 as well. An evil hacker known on IRC as cheshire_cat has an account in the same system. He starts a process listening on the same port as used by Alice's server, but in PF_INET. As a consequence, cheshire_cat will distract all IPv4 traffic supposed to go to Alice's server. Such sort of port theft was initially enabled by copying the code that implemented the RFC 2553 semantics on IPv4/6 sockets (see inet6(4)) for the implied case of the same owner for both connections. After this change, the above scenario will be impossible. In the same setting, the user who attempts to start his server last will get EADDRINUSE. Of course, using IPv4 mapped to IPv6 leads to security complications in the first place, but there is no reason to make it even more unsafe. This change doesn't apply to KAME since it affects a FreeBSD-specific part of the code. It doesn't modify the out-of-box behaviour of the TCP/IP stack either as long as mapping IPv4 to IPv6 is off by default. MFC after: 1 month	2004-07-28 13:03:07 +00:00
Robert Watson	07385abd73	Commit a first pass at in6pcb and pcbinfo locking for IPv6, synchronizing IPv6 protocol control blocks and lists. These changes are modeled on the inpcb locking for IPv4, submitted by Jennifer Yang, and committed by Jeffrey Hsu. With these locking changes, IPv6 use of inpcbs is now substantially more MPSAFE, and permits IPv4 inpcb locking assertions to be run in the presence of IPv6 compiled into the kernel.	2004-07-27 23:44:03 +00:00
Yaroslav Tykhiy	f66145c6bd	Don't consider TCP connections beyond LISTEN state (i.e. with the foreign address being not wildcard) when checking for possible port theft since such connections cannot be stolen. The port theft check is FreeBSD-specific and isn't in the KAME tree. PR: bin/65928 (in the audit trail) Reviewed by: -net, -hackers (silence) Tested by: Nick Leuta <skynick at mail.sc.ru> MFC after: 1 month	2004-07-27 16:35:09 +00:00
Colin Percival	56f21b9d74	Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb	2004-07-26 07:24:04 +00:00
Robert Watson	359fdba7a7	Missed directory in previous commit; need to hold SOCK_LOCK(so) before calling sotryfree(). -- Body of earlier bulk commit this belonged with -- Log: Extend coverage of SOCK_LOCK(so) to include so_count, the socket reference count: - Assert SOCK_LOCK(so) macros that directly manipulate so_count: soref(), sorele(). - Assert SOCK_LOCK(so) in macros/functions that rely on the state of so_count: sofree(), sotryfree(). - Acquire SOCK_LOCK(so) before calling these functions or macros in various contexts in the stack, both at the socket and protocol layers. - In some cases, perform soisdisconnected() before sotryfree(), as this could result in frobbing of a non-present socket if sotryfree() actually frees the socket. - Note that sofree()/sotryfree() will release the socket lock even if they don't free the socket. Submitted by: sam Sponsored by: FreeBSD Foundation Obtained from: BSD/OS	2004-06-12 20:59:48 +00:00
Warner Losh	f36cfd49ad	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson	2004-04-07 20:46:16 +00:00
Pawel Jakub Dawidek	b0330ed929	Reduce 'td' argument to 'cred' (struct ucred) argument in those functions: - in_pcbbind(), - in_pcbbind_setup(), - in_pcbconnect(), - in_pcbconnect_setup(), - in6_pcbbind(), - in6_pcbconnect(), - in6_pcbsetport(). "It should simplify/clarify things a great deal." --rwatson Requested by: rwatson Reviewed by: rwatson, ume	2004-03-27 21:05:46 +00:00
Hajimu UMEMOTO	efddf5c64d	supported IPV6_RECVPATHMTU socket option. Obtained from: KAME	2004-02-13 14:50:01 +00:00
Hajimu UMEMOTO	0f36f0e84a	fix build with FAST_IPSEC. Reported by: cjc	2004-02-09 16:02:16 +00:00
Hajimu UMEMOTO	a18310584f	call ipsec_pcbconn()/ipsec_pcbdisconn() from in6_pcbconnect(). Obtained from: KAME	2004-01-13 10:45:02 +00:00
Hajimu UMEMOTO	5fac41781b	in set{peer, sock}addr, do not convert the unspecified address (::) to the mapped address form. PR: kern/22868 Obtained from: KAME MFC after: 3 days	2004-01-10 08:11:51 +00:00
Andre Oppermann	97d8d152c2	Introduce tcp_hostcache and remove the tcp specific metrics from the routing table. Move all usage and references in the tcp stack from the routing table metrics to the tcp hostcache. It caches measured parameters of past tcp sessions to provide better initial start values for following connections from or to the same source or destination. Depending on the network parameters to/from the remote host this can lead to significant speedups for new tcp connections after the first one because they inherit and shortcut the learning curve. tcp_hostcache is designed for multiple concurrent access in SMP environments with high contention and is hash indexed by remote ip address. It removes significant locking requirements from the tcp stack with regard to the routing table. Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl)	2003-11-20 20:07:39 +00:00
Sam Leffler	9c63e9dbd7	Overhaul routing table entry cleanup by introducing a new rtexpunge routine that takes a locked routing table reference and removes all references to the entry in the various data structures. This eliminates instances of recursive locking and also closes races where the lock on the entry had to be dropped prior to calling rtrequest(RTM_DELETE). This also cleans up confusion where the caller held a reference to an entry that might have been reclaimed (and in some cases used that reference). Supported by: FreeBSD Foundation	2003-10-30 23:02:51 +00:00
Hajimu UMEMOTO	f95d46333d	Switch Advanced Sockets API for IPv6 from RFC2292 to RFC3542 (aka RFC2292bis). Though I believe this commit doesn't break backward compatibility againt existing binaries, it breaks backward compatibility of API. Now, the applications which use Advanced Sockets API such as telnet, ping6, mld6query and traceroute6 use RFC3542 API. Obtained from: KAME	2003-10-24 18:26:30 +00:00
Hajimu UMEMOTO	7efe5d92ab	- fix typo in comments. - style. - NULL is not 0. - some variables were renamed. - nuke unused logic. (there is no functional change.) Obtained from: KAME	2003-10-08 18:26:08 +00:00
Hajimu UMEMOTO	40e39bbb67	return(code) -> return (code) (reduce diffs against KAME)	2003-10-06 14:02:09 +00:00
Sam Leffler	d1dd20be6e	Locking for updates to routing table entries. Each rtentry gets a mutex that covers updates to the contents. Note this is separate from holding a reference and/or locking the routing table itself. Other/related changes: o rtredirect loses the final parameter by which an rtentry reference may be returned; this was never used and added unwarranted complexity for locking. o minor style cleanups to routing code (e.g. ansi-fy function decls) o remove the logic to bump the refcnt on the parent of cloned routes, we assume the parent will remain as long as the clone; doing this avoids a circularity in locking during delete o convert some timeouts to MPSAFE callouts Notes: 1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level applications cannot/do-no know about mutex's. Doing this requires that the mutex be the last element in the structure. A better solution is to introduce an externalized version of struct rtentry but this is a major task because of the intertwining of rtentry and other data structures that are visible to user applications. 2. There are known LOR's that are expected to go away with forthcoming work to eliminate many held references. If not these will be resolved prior to release. 3. ATM changes are untested. Sponsored by: FreeBSD Foundation Obtained from: BSD/OS (partly)	2003-10-04 03:44:50 +00:00
Hajimu UMEMOTO	b79274ba41	randomize IPv6 flowlabel when RANDOM_IP_ID is defined. Obtained from: KAME	2003-10-01 21:24:28 +00:00
Ruslan Ermilov	78f94aa951	Fix a bunch of off-by-one errors in the range checking code.	2003-09-11 21:40:21 +00:00
Olivier Houchard	56e6821e56	Do not attempt to access to inp_socket fields if the socket is in the TIME_WAIT state, as inp_socket will then be NULL. This fixes a panic that occurs when one tries to bind a port that was previously binded with remaining TIME_WAIT sockets.	2003-06-17 00:31:30 +00:00
Jonathan Lemon	340c35de6a	Add a TCP TIMEWAIT state which uses less space than a fullblown TCP control block. Allow the socket and tcpcb structures to be freed earlier than inpcb. Update code to understand an inp w/o a socket. Reviewed by: hsu, silby, jayanth Sponsored by: DARPA, NAI Labs	2003-02-19 22:32:43 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Sam Leffler	b9234fafa0	Tie new "Fast IPsec" code into the build. This involves the usual configuration stuff as well as conditional code in the IPv4 and IPv6 areas. Everything is conditional on FAST_IPSEC which is mutually exclusive with IPSEC (KAME IPsec implmentation). As noted previously, don't use FAST_IPSEC with INET6 at the moment. Reviewed by: KAME, rwatson Approved by: silence Supported by: Vernier Networks	2002-10-16 02:25:05 +00:00
Don Lewis	26ef6ac4df	Create new functions in_sockaddr(), in6_sockaddr(), and in6_v4mapsin6_sockaddr() which allocate the appropriate sockaddr_in* structure and initialize it with the address and port information passed as arguments. Use calls to these new functions to replace code that is replicated multiple times in in_setsockaddr(), in_setpeeraddr(), in6_setsockaddr(), in6_setpeeraddr(), in6_mapped_sockaddr(), and in6_mapped_peeraddr(). Inline COMMON_END in tcp_usr_accept() so that we can call in_sockaddr() with temporary copies of the address and port after the PCB is unlocked. Fix the lock violation in tcp6_usr_accept() (caused by calling MALLOC() inside in6_mapped_peeraddr() while the PCB is locked) by changing the implementation of tcp6_usr_accept() to match tcp_usr_accept(). Reviewed by: suz	2002-08-21 11:57:12 +00:00
Jeffrey Hsu	3ce144ea88	Notify functions can destroy the pcb, so they have to return an indication of whether this happenned so the calling function knows whether or not to unlock the pcb. Submitted by: Jennifer Yang (yangjihui@yahoo.com) Bug reported by: Sid Carter (sidcarter@symonds.net)	2002-06-14 08:35:21 +00:00
Jeffrey Hsu	bdc2055256	As a stop-gap measure, add one INP_LOCK_DESTROY() to in6_pcbdetach() to get kernel compiled with INET6 to boot.	2002-06-12 06:01:22 +00:00
Jeffrey Hsu	f76fcf6d4c	Lock up inpcb. Submitted by: Jennifer Yang <yangjihui@yahoo.com>	2002-06-10 20:05:46 +00:00
Seigo Tanimura	4cc20ab1f0	Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu	2002-05-31 11:52:35 +00:00
Seigo Tanimura	243917fe3b	Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred	2002-05-20 05:41:09 +00:00
SUZUKI Shinsuke	88ff5695c1	just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD. (based on freebsd4-snap-20020128) Reviewed by: ume MFC after: 1 week	2002-04-19 04:46:24 +00:00
John Baldwin	44731cab3b	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
Jeff Roberson	9454002711	Remove references to vm_zone.h and switch over to the new uma API.	2002-03-20 08:03:54 +00:00
Alfred Perlstein	c4c0c592f0	Fix warnings caused by discarding const. Hairy Eyeball At: peter	2002-02-27 02:44:45 +00:00
Matthew Dillon	b1e4abd246	Give struct socket structures a ref counting interface similar to vnodes. This will hopefully serve as a base from which we can expand the MP code. We currently do not attempt to obtain any mutex or SX locks, but the door is open to add them when we nail down exactly how that part of it is going to work.	2001-11-17 03:07:11 +00:00
Ruslan Ermilov	8071913df2	Pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2. Have sys/net/route.c:rtrequest1(), which takes ``rt_addrinfo '' as the argument. Pass rt_addrinfo all the way down to rtrequest1 and ifa->ifa_rtrequest. 3rd argument of ifa->ifa_rtrequest is now ``rt_addrinfo '' instead of ``sockaddr '' (almost noone is using it anyways). Benefit: the following command now works. Previously we needed two route(8) invocations, "add" then "change". # route add -inet6 default ::1 -ifp gif0 Remove unsafe typecast in rtrequest(), from ``rtentry '' to ``sockaddr *''. It was introduced by 4.3BSD-Reno and never corrected. Obtained from: BSD/OS, NetBSD MFC after: 1 month PR: kern/28360	2001-10-17 18:07:05 +00:00
Ruslan Ermilov	65ad7c437d	Pull fix for memory leak in in6_losing() from netinet/in_pcb.c,v 1.85. MFC after: 1 week	2001-10-17 17:42:23 +00:00
Brooks Davis	9494d5968f	Make faith loadable, unloadable, and clonable.	2001-09-25 18:40:52 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Jonathan Lemon	f9132cebdc	Wrap array accesses in macros, which also happen to be lvalues: ifnet_addrs[i - 1] -> ifaddr_byindex(i) ifindex2ifnet[i] -> ifnet_byindex(i) This is intended to ease the conversion to SMPng.	2001-09-06 02:40:43 +00:00
Hajimu UMEMOTO	e43cc4ae36	When running aplication joined multicast address, removing network card, and kill aplication. imo_membership[].inm_ifp refer interface pointer after removing interface. When kill aplication, release socket,and imo_membership. imo_membership use already not exist interface pointer. Then, kernel panic. PR: 29345 Submitted by: Inoue Yuichi <inoue@nd.net.fujitsu.co.jp> Obtained from: KAME MFC after: 3 days	2001-08-04 17:10:14 +00:00
Hajimu UMEMOTO	3384154590	Sync with recent KAME. This work was based on kame-20010528-freebsd43-snap.tgz and some critical problem after the snap was out were fixed. There are many many changes since last KAME merge. TODO: - The definitions of SADB_* in sys/net/pfkeyv2.h are still different from RFC2407/IANA assignment because of binary compatibility issue. It should be fixed under 5-CURRENT. - ip6po_m member of struct ip6_pktopts is no longer used. But, it is still there because of binary compatibility issue. It should be removed under 5-CURRENT. Reviewed by: itojun Obtained from: KAME MFC after: 3 weeks	2001-06-11 12:39:29 +00:00
Dag-Erling Smørgrav	a3ea6d41b9	First step towards an MP-safe zone allocator: - have zalloc() and zfree() always lock the vm_zone. - remove zalloci() and zfreei(), which are now redundant. Reviewed by: bmilekic, jasone	2001-01-21 22:23:11 +00:00
Jun-ichiro itojun Hagino	686cdd19b1	sync with kame tree as of july00. tons of bug fixes/improvements. API changes: - additional IPv6 ioctls - IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8). (also syntax change)	2000-07-04 16:35:15 +00:00
Yoshinobu Inoue	fdaf052eb3	Support per socket based IPv4 mapped IPv6 addr enable/disable control. Submitted by: ume	2000-04-01 22:35:47 +00:00
Brian Feldman	333aa64d05	in6_pcb.c: Remove a bogus (redundant, just weird, etc.) key_freeso(so). There are no consumers of it now, nor does it seem there ever will be. in6?_pcb.c: Add an if (inp->in6?p_sp != NULL) before the call to ipsec[46]_delete_pcbpolicy(inp). In low-memory conditions this can cause a crash because in6?_sp can be NULL...	2000-03-22 02:27:30 +00:00

1 2

60 Commits