Commit Graph

55 Commits

Author SHA1 Message Date
glebius
36031fbf77 Backout revision 1.54, because it exposes a worse problem, than
it fixes. I believe the problem lives somewhere outside ng_ksocket,
but until it is found, let the node be working.

PR:		kern/84952
PR:		kern/82413
MFC after:	3 days
2005-08-25 07:21:15 +00:00
glebius
d4c770bc93 Catch up with new ng_send_fn1() interface. 2005-05-16 17:07:39 +00:00
glebius
23a1a3e3a9 When used as divert socket we need to decouple stack when node is entered
from socket side. Use ng_queue_fn() instead of ng_send_fn().
2005-05-13 11:40:08 +00:00
glebius
dcecb5d15d Fix panics with misconfigured routing:
- Backout previous revision, the check is useless.
- Turn node to queue mode, since it is edge node.

Reported by:	sem
2005-04-18 11:32:17 +00:00
glebius
5af7592fca Reimplement recursion protection, checking whether current thread holds
sockbuf mutex.

Reviewed by:	rwatson
2005-02-19 14:41:49 +00:00
glebius
471fd11ce3 Remove a recursion protection, which we inherited from splnet() netgraph times.
Now several threads may write data to ng_ksocket. Locking of socket is done in
sosend().

Reviewed by:	archie, julian, rwatson
MFC after:	2 weeks
2005-02-16 16:00:35 +00:00
glebius
91f542fb50 Allocate enough space for new tag.
Pointy hat to:	glebius
2005-02-12 16:26:36 +00:00
glebius
f36bba96f5 When netgraph(4) was converted to use mbuf_tags(9) instead of meta-data
a definite setup was broken: two ng_ksockets are connected to each other,
connect()ed to different remote hosts, and bind()ed to different local
interfaces. In this case one ng_ksocket is fooled with tag from the other
one.

Put node id into tag. In rcvdata method utilize tag only if it has our
own id inside or id equals zero. The latter case is added to support
packets send by some third, not ng_ksocket node.

MFC after:	1 week
2005-02-12 14:54:19 +00:00
imp
a50ffc2912 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
rwatson
2a581a76ba In FreeBSD 5.x, curthread is always defined, so we don't need to to test
and optionally use &thread0 if it's NULL.

Spotted by:	julian
2004-09-02 19:53:13 +00:00
julian
dfb6d51195 Convert Netgraph to use mbuf tags to pass its meta information around.
Thanks to Sam for importing tags in a way that allowed this to be done.

Submitted by:	Gleb Smirnoff <glebius@cell.sick.ru>
Also allow the sr and ar drivers to create netgraph versions of their modules.
Document the change to the ksocket node.
2004-06-25 19:22:05 +00:00
rwatson
855c4bb01f Merge additional socket buffer locking from rwatson_netperf:
- Lock down low hanging fruit use of sb_flags with socket buffer
  lock.

- Lock down low hanging fruit use of so_state with socket lock.

- Lock down low hanging fruit use of so_options.

- Lock down low-hanging fruit use of sb_lowwat and sb_hiwat with
  socket buffer lock.

- Annotate situations in which we unlock the socket lock and then
  grab the receive socket buffer lock, which are currently actually
  the same lock.  Depending on how we want to play our cards, we
  may want to coallesce these lock uses to reduce overhead.

- Convert a if()->panic() into a KASSERT relating to so_state in
  soaccept().

- Remove a number of splnet()/splx() references.

More complex merging of socket and socket buffer locking to
follow.
2004-06-17 22:48:11 +00:00
rwatson
f2c0db1521 The socket field so_state is used to hold a variety of socket related
flags relating to several aspects of socket functionality.  This change
breaks out several bits relating to send and receive operation into a
new per-socket buffer field, sb_state, in order to facilitate locking.
This is required because, in order to provide more granular locking of
sockets, different state fields have different locking properties.  The
following fields are moved to sb_state:

  SS_CANTRCVMORE            (so_state)
  SS_CANTSENDMORE           (so_state)
  SS_RCVATMARK              (so_state)

Rename respectively to:

  SBS_CANTRCVMORE           (so_rcv.sb_state)
  SBS_CANTSENDMORE          (so_snd.sb_state)
  SBS_RCVATMARK             (so_rcv.sb_state)

This facilitates locking by isolating fields to be located with other
identically locked fields, and permits greater granularity in socket
locking by avoiding storing fields with different locking semantics in
the same short (avoiding locking conflicts).  In the future, we may
wish to coallesce sb_state and sb_flags; for the time being I leave
them separate and there is no additional memory overhead due to the
packing/alignment of shorts in the socket buffer structure.
2004-06-14 18:16:22 +00:00
rwatson
82295697cd Extend coverage of SOCK_LOCK(so) to include so_count, the socket
reference count:

- Assert SOCK_LOCK(so) macros that directly manipulate so_count:
  soref(), sorele().

- Assert SOCK_LOCK(so) in macros/functions that rely on the state of
  so_count: sofree(), sotryfree().

- Acquire SOCK_LOCK(so) before calling these functions or macros in
  various contexts in the stack, both at the socket and protocol
  layers.

- In some cases, perform soisdisconnected() before sotryfree(), as
  this could result in frobbing of a non-present socket if
  sotryfree() actually frees the socket.

- Note that sofree()/sotryfree() will release the socket lock even if
  they don't free the socket.

Submitted by:	sam
Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS
2004-06-12 20:47:32 +00:00
rwatson
576b26bafd Integrate accept locking from rwatson_netperf, introducing a new
global mutex, accept_mtx, which serializes access to the following
fields across all sockets:

          so_qlen          so_incqlen         so_qstate
          so_comp          so_incomp          so_list
          so_head

While providing only coarse granularity, this approach avoids lock
order issues between sockets by avoiding ownership of the fields
by a specific socket and its per-socket mutexes.

While here, rewrite soclose(), sofree(), soaccept(), and
sonewconn() to add assertions, close additional races and  address
lock order concerns.  In particular:

- Reorganize the optimistic concurrency behavior in accept1() to
  always allocate a file descriptor with falloc() so that if we do
  find a socket, we don't have to encounter the "Oh, there wasn't
  a socket" race that can occur if falloc() sleeps in the current
  code, which broke inbound accept() ordering, not to mention
  requiring backing out socket state changes in a way that raced
  with the protocol level.  We may want to add a lockless read of
  the queue state if polling of empty queues proves to be important
  to optimize.

- In accept1(), soref() the socket while holding the accept lock
  so that the socket cannot be free'd in a race with the protocol
  layer.  Likewise in netgraph equivilents of the accept1() code.

- In sonewconn(), loop waiting for the queue to be small enough to
  insert our new socket once we've committed to inserting it, or
  races can occur that cause the incomplete socket queue to
  overfill.  In the previously implementation, it was sufficient
  to simply tested once since calling soabort() didn't release
  synchronization permitting another thread to insert a socket as
  we discard a previous one.

- In soclose()/sofree()/et al, it is the responsibility of the
  caller to remove a socket from the incomplete connection queue
  before calling soabort(), which prevents soabort() from having
  to walk into the accept socket to release the socket from its
  queue, and avoids races when releasing the accept mutex to enter
  soabort(), permitting soabort() to avoid lock ordering issues
  with the caller.

- Generally cluster accept queue related operations together
  throughout these functions in order to facilitate locking.

Annotate new locking in socketvar.h.
2004-06-02 04:15:39 +00:00
rwatson
bddadcf71a The SS_COMP and SS_INCOMP flags in the so_state field indicate whether
the socket is on an accept queue of a listen socket.  This change
renames the flags to SQ_COMP and SQ_INCOMP, and moves them to a new
state field on the socket, so_qstate, as the locking for these flags
is substantially different for the locking on the remainder of the
flags in so_state.
2004-06-01 02:42:56 +00:00
julian
c85e63d425 Switch to using C99 sparse initialisers for the type methods array.
Should make no binary difference.

Submitted by:	Gleb Smirnoff <glebius@cell.sick.ru>
Reviewed by:	Harti Brandt <harti@freebsd.org>
MFC after:	1 week
2004-05-29 00:51:19 +00:00
harti
14cb10664f Get rid of the deprecated *LEN constants in favour of the new
*SIZ constants that include the trailing \0 byte.
2004-01-26 14:05:31 +00:00
ru
6c49c65d27 Replaced two bzero() calls with the M_ZERO flag to malloc().
Reviewed by:	julian
2003-12-17 11:48:18 +00:00
hsu
4904c77f42 Add Protocol Independent Multicast protocol.
Submitted by:	Pavlin Radoslavov <pavlin@icir.org>
2003-08-20 22:11:58 +00:00
archie
389061c125 Add missing braces.
Submitted by:	Andrew Lankford <arlankfo@141.com>
2003-04-28 20:38:05 +00:00
benno
bb76739de0 Reference the socket we're accepting. 2002-09-14 08:56:10 +00:00
benno
f69ecac521 Remember who asked for a connect or accept operation so we can actually tell
them when it's done.

Reviewed by:	archie
2002-09-11 00:52:50 +00:00
archie
d8ca8557f2 Don't use "NULL" when "0" is really meant. 2002-08-22 00:30:03 +00:00
archie
b10887dc2b Fix GCC warnings caused by initializing a zero length array. In the process,
simply things a bit by getting rid of 'struct ng_parse_struct_info' which
was useless because it only contained one field.

MFC after:	2 weeks
2002-05-31 23:48:03 +00:00
tanimura
e6fa9b9e92 Back out my lats commit of locking down a socket, it conflicts with hsu's work.
Requested by:	hsu
2002-05-31 11:52:35 +00:00
tanimura
92d8381dd5 Lock down a socket, milestone 1.
o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a
  socket buffer. The mutex in the receive buffer also protects the data
  in struct socket.

o Determine the lock strategy for each members in struct socket.

o Lock down the following members:

  - so_count
  - so_options
  - so_linger
  - so_state

o Remove *_locked() socket APIs.  Make the following socket APIs
  touching the members above now require a locked socket:

 - sodisconnect()
 - soisconnected()
 - soisconnecting()
 - soisdisconnected()
 - soisdisconnecting()
 - sofree()
 - soref()
 - sorele()
 - sorwakeup()
 - sotryfree()
 - sowakeup()
 - sowwakeup()

Reviewed by:	alfred
2002-05-20 05:41:09 +00:00
jhb
3706cd3509 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
julian
b5eb64d6f0 Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
2002-02-07 20:58:47 +00:00
archie
8d266cece1 Avoid reentrantly sending on the same socket, which causes a kernel panic. 2002-01-06 01:08:30 +00:00
rwatson
5eea21ccca o Make the credential used by socreate() an explicit argument to
socreate(), rather than getting it implicitly from the thread
  argument.

o Make NFS cache the credential provided at mount-time, and use
  the cached credential (nfsmount->nm_cred) when making calls to
  socreate() on initially connecting, or reconnecting the socket.

This fixes bugs involving NFS over TCP and ipfw uid/gid rules, as well
as bugs involving NFS and mandatory access control implementations.

Reviewed by:	freebsd-arch
2001-12-31 17:45:16 +00:00
obrien
7fd9a6a23a Update to C99, s/__FUNCTION__/__func__/,
also don't use ANSI string concatenation.
2001-12-10 08:09:49 +00:00
archie
5ede7a3e25 When a socket is not connected, allow the peer "struct sockaddr"
to be included in the meta information that is associated with
incoming and outgoing packets.

Reviewed by:	julian
MFC after:	1 week
2001-11-28 19:39:58 +00:00
archie
11a08756a6 Let "raw" mean IPPROTO_RAW instead of IPPROTO_IP.
Noticed by:	jdp
MFC after:	3 days
2001-10-10 19:51:13 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
julian
f307b418db First pass at porting John's "accept" changes to
allow an in-kernel webserver (or similar) to accept
and handle incoming connections using netgraph without ever leaving the
kernel. (allows incoming tunnel requests to be
handled totally within the kernel for example)

Needs work, but shouldn't break existing functionality.

Submitted by:	John Polstra <jdp@polstra.com>
MFC after:	2 weeks
2001-09-07 07:12:51 +00:00
archie
58771d9471 Fix an erroneous comment and two style(9) bugs. 2001-02-16 17:37:31 +00:00
julian
bb6dbcf3b2 Fix some memory leaks
Add memory leak detection assitance.
2001-01-10 07:13:58 +00:00
julian
ff86256bf7 Part 2 of the netgraph rewrite.
This is mostly cosmetic changes, (though I caught a bug or two while
makeing them)
Reviewed by:	archie@freebsd.org
2001-01-08 05:34:06 +00:00
julian
f0c46a9d00 Rewrite of netgraph to start getting ready for SMP.
This version is functional and is aproaching solid..
notice I said APROACHING. There are many node types I cannot test
I have tested: echo hole ppp socket vjc iface tee bpf async tty
The rest compile and "Look" right.  More changes to follow.
DEBUGGING is enabled in this code to help if people have problems.
2001-01-06 00:46:47 +00:00
julian
0c949560a1 Divorce the kernel binary ABI version number from the message
format version number. (userland programs should not need to be
recompiled when the netgraph kernel internal ABI is changed.

Also fix modules that don;t handle the fact that a caller may not supply
a return message pointer. (benign at the moment because the calling code
checks, but that will change)
2000-12-18 20:03:32 +00:00
julian
2d1192e612 Reviewed by: Archie@freebsd.org
This clears out my outstanding netgraph changes.
There is a netgraph change of design in the offing and this is to some
extent a superset of soem of the new functionality and some of the old
functionality that may be removed.

This code works as before, but allows some new features that I want to
work with and evaluate. It is the basis for a version of netgraph
with integral locking for SMP use.

This is running on my test machine with no new problems :-)
2000-12-12 18:52:14 +00:00
dwmalone
918549eb31 Add the use of M_ZERO to netgraph.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
Submitted by:	archie
Approved by:	archie
2000-11-18 15:17:43 +00:00
brian
5d83d171df Go back to using data_len in struct ngpppoe_init_data after discussions
with Julian and Archie.

Implement a new ``sizedstring'' parse type for dealing with field pairs
consisting of a uint16_t followed by a data field of that size, and use
this to deal with the data_len and data fields.

Written by:		Archie with some input by me
Agreed in principle by:	julian
2000-11-16 23:14:53 +00:00
julian
bf2394b8d4 Since neither archie nor I work at Whistle any more, change our email
addresses to be the more usefu @freebsd.org ones
so we can keep getting bug-reports.
- man pages to follow..
2000-10-24 17:32:45 +00:00
archie
b8962bd43f Fix memory leak.
Submitted by:	Christopher N. Harrell <cnh@ivmg.net>
2000-10-11 19:04:34 +00:00
archie
1023c32e1a Allocate all memory (including within node constructors) with M_NOWAIT
instead of M_WAITOK, to allow for maximum flexibility.
2000-09-21 18:01:23 +00:00
archie
91912098c3 Take advantage of the new unsigned and hex integer types. 2000-08-10 22:45:54 +00:00
archie
7c1fac0cb6 In a struct sockaddr, sa->sa_len can be zero if uninitialized.
Make sure that this doesn't cause a problem when parsing.
2000-08-09 23:57:44 +00:00
julian
49604b4259 Two simple changes to the kernel internal API for netgraph modules,
to support future work in flow-control and 'packet reject/replace'
processing modes.

reviewed by: phk, archie
2000-04-28 17:09:00 +00:00