Commit Graph

74 Commits

Author SHA1 Message Date
Michael Tuexen
d089f9b915 Add FIB support for SCTP.
This fixes https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200379

MFC after: 3 days
2015-06-17 15:20:14 +00:00
Michael Tuexen
59b6d5be4e Add a SCTP socket option to limit the cwnd for each path.
MFC after: 1 month
2015-03-10 19:49:25 +00:00
Michael Tuexen
4e88d37a2a Do the renaming of sb_cc to sb_ccc in a way with less code changes by
using a macro.
This is an alternate approach to
https://svnweb.freebsd.org/changeset/base/275326
which is easier to handle upstream.

Discussed with: rrs, glebius
2014-12-02 20:29:29 +00:00
Gleb Smirnoff
0f9d0a73a4 Merge from projects/sendfile:
o Introduce a notion of "not ready" mbufs in socket buffers.  These
mbufs are now being populated by some I/O in background and are
referenced outside.  This forces following implications:
- An mbuf which is "not ready" can't be taken out of the buffer.
- An mbuf that is behind a "not ready" in the queue neither.
- If sockbet buffer is flushed, then "not ready" mbufs shouln't be
  freed.

o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc.
  The sb_ccc stands for ""claimed character count", or "committed
  character count".  And the sb_acc is "available character count".
  Consumers of socket buffer API shouldn't already access them directly,
  but use sbused() and sbavail() respectively.
o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones
  with M_BLOCKED.
o New field sb_fnrdy points to the first not ready mbuf, to avoid linear
  search.
o New function sbready() is provided to activate certain amount of mbufs
  in a socket buffer.

A special note on SCTP:
  SCTP has its own sockbufs.  Unfortunately, FreeBSD stack doesn't yet
allow protocol specific sockbufs.  Thus, SCTP does some hacks to make
itself compatible with FreeBSD: it manages sockbufs on its own, but keeps
sb_cc updated to inform the stack of amount of data in them.  The new
notion of "not ready" data isn't supported by SCTP.  Instead, only a
mechanical substitute is done: s/sb_cc/sb_ccc/.
  A proper solution would be to take away struct sockbuf from struct
socket and allow protocols to implement their own socket buffers, like
SCTP already does.  This was discussed with rrs@.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-30 12:52:33 +00:00
Michael Tuexen
c79bec9c75 Add support for the SCTP_AUTH_SUPPORTED and SCTP_ASCONF_SUPPORTED
socket options. Add also a sysctl to control the support of ASCONF.

MFC after: 1 week
2014-08-12 11:30:16 +00:00
Michael Tuexen
317e00ef86 Add support for the SCTP_RECONFIG_SUPPORTED and the corresponding
sysctl controlling the negotiation of the RE-CONFIG extension.

MFC after: 3 days
2014-08-04 20:07:35 +00:00
Michael Tuexen
cb9b8e6f7d Add support for the SCTP_PKTDROP_SUPPORTED socket option and
the corresponding sysctl variable.
The default is off, since the specification is not an RFC yet.

MFC after: 1 week
2014-08-03 18:12:55 +00:00
Michael Tuexen
caea98793f Add SCTP socket option SCTP_NRSACK_SUPPORTED to control the
NRSACK extension. The default will still be off, since it
it not an RFC (yet).
Changing the sysctl name will be in a separate commit.

MFC after: 1 week
2014-08-03 14:10:10 +00:00
Michael Tuexen
dd973b0e15 Add support for the SCTP_PR_SUPPORTED socket option as specified in
http://tools.ietf.org/html/draft-ietf-tsvwg-sctp-prpolicies
Add also a sysctl controlling the default of the end-points.

MFC after: 1 week
2014-08-02 21:36:40 +00:00
Michael Tuexen
f342355a0e Cleanup the ECN configuration handling and provide an SCTP socket
option for controlling ECN on future associations and get the
status on current associations.
A simialar pattern will be used for controlling SCTP extensions in
upcoming commits.
2014-08-02 17:35:13 +00:00
Michael Tuexen
2c9c61defa Make the features a 64-bit value instead of 32-bit.
This will allow an easier integration of the support
for NDATA.
While there, do also some minor cleanups.
Obtained from:	rrs@
MFC after: 2 weeks
2013-08-12 13:52:15 +00:00
Michael Tuexen
2416af26a0 Send the adaptation layer indication only if set by the user.
MFC after: 3 days
Discussed with: rrs
2013-02-11 21:02:49 +00:00
Michael Tuexen
3a51a2647a Add support for SCTP/UDP/IPV6.
This completes the support of
http://tools.ietf.org/html/draft-ietf-tsvwg-sctp-udp-encaps

MFC after: 1 week
2012-11-17 20:04:04 +00:00
Michael Tuexen
b1754ad17b Pass the src and dst address of a received packet explicitly around.
MFC after: 3 days
2012-06-28 16:01:08 +00:00
Michael Tuexen
807aad636f Use consistent text at the begining of the files.
MFC after: 3 days
2012-05-23 11:26:28 +00:00
Randall Stewart
c4e848b770 Make stream our stream reset implementation
compliant to RFC6525.

MFC after:	1 month
2012-03-29 13:36:53 +00:00
Michael Tuexen
7215cc1b74 Fix unused parameter warnings.
While there, fix some whitespace issues.

MFC after: 3 months.
2011-12-17 19:21:40 +00:00
Michael Tuexen
c9c5805975 Add support for the SCTP_REMOTE_UDP_ENCAPS_PORT socket option.
Retire the the now unused sctp_udp_tunneling_for_client_enable
sysctl variable.

MFC after: 3 months.
2011-11-20 15:00:45 +00:00
Michael Tuexen
58bdb69150 Fix the handling of the flowlabel and DSCP value in the SCTP_PEER_ADDR_PARAMS
socket option.
Honor the net.inet6.ip6.auto_flowlabel sysctl setting.

Approved by: re (bz)
MFC after: 1 month.
2011-09-14 08:15:21 +00:00
Michael Tuexen
ca85e9482a The result of a joint work between rrs@ and myself at the IETF:
* Decouple the path supervision using a separate HB timer per path.
* Add support for potentially failed state.
* Bring back RTO.min to 1 second.
* Accept packets on IP-addresses already announced via an ASCONF
* While there: do some cleanups.

Approved by: re@
MFC after: 2 months.
2011-08-03 20:21:00 +00:00
Randall Stewart
5d40cf5d23 1) Typo correction in comments and one spacing change.
2) Mass update to all copyrights.
MFC after:	3 Months
2011-02-05 12:12:51 +00:00
Michael Tuexen
c446091b1e Make sure that changing the ECN sysctl does not affect
exisiting associations and endpoints.

MFC after: 3 months.
2011-02-03 19:59:00 +00:00
Randall Stewart
ae26e0a472 Fix the per CPU stats so that:
1) They don't use the giant "MAX_CPU" define and instead
   are allocated dynamically based on mp_ncpus
2) Will zero with the netstat -z -s -p sctp
3) Will be properly handled by both the sctp_init and finish
   (the multi-net stuff was incorrectly bzero'ing in sctp_init
    the wrong size.. the bzero is now moved to the right places).
    And of course the free is put in at the very end.

MFC after:	3 Months
2011-02-03 11:52:22 +00:00
Randall Stewart
bfc46083b9 Adds an experimental option to create a pool of
threads. These serve as input threads and are queued
packets based on the V-tag number. This is similar to
what a modern card can do with queue's for TCP... but
alas modern cards know nothing about SCTP.

MFC after:	3 months (maybe)
2011-02-03 10:05:30 +00:00
Randall Stewart
899288ae4b 1) Allow a chunk to track the cwnd it was at when sent.
2) Add separate max-bursts for retransmit and hb. These
   are set to sysctlable values but not settable via the
   socket api. This makes sure we don't blast out HB's or
   fast-retransmits.
3) Determine on the first data transmission on a net if
   its local-lan (by being under or over a RTT). This
   can later be used to think about different algorithms
   based on locallan vs big-i (experimental)
4) The cwnd should NOT be allowed to grow when an ECNEcho
   is seen (TCP has this same bug). We fix this in SCTP
   so an ECNe being seen prevents an advance of cwnd.
5) CWR's should not be sent multiple times to the
   same network, instead just updating the TSN being
   transmitted if needed.

MFC after:	1 Month
2011-02-02 11:13:23 +00:00
Michael Tuexen
90fed1d88e Change infrastructure for SCTP_MAX_BURST to allow compliance
with the latest socket API ID. Especially it can be disabled.

Full compliance needs changing the structure used in the
socket option. Since this breaks the API, it will be a
seperate commit which will not be MFCed to stable/8.

MFC after: 3 months.
2011-01-26 19:49:03 +00:00
Michael Tuexen
f7a77f6fd3 Add stream scheduling support.
This work is based on a patch received from Robin Seggelmann.

MFC after: 3 months.
2011-01-23 19:36:28 +00:00
Michael Tuexen
20083c2eb1 Fix the switching on/off of CMT using sysctl and socket option.
Fix the switching on/off of PF and NR-SACKs using sysctl.
Add minor improvement in handling malloc failures.
Improve the address checks when sending.

MFC after: 4 weeks
2010-08-28 17:59:51 +00:00
Randall Stewart
28085b2e10 This does two changes:
1) Makes it so that the INVARIANT function validate nolocks is
   available anywhere.
2) Fixes a BUG where a close has been done on a collision socket
   and the cookie processing would return leaving a lock held.
MFC after:	1 week
2010-06-05 21:20:28 +00:00
Randall Stewart
f751743351 This adds back the Iterator to the sctp
code base. We now properly have ONE thread
that services all VNET's. Also we purge out
the old timer based iterator code which had
multiple LOR's and other issues.

MFC after:	3 days
2010-05-16 17:03:56 +00:00
Michael Tuexen
b5c164935e * Fix some race condition in SACK/NR-SACK processing.
* Fix handling of mapping arrays when draining mbufs or processing
  FORWARD-TSN chunks.
* Cleanup code (no duplicate code anymore for SACKs and NR-SACKs).
Part of this code was developed together with rrs.
MFC after: 2 weeks.
2010-04-03 15:40:14 +00:00
Randall Stewart
ff014514ee Adds the option of keeping per-cpu statistics in SCTP. This
may be useful since it gets rid of atomics but I want it to
remain an option until I can do further testing on if it really
speeds things up.
2010-03-24 20:02:40 +00:00
Randall Stewart
482444b4a5 Support for VNET in SCTP (hopefully) 2009-09-17 15:11:12 +00:00
Randall Stewart
a99b67833a - Cleanup checksum code.
- Prepare for CRC offloading, add MIB counters (RS/MT).
- Bugfix: Disable CRC computation for IPv6 addresses with local scope (MT).
- Bugfix: Handle close() with SO_LINGER correctly when notifications
          are generated during the close() call(MT).
- Bugfix: Generate DRY event when sender is dry during subscription.
          Only for 1-to-1 style sockets (RS/MT)
- Bugfix: Put vtags for the correct amount of time into time-wait (MT).
- Bugfix: Clear vtag entries correctly on expiration (MT).
- Bugfix: shutdown() indicates ENOTCONN when called for unconnected
          1-to-1 style sockets (MT).
- Bugfix: In sctp Auth code (PL).
- Add support for devices that support SCTP csum offload (igb).
- Add missing sctp_associd to mib sysctl xsctp_tcb structure (RS)
Obtained from:	With help from Peter Lei and Michael Tuexen
2009-02-03 11:04:03 +00:00
Randall Stewart
830d754d52 Code from the hack-session known as the IETF (and a
bit of debugging afterwards):
- Fix protection code for notification generation.
- Decouple associd from vtag
- Allow vtags to have less strigent requirements in non-uniqueness.
   o don't pre-hash them when you issue one in a cookie.
   o Allow duplicates and use addresses and ports to
     discriminate amongst the duplicates during lookup.
- Add support for the NAT draft draft-ietf-behave-sctpnat-00, this
  is still experimental and needs more extensive testing with the
  Jason Butt ipfw changes.
- Support for the SENDER_DRY event to get DTLS in OpenSSL working
  with a set of patches from Michael Tuexen (hopefully heading to OpenSSL soon).
- Update the support of SCTP-AUTH by Peter Lei.
- Use macros for refcounting.
- Fix MTU for UDP encapsulation.
- Fix reporting back of unsent data.
- Update assoc send counter handling to be consistent with endpoint sent counter.
- Fix a bug in PR-SCTP.
- Fix so we only send another FWD-TSN when a SACK arrives IF and only
  if the adv-peer-ack point progressed. However we still make sure
  a timer is running if we do have an adv_peer_ack point.
- Fix PR-SCTP bug where chunks were retransmitted if they are sent
  unreliable but not abandoned yet.

With the help of:	Michael Teuxen and Peter Lei :-)
MFC after:	 4 weeks
2008-12-06 13:19:54 +00:00
Randall Stewart
a1e132720b -Improvement: Add '\n' on debug output in sctp_lower_sosend().
-Improvement: panic() on INVARIANTS kernels if memory allocation
 fails for a tagblock in sctp_add_vtag_to_timewait().
-Bugfix: Protect code in sctp_is_in_timewait() by
 SCTP_INP_INFO_WLOCK/SCTP_INP_INFO_WUNLOCK.
-Cleanup: Get rid of unused variable now in sctp_init_asoc().
-Bugfix: Reuse the correct vtag in sctp_add_vtag_to_timewait().
-Cleanup: Get rid of unused constant SCTP_TIME_WAIT_SHORT
 in sctp_constants.h.
-Improvement: Use all hash buckets of the vtag hash table.
-Cleanup: Get rid of then unused constant SCTP_STACK_VTAG_HASH_SIZE_A.
-Bugfix: Handle SHUTDOWN;SACK packet correctly.
-Bugfix: Last TSN in a gap ack block was not being "ack'd"
         in the internal scoreboard.
Obtained from:	(with help from Michael Tuexen)
2008-11-12 14:16:39 +00:00
Randall Stewart
6d9e8f2b3a Adds support for the SCTP_PORT_REUSE option
Fixes a refcount bug found in the process

Obtained from:	With the help of Michael Tuexen
2008-07-31 11:08:30 +00:00
Randall Stewart
fc14de76f4 1) Adds the rest of the VIMAGE change macros
2) Adds some __UserSpace__ on some of the common defines that
   the user space code needs
3) Fixes a bug when we send up data to a user that failed. We
   need to a) trim off the data chunk headers, if present, and
   b) make sure the frag bit is communicated properly for the
   msgs coming off the stream queues... i.e. we see if some
   of the msg has been taken.

Obtained from:	jeli contributed the VIMAGE changes on this pass Thanks Julain!
2008-07-09 16:45:30 +00:00
Randall Stewart
97a7b90ff3 More prep for Vimage:
- only one functino to destroy an SCTP stack sctp_finish()
 - Make it so this function also arranges for any threads
   created by the image to do a kthread_exit()
2008-06-15 12:31:23 +00:00
Randall Stewart
b3f1ea41fd - Macro-izes the packed declaration in all headers.
- Vimage prep - these are major restructures to move
  all global variables to be accessed via a macro or two.
  The variables all go into a single structure.
- Asconf address addition tweaks (add_or_del Interfaces)
- Fix rwnd calcualtion to be more conservative.
- Support SACK_IMMEDIATE flag to skip delayed sack
  by demand of peer.
- Comment updates in the sack mapping calculations
- Invarients panic added.
- Pre-support for UDP tunneling (we can do this on
  MAC but will need added support from UDP to
  get a "pipe" of UDP packets in.
- clear trace buffer sysctl added when local tracing on.

Note the majority of this huge patch is all the vimage prep stuff :-)
2008-06-14 07:58:05 +00:00
Randall Stewart
c54a18d26b - Adds support for the multi-asconf (From Kozuka-san)
- Adds some prepwork (Not all yet) for vimage in particular
  support the delete the sctppcbinfo.xx structs. There is
  still a leak in here if it were to be called plus we stil
  need the regrouping (From Me and Michael Tuexen)
- Adds support for UDP tunneling. For BSD there is no
  socket yet setup so its disabled, but major argument
  changes are in here to emcompass the passing of the port
  number (zero when you don't have a udp tunnel, the default
  for BSD). Will add some hooks in UDP here shortly (discussed
  with Robert) that will allow easy tunneling. (Mainly from
  Peter Lei and Michael Tuexen with some BSD work from me :-D)
- Some ease for windows, evidently leave is reserved by their
  compile move label leave: -> out:

MFC after:	1 week
2008-05-20 13:47:46 +00:00
Randall Stewart
5e2c2d872b Allow SCTP to compile without INET6.
PR:		116816
Obtained from	tuexen@fh-muenster.de:
MFC after:	2 weeks
2008-04-16 17:24:18 +00:00
Randall Stewart
fb8fb8f815 - Change the Time Wait of vtags value to match the cookie-life
- Select a tag gains ability to optionally save new tags
  off in the timewait system.
- When looking up associations do not give back a stcb that
  is in the about-to-be-freed state, and instead continue
  looking for other candiates.
- New function to query to see if value is in time-wait.
- Timewait had a time comparison error that caused very
  few vtags to actually stay in time-wait.
- When setting tags in time-wait, we now use the time
  requested NOT a fixed constant value.
- sstat now gets the proper associd when we do the query.
- When we process an association, we expect the tag chosen
  (if we have one from a cookie) to be in time-wait. Before
  we would NOT allow the assoc up by checking if its good.
  In theory this should have caused almost all assoc not
  to come up except for the time-comparison bug above (this
  bug was hidden by the time comparison bug :-D).
- Don't save tags for nonce values in the time-wait cache
  since these are used only during cookie collisions and do
  not matter if they are unique or not.
MFC after:	1 week
2007-10-30 14:09:24 +00:00
Randall Stewart
c99efcf633 - The address lock is changed to a rwlock. This
also involves macro changes to have a RLOCK and a WLOCK
  and placing the correct version within the code.
- The INP-INFO lock is changed to a rwlock.
- When sctp_shutdown() is called on Mac OS X, the socket lock is held.
  So call sctp_chunk_output with SCTP_SO_LOCKED and
  not SCTP_SO_NOT_LOCKED.
- Add SCTP_IPI_ADDR_[RW]LOCK and SCTP_IPI_ADDR_[RW]UNLOCK for Mac OS X.
- u_int64_t -> uint64_t
- add missing addr unlock for error return path
Approved by:	re@freebsd.org (K Smith)
2007-09-18 15:16:39 +00:00
Randall Stewart
851b7298b3 - send call has a reference to uio->uio_resid in
the recent send code, but uio may be NULL on sendfile
  calls. Change to use sndlen variable.
- EMSGSIZE is not being returned in non-blocking mode
  and needs a small tweak to look if the msg would
  ever fit when returning EWOULDBLOCK.
- FWD-TSN has a bug in stream processing which could
  cause a panic. This is a follow on to the codenomicon
  fix.
- PDAPI level 1 and 2 do not work unless the reader
  gets his returned buffer full. Fix so we can break
  out when at level 1 or 2.
- Fix fast-handoff features to copy across properly on
  accepted sockets
- Fix sctp_peeloff() system call when no true system call
  exists to screen arguments for errors. In cases where a
  real system call exists the system call itself does this.
- Fix raddr leak in recent add-ip code change for bundled
  asconfs (even when non-bundled asconfs are received)
- Make sure ipi_addr lock is held when walking global addr
  list. Need to change this lock type to a rwlock().
- Add don't wake flag on both input and output when the
  socket is closing.
- When deleting an address verify the interface is correct
  before allowing the delete to process. This protects panda
  and unnumbered.
- Clean up old sysctl stuff and get rid of the old Open/Net
  BSD structures.
- Add a function to watch the ranges in the sysctl sets.
- When appending in the reassembly queue, validate that
  the assoc has not gone to about to be freed. If so
  (in the middle) abort out. Note this especially effects
  MAC I think due to the lock/unlock they do (or with
  LOCK testing in place).
- Netstat patch to get rid of warnings.
- Make sure that no data gets queued to inactive/unconfirmed
  destinations. This especially effect CMT but also makes a
  impact on regular SCTP as well.
- During init collision when we detect seq number out
  of sync we need to treat it like Case C and discard
  the cookie (no invarient needed here).
- Atomic access to the random store.
- When we declare a vtag good, we need to shove it
  into the time wait hash to prevent further use. When
  the tag is put into the assoc hash, we need to remove it
  from the twait hash (where it will surely be). This prevents
  duplicate tag assignments.
- Move decr-ref count to better protect sysctl out of
  data.
- ltrace error corrections in sctp6_usrreq.c
- Add hook for interface up/down to be sent to us.
- Make sysctl() exported structures independent of processor
  architecture.
- Fix route and src addr cache clearing for delete address case.
- Make sure address marked SCTP_DEL_IP_ADDRESS is never selected
  as src addr.
- in icmp handling fixed so we actually look at the icmp codes
  to figure out what to do.
- Modified mobility code.
  Reception of DELETE IP ADDRESS for a primary destination and
  SET PRIMARY for a new primary destination is used for
  retransmission trigger to the new primary destination.
  Also, in this case, destination of chunks in send_queue are
  changed to the new primary destination.
- Fix so that we disallow sending by mbuf to ever have EEOR
  mode set upon it.

Approved by:	re@freebsd.org (B Mah)
2007-09-08 17:48:46 +00:00
Randall Stewart
2afb3e849f - During shutdown pending, when the last sack came in and
the last message on the send stream was "null" but still
  there, a state we allow, we could get hung and not clean
  it up and wait for the shutdown guard timer to clear the
  association without a graceful close. Fix this so that
  that we properly clean up.
- Added support for Multiple ASCONF per new RFC. We only
  (so far) accept input of these and cannot yet generate
  a multi-asconf.
- Sysctl'd support for experimental Fast Handover feature. Always
  disabled unless sysctl or socket option changes to enable.
- Error case in add-ip where the peer supports AUTH and ADD-IP
  but does NOT require AUTH of ASCONF/ASCONF-ACK. We need to
  ABORT in this case.
- According to the Kyoto summit of socket api developers
  (Solaris, Linux, BSD). We need to have:
   o non-eeor mode messages be atomic - Fixed
   o Allow implicit setup of an assoc in 1-2-1 model if
     using the sctp_**() send calls - Fixed
   o Get rid of HAVE_XXX declarations - Done
   o add a sctp_pr_policy in hole in sndrcvinfo structure - Done
   o add a PR_SCTP_POLICY_VALID type flag - yet to-do in a future patch!
- Optimize sctp6 calls to reuse code in sctp_usrreq. Also optimize
  when we close sending out the data and disabling Nagle.
- Change key concatenation order to match the auth RFC
- When sending OOTB shutdown_complete always do csum.
- Don't send PKT-DROP to a PKT-DROP
- For abort chunks just always checksums same for
  shutdown-complete.
- inpcb_free front state had a bug where in queue
  data could wedge an assoc. We need to just abandon
  ones in front states (free_assoc).
- If a peer sends us a 64k abort, we would try to
  assemble a response packet which may be larger than
  64k. This then would be dropped by IP. Instead make
  a "minimum" size for us 64k-2k (we want at least
  2k for our initack). If we receive such an init
  discard it early without all the processing.
- When we peel off we must increment the tcb ref count
  to keep it from being freed from underneath us.
- handling fwd-tsn had bugs that caused memory overwrites
  when given faulty data, fixed so can't happen and we
  also stop at the first bad stream no.
- Fixed so comm-up generates the adaption indication.
- peeloff did not get the hmac params copied.
- fix it so we lock the addr list when doing src-addr selection
  (in future we need to use a multi-reader/one writer lock here)
- During lowlevel output, we could end up with a _l_addr set
  to null if the iterator is calling the output routine. This
  means we would possibly crash when we gather the MTU info.
  Fix so we only do the gather where we have a src address
  cached.
- we need to be sure to set abort flag on conn state when
  we receive an abort.
- peeloff could leak a socket. Moved code so the close will
  find the socket if the peeloff fails (uipc_syscalls.c)

Approved by:	re@freebsd.org(Ken Smith)
2007-08-27 05:19:48 +00:00
Randall Stewart
c4739e2f47 - Fix address add handling to clear cached routes and source addresses
when peer acks the add in case the routing table changes.
- Fix sctp_lower_sosend to send shutdown chunk for mbuf send
  case when sndlen = 0 and sinfoflag = SCTP_EOF
- Fix sctp_lower_sosend for SCTP_ABORT mbuf send case with null data,
  So that it does not send the "null" data mbuf out and cause
  it to get freed twice.
- Fix so auto-asconf sysctl actually effect the socket's asconf state.
- Do not allow SCTP_AUTO_ASCONF option to be used on subset bound sockets.
- Memset bug in sctp_output.c (arguments were reversed) submitted
  found and reported by Dave Jones (davej@codemonkey.org.uk).
- PD-API point needs to be invoked >= not just > to conform to socket api
  draft this fixes sctp_indata.c in the two places need to be >=.
- move M_NOTIFICATION to use M_PROTO5.
- PEER_ADDR_PARAMS did not fail properly if you specify an address
  that is not in the association with a valid assoc_id. This meant
  you got or set the stcb level values instead of the destination
  you thought you were going to get/set. Now validate if the
  stcb is non-null and the net is NULL that the sa_family is
  set and the address is unspecified otherwise return an error.
- The thread based iterator could crash if associations were freed
  at the exact time it was running. rework the worker thread to
  use the increment/decrement to prevent this and no longer use
  the markers that the timer based iterator uses.
- Fix the memleak in sctp_add_addr_to_vrf() for the case when it is
  detected that ifa is already pointing to a ifn.
- Fix it so that if someone is so insane that they drop the
  send window below the minimal add mark, they still can send.
- Changed all state for associations to use mask safe macro.
- During front states in association freeing in sctp_inpcbfree, we
  had a locking problem where locks were not in place where they
  should have been.
- Free association calls were not testing the return value in
  sctp_inpcb_free() properly... others should be cast  void returns
  where we don't care about the return value.
- If a reference count is held on an assoc, even from the "force free"
  we should not do the actual free.. but instead let the timer
  free it.
- When we enter sctp_input(), if the SCTP_ASOC_ABOUT_TO_BE_FREED
  flag is set, we must NOT process the packet but handle it like
  ootb. This is because while freeing an assoc we release the
  locks to get all the higher order locks so we can purge all
  the hash tables. This leaves a hole if a packet comes in
  just at that point. Now sctp_common_input_processing() will
  call the ootb code in such a case.
- Change MBUF M_NOTIFICATION to use M_PROTO5 (per Sam L). This makes
  it so we don't have a conflict (I think this is a covertity change).
  We made this change AFTER some conversation and looking to make sure
  that M_PROTO5 does not have a problem between SCTP and the 802.11
  stuff (which is the only other place its used).
- Fixed lock order reversal and missing atomic protection around
  locked_tcb during association lookup and the 1-2-1 model.
- Added debug to source address selection.
- V6 output must always do checksum even for loopback.
- Remove more locks around inp that are not needed for an atomically
  added/subtracted ref count.
- slight optimization in the way we zero the array in sctp_sack_check()
- It was possible to respond to a ABORT() with bad checksum with
  a PKT-DROP. This lead to a PKT-DROP/ABORT war. Add code to NOT
  send a PKT-DROP to any ABORT().
- Add an option for local logging (useful for macintosh or when
  you need better performing during debugging). Note no commands
  are here to get the log info, you must just use kgdb.
- The timer code needs to be aware of if it needs to call
  sctp_sack_check() to slide the maps and adjust the cum-ack.
  This is because it may be out of sync cum-ack wise.
- Added threshold managment logging.
- If the user picked just the right size, that just filled the send
  window minus one mtu, we would enter a forever loop not copying and
  at the same time not blocking. Change from < to <= solves this.
- Sysctl added to control the fragment interleave level which defaults
  to 1.
- My rwnd control was not being used to control the rwnd properly (we
  did not add and subtract to it :-() this is now fixed so we handle
  small messages (1 byte etc) better to bring our rwnd down more
  slowly.

Approved by:	re@freebsd.org (Bruce Mah)
2007-08-24 00:53:53 +00:00
Randall Stewart
1b649582bb - take out a needless panic under invariants for sctp_output.c
- Fix addrs's error checking of sctp_sendx(3) when addrcnt is less than
   SCTP_SMALL_IOVEC_SIZE
 - re-add back inpcb_bind local address check bypass capability
 - Fix it so sctp_opt_info is independant of assoc_id postion.
 - Fix cookie life set to use MSEC_TO_TICKS() macro.
 - asconf changes
   o More comment changes/clarifications related to the old local address
    "not" list which is now an explicit restricted list.

   o Rename some functions for clarity:
     - sctp_add/del_local_addr_assoc to xxx_local_addr_restricted()
     - asconf related iterator functions to sctp_asconf_iterator_xxx()

   o Fix bug when the same address is deleted and added (and removed from
     the asconf queue) where the ifa is "freed" twice refcount wise,
     possibly freeing it completely.

   o Fix bug in output where the first ASCONF would not go out after the
     last address is changed (e.g. only goes out when retransmitted).

   o Fix bug where multiple ASCONFs can be bundled in the same packet with
     the and with the same serial numbers.

   o Fix asconf stcb iterator to not send ASCONF until after all work
     queue entries have been processed.

   o Change behavior so that when the last address is deleted (auto asconf
     on a bound all endpoint) no action is taken until an address is
     added; at that time, an ASCONF add+delete is sent (if the assoc
     is still up).

   o Fix local address counting so that address scoping is taken into
     account.

   o #ifdef SCTP_TIMER_BASED_ASCONF the old timer triggered sending
     of ASCONF (after an RTO).  The default now is to send
     ASCONF immediately (except for the case of changing/deleting the
     last usable address).
Approved by:	re(ken smith)@freebsd.org
2007-07-24 20:06:02 +00:00
Randall Stewart
52be287ebb - remove duplicate code from sctp_asconf.c
- remove duplicate #include <sys/priv.h> that is not under
   #ifdef FreeBSD version to allow compile on 6.1
- static analysis changes per the cisco SA tool including:
    o some SA_IGNORE comments
    o some checks for NULL before unlock.
    o type corrections int -> size_t
- Fix it so sctp_alloc_asoc takes a thread/proc argument. Without this
   we pass a NULL in to bind on implicit assoc setup and crash  :-(
Approved by:	re@freebsd.org(Ken Smith)
2007-07-21 21:41:32 +00:00
Randall Stewart
18e198d3a3 - added pre-checks to the bindx call.
- use proper tick gathering macro instead of ticks directly.
- Placed reasonable boundaries on sets that a user can do
  that are converted to ticks from ms.
- Fix CMT_PF to always check to be sure CMT is on.
- Fix ticks use of CMT_PF.
- put back code to allow asconfs to be queued while INITs are in flight
  and before the assoc is established.
- During window probes, an ack'd packet might be left with the window
  probe mark on it causing it to be retransmitted. Change so that
  the flight decrease macro clears the window_probe mark.
- Additional logging flight size/reading and ASOC LOG. This
  is only enabled if you manually insert things into opt_sctp.h
  since its a set of debug code only.
- Found an interesting SMP race in the way data was appended which
  could cause a reader to lose a part of a message, had to
  reorder when we marked the message was complete to after
  the data was appended.
- bug in ADD-IP for the subset bound socket case when the peer has only
  one address
- fix ASCONF implicit success/error handling case
- proper support of jails in Freebsd 6>
- copy out the timeval for the 64 bit sparc world on cookie-echo
  alignment error crashes without this).
Approved by:	re(Ken Smith)
2007-07-17 20:58:26 +00:00