Commit Graph

69 Commits

Author SHA1 Message Date
Randall Stewart
482444b4a5 Support for VNET in SCTP (hopefully) 2009-09-17 15:11:12 +00:00
Michael Tuexen
d830c305ea Fix a bug reported by Daniel Mentz:
When authenticating DATA chunks some DATA chunks
might get stuck when the MTU gets decreased via
an ICMP message.

Approved by: rrs (mentor)
MFC after: immediately
2009-09-16 14:23:31 +00:00
Randall Stewart
8933fa13b6 Many bug fixes (from the IETF hack-fest):
- PR-SCTP had major issues when skipping through a multi-part message.
  o Did not look at socket buffer.
  o Did not properly handle the reassmebly queue.
  o The MARKED segments could interfere and un-skip a chunk causing
    a problem with the proper FWD-TSN.
  o No FR of FWD-TSN's was being done.
- NR-Sack code was basically disabled. It needed fixes that
  never got into the real code.
- CMT code had issues when the two paths were NOT the same b/w. We
  found a few small bugs, but also the critcal one here was not
  dividing the rwnd amongst the paths.

Obtained from:	Michael Tuexen and myself at the IETF hack-fest ;-)
2009-04-04 11:43:32 +00:00
Randall Stewart
8aae94933f Fix the add stream feature of strm-reset to really work:
- Fix the copy, we can't do a blind copy but must transfer
   the data from the old to the new.
 - Fix the ACK processing so we properly stop retransmitting
   the thing.
 - Fix it so if we get a retran we will properly reply with
   the saved response without doing anything.

MFC after:	1 month
2009-02-27 20:54:45 +00:00
Randall Stewart
ea44232b3a Add the add-stream capability. Still needs more
testing..

MFC after:	1 month
2009-02-20 15:03:54 +00:00
Randall Stewart
c3b8c73cf1 Have the jail code use the error returned to pass not constant
errors.
Obtained from:	jamie@freebsd.org
2009-02-13 18:44:30 +00:00
Randall Stewart
a99b67833a - Cleanup checksum code.
- Prepare for CRC offloading, add MIB counters (RS/MT).
- Bugfix: Disable CRC computation for IPv6 addresses with local scope (MT).
- Bugfix: Handle close() with SO_LINGER correctly when notifications
          are generated during the close() call(MT).
- Bugfix: Generate DRY event when sender is dry during subscription.
          Only for 1-to-1 style sockets (RS/MT)
- Bugfix: Put vtags for the correct amount of time into time-wait (MT).
- Bugfix: Clear vtag entries correctly on expiration (MT).
- Bugfix: shutdown() indicates ENOTCONN when called for unconnected
          1-to-1 style sockets (MT).
- Bugfix: In sctp Auth code (PL).
- Add support for devices that support SCTP csum offload (igb).
- Add missing sctp_associd to mib sysctl xsctp_tcb structure (RS)
Obtained from:	With help from Peter Lei and Michael Tuexen
2009-02-03 11:04:03 +00:00
Randall Stewart
830d754d52 Code from the hack-session known as the IETF (and a
bit of debugging afterwards):
- Fix protection code for notification generation.
- Decouple associd from vtag
- Allow vtags to have less strigent requirements in non-uniqueness.
   o don't pre-hash them when you issue one in a cookie.
   o Allow duplicates and use addresses and ports to
     discriminate amongst the duplicates during lookup.
- Add support for the NAT draft draft-ietf-behave-sctpnat-00, this
  is still experimental and needs more extensive testing with the
  Jason Butt ipfw changes.
- Support for the SENDER_DRY event to get DTLS in OpenSSL working
  with a set of patches from Michael Tuexen (hopefully heading to OpenSSL soon).
- Update the support of SCTP-AUTH by Peter Lei.
- Use macros for refcounting.
- Fix MTU for UDP encapsulation.
- Fix reporting back of unsent data.
- Update assoc send counter handling to be consistent with endpoint sent counter.
- Fix a bug in PR-SCTP.
- Fix so we only send another FWD-TSN when a SACK arrives IF and only
  if the adv-peer-ack point progressed. However we still make sure
  a timer is running if we do have an adv_peer_ack point.
- Fix PR-SCTP bug where chunks were retransmitted if they are sent
  unreliable but not abandoned yet.

With the help of:	Michael Teuxen and Peter Lei :-)
MFC after:	 4 weeks
2008-12-06 13:19:54 +00:00
Bjoern A. Zeeb
413628a7e3 MFp4:
Bring in updated jail support from bz_jail branch.

This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..

SCTP support was updated and supports IPv6 in jails as well.

Cpuset support permits jails to be bound to specific processor
sets after creation.

Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.

DDB 'show jails' command was added to aid debugging.

Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.

Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.

Bump __FreeBSD_version for the afore mentioned and in kernel changes.

Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
  and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
  help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
  suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
  on cluster machines as well as all the testers and people
  who provided feedback the last months on freebsd-jail and
  other channels.
- My employer, CK Software GmbH, for the support so I could work on this.

Reviewed by:	(see above)
MFC after:	3 months (this is just so that I get the mail)
X-MFC Before:   7.2-RELEASE if possible
2008-11-29 14:32:14 +00:00
Randall Stewart
ac29704161 New sockets (accepted) were not inheriting the proper snd/rcv buffer value.
Obtained from:	 Michael Tuexen
2008-10-18 15:56:12 +00:00
Randall Stewart
6d9e8f2b3a Adds support for the SCTP_PORT_REUSE option
Fixes a refcount bug found in the process

Obtained from:	With the help of Michael Tuexen
2008-07-31 11:08:30 +00:00
Randall Stewart
fc14de76f4 1) Adds the rest of the VIMAGE change macros
2) Adds some __UserSpace__ on some of the common defines that
   the user space code needs
3) Fixes a bug when we send up data to a user that failed. We
   need to a) trim off the data chunk headers, if present, and
   b) make sure the frag bit is communicated properly for the
   msgs coming off the stream queues... i.e. we see if some
   of the msg has been taken.

Obtained from:	jeli contributed the VIMAGE changes on this pass Thanks Julain!
2008-07-09 16:45:30 +00:00
Randall Stewart
97a7b90ff3 More prep for Vimage:
- only one functino to destroy an SCTP stack sctp_finish()
 - Make it so this function also arranges for any threads
   created by the image to do a kthread_exit()
2008-06-15 12:31:23 +00:00
Randall Stewart
b3f1ea41fd - Macro-izes the packed declaration in all headers.
- Vimage prep - these are major restructures to move
  all global variables to be accessed via a macro or two.
  The variables all go into a single structure.
- Asconf address addition tweaks (add_or_del Interfaces)
- Fix rwnd calcualtion to be more conservative.
- Support SACK_IMMEDIATE flag to skip delayed sack
  by demand of peer.
- Comment updates in the sack mapping calculations
- Invarients panic added.
- Pre-support for UDP tunneling (we can do this on
  MAC but will need added support from UDP to
  get a "pipe" of UDP packets in.
- clear trace buffer sysctl added when local tracing on.

Note the majority of this huge patch is all the vimage prep stuff :-)
2008-06-14 07:58:05 +00:00
Randall Stewart
c54a18d26b - Adds support for the multi-asconf (From Kozuka-san)
- Adds some prepwork (Not all yet) for vimage in particular
  support the delete the sctppcbinfo.xx structs. There is
  still a leak in here if it were to be called plus we stil
  need the regrouping (From Me and Michael Tuexen)
- Adds support for UDP tunneling. For BSD there is no
  socket yet setup so its disabled, but major argument
  changes are in here to emcompass the passing of the port
  number (zero when you don't have a udp tunnel, the default
  for BSD). Will add some hooks in UDP here shortly (discussed
  with Robert) that will allow easy tunneling. (Mainly from
  Peter Lei and Michael Tuexen with some BSD work from me :-D)
- Some ease for windows, evidently leave is reserved by their
  compile move label leave: -> out:

MFC after:	1 week
2008-05-20 13:47:46 +00:00
Randall Stewart
5e2c2d872b Allow SCTP to compile without INET6.
PR:		116816
Obtained from	tuexen@fh-muenster.de:
MFC after:	2 weeks
2008-04-16 17:24:18 +00:00
Randall Stewart
eadccaccf0 Use the pru_flush infrastructure to avoid a panic
PR:		122710
MFC after:	1 week
2008-04-14 18:13:33 +00:00
Randall Stewart
41eee5558c - More fixes for lock misses on the transfer of data to
the sent_queue. Sometimes I wonder why any code
  ever works :-)
- Fix the pad of the last mbuf routine, It was working improperly
  on non-4 byte aligned chunks which could cause memory overruns.

MFC after:	1 week
2007-12-07 01:32:14 +00:00
Randall Stewart
2aedc03dad - More fixes for the non-blocking msg send, had the skip of the pre-block
test incorrect.
- Fix the initial buf calculation to be more friendly, calc is the same
  but we use different variable to make it easier amongst the different
  code versions.

MFC after:	1 week
2007-12-04 20:20:42 +00:00
Randall Stewart
fb8fb8f815 - Change the Time Wait of vtags value to match the cookie-life
- Select a tag gains ability to optionally save new tags
  off in the timewait system.
- When looking up associations do not give back a stcb that
  is in the about-to-be-freed state, and instead continue
  looking for other candiates.
- New function to query to see if value is in time-wait.
- Timewait had a time comparison error that caused very
  few vtags to actually stay in time-wait.
- When setting tags in time-wait, we now use the time
  requested NOT a fixed constant value.
- sstat now gets the proper associd when we do the query.
- When we process an association, we expect the tag chosen
  (if we have one from a cookie) to be in time-wait. Before
  we would NOT allow the assoc up by checking if its good.
  In theory this should have caused almost all assoc not
  to come up except for the time-comparison bug above (this
  bug was hidden by the time comparison bug :-D).
- Don't save tags for nonce values in the time-wait cache
  since these are used only during cookie collisions and do
  not matter if they are unique or not.
MFC after:	1 week
2007-10-30 14:09:24 +00:00
Randall Stewart
b201f5360c - fix sctp_ifn initial refcount issue (prevents deletion)
- fix a bug during cookie collision that prevented an
  association from coming up in a specific restart case.
- Fix it so the shutdown-pending flag gets removed (this is
  more for correctness then needed) when we enter shutdown-sent
  or shutdown-ack-sent states.
- Fix a bug that caused the receiver to sometimes NOT send
  a SACK when a duplicate TSN arrived. Without this fix
  it was possible for the association to fall down if the
- Deleted primary destination is also stored when SCTP_MOBILITY_BASE.
  (Previously, it is stored when only SCTP_MOBILITY_FASTHANDOFF)
- Fix a locking issue where we might call send_initiate_ack() and
  incorrectly state the lock held/not held. Also fix it so that
  when we release the lock the inp cannot be deleted on us.
- Add the debug option that can cause the stack to panic instead
  of aborting an assoc. This does not and should never show up
  in options but is useful for debugging unexpected aborts.
- Add cumack_log sent to track sending cumack information for
  the debug case where we are running a special log per assoc.
- Added extra () aroudn sctp_sbspace macro to avoid compile warnings.
MFC after:	1 week
2007-10-16 14:05:51 +00:00
Randall Stewart
d55b0b1b09 - Bug fix managing congestion parameter on immediate
retransmittion by handover event (fast mobility code)
- Fixed problem of mobility code which is caused by remaining
  parameters in the deleted primary destination.
- Add a missing lock. When a peer sends an INIT, and while we
  are processing it to send an INIT-ACK the socket is closed,
  we did not hold a lock to keep the socket from going away.
  Add protection for this case.
- Fix so that arwnd is alway uses the minimal rwnd if the user
  has set the socket buffer smaller. Found this when the test
  org decided to see what happens when you set in a rwnd of 10
  bytes (which is not allowed per RFC .. 4k is minimum).
- Fixes so a cookie-echo ootb will NOT cause an abort to
  be sent. This was happening in a MPI collision case.
- Examined all panics and unless there was no recovery, moved
  any that were not already to INVARANTS.

Approved by:	re@freebsd.org (gnn)
2007-10-01 03:22:29 +00:00
Randall Stewart
c99efcf633 - The address lock is changed to a rwlock. This
also involves macro changes to have a RLOCK and a WLOCK
  and placing the correct version within the code.
- The INP-INFO lock is changed to a rwlock.
- When sctp_shutdown() is called on Mac OS X, the socket lock is held.
  So call sctp_chunk_output with SCTP_SO_LOCKED and
  not SCTP_SO_NOT_LOCKED.
- Add SCTP_IPI_ADDR_[RW]LOCK and SCTP_IPI_ADDR_[RW]UNLOCK for Mac OS X.
- u_int64_t -> uint64_t
- add missing addr unlock for error return path
Approved by:	re@freebsd.org (K Smith)
2007-09-18 15:16:39 +00:00
Randall Stewart
851b7298b3 - send call has a reference to uio->uio_resid in
the recent send code, but uio may be NULL on sendfile
  calls. Change to use sndlen variable.
- EMSGSIZE is not being returned in non-blocking mode
  and needs a small tweak to look if the msg would
  ever fit when returning EWOULDBLOCK.
- FWD-TSN has a bug in stream processing which could
  cause a panic. This is a follow on to the codenomicon
  fix.
- PDAPI level 1 and 2 do not work unless the reader
  gets his returned buffer full. Fix so we can break
  out when at level 1 or 2.
- Fix fast-handoff features to copy across properly on
  accepted sockets
- Fix sctp_peeloff() system call when no true system call
  exists to screen arguments for errors. In cases where a
  real system call exists the system call itself does this.
- Fix raddr leak in recent add-ip code change for bundled
  asconfs (even when non-bundled asconfs are received)
- Make sure ipi_addr lock is held when walking global addr
  list. Need to change this lock type to a rwlock().
- Add don't wake flag on both input and output when the
  socket is closing.
- When deleting an address verify the interface is correct
  before allowing the delete to process. This protects panda
  and unnumbered.
- Clean up old sysctl stuff and get rid of the old Open/Net
  BSD structures.
- Add a function to watch the ranges in the sysctl sets.
- When appending in the reassembly queue, validate that
  the assoc has not gone to about to be freed. If so
  (in the middle) abort out. Note this especially effects
  MAC I think due to the lock/unlock they do (or with
  LOCK testing in place).
- Netstat patch to get rid of warnings.
- Make sure that no data gets queued to inactive/unconfirmed
  destinations. This especially effect CMT but also makes a
  impact on regular SCTP as well.
- During init collision when we detect seq number out
  of sync we need to treat it like Case C and discard
  the cookie (no invarient needed here).
- Atomic access to the random store.
- When we declare a vtag good, we need to shove it
  into the time wait hash to prevent further use. When
  the tag is put into the assoc hash, we need to remove it
  from the twait hash (where it will surely be). This prevents
  duplicate tag assignments.
- Move decr-ref count to better protect sysctl out of
  data.
- ltrace error corrections in sctp6_usrreq.c
- Add hook for interface up/down to be sent to us.
- Make sysctl() exported structures independent of processor
  architecture.
- Fix route and src addr cache clearing for delete address case.
- Make sure address marked SCTP_DEL_IP_ADDRESS is never selected
  as src addr.
- in icmp handling fixed so we actually look at the icmp codes
  to figure out what to do.
- Modified mobility code.
  Reception of DELETE IP ADDRESS for a primary destination and
  SET PRIMARY for a new primary destination is used for
  retransmission trigger to the new primary destination.
  Also, in this case, destination of chunks in send_queue are
  changed to the new primary destination.
- Fix so that we disallow sending by mbuf to ever have EEOR
  mode set upon it.

Approved by:	re@freebsd.org (B Mah)
2007-09-08 17:48:46 +00:00
Randall Stewart
ceaad40ae7 - Locking compatiability changes. This involves adding
additional flags to many function calls. The flags only
  get used in BSD when we compile with lock testing. These
  flags allow apple to escape the "giant" lock it holds on
  the socket and have more fine-grained locking in the NKE.
  It also allows us to test (with witness) the locking used
  by apple via a compile switch (manually applied).

Approved by:	re@freebsd.org(B Mah)
2007-09-08 11:35:11 +00:00
Randall Stewart
2afb3e849f - During shutdown pending, when the last sack came in and
the last message on the send stream was "null" but still
  there, a state we allow, we could get hung and not clean
  it up and wait for the shutdown guard timer to clear the
  association without a graceful close. Fix this so that
  that we properly clean up.
- Added support for Multiple ASCONF per new RFC. We only
  (so far) accept input of these and cannot yet generate
  a multi-asconf.
- Sysctl'd support for experimental Fast Handover feature. Always
  disabled unless sysctl or socket option changes to enable.
- Error case in add-ip where the peer supports AUTH and ADD-IP
  but does NOT require AUTH of ASCONF/ASCONF-ACK. We need to
  ABORT in this case.
- According to the Kyoto summit of socket api developers
  (Solaris, Linux, BSD). We need to have:
   o non-eeor mode messages be atomic - Fixed
   o Allow implicit setup of an assoc in 1-2-1 model if
     using the sctp_**() send calls - Fixed
   o Get rid of HAVE_XXX declarations - Done
   o add a sctp_pr_policy in hole in sndrcvinfo structure - Done
   o add a PR_SCTP_POLICY_VALID type flag - yet to-do in a future patch!
- Optimize sctp6 calls to reuse code in sctp_usrreq. Also optimize
  when we close sending out the data and disabling Nagle.
- Change key concatenation order to match the auth RFC
- When sending OOTB shutdown_complete always do csum.
- Don't send PKT-DROP to a PKT-DROP
- For abort chunks just always checksums same for
  shutdown-complete.
- inpcb_free front state had a bug where in queue
  data could wedge an assoc. We need to just abandon
  ones in front states (free_assoc).
- If a peer sends us a 64k abort, we would try to
  assemble a response packet which may be larger than
  64k. This then would be dropped by IP. Instead make
  a "minimum" size for us 64k-2k (we want at least
  2k for our initack). If we receive such an init
  discard it early without all the processing.
- When we peel off we must increment the tcb ref count
  to keep it from being freed from underneath us.
- handling fwd-tsn had bugs that caused memory overwrites
  when given faulty data, fixed so can't happen and we
  also stop at the first bad stream no.
- Fixed so comm-up generates the adaption indication.
- peeloff did not get the hmac params copied.
- fix it so we lock the addr list when doing src-addr selection
  (in future we need to use a multi-reader/one writer lock here)
- During lowlevel output, we could end up with a _l_addr set
  to null if the iterator is calling the output routine. This
  means we would possibly crash when we gather the MTU info.
  Fix so we only do the gather where we have a src address
  cached.
- we need to be sure to set abort flag on conn state when
  we receive an abort.
- peeloff could leak a socket. Moved code so the close will
  find the socket if the peeloff fails (uipc_syscalls.c)

Approved by:	re@freebsd.org(Ken Smith)
2007-08-27 05:19:48 +00:00
Randall Stewart
c4739e2f47 - Fix address add handling to clear cached routes and source addresses
when peer acks the add in case the routing table changes.
- Fix sctp_lower_sosend to send shutdown chunk for mbuf send
  case when sndlen = 0 and sinfoflag = SCTP_EOF
- Fix sctp_lower_sosend for SCTP_ABORT mbuf send case with null data,
  So that it does not send the "null" data mbuf out and cause
  it to get freed twice.
- Fix so auto-asconf sysctl actually effect the socket's asconf state.
- Do not allow SCTP_AUTO_ASCONF option to be used on subset bound sockets.
- Memset bug in sctp_output.c (arguments were reversed) submitted
  found and reported by Dave Jones (davej@codemonkey.org.uk).
- PD-API point needs to be invoked >= not just > to conform to socket api
  draft this fixes sctp_indata.c in the two places need to be >=.
- move M_NOTIFICATION to use M_PROTO5.
- PEER_ADDR_PARAMS did not fail properly if you specify an address
  that is not in the association with a valid assoc_id. This meant
  you got or set the stcb level values instead of the destination
  you thought you were going to get/set. Now validate if the
  stcb is non-null and the net is NULL that the sa_family is
  set and the address is unspecified otherwise return an error.
- The thread based iterator could crash if associations were freed
  at the exact time it was running. rework the worker thread to
  use the increment/decrement to prevent this and no longer use
  the markers that the timer based iterator uses.
- Fix the memleak in sctp_add_addr_to_vrf() for the case when it is
  detected that ifa is already pointing to a ifn.
- Fix it so that if someone is so insane that they drop the
  send window below the minimal add mark, they still can send.
- Changed all state for associations to use mask safe macro.
- During front states in association freeing in sctp_inpcbfree, we
  had a locking problem where locks were not in place where they
  should have been.
- Free association calls were not testing the return value in
  sctp_inpcb_free() properly... others should be cast  void returns
  where we don't care about the return value.
- If a reference count is held on an assoc, even from the "force free"
  we should not do the actual free.. but instead let the timer
  free it.
- When we enter sctp_input(), if the SCTP_ASOC_ABOUT_TO_BE_FREED
  flag is set, we must NOT process the packet but handle it like
  ootb. This is because while freeing an assoc we release the
  locks to get all the higher order locks so we can purge all
  the hash tables. This leaves a hole if a packet comes in
  just at that point. Now sctp_common_input_processing() will
  call the ootb code in such a case.
- Change MBUF M_NOTIFICATION to use M_PROTO5 (per Sam L). This makes
  it so we don't have a conflict (I think this is a covertity change).
  We made this change AFTER some conversation and looking to make sure
  that M_PROTO5 does not have a problem between SCTP and the 802.11
  stuff (which is the only other place its used).
- Fixed lock order reversal and missing atomic protection around
  locked_tcb during association lookup and the 1-2-1 model.
- Added debug to source address selection.
- V6 output must always do checksum even for loopback.
- Remove more locks around inp that are not needed for an atomically
  added/subtracted ref count.
- slight optimization in the way we zero the array in sctp_sack_check()
- It was possible to respond to a ABORT() with bad checksum with
  a PKT-DROP. This lead to a PKT-DROP/ABORT war. Add code to NOT
  send a PKT-DROP to any ABORT().
- Add an option for local logging (useful for macintosh or when
  you need better performing during debugging). Note no commands
  are here to get the log info, you must just use kgdb.
- The timer code needs to be aware of if it needs to call
  sctp_sack_check() to slide the maps and adjust the cum-ack.
  This is because it may be out of sync cum-ack wise.
- Added threshold managment logging.
- If the user picked just the right size, that just filled the send
  window minus one mtu, we would enter a forever loop not copying and
  at the same time not blocking. Change from < to <= solves this.
- Sysctl added to control the fragment interleave level which defaults
  to 1.
- My rwnd control was not being used to control the rwnd properly (we
  did not add and subtract to it :-() this is now fixed so we handle
  small messages (1 byte etc) better to bring our rwnd down more
  slowly.

Approved by:	re@freebsd.org (Bruce Mah)
2007-08-24 00:53:53 +00:00
Randall Stewart
2dad8a55be - Remove extra comment for 7.0 (no GIANT here).
- Remove unneeded WLOCK/UNLOCK of inp for getting TCB lock.
- Fix panic that may occur when freeing an assoc that has partial
  delivery in progress (may dereference null socket pointer when
  queuing partial delivery aborted notification)
- Some spacing and comment fixes.
- Fix address add handling to clear cached routes and source addresses
  when peer acks the add in case the routing table changes.
Approved by:	re@freebsd.org (Bruce Mah)
2007-08-16 01:51:22 +00:00
Randall Stewart
63981c2b40 - change number assignments for SHA225-512 (match artisync
for bakeoff.. using the next sequential ones)
- In cookie processing 1-2-1, we did not increment the stcb
  refcnt before releasing the tcb lock. We need to do this
  to keep the tcb from being freed by a abort or ?? unlikely
  but worth doing. Also get rid of unneed INP_WLOCK.
- extra receive info included the rcvinfo which killed the
  padding/alignment. We now redefine all the fields properly
  so they both align properly both to 128 bytes.
- A peeled off socket would not close without an error due to
  its misguided idea that sctp_disconnect() was not supported
  on it. This fixes it so it goes through the proper path.
- When an assoc was being deleted after abort (via a timer) a
  small race condition exists where we might take a packet for
  the old assoc (since we are waiting for a cleanup timer). This
  state especially happens in mac. We now add a state in the asoc
  so these can properly handle the packet as OOTB.
Approved by:	re@freebsd.org(Ken Smith)
2007-08-06 15:46:46 +00:00
Randall Stewart
1b649582bb - take out a needless panic under invariants for sctp_output.c
- Fix addrs's error checking of sctp_sendx(3) when addrcnt is less than
   SCTP_SMALL_IOVEC_SIZE
 - re-add back inpcb_bind local address check bypass capability
 - Fix it so sctp_opt_info is independant of assoc_id postion.
 - Fix cookie life set to use MSEC_TO_TICKS() macro.
 - asconf changes
   o More comment changes/clarifications related to the old local address
    "not" list which is now an explicit restricted list.

   o Rename some functions for clarity:
     - sctp_add/del_local_addr_assoc to xxx_local_addr_restricted()
     - asconf related iterator functions to sctp_asconf_iterator_xxx()

   o Fix bug when the same address is deleted and added (and removed from
     the asconf queue) where the ifa is "freed" twice refcount wise,
     possibly freeing it completely.

   o Fix bug in output where the first ASCONF would not go out after the
     last address is changed (e.g. only goes out when retransmitted).

   o Fix bug where multiple ASCONFs can be bundled in the same packet with
     the and with the same serial numbers.

   o Fix asconf stcb iterator to not send ASCONF until after all work
     queue entries have been processed.

   o Change behavior so that when the last address is deleted (auto asconf
     on a bound all endpoint) no action is taken until an address is
     added; at that time, an ASCONF add+delete is sent (if the assoc
     is still up).

   o Fix local address counting so that address scoping is taken into
     account.

   o #ifdef SCTP_TIMER_BASED_ASCONF the old timer triggered sending
     of ASCONF (after an RTO).  The default now is to send
     ASCONF immediately (except for the case of changing/deleting the
     last usable address).
Approved by:	re(ken smith)@freebsd.org
2007-07-24 20:06:02 +00:00
Randall Stewart
52be287ebb - remove duplicate code from sctp_asconf.c
- remove duplicate #include <sys/priv.h> that is not under
   #ifdef FreeBSD version to allow compile on 6.1
- static analysis changes per the cisco SA tool including:
    o some SA_IGNORE comments
    o some checks for NULL before unlock.
    o type corrections int -> size_t
- Fix it so sctp_alloc_asoc takes a thread/proc argument. Without this
   we pass a NULL in to bind on implicit assoc setup and crash  :-(
Approved by:	re@freebsd.org(Ken Smith)
2007-07-21 21:41:32 +00:00
Randall Stewart
18e198d3a3 - added pre-checks to the bindx call.
- use proper tick gathering macro instead of ticks directly.
- Placed reasonable boundaries on sets that a user can do
  that are converted to ticks from ms.
- Fix CMT_PF to always check to be sure CMT is on.
- Fix ticks use of CMT_PF.
- put back code to allow asconfs to be queued while INITs are in flight
  and before the assoc is established.
- During window probes, an ack'd packet might be left with the window
  probe mark on it causing it to be retransmitted. Change so that
  the flight decrease macro clears the window_probe mark.
- Additional logging flight size/reading and ASOC LOG. This
  is only enabled if you manually insert things into opt_sctp.h
  since its a set of debug code only.
- Found an interesting SMP race in the way data was appended which
  could cause a reader to lose a part of a message, had to
  reorder when we marked the message was complete to after
  the data was appended.
- bug in ADD-IP for the subset bound socket case when the peer has only
  one address
- fix ASCONF implicit success/error handling case
- proper support of jails in Freebsd 6>
- copy out the timeval for the 64 bit sparc world on cookie-echo
  alignment error crashes without this).
Approved by:	re(Ken Smith)
2007-07-17 20:58:26 +00:00
Randall Stewart
b54d3a6c48 - Modular congestion control, with RFC2581 being the default.
- CMT_PF states added (w/sysctl to turn the PF version on)
- sctp_input.c had a missing incr of cookie case when the
  auth was bad. This meant a free was called without an
  increment to refcnt, added increment like rest of code.
- There was a case, unlikely, when the scope of the destination
  changed (this is a TSNH case). In that case, it would not free
  the alloc'ed asoc (in sctp_input.c).
- When listed addresses found a colliding cookie/Init, then
  the collided upon tcb was not unlocked in sctp_pcb.c
- Add error checking on arguments of sctp_sendx(3) to prevent it from
  referencing a NULL pointer.
- Fix an error return of sctp_sendx(3), it was returing
  ENOMEM not -1.
- Get assoc id was changed to use the sanctified socket api
  method for getting a assoc id (PEER_ADDR_INFO instead of
  PEER_ADDR_PARAMS).
- Fix it so a peeled off socket will get a proper error return
  if it trys to send to a different address then it is connected to.
- Fix so that select_a_stream can avoid an endless loop that
  could hang a caller.
- time_entered (state set time) was not being set in all cases
  to the time we went established.
Approved by:	re(ken smith)
2007-07-14 09:36:28 +00:00
George V. Neville-Neil
b2630c2934 Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSEC
option is now deprecated, as well as the KAME IPsec code.
What was FAST_IPSEC is now IPSEC.

Approved by: re
Sponsored by: Secure Computing
2007-07-03 12:13:45 +00:00
George V. Neville-Neil
2cb64cb272 Commit IPv6 support for FAST_IPSEC to the tree.
This commit includes only the kernel files, the rest of the files
will follow in a second commit.

Reviewed by:    bz
Approved by:    re
Supported by:   Secure Computing
2007-07-01 11:41:27 +00:00
Randall Stewart
eacc51c5b6 - Fixes cstatic issues found by cisco sa tool (missing frees and such
on error legs)
- align sctp_sockstore to 64 bit boundary ..
2007-06-18 21:59:15 +00:00
Randall Stewart
80fefe0a08 - Fix so ifn's are properly deleted when the ref count goes to 0.
- Fix so VRF's will clean themselves up when no references are around.
- Allow sctp_ifa to be passed into inpcb_bind, addr_mgmt_ep_sa to bypass
  normal validation checks.
- turn auto-asconf off for subset bound sockets
- Moves all logging to use KTR. This gets rid of most
  of the logging #ifdef's with a few exceptions reducing
  the number of config options for SCTP.
2007-06-14 22:59:04 +00:00
Randall Stewart
35918f8571 - Restructure so bindx functions are not done inline to socket option
but are a seperate call that can be re-used if needed.
- 64 bit issues
  o re-arrange cookie so it is better 64 bit aligned
  o For wire level things we need the packed attribute.
2007-06-12 11:21:00 +00:00
Robert Watson
32f9753cfb Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths.  Do, however, move those prototypes to priv.h.

Reviewed by:	csjp
Obtained from:	TrustedBSD Project
2007-06-12 00:12:01 +00:00
Randall Stewart
f4c93d2405 - fix initial pcb vrf setting when the initial vrf is not the
default_vrf_id
- Missing lock/unlock of inp added as well in the v6 side.
- IFN hash table moves to sctppcbinfo since indexes are
  unique across systems (including different VRFs) this makes it easier
  to do ifn lookups.
2007-06-02 11:05:08 +00:00
Randall Stewart
207304d4b7 - Fixes so we won't try to start a timer when we
hold a wq lock for the iterator. Panda uses a
  silly recursive lock they hold through the timer.
- Add poor mans wireshark compile option..
- Allocate and start using SCTP_M_XXX for all SCTP_MALLOC() calls.
- sysctl now will get back the refcnt for viewing by onlookers.

Reviewed by:	gnn
2007-05-29 09:29:03 +00:00
Randall Stewart
d61a0ae066 - fixed autclose to not allow setting on 1-2-1 model.
- bounded cookie-life to 1 second minimum in socket option set.
- Delayed_ack_time becomes delayed_ack per new socket api document.
- Improve port number selection, we now use low/high bounds and
  no chance of a endless loop. Only one call to random per bind
  as well.
- fixes so set_peer_primary pre-screens addresses to be
  valid to this host.
- maxseg did not allow setting on an assoc basis. We needed
  to thus track and use an association value instead of a inp value.
- Fixed ep get of HB status to report back properly.
- use settings flag to tell if assoc level hb is on off not
  the timer.. since the timer may still run if unconf address
  are present.
- check for crazy ENABLE/DISABLE conditions.
- set and get of pmtud (fixed path mtu) not always taking into account ovh.
- Getting PMTU info on stcb only needs to return PMTUD_ENABLED if
  any net is doing PMTU discovery.
- Panic or warning fixed to not do so when a valid ip frag is
  taking place.
- sndrcvinfo appearing in both inp and stcb was full size, instead
  of the non-pad version. This saves about 92 bytes from each struct
  by carefully converting to use the smaller version.
- one-2-one model get(maxseg) would always get ep value, never the
  tcb's value.
- The delayed ack time could be under a tick, this fixes so
  it bounds it to at least 1 tick for platforms whos tick
  is more than a ms.
- Fragment interleave level set to wrong default value.
- Fragment interleave could not set level 0.
- Defered stream reset was broken due to a guard check and ntohl issue.
- Found two lock order reversals and fixed.
- Tighten up address checking, if the user gives an address the sa_len
  had better be set properly.
- Get asoc by assoc-id would return a locked tcb when it was asked
  not to if the tcb was in the restart hash.
- sysctl to dig down and get more association details

Reviewed by:	gnn
2007-05-28 11:17:24 +00:00
Randall Stewart
3c503c28da - Fixed 1-2-1 model to not worry about associd in sockopts
- Fixed RTOinfo for bounding.
- Fixed connect() to return ECONNREFUSED when an ABORT is received.
- Added comments to direct Static Analysis not to look at some things
  it does not understand (comments are /* sa_ignore XXXXX */)
- Bind when colliding was broken, missing not_found = 1 before
  checking to see if the port was in use caused endless bind loop.
- Cookie life needs to be in milliseconds to conform to socket api.
- Cookie life is not supposed to change if its 0, On the assoc
  level set we changed it to 0 opps.
- Two more static analysis issues identified by the cisco
  tool. Null checks needed.
- An issue for sendfile(). Need to validate the correct
  input argument.
- When sending failed due to a no route to host, we leaked
  the mbuf chain failing to call m_freem().
- Fix #ifdef issue for getting hash block len when HAVE_SHA2 is NOT defined
Reviewed by:	gnn
2007-05-17 12:16:24 +00:00
Randall Stewart
ad81507eed Two major items here:
- All printf that was surrounded by #ifdef SCTP_DEBUG moves to
  a macro that does all of this. This removes all printfs from
  the code and makes the code more portable and easier to
  read.
- Static Analysis (cisco) - found a few bugs, but mostly we
  add checks for NULL pointers and such to make the tool
  happy. We now pass the Cisco SA tools checks except for
  where it does not understand tailq/lists. We still need
  to look at the coverity tools output too (this is like
  the cisco SA tool) and see if it wants us to fix any other
  items. Hopefully this will be the last major churn in the
  code other than bug fixes.
2007-05-09 13:30:06 +00:00
Randall Stewart
b100636770 - Copyright change, cisco's silly tool wants it to say:
"Copyright (c) 2001-2007, by Cisco Systems,"
   instead of
       *Copyright (c) 2001-2007, Cisco Systems,"

-  Also fix a few straglers that were still in 2006.
2007-05-08 17:01:12 +00:00
Randall Stewart
b0552ae214 - Get rid of the sctp_inpcb_free() "magic numbers", now they
are sensible defines that tell what you are directing
   the function to do.
2007-05-08 15:53:03 +00:00
Randall Stewart
6e55db5445 - Static analyisis fixes for cisco's commit (this is equivilant
to the coverity tool.. may even be the same one.. not sure).
-  A bug in the way sctp_abort() and friends were
   setting the IP_CLOSE flag.. and NOT passing the
   last argument as a (,1)... so that things would
   get freed..
2007-05-08 14:32:53 +00:00
Randall Stewart
17205ecc85 - More macros for OS compatabilty
-  PR-SCTP would ignore FWD-TSN's above a rwnd's worth
   of TSN's (1 byte msgs).. this left the peer hopelessly
   out of sync.. or an attacker. So now we abort the assoc.
-  New IFN hash, also rename hashes to match addr/ifn now
   that the vrf has multiple.
-  Do not enable SCTP_PCB_FLAGS_RECVDATAIOEVNT per default
   as defined in the Socket API ID.
-  Export MTU information via sysctl.
-  Vrf's need table id's. This is default for
   BSD, but may be other things later when BSD
   fully supports VRFs.
-  Additional stream reset bug (caught by cisco dev-test).
-  Additional validations for the address in sending a message (socket api).
-------- and -----
-  Fix association notifications not to give the active open
   side false notifications.
-  Fix so sendfile and SENDALL will work properly (missing
   flag to say socket sender is done).
-  Fix Bug that prevented COOKIES from being retransmitted.
-  Break out connectx into helper sub-models so that iox routines can
   reuse the helpers.
-  When an address is added during system init (non-dynamic mode) make
   sure that the "defer use" flag is not set.
** its compiling on XR now :-D **

Reviewed by:	gnn
2007-05-08 00:21:05 +00:00
Randall Stewart
1bb552e88d Fixes a missing unlock in the one-2-one hash table, if
it was full and a collision occured, then we would leave
a inp locked. Also fixes a missing inp unlock if IPSEC was
on and it failed during the attach. Bug found by Weongyo Jeong.
2007-05-04 15:19:10 +00:00
Randall Stewart
d06c82f169 - Somehow the disable fragment option got lost. We could
set/clear it but would not do it. Now we will.
-  Moved to latest socket api for extended sndrcv info struct.
-  Moved to support all new levels of fragment interleave (0-2).
-  Codenomicon security test updates - length checks and such.
-  Bug in stream reset (2 actually).
-  setpeerprimary could unlock a null pointer, fixed.
-  Added a flag in the pcb so netstat can see if we are listening easier.

Obtained from:	(some of the Listen changes from Weongyo Jeong)
2007-05-02 12:50:13 +00:00