chunk, enable UDP encapsulation for all those addresses.
This helps clients using a userland stack to support multihoming if
they are not behind a NAT.
MFC after: 1 week
This is currently only a code change without any functional
change. But this allows to set the remote encapsulation port
in a more detailed way, which will be provided in a follow-up
commit.
MFC after: 1 week
adds the new I-Data (Interleaved Data) message. This allows a user
to be able to have complete freedom from Head Of Line blocking that
was previously there due to the in-ability to send multiple large
messages without the TSN's being in sequence. The code as been
tested with Michaels various packet drill scripts as well as
inter-networking between the IETF's location in Argentina and Germany.
This routine checks that there are no locks held for an inp,
without having any lock on the inp. This breaks if the inp
goes away when it is called. This happens on stress tests
on a RPi B+.
MFC after: 3 days
* When processing a cookie, use the number of
streams announced in the INIT-ACK.
* When sending an INIT-ACK for an existing
association, use the value from the association,
not from the end-point.
MFC after: 1 week
o Introduce a notion of "not ready" mbufs in socket buffers. These
mbufs are now being populated by some I/O in background and are
referenced outside. This forces following implications:
- An mbuf which is "not ready" can't be taken out of the buffer.
- An mbuf that is behind a "not ready" in the queue neither.
- If sockbet buffer is flushed, then "not ready" mbufs shouln't be
freed.
o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc.
The sb_ccc stands for ""claimed character count", or "committed
character count". And the sb_acc is "available character count".
Consumers of socket buffer API shouldn't already access them directly,
but use sbused() and sbavail() respectively.
o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones
with M_BLOCKED.
o New field sb_fnrdy points to the first not ready mbuf, to avoid linear
search.
o New function sbready() is provided to activate certain amount of mbufs
in a socket buffer.
A special note on SCTP:
SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet
allow protocol specific sockbufs. Thus, SCTP does some hacks to make
itself compatible with FreeBSD: it manages sockbufs on its own, but keeps
sb_cc updated to inform the stack of amount of data in them. The new
notion of "not ready" data isn't supported by SCTP. Instead, only a
mechanical substitute is done: s/sb_cc/sb_ccc/.
A proper solution would be to take away struct sockbuf from struct
socket and allow protocols to implement their own socket buffers, like
SCTP already does. This was discussed with rrs@.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
NRSACK extension. The default will still be off, since it
it not an RFC (yet).
Changing the sysctl name will be in a separate commit.
MFC after: 1 week
option for controlling ECN on future associations and get the
status on current associations.
A simialar pattern will be used for controlling SCTP extensions in
upcoming commits.
* Decouple the path supervision using a separate HB timer per path.
* Add support for potentially failed state.
* Bring back RTO.min to 1 second.
* Accept packets on IP-addresses already announced via an ASCONF
* While there: do some cleanups.
Approved by: re@
MFC after: 2 months.
1) They don't use the giant "MAX_CPU" define and instead
are allocated dynamically based on mp_ncpus
2) Will zero with the netstat -z -s -p sctp
3) Will be properly handled by both the sctp_init and finish
(the multi-net stuff was incorrectly bzero'ing in sctp_init
the wrong size.. the bzero is now moved to the right places).
And of course the free is put in at the very end.
MFC after: 3 Months
threads. These serve as input threads and are queued
packets based on the V-tag number. This is similar to
what a modern card can do with queue's for TCP... but
alas modern cards know nothing about SCTP.
MFC after: 3 months (maybe)
2) Add separate max-bursts for retransmit and hb. These
are set to sysctlable values but not settable via the
socket api. This makes sure we don't blast out HB's or
fast-retransmits.
3) Determine on the first data transmission on a net if
its local-lan (by being under or over a RTT). This
can later be used to think about different algorithms
based on locallan vs big-i (experimental)
4) The cwnd should NOT be allowed to grow when an ECNEcho
is seen (TCP has this same bug). We fix this in SCTP
so an ECNe being seen prevents an advance of cwnd.
5) CWR's should not be sent multiple times to the
same network, instead just updating the TSN being
transmitted if needed.
MFC after: 1 Month
with the latest socket API ID. Especially it can be disabled.
Full compliance needs changing the structure used in the
socket option. Since this breaks the API, it will be a
seperate commit which will not be MFCed to stable/8.
MFC after: 3 months.
Fix the switching on/off of PF and NR-SACKs using sysctl.
Add minor improvement in handling malloc failures.
Improve the address checks when sending.
MFC after: 4 weeks
1) Makes it so that the INVARIANT function validate nolocks is
available anywhere.
2) Fixes a BUG where a close has been done on a collision socket
and the cookie processing would return leaving a lock held.
MFC after: 1 week
code base. We now properly have ONE thread
that services all VNET's. Also we purge out
the old timer based iterator code which had
multiple LOR's and other issues.
MFC after: 3 days
* Fix handling of mapping arrays when draining mbufs or processing
FORWARD-TSN chunks.
* Cleanup code (no duplicate code anymore for SACKs and NR-SACKs).
Part of this code was developed together with rrs.
MFC after: 2 weeks.
- Prepare for CRC offloading, add MIB counters (RS/MT).
- Bugfix: Disable CRC computation for IPv6 addresses with local scope (MT).
- Bugfix: Handle close() with SO_LINGER correctly when notifications
are generated during the close() call(MT).
- Bugfix: Generate DRY event when sender is dry during subscription.
Only for 1-to-1 style sockets (RS/MT)
- Bugfix: Put vtags for the correct amount of time into time-wait (MT).
- Bugfix: Clear vtag entries correctly on expiration (MT).
- Bugfix: shutdown() indicates ENOTCONN when called for unconnected
1-to-1 style sockets (MT).
- Bugfix: In sctp Auth code (PL).
- Add support for devices that support SCTP csum offload (igb).
- Add missing sctp_associd to mib sysctl xsctp_tcb structure (RS)
Obtained from: With help from Peter Lei and Michael Tuexen
bit of debugging afterwards):
- Fix protection code for notification generation.
- Decouple associd from vtag
- Allow vtags to have less strigent requirements in non-uniqueness.
o don't pre-hash them when you issue one in a cookie.
o Allow duplicates and use addresses and ports to
discriminate amongst the duplicates during lookup.
- Add support for the NAT draft draft-ietf-behave-sctpnat-00, this
is still experimental and needs more extensive testing with the
Jason Butt ipfw changes.
- Support for the SENDER_DRY event to get DTLS in OpenSSL working
with a set of patches from Michael Tuexen (hopefully heading to OpenSSL soon).
- Update the support of SCTP-AUTH by Peter Lei.
- Use macros for refcounting.
- Fix MTU for UDP encapsulation.
- Fix reporting back of unsent data.
- Update assoc send counter handling to be consistent with endpoint sent counter.
- Fix a bug in PR-SCTP.
- Fix so we only send another FWD-TSN when a SACK arrives IF and only
if the adv-peer-ack point progressed. However we still make sure
a timer is running if we do have an adv_peer_ack point.
- Fix PR-SCTP bug where chunks were retransmitted if they are sent
unreliable but not abandoned yet.
With the help of: Michael Teuxen and Peter Lei :-)
MFC after: 4 weeks
-Improvement: panic() on INVARIANTS kernels if memory allocation
fails for a tagblock in sctp_add_vtag_to_timewait().
-Bugfix: Protect code in sctp_is_in_timewait() by
SCTP_INP_INFO_WLOCK/SCTP_INP_INFO_WUNLOCK.
-Cleanup: Get rid of unused variable now in sctp_init_asoc().
-Bugfix: Reuse the correct vtag in sctp_add_vtag_to_timewait().
-Cleanup: Get rid of unused constant SCTP_TIME_WAIT_SHORT
in sctp_constants.h.
-Improvement: Use all hash buckets of the vtag hash table.
-Cleanup: Get rid of then unused constant SCTP_STACK_VTAG_HASH_SIZE_A.
-Bugfix: Handle SHUTDOWN;SACK packet correctly.
-Bugfix: Last TSN in a gap ack block was not being "ack'd"
in the internal scoreboard.
Obtained from: (with help from Michael Tuexen)
2) Adds some __UserSpace__ on some of the common defines that
the user space code needs
3) Fixes a bug when we send up data to a user that failed. We
need to a) trim off the data chunk headers, if present, and
b) make sure the frag bit is communicated properly for the
msgs coming off the stream queues... i.e. we see if some
of the msg has been taken.
Obtained from: jeli contributed the VIMAGE changes on this pass Thanks Julain!
- only one functino to destroy an SCTP stack sctp_finish()
- Make it so this function also arranges for any threads
created by the image to do a kthread_exit()
- Vimage prep - these are major restructures to move
all global variables to be accessed via a macro or two.
The variables all go into a single structure.
- Asconf address addition tweaks (add_or_del Interfaces)
- Fix rwnd calcualtion to be more conservative.
- Support SACK_IMMEDIATE flag to skip delayed sack
by demand of peer.
- Comment updates in the sack mapping calculations
- Invarients panic added.
- Pre-support for UDP tunneling (we can do this on
MAC but will need added support from UDP to
get a "pipe" of UDP packets in.
- clear trace buffer sysctl added when local tracing on.
Note the majority of this huge patch is all the vimage prep stuff :-)
- Adds some prepwork (Not all yet) for vimage in particular
support the delete the sctppcbinfo.xx structs. There is
still a leak in here if it were to be called plus we stil
need the regrouping (From Me and Michael Tuexen)
- Adds support for UDP tunneling. For BSD there is no
socket yet setup so its disabled, but major argument
changes are in here to emcompass the passing of the port
number (zero when you don't have a udp tunnel, the default
for BSD). Will add some hooks in UDP here shortly (discussed
with Robert) that will allow easy tunneling. (Mainly from
Peter Lei and Michael Tuexen with some BSD work from me :-D)
- Some ease for windows, evidently leave is reserved by their
compile move label leave: -> out:
MFC after: 1 week
- Select a tag gains ability to optionally save new tags
off in the timewait system.
- When looking up associations do not give back a stcb that
is in the about-to-be-freed state, and instead continue
looking for other candiates.
- New function to query to see if value is in time-wait.
- Timewait had a time comparison error that caused very
few vtags to actually stay in time-wait.
- When setting tags in time-wait, we now use the time
requested NOT a fixed constant value.
- sstat now gets the proper associd when we do the query.
- When we process an association, we expect the tag chosen
(if we have one from a cookie) to be in time-wait. Before
we would NOT allow the assoc up by checking if its good.
In theory this should have caused almost all assoc not
to come up except for the time-comparison bug above (this
bug was hidden by the time comparison bug :-D).
- Don't save tags for nonce values in the time-wait cache
since these are used only during cookie collisions and do
not matter if they are unique or not.
MFC after: 1 week