RTF_BLACKHOLE as well.
To quote the submitter:
The uRPF loose-check implementation by the industry vendors, at least on Cisco
and possibly Juniper, will fail the check if the route of the source address
is pointed to Null0 (on Juniper, discard or reject route). What this means is,
even if uRPF Loose-check finds the route, if the route is pointed to blackhole,
uRPF loose-check must fail. This allows people to utilize uRPF loose-check mode
as a pseudo-packet-firewall without using any manual filtering configuration --
one can simply inject a IGP or BGP prefix with next-hop set to a static route
that directs to null/discard facility. This results in uRPF Loose-check failing
on all packets with source addresses that are within the range of the nullroute.
Submitted by: James Jun <james@towardex.com>
1) data to be sent to the right of snd_recover.
2) send more data then whats in the send buffer.
The fix is to postpone sack retransmit to a subsequent recovery episode
if the current retransmit pointer is beyond snd_recover.
Thanks to Mohan Srinivasan for helping fix the bug.
Submitted by:Daniel Lang
for the SYN|ACK packet and then letting in6_pcbconnect set the
flowlabel later. Arange for the syncache/syncookie code to set and
recall the flow label so that the flowlabel used for the SYN|ACK
is consistent. This is done by using some of the cookie (when tcp
cookies are enabeled) and by stashing the flowlabel in syncache.
Tested and Discovered by: Orla McGann <orly@cnri.dit.ie>
Approved by: ume, silby
MFC after: 1 month
icmp_error() packets. While here retire PACKET_TAG_PF_GENERATED (which
served the same purpose) and use M_SKIP_FIREWALL in pf as well. This should
speed up things a bit as we get rid of the tag allocations.
Discussed with: juli
using M_PROTO6 and possibly shooting someone's foot, as well as allowing the
firewall to be used in multiple passes, or with a packet classifier frontend,
that may need to explicitly allow a certain packet. Presently this is handled
in the ipfw_chk code as before, though I have run with it moved to upper
layers, and possibly it should apply to ipfilter and pf as well, though this
has not been investigated.
Discussed with: luigi, rwatson
for unknown events.
A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
bootp -> BOOTP
bootp.nfsroot -> BOOTP_NFSROOT
bootp.nfsv3 -> BOOTP_NFSV3
bootp.compat -> BOOTP_COMPAT
bootp.wired_to -> BOOTP_WIRED_TO
- i.e. back out the previous commit. It's already possible to
pxeboot(8) with a GENERIC kernel.
Pointed out by: dwmalone
BOOTP -> bootp
BOOTP_NFSROOT -> bootp.nfsroot
BOOTP_NFSV3 -> bootp.nfsv3
BOOTP_COMPAT -> bootp.compat
BOOTP_WIRED_TO -> bootp.wired_to
This lets you PXE boot with a GENERIC kernel by putting this sort of thing
in loader.conf:
bootp="YES"
bootp.nfsroot="YES"
bootp.nfsv3="YES"
bootp.wired_to="bge1"
or even setting the variables manually from the OK prompt.
{ip,udp,tcp} header and return a void * pointing to the payload (i.e. the
first byte past the end of the header and any required padding). Use them
consistently throughout libalias to a) reduce code duplication, b) improve
code legibility, c) get rid of a bunch of alignment warnings.
a short pointer. The previous implementation seems to be in a gray zone
of the C standard, and GCC generates incorrect code for it at -O2 or
higher on some platforms.
named link, foo_link or link_foo to lnk, foo_lnk or lnk_foo, fixing
signed / unsigned comparisons, and shoving unused function arguments
under the carpet.
I was hoping WARNS?=6 might reveal more serious problems, and perhaps
the source of the -O2 breakage, but found no smoking gun.
Fix this problem by separating out the SACK and the newreno cases. Also, check
if we are in FASTRECOVERY for the sack case and if so, turn off dupacks.
Fix an issue where the congestion window was not being incremented by ssthresh.
Thanks to Mohan Srinivasan for finding this problem.
associated with performing a wakeup on the socket buffer:
- When performing an sbappend*() followed by a so[rw]wakeup(), explicitly
acquire the socket buffer lock and use the _locked() variants of both
calls. Note that the _locked() sowakeup() versions unlock the mutex on
return. This is done in uipc_send(), divert_packet(), mroute
socket_send(), raw_append(), tcp_reass(), tcp_input(), and udp_append().
- When the socket buffer lock is dropped before a sowakeup(), remove the
explicit unlock and use the _locked() sowakeup() variant. This is done
in soisdisconnecting(), soisdisconnected() when setting the can't send/
receive flags and dropping data, and in uipc_rcvd() which adjusting
back-pressure on the sockets.
For UNIX domain sockets running mpsafe with a contention-intensive SMP
mysql benchmark, this results in a 1.6% query rate improvement due to
reduce mutex costs.
locking in tcp_input() for TCP packets with urgent data pointers to
hold the socket buffer lock across testing and updating oobmark
from just protecting sb_state.
Update socket locking annotations
Giant if debug.mpsafenet=0, as any points that require synchronization
in the SMPng world also required it in the Giant-world:
- inpcb locks (including IPv6)
- inpcbinfo locks (including IPv6)
- dummynet subsystem lock
- ipfw2 subsystem lock
the socket buffer having its limits adjusted. sbreserve() now acquires
the lock before calling sbreserve_locked(). In soreserve(), acquire
socket buffer locks across read-modify-writes of socket buffer fields,
and calls into sbreserve/sbrelease; make sure to acquire in keeping
with the socket buffer lock order. In tcp_mss(), acquire the socket
buffer lock in the calling context so that we have atomic read-modify
-write on buffer sizes.
originated on RELENG_4 and was ported to -CURRENT.
The scoreboarding code was obtained from OpenBSD, and many
of the remaining changes were inspired by OpenBSD, but not
taken directly from there.
You can enable/disable sack using net.inet.tcp.do_sack. You can
also limit the number of sack holes that all senders can have in
the scoreboard with net.inet.tcp.sackhole_limit.
Reviewed by: gnn
Obtained from: Yahoo! (Mohan Srinivasan, Jayanth Vijayaraghavan)
ip_ctloutput(), as it may need to perform blocking memory allocations.
This also improves consistency with locking relative to other points
that call into ip_ctloutput().
Bumped into by: Grover Lines <grover@ceribus.net>
was received on a broadcast address on the input path. Under certain
circumstances this could result in a panic, notably for locally-generated
packets which do not have m_pkthdr.rcvif set.
This is a similar situation to that which is solved by
src/sys/netinet/ip_icmp.c rev 1.66.
PR: kern/52935
encapsulated within an IPv6 datagram, do not abuse the 'ipov' pointer
when registering trace records. 'ipov' is specific to IPv4, and
will therefore be uninitialized.
[This fandango is only necessary in the first place because of our
host-byte-order IP field pessimization.]
PR: kern/60856
Submitted by: Galois Zheng
unless the segment really contains the last of the data for the stream.
PR: kern/34619
Obtained from: OpenBSD (tcp_output.c rev 1.47)
Noticed by: Joseph Ishac
Reviewed by: George Neville-Neil