ARP requests only on the network where this IP address belong, to.
Before this change we did replied on all interfaces. This could
lead to an IP address conflict with host we are doing ARP proxy
for.
PR: kern/75634
Reviewed by: andre
between sack and a bug in the "bad retransmit recovery" logic. This is
a workaround, the underlying bug will be fixed later.
Submitted by: Mohan Srinivasan, Noritoshi Demizu
TTL a packet must have when received on a socket. All packets with a
lower TTL are silently dropped. Works on already connected/connecting
and listening sockets for RAW/UDP/TCP.
This option is only really useful when set to 255 preventing packets
from outside the directly connected networks reaching local listeners
on sockets.
Allows userland implementation of 'The Generalized TTL Security Mechanism
(GTSM)' according to RFC3682. Examples of such use include the Cisco IOS
BGP implementation command "neighbor ttl-security".
MFC after: 2 weeks
Sponsored by: TCP/IP Optimization Fundraise 2005
cluster if needed.
Fixes the TCP issues raised in I-D draft-gont-icmp-payload-00.txt.
This aids in-the-wild debugging a lot and allows the receiver to do
more elaborate checks on the validity of the response.
MFC after: 2 weeks
Sponsored by: TCP/IP Optimization Fundraise 2005
packet in an ICMP reply. The minimum of 8 bytes is internally
enforced. The maximum quotation is the remaining space in the
reply mbuf.
This option is added in response to the issues raised in I-D
draft-gont-icmp-payload-00.txt.
MFC after: 2 weeks
Spnsored by: TCP/IP Optimizations Fundraise 2005
the IP address the packet came through in. This is useful for routers
to show in traceroutes the actual path a packet has taken instead of
the possibly different return path.
The new sysctl is named net.inet.icmp.reply_from_interface and defaults
to off.
MFC after: 2 weeks
than one interface in one subnet. However, some userland apps rely on
the believe that this configuration is impossible.
Add a sysctl switch net.inet.ip.same_prefix_carp_only. If the switch
is on, then kernel will refuse to add an additional interface to
already connected subnet unless the interface is CARP. Default
value is off.
PR: bin/82306
In collaboration with: mlaier
* Correct handling of IPv6 Extension Headers.
* Add unreach6 code.
* Add logging for IPv6.
Submitted by: sysctl handling derived from patch from ume needed for ip6fw
Obtained from: is_icmp6_query and send_reject6 derived from similar
functions of netinet6,ip6fw
Reviewed by: ume, gnn; silence on ipfw@
Test setup provided by: CK Software GmbH
MFC after: 6 days
incoming ARP packet and route request adding/removing
ARP entries. The root of the problem is that
struct llinfo_arp was accessed without any locks.
To close race we will use locking provided by
rtentry, that references this llinfo_arp:
- Make arplookup() return a locked rtentry.
- In arpresolve() hold the lock provided by
rt_check()/arplookup() until the end of function,
covering all accesses to the rtentry itself and
llinfo_arp it refers to.
- In in_arpinput() do not drop lock provided by
arplookup() during first part of the function.
- Simplify logic in the first part of in_arpinput(),
removing one level of indentation.
- In the second part of in_arpinput() hold rtentry
lock while copying address.
o Fix a condition when route entry is destroyed, while
another thread is contested on its lock:
- When storing a pointer to rtentry in llinfo_arp list,
always add a reference to this rtentry, to prevent
rtentry being destroyed via RTM_DELETE request.
- Remove this reference when removing entry from
llinfo_arp list.
o Further cleanup of arptimer():
- Inline arptfree() into arptimer().
- Use official queue(3) way to pass LIST.
- Hold rtentry lock while reading its structure.
- Do not check that sdl_family is AF_LINK, but
assert this.
Reviewed by: sam
Stress test: http://www.holm.cc/stress/log/cons141.html
Stress test: http://people.freebsd.org/~pho/stress/log/cons144.html
to atomically return either an existing set of IP multicast options for the
PCB, or a newlly allocated set with default values. The inpcb is returned
locked. This function may sleep.
Call ip_moptions() to acquire a reference to a PCB's socket options, and
perform the update of the options while holding the PCB lock. Release the
lock before returning.
Remove garbage collection of multicast options when values return to the
default, as this complicates locking substantially. Most applications
allocate a socket either to be multicast, or not, and don't tend to keep
around sockets that have previously been used for multicast, then used for
unicast.
This closes a number of race conditions involving multiple threads or
processes modifying the IP multicast state of a socket simultaenously.
MFC after: 7 days
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.
Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.
Reviewed by: pjd, bz
MFC after: 7 days
lists, as well as accessor macros. For now, this is a recursive mutex
due code sequences where IPv4 multicast calls into IGMP calls into
ip_output(), which then tests for a multicast forwarding case.
For support macros in in_var.h to check multicast address lists, assert
that in_multi_mtx is held.
Acquire in_multi_mtx around iteration over the IPv4 multicast address
lists, such as in ip_input() and ip_output().
Acquire in_multi_mtx when manipulating the IPv4 layer multicast addresses,
as well as over the manipulation of ifnet multicast address lists in order
to keep the two layers in sync.
Lock down accesses to IPv4 multicast addresses in IGMP, or assert the
lock when performing IGMP join/leave events.
Eliminate spl's associated with IPv4 multicast addresses, portions of
IGMP that weren't previously expunged by IGMP locking.
Add in_multi_mtx, igmp_mtx, and if_addr_mtx lock order to hard-coded
lock order in WITNESS, in that order.
Problem reported by: Ed Maste <emaste at phaedrus dot sandvine dot ca>
MFC after: 10 days
FreeBSD specific ip_newid() changes NetBSD does not have.
Correct handling of non AF_INET packets passed to bpf [2].
PR: kern/80340[1], NetBSD PRs 29150[1], 30844[2]
Obtained from: NetBSD ip_gre.c rev. 1.34,1.35, if_gre.c rev. 1.56
Submitted by: Gert Doering <gert at greenie.muc.de>[2]
MFC after: 4 days
- most of the kernel code will not care about the actual encoding of
scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
scoped addresses as a special case.
- scope boundary check will be stricter. For example, the current
*BSD code allows a packet with src=::1 and dst=(some global IPv6
address) to be sent outside of the node, if the application do:
s = socket(AF_INET6);
bind(s, "::1");
sendto(s, some_global_IPv6_addr);
This is clearly wrong, since ::1 is only meaningful within a single
node, but the current implementation of the *BSD kernel cannot
reject this attempt.
Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Obtained from: KAME
redundant with respect to existing mbuf copy label routines. Expose
a new mac_copy_mbuf() routine at the top end of the Framework and
use that; use the existing mpo_copy_mbuf_label() routine on the
bottom end.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, SPAWAR
Approved by: re (scottl)
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64 (tier1) and ia64/ppc (tier2).
This adds two new macros that check the alignment, these are compile time
dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where
alignment isn't need so the cost is avoided.
IP_HDR_ALIGNED_P()
IP6_HDR_ALIGNED_P()
Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment
is checked for ipfw and dummynet too.
PR: ia64/81284
Obtained from: NetBSD
Approved by: re (dwhite), mlaier (mentor)
after PAWS checks. The symptom of this is an inconsistency in the cached
sack state, caused by the fact that the sack scoreboard was not being
updated for an ACK handled in the header prediction path.
Found by: Andrey Chernov.
Submitted by: Noritoshi Demizu, Raja Mukerji.
Approved by: re
does not clear tlen and frees the mbuf (leaving th pointing at
freed memory), if the data segment is a complete duplicate.
This change works around that bug. A fix for the tcp_reass() bug
will appear later (that bug is benign for now, as neither th nor
tlen is referenced in tcp_input() after the call to tcp_reass()).
Found by: Pawel Jakub Dawidek.
Submitted by: Raja Mukerji, Noritoshi Demizu.
Approved by: re
so residue of division for all hosts on net is the same, and thus only
one VHID answers. Change source IP in host byte order.
Reviewed by: mlaier
Approved by: re (scottl)
The ipfw tables lookup code caches the result of the last query. The
kernel may process multiple packets concurrently, performing several
concurrent table lookups. Due to an insufficient locking, a cached
result can become corrupted that could cause some addresses to be
incorrectly matched against a lookup table.
Submitted by: ru
Reviewed by: csjp, mlaier
Security: CAN-2005-2019
Security: FreeBSD-SA-05:13.ipfw
Correct bzip2 permission race condition vulnerability.
Obtained from: Steve Grubb via RedHat
Security: CAN-2005-0953
Security: FreeBSD-SA-05:14.bzip2
Approved by: obrien
Correct TCP connection stall denial of service vulnerability.
A TCP packets with the SYN flag set is accepted for established
connections, allowing an attacker to overwrite certain TCP options.
Submitted by: Noritoshi Demizu
Reviewed by: andre, Mohan Srinivasan
Security: CAN-2005-2068
Security: FreeBSD-SA-05:15.tcp
Approved by: re (security blanket), cperciva
processing is now done in the ACK processing case.
- Merge tcp_sack_option() and tcp_del_sackholes() into a new function
called tcp_sack_doack().
- Test (SEG.ACK < SND.MAX) before processing the ACK.
Submitted by: Noritoshi Demizu
Reveiewed by: Mohan Srinivasan, Raja Mukerji
Approved by: re
kernel module. LibAlias is not aware about checksum offloading,
so the caller should provide checksum calculation. (The only
current consumer is ng_nat(4)). When TCP packet internals has
been changed and it requires checksum recalculation, a cookie
is set in th_x2 field of TCP packet, to inform caller that it
needs to recalculate checksum. This ugly hack would be removed
when LibAlias is made more kernel friendly.
Incremental checksum updates are left as is, since they don't
conflict with offloading.
Approved by: re (scottl)
a DLT_NULL interface. In particular:
1) Consistently use type u_int32_t for the header of a
DLT_NULL device - it continues to represent the address
family as always.
2) In the DLT_NULL case get bpf_movein to store the u_int32_t
in a sockaddr rather than in the mbuf, to be consistent
with all the DLT types.
3) Consequently fix a bug in bpf_movein/bpfwrite which
only permitted packets up to 4 bytes less than the MTU
to be written.
4) Fix all DLT_NULL devices to have the code required to
allow writing to their bpf devices.
5) Move the code to allow writing to if_lo from if_simloop
to looutput, because it only applies to DLT_NULL devices
but was being applied to other devices that use if_simloop
possibly incorrectly.
PR: 82157
Submitted by: Matthew Luckie <mjl@luckie.org.nz>
Approved by: re (scottl)