Commit Graph

6136 Commits

Author SHA1 Message Date
Mateusz Guzik
cc426dd319 Remove unused argument to priv_check_cred.
Patch mostly generated with cocinnelle:

@@
expression E1,E2;
@@

- priv_check_cred(E1,E2,0)
+ priv_check_cred(E1,E2)

Sponsored by:	The FreeBSD Foundation
2018-12-11 19:32:16 +00:00
Mark Johnston
9d2877fc3d Clamp the INPCB port hash tables to IPPORT_MAX + 1 chains.
Memory beyond that limit was previously unused, wasting roughly 1MB per
8GB of RAM.  Also retire INP_PCBLBGROUP_PORTHASH, which was identical to
INP_PCBPORTHASH.

Reviewed by:	glebius
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D17803
2018-12-05 17:06:00 +00:00
Andrey V. Elsukov
d66f9c86fa Add ability to request listing and deleting only for dynamic states.
This can be useful, when net.inet.ip.fw.dyn_keep_states is enabled, but
after rules reloading some state must be deleted. Added new flag '-D'
for such purpose.

Retire '-e' flag, since there can not be expired states in the meaning
that this flag historically had.

Also add "verbose" mode for listing of dynamic states, it can be enabled
with '-v' flag and adds additional information to states list. This can
be useful for debugging.

Obtained from:	Yandex LLC
MFC after:	2 months
Sponsored by:	Yandex LLC
2018-12-04 16:12:43 +00:00
Michael Tuexen
c8b53ced95 Limit option_len for the TCP_CCALGOOPT.
Limiting the length to 2048 bytes seems to be acceptable, since
the values used right now are using 8 bytes.

Reviewed by:		glebius, bz, rrs
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D18366
2018-11-30 10:50:07 +00:00
Mark Johnston
79db6fe7aa Plug some networking sysctl leaks.
Various network protocol sysctl handlers were not zero-filling their
output buffers and thus would export uninitialized stack memory to
userland.  Fix a number of such handlers.

Reported by:	Thomas Barabosch, Fraunhofer FKIE
Reviewed by:	tuexen
MFC after:	3 days
Security:	kernel memory disclosure
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18301
2018-11-22 20:49:41 +00:00
Michael Tuexen
ad2be38941 A TCP stack is required to check SEG.ACK first, when processing a
segment in the SYN-SENT state as stated in Section 3.9 of RFC 793,
page 66. Ensure this is also done by the TCP RACK stack.

Reviewed by:		rrs@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D18034
2018-11-22 20:05:57 +00:00
Michael Tuexen
fef56019e9 Ensure that the TCP RACK stack honours the setting of the
net.inet.tcp.drop_synfin sysctl-variable.

Reviewed by:		rrs@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D18033
2018-11-22 20:02:39 +00:00
Michael Tuexen
7e729f0787 Ensure that the default RTT stack can make an RTT measurement if
the TCP connection was initiated using the RACK stack, but the
peer does not support the TCP RACK extension.

This ensures that the TCP behaviour on the wire is the same if
the TCP connection is initated using the RACK stack or the default
stack.

Reviewed by:		rrs@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D18032
2018-11-22 19:56:52 +00:00
Michael Tuexen
794107181a Ensure that TCP RST-segments announce consistently a receiver window of
zero. This was already done when sending them via tcp_respond().

Reviewed by:		rrs@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D17949
2018-11-22 19:49:52 +00:00
Michael Tuexen
3bea9a2664 Improve two KASSERTs in the TCP RACK stack.
There are two locations where an always true comparison was made in
a KASSERT. Replace this by an appropriate check and use a consistent
panic message. Also use this code when checking a similar condition.

PR:			229664
Reviewed by:		rrs@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D18021
2018-11-21 18:19:15 +00:00
Andrey V. Elsukov
5786c6b9f9 Make multiline APPLY_MASK() macro to be function-like.
Reported by:	cem
MFC after:	1 week
2018-11-20 18:38:28 +00:00
Bjoern A. Zeeb
945aad9c62 Improve the comment for arpresolve_full() in if_ether.c.
No functional changes.

MFC after:	6 weeks
2018-11-17 16:13:09 +00:00
Bjoern A. Zeeb
90d99b6587 Retire arpresolve_addr(), which is not used anywhere, from if_ether.c. 2018-11-17 16:08:36 +00:00
Jonathan T. Looney
2157f3c36a Add some additional length checks to the IPv4 fragmentation code.
Specifically, block 0-length fragments, even when the MF bit is clear.
Also, ensure that every fragment with the MF bit clear ends at the same
offset and that no subsequently-received fragments exceed that offset.

Reviewed by:	glebius, markj
MFC after:	3 days
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D17922
2018-11-16 18:32:48 +00:00
Mark Johnston
86af1d0241 Ensure that IP fragments do not extend beyond IP_MAXPACKET.
Such fragments are obviously invalid, and when processed may end up
violating the sort order (by offset) of fragments of a given packet.
This doesn't appear to be exploitable, however.

Reviewed by:	emaste
Discussed with:	jtl
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17914
2018-11-10 03:00:36 +00:00
Ed Maste
2bfaf585ca Avoid buffer underwrite in icmp_error
icmp_error allocates either an mbuf (with pkthdr) or a cluster depending
on the size of data to be quoted in the ICMP reply, but the calculation
failed to account for the additional padding that m_align may apply.

Include the ip header in the size passed to m_align.  On 64-bit archs
this will have the net effect of moving everything 4 bytes later in the
mbuf or cluster.  This will result in slightly pessimal alignment for
the ICMP data copy.

Also add an assertion that we do not move m_data before the beginning of
the mbuf or cluster.

Reported by:	A reddit user
Reviewed by:	bz, jtl
MFC after:	3 days
Security:	CVE-2018-17156
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17909
2018-11-08 20:17:36 +00:00
Michael Tuexen
8553b984a5 Don't use a function when neither INET nor INET6 are defined.
This is a valid case for the userland stack, where this fixes
two set-but-not-used warnings in this case.

Thanks to Christian Wright for reporting the issue.
2018-11-06 12:55:03 +00:00
Jonathan T. Looney
54e675342b m_pulldown() may reallocate n. Update the oip pointer after the
m_pulldown() call.

MFC after:	2 weeks
Sponsored by:	Netflix
2018-11-02 19:14:15 +00:00
Bjoern A. Zeeb
e2c532f156 carpstats are the last virtualised variable in the file and end up at the
end of the vnet_set.  The generated code uses an absolute relocation at
one byte beyond the end of the carpstats array.  This means the relocation
for the vnet does not happen for carpstats initialisation and as a result
the kernel panics on module load.

This problem has only been observed with carp and only on i386.
We considered various possible solutions including using linker scripts
to add padding to all kernel modules for pcpu and vnet sections.

While the symbols (by chance) stay in the order of appearance in the file
adding an unused non-file-local variable at the end of the file will extend
the size of set_vnet and hence make the absolute relocation for carpstats
work (think of this as a single-module set_vnet padding).

This is a (tmporary) hack.  It is the least intrusive one as we need a
timely solution for the upcoming release.  We will revisit the problem in
HEAD.  For a lot more information and the possible alternate solutions
please see the PR and the references therein.

PR:			230857
MFC after:		3 days
2018-11-01 17:26:18 +00:00
Mark Johnston
d9ff5789be Remove redundant checks for a NULL lbgroup table.
No functional change intended.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17108
2018-11-01 15:52:49 +00:00
Mark Johnston
79ee680b65 Improve style in in_pcbinslbgrouphash() and related subroutines.
No functional change intended.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17107
2018-11-01 15:51:49 +00:00
Michael Tuexen
6999f6975c Remove debug code which slipped in accidently.
MFC after:		4 weeks
X-MFC with:		r339989
Sponsored by:		Netflix, Inc.
2018-11-01 11:41:40 +00:00
Michael Tuexen
099ab39f44 Improve a comment to refer to the actual sections in the TCP
specification for the comparisons made.
Thanks to lstewart@ for the suggestion.

MFC after:		4 weeks
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D17595
2018-11-01 11:35:28 +00:00
Bjoern A. Zeeb
201100c58b Initial implementation of draft-ietf-6man-ipv6only-flag.
This change defines the RA "6" (IPv6-Only) flag which routers
may advertise, kernel logic to check if all routers on a link
have the flag set and accordingly update a per-interface flag.

If all routers agree that it is an IPv6-only link, ether_output_frame(),
based on the interface flag, will filter out all ETHERTYPE_IP/ARP
frames, drop them, and return EAFNOSUPPORT to upper layers.

The change also updates ndp to show the "6" flag, ifconfig to
display the IPV6_ONLY nd6 flag if set, and rtadvd to allow
announcing the flag.

Further changes to tcpdump (contrib code) are availble and will
be upstreamed.

Tested the code (slightly earlier version) with 2 FreeBSD
IPv6 routers, a FreeBSD laptop on ethernet as well as wifi,
and with Win10 and OSX clients (which did not fall over with
the "6" flag set but not understood).

We may also want to (a) implement and RX filter, and (b) over
time enahnce user space to, say, stop dhclient from running
when the interface flag is set.  Also we might want to start
IPv6 before IPv4 in the future.

All the code is hidden under the EXPERIMENTAL option and not
compiled by default as the draft is a work-in-progress and
we cannot rely on the fact that IANA will assign the bits
as requested by the draft and hence they may change.

Dear 6man, you have running code.

Discussed with:	Bob Hinden, Brian E Carpenter
2018-10-30 20:08:48 +00:00
Mark Johnston
da7d7778b0 Expose some netdump configuration parameters through sysctl.
Reviewed by:	cem
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D17755
2018-10-29 21:16:26 +00:00
Eugene Grosbein
1a5995cc88 Prevent ip_input() from panicing due to unprotected access to INADDR_HASH.
PR:			220078
MFC after:		1 month
Differential Revision:	https://reviews.freebsd.org/D12457
Tested-by:		Cassiano Peixoto and others
2018-10-27 04:59:35 +00:00
Eugene Grosbein
4f1e3122ac Prevent multicast code from panicing due to unprotected access to INADDR_HASH.
PR:			220078
MFC after:		1 month
Differential Revision:	https://reviews.freebsd.org/D12457
Tested-by:		Cassiano Peixoto and others
2018-10-27 04:53:25 +00:00
Michael Tuexen
de00ad05e6 Add initial descriptions for SCTP related MIB variable.
This work was mostly done by Marie-Helene Kvello-Aune.

MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D3583
2018-10-26 21:04:17 +00:00
Andrey V. Elsukov
8796e291f8 Add the check that current VNET is ready and access to srchash is allowed.
This change is similar to r339646. The callback that checks for appearing
and disappearing of tunnel ingress address can be called during VNET
teardown. To prevent access to already freed memory, add check to the
callback and epoch_wait() call to be sure that callback has finished its
work.

MFC after:	20 days
2018-10-23 13:11:45 +00:00
John Baldwin
74e10fb613 A couple of style fixes in recent TCP changes.
- Add a blank line before a block comment to match other block comments
  in the same function.
- Sort the prototype for sbsndptr_adv and fix whitespace between return
  type and function name.

Reviewed by:	gallatin, bz
Differential Revision:	https://reviews.freebsd.org/D17474
2018-10-22 21:17:36 +00:00
Eugene Grosbein
410634efd1 New sysctl: net.inet.icmp.error_keeptags
Currently, icmp_error() function copies FIB number from original packet
into generated ICMP response but not mbuf_tags(9) chain.
This prevents us from easily matching ICMP responses corresponding
to tagged original packets by means of packet filter such as ipfw(8).
For example, ICMP "time-exceeded in-transit" packets usually generated
in response to traceroute probes lose tags attached to original packets.

This change adds new sysctl net.inet.icmp.error_keeptags
that defaults to 0 to avoid extra overhead when this feature not needed.

Set net.inet.icmp.error_keeptags=1 to make icmp_error() copy mbuf_tags
from original packet to generated ICMP response.

PR:		215874
MFC after:	1 month
2018-10-21 21:29:19 +00:00
Andrey V. Elsukov
f252e3f2f2 Include <sys/eventhandler.h> to fix the build.
MFC after:	1 month
2018-10-21 18:39:34 +00:00
Andrey V. Elsukov
19873f4780 Add handling for appearing/disappearing of ingress addresses to if_gre(4).
* register handler for ingress address appearing/disappearing;
* add new srcaddr hash table for fast softc lookup by srcaddr;
* when srcaddr disappears, clear IFF_DRV_RUNNING flag from interface,
  and set it otherwise;

MFC after:	1 month
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D17214
2018-10-21 18:13:45 +00:00
Andrey V. Elsukov
009d82ee0f Add handling for appearing/disappearing of ingress addresses to if_gif(4).
* register handler for ingress address appearing/disappearing;
* add new srcaddr hash table for fast softc lookup by srcaddr;
* when srcaddr disappears, clear IFF_DRV_RUNNING flag from interface,
  and set it otherwise;
* remove the note about ingress address from BUGS section.

MFC after:	1 month
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D17134
2018-10-21 18:06:15 +00:00
Andrey V. Elsukov
8251c68d5c Add KPI that can be used by tunneling interfaces to handle IP addresses
appearing and disappearing on the host system.

Such handling is need, because tunneling interfaces must use addresses,
that are configured on the host as ingress addresses for tunnels.
Otherwise the system can send spoofed packets with source address, that
belongs to foreign host.

The KPI uses ifaddr_event_ext event to implement addresses tracking.
Tunneling interfaces register event handlers and then they are
notified by the kernel, when an address disappears or appears.

ifaddr_event_compat() handler from if.c replaced by srcaddr_change_event()
in the ip_encap.c

MFC after:	1 month
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D17134
2018-10-21 17:55:26 +00:00
Andrey V. Elsukov
094d6f8d75 Add IPFW_RULE_JUSTOPTS flag, that is used by ipfw(8) to mark rule,
that was added using "new rule format". And then, when the kernel
returns rule with this flag, ipfw(8) can correctly show it.

Reported by:	lev
MFC after:	3 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D17373
2018-10-21 15:10:59 +00:00
Andrey V. Elsukov
64d63b1e03 Add ifaddr_event_ext event. It is similar to ifaddr_event, but the
handler receives the type of event IFADDR_EVENT_ADD/IFADDR_EVENT_DEL,
and the pointer to ifaddr. Also ifaddr_event now is implemented using
ifaddr_event_ext handler.

MFC after:	3 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D17100
2018-10-21 15:02:06 +00:00
Michael Tuexen
93899d10b4 The handling of RST segments in the SYN-RCVD state exists in the
code paths. Both are not consistent and the one on the syn cache code
does not conform to the relevant specifications (Page 69 of RFC 793
and Section 4.2 of RFC 5961).

This patch fixes this:
* The sequence numbers checks are fixed as specified on
  page Page 69 RFC 793.
* The sysctl variable net.inet.tcp.insecure_rst is now honoured
  and the behaviour as specified in Section 4.2 of RFC 5961.

Approved by:		re (gjb@)
Reviewed by:		bz@, glebius@, rrs@,
Differential Revision:	https://reviews.freebsd.org/D17595
Sponsored by:		Netflix, Inc.
2018-10-18 19:21:18 +00:00
Jonathan T. Looney
ac75e35d85 In r338102, the TCP reassembly code was substantially restructured. Prior
to this change, the code sometimes used a temporary stack variable to hold
details of a TCP segment. r338102 stopped using the variable to hold
segments, but did not actually remove the variable.

Because the variable is no longer used, we can safely remove it.

Approved by:	re (gjb)
2018-10-16 14:41:09 +00:00
Bjoern A. Zeeb
4ba16a92c7 In udp_input() when walking the pcblist we can come across
an inp marked FREED after the epoch(9) changes.
Check once we hold the lock and skip the inp if it is the case.

Contrary to IPv6 the locking of the inp is outside the multicast
section and hence a single check seems to suffice.

PR:		232192
Reviewed by:	mmacy, markj
Approved by:	re (kib)
Differential Revision:	https://reviews.freebsd.org/D17540
2018-10-12 22:51:45 +00:00
Bjoern A. Zeeb
3afdfcaf33 r217592 moved the check for imo in udp_input() into the conditional block
but leaving the variable assignment outside the block, where it is no longer
used. Move both the variable and the assignment one block further in.

This should result in no functional changes. It will however make upcoming
changes slightly easier to apply.

Reviewed by:		markj, jtl, tuexen
Approved by:		re (kib)
Differential Revision:	https://reviews.freebsd.org/D17525
2018-10-12 11:30:46 +00:00
Jonathan T. Looney
13c6ba6d94 There are three places where we return from a function which entered an
epoch section without exiting that epoch section. This is bad for two
reasons: the epoch section won't exit, and we will leave the epoch tracker
from the stack on the epoch list.

Fix the epoch leak by making sure we exit epoch sections before returning.

Reviewed by:	ae, gallatin, mmacy
Approved by:	re (gjb, kib)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D17450
2018-10-09 13:26:06 +00:00
Michael Tuexen
3535cdc43e Avoid truncating unrecognised parameters when reporting them.
This resulted in sending malformed packets.

Approved by:		re (kib@)
MFC after:		1 week
2018-10-07 15:13:47 +00:00
Michael Tuexen
3924dfa721 Ensure that the ips_localout counter is incremented for
locally generated SCTP packets sent over IPv4. This make
the behaviour consistent with IPv6.

Reviewed by:		ae@, bz@, jtl@
Approved by:		re (kib@)
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D17406
2018-10-07 11:26:15 +00:00
Tom Jones
b6e870116f Convert UDP length to host byte order
When getting the number of bytes to checksum make sure to convert the UDP
length to host byte order when the entire header is not in the first mbuf.

Reviewed by: jtl, tuexen, ae
Approved by: re (gjb), jtl (mentor)
Differential Revision:  https://reviews.freebsd.org/D17357
2018-10-05 12:51:30 +00:00
Ryan Stone
083a010c62 Hold a write lock across udp_notify()
With the new route cache feature udp_notify() will modify the inp when it
needs to invalidate the route cache.  Ensure that we hold a write lock on
the inp before calling the function to ensure that multiple threads don't
race while trying to invalidate the cache (which previously lead to a page
fault).

Differential Revision: https://reviews.freebsd.org/D17246
Reviewed by: sbruno, bz, karels
Sponsored by: Dell EMC Isilon
Approved by:	re (gjb)
2018-10-04 22:03:58 +00:00
Michael Tuexen
15a087e551 Mitigate providing a timing signal if the COOKIE or AUTH
validation fails.
Thanks to jmg@ for reporting the issue, which was discussed in
https://admbugs.freebsd.org/show_bug.cgi?id=878

Approved by:            re (TBD@)
MFC after:              1 week
2018-10-01 14:05:31 +00:00
Michael Tuexen
9d2e3f14c4 After allocating chunks set the fields in a consistent way.
This removes two assignments for the flags field being done
twice and adds one, which was missing.
Thanks to Felix Weinrank for reporting the issue he found
by using fuzz testing of the userland stack.

Approved by:            re (kib@)
MFC after:              1 week
2018-10-01 13:09:18 +00:00
Andrey V. Elsukov
384a5c3c28 Add INP_INFO_WUNLOCK_ASSERT() macro and use it instead of
INP_INFO_UNLOCK_ASSERT() in TCP-related code. For encapsulated traffic
it is possible, that the code is running in net_epoch_preempt section,
and INP_INFO_UNLOCK_ASSERT() is very strict assertion for such case.

PR:		231428
Reviewed by:	mmacy, tuexen
Approved by:	re (kib)
Differential Revision:	https://reviews.freebsd.org/D17335
2018-10-01 10:46:00 +00:00
Michael Tuexen
1b084a5e5e Plug mbuf leak in the SCTP input path in an error case.
Approved by:            re (kib@)
MFC after:              1 week
CID:			749312
2018-09-30 21:54:02 +00:00