Commit Graph

149 Commits

Author SHA1 Message Date
Gleb Smirnoff
b8a6e03fac Widen NET_EPOCH coverage.
When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex covered areas were
as small as possible, so became epoch covered areas.

However, epoch doesn't introduce any contention, it just delays
memory reclaim. So, there is no point to minimise epoch covered
areas in sense of performance. Meanwhile entering/exiting epoch
also has non-zero CPU usage, so doing this less often is a win.

Not the least is also code maintainability. In the new paradigm
we can assume that at any stage of processing a packet, we are
inside network epoch. This makes coding both input and output
path way easier.

On output path we already enter epoch quite early - in the
ip_output(), in the ip6_output().

This patch does the same for the input path. All ISR processing,
network related callouts, other ways of packet injection to the
network stack shall be performed in net_epoch. Any leaf function
that walks network configuration now asserts epoch.

Tricky part is configuration code paths - ioctls, sysctls. They
also call into leaf functions, so some need to be changed.

This patch would introduce more epoch recursions (see EPOCH_TRACE)
than we had before. They will be cleaned up separately, as several
of them aren't trivial. Note, that unlike a lock recursion the
epoch recursion is safe and just wastes a bit of resources.

Reviewed by:	gallatin, hselasky, cy, adrian, kristof
Differential Revision:	https://reviews.freebsd.org/D19111
2019-10-07 22:40:05 +00:00
Conrad Meyer
e2e050c8ef Extract eventfilter declarations to sys/_eventfilter.h
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions.  The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended).  Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
2019-05-20 00:38:23 +00:00
Gleb Smirnoff
a68cc38879 Mechanical cleanup of epoch(9) usage in network stack.
- Remove macros that covertly create epoch_tracker on thread stack. Such
  macros a quite unsafe, e.g. will produce a buggy code if same macro is
  used in embedded scopes. Explicitly declare epoch_tracker always.

- Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list
  IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read
  locking macros to what they actually are - the net_epoch.
  Keeping them as is is very misleading. They all are named FOO_RLOCK(),
  while they no longer have lock semantics. Now they allow recursion and
  what's more important they now no longer guarantee protection against
  their companion WLOCK macros.
  Note: INP_HASH_RLOCK() has same problems, but not touched by this commit.

This is non functional mechanical change. The only functionally changed
functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter
epoch recursively.

Discussed with:	jtl, gallatin
2019-01-09 01:11:19 +00:00
Andrew Turner
5f901c92a8 Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by:	bz
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16147
2018-07-24 16:35:52 +00:00
Brooks Davis
3a4fc8a8a1 Remove support for the Arcnet protocol.
While Arcnet has some continued deployment in industrial controls, the
lack of drivers for any of the PCI, USB, or PCIe NICs on the market
suggests such users aren't running FreeBSD.

Evidence in the PR database suggests that the cm(4) driver (our sole
Arcnet NIC) was broken in 5.0 and has not worked since.

PR:		182297
Reviewed by:	jhibbits, vangyzen
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15057
2018-04-13 21:18:04 +00:00
Brooks Davis
0437c8e3b1 Remove support for FDDI networks.
Defines in net/if_media.h remain in case code copied from ifconfig is in
use elsewere (supporting non-existant media type is harmless).

Reviewed by:	kib, jhb
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15017
2018-04-11 17:28:24 +00:00
Brooks Davis
69f0fecbd6 Remove infrastructure for token-ring networks.
Reviewed by:	cem, imp, jhb, jmallett
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14875
2018-03-28 23:33:26 +00:00
Jonathan T. Looney
20cb3e2557 This change adds a flag to the DAD entry to indicate whether it is
currently on the queue. This prevents accidentally doubly-removing a DAD
entry from the queue, while also simplifying some of the logic in
nd6_dad_stop().

Reviewed by:	ae, hrs, vangyzen
MFC after:	2 weeks
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D10943
2018-03-24 13:18:09 +00:00
Alexander Kabaev
151ba7933a Do pass removing some write-only variables from the kernel.
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
2017-12-25 04:48:39 +00:00
Alexander Kabaev
bf51c9665d Silence clang analyzer false positive.
clang does not know that two lookup calls will return the same
pointer, so it assumes correctly that using the old pointer
after dropping the reference to it is a bit risky.
2017-12-23 16:45:26 +00:00
Andrey V. Elsukov
a406128960 Follow the RFC6980 and silently ignore following IPv6 NDP messages
that had the IPv6 fragmentation header:
 o  Neighbor Solicitation
 o  Neighbor Advertisement
 o  Router Solicitation
 o  Router Advertisement
 o  Redirect

Introduce M_FRAGMENTED mbuf flag, and set it after IPv6 fragment reassembly
is completed. Then check the presence of this flag in correspondig ND6
handling routines.

PR:		224247
MFC after:	2 weeks
2017-12-15 12:37:32 +00:00
Pedro F. Giffuni
51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Alan Somers
559b42968c Constrain IPv6 routes to single FIBs when net.add_addr_allfibs=0
sys/netinet6/icmp6.c
	Use the interface's FIB for source address selection in ICMPv6 error
	responses.

sys/netinet6/in6.c
	In in6_newaddrmsg, announce arrival of local addresses on the
	interface's FIB only.  In in6_lltable_rtcheck, use a per-fib ND6
	cache instead of a single cache.

sys/netinet6/in6_src.c
	In in6_selectsrc, use the caller's fib instead of the default fib.
	In in6_selectsrc_socket, remove a superfluous check.

sys/netinet6/nd6.c
	In nd6_lle_event, use the interface's fib for routing socket
	messages.  In nd6_is_new_addr_neighbor, check all FIBs when trying
	to determine whether an address is a neighbor.  Also, simplify the
	code for point to point interfaces.

sys/netinet6/nd6.h
sys/netinet6/nd6.c
sys/netinet6/nd6_rtr.c
	Make defrouter_select fib-aware, and make all of its callers pass in
	the interface fib.

sys/netinet6/nd6_nbr.c
	When inputting a Neighbor Solicitation packet, consider the
	interface fib instead of the default fib for DAD.  Output NS and
	Neighbor Advertisement packets on the correct fib.

sys/netinet6/nd6_rtr.c
	Allow installing the same host route on different interfaces in
	different FIBs.  If rt_add_addr_allfibs=0, only install or delete
	the prefix route on the interface fib.

tests/sys/netinet/fibs_test.sh
	Clear some expected failures, but add a skip for the newly revealed
	BUG217871.

PR:		196361
Submitted by:	Erick Turnquist <jhujhiti@adjectivism.org>
Reported by:	Jason Healy <jhealy@logn.net>
Reviewed by:	asomers
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D9451
2017-03-17 16:50:37 +00:00
Andriy Voskoboinyk
2bbd06fc33 Garbage collect IFT_IEEE80211 (but leave the define for possible reuse)
This interface type ("a parent interface of wlanX") is not used since
r287197

Reviewed by:	adrian, glebius
Differential Revision:	https://reviews.freebsd.org/D9308
2017-01-28 17:08:40 +00:00
Mark Johnston
0ed7d74424 nd6_dad_timer(): don't assert that the address is tentative.
It appears that this assertion can be tripped in some cases when
multiple interfaces are on the same link. Until this is resolved, revert a
part of r306305 and simply log a message if the DAD timer fires on a
non-tentative address.

Reported by:	jhb
X-MFC With:	r306305
2016-10-01 01:30:34 +00:00
Mark Johnston
970fe0938e Convert checks in nd6_dad_start() and nd6_dad_timer() to assertions.
In particular, these functions can assume they are operating on tentative
addresses.

MFC after:	2 weeks
2016-09-24 21:40:24 +00:00
Mark Johnston
0973ca723c Always start IPv6 DAD asynchronously.
Otherwise we transmit the first neighbour solicitation in the context of the
caller of nd6_dad_start(), which can easily result in lock recursion. When
DAD is to be started after some delay, we send the first NS from the DAD
callout handler, so just change the implementation to do this in the
non-delayed case as well.

Reviewed by:	ae, hrs
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D6639
2016-06-02 17:17:15 +00:00
Mark Johnston
ff63037da1 Modify defrouter_remove() to perform the router lookup before removal.
This allows some simplification of its callers. No functional change
intended.

Tested by:	Larry Rosenman (as part of a larger change)
MFC after:	1 month
2016-03-17 19:01:44 +00:00
Mark Johnston
4de485fe5f Lock the NDP default router list and count defrouter references.
This addresses a number of race conditions that can cause crashes as a
result of unsynchronized access to the list.

PR:		206904
Tested by:	Larry Rosenman <ler@lerctr.org>,
		Kevin Bowling <kevin.bowling@kev009.com>
MFC after:	2 months
Differential Revision: https://reviews.freebsd.org/D5315
2016-02-25 20:12:05 +00:00
Mark Johnston
c15064c27a Release the ref acquired in nd6_dad_find() if DAD is already in progress.
MFC after:	1 week
2016-02-18 00:00:51 +00:00
Mark Johnston
f60d595f34 Remove a bogus comment from nd6_na_input().
The splnet() call that it refers to has been removed, and a lock for the
default router list is in fact needed.

MFC after:	1 week
2016-02-12 21:01:53 +00:00
Mark Johnston
baebd3e54f Remove superfluous return statements from the neighbour discovery code.
MFC after:	1 week
2016-02-12 20:55:22 +00:00
Alexander V. Chernikov
601c0b8bcc Split in6_selectsrc() into in6_selectsrc_addr() and in6_selectsrc_socket().
in6_selectsrc() has 2 class of users: socket-based one (raw/udp/pcb/etc) and
  socket-less (ND code). The main reason for that change is inability to
  specify non-default FIB for callers w/o socket since (internally) inpcb
  is used to determine fib.

As as result, add 2 wrappers for in6_selectsrc() (making in6_selectsrc()
  static):
1) in6_selectsrc_socket() for the former class. Embed scope_ambiguous check
  along with returning hop limit when needed.
2) in6_selectsrc_addr() for the latter case. Add 'fibnum' argument and
  pass IPv6 address  w/ explicitly specified scope as separate argument.

Reviewed by:	ae (previous version)
2016-01-10 13:40:29 +00:00
Alexander V. Chernikov
9a1b64d5a0 Add rib_lookup_info() to provide API for retrieving individual route
entries data in unified format.

There are control plane functions that require information other than
  just next-hop data (e.g. individual rtentry fields like flags or
  prefix/mask). Given that the goal is to avoid rte reference/refcounting,
  re-use rt_addrinfo structure to store most rte fields. If caller wants
  to retrieve key/mask or gateway (which are sockaddrs and are allocated
  separately), it needs to provide sufficient-sized sockaddrs structures
  w/ ther pointers saved in passed rt_addrinfo.

Convert:
  * lltable new records checks (in_lltable_rtcheck(),
    nd6_is_new_addr_neighbor().
  * rtsock pre-add/change route check.
  * IPv6 NS ND-proxy check (RADIX_MPATH code was eliminated because
     1) we don't support RTF_ANNOUNCE ND-proxy for networks and there should
       not be multiple host routes for such hosts 2) if we have multiple
       routes we should inspect them (which is not done). 3) the entire idea
       of abusing KRT as storage for ND proxy seems odd. Userland programs
       should be used for that purpose).
2016-01-04 15:03:20 +00:00
Alexander V. Chernikov
357ce739b9 Remove 'struct route_int6' argument from in6_selectsrc() and
in6_selectif().

The main task of in6_selectsrc() is to return IPv6 SAS (along with
  output interface used for scope checks). No data-path code uses
  route argument for caching. The only users are icmp6 (reflect code),
  ND6 ns/na generation code. All this fucntions are control-plane, so
  there is no reason to try to 'optimize' something by passing cached
  route into to ip6_output(). Given that, simplify code by eliminating
  in6_selectsrc() 'struct route_in6' argument. Since in6_selectif() is
  used only by in6_selectsrc(), eliminate its 'struct route_in6' argument,
  too. While here, reshape rte-related code inside in6_selectif() to
  free lookup result immediately after saving all the needed fields.
2016-01-03 10:43:23 +00:00
Alexander V. Chernikov
9a7ee988b5 Use lltable_get_ifp() instead of direct access to lltable fields. 2016-01-01 12:35:33 +00:00
Alexander V. Chernikov
4fb3a8208c Implement interface link header precomputation API.
Add if_requestencap() interface method which is capable of calculating
  various link headers for given interface. Right now there is support
  for INET/INET6/ARP llheader calculation (IFENCAP_LL type request).
  Other types are planned to support more complex calculation
  (L2 multipath lagg nexthops, tunnel encap nexthops, etc..).

Reshape 'struct route' to be able to pass additional data (with is length)
  to prepend to mbuf.

These two changes permits routing code to pass pre-calculated nexthop data
  (like L2 header for route w/gateway) down to the stack eliminating the
  need for other lookups. It also brings us closer to more complex scenarios
  like transparently handling MPLS nexthops and tunnel interfaces.
  Last, but not least, it removes layering violation introduced by flowtable
  code (ro_lle) and simplifies handling of existing if_output consumers.

ARP/ND changes:
Make arp/ndp stack pre-calculate link header upon installing/updating lle
  record. Interface link address change are handled by re-calculating
  headers for all lles based on if_lladdr event. After these changes,
  arpresolve()/nd6_resolve() returns full pre-calculated header for
  supported interfaces thus simplifying if_output().
Move these lookups to separate ether_resolve_addr() function which ether
  returs error or fully-prepared link header. Add <arp|nd6_>resolve_addr()
  compat versions to return link addresses instead of pre-calculated data.

BPF changes:
Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT.
Despite the naming, both of there have ther header "complete". The only
  difference is that interface source mac has to be filled by OS for
  AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside
  BPF and not pollute if_output() routines. Convert BPF to pass prepend data
  via new 'struct route' mechanism. Note that it does not change
  non-optimized if_output(): ro_prepend handling is purely optional.
Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI.
  It is not needed for ethernet anymore. The only remaining FDDI user is
  dev/pdq mostly untouched since 2007. FDDI support was eliminated from
  OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65).

Flowtable changes:
  Flowtable violates layering by saving (and not correctly managing)
  rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated
  header data from that lle.

Differential Revision:	https://reviews.freebsd.org/D4102
2015-12-31 05:03:27 +00:00
Steven Hartland
d6e82913c1 Revert r292275 & r292379
glebius has concerns about these changes so reverting those can be discussed
and addressed.

Sponsored by:	Multiplay
2015-12-17 14:41:30 +00:00
Steven Hartland
3a909afe8e Fix issues introduced by r292275
* Fix panic for etherswitches which don't have a LLADDR.
* Disabled DELAY in unsolicited NDA, which needs further work.
* Fixed missing DELAY in carp_send_na.
* style(9) fix.

Reported by:	kp & melifaro
X-MFC-With:	r292275
MFC after:	1 month
Sponsored by:	Multiplay
2015-12-16 22:26:28 +00:00
Steven Hartland
52e53e2de0 Fix lagg failover due to missing notifications
When using lagg failover mode neither Gratuitous ARP (IPv4) or Unsolicited
Neighbour Advertisements (IPv6) are sent to notify other nodes that the
address may have moved.

This results is slow failover, dropped packets and network outages for the
lagg interface when the primary link goes down.

We now use the new if_link_state_change_cond with the force param set to
allow lagg to force through link state changes and hence fire a
ifnet_link_event which are now monitored by rip and nd6.

Upon receiving these events each protocol trigger the relevant
notifications:
* inet4 => Gratuitous ARP
* inet6 => Unsolicited Neighbour Announce

This also fixes the carp IPv6 NA's that stopped working after r251584 which
added the ipv6_route__llma route.

The new behavour can be controlled using the sysctls:
* net.link.ether.inet.arp_on_link
* net.inet6.icmp6.nd6_on_link

Also removed unused param from lagg_port_state and added descriptions for the
sysctls while here.

PR:		156226
MFC after:	1 month
Sponsored by:	Multiplay
Differential Revision:	https://reviews.freebsd.org/D4111
2015-12-15 16:02:11 +00:00
Alexander V. Chernikov
12cb7521c2 Remove LLE read lock from IPv6 fast path.
LLE structure is mostly unchanged during its lifecycle: there are only 2
things relevant for fast path lookup code:
1) link-level address change. Since r286722, these updates are performed
  under AFDATA WLOCK.
2) Some sort of feedback indicating that this particular entry is used so
  we send NS to perform reachability verification instead of expiring entry.
  The only signal that is needed from fast path is something like binary
  yes/no.
The latter is solved by the following changes:

Special r_skip_req (introduced in D3688) value is used for fast path feedback.
  It is read lockless by fast path, but updated under req_mutex mutex. If this
  field is non-zero, then fast path will acquire lock and set it back to 0.

After transitioning to STALE state, callout timer is armed to run each
  V_nd6_delay seconds to make sure that if packet was transmitted at the start
  of given interval, we would be able to switch to PROBE state in V_nd6_delay
  seconds as user expects.
(in STALE state) timer is rescheduled until original V_nd6_gctimer expires
  keeping lle in STALE state (remaining timer value stored in lle_remtime).
(in STALE state) timer is rescheduled if packet was transmitted less that
  V_nd6_delay seconds ago to make sure we transition to PROBE state exactly
  after V_n6_delay seconds.

As a result, all packets towards lle in REACHABLE/STALE/PROBE states are handled
  by fast path without acquiring lle read lock.

Differential Revision:		https://reviews.freebsd.org/D3780
2015-12-13 07:39:49 +00:00
Alexander V. Chernikov
ddd208f7ad Unify setting lladdr for AF_INET[6]. 2015-11-07 11:12:00 +00:00
Hiroki Sato
6401c828ce - Schedule DAD for IN6_IFF_TENTATIVE addresses in nd6_timer(). This
catches cases that DAD probes cannot be sent because of
  IFF_UP && !IFF_DRV_RUNNING.

- nd6_dad_starttimer() now calls nd6_dad_ns_output(), instead of
  calling it before nd6_dad_starttimer().

- Do not release an entry in dadq when a duplicate entry is being
  added.
2015-10-03 12:09:12 +00:00
Alexander V. Chernikov
1558cb2448 Eliminate nd6_nud_hint() and its TCP bindings.
Initially function was introduced in r53541 (KAME initial commit) to
  "provide hints from upper layer protocols that indicate a connection
  is making "forward progress"" (quote from RFC 2461 7.3.1 Reachability
  Confirmation).
However, it was converted to do nothing (e.g. just return) in r122922
  (tcp_hostcache implementation) back in 2003. Some defines were moved
  to tcp_var.h in r169541. Then, it was broken (for non-corner cases)
  by r186119 (L2<>L3 split) in 2008 (NULL ifp in nd6_lookup). So,
  right now this code is broken and has no "real" base users.

Differential Revision:	https://reviews.freebsd.org/D3699
2015-09-27 05:29:34 +00:00
Alexander V. Chernikov
aa5f023eaf Unify nd6 state switching by using newly-created nd6_llinfo_setstate()
function. The change is mostly mechanical with the following exception:
Last piece of nd6_resolve_slow() was refactored: ND6_LLINFO_PERMANENT
  condition was removed as always-true, explicit ND6_LLINFO_NOSTATE ->
  ND6_LLINFO_INCOMPLETE state transition was removed as duplicate.

Reviewed by:	ae
Sponsored by:	Yandex LLC
2015-09-21 11:19:53 +00:00
Alexander V. Chernikov
26deb8826c Do not pass lle to nd6_ns_output(). Use newly-added
nd6_llinfo_get_holdsrc() to extract desired IPv6 source
  from holdchain and pass it to the nd6_ns_output().
2015-09-05 14:14:03 +00:00
Hiroki Sato
fb583bd228 - Deprecate IN6_IFF_NODAD. It was used to prevent DAD on a loopback
interface but in6if_do_dad() already had a check for IFF_LOOPBACK.

- Remove in6if_do_dad() check in in6_broadcast_ifa().  An address
  which needs DAD always has IN6_IFF_TENTATIVE there.

- in6if_do_dad() now returns EAGAIN when the interface is not ready
  since DAD callout handler ignores such an interface.

- In DAD callout handler, mark an address as IN6_IFF_TENTATIVE
  when the interface has ND6_IFF_IFDISABLED.  And Do IFF_UP and
  IFF_DRV_RUNNING check consistently when DAD is required.

- draft-ietf-6man-enhanced-dad is now published as RFC 7527.

- Fix some typos.
2015-08-24 05:21:49 +00:00
Alexander V. Chernikov
314294de5c Store addresses instead of sockaddrs inside llentry.
This permits us having all (not fully true yet) all the info
needed in lookup process in first 64 bytes of 'struct llentry'.

struct llentry layout:
BEFORE:
[rwlock .. state .. state .. MAC ] (lle+1) [sockaddr_in[6]]
AFTER
[ in[6]_addr MAC .. state .. rwlock ]

Currently, address part of struct llentry has only 16 bytes for the key.
However, lltable does not restrict any custom lltable consumers with long
keys use the previous approach (store key at (lle+1)).

Sponsored by:	Yandex LLC
2015-08-11 09:26:11 +00:00
Hiroki Sato
59333867ff - Remove ND6_IFF_IGNORELOOP. This functionality was useless in practice
because a link where looped back NS messages are permanently observed
  does not work with either NDP or ARP for IPv4.

- draft-ietf-6man-enhanced-dad is now RFC 7527.

Discussed with:	hiren
MFC after:	3 days
2015-05-12 03:31:57 +00:00
Gleb Smirnoff
0fa5aacd8b Remove #ifdef IFT_FOO.
Submitted by:	Guy Yur <guyyur gmail.com>
2015-05-02 20:31:27 +00:00
Andrey V. Elsukov
efb19cf6db Fix the check for maximum mbuf's size needed to send ND6 NA and NS.
It is acceptable that the size can be equal to MCLBYTES. In the later
KAME's code this check has been moved under DIAGNOSTIC ifdef, because
the size of NA and NS is much smaller than MCLBYTES. So, it is safe to
replace the check with KASSERT.

PR:		199304
Discussed with:	glebius
MFC after:	1 week
2015-04-09 12:57:58 +00:00
Andrey V. Elsukov
ff9f2a36de To avoid a possible race, release the reference to ifa after return
from nd6_dad_na_input().

Submitted by:	Alexandre Martins
MFC after:	1 week
2015-03-19 00:04:25 +00:00
Hiroki Sato
23e9ffb0e1 - Implement loopback probing state in enhanced DAD algorithm.
- Add no_dad and ignoreloop per-IF knob.  no_dad disables DAD completely,
  and ignoreloop is to prevent infinite loop in loopback probing state when
  loopback is permanently expected.
2015-03-05 21:27:49 +00:00
Andrey V. Elsukov
1eef8a6c08 Create nd6_ns_output_fib() function with extra argument fibnum. Use it
to initialize mbuf's fibnum. Uninitialized fibnum value can lead to
panic in the routing code. Currently we use only RT_DEFAULT_FIB value
for initialization.

Differential Revision:	https://reviews.freebsd.org/D1998
Reviewed by:	hrs (previous version)
Sponsored by:	Yandex LLC
2015-03-03 10:50:03 +00:00
Hiroki Sato
8d56075939 Nonce has to be non-NULL for DAD even if net.inet6.ip6.dad_enhanced=0. 2015-03-03 04:28:19 +00:00
Hiroki Sato
11d8451df3 Implement Enhanced DAD algorithm for IPv6 described in
draft-ietf-6man-enhanced-dad-13.

This basically adds a random nonce option (RFC 3971) to NS messages
for DAD probe to detect a looped back packet.  This looped back packet
prevented DAD on some pseudo-interfaces which aggregates multiple L2 links
such as lagg(4).

The length of the nonce is set to 6 bytes.  This algorithm can be disabled by
setting net.inet6.ip6.dad_enhanced sysctl to 0 in a per-vnet basis.

Reported by:		hiren
Reviewed by:		ae
Differential Revision:	https://reviews.freebsd.org/D1835
2015-03-02 17:30:26 +00:00
Alexander V. Chernikov
d7968c29ec * Use newly-created nd6_grab_holdchain() function to retrieve lle
hold mbuf chain instead of calling full-blown nd6_output_lle()
  for each packet. This simplifies both callers and nd6_output_lle()
  implementation.
* Make nd6_output_lle() static and remove now-unused lle and chain
  arguments.
* Rename nd6_output_flush() -> nd6_flush_holdchain() to be consistent.
* Move all pre-send transmit hooks to newly-created nd6_output_ifp().
  Now nd6_output(), nd6_output_lle() and nd6_flush_holdchain() are using
  it to send mbufs to if_output.
* Remove SeND hook from nd6_na_input() because it was implemented
  incorrectly since the beginning (r211501):
  - it tagged initial input mbuf (m) instead of m_hold
  - tagging _all_ mbufs in holdchain seems to be wrong anyway.
2015-01-08 18:02:05 +00:00
Robert Watson
ed6a66ca6c To ease changes to underlying mbuf structure and the mbuf allocator, reduce
the knowledge of mbuf layout, and in particular constants such as M_EXT,
MLEN, MHLEN, and so on, in mbuf consumers by unifying various alignment
utility functions (M_ALIGN(), MH_ALIGN(), MEXT_ALIGN() in a single
M_ALIGN() macro, implemented by a now-inlined m_align() function:

- Move m_align() from uipc_mbuf.c to mbuf.h; mark as __inline.
- Reimplement M_ALIGN(), MH_ALIGN(), and MEXT_ALIGN() using m_align().
- Update consumers around the tree to simply use M_ALIGN().

This change eliminates a number of cases where mbuf consumers must be aware
of whether or not mbufs returned by the allocator use external storage, but
also assumptions about the size of the returned mbuf. This will make it
easier to introduce changes in how we use external storage, as well as
features such as variable-size mbufs.

Differential Revision:	https://reviews.freebsd.org/D1436
Reviewed by:	glebius, trasz, gnn, bz
Sponsored by:	EMC / Isilon Storage Division
2015-01-05 09:58:32 +00:00
Mark Johnston
a37271c3b8 Revert r275695: nd6_dad_find() was already correct.
Reported by:	ae, kib
Pointy hat to:	markj
2014-12-11 09:16:45 +00:00
Mark Johnston
97712e3efc Fix a bug in r266857: nd6_dad_find() must return NULL if it doesn't find
a matching element in the DAD queue.

Reported by:	Holger Hans Peter Freyther <holger@freyther.de>
MFC after:	3 days
2014-12-11 00:41:54 +00:00