Commit Graph

78 Commits

Author SHA1 Message Date
Conrad Meyer
e2e050c8ef Extract eventfilter declarations to sys/_eventfilter.h
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions.  The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended).  Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
2019-05-20 00:38:23 +00:00
Gleb Smirnoff
a68cc38879 Mechanical cleanup of epoch(9) usage in network stack.
- Remove macros that covertly create epoch_tracker on thread stack. Such
  macros a quite unsafe, e.g. will produce a buggy code if same macro is
  used in embedded scopes. Explicitly declare epoch_tracker always.

- Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list
  IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read
  locking macros to what they actually are - the net_epoch.
  Keeping them as is is very misleading. They all are named FOO_RLOCK(),
  while they no longer have lock semantics. Now they allow recursion and
  what's more important they now no longer guarantee protection against
  their companion WLOCK macros.
  Note: INP_HASH_RLOCK() has same problems, but not touched by this commit.

This is non functional mechanical change. The only functionally changed
functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter
epoch recursively.

Discussed with:	jtl, gallatin
2019-01-09 01:11:19 +00:00
Andrew Turner
5f901c92a8 Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by:	bz
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16147
2018-07-24 16:35:52 +00:00
Andrey V. Elsukov
acf673edf0 Move invoking of callout_stop(&lle->lle_timer) into llentry_free().
This deduplicates the code a bit, and also implicitly adds missing
callout_stop() to in[6]_lltable_delete_entry() functions.

PR:		209682, 225927
Submitted by:	hselasky (previous version)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D4605
2018-07-17 11:33:23 +00:00
Matt Macy
0f8d79d977 CK: update consumers to use CK macros across the board
r334189 changed the fields to have names distinct from those in queue.h
in order to expose the oversights as compile time errors
2018-05-24 23:21:23 +00:00
Matt Macy
4f6c66cc9c UDP: further performance improvements on tx
Cumulative throughput while running 64
  netperf -H $DUT -t UDP_STREAM -- -m 1
on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps

Single stream throughput increases from 910kpps to 1.18Mpps

Baseline:
https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg

- Protect read access to global ifnet list with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg

- Protect short lived ifaddr references with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg

- Convert if_afdata read lock path to epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg

A fix for the inpcbhash contention is pending sufficient time
on a canary at LLNW.

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15409
2018-05-23 21:02:14 +00:00
Pedro F. Giffuni
fe267a5590 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
Ed Maste
3e85b721d6 Remove register keyword from sys/ and ANSIfy prototypes
A long long time ago the register keyword told the compiler to store
the corresponding variable in a CPU register, but it is not relevant
for any compiler used in the FreeBSD world today.

ANSIfy related prototypes while here.

Reviewed by:	cem, jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D10193
2017-05-17 00:34:34 +00:00
Andrey V. Elsukov
199511bcdd Make LLTABLE list lock private for if_llatbl.c
Rename lock and macros to reflect that it protects V_lltables list.
2016-10-11 17:41:13 +00:00
Pedro F. Giffuni
a4641f4eaa sys/net*: minor spelling fixes.
No functional change.
2016-05-03 18:05:43 +00:00
Alexander V. Chernikov
4fb3a8208c Implement interface link header precomputation API.
Add if_requestencap() interface method which is capable of calculating
  various link headers for given interface. Right now there is support
  for INET/INET6/ARP llheader calculation (IFENCAP_LL type request).
  Other types are planned to support more complex calculation
  (L2 multipath lagg nexthops, tunnel encap nexthops, etc..).

Reshape 'struct route' to be able to pass additional data (with is length)
  to prepend to mbuf.

These two changes permits routing code to pass pre-calculated nexthop data
  (like L2 header for route w/gateway) down to the stack eliminating the
  need for other lookups. It also brings us closer to more complex scenarios
  like transparently handling MPLS nexthops and tunnel interfaces.
  Last, but not least, it removes layering violation introduced by flowtable
  code (ro_lle) and simplifies handling of existing if_output consumers.

ARP/ND changes:
Make arp/ndp stack pre-calculate link header upon installing/updating lle
  record. Interface link address change are handled by re-calculating
  headers for all lles based on if_lladdr event. After these changes,
  arpresolve()/nd6_resolve() returns full pre-calculated header for
  supported interfaces thus simplifying if_output().
Move these lookups to separate ether_resolve_addr() function which ether
  returs error or fully-prepared link header. Add <arp|nd6_>resolve_addr()
  compat versions to return link addresses instead of pre-calculated data.

BPF changes:
Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT.
Despite the naming, both of there have ther header "complete". The only
  difference is that interface source mac has to be filled by OS for
  AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside
  BPF and not pollute if_output() routines. Convert BPF to pass prepend data
  via new 'struct route' mechanism. Note that it does not change
  non-optimized if_output(): ro_prepend handling is purely optional.
Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI.
  It is not needed for ethernet anymore. The only remaining FDDI user is
  dev/pdq mostly untouched since 2007. FDDI support was eliminated from
  OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65).

Flowtable changes:
  Flowtable violates layering by saving (and not correctly managing)
  rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated
  header data from that lle.

Differential Revision:	https://reviews.freebsd.org/D4102
2015-12-31 05:03:27 +00:00
Bjoern A. Zeeb
76d68eccbd Simplify bringup order by removing a SYSINIT making it a static list
initialization.

Mfp4 @180384,180385:

  There is no need for a dedicated SYSINIT here.  The
  list can be initialized statically.

  Sponsored by:		CK Software GmbH
Sponsored by:		The FreeBSD Foundation
MFC after:		2 weeks
Reviewed by:		gnn
Differential Revision:	https://reviews.freebsd.org/D4528
2015-12-22 14:57:04 +00:00
Alexander V. Chernikov
12cb7521c2 Remove LLE read lock from IPv6 fast path.
LLE structure is mostly unchanged during its lifecycle: there are only 2
things relevant for fast path lookup code:
1) link-level address change. Since r286722, these updates are performed
  under AFDATA WLOCK.
2) Some sort of feedback indicating that this particular entry is used so
  we send NS to perform reachability verification instead of expiring entry.
  The only signal that is needed from fast path is something like binary
  yes/no.
The latter is solved by the following changes:

Special r_skip_req (introduced in D3688) value is used for fast path feedback.
  It is read lockless by fast path, but updated under req_mutex mutex. If this
  field is non-zero, then fast path will acquire lock and set it back to 0.

After transitioning to STALE state, callout timer is armed to run each
  V_nd6_delay seconds to make sure that if packet was transmitted at the start
  of given interval, we would be able to switch to PROBE state in V_nd6_delay
  seconds as user expects.
(in STALE state) timer is rescheduled until original V_nd6_gctimer expires
  keeping lle in STALE state (remaining timer value stored in lle_remtime).
(in STALE state) timer is rescheduled if packet was transmitted less that
  V_nd6_delay seconds ago to make sure we transition to PROBE state exactly
  after V_n6_delay seconds.

As a result, all packets towards lle in REACHABLE/STALE/PROBE states are handled
  by fast path without acquiring lle read lock.

Differential Revision:		https://reviews.freebsd.org/D3780
2015-12-13 07:39:49 +00:00
Alexander V. Chernikov
f8aee88f0b Remove LLE read lock from IPv4 fast path.
LLE structure is mostly unchanged during its lifecycle.
To be more specific, there are 2 things relevant for fast path
  lookup code:
1) link-level address change. Since r286722, these updates are performed
  under AFDATA WLOCK.
2) Some sort of feedback indicating that this particular entry is used so
  we re-send arp request to perform reachability verification instead of
  expiring entry. The only signal that is needed from fast path is something
  like binary yes/no.

The latter is solved by the following changes:
1) introduce special r_skip_req field which is read lockless by fast path,
  but updated under (new) req_mutex mutex. If this field is non-zero, then
  fast path will acquire lock and set it back to 0.
2) introduce simple state machine: incomplete->reachable<->verify->deleted.
  Before that we implicitely had incomplete->reachable->deleted state machine,
  with V_arpt_keep between "reachable" and "deleted". Verification was performed
  in runtime 5 seconds before V_arpt_keep expire.
  This is changed to "change state to verify 5 seconds before V_arpt_keep,
  set r_skip_req to non-zero value and check it every second". If the value
  is zero - then send arp verification probe.
These changes do not introduce any signifficant control plane overhead:
  typically lle callout timer would fire 1 time more each V_arpt_keep (1200s)
  for used lles and up to arp_maxtries (5) for dead lles.

As a result, all packets towards "reachable" lle are handled by fast path without
acquiring lle read lock.

Additional "req_mutex" is needed because callout / arpresolve_slow() or eventhandler
  might keep LLE lock for signifficant amount of time, which might not be feasible
  for fast path locking (e.g. having rmlock as ether AFDATA or lltable own lock).

Differential Revision:	https://reviews.freebsd.org/D3688
2015-12-05 09:50:37 +00:00
Randall Stewart
7c4676ddee This fixes several places where callout_stops return is examined. The
new return codes of -1 were mistakenly being considered "true". Callout_stop
now returns -1 to indicate the callout had either already completed or
was not running and 0 to indicate it could not be stopped.  Also update
the manual page to make it more consistent no non-zero in the callout_stop
or callout_reset descriptions.

MFC after:	1 Month with associated callout change.
2015-11-13 22:51:35 +00:00
Alexander V. Chernikov
ddd208f7ad Unify setting lladdr for AF_INET[6]. 2015-11-07 11:12:00 +00:00
Alexander V. Chernikov
1558cb2448 Eliminate nd6_nud_hint() and its TCP bindings.
Initially function was introduced in r53541 (KAME initial commit) to
  "provide hints from upper layer protocols that indicate a connection
  is making "forward progress"" (quote from RFC 2461 7.3.1 Reachability
  Confirmation).
However, it was converted to do nothing (e.g. just return) in r122922
  (tcp_hostcache implementation) back in 2003. Some defines were moved
  to tcp_var.h in r169541. Then, it was broken (for non-corner cases)
  by r186119 (L2<>L3 split) in 2008 (NULL ifp in nd6_lookup). So,
  right now this code is broken and has no "real" base users.

Differential Revision:	https://reviews.freebsd.org/D3699
2015-09-27 05:29:34 +00:00
Alexander V. Chernikov
d3cdb71655 * Require explicitl lle unlink prior to calling llentry_delete().
This one slightly decreases time of holding afdata wlock.
* While here, make nd6_free() return void. No one has used its return value
  since r186119.
2015-09-15 06:48:19 +00:00
Alexander V. Chernikov
3e7a2321e3 * Do more fine-grained locking: call eventhandlers/free_entry
without holding afdata wlock
* convert per-af delete_address callback to global lltable_delete_entry() and
  more low-level "delete this lle" per-af callback
* fix some bugs/inconsistencies in IPv4/IPv6 ifscrub procedures

Sponsored by:		Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D3573
2015-09-14 16:48:19 +00:00
Alexander V. Chernikov
3b0fd911fa Simplify lla_rt_output()/nd6_add_ifa_lle() by setting lle state in
alloc handler, based on flags.
2015-08-31 05:03:36 +00:00
Alexander V. Chernikov
5a2555160f * Split allocation and table linking for lle's.
Before that, the logic besides lle_create() was the following:
  return existing if found, create if not. This behaviour was error-prone
  since we had to deal with 'sudden' static<>dynamic lle changes.
  This commit fixes bunch of different issues like:
  - refcount leak when lle is converted to static.
    Simple check case:
    console 1:
    while true;
      do for i in `arp -an|awk '$4~/incomp/{print$2}'|tr -d '()'`;
        do arp -s $i 00:22:44:66:88:00 ; arp -d $i;
      done;
    done
   console 2:
    ping -f any-dead-host-in-L2
   console 3:
    # watch for memory consumption:
    vmstat -m | awk '$1~/lltable/{print$2}'
  - possible problems in arptimer() / nd6_timer() when dropping/reacquiring
   lock.
  New logic explicitly handles use-or-create cases in every lla_create
  user. Basically, most of the changes are purely mechanical. However,
  we explicitly avoid using existing lle's for interface/static LLE records.
* While here, call lle_event handlers on all real table lle change.
* Create lltable_free_entry() calling existing per-lltable
  lle_free_t callback for entry deletion
2015-08-20 12:05:17 +00:00
Alexander V. Chernikov
0447c1367a Use single 'lle_timer' callout in lltable instead of
two different names of the same timer.
2015-08-11 12:38:54 +00:00
Alexander V. Chernikov
41cb42a633 MFP r276712.
* Split lltable_init() into lltable_allocate_htbl() (alloc
  hash table with default callbacks) and lltable_link() (
  links any lltable to the list).
* Switch from LLTBL_HASHTBL_SIZE to per-lltable hash size field.
* Move lltable setup to separate functions in in[6]_domifattach.
2015-08-11 05:51:00 +00:00
Alexander V. Chernikov
11cdad9873 Partially merge r274887,r275334,r275577,r275578,r275586 to minimize
differences between projects/routing and HEAD.

This commit tries to keep code logic the same while changing underlying
code to use unified callbacks.

* Add llt_foreach_entry method to traverse all entries in given llt
* Add llt_dump_entry method to export particular lle entry in sysctl/rtsock
  format (code is not indented properly to minimize diff). Will be fixed
  in the next commits.
* Add llt_link_entry/llt_unlink_entry methods to link/unlink particular lle.
* Add llt_fill_sa_entry method to export address in the lle to sockaddr
  format.
* Add llt_hash method to use in generic hash table support code.
* Add llt_free_entry method which is used in llt_prefix_free code.

* Prepare for fine-grained locking by separating lle unlink and deletion in
  lltable_free() and lltable_prefix_free().

* Provide lltable_get<ifp|af>() functions to reduce direct 'struct lltable'
 access by external callers.

* Remove @llt agrument from lle_free() lle callback since it was unused.
* Temporarily add L3_CADDR() macro for 'const' sockaddr typecasting.
* Switch to per-af hashing code.
* Rename LLE_FREE_LOCKED() callback from in[6]_lltable_free() to
  in_[6]lltable_destroy() to avoid clashing with llt_free_entry() method.
  Update description from these functions.
* Use unified lltable_free_entry() function instead of per-af one.

Reviewed by:	ae
2015-08-10 12:03:59 +00:00
Alexander V. Chernikov
3a7498636a * Allocate hash tables separately
* Make llt_hash() callback more flexible
* Default hash size and hashing method is now per-af
* Move lltable allocation to separate function
2015-01-05 17:23:02 +00:00
Alexander V. Chernikov
b44a7d5d87 * Use unified code for deleting entry by sockaddr instead of per-af one.
* Remove now unused llt_delete_addr callback.
2015-01-03 19:09:06 +00:00
Alexander V. Chernikov
20dd899505 * Hide lltable implementation details in if_llatbl_var.h
* Make most of lltable_* methods 'normal' functions instead of inline
* Add lltable_get_<af|ifp>() functions to access given lltable fields
* Temporarily resurrect nd6_lookup() function
2015-01-03 16:04:28 +00:00
Alexander V. Chernikov
d82ed5051c Simplify lle lookup/create api by using addresses instead of sockaddrs. 2014-12-08 23:23:53 +00:00
Alexander V. Chernikov
73b52ad896 Use llt_prepare_static_entry method to prepare valid per-af static entry. 2014-12-07 23:59:44 +00:00
Alexander V. Chernikov
0368226e65 * Retire abstract llentry_free() in favor of lltable_drop_entry_queue()
and explicit calls to RTENTRY_FREE_LOCKED()
* Use lltable_prefix_free() in arp_ifscrub to be consistent with nd6.
* Rename <lltable_|llt>_delete function to _delete_addr() to note that
   this function is used to external callers. Make this function maintain
   its own locking.
* Use lookup/unlink/clear call chain from internal callers instead of
    delete_addr.
* Fix LLE_DELETED flag handling
2014-12-07 23:08:07 +00:00
Alexander V. Chernikov
721cd2e032 Do not enforce particular lle storage scheme:
* move lltable allocation to per-domain callbacks.
* make llentry_link/unlink functions overridable llt methods.
* make hash table traversal another overridable llt method.
2014-12-07 17:32:06 +00:00
Alexander V. Chernikov
a743ccd468 * Add llt_clear_entry() callback which is able to do all lle
cleanup including unlinking/freeing
* Relax locking in lltable_prefix_free_af/lltable_free
* Do not pass @llt to lle free callback: it is always NULL now.
* Unify arptimer/nd6_llinfo_timer: explicitly unlock lle avoiding
   unlock/lock sequinces
* Do not pass unlocked lle to nd6_ns_output(): add nd6_llinfo_get_holdsrc()
   to retrieve preferred source address from lle hold queue and pass it
   instead of lle.
* Finally, make nd6_create() create and return unlocked lle
* Separate defrtr handling code from nd6_free():
   use nd6_check_del_defrtr() to check if we need to keep entry instead of
    performing GC,
   use nd6_check_recalc_defrtr() to perform actual recalc on lle removal.
* Move isRouter handling from nd6_cache_lladdr() to separate
   nd6_check_router()
* Add initial code to maintain lle runtime flags in sync.
2014-12-07 15:42:46 +00:00
Alexander V. Chernikov
ce313fdd71 * Unify lle table dump/prefix removal code.
* Rename lla_XXX -> lltable_XXX_lle to reduce number of name prefixes
  used by lltable code.
2014-11-30 14:35:01 +00:00
Alexander V. Chernikov
73d770287d Do more fine-grained lltable locking: use table runtime lock as rare
as we can.
2014-11-23 15:38:06 +00:00
Alexander V. Chernikov
9479029b1f * Add lltable llt_hash callback
* Move lltable items insertions/deletions to generic llt code.
2014-11-23 12:15:28 +00:00
Alexander V. Chernikov
27688dfe1d Temporarily revert r274774. 2014-11-22 17:57:54 +00:00
Alexander V. Chernikov
2e47d2f953 Mark ifaddr/rtsock static entries RLLE_VALID. 2014-11-21 23:37:59 +00:00
Alexander V. Chernikov
9883e41b4b Switch IF_AFDATA lock to rmlock 2014-11-21 02:28:56 +00:00
Alexander V. Chernikov
df629abf3e Rework LLE code locking:
* struct llentry is now basically split into 2 pieces:
  all fields within 64 bytes (amd64) are now protected by both
  ifdata lock AND lle lock, e.g. you require both locks to be held
  exclusively for modification. All data necessary for fast path
  operations is kept here. Some fields were added:
  - r_l3addr - makes lookup key liev within first 64 bytes.
  - r_flags - flags, containing pre-compiled decision whether given
    lle contains usable data or not. Current the only flag is RLLE_VALID.
  - r_len - prepend data len, currently unused
  - r_kick - used to provide feedback to control plane (see below).
  All other fields are protected by lle lock.
* Add simple state machine for ARP to handle "about to expire" case:
  Current model (for the fast path) is the following:
  - rlock afdata
  - find / rlock rte
  - runlock afdata
  - see if "expire time" is approaching
    (time_uptime + la->la_preempt > la->la_expire)
  - if true, call arprequest() and decrease la_preempt
  - store MAC and runlock rte
  New model (data plane):
  - rlock afdata
  - find rte
  - check if it can be used using r_* fields only
  - if true, store MAC
  - if r_kick field != 0 set it to 0.
  - runlock afdata
  New mode (control plane):
  - schedule arptimer to be called in (V_arpt_keep - V_arp_maxtries)
    seconds instead of V_arpt_keep.
  - on first timer invocation change state from ARP_LLINFO_REACHABLE
    to ARP_LLINFO_VERIFY, sets r_kick to 1 and shedules next call in
    V_arpt_rexmit (default to 1 sec).
  - on subsequent timer invocations in ARP_LLINFO_VERIFY state, checks
    for r_kick value: reschedule if not changed, and send arprequest()
    if set to zero (e.g. entry was used).
* Convert IPv4 path to use new single-lock approach. IPv6 bits to follow.
* Slow down in_arpinput(): now valid reply will (in most cases) require
  acquiring afdata WLOCK twice. This is requirement for storing changed
  lle data. This change will be slightly optimized in future.
* Provide explicit hash link/unlink functions for both ipv4/ipv6 code.
  This will probably be moved to generic lle code once we have per-AF
  hashing callback inside lltable.
* Perform lle unlink on deletion immediately instead of delaying it to
  the timer routine.
* Make r244183 more explicit: use new LLE_CALLOUTREF flag to indicate the
  presence of lle reference used for safe callout calls.
2014-11-16 20:12:49 +00:00
Alexander V. Chernikov
b4b1367ae4 * Move lle creation/deletion from lla_lookup to separate functions:
lla_lookup(LLE_CREATE) -> lla_create
  lla_lookup(LLE_DELETE) -> lla_delete
  Assume lla_create to return LLE_EXCLUSIVE lock for lle.
* Rework lla_rt_output to perform all lle changes under afdata WLOCK.
* change arp_ifscrub() ackquire afdata WLOCK, the same as arp_ifinit().
2014-11-15 18:54:07 +00:00
Andre Oppermann
f89d4c3acf Back out r249318, r249320 and r249327 due to a heisenbug most
likely related to a race condition in the ipi_hash_lock with
the exact cause currently unknown but under investigation.
2013-05-06 16:42:18 +00:00
Andre Oppermann
e8b3186b6a Change certain heavily used network related mutexes and rwlocks to
reside on their own cache line to prevent false sharing with other
nearby structures, especially for those in the .bss segment.

NB: Those mutexes and rwlocks with variables next to them that get
changed on every invocation do not benefit from their own cache line.
Actually it may be net negative because two cache misses would be
incurred in those cases.
2013-04-09 21:02:20 +00:00
Gleb Smirnoff
9711a168b9 Retire struct sockaddr_inarp.
Since ARP and routing are separated, "proxy only" entries
don't have any meaning, thus we don't need additional field
in sockaddr to pass SIN_PROXY flag.

New kernel is binary compatible with old tools, since sizes
of sockaddr_inarp and sockaddr_in match, and sa_family are
filled with same value.

The structure declaration is left for compatibility with
third party software, but in tree code no longer use it.

Reviewed by:	ru, andre, net@
2013-01-31 08:55:21 +00:00
Gleb Smirnoff
1910bfcba2 route_output() always supplies info with RTAX_GATEWAY member that
points to a sockaddr of AF_LINK family. Assert this instead of
checking.
2013-01-29 21:44:22 +00:00
Gleb Smirnoff
b1ec2940af Fix problem in r238990. The LLE_LINKED flag should be tested prior to
entering llentry_free(), and in case if we lose the race, we should simply
perform LLE_FREE_LOCKED(). Otherwise, if the race is lost by the thread
performing arptimer(), it will remove two references from the lle instead
of one.

Reported by:	Ian FREISLICH <ianf clue.co.za>
2012-12-13 11:11:15 +00:00
Gleb Smirnoff
ea53792942 Fix races between in_lltable_prefix_free(), lla_lookup(),
llentry_free() and arptimer():

o Use callout_init_rw() for lle timeout, this allows us safely
  disestablish them.
  - This allows us to simplify the arptimer() and make it
    race safe.
o Consistently use ifp->if_afdata_lock to lock access to
  linked lists in the lle hashes.
o Introduce new lle flag LLE_LINKED, which marks an entry that
  is attached to the hash.
  - Use LLE_LINKED to avoid double unlinking via consequent
    calls to llentry_free().
  - Mark lle with LLE_DELETED via |= operation istead of =,
    so that other flags won't be lost.
o Make LLE_ADDREF(), LLE_REMREF() and LLE_FREE_LOCKED() more
  consistent and provide more informative KASSERTs.

The patch is a collaborative work of all submitters and myself.

PR:		kern/165863
Submitted by:	Andrey Zonov <andrey zonov.org>
Submitted by:	Ryan Stone <rysto32 gmail.com>
Submitted by:	Eric van Gyzen <eric_van_gyzen dell.com>
2012-08-02 13:57:49 +00:00
Gleb Smirnoff
b1d86af706 The llentry_update() is used only by flowtable and the latter
always passes NULL pointer to it. Thus, code can be simplified
and function renamed to llentry_alloc() to match rtalloc().
2012-08-02 13:20:44 +00:00
Gleb Smirnoff
b9aee262e5 Some more whitespace cleanup. 2012-08-01 09:00:26 +00:00
Gleb Smirnoff
ea50c13ebe Some style(9) and whitespace changes.
Together with:	Andrey Zonov <andrey zonov.org>
2012-07-31 11:31:12 +00:00
Kip Macy
0fe48d670f A flowtable entry can continue referencing an llentry indefinitely if the entry is repeatedly
referenced within its timeout window. This change clears the LLE_VALID flag when an llentry
is removed from an interface's hash table and adds an extra check to the flowtable code
for the LLE_VALID flag in llentry to avoid retaining and using a stale reference.

Reviewed by:	qingli@
MFC after:	2 weeks
2012-01-26 20:02:40 +00:00