Commit Graph

26 Commits

Author SHA1 Message Date
Navdeep Parhar
32d2623ae2 Add the ability to look up the 3b PCP of a VLAN interface. Use it in
toe_l2_resolve to fill up the complete vtag and not just the vid.

Reviewed by:	kib@
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D16752
2018-08-16 23:46:38 +00:00
Navdeep Parhar
a64564109a Add a hook to allow the toedev handling an offloaded connection to
provide accurate TCP_INFO.

Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14816
2018-04-03 01:08:54 +00:00
Pedro F. Giffuni
fe267a5590 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
George V. Neville-Neil
6d76822688 This change re-adds L2 caching for TCP and UDP, as originally added in D4306
but removed due to other changes in the system. Restore the llentry pointer
to the "struct route", and use it to cache the L2 lookup (ARP or ND6) as
appropriate.

Submitted by:	Mike Karels
Differential Revision:	https://reviews.freebsd.org/D6262
2016-06-02 17:51:29 +00:00
Gleb Smirnoff
8ec07310fa These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
Alexander V. Chernikov
f2b2e77a41 (Temporarily) remove route_redirect_event eventhandler.
Such handler should pass different set of variables, instead
  of directly providing 2 locked route entries.
Given that it hasn't been really used since at least 2012, remove
  current code.
Will re-add it after finishing most major routing-related changes.

Discussed with:	np
2016-01-09 06:26:40 +00:00
Alexander V. Chernikov
4fb3a8208c Implement interface link header precomputation API.
Add if_requestencap() interface method which is capable of calculating
  various link headers for given interface. Right now there is support
  for INET/INET6/ARP llheader calculation (IFENCAP_LL type request).
  Other types are planned to support more complex calculation
  (L2 multipath lagg nexthops, tunnel encap nexthops, etc..).

Reshape 'struct route' to be able to pass additional data (with is length)
  to prepend to mbuf.

These two changes permits routing code to pass pre-calculated nexthop data
  (like L2 header for route w/gateway) down to the stack eliminating the
  need for other lookups. It also brings us closer to more complex scenarios
  like transparently handling MPLS nexthops and tunnel interfaces.
  Last, but not least, it removes layering violation introduced by flowtable
  code (ro_lle) and simplifies handling of existing if_output consumers.

ARP/ND changes:
Make arp/ndp stack pre-calculate link header upon installing/updating lle
  record. Interface link address change are handled by re-calculating
  headers for all lles based on if_lladdr event. After these changes,
  arpresolve()/nd6_resolve() returns full pre-calculated header for
  supported interfaces thus simplifying if_output().
Move these lookups to separate ether_resolve_addr() function which ether
  returs error or fully-prepared link header. Add <arp|nd6_>resolve_addr()
  compat versions to return link addresses instead of pre-calculated data.

BPF changes:
Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT.
Despite the naming, both of there have ther header "complete". The only
  difference is that interface source mac has to be filled by OS for
  AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside
  BPF and not pollute if_output() routines. Convert BPF to pass prepend data
  via new 'struct route' mechanism. Note that it does not change
  non-optimized if_output(): ro_prepend handling is purely optional.
Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI.
  It is not needed for ethernet anymore. The only remaining FDDI user is
  dev/pdq mostly untouched since 2007. FDDI support was eliminated from
  OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65).

Flowtable changes:
  Flowtable violates layering by saving (and not correctly managing)
  rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated
  header data from that lle.

Differential Revision:	https://reviews.freebsd.org/D4102
2015-12-31 05:03:27 +00:00
Randall Stewart
55bceb1e2b First cut of the modularization of our TCP stack. Still
to do is to clean up the timer handling using the async-drain.
Other optimizations may be coming to go with this. Whats here
will allow differnet tcp implementations (one included).
Reviewed by:	jtl, hiren, transports
Sponsored by:	Netflix Inc.
Differential Revision:	D4055
2015-12-16 00:56:45 +00:00
Alexander V. Chernikov
8e5aadb617 Replace toe_nd6_resolve() with nd6_resolve().
Reviewed by:	np
2015-09-22 19:05:44 +00:00
Alexander V. Chernikov
aa5f023eaf Unify nd6 state switching by using newly-created nd6_llinfo_setstate()
function. The change is mostly mechanical with the following exception:
Last piece of nd6_resolve_slow() was refactored: ND6_LLINFO_PERMANENT
  condition was removed as always-true, explicit ND6_LLINFO_NOSTATE ->
  ND6_LLINFO_INCOMPLETE state transition was removed as duplicate.

Reviewed by:	ae
Sponsored by:	Yandex LLC
2015-09-21 11:19:53 +00:00
Alexander V. Chernikov
26deb8826c Do not pass lle to nd6_ns_output(). Use newly-added
nd6_llinfo_get_holdsrc() to extract desired IPv6 source
  from holdchain and pass it to the nd6_ns_output().
2015-09-05 14:14:03 +00:00
Alexander V. Chernikov
5a2555160f * Split allocation and table linking for lle's.
Before that, the logic besides lle_create() was the following:
  return existing if found, create if not. This behaviour was error-prone
  since we had to deal with 'sudden' static<>dynamic lle changes.
  This commit fixes bunch of different issues like:
  - refcount leak when lle is converted to static.
    Simple check case:
    console 1:
    while true;
      do for i in `arp -an|awk '$4~/incomp/{print$2}'|tr -d '()'`;
        do arp -s $i 00:22:44:66:88:00 ; arp -d $i;
      done;
    done
   console 2:
    ping -f any-dead-host-in-L2
   console 3:
    # watch for memory consumption:
    vmstat -m | awk '$1~/lltable/{print$2}'
  - possible problems in arptimer() / nd6_timer() when dropping/reacquiring
   lock.
  New logic explicitly handles use-or-create cases in every lla_create
  user. Basically, most of the changes are purely mechanical. However,
  we explicitly avoid using existing lle's for interface/static LLE records.
* While here, call lle_event handlers on all real table lle change.
* Create lltable_free_entry() calling existing per-lltable
  lle_free_t callback for entry deletion
2015-08-20 12:05:17 +00:00
Alexander V. Chernikov
314294de5c Store addresses instead of sockaddrs inside llentry.
This permits us having all (not fully true yet) all the info
needed in lookup process in first 64 bytes of 'struct llentry'.

struct llentry layout:
BEFORE:
[rwlock .. state .. state .. MAC ] (lle+1) [sockaddr_in[6]]
AFTER
[ in[6]_addr MAC .. state .. rwlock ]

Currently, address part of struct llentry has only 16 bytes for the key.
However, lltable does not restrict any custom lltable consumers with long
keys use the previous approach (store key at (lle+1)).

Sponsored by:	Yandex LLC
2015-08-11 09:26:11 +00:00
Alexander V. Chernikov
e362cf0e9f MFP r274553:
* Move lle creation/deletion from lla_lookup to separate functions:
  lla_lookup(LLE_CREATE) -> lla_create
  lla_lookup(LLE_DELETE) -> lla_delete
lla_create now returns with LLE_EXCLUSIVE lock for lle.
* Provide typedefs for new/existing lltable callbacks.

Reviewed by:	ae
2015-08-08 17:48:54 +00:00
Julien Charbon
ff9b006d61 Decompose TCP INP_INFO lock to increase short-lived TCP connections scalability:
- The existing TCP INP_INFO lock continues to protect the global inpcb list
  stability during full list traversal (e.g. tcp_pcblist()).

- A new INP_LIST lock protects inpcb list actual modifications (inp allocation
  and free) and inpcb global counters.

It allows to use TCP INP_INFO_RLOCK lock in critical paths (e.g. tcp_input())
and INP_INFO_WLOCK only in occasional operations that walk all connections.

PR:			183659
Differential Revision:	https://reviews.freebsd.org/D2599
Reviewed by:		jhb, adrian
Tested by:		adrian, nitroboost-gmail.com
Sponsored by:		Verisign, Inc.
2015-08-03 12:13:54 +00:00
Alexander V. Chernikov
74860d4f7c Do not return unlocked/unreferenced lle in arpresolve/nd6_storelladdr -
return lle flags IFF needed.
Do not pass rte to arpresolve - pass is_gateway flag instead.
2014-11-27 23:06:25 +00:00
Alexander V. Chernikov
b4b1367ae4 * Move lle creation/deletion from lla_lookup to separate functions:
lla_lookup(LLE_CREATE) -> lla_create
  lla_lookup(LLE_DELETE) -> lla_delete
  Assume lla_create to return LLE_EXCLUSIVE lock for lle.
* Rework lla_rt_output to perform all lle changes under afdata WLOCK.
* change arp_ifscrub() ackquire afdata WLOCK, the same as arp_ifinit().
2014-11-15 18:54:07 +00:00
Alexander V. Chernikov
7b42f6fae2 * Convert TOE framework to use new routing api.
* Add fib6_lookup_nh_ext().
* Rename union structures:
  nhop64_basic -> nhopu_basic,
  nhop64_extended -> nhopu_extended
2014-10-25 18:25:00 +00:00
Navdeep Parhar
5acf7269da Catch up with r271119. 2014-09-24 20:12:40 +00:00
Gleb Smirnoff
c3322cb91c Include necessary headers that now are available due to pollution
via if_var.h.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-28 07:29:16 +00:00
Navdeep Parhar
20be068c8a Teach toe_l2_resolve to resolve IPv6 destinations too.
Reviewed by:	bz@
2013-01-26 00:57:29 +00:00
Navdeep Parhar
b218348bc3 Teach toe_4tuple_check() to deal with IPv6 4-tuples too.
Reviewed by:	bz@
2013-01-25 20:45:24 +00:00
Kevin Lo
9823d52705 Revert previous commit...
Pointyhat to:	kevlo (myself)
2012-10-10 08:36:38 +00:00
Kevin Lo
a10cee30c9 Prefer NULL over 0 for pointers 2012-10-09 08:27:40 +00:00
Navdeep Parhar
06fd9875aa Correctly handle the case where an inp has already been dropped by the time
the TOE driver reports that an active open failed.  toe_connect_failed is
supposed to handle this but it should be provided the inpcb instead of the
tcpcb which may no longer be around.
2012-08-21 18:09:33 +00:00
Navdeep Parhar
09fe63205c - Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
  These are available as t3_tom and t4_tom modules that augment cxgb(4)
  and cxgbe(4) respectively.  The cxgb/cxgbe drivers continue to work as
  usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs).  T4 iWARP in the
  works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload?  Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded?  Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by:	bz, gnn
Sponsored by:	Chelsio communications.
MFC after:	~3 months (after 9.1, and after ensuring MFC is feasible)
2012-06-19 07:34:13 +00:00