Commit Graph

382 Commits

Author SHA1 Message Date
delphij
67daa950c9 Fix !INET build. 2019-08-02 22:43:09 +00:00
rrs
8a34b17735 This adds the third step in getting BBR into the tree. BBR and
an updated rack depend on having access to the new
ratelimit api in this commit.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D20953
2019-08-01 14:17:31 +00:00
hselasky
1a5fd513af Convert all IPv4 and IPv6 multicast memberships into using a STAILQ
instead of a linear array.

The multicast memberships for the inpcb structure are protected by a
non-sleepable lock, INP_WLOCK(), which needs to be dropped when
calling the underlying possibly sleeping if_ioctl() method. When using
a linear array to keep track of multicast memberships, the computed
memory location of the multicast filter may suddenly change, due to
concurrent insertion or removal of elements in the linear array. This
in turn leads to various invalid memory access issues and kernel
panics.

To avoid this problem, put all multicast memberships on a STAILQ based
list. Then the memory location of the IPv4 and IPv6 multicast filters
become fixed during their lifetime and use after free and memory leak
issues are easier to track, for example by: vmstat -m | grep multi

All list manipulation has been factored into inline functions
including some macros, to easily allow for a future hash-list
implementation, if needed.

This patch has been tested by pho@ .

Differential Revision: https://reviews.freebsd.org/D20080
Reviewed by:	markj @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-06-25 11:54:41 +00:00
jhb
5518ae8169 Restructure mbuf send tags to provide stronger guarantees.
- Perform ifp mismatch checks (to determine if a send tag is allocated
  for a different ifp than the one the packet is being output on), in
  ip_output() and ip6_output().  This avoids sending packets with send
  tags to ifnet drivers that don't support send tags.

  Since we are now checking for ifp mismatches before invoking
  if_output, we can now try to allocate a new tag before invoking
  if_output sending the original packet on the new tag if allocation
  succeeds.

  To avoid code duplication for the fragment and unfragmented cases,
  add ip_output_send() and ip6_output_send() as wrappers around
  if_output and nd6_output_ifp, respectively.  All of the logic for
  setting send tags and dealing with send tag-related errors is done
  in these wrapper functions.

  For pseudo interfaces that wrap other network interfaces (vlan and
  lagg), wrapper send tags are now allocated so that ip*_output see
  the wrapper ifp as the ifp in the send tag.  The if_transmit
  routines rewrite the send tags after performing an ifp mismatch
  check.  If an ifp mismatch is detected, the transmit routines fail
  with EAGAIN.

- To provide clearer life cycle management of send tags, especially
  in the presence of vlan and lagg wrapper tags, add a reference count
  to send tags managed via m_snd_tag_ref() and m_snd_tag_rele().
  Provide a helper function (m_snd_tag_init()) for use by drivers
  supporting send tags.  m_snd_tag_init() takes care of the if_ref
  on the ifp meaning that code alloating send tags via if_snd_tag_alloc
  no longer has to manage that manually.  Similarly, m_snd_tag_rele
  drops the refcount on the ifp after invoking if_snd_tag_free when
  the last reference to a send tag is dropped.

  This also closes use after free races if there are pending packets in
  driver tx rings after the socket is closed (e.g. from tcpdrop).

  In order for m_free to work reliably, add a new CSUM_SND_TAG flag in
  csum_flags to indicate 'snd_tag' is set (rather than 'rcvif').
  Drivers now also check this flag instead of checking snd_tag against
  NULL.  This avoids false positive matches when a forwarded packet
  has a non-NULL rcvif that was treated as a send tag.

- cxgbe was relying on snd_tag_free being called when the inp was
  detached so that it could kick the firmware to flush any pending
  work on the flow.  This is because the driver doesn't require ACK
  messages from the firmware for every request, but instead does a
  kind of manual interrupt coalescing by only setting a flag to
  request a completion on a subset of requests.  If all of the
  in-flight requests don't have the flag when the tag is detached from
  the inp, the flow might never return the credits.  The current
  snd_tag_free command issues a flush command to force the credits to
  return.  However, the credit return is what also frees the mbufs,
  and since those mbufs now hold references on the tag, this meant
  that snd_tag_free would never be called.

  To fix, explicitly drop the mbuf's reference on the snd tag when the
  mbuf is queued in the firmware work queue.  This means that once the
  inp's reference on the tag goes away and all in-flight mbufs have
  been queued to the firmware, tag's refcount will drop to zero and
  snd_tag_free will kick in and send the flush request.  Note that we
  need to avoid doing this in the middle of ethofld_tx(), so the
  driver grabs a temporary reference on the tag around that loop to
  defer the free to the end of the function in case it sends the last
  mbuf to the queue after the inp has dropped its reference on the
  tag.

- mlx5 preallocates send tags and was using the ifp pointer even when
  the send tag wasn't in use.  Explicitly use the ifp from other data
  structures instead.

- Sprinkle some assertions in various places to assert that received
  packets don't have a send tag, and that other places that overwrite
  rcvif (e.g. 802.11 transmit) don't clobber a send tag pointer.

Reviewed by:	gallatin, hselasky, rgrimes, ae
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20117
2019-05-24 22:30:40 +00:00
gallatin
63aec3850f Track TCP connection's NUMA domain in the inpcb
Drivers can now pass up numa domain information via the
mbuf numa domain field.  This information is then used
by TCP syncache_socket() to associate that information
with the inpcb. The domain information is then fed back
into transmitted mbufs in ip{6}_output(). This mechanism
is nearly identical to what is done to track RSS hash values
in the inp_flowid.

Follow on changes will use this information for lacp egress
port selection, binding TCP pacers to the appropriate NUMA
domain, etc.

Reviewed by:	markj, kib, slavash, bz, scottl, jtl, tuexen
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20028
2019-04-25 15:37:28 +00:00
jhb
24a54e79c9 Don't check the inp socket pointer in in_pcboutput_eagain.
Reviewed by:	hps (by saying it was ok to be removed)
MFC after:	1 month
Sponsored by:	Netflix
2019-03-29 19:47:42 +00:00
ae
50a8601b2e In r335015 PCB destroing was made deferred using epoch_call().
But ipsec_delete_pcbpolicy() uses some VNET-virtualized variables,
and thus it needs VNET context, that is missing during gtaskqueue
executing. Use inp_vnet context to set curvnet in in_pcbfree_deferred().

PR:		235684
MFC after:	1 week
2019-02-13 15:46:05 +00:00
glebius
6d8cc191f9 Mechanical cleanup of epoch(9) usage in network stack.
- Remove macros that covertly create epoch_tracker on thread stack. Such
  macros a quite unsafe, e.g. will produce a buggy code if same macro is
  used in embedded scopes. Explicitly declare epoch_tracker always.

- Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list
  IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read
  locking macros to what they actually are - the net_epoch.
  Keeping them as is is very misleading. They all are named FOO_RLOCK(),
  while they no longer have lock semantics. Now they allow recursion and
  what's more important they now no longer guarantee protection against
  their companion WLOCK macros.
  Note: INP_HASH_RLOCK() has same problems, but not touched by this commit.

This is non functional mechanical change. The only functionally changed
functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter
epoch recursively.

Discussed with:	jtl, gallatin
2019-01-09 01:11:19 +00:00
mjg
7e31d1de7e Remove unused argument to priv_check_cred.
Patch mostly generated with cocinnelle:

@@
expression E1,E2;
@@

- priv_check_cred(E1,E2,0)
+ priv_check_cred(E1,E2)

Sponsored by:	The FreeBSD Foundation
2018-12-11 19:32:16 +00:00
markj
a74ba1d431 Clamp the INPCB port hash tables to IPPORT_MAX + 1 chains.
Memory beyond that limit was previously unused, wasting roughly 1MB per
8GB of RAM.  Also retire INP_PCBLBGROUP_PORTHASH, which was identical to
INP_PCBPORTHASH.

Reviewed by:	glebius
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D17803
2018-12-05 17:06:00 +00:00
markj
5c563658ea Plug some networking sysctl leaks.
Various network protocol sysctl handlers were not zero-filling their
output buffers and thus would export uninitialized stack memory to
userland.  Fix a number of such handlers.

Reported by:	Thomas Barabosch, Fraunhofer FKIE
Reviewed by:	tuexen
MFC after:	3 days
Security:	kernel memory disclosure
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18301
2018-11-22 20:49:41 +00:00
markj
96786af957 Remove redundant checks for a NULL lbgroup table.
No functional change intended.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17108
2018-11-01 15:52:49 +00:00
markj
fbdfd87a7a Improve style in in_pcbinslbgrouphash() and related subroutines.
No functional change intended.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17107
2018-11-01 15:51:49 +00:00
markj
6131d60f6a Fix synchronization of LB group access.
Lookups are protected by an epoch section, so the LB group linkage must
be a CK_LIST rather than a plain LIST.  Furthermore, we were not
deferring LB group frees, so in_pcbremlbgrouphash() could race with
readers and cause a use-after-free.

Reviewed by:	sbruno, Johannes Lundberg <johalun0@gmail.com>
Tested by:	gallatin
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17031
2018-09-10 19:00:29 +00:00
markj
eb3d0d016c Use ratecheck(9) in in_pcbinslbgrouphash().
Reviewed by:	bz, Johannes Lundberg <johalun0@gmail.com>
Approved by:	re (kib)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D17065
2018-09-07 21:11:41 +00:00
markj
4b466cd9a2 Fix style bugs in in_pcblookup_lbgroup().
No functional change intended.

Reviewed by:	bz, Johannes Lundberg <johalun0@gmail.com>
Approved by:	re (rgrimes)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17030
2018-09-05 15:04:11 +00:00
markj
b60ed05720 Use the correct malloc type in in_pcblbgroup_free().
Approved by:	re (kib)
Sponsored by:	The FreeBSD Foundation
2018-09-03 17:39:09 +00:00
mmacy
99cec0a00c Fix in6_multi double free
This is actually several different bugs:
- The code is not designed to handle inpcb deletion after interface deletion
  - add reference for inpcb membership
- The multicast address has to be removed from interface lists when the refcount
  goes to zero OR when the interface goes away
  - decouple list disconnect from refcount (v6 only for now)
- ifmultiaddr can exist past being on interface lists
  - add flag for tracking whether or not it's enqueued
- deferring freeing moptions makes the incpb cleanup code simpler but opens the
  door wider still to races
  - call inp_gcmoptions synchronously after dropping the the inpcb lock

Fundamentally multicast needs a rewrite - but keep applying band-aids for now.

Tested by: kp
Reported by: novel, kp, lwhsu
2018-08-15 20:23:08 +00:00
andrew
a6605d2938 Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by:	bz
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16147
2018-07-24 16:35:52 +00:00
brooks
39f527e7ee Use uintptr_t alone when assigning to kvaddr_t variables.
Suggested by:	jhb
2018-07-10 13:03:06 +00:00
brooks
8baf738e84 Correct breakage on 32-bit platforms from r335979. 2018-07-06 10:03:33 +00:00
brooks
6615ed4c61 Make struct xinpcb and friends word-size independent.
Replace size_t members with ksize_t (uint64_t) and pointer members
(never used as pointers in userspace, but instead as unique
idenitifiers) with kvaddr_t (uint64_t). This makes the structs
identical between 32-bit and 64-bit ABIs.

On 64-bit bit systems, the ABI is maintained. On 32-bit systems,
this is an ABI breaking change. The ABI of most of these structs
was previously broken in r315662.  This also imposes a small API
change on userspace consumers who must handle kernel pointers
becoming virtual addresses.

PR:		228301 (exp-run by antoine)
Reviewed by:	jtl, kib, rwatson (various versions)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15386
2018-07-05 13:13:48 +00:00
mmacy
14de8a2820 epoch(9): allow preemptible epochs to compose
- Add tracker argument to preemptible epochs
- Inline epoch read path in kernel and tied modules
- Change in_epoch to take an epoch as argument
- Simplify tfb_tcp_do_segment to not take a ti_locked argument,
  there's no longer any benefit to dropping the pcbinfo lock
  and trying to do so just adds an error prone branchfest to
  these functions
- Remove cases of same function recursion on the epoch as
  recursing is no longer free.
- Remove the the TAILQ_ENTRY and epoch_section from struct
  thread as the tracker field is now stack or heap allocated
  as appropriate.

Tested by: pho and Limelight Networks
Reviewed by: kbowling at llnw dot com
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16066
2018-07-04 02:47:16 +00:00
sbruno
1dc47ad154 Reap unused variable and assignment that had no effect. Noted by cross
compiling with gcc on mips.

Reviewed by:	mmacy
2018-06-24 21:36:37 +00:00
mmacy
41c8895b78 in_pcblookup_hash: validate inp before return
Post r335356 it is possible to have an inpcb on the hash lists that is
partially torn down. Validate before using. Also as a side effect of this
change the lock ordering issue between hash lock and inpcb no longer exists
allowing some simplification.

Reported by:	pho@
2018-06-21 18:40:15 +00:00
mmacy
255aa2f16f Fix PCBGROUPS build post CK conversion of pcbinfo 2018-06-13 23:19:54 +00:00
mmacy
5fa208f76c Handle INP_FREED when looking up an inpcb
When hash table lookups are not serialized with in_pcbfree it will be
possible for callers to find an inpcb that has been marked free. We
need to check for this and return NULL.
2018-06-13 04:23:49 +00:00
mmacy
e34884b056 Defer inpcbport free in in_pcbremlists as well 2018-06-12 23:26:25 +00:00
mmacy
6e4e86f96e Defer inpcbport free until after a grace period has elapsed
This is a dependency for inpcbinfo rlock conversion to epoch
2018-06-12 22:18:27 +00:00
mmacy
1cbc14be82 mechanical CK macro conversion of inpcbinfo lists
This is a dependency for converting the inpcbinfo hash and info rlocks
to epoch.
2018-06-12 22:18:20 +00:00
mmacy
f2fc01c6c7 Defer inpcb deletion until after a grace period has elapsed
Deferring the actual free of the inpcb until after a grace
period has elapsed will allow us to convert the inpcbinfo
info and hash read locks to epoch.

Reviewed by: gallatin, jtl
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15510
2018-06-12 22:18:15 +00:00
sbruno
d0aeaa5af7 Load balance sockets with new SO_REUSEPORT_LB option.
This patch adds a new socket option, SO_REUSEPORT_LB, which allow multiple
programs or threads to bind to the same port and incoming connections will be
load balanced using a hash function.

Most of the code was copied from a similar patch for DragonflyBSD.

However, in DragonflyBSD, load balancing is a global on/off setting and can not
be set per socket. This patch allows for simultaneous use of both the current
SO_REUSEPORT and the new SO_REUSEPORT_LB options on the same system.

Required changes to structures:
Globally change so_options from 16 to 32 bit value to allow for more options.
Add hashtable in pcbinfo to hold all SO_REUSEPORT_LB sockets.

Limitations:
As DragonflyBSD, a load balance group is limited to 256 pcbs (256 programs or
threads sharing the same socket).

This is a substantially different contribution as compared to its original
incarnation at svn r332894 and reverted at svn r332967.  Thanks to rwatson@
for the substantive feedback that is included in this commit.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Obtained from:	DragonflyBSD
Relnotes:	Yes
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D11003
2018-06-06 15:45:57 +00:00
mmacy
816ba56980 in_pcbladdr: remove debug code that snuck in with ifa epoch conversion r334118 2018-05-27 06:47:09 +00:00
mmacy
ecd6e9d307 UDP: further performance improvements on tx
Cumulative throughput while running 64
  netperf -H $DUT -t UDP_STREAM -- -m 1
on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps

Single stream throughput increases from 910kpps to 1.18Mpps

Baseline:
https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg

- Protect read access to global ifnet list with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg

- Protect short lived ifaddr references with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg

- Convert if_afdata read lock path to epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg

A fix for the inpcbhash contention is pending sufficient time
on a canary at LLNW.

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15409
2018-05-23 21:02:14 +00:00
mmacy
482d0f575f inpcb: revert deferred inpcb free pending further review 2018-05-21 16:13:43 +00:00
mmacy
47771c5690 in(s)_moptions: free before tearing down inpcb 2018-05-20 20:08:21 +00:00
mmacy
7110d4d796 inpcb: defer destruction of inpcb until after a grace period has elapsed
in_pcbfree will remove the incpb from the list and release the rtentry
while the vnet is set, but the actual destruction will be deferred
until any threads in a (not yet used) epoch section, no longer potentially
have references.
2018-05-20 04:38:04 +00:00
mmacy
99ec59840b in_pcb: add helper for deferring inpcb rele calls from list functions 2018-05-20 02:17:30 +00:00
mmacy
ce91e745ec ip(6)_freemoptions: defer imo destruction to epoch callback task
Avoid the ugly unlock / lock of the inpcbinfo where we need to
figure out what kind of lock we hold by simply deferring the
operation to another context. (Also a small dependency for
converting the pcbinfo read lock to epoch)
2018-05-20 00:22:28 +00:00
mmacy
7aeac9ef18 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
mmacy
d4a327c789 Currently in_pcbfree will unconditionally wunlock the pcbinfo lock
to avoid a LOR on the multicast list lock in the freemoptions routines.
As it turns out, tcp_usr_detach can acquire the tcbinfo lock readonly.
Trying to wunlock the pcbinfo lock in that context has caused a number
of reported crashes.

This change unclutters in_pcbfree and moves the handling of wunlock vs
runlock of pcbinfo to the freemoptions routine.

Reported by:	mjg@, bde@, o.hartmann at walstatt.org
Approved by:	sbruno
2018-05-05 22:40:40 +00:00
shurd
7d4b8facc7 Separate list manipulation locking from state change in multicast
Multicast incorrectly calls in to drivers with a mutex held causing drivers
to have to go through all manner of contortions to use a non sleepable lock.
Serialize multicast updates instead.

Submitted by:	mmacy <mmacy@mattmacy.io>
Reviewed by:	shurd, sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14969
2018-05-02 19:36:29 +00:00
sbruno
257e6e5563 Revert r332894 at the request of the submitter.
Submitted by:	Johannes Lundberg <johalun0_gmail.com>
Sponsored by:	Limelight Networks
2018-04-24 19:55:12 +00:00
sbruno
bbf7d4dd03 Load balance sockets with new SO_REUSEPORT_LB option
This patch adds a new socket option, SO_REUSEPORT_LB, which allow multiple
programs or threads to bind to the same port and incoming connections will be
load balanced using a hash function.

Most of the code was copied from a similar patch for DragonflyBSD.

However, in DragonflyBSD, load balancing is a global on/off setting and can not
be set per socket. This patch allows for simultaneous use of both the current
SO_REUSEPORT and the new SO_REUSEPORT_LB options on the same system.

Required changes to structures
Globally change so_options from 16 to 32 bit value to allow for more options.
Add hashtable in pcbinfo to hold all SO_REUSEPORT_LB sockets.

Limitations
As DragonflyBSD, a load balance group is limited to 256 pcbs
(256 programs or threads sharing the same socket).

Submitted by:	Johannes Lundberg <johanlun0@gmail.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D11003
2018-04-23 19:51:00 +00:00
rrs
863f90dbfa This commit brings in the TCP high precision timer system (tcp_hpts).
It is the forerunner/foundational work of bringing in both Rack and BBR
which use hpts for pacing out packets. The feature is optional and requires
the TCPHPTS option to be enabled before the feature will be active. TCP
modules that use it must assure that the base component is compile in
the kernel in which they are loaded.

MFC after:	Never
Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D15020
2018-04-19 13:37:59 +00:00
jtl
15af8ae175 Check that in_pcbfree() is only called once for each PCB. If that
assumption is violated, "bad things" could follow.

I believe such an assert would have detected some of the problems jch@
was chasing in PR 203175 (see r307551).  We also use it in our internal
TCP development efforts.  And, in case a bug does slip through to
released code, this change silently ignores subsequent calls to
in_pcbfree().

Reviewed by:	rrs
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D14990
2018-04-06 16:48:11 +00:00
np
82a840f165 Fix RSS build (broken in r331309).
Sponsored by:	Chelsio Communications
2018-03-29 19:48:17 +00:00
jtl
ee029a5d0b If the INP lock is uncontested, avoid taking a reference and jumping
through the lock-switching hoops.

A few of the INP lookup operations that lock INPs after the lookup do
so using this mechanism (to maintain lock ordering):

1. Lock lookup structure.
2. Find INP.
3. Acquire reference on INP.
4. Drop lock on lookup structure.
5. Acquire INP lock.
6. Drop reference on INP.

This change provides a slightly shorter path for cases where the INP
lock is uncontested:

1. Lock lookup structure.
2. Find INP.
3. Try to acquire the INP lock.
4. If successful, drop lock on lookup structure.

Of course, if the INP lock is contested, the functions will need to
revert to the previous way of switching locks safely.

This saves a few atomic operations when the INP lock is uncontested.

Discussed with:	gallatin, rrs, rwatson
MFC after:	2 weeks
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D12911
2018-03-21 15:54:46 +00:00
rstone
c0c5474ab0 Reduce code duplication for inpcb route caching
Add a new macro to clear both the L3 and L2 route caches, to
hopefully prevent future instances where only the L3 cache was
cleared when both should have been.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13989
Reviewed by:	karels
2018-01-23 03:15:39 +00:00
pfg
4736ccfd9c sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00