Commit Graph

213945 Commits

Author SHA1 Message Date
cem
72e57e0f5f NTB: Abstract doorbell register access
The doorbell registers (and associated mask) are 16-bit on Xeon but
64-bit on SoC.  Abstract IO access to doorbell registers with
'db_ioread' and 'db_iowrite' (names and idea borrowed from the dual
BSD/GPL Linux driver).

Sponsored by:	EMC / Isilon Storage Division
2015-10-14 23:48:03 +00:00
cem
05fa2677c4 if_ntb: MFV 3cc5ba19: Add alignment check to meet hardware requirement
Original Linux commit log:

The NTB translate register must have the value to be BAR size aligned.
This alignment check make sure that the DMA memory allocated has the
proper alignment. Another requirement for NTB to function properly with
memory window BAR size greater or equal to 4M is to use the CMA feature
in 3.16 kernel with the appropriate CONFIG_CMA_ALIGNMENT and
CONFIG_CMA_SIZE_MBYTES set.

Authored by:	Dave Jiang
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 23:47:52 +00:00
cem
1a9e30b488 NTB: MFV a1413cfb: correct the spread of queues over mw's
The detection of an uneven number of queues on the given memory windows
was not correct.  The mw_num is zero based and the mod should be
division to spread them evenly over the mw's.

Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 23:47:35 +00:00
cem
1a710218be NTB: Remap MSI-X messages over available slots
Remap MSI-X messages over available slots rather than falling back to
legacy INTx when fewer MSI-X slots are available than were requested.

N.B. the Linux driver does *not* do this.

To aid in testing, a tunable 'hw.ntb.force_remap_mode' has been added.
It defaults to off (0).  When the tunable is enabled and sufficient
slots were available, the driver restricts the number of slots by one
and remaps the MSI-X messages over the remaining slots.

In case this is actually not okay (as I don't yet have access to this
hardware to test), a tunable 'hw.ntb.prefer_intx_to_remap' has been
added.  It defaults to off (0).  When the tunable is enabled and fewer
slots are available than requested, fall back to legacy INTx mode rather
than attempting to remap MSI-X messages.

Suggested by:	jhb
Reviewed by:	jhb (earlier version)
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 23:47:23 +00:00
cem
ec70a91173 NTB: Reserve link event doorbell callback on Xeon
Consumers that registered on this bit would never see a callback and it
is likely a mistake.

This does not affect if_ntb, which limits itself to a single doorbell
callback.
2015-10-14 23:47:08 +00:00
cem
fa6dd4da61 NTB: MFV 53a788a7: Split ntb_setup_interrupts() into SOC, Xeon, and legacy routines
The names don't line up 100% with Linux.  Our routines are named
ntb_setup_interrupts, ntb_setup_xeon_msix, ntb_setup_soc_msix, and
ntb_setup_legacy_interrupt.  Linux SNB = FreeBSD Xeon; Linux BWD =
FreeBSD SOC.  Original Linux commit log:

This is an cleanup effort to make ntb_setup_msix() more readable - use
ntb_setup_bwd_msix() to init MSI-Xs on BWD hardware and
ntb_setup_snb_msix() - on SNB hardware.

Function ntb_setup_snb_msix() also initializes MSI-Xs the way it should
has been done - looping pci_enable_msix() until success or failure.

Authored by:	Alexander Gordeev
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 23:46:15 +00:00
cem
0ea8fc84ed if_ntb: Cleanup style 2015-10-14 23:45:35 +00:00
cem
cccff74e5f NTB: MFV 403c63cb: client event cleanup
Provide a better event interface between the client and transport.

Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 23:44:42 +00:00
np
192e942c78 iw_cxgbe: use correct RFC number. 2015-10-14 23:29:19 +00:00
gjb
13af28171f Deprecate MD5 checksum generation in favor of SHA512.
This was discussed during the 10.2-RELEASE cycle, however
since we were nearing the end of the cycle, we decided to
defer this change until after 10.2-RELEASE.

Reminded by:	so (delphij), jmg
MFC after:	5 days
Sponsored by:	The FreeBSD Foundation
2015-10-14 22:33:11 +00:00
emaste
2ef6d5560f Add Ubiquiti EdgeRouter Lite (ERL) kernel config file
The ERL is a fairly cheap (~$100 USD) and readily available dual core
MIPS64 device so it makes a useful MIPS reference platform.

This is based in part on the kernel config generated by the mkerlimage
script from http://rtfm.net/FreeBSD/ERL/.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D3884
2015-10-14 21:10:05 +00:00
bdrewery
cca4c22082 Add missing targets to PHONY_NOTMAIN.
- buildconfig, installconfig (missed in r289085)
- files (missed in r241298)

Sponsored by:	EMC / Isilon Storage Division
2015-10-14 20:38:51 +00:00
bdrewery
174c029e4d Recurse on 'buildconfig' and 'installconfig'. Remove the 'config' pseudo target.
The 'config' target isn't really needed right now so just remove it to avoid
any clashes with config(8) building.  It's also likely misspelled and should
be 'configs' if we decide to add it back.  This was just a convenience
target recently added.

Sponsored by:	EMC / Isilon Storage Division
2015-10-14 20:30:32 +00:00
bdrewery
132e71dfe9 Re-indent the ALL_SUBDIR_TARGETS list 2015-10-14 20:28:15 +00:00
ngie
7983774b18 Fix test-fenv:test_dfl_env when run on some amd64 CPUs
Compare the fields that the AMD [1] and Intel [2] specs say will be
set once fnstenv returns.

Not all amd64 capable processors zero out the env.__x87.__other field
(example: AMD Opteron 6308). The AMD64/x64 specs aren't explicit on what the
env.__x87.__other field will contain after fnstenv is executed, so the values
in env.__x87.__other could be filled with arbitrary data depending on how the
CPU-specific implementation of fnstenv.

1. http://support.amd.com/TechDocs/26569_APM_v5.pdf
2. http://www.intel.com/Assets/en_US/PDF/manual/253666.pdf

Discussed with: kib, Anton Rang <anton.rang@isilon.com>
Reviewed by: Daniel O'Connor <darius@dons.net.au> (earlier patch; pre-generalization)
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
Reported by: Bill Morchin <wmorchin@isilon.com>
2015-10-14 20:22:12 +00:00
bdrewery
2606e8ae5c Revert r289282 for now as the interaction with a directory containing
bsd.files.mk and bsd.subdir.mk is recursing too many times.
2015-10-14 19:30:04 +00:00
emaste
d70c586bb8 /libexec subdirs are part of the base system (for *.debug files)
Sponsored by:	The FreeBSD Foundation
2015-10-14 19:19:44 +00:00
emaste
19ad2d2446 Add mtree entry for casper .debug files
This was missed in r258838.

Sponsored by:	The FreeBSD Foundation
2015-10-14 19:14:05 +00:00
bz
a1c8ec0037 Revert r289319 as it seems some ARM kernels include HWPMC but no FDT.
To me that seems broken as certain interrupts will never be handled
properly.  I'll re-open D3877 and we can seek a better solution and
try again.  For now go back to that state and avoid compile time errors.
2015-10-14 18:53:34 +00:00
bz
e2d907351f Fix the dependencies to be similar to TCP as without TCP, e.g., NOIP kernels
this will otherwise fail.
2015-10-14 18:32:06 +00:00
bz
857b8f754e Properly define functions withut argument and wrap for { for style purposes
as followed in the rest of the file.  This will hopefully make gcc more happy.
2015-10-14 18:30:04 +00:00
kib
93aac33beb Allow PT_NOTES segments to be located anywhere in the executable
image.

The dynamic linker still requires that program headers of the
executable or dso are mapped by a PT_LOAD segment.

Reviewed by:	emaste, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D3871
2015-10-14 18:29:21 +00:00
kib
a6091af923 Allow PT_INTERP and PT_NOTES segments to be located anywhere in the
executable image.  Keep one page (arbitrary) limit on the max allowed
size of the PT_NOTES.

The ELF image activators still require that program headers of the
executable are fully contained in the first page of the image file.

Reviewed by:	emaste, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D3871
2015-10-14 18:27:35 +00:00
des
a320991678 Apply r3505 (s/SIGQUIT/SIGTERM/ in man page)
PR:		203580
2015-10-14 18:08:38 +00:00
bz
aa0582a388 Now that we can detect the Cortex-A8 properly, fix the event list
according to the Cortex-A8 TRM r3p2 section 3.2.49.
The A8 list differs from the "ARM-v7 common" list, given the A8
was an earlier model.

There is still more work to be done for other Cortex-Ax version as
andrew points out, but I am just trying to fix A8 for now for teaching.

MFC after:		2 weeks
Sponsored by:		DARPA/AFRL
Obtained from:		Cambridge/L41
Reviewed by:		andrew
Differential Revision:	https://reviews.freebsd.org/D3876
2015-10-14 17:20:19 +00:00
bz
c3c1d7f7d2 HWPMC depends on pmu.c even if device pmu is not specified.
Would be great if we could just automatically enabled "device pmu"
if we try to compile in HWPMC.

MFC after:		2 weeks
Sponsored by:		DARPA/AFRL
Reviewed by:		andrew
Differential Revision:	https://reviews.freebsd.org/D3877
2015-10-14 17:07:24 +00:00
bz
fec51f584e For the Cortex-A8 use the a8 and not the a9 events table.
MFC after:		2 weeks
Sponsored by:		DARPA/AFRL
Differential Revision:	https://reviews.freebsd.org/D3882
2015-10-14 16:56:25 +00:00
kp
40bca2754d pf: Fix TSO issues
In certain configurations (mostly but not exclusively as a VM on Xen) pf
produced packets with an invalid TCP checksum.

The problem was that pf could only handle packets with a full checksum. The
FreeBSD IP stack produces TCP packets with a pseudo-header checksum (only
addresses, length and protocol).
Certain network interfaces expect to see the pseudo-header checksum, so they
end up producing packets with invalid checksums.

To fix this stop calculating the full checksum and teach pf to only update TCP
checksums if TSO is disabled or the change affects the pseudo-header checksum.

PR:		154428, 193579, 198868
Reviewed by:	sbruno
MFC after:	1 week
Relnotes:	yes
Sponsored by:	RootBSD
Differential Revision:	https://reviews.freebsd.org/D3779
2015-10-14 16:21:41 +00:00
vangyzen
60458e70da resolver: automatically reload /etc/resolv.conf
On each resolver query, use stat(2) to see if the modification time
of /etc/resolv.conf has changed.  If so, reload the file and reinitialize
the resolver library.  However, only call stat(2) if at least two seconds
have passed since the last call to stat(2), since calling it on every
query could kill performance.

This new behavior is enabled by default.  Add a "reload-period" option
to disable it or change the period of the test.

Document this behavior and option in resolv.conf(5).

Polish the man page just enough to appease igor.

https://lists.freebsd.org/pipermail/freebsd-arch/2015-October/017342.html

Reviewed by:	kp, wblock
Discussed with:	jilles, imp, alfred
MFC after:	1 month
Relnotes:	yes
Sponsored by:	Dell Inc.
Differential Revision:	https://reviews.freebsd.org/D3867
2015-10-14 14:26:44 +00:00
mav
b800f00410 MFV r289311: 5764 "zfs send -nv" directs output to stderr
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Basil Crow <basil.crow@delphix.com>
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Manoj Joseph <manoj.joseph@delphix.com>

illumos/illumos-gate@dc5f28a3c3
2015-10-14 11:52:58 +00:00
mav
297adc9551 MFV r289308: 6267 dn_bonus evicted too early
Reviewed by: Richard Yao <ryao@gentoo.org>
Reviewed by: Xin LI <delphij@freebsd.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Justin T. Gibbs <gibbs@FreeBSD.org>

illumos/illumos-gate@d2058105c6
2015-10-14 10:38:05 +00:00
mav
984f7bcd4c MFV r289306: 6295 metaslab_condense's dbgmsg should include vdev id
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@freebsd.org>
Reviewed by: Xin Li <delphij@freebsd.org>
Reviewed by: Justin Gibbs <gibbs@scsiguy.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Joe Stein <joe.stein@delphix.com>

illumos/illumos-gate@daec38ecb4
2015-10-14 10:31:50 +00:00
mav
89a4cb886e MFV r289304: 6293 ztest failure: error == 28 (0xc == 0x1c) in ztest_tx_assign()
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>

illumos/illumos-gate@8fe00bfb87
2015-10-14 10:28:29 +00:00
kib
176234430e Invalid pages should not appear on the inactive queue. Change the
check into an assertion.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2015-10-14 09:03:32 +00:00
ngie
ef127649af Integrate tools/regression/vfs into the FreeBSD test suite as tests/sys/vfs
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
2015-10-14 08:16:15 +00:00
mav
f3d984a1de MFV r289298: 6286 ZFS internal error when set large block on bootfs
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

illumos/illumos-gate@6de9bb5603
2015-10-14 07:50:08 +00:00
mav
4e77c724b3 MFV r289296: 6288 dmu_buf_will_dirty could be faster
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Justin Gibbs <gibbs@scsiguy.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

illumos/illumos-gate@0f2e7d03b8
2015-10-14 07:45:44 +00:00
mav
afbdb89044 MFV r289294: 5219 l2arc_write_buffers() may write beyond target_sz
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Saso Kiselkov <skiselkov@gmail.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Steven Hartland <steven.hartland@multiplay.co.uk>
Reviewed by: Justin Gibbs <gibbs@FreeBSD.org>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Andriy Gapon <avg@freebsd.org>

illumos/illumos-gate@d7d9a6d919
2015-10-14 07:37:02 +00:00
hiren
c9534bc93d Fix an unnecessarily aggressive behavior where mtu clamping begins on first
retransmission timeout (rto) when blackhole detection is enabled.  Make
sure it only happens when the second attempt to send the same segment also fails
with rto.

Also make sure that each mtu probing stage (usually 1448 -> 1188 -> 524) follows
the same pattern and gets 2 chances (rto) before further clamping down.

Note: RFC4821 doesn't specify implementation details on how this situation
should be handled.

Differential Revision:	https://reviews.freebsd.org/D3434
Reviewed by:	sbruno, gnn (previous version)
MFC after:	2 weeks
Sponsored by:	Limelight Networks
2015-10-14 06:57:28 +00:00
bdrewery
c7b6fda55e Fix support for building a PROG_CXX, and PROG, directly.
For example in lib/atf/libatf-c++/tests/detail it is now possible to
run 'make application_test'.  This was intended to worked for PROGS,
but lacked support for PROGS_CXX.

Also fix redefining the main PROG target to recurse.  This isn't needed
since the main process is setting PROG/PROG_CXX to handle it directly
via bsd.prog.mk.

MFC after:	3 weeks
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 05:50:16 +00:00
adrian
b79a4ad744 Fix date.
Noticed by:	bdrewery
2015-10-14 05:16:56 +00:00
bdrewery
3451a7cd5b Follow-up r288218 by ensuring common objects are built before recursing.
Some example where this is a problem:
  lib/atf/libatf-c++/tests/Makefile:SRCS.${_T}=   ${_T}.cpp test_helpers.cpp
  lib/atf/libatf-c++/tests/detail/Makefile:SRCS.${_T}=    ${_T}.cpp test_helpers.cpp
  lib/atf/libatf-c/tests/Makefile:SRCS.${_T}=     ${_T}.c test_helpers.c
  lib/atf/libatf-c/tests/detail/Makefile:SRCS.${_T}=      ${_T}.c test_helpers.c
  lib/libpam/libpam/tests/Makefile:SRCS.${test} = ${test}.c ${COMMONSRC}

A similar change may be needed for FILES, SCRIPTS, or INCS, but for now stay
with just SRCS.

Reported by:	rodrigc
MFC after:	3 weeks
X-MFC-With:	r288218
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 04:42:05 +00:00
adrian
e146e19fb5 rsu(4) manpage updates: add me, add 802.11n support, update caveats.
* Add that I indeed added 802.11n support.
* Update caveats - we support 1x1, 1x2 and 2x2 operation now, but
  there's no transmit aggregation support.
2015-10-14 02:43:04 +00:00
bdrewery
e08f0663f7 Replace the out-of-place includes/files/config handling in bsd.subdir.mk with
more typical ALL_SUBDIR_TARGETS entries and target hooks in bsd.incs.mk,
bsd.files.mk and bsd.confs.mk.

This allows the targets to be NOPs if unneeded and still work with the
shortcut 'make includes' to build and then install in a parallel-safe manner.

Sort and re-indent the ALL_SUBDIR_TARGETS with the new entries.

Sponsored by:	EMC / Isilon Storage Division
2015-10-14 02:37:30 +00:00
cem
466958dfd3 NTB: MFV e8aeb60c: Disable interrupts and poll under high load
Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 02:14:45 +00:00
cem
ab64537884 NTB: MFV 78958433: Enable Snoop on Primary Side
Enable Snoop from Primary to Secondary side on BAR23 and BAR45 on all
TLPs.  Previously, Snoop was only enabled from Secondary to Primary
side.  This can have a performance improvement on some workloads.

Also, make the code more obvious about how the link is being enabled.

Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 02:14:15 +00:00
jeff
4402204d47 Parallelize the buffer cache and rewrite getnewbuf(). This results in a
8x performance improvement in a micro benchmark on a 4 socket machine.

 - Get buffer headers from a per-cpu uma cache that sits in from of the
   free queue.
 - Use a per-cpu quantum cache in vmem to eliminate contention for kva.
 - Use multiple clean queues according to buffer cache size to eliminate
   clean queue lock contention.
 - Introduce a bufspace daemon that attempts to prevent getnewbuf() callers
   from blocking or doing direct recycling.
 - Close some bufspace allocation races that could lead to endless
   recycling.
 - Further the transition to a more modern style of small functions grouped
   by prefix in order to improve growing complexity.

Sponsored by:	EMC / Isilon
Reviewed by:	kib
Tested by:	pho
2015-10-14 02:10:07 +00:00
bdrewery
c54a8cc39c Correct a comment in bsd.incs.mk forgotten in r274662 and copied into bsd.confs.mk.
The bsd.confs.mk may be wrong but for now fix it.

Sponsored by:	EMC / Isilon Storage Division
2015-10-14 00:43:29 +00:00
bdrewery
db1849f980 Add a note about the mysterious files/includes/config block.
This originated from r96668.
2015-10-14 00:36:33 +00:00
hiren
0d12306188 There are times when it would be really nice to have a record of the last few
packets and/or state transitions from each TCP socket. That would help with
narrowing down certain problems we see in the field that are hard to reproduce
without understanding the history of how we got into a certain state. This
change provides just that.

It saves copies of the last N packets in a list in the tcpcb. When the tcpcb is
destroyed, the list is freed. I thought this was likely to be more
performance-friendly than saving copies of the tcpcb. Plus, with the packets,
you should be able to reverse-engineer what happened to the tcpcb.

To enable the feature, you will need to compile a kernel with the TCPPCAP
option. Even then, the feature defaults to being deactivated. You can activate
it by setting a positive value for the number of captured packets. You can do
that on either a global basis or on a per-socket basis (via a setsockopt call).

There is no way to get the packets out of the kernel other than using kmem or
getting a coredump. I thought that would help some of the legal/privacy concerns
regarding such a feature. However, it should be possible to add a future effort
to export them in PCAP format.

I tested this at low scale, and found that there were no mbuf leaks and the peak
mbuf usage appeared to be unchanged with and without the feature.

The main performance concern I can envision is the number of mbufs that would be
used on systems with a large number of sockets. If you save five packets per
direction per socket and have 3,000 sockets, that will consume at least 30,000
mbufs just to keep these packets. I tried to reduce the concerns associated with
this by limiting the number of clusters (not mbufs) that could be used for this
feature. Again, in my testing, that appears to work correctly.

Differential Revision:	D3100
Submitted by:		Jonathan Looney <jlooney at juniper dot net>
Reviewed by:		gnn, hiren
2015-10-14 00:35:37 +00:00