106320 Commits

Author SHA1 Message Date
Ed Maste
42d17d369b Add Ubiquiti EdgeRouter Lite (ERL) kernel config file
The ERL is a fairly cheap (~$100 USD) and readily available dual core
MIPS64 device so it makes a useful MIPS reference platform.

This is based in part on the kernel config generated by the mkerlimage
script from http://rtfm.net/FreeBSD/ERL/.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D3884
2015-10-14 21:10:05 +00:00
Bjoern A. Zeeb
184b91e538 Revert r289319 as it seems some ARM kernels include HWPMC but no FDT.
To me that seems broken as certain interrupts will never be handled
properly.  I'll re-open D3877 and we can seek a better solution and
try again.  For now go back to that state and avoid compile time errors.
2015-10-14 18:53:34 +00:00
Bjoern A. Zeeb
ff72f87b3f Fix the dependencies to be similar to TCP as without TCP, e.g., NOIP kernels
this will otherwise fail.
2015-10-14 18:32:06 +00:00
Bjoern A. Zeeb
f87ec781ef Properly define functions withut argument and wrap for { for style purposes
as followed in the rest of the file.  This will hopefully make gcc more happy.
2015-10-14 18:30:04 +00:00
Konstantin Belousov
6c775eb64e Allow PT_INTERP and PT_NOTES segments to be located anywhere in the
executable image.  Keep one page (arbitrary) limit on the max allowed
size of the PT_NOTES.

The ELF image activators still require that program headers of the
executable are fully contained in the first page of the image file.

Reviewed by:	emaste, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D3871
2015-10-14 18:27:35 +00:00
Bjoern A. Zeeb
71f7442233 Now that we can detect the Cortex-A8 properly, fix the event list
according to the Cortex-A8 TRM r3p2 section 3.2.49.
The A8 list differs from the "ARM-v7 common" list, given the A8
was an earlier model.

There is still more work to be done for other Cortex-Ax version as
andrew points out, but I am just trying to fix A8 for now for teaching.

MFC after:		2 weeks
Sponsored by:		DARPA/AFRL
Obtained from:		Cambridge/L41
Reviewed by:		andrew
Differential Revision:	https://reviews.freebsd.org/D3876
2015-10-14 17:20:19 +00:00
Bjoern A. Zeeb
48d1ff3a2d HWPMC depends on pmu.c even if device pmu is not specified.
Would be great if we could just automatically enabled "device pmu"
if we try to compile in HWPMC.

MFC after:		2 weeks
Sponsored by:		DARPA/AFRL
Reviewed by:		andrew
Differential Revision:	https://reviews.freebsd.org/D3877
2015-10-14 17:07:24 +00:00
Kristof Provost
c110fc49da pf: Fix TSO issues
In certain configurations (mostly but not exclusively as a VM on Xen) pf
produced packets with an invalid TCP checksum.

The problem was that pf could only handle packets with a full checksum. The
FreeBSD IP stack produces TCP packets with a pseudo-header checksum (only
addresses, length and protocol).
Certain network interfaces expect to see the pseudo-header checksum, so they
end up producing packets with invalid checksums.

To fix this stop calculating the full checksum and teach pf to only update TCP
checksums if TSO is disabled or the change affects the pseudo-header checksum.

PR:		154428, 193579, 198868
Reviewed by:	sbruno
MFC after:	1 week
Relnotes:	yes
Sponsored by:	RootBSD
Differential Revision:	https://reviews.freebsd.org/D3779
2015-10-14 16:21:41 +00:00
Alexander Motin
3dbe12b067 MFV r289308: 6267 dn_bonus evicted too early
Reviewed by: Richard Yao <ryao@gentoo.org>
Reviewed by: Xin LI <delphij@freebsd.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Justin T. Gibbs <gibbs@FreeBSD.org>

illumos/illumos-gate@d2058105c6
2015-10-14 10:38:05 +00:00
Alexander Motin
fc6f8dee4c MFV r289306: 6295 metaslab_condense's dbgmsg should include vdev id
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andriy Gapon <avg@freebsd.org>
Reviewed by: Xin Li <delphij@freebsd.org>
Reviewed by: Justin Gibbs <gibbs@scsiguy.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Joe Stein <joe.stein@delphix.com>

illumos/illumos-gate@daec38ecb4
2015-10-14 10:31:50 +00:00
Alexander Motin
422891c28a MFV r289304: 6293 ztest failure: error == 28 (0xc == 0x1c) in ztest_tx_assign()
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>

illumos/illumos-gate@8fe00bfb87
2015-10-14 10:28:29 +00:00
Konstantin Belousov
12a73f207a Invalid pages should not appear on the inactive queue. Change the
check into an assertion.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2015-10-14 09:03:32 +00:00
Alexander Motin
95e20e65d3 MFV r289298: 6286 ZFS internal error when set large block on bootfs
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

illumos/illumos-gate@6de9bb5603
2015-10-14 07:50:08 +00:00
Alexander Motin
a4256278bf MFV r289296: 6288 dmu_buf_will_dirty could be faster
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Justin Gibbs <gibbs@scsiguy.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

illumos/illumos-gate@0f2e7d03b8
2015-10-14 07:45:44 +00:00
Alexander Motin
0f45d37812 MFV r289294: 5219 l2arc_write_buffers() may write beyond target_sz
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Saso Kiselkov <skiselkov@gmail.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Steven Hartland <steven.hartland@multiplay.co.uk>
Reviewed by: Justin Gibbs <gibbs@FreeBSD.org>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Andriy Gapon <avg@freebsd.org>

illumos/illumos-gate@d7d9a6d919
2015-10-14 07:37:02 +00:00
Hiren Panchasara
adf43a9279 Fix an unnecessarily aggressive behavior where mtu clamping begins on first
retransmission timeout (rto) when blackhole detection is enabled.  Make
sure it only happens when the second attempt to send the same segment also fails
with rto.

Also make sure that each mtu probing stage (usually 1448 -> 1188 -> 524) follows
the same pattern and gets 2 chances (rto) before further clamping down.

Note: RFC4821 doesn't specify implementation details on how this situation
should be handled.

Differential Revision:	https://reviews.freebsd.org/D3434
Reviewed by:	sbruno, gnn (previous version)
MFC after:	2 weeks
Sponsored by:	Limelight Networks
2015-10-14 06:57:28 +00:00
Conrad Meyer
737bc5014c NTB: MFV e8aeb60c: Disable interrupts and poll under high load
Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 02:14:45 +00:00
Conrad Meyer
f3f87fe051 NTB: MFV 78958433: Enable Snoop on Primary Side
Enable Snoop from Primary to Secondary side on BAR23 and BAR45 on all
TLPs.  Previously, Snoop was only enabled from Secondary to Primary
side.  This can have a performance improvement on some workloads.

Also, make the code more obvious about how the link is being enabled.

Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-14 02:14:15 +00:00
Jeff Roberson
21fae96123 Parallelize the buffer cache and rewrite getnewbuf(). This results in a
8x performance improvement in a micro benchmark on a 4 socket machine.

 - Get buffer headers from a per-cpu uma cache that sits in from of the
   free queue.
 - Use a per-cpu quantum cache in vmem to eliminate contention for kva.
 - Use multiple clean queues according to buffer cache size to eliminate
   clean queue lock contention.
 - Introduce a bufspace daemon that attempts to prevent getnewbuf() callers
   from blocking or doing direct recycling.
 - Close some bufspace allocation races that could lead to endless
   recycling.
 - Further the transition to a more modern style of small functions grouped
   by prefix in order to improve growing complexity.

Sponsored by:	EMC / Isilon
Reviewed by:	kib
Tested by:	pho
2015-10-14 02:10:07 +00:00
Hiren Panchasara
86a996e6bd There are times when it would be really nice to have a record of the last few
packets and/or state transitions from each TCP socket. That would help with
narrowing down certain problems we see in the field that are hard to reproduce
without understanding the history of how we got into a certain state. This
change provides just that.

It saves copies of the last N packets in a list in the tcpcb. When the tcpcb is
destroyed, the list is freed. I thought this was likely to be more
performance-friendly than saving copies of the tcpcb. Plus, with the packets,
you should be able to reverse-engineer what happened to the tcpcb.

To enable the feature, you will need to compile a kernel with the TCPPCAP
option. Even then, the feature defaults to being deactivated. You can activate
it by setting a positive value for the number of captured packets. You can do
that on either a global basis or on a per-socket basis (via a setsockopt call).

There is no way to get the packets out of the kernel other than using kmem or
getting a coredump. I thought that would help some of the legal/privacy concerns
regarding such a feature. However, it should be possible to add a future effort
to export them in PCAP format.

I tested this at low scale, and found that there were no mbuf leaks and the peak
mbuf usage appeared to be unchanged with and without the feature.

The main performance concern I can envision is the number of mbufs that would be
used on systems with a large number of sockets. If you save five packets per
direction per socket and have 3,000 sockets, that will consume at least 30,000
mbufs just to keep these packets. I tried to reduce the concerns associated with
this by limiting the number of clusters (not mbufs) that could be used for this
feature. Again, in my testing, that appears to work correctly.

Differential Revision:	D3100
Submitted by:		Jonathan Looney <jlooney at juniper dot net>
Reviewed by:		gnn, hiren
2015-10-14 00:35:37 +00:00
Conrad Meyer
2a84460ce8 NTB: MFV 58b88920: Document HW errata
Add a comment describing the necessary ordering of modifications to the
NTB Limit and Base registers.

Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 23:43:06 +00:00
Conrad Meyer
4d07d562c3 NTB: MFV fca4d518: Fix ntb_transport link down race
A WARN_ON is being hit in ntb_qp_link_work due to the NTB transport link
being down while the ntb qp link is still active.  This is caused by the
transport link being brought down prior to the qp link worker thread
being terminated.  To correct this, shutdown the qp's prior to bringing
the transport link down.  Also, only call the qp worker thread if it is
in interrupt context, otherwise call the function directly.

Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 23:42:13 +00:00
Conrad Meyer
2cd48421c9 NTB: MFV 9fec60c4: Fix NTB-RP Link Up
The Xeon NTB-RP setup, the transparent side does not get a link up/down
interrupt.  Since the presence of a NTB device on the transparent side
means that we have a NTB link up, we can work around the lack of an
interrupt by simply calling the link up function to notify the upper
layers.

Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 23:41:40 +00:00
Conrad Meyer
6d960015e2 NTB: MFV c529aa30: Xeon Doorbell errata workaround
Modifications to the 14th bit of the B2BDOORBELL register will not be
mirrored to the remote system due to a hardware issue.  To get around
the issue, shrink the number of available doorbell bits by 1.  The max
number of doorbells was being used as a way to referencing the Link
Doorbell bit.  Since this would no longer work, the driver must now
explicitly reference that bit.

This does not affect the xeon_errata_workaround case, as it is not using
the b2bdoorbell register.

Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 23:41:06 +00:00
Conrad Meyer
2c6fb1de80 NTB: MFV f9a2cf89: Comment Fix
Add "data" ntb_register_db_callback parameter description comment and
correct poor speling.

Authored by:	Jon Mason
Obtained from:	Linux (Dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 20:55:21 +00:00
Conrad Meyer
2501258bd9 NTB: MFV b1ef0043: Remove References of non-B2B BWD HW
NTB-RP is not a supported configuration on BWD hardware.  Remove the
code attempting to set it up.

Authored by:	Jon Mason
Obtained from:	Linux (dual BSD/GPL driver)
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 20:54:38 +00:00
Conrad Meyer
e26b6f00f9 if_ntb: Fix build on i386
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 19:46:54 +00:00
Conrad Meyer
f3e30f9721 ioat: Use correct macro, fix build on i386
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 19:46:12 +00:00
Conrad Meyer
04494661d5 NTB: (partial) MFV ed6c24ed: NTB-RP support
This commit does not actually add NTB-RP support.  Mostly it serves to
shuffle code around to match the Linux driver.  Original Linux commit
log follows:

Add support for Non-Transparent Bridge connected to a PCI-E Root Port on
the remote system (also known as NTB-RP mode).  This allows for a NTB
enabled system to be connected to a non-NTB enabled system/slot.

Modifications to the registers and BARs/MWs on the Secondary side by the
remote system are reflected into registers on the Primary side for the
local system.  Similarly, modifications of registers and BARs/MWs on
Primary side by the local system are reflected into registers on the
Secondary side for the Remote System.  This allows communication between
the 2 sides via these registers and BARs/MWs.

Note: there is not a fix for the Xeon Errata (that was already worked
around in NTB-B2B mode) for NTB-RP mode.  Due to this limitation, NTB-RP
will not work on the Secondary side with the Xeon Errata workaround
enabled.  To get around this, disable the workaround via the
xeon_errata_workaround=0 modparm.  However, this can cause the hang
described in the errata.

Authored by:	Jon Mason
Obtained from:	Linux
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 19:45:29 +00:00
Conrad Meyer
1d9352af47 NTB: MFV 49793889: Rename Variables for NTB-RP
Many variable names in the NTB driver refer to the primary or secondary
side.  However, these variables will be used to access the reverse case
when in NTB-RP mode.  Make these names more generic in anticipation of
NTB-RP support.

Authored by:	Jon Mason
Obtained from:	Linux
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 19:44:25 +00:00
Michael Tuexen
9372530827 Fix the timeout for INIT retransmissions in the case where RTO_MIN is
smaller than RTO_INITIAL.

MFC after:	1 week
2015-10-13 18:27:55 +00:00
Sean Bruno
b0c041f887 Add support for sysctl knobs to live tune the per interrupt rx/tx packet
processing limits in ixgbe(4)

Differential Revision:	https://reviews.freebsd.org/D3719
Submitted by:	jason wolfe (j-nitrology.com)
MFC after:	2 weeks
2015-10-13 17:34:18 +00:00
Conrad Meyer
150be74358 NTB: Enable 32-bit support
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 17:22:23 +00:00
Conrad Meyer
08f35652eb NTB: Update pci ids
Add JSF, HSX, BDX ids; add two additional Xeon errata flags while we're
here.

Obtained from:	Linux
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 17:21:38 +00:00
Conrad Meyer
5fa87f166d NTB: MFV 113bf1c9: BWD Link Recovery
The BWD NTB device will drop the link if an error is encountered on the
point-to-point PCI bridge.  The link will stay down until all errors are
cleared and the link is re-established.  On link down, check to see if
the error is detected, if so do the necessary housekeeping to try and
recover from the error and reestablish the link.

There is a potential race between the 2 NTB devices recovering at the
same time.  If the times are synchronized, the link will not recover and
the driver will be stuck in this loop forever.  Add a random interval to
the recovery time to prevent this race.

Authored by:	Jon Mason
Obtained from:	Linux
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 17:20:47 +00:00
Sean Bruno
2b70dea602 ixl(4): Remove compile warning for unused function.
sys/dev/ixl/if_ixl.c:4377:1: warning: unused function 'ixl_debug_info' [-Wunused-function]
ixl_debug_info(SYSCTL_HANDLER_ARGS)

Differential Revision:	https://reviews.freebsd.org/D3718
Submitted by:	bz
2015-10-13 17:20:05 +00:00
Alexander Motin
e596ff7a1f Export bunch of state variables as sysctls. 2015-10-13 11:02:56 +00:00
Conrad Meyer
fd95aefd77 NTB: Style(9) cleanups 2015-10-13 03:12:55 +00:00
Conrad Meyer
59cf83b813 NTB: MFV 948d3a65: Xeon Errata Workaround
There is a Xeon hardware errata related to writes to SDOORBELL or B2BDOORBELL
in conjunction with inbound access to NTB MMIO Space, which may hang the
system.  To workaround this issue, use one of the memory windows to access the
interrupt and scratch pad registers on the remote system.  This bypasses the
issue, but removes one of the memory windows from use by the transport.  This
reduction of MWs necessitates adding some logic to determine the number of
available MWs.

Since some NTB usage methodologies may have unidirectional traffic, the ability
to disable the workaround via modparm has been added.

See BF113 in
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-c5500-c3500-spec-update.pdf
See BT119 in
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-family-spec-update.pdf

Authored by:	Jon Mason
Obtained from:	Linux
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 03:12:11 +00:00
Conrad Meyer
902362c988 NTB: Add hw.ntb sysctl node 2015-10-13 03:11:21 +00:00
Conrad Meyer
f74e1ad354 NTB: MFV b6750cfe: Correct USD/DSD Identification
Due to ambiguous documentation, the USD/DSD identification is backward
when compared to the setting in BIOS.  Correct the bits to match the
BIOS setting.

Authored by:	Jon Mason
Obtained from:	Linux
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 03:10:36 +00:00
Conrad Meyer
7a97964b79 NTB: MFV 87034511: Correct Number of Scratch Pad Registers
The NTB Xeon hardware has 16 scratch pad registers and 16 back-to-back
scratch pad registers.  Correct the #define to represent this and update
the variable names to reflect their usage.

Authored by:	Jon Mason
Obtained from:	Linux
Sponsored by:	EMC / Isilon Storage Division
2015-10-13 03:10:04 +00:00
Jason A. Harmening
dcaa560af0 Ensure the client regions for unmapped bounce buffers created through bus_dmamap_load_phys() do not span multiple pages.
This is already done for mapped buffers.
While here, stop casting bus_addr_t to vm_offset_t.
2015-10-13 02:17:56 +00:00
Navdeep Parhar
ca7c0f2e2d iw_cxgbe: MPA v2 is always available.
Submitted by:	Krishnamraju Eraparaju at chelsio dot com
Reviewed by:	Steve Wise at opengridcomputing dot com
2015-10-13 01:04:38 +00:00
David C Somayajulu
56c55e6513 Add support for reading device temperature
MFC after:5 days
2015-10-12 20:21:17 +00:00
Alexander Motin
2269f420b2 FreeBSD-specific addition to r289191. 2015-10-12 18:15:25 +00:00
Alexander Motin
fea4f2108f MFV r289188: 6281 prefetching should apply to 1MB reads
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Alexander Motin <mav@freebsd.org>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Justin Gibbs <gibbs@scsiguy.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
Author: George Wilson <george.wilson@delphix.com>

illumos/illumos-gate@632802744e
2015-10-12 15:48:45 +00:00
Alexander Motin
558dcd4e42 MFV r289187: 6251 add tunable to disable free_bpobj processing
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Simon Klinkert <simon.klinkert@gmail.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Albert Lee <trisk@omniti.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: George Wilson <george.wilson@delphix.com>

illumos/illumos-gate@139510fb6e
2015-10-12 15:44:44 +00:00
Alexander Motin
72b6ad9bb5 MFV r289185: 6250 zvol_dump_init() can hold txg open
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Albert Lee <trisk@omniti.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: George Wilson <george.wilson@delphix.com>

illumos/illumos-gate@b10bba7246
2015-10-12 15:39:03 +00:00
Kevin Lo
dc82802918 Accept any correct frames from any source when MONITOR mode is used.
Submitted by:	Andriy Voskoboinyk <s3erios at gmail.com>
Differential Revision:	https://reviews.freebsd.org/D3812
2015-10-12 08:17:21 +00:00