Commit Graph

239534 Commits

Author SHA1 Message Date
Gleb Smirnoff
647b604144 Unbreak call to ipf_check(): it expects the out parameter to be 0 or 1.
Pointy hat to:	glebius
Reported by:	cy
2019-02-01 07:48:37 +00:00
Marcelo Araujo
4edc7f418a Revert r343634:
Mostly a cosmetic change to replace strlen with strnlen.

Requested by:	kib and imp
2019-02-01 03:09:11 +00:00
Gleb Smirnoff
2e15db7bcd Hopefully fix compilation by other compilers. 2019-02-01 00:34:18 +00:00
Gleb Smirnoff
2790ca97d9 Fix build without INET6. 2019-02-01 00:33:17 +00:00
Marcelo Araujo
7722142ba8 Mostly a cosmetic change to replace strlen with strnlen.
Obtained from:	Project ACRN
MFC after:	2 weeks
2019-01-31 23:32:19 +00:00
Bryan Drewery
ab3cf2b476 Shar files may be seen as binary by grep.
Suggest using -a to egrep to properly see executed commands.

This is a minor improvement to the manpage.  A better improvement
would be removal or gigantic warnings.

Sponsored by:	Dell EMC
MFC after:	1 week
2019-01-31 23:21:18 +00:00
Brooks Davis
12aec82c09 Remove iBCS2: also remove xenix syscall function support.
Missed in r342243.
2019-01-31 23:01:12 +00:00
Gleb Smirnoff
b252313f0b New pfil(9) KPI together with newborn pfil API and control utility.
The KPI have been reviewed and cleansed of features that were planned
back 20 years ago and never implemented.  The pfil(9) internals have
been made opaque to protocols with only returned types and function
declarations exposed. The KPI is made more strict, but at the same time
more extensible, as kernel uses same command structures that userland
ioctl uses.

In nutshell [KA]PI is about declaring filtering points, declaring
filters and linking and unlinking them together.

New [KA]PI makes it possible to reconfigure pfil(9) configuration:
change order of hooks, rehook filter from one filtering point to a
different one, disconnect a hook on output leaving it on input only,
prepend/append a filter to existing list of filters.

Now it possible for a single packet filter to provide multiple rulesets
that may be linked to different points. Think of per-interface ACLs in
Cisco or Juniper. None of existing packet filters yet support that,
however limited usage is already possible, e.g. default ruleset can
be moved to single interface, as soon as interface would pride their
filtering points.

Another future feature is possiblity to create pfil heads, that provide
not an mbuf pointer but just a memory pointer with length. That would
allow filtering at very early stages of a packet lifecycle, e.g. when
packet has just been received by a NIC and no mbuf was yet allocated.

Differential Revision:	https://reviews.freebsd.org/D18951
2019-01-31 23:01:03 +00:00
Brooks Davis
90f2d5012a Regen after r342190.
Differential Revision:	https://reviews.freebsd.org/D18444
2019-01-31 22:58:17 +00:00
Konstantin Belousov
7674dce0a4 nvdimm: only enumerate present nvdimm devices
Not all child devices of the NVDIMM root device represent DIMM devices
which are present in the system. The spec says (ACPI 6.2, sec 9.20.2):

    For each NVDIMM present or intended to be supported by platform,
    platform firmware also exposes an NVDIMM device ... under the
    NVDIMM root device.

Present NVDIMM devices are found by walking all of the NFIT table's
SPA ranges, then walking the NVDIMM regions mentioned by those SPA
ranges.

A set of NFIT walking helper functions are introduced to avoid the
need to splat the enumeration logic across several disparate
callbacks.

Submitted by:	D Scott Phillips <d.scott.phillips@intel.com>
Sponsored by:	Intel Corporation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18439
2019-01-31 22:47:04 +00:00
Konstantin Belousov
7dcbca8d67 nvdimm: enumerate NVDIMM SPA ranges from the root device
Move the enumeration of NVDIMM SPA ranges from the spa GEOM class
initializer into the NVDIMM root device. This will be necessary for a
later change where NVDIMM namespaces require NVDIMM device enumeration
to be reliably ordered before SPA enumeration.

Submitted by:	D Scott Phillips <d.scott.phillips@intel.com>
Sponsored by:	Intel Corporation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18734
2019-01-31 22:43:20 +00:00
Gleb Smirnoff
eec189c70b Add new m_ext type for data for M_NOFREE mbufs, which doesn't actually do
anything except several assertions.  This type is going to be used for
temporary on stack mbufs, that point into data in receive ring of a NIC,
that shall not be freed.  Such mbuf can not be stored or reallocated, its
life time is current context.
2019-01-31 22:37:28 +00:00
Mark Johnston
919e7b5359 Prevent some kobj memory allocation failures from panicking the system.
Parts of the kobj(9) KPI assume a non-sleepable context for the purpose
of internal memory allocations, but currently have no way to signal an
allocation failure to the caller, so they just panic in this case.  This
can occur even when kobj_create() is called with M_WAITOK.  Fix some
instances of the problem by plumbing wait flags from kobj_create() through
internal subroutines.  Change kobj_class_compile() to assume a sleepable
context when called externally, since all existing callers use it in a
sleepable context.

To fix the problem fully the kobj_init() KPI must be changed.

Reported and tested by:	pho
Reviewed by:	kib (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19023
2019-01-31 22:27:39 +00:00
Eric Joyner
7aad1f4edc ix(4),ixv(4): Fix TSO offloads when TXCSUM is disabled
This patch and commit message are based on r340256 created by Jacob Keller:

The iflib stack does not disable TSO automatically when TXCSUM is
disabled, instead assuming that the driver will correctly handle TSOs
even when CSUM_IP is not set.

This results in iflib calling ixgbe_isc_txd_encap with packets which have
CSUM_IP_TSO, but do not have CSUM_IP or CSUM_IP_TCP set. Because of
this, ixgbe_tx_ctx_setup will not setup the IPv4 checksum offloading.

This results in bad TSO packets being sent if a user disables TXCSUM
without disabling TSO.

Fix this by updating the ixgbe_tx_ctx_setup function to check both
CSUM_IP and CSUM_IP_TSO when deciding whether to enable checksums.

Once this is corrected, another issue for TSO packets is revealed. The
driver sets IFLIB_NEED_ZERO_CSUM in order to enable a work around that
causes the ip->sum field to be zero'd. This is necessary for ix
hardware to correctly perform TSOs.

However, if TXCSUM is disabled, then the work around is not enabled, as
CSUM_IP will not be set when the iflib stack checks to see if it should
clear the sum field.

Fix this by adding IFLIB_TSO_INIT_IP to the iflib flags for the ix and
ixv interface files.

Once both of these changes are made, the ix and ixv drivers should
correctly offload TSO packets when TSO offload is enabled, regardless
of whether TXCSUM is enabled or disabled.

Submitted by:	Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by:	IntelNetworking
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D18470
2019-01-31 21:53:03 +00:00
Eric Joyner
b2c1e8e620 ix(4): Run {mod,msf,mbx,fdir,phy}_task in if_update_admin_status
From Piotr:

This patch introduces adapter->task_requests register responsible for
recording requests for mod_task, msf_task, mbx_task, fdir_task and
phy_task calls. Instead of enqueueing these tasks with
GROUPTASK_ENQUEUE, handlers will be called directly from
ixgbe_if_update_admin_status() while holding ctx lock.

SIOCGIFXMEDIA ioctl() call reads adapter->media list. The list is
deleted and rewritten in ixgbe_handle_msf() task without holding ctx
lock. This change is needed to maintain data coherency when sharing
adapter info via ioctl() calls.

Patch co-authored by Krzysztof Galazka <krzysztof.galazka@intel.com>.

PR:		221317
Submitted by:	Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by:	sbruno@, IntelNetworking
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D18468
2019-01-31 21:44:33 +00:00
John Baldwin
829c56fc08 Don't set IFCAP_TXRTLMT during lagg_clone_create().
lagg_capabilities() will set the capability once interfaces supporting
the feature are added to the lagg.  Setting it on a lagg without any
interfaces is pointless as the if_snd_tag_alloc call will always fail
in that case.

Reviewed by:	hselasky, gallatin
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19040
2019-01-31 21:35:37 +00:00
Gleb Smirnoff
f712b16127 Revert r316461: Remove "IPFW static rules" rmlock, and use pfil's global lock.
The pfil(9) system is about to be converted to epoch(9) synchronization, so
we need [temporarily] go back with ipfw internal locking.

Discussed with:	ae
2019-01-31 21:04:50 +00:00
Konstantin Belousov
f8d49128a9 Make iflib a loadable module: add seemingly missed header.
Reported by:	CI (i.e. it is not reproducable in my local builds)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2019-01-31 20:04:18 +00:00
Konstantin Belousov
c75f49f7d8 Make iflib a loadable module.
iflib is already a module, but it is unconditionally compiled into the
kernel.  There are drivers which do not need iflib(4), and there are
situations where somebody might not want iflib in kernel because of
using the corresponding driver as module.

Reviewed by:	marius
Discussed with:	erj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19041
2019-01-31 19:05:56 +00:00
Gleb Smirnoff
37125720b9 In zone_alloc_bucket() max argument was calculated based on uz_count.
Then bucket_alloc() also selects bucket size based on uz_count. However,
since zone lock is dropped, uz_count may reduce. In this case max may
be greater than ub_entries and that would yield into writing beyond end
of the allocation.

Reported by:	pho
2019-01-31 17:52:48 +00:00
Ed Maste
675f752cc5 readelf: dump elf note data
Output format is compatible with GNU readelf's handling of unknown note
types (modulo a GNU char signedness bug); future changes will add type-
specific decoding.

Reviewed by:	kib
MFC after:	1 week
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
2019-01-31 17:04:55 +00:00
Ed Maste
97d368d62b elfdump: use designated array initialization for note types
This ensures the note type name is in the correct slot.

PR:		228290
Submitted by:	kib
MFC with:	343610
Sponsored by:	The FreeBSD Foundation
2019-01-31 16:49:06 +00:00
Ed Maste
8ae9aa2772 elfdump: fix build after r343610
One patch hunk did not survive the trip from git to svn.

PR:		228290
MFC with:	r343610
2019-01-31 16:21:09 +00:00
Ed Maste
2bc7b0242f elfdump: include note type names
Based on a patch submitted by Dan McGregor.

PR:		228290
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-01-31 16:19:04 +00:00
Ed Maste
0f663f7258 elfdump: whitespace fixup in advance of other changes 2019-01-31 16:11:15 +00:00
Ed Maste
1f7d148368 regen src.conf.5 after r343606 2019-01-31 15:50:11 +00:00
Konstantin Belousov
75fe717698 Reserve a bit in the FreeBSD feature control note for marking the
image as not compatible with ASLR.

Requested by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D5603
2019-01-31 15:44:49 +00:00
Ed Maste
ca5efb62ee Enable lld as the system linker by default on i386
The migration to LLVM's lld linker has been in progress for quite some
time - I opened an LLVM tracking bug (23214) in April 2015 to track
issues using lld as FreeBSD's linker, and requested the first exp-run
using lld as /usr/bin/ld in November 2016.

In 12.0 LLD is the system linker on amd64, arm64, and armv7.  i386 was
not switched initially as there were additional ports failures not found
on amd64.  Those have largely been addressed now, although there are a
small number of issues that are still being worked on.  In some of these
cases having lld as the system linker makes it easier for developers and
third parties to investigate failures.

Thanks to antoine@ for handling the exp-runs and to everyone in the
FreeBSD and LLVM communites who have fixed issues with lld to get us to
this point.

PR:		214864
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
2019-01-31 15:07:32 +00:00
Andriy Voskoboinyk
838b61c1f0 bwn(4): reuse ieee80211_tx_complete function.
MFC after:	1 week
2019-01-31 11:12:31 +00:00
Andriy Voskoboinyk
9b1a29716a ipw(4): reuse ieee80211_tx_complete function
This should partially fix 'netstat -b -I wlan0' output

MFC after:	1 week
2019-01-31 10:44:00 +00:00
Kyle Evans
6cbda6d943 install(1): Fix relative path calculation with partial common dest/src
For example, from the referenced PR [1]:

$ mkdir /tmp/lib/ /tmp/libexec
$ touch /tmp/lib/foo.so
$ install -lrs /tmp/lib/foo.so /tmp/libexec/

The common path identification bits terminate src at /tmp/lib/ and the
destination at /tmp/libe. The subsequent backtracking is then incorrect, as
it traverses the destination and backtraces exactly one level while eating
the 'libexec' because it was previously (falsely) identified as common with
'lib'.

The obvious fix would be to make sure we've actually terminated just after
directory separators and rewind a character if we haven't. In the above
example, we would end up rewinding to /tmp/ and subsequently doing the right
thing.

Test case added.

PR:		235330 [1]
MFC after:	1 week
2019-01-31 05:20:11 +00:00
Cy Schubert
43f5d5a277 Document the instance context pointer.
MFC after:	3 days
2019-01-31 04:16:52 +00:00
Kyle Evans
9303f81955 libc/tests: Add test case for jemalloc/libthr bug fixed in r343566
Submitted by:	Andrew Gierth (original reproducer; kevans massaged for atf)
Reviewed by:	kib
MFC after:	2 weeks
X-MFC-with:	r343566 (or after)
Differential Revision:	https://reviews.freebsd.org/D19027
2019-01-31 02:49:24 +00:00
David C Somayajulu
fa790ea99f Add RDMA (iWARP and RoCEv1) support
David Somayajulu (davidcs): Overall RDMA Driver infrastructure and iWARP
Anand Khoje (akhoje@marvell.com): RoCEv1 verbs implementation

MFC after:5 days
2019-01-31 00:09:38 +00:00
Ed Maste
fb4e718261 readelf: fix i386 build
Use %jx and (uintmax_t) cast.

PR:		232983
MFC with:	r343592
Sponsored by:	The FreeBSD Foundation
2019-01-30 21:46:12 +00:00
Ed Maste
87a8583b24 readelf: decode flag bits in DT_FLAGS/DT_FLAGS_1
Decode d_val when the tag is DT_FLAGS or DT_FLAGS_1 based on the
information at:

https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-42444.html

PR:		232983
Submitted by:	Bora Ozarslan borako.ozarslan@gmail.com
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18784
2019-01-30 20:44:51 +00:00
Cy Schubert
b403765e8c Do not obtain an already held read lock. This causes a witness panic when
ipfs is invoked. This is the second of two panics resolving PR 235110.

PR:		235110
Reported by:	David.Boyd49@twc.com
MFC after:	2 weeks
2019-01-30 20:23:16 +00:00
Cy Schubert
b63abbf63a When copying a NAT rule struct to userland for save by ipfs, use the
length of the struct in memmove() rather than an unintialized variable.
This fixes the first of two kernel page faults when ipfs is invoked.

PR:		235110
Reported by:	David.Boyd49@twc.com
MFC after:	2 weeks
2019-01-30 20:22:33 +00:00
Ed Maste
9c812c8d4e freebsd-update: regenerate man page database after update
These are currently not reproducible because they're built by the
makewhatis on the freebsd-update build host, not the one in the tree.
Regenerate after update, and later we can avoid including it in
freebsd-update data.

PR:		214545, 217389
Reviewed by:	delphij
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D10482
2019-01-30 19:19:14 +00:00
Alexander Motin
441a6b699f Remove stale now comment, forgotten in r343582.
MFC after:	2 weeks
2019-01-30 18:56:45 +00:00
Brooks Davis
435a8c1560 Add a simple port filter to SIFTR.
SIFTR does not allow any kind of filtering, but captures every packet
processed by the TCP stack.
Often, only a specific session or service is of interest, and doing the
filtering in post-processing of the log adds to the overhead of SIFTR.

This adds a new sysctl net.inet.siftr.port_filter. When set to zero, all
packets get captured as previously. If set to any other value, only
packets where either the source or the destination ports match, are
captured in the log file.

Submitted by:	Richard Scheffenegger
Reviewed by:	Cheng Cui
Differential Revision:	https://reviews.freebsd.org/D18897
2019-01-30 17:44:30 +00:00
Alexander Motin
54cde30f92 Remove BIO_ORDERED flag from BIO_FLUSH sent by ZFS.
In all cases where ZFS sends BIO_FLUSH, it first waits for all related
writes to complete, so its BIO_FLUSH does not care about strict ordering.
Removal of one makes life much easier at least for NVMe driver, which
hardware has no concept of request ordering, relying completely on software.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-01-30 17:39:44 +00:00
Alexander Motin
6afd921090 Only sort requests of types that have concept of offset.
Other types, such as BIO_FLUSH or BIO_ZONE, or especially new/unknown ones,
may imply some degree of ordering even if strict ordering is not requested
explicitly.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-01-30 17:24:50 +00:00
Hans Petter Selasky
9de921ee59 Export vendor specific USB MIDI device list to PnP info.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-01-30 17:11:08 +00:00
Ravi Pokala
475a76e3ce Remove unecessary "All rights reserved" from files under my or Panasas's
copyright.

When all member nations of the Buenos Aires Convention adopted the Berne
Convention, the phrase "All rights reserved" became unnecessary to assert
copyright. Remove it from files under my or Panasas's copyright. The files
related to jedec_dimm(4) also bear avg@'s copyright; he has approved this
change.

Approved by:	avg
Sponsored by:	Panasas
2019-01-30 16:55:00 +00:00
Alexander Motin
a5fde7ef52 Relax BIO_FLUSH ordering in da(4), respecting BIO_ORDERED.
r212160 tightened this from always using MSG_SIMPLE_Q_TAG to always
MSG_ORDERED_Q_TAG.  Since it also marked all BIO_FLUSH requests with
BIO_ORDERED, this commit changes nothing immediately, but it returns
BIO_FLUSH callers ability to actually specify ordering they really
need, alike to other request types.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-01-30 16:50:53 +00:00
Konstantin Belousov
e259e5f4c0 Remove duplicate declarations.
Submitted by:	bde
MFC after:	2 months
2019-01-30 16:29:15 +00:00
Konstantin Belousov
d49ca25de6 Rename rtld-elf/malloc.c to rtld-elf/rtld_malloc.c.
Then malloc.c file name is too generic to use it for libthr.a.

Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2019-01-30 16:28:27 +00:00
Vincenzo Maffione
19c4ec08ad netmap: fix lock order reversal related to kqueue usage
When using poll(), select() or kevent() on netmap file descriptors,
netmap executes the equivalent of NIOCTXSYNC and NIOCRXSYNC commands,
before collecting the events that are ready. In other words, the
poll/kevent callback has side effects. This is done to avoid the
overhead of two system call per iteration (e.g., poll() + ioctl(NIOC*XSYNC)).

When the kqueue subsystem invokes the kqueue(9) f_event callback
(netmap_knrw), it holds the lock of the struct knlist object associated
to the netmap port (the lock is provided at initialization, by calling
knlist_init_mtx).
However, netmap_knrw() may need to wake up another netmap port (or even
the same one), which means that it may need to call knote().
Since knote() needs the lock of the struct knlist object associated to
the to-be-wake-up netmap port, it is possible to have a lock order reversal
problem (AB/BA deadlock).

This change prevents the deadlock by executing the knote() call in a
per-selinfo taskqueue, where it is possible to hold a mutex.

Reviewed by:	aleksandr.fedorov_itglobal.com
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D18956
2019-01-30 15:51:55 +00:00
Marius Strobl
b97de13ae0 - Stop iflib(4) from leaking MSI messages on detachment by calling
bus_teardown_intr(9) before pci_release_msi(9).
- Ensure that iflib(4) and associated drivers pass correct RIDs to
  bus_release_resource(9) by obtaining the RIDs via rman_get_rid(9)
  on the corresponding resources instead of using the RIDs initially
  passed to bus_alloc_resource_any(9) as the latter function may
  change those RIDs. Solely em(4) for the ioport resource (but not
  others) and bnxt(4) were using the correct RIDs by caching the ones
  returned by bus_alloc_resource_any(9).
- Change the logic of iflib_msix_init() around to only map the MSI-X
  BAR if MSI-X is actually supported, i. e. pci_msix_count(9) returns
  > 0. Otherwise the "Unable to map MSIX table " message triggers for
  devices that simply don't support MSI-X and the user may think that
  something is wrong while in fact everything works as expected.
- Put some (mostly redundant) debug messages emitted by iflib(4)
  and em(4) during attachment under bootverbose. The non-verbose
  output of em(4) seen during attachment now is close to the one
  prior to the conversion to iflib(4).
- Replace various variants of spelling "MSI-X" (several in messages)
  with "MSI-X" as used in the PCI specifications.
- Remove some trailing whitespace from messages emitted by iflib(4)
  and change them to consistently start with uppercase.
- Remove some obsolete comments about releasing interrupts from
  drivers and correct a few others.

Reviewed by:	erj, Jacob Keller, shurd
Differential Revision:	https://reviews.freebsd.org/D18980
2019-01-30 13:21:26 +00:00