Commit Graph

1784 Commits

Author SHA1 Message Date
Jung-uk Kim
a009b7dcab Merge ACPICA 20191018. 2019-10-19 14:56:44 +00:00
Brandon Bergren
9f15304d32 Fix read past end of struct in ncsw glue code.
The logic in XX_IsPortalIntr() was reading past the end of XX_PInfo.
This was causing it to erroneously return 1 instead of 0 in some
circumstances, causing a panic on the AmigaOne X5000 due to mixing
exclusive and nonexclusive interrupts on the same interrupt line.

Since this code is only called a couple of times during startup, use
a simple double loop instead of the complex read-ahead single loop.

This also fixes a bug where it would never check cpu=0 on type=1.

Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D21988
2019-10-12 23:16:17 +00:00
Gleb Smirnoff
b8a6e03fac Widen NET_EPOCH coverage.
When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex covered areas were
as small as possible, so became epoch covered areas.

However, epoch doesn't introduce any contention, it just delays
memory reclaim. So, there is no point to minimise epoch covered
areas in sense of performance. Meanwhile entering/exiting epoch
also has non-zero CPU usage, so doing this less often is a win.

Not the least is also code maintainability. In the new paradigm
we can assume that at any stage of processing a packet, we are
inside network epoch. This makes coding both input and output
path way easier.

On output path we already enter epoch quite early - in the
ip_output(), in the ip6_output().

This patch does the same for the input path. All ISR processing,
network related callouts, other ways of packet injection to the
network stack shall be performed in net_epoch. Any leaf function
that walks network configuration now asserts epoch.

Tricky part is configuration code paths - ioctls, sysctls. They
also call into leaf functions, so some need to be changed.

This patch would introduce more epoch recursions (see EPOCH_TRACE)
than we had before. They will be cleaned up separately, as several
of them aren't trivial. Note, that unlike a lock recursion the
epoch recursion is safe and just wastes a bit of resources.

Reviewed by:	gallatin, hselasky, cy, adrian, kristof
Differential Revision:	https://reviews.freebsd.org/D19111
2019-10-07 22:40:05 +00:00
Cy Schubert
289850db1c Add missing definition in DEBUG code.
MFC after:	3 days
2019-10-04 22:10:38 +00:00
Hans Petter Selasky
e8f206cc60 Notify all sleeping threads of device removal in krping.
Implement d_purge for krping_cdevsw.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:03:48 +00:00
Justin Hibbits
4f0796edf8 dpaa(4): Fix memcpy size for threshold copy in NCSW contrib
On 64-bit platforms uintptr_t makes the copy twice as large as it should be.
This code isn't actually used in FreeBSD, since it's for guest mode only,
not hypervisor mode, but fixing it for completeness sake.

Reported by:	bdragon (clang9 build)
2019-09-28 02:49:46 +00:00
Cy Schubert
d096bd7911 ipf mistakenly regards UDP packets with a checksum of 0xffff as bad.
Obtained from:	NetBSD fil.c r1.30, NetBSD PR/54443
MFC after:	3 days
2019-09-26 03:09:42 +00:00
Kyle Evans
da0a7834ac octeon-sdk: suppress another set of warnings under clang
Clang sees this construct and warns that adding an int to a string like this
does not concatenate the two. Fortunately, this is not what octeon-sdk
actually intended to do, so we take the path towards remediation that clang
offers: use array indexing instead.
2019-09-22 18:32:05 +00:00
Mark Johnston
fee2a2fa39 Change synchonization rules for vm_page reference counting.
There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator.  In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well.  These references are protected by the page lock, which must
therefore be acquired for many per-page operations.  This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter.  A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held.  As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed.  The former
requires that either the object lock or the busy lock is held.  The
latter no longer has a return value and may free the page if it releases
the last reference to that page.  vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate.  vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold().  It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler.  vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state).  In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths.  In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock.  In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped.  The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by:	jeff (earlier version)
Tested by:	gallatin (earlier version), pho
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20486
2019-09-09 21:32:42 +00:00
Cy Schubert
d8deeff04d Document ipf_nat_hashtab_add() return codes.
MFC after:	3 days
2019-08-28 04:55:17 +00:00
Cy Schubert
d49ffa7664 Destroy the mutex in case of error.
Obtained from:	NetBSD ip_nat.c r1.7
MFC after:	3 days
2019-08-28 04:55:03 +00:00
Cy Schubert
6d7534d8ea Fixup typo in comment.
Obtained from:	NetBSD ip_nat.c r1.7
MFC after:	3 days
2019-08-28 04:54:26 +00:00
Xin LI
110f229794 MFV r351500: Fix CLEAR_HASH macro to be usable as a single statement.
MFC after:	2 weeks
2019-08-26 00:46:39 +00:00
Cy Schubert
4d93130daf Specifying array sizes for fully initialized tables at compile time is
redundant.

MFC after:	1 week
2019-08-22 03:33:10 +00:00
Bjoern A. Zeeb
e303ee524e athhal: disable unused function (big endian only)
Disable ar9300_swap_tx_desc() for the moment.  It is an unused
function only tried to compile on big endian systems.

Found by:	s390x buildkernel
MFC after:	3 months
2019-08-21 10:42:31 +00:00
Justin Hibbits
aef13f050c dpaa: Fix warnings in dtsec(4) found by clang
These are all trivial warnings that have no real functional change.
2019-08-21 02:26:22 +00:00
Jung-uk Kim
37d7a5bcc0 MFV: r351091
Fix the reported boot failures and revert r350510.

Note this commit is effectively merging ACPICA 20190703 again and applying
an upstream patch.

https://github.com/acpica/acpica/commit/73f6372

Tested by:	scottl
2019-08-15 17:43:36 +00:00
Jung-uk Kim
d632f88e20 Enable ACPICA mutex debugging in INVARIANTS case.
This lets us detect lock order reversal in ACPICA code to avoid deadlock.
2019-08-15 16:04:22 +00:00
Cy Schubert
263a6508a3 Initialize the frentry (the control block that defines a rule) checksum
to zero. Matching checksums save time and effort by mitigating the need
for full rule compare.

MFC after:	3 days
2019-08-12 02:42:47 +00:00
Cy Schubert
fa99b3234a Calculate the number interface array elements using the new FR_NUM macro
instead of the hard-coded value of 4. This is a precursor to increasing
the number of interfaces speficied in "on {interface, ..., interface}".
Note that though this feature is coded in ipf_y.y, it is partially
supported in the ipfilter kld, meaning it does not work yet (and is yet
to be documented in ipf.5 too).

MFC after:	2 weeks
2019-08-11 23:54:52 +00:00
Cy Schubert
fef510763d r272552 applied the patch from ipfilter upstream fil.c r1.129 to fix
broken ipfilter rule matches (upstream bug #554). The upstream patch
was incomplete, it resolved all but one rule compare issue. The issue
fixed here is when "{to, reply-to, dup-to} interface" are used in
conjuncion with "on interface". The match was only made if the on keyword
was specified in the same order in each case referencing the same rule.
This commit fixes this.

The reason for this is that interface name strings and comment keyword
comments are stored in a a variable length field starting at fr_names
in the frentry struct. These strings are placed into this variable length
in the order they are encountered by ipf_y.y and indexed through index
pointers in fr_ifnames, fr_comment or one of the frdest struct fd_name
fields. (Three frdest structs are within frentry.) Order matters and
this patch takes this into account.

While in here it was discovered that though ipfilter is designed to
support multiple interface specifiations per rule (up to four), this
undocumented (the man page makes no mention of it) feature does not work.
A todo is to fix the multiple interfaces feature at a later date. To
understand the design decision as to why only four were intended, it is
suspected that the decision was made because Sun workstations and PCs
rarely if ever exceeded four NICs at the time, this is not true in 2019.

PR:		238796
Reported by:	WHR <msl0000023508@gmail.com>
MFC after:	2 weeks
2019-08-11 23:54:49 +00:00
Warner Losh
5e34c4c505 Stopgap fix for gcc platforms.
Our in-tree gcc doesn't have a no-tree-vectorize optimization knob, so we get a
warning that it's unused. This causes the build to fail on all our gcc platforms.
Add a quick version check as a stop-gap measure to get CI building again.
2019-08-08 20:09:36 +00:00
Conrad Meyer
4d3f1eafc9 Update to Zstandard 1.4.2
The full release notes for 1.4.1 (skipped) and 1.4.2 can be found on Github:

  https://github.com/facebook/zstd/releases/tag/v1.4.1
  https://github.com/facebook/zstd/releases/tag/v1.4.2

These are mostly minor updates; 1.4.1 purportedly brings something like 7%
faster decompression speed.

Relnotes:	yes
2019-08-08 16:54:22 +00:00
Xin LI
a15cb219c6 Expose zlib's utility functions in Z_SOLO library when building kernel.
This allows kernel code to reuse zlib's implementation.

PR:		229763
Reviewed by:	Yoshihiro Ota <ota j email ne jp>
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D21156
2019-08-07 01:41:17 +00:00
Cy Schubert
a1601073bf Resolve ipfilter kld unload issues related to VNET jails.
When the ipfilter kld is loaded, used within VNET jail, and unloaded,
then subsequent loading, use, and unloading of another packet filters
will cause the subsequently loaded netpfil kld's to panic.

The scenario is as follows:

cd /usr/tests/sys/netpfil/common

kldunload ipl
kldunload pfsync
kldunload ipfw

kyua test pass_block

kldload ipl
kyua test pass_block
kldunload ipl

kldload pfsync
kyua test pass_block
kldunload pfsync
-- page fault panic occurs here --

Reported by:	"Ahsan Barkati" <ahsanbarkati@g.....com> via kp@
Discussed with:	kp@
Tested by:	kp@
MFC after:	3 days
2019-08-04 12:47:38 +00:00
Cy Schubert
ded28caa5e Returning an uninitialized error is a bad thing.
MFC after:	3 days
2019-08-04 12:47:35 +00:00
Cy Schubert
dfb39567a2 MFC after: 3 days 2019-08-02 22:58:45 +00:00
Bjoern A. Zeeb
0ecd976e80 IPv6 cleanup: kernel
Finish what was started a few years ago and harmonize IPv6 and IPv4
kernel names.  We are down to very few places now that it is feasible
to do the change for everything remaining with causing too much disturbance.

Remove "aliases" for IPv6 names which confusingly could indicate
that we are talking about a different data structure or field or
have two fields, one for each address family.
Try to follow common conventions used in FreeBSD.

* Rename sin6p to sin6 as that is how it is spelt in most places.
* Remove "aliases" (#defines) for:
  - in6pcb which really is an inpcb and nothing separate
  - sotoin6pcb which is sotoinpcb (as per above)
  - in6p_sp which is inp_sp
  - in6p_flowinfo which is inp_flow
* Try to use ia6 for in6_addr rather than in6p.
* With all these gone  also rename the in6p variables to inp as
  that is what we call it in most of the network stack including
  parts of netinet6.

The reasons behind this cleanup are that we try to further
unify netinet and netinet6 code where possible and that people
will less ignore one or the other protocol family when doing
code changes as they may not have spotted places due to different
names for the same thing.

No functional changes.

Discussed with:		tuexen (SCTP changes)
MFC after:		3 months
Sponsored by:		Netflix
2019-08-02 07:41:36 +00:00
Jung-uk Kim
2e57804413 Revert r349863 (ACPICA 20190703).
This commit caused boot failures on some systems.

Requested by:	scottl
2019-08-01 17:45:43 +00:00
Xin LI
0ed1d6fb00 Allow Kernel to link in both legacy libkern/zlib and new sys/contrib/zlib,
with an eventual goal to convert all legacl zlib callers to the new zlib
version:

 * Move generic zlib shims that are not specific to zlib 1.0.4 to
   sys/dev/zlib.
 * Connect new zlib (1.2.11) to the zlib kernel module, currently built
   with Z_SOLO.
 * Prefix the legacy zlib (1.0.4) with 'zlib104_' namespace.
 * Convert sys/opencrypto/cryptodeflate.c to use new zlib.
 * Remove bundled zlib 1.2.3 from ZFS and adapt it to new zlib and make
   it depend on the zlib module.
 * Fix Z_SOLO build of new zlib.

PR:		229763
Submitted by:	Yoshihiro Ota <ota j email ne jp>
Reviewed by:	markm (sys/dev/zlib/zlib_kmod.c)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D19706
2019-08-01 06:35:33 +00:00
Cy Schubert
caddc9e343 As of upstream fil.c CVS r1.53 (March 1, 2009), prior to the import of
ipfilter 5.1.2 into FreeBSD-10, the fix for, 2580062 from/to targets
should be able to use any interface name, moved frentry.fr_cksum to
prior to frentry.fr_func thereby making this code redundant. After
investigating whether this fix to move fr_cksum was correct and if it
broke anything, it has been determined that the fix is correct and this
code is redundant. We remove it here.

MFC after:	2 weeks
2019-07-16 19:00:42 +00:00
Cy Schubert
a422d59f7b Refactor, removing one compare.
This changes the return code however the caller only tests for 0 and != 0.
One might ask then, why multiple return codes when the caller only tests
for 0 and != 0? From what I can tell, Darren probably passed various
return codes for sake of debugging. The debugging code is long gone
however we can still use the different return codes using DTrace FBT
traces. We can still determine why the compare failed by examining the
differences between the fr1 and fr2 frentry structs, which is a simple
test in DTrace. This allows reducing the number of tests, improving the
code while not affecting our ability to capture information for
diagnostic purposes.

MFC after:	1 week
2019-07-16 19:00:38 +00:00
Cy Schubert
d096fc9ccd Calculate the offset of the interface name using FR_NAME rather than
calclulating it "by hand". This improves consistency with the rest of
the code and is in line with planned fixes and other work.

MFC after:	1 week
2019-07-14 02:46:34 +00:00
Cy Schubert
49a28fbdd2 Recycle the unused FR_CMPSIZ macro which became orphaned in ipfilter 5
prior to its import into FreeBSD. This macro calculates the size to be
compared within the frentry structure. The ipfilter 4 version of the
macro calculated the compare size based upon the static size of the
frentry struct. Today it uses the ipfilter 5 method of calculating the
size based upon the new to ipfilter 5 fr_size value found in the
frentry struct itself.

No effective change in code is intended.

MFC after:	1 week
2019-07-14 02:46:30 +00:00
Cy Schubert
d4af744b6a style(9)
MFC after:	3 days
2019-07-14 02:46:26 +00:00
Cy Schubert
75118b47fc Move the new ipf_pcksum6() function from ip_fil_freebsd.c to fil.c.
The reason for this is that ipftest(8), which still works on FreeBSD-11,
fails to link to it, breaking stable/11 builds.

ipftest(8) was broken (segfault) sometime during the FreeBSD-12 cycle.
glebius@ suggested we disable building it until I can get around to
fixing it. Hence this was not caught in -current.

The intention is to fix ipftest(8) as it is used by the netbsd-tests
(imported by ngie@ many moons ago) for regression testing.

MFC after:	immediately
2019-07-12 01:59:08 +00:00
Cy Schubert
c5dddb272d Remove a tautological test for adding a rule in the block that
adds rules.

MFC after:	1 week
2019-07-11 19:36:18 +00:00
Cy Schubert
3133f9c2a3 Correct r349898. The default is add a rule.
MFC after:	1 week
X-MFC with:	r349898
2019-07-11 19:36:14 +00:00
Cy Schubert
d37052fc86 ipfilter commands, in this case ipf(8), passes its operations and rules
via an ioctl interface. Rules can be added or removed and stats and
counters can be zeroed out. As the ipfilter interprets these
instructions or operations they are stored in an integer called
addrem (add/remove). 1 is add, 2 is remove, and 3 is clear stats and
counters. Much of this is not documented. This commit documents these
operations by replacing simple integers with a self documenting
enum along with a few basic comments.

MFC after:	1 week
2019-07-11 00:08:46 +00:00
Jung-uk Kim
56a6dee6f7 MFV: r349861
Import ACPICA 20190703.
2019-07-09 18:02:36 +00:00
Mark Johnston
eeacb3b02f Merge the vm_page hold and wire mechanisms.
The hold_count and wire_count fields of struct vm_page are separate
reference counters with similar semantics.  The remaining essential
differences are that holds are not counted as a reference with respect
to LRU, and holds have an implicit free-on-last unhold semantic whereas
vm_page_unwire() callers must explicitly determine whether to free the
page once the last reference to the page is released.

This change removes the KPIs which directly manipulate hold_count.
Functions such as vm_fault_quick_hold_pages() now return wired pages
instead.  Since r328977 the overhead of maintaining LRU for wired pages
is lower, and in many cases vm_fault_quick_hold_pages() callers would
swap holds for wirings on the returned pages anyway, so with this change
we remove a number of page lock acquisitions.

No functional change is intended.  __FreeBSD_version is bumped.

Reviewed by:	alc, kib
Discussed with:	jeff
Discussed with:	jhb, np (cxgbe)
Tested by:	pho (previous version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19247
2019-07-08 19:46:20 +00:00
Cy Schubert
67a1d0547c Update frtuc struct comments. It not only defines TCP things we are
interested in but also UDP.

While at it document the source and destination port variables.

MFC after:	3 days
2019-07-08 19:11:49 +00:00
Cy Schubert
b64b92b0d2 Correct the description for the low port in the port compare struct.
Adjust the high port description to match that of the low port
description.

MFC after:	3 days
2019-07-08 19:11:35 +00:00
Cy Schubert
23cfb1b256 The RFC 3128 test should be made after the offset mask has been applied.
Reported by:	christos@NetBSD.org
X-MFC with:	r349399
2019-06-30 22:32:33 +00:00
Cy Schubert
a9a131902d Revert r349400. It has uintended effects.
Reported by:	christos@NetBSD.org
X-MFC with:	r349400.
2019-06-30 22:27:58 +00:00
Cy Schubert
65f07d9976 While working on PR/238796 I discovered an unused variable in frdest,
the next hop structure. It is likely this contributes to PR/238796
though other factors remain to be investigated.

PR:		238796
MFC after:	1 week
2019-06-26 00:53:49 +00:00
Cy Schubert
2637412cbc Remove a tautological compare for offset != 0.
MFC after:	1 week
2019-06-26 00:53:46 +00:00
Cy Schubert
7f39a7e492 Prompted by r349366, ipfilter is also does not conform to RFC 3128
by dropping TCP fragments with offset = 1.

In addition to dropping these fragments, add a DTrace probe to allow
for more detailed monitoring and diagnosis if required.

MFC after:	1 week
2019-06-26 00:53:43 +00:00
Cy Schubert
c964c98793 The definition of icmptypes in ip_compt.h is dead code as it already
use the icmptypes in ip_icmp.h.

MFC after:	1 week
2019-06-25 07:04:47 +00:00
Cy Schubert
51a7230a18 Clean out duplicate definitions of TCP macros also found in netinet/tcp.h.
MFC after:	1 week
2019-06-24 02:58:02 +00:00