123531 Commits

Author SHA1 Message Date
Conrad Meyer
bc812246a0 cam_ccb.h: Remove redundant declarations of static inline functions
No functional change.

They're unnecessarily confusing for tools like grep or ctags.

Sponsored by:	Dell EMC Isilon
2018-08-09 21:20:07 +00:00
Navdeep Parhar
518bca2c21 cxgbe(4): Set fl_pktshift to 0 by default.
Sponsored by:	Chelsio Communications
2018-08-09 21:07:32 +00:00
Kyle Evans
240fcda1e8 subr_prf: style(9) the sizeof
Reported by:	jkim, ian
2018-08-09 19:09:06 +00:00
Mark Johnston
b50a4ea646 Account for the lowmem handlers in the inactive queue scan target.
Before r329882 the target would be computed after lowmem handlers run
and free pages.  On some systems a significant amount of page
reclamation happens this way.  However, with r329882 the target is
computed first, which can lead to unnecessary reclamation from the
page cache, and this in turn may result in excessive swapping.

Instead, adjust the target after running lowmem handlers.  Don't
invoke the lowmem handlers before the PID controller, though, since
that would hide the true rate of page allocation.

Reviewed by:	alc, kib (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16606
2018-08-09 18:25:49 +00:00
Kyle Evans
4c793b68da subr_prf: Use "sizeof current_boot_tag" instead 2018-08-09 17:53:18 +00:00
Kyle Evans
2a4650cc11 BOOT_TAG: Make a config(5) option, expose as sysctl and loader tunable
BOOT_TAG lived shortly in sys/msgbuf.h, but this wasn't necessarily great
for changing it or removing it. Move it into subr_prf.c and add options for
it to opt_printf.h.

One can specify both the BOOT_TAG and BOOT_TAG_SZ (really, size of the
buffer that holds the BOOT_TAG). We expose it as kern.boot_tag and also add
a loader tunable by the same name that we'll fetch upon initialization of
the msgbuf.

This allows for flexibility and also ensures that there's a consistent way
to figure out the boot tag of the running kernel, rather than relying on
headers to be in-sync.

Prodded super-super-lightly by:	imp
2018-08-09 17:47:47 +00:00
Kyle Evans
21aa6e8345 msgbuf: Light detailing (const'ify and bool'itize) 2018-08-09 17:42:27 +00:00
Navdeep Parhar
8a684e1fd1 cxgbe(4): Display pkt-size and burst-size in traffic class parameters. 2018-08-09 14:36:44 +00:00
Navdeep Parhar
5fc0f72f3b cxgbe(4): Add support for high priority filters on T6+. They have their
own region in the TCAM starting with T6, unlike previous chips where
they were in the same region as normal filters.

These filters "hit" before anything else in the LE's lookup.  The exact
order is:
a) High priority filters
b) TOE's active region (TCAM and/or hash)
c) Servers (TOE hw listeners)
d) Normal filters

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-09 14:19:47 +00:00
Leandro Lupori
c8e2123b6a [ppc] Fix kernel panic when using BOOTP_NFSROOT
On PowerPC (and possibly other architectures), that doesn't use
EARLY_AP_STARTUP, the config task queue may be used initialized.
This was observed while trying to mount the root fs from NFS, as
reported here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230168.

This patch has 2 main changes:
1- Perform a basic initialization of qgroup_config, similar to
what is done in taskqgroup_adjust, but simpler.
This makes qgroup_config ready to be used during NFS root mount.

2- When EARLY_AP_STARTUP is not used, call inm_init() and
in6m_init() right before SI_SUB_ROOT_CONF, because bootp needs
to send multicast packages to request an IP.

PR:		Bug 230168
Reported by:	sbruno
Reviewed by:	jhibbits, mmacy, sbruno
Approved by:	jhibbits
Differential Revision:	D16633
2018-08-09 14:04:51 +00:00
Olivier Houchard
d8f1ed8d94 Import CK as of commit 08813496570879fbcc2adcdd9ddc0a054361bfde, mostly
to avoid using lwsync on ppc32.
2018-08-09 12:11:49 +00:00
Hans Petter Selasky
25a1e0f636 Implement missing atomic_fcmpset_XXX() support for i386.
This also fixes i386 build after r337527.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-08-09 11:30:13 +00:00
Andriy Gapon
27cfcd9599 add an option for ddb ps command to print process arguments
We use ps to collect the information of all processes in textdump. But
it doesn't contain process arguments which however sometimes are very
useful for debugging.  The new 'a' modifier adds that capability.

While here, remove 'm' modifier from ddb.4.  It was in the manual page
from its very first revision, but I could not find any evidence of the
code ever supporting it.

Submitted by:	Terry Hu <thu@panzura.com>
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D16603
2018-08-09 11:21:31 +00:00
Hans Petter Selasky
6402bc3d1e Use atomic_fcmpset_XXX() instead of atomic_cmpset_XXX() when possible
in the LinuxKPI.

Suggested by:	mjg @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-08-09 09:39:32 +00:00
Matt Macy
9fec45d8e5 epoch_block_wait: don't check TD_RUNNING
struct epoch_thread is not type safe (stack allocated) and thus cannot be dereferenced from another CPU

Reported by: novel@
2018-08-09 05:18:27 +00:00
Kyle Evans
2834d61202 kern: Add a BOOT_TAG marker at the beginning of boot dmesg
From the "newly licensed to drive" PR department, add a BOOT_TAG marker (by
default, --<<BOOT>>--, to the beginning of each boot's dmesg. This makes it
easier to do textproc magic to locate the start of each boot and, of
particular interest to some, the dmesg of the current boot.

The PR has a dmesg(8) component as well that I've opted not to include for
the moment- it was the more contentious part of this PR.

bde@ also made the statement that this boot tag should be written with an
ordinary printf, which I've- for the moment- declined to change about this
patch to keep it more transparent to observer of the boot process.

PR:		43434
Submitted by:	dak <aurelien.nephtali@wanadoo.fr> (basically rewritten)
MFC after:	maybe never
2018-08-09 01:32:09 +00:00
Breno Leitao
78f4e2fea0 powerpc64/powernv: re-read RTC after polling
If OPAL_RTC_READ is busy and does not return the information on the first run,
as returning OPAL_BUSY_EVENT, the system will crash since ymd and hmsm variable
will contain junk values.

This is happening because we were not calling OPAL_RTC_READ again after
OPAL_POLL_EVENTS' return, which would finally replace the old/junk hmsm and ymd
values.

The code was also mixing OPAL_RTC_READ and OPAL_POLL_EVENTS return values.

This patch fix this logic and guarantee that we call OPAL_RTC_READ after
OPAL_POLL_EVENTS return, and guarantee the code will only proceed if
OPAL_RTC_READ returns OPAL_SUCCESS.

Reviewed by: jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D16617
2018-08-08 21:19:07 +00:00
Rick Macklem
41df1b5b47 Assorted fixes to handling of LayoutRecall callbacks, mostly error handling.
After a re-read of the appropriate section of RFC5661, I decided that a
few things should be changed related to LayoutRecall callback handling.
Here are the things fixed by this patch.
- For two of the three cases that LayoutRecall is done, I now think
  setting the clora_changed argument false is correct.
- All errors other than NFSERR_DELAY returned by LayoutRecall appear
  permanent, so don't retry for any of them. (NFSERR_DELAY is retried by
  newnfs_request(), so it is not affected by this patch.)
- Instead of waiting "forever" (actually until the process is SIGTERM'd)
  for Layouts to be returned during a mirror copy, fail and return
  ENXIO after about 1minute.
  Waiting for a <ctrl>C made sense when pnfsdscopymr() was done by itself,
  but did not make sense when done via find(1).
This patch only affects the pNFS server.
2018-08-08 20:21:45 +00:00
Andrey V. Elsukov
5c4aca8218 Use host byte order when comparing mss values.
This fixes tcp-setmss action on little endian machines.

PR:		225536
Submitted by:	John Zielinski
2018-08-08 17:32:02 +00:00
Alan Cox
2bf8cb3804 Add support for pmap_enter(..., psind=1) to the armv6 pmap. In other words,
add support for explicitly requesting that pmap_enter() create a 1 MB page
mapping.  (Essentially, this feature allows the machine-independent layer
to create superpage mappings preemptively, and not wait for automatic
promotion to occur.)

Export pmap_ps_enabled() to the machine-independent layer.

Add a flag to pmap_pv_insert_pte1() that specifies whether it should fail
or reclaim a PV entry when one is not available.

Refactor pmap_enter_pte1() into two functions, one by the same name, that
is a general-purpose function for creating pte1 mappings, and another,
pmap_enter_1mpage(), that is used to prefault 1 MB read- and/or execute-
only mappings for execve(2), mmap(2), and shmat(2).

In addition, as an optimization to pmap_enter(..., psind=0), eliminate the
use of pte2_is_managed() from pmap_enter().  Unlike the x86 pmap
implementations, armv6 does not have a managed bit defined within the PTE.
So, pte2_is_managed() is actually a call to PHYS_TO_VM_PAGE(), which is O(n)
in the number of vm_phys_segs[].  All but one call to PHYS_TO_VM_PAGE() in
pmap_enter() can be avoided.

Reviewed by:	kib, markj, mmel
Tested by:	mmel
MFC after:	6 weeks
Differential Revision:	https://reviews.freebsd.org/D16555
2018-08-08 16:55:01 +00:00
Ruslan Bukin
6371d0bd64 Implement uma_small_alloc(), uma_small_free().
Reviewed by:	markj
Obtained from:	arm64
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16628
2018-08-08 16:08:38 +00:00
Pedro F. Giffuni
c820acbf0a msdosfs: fixes for Undefined Behavior.
These were found by the Undefined Behaviour GsoC project at NetBSD:

Do not change signedness bit with left shift.
While there avoid signed integer overflow.
Address both issues with using unsigned type.

msdosfs_fat.c:512:42, left shift of 1 by 31 places cannot be represented
in type 'int'
msdosfs_fat.c:521:44, left shift of 1 by 31 places cannot be represented
in type 'int'
msdosfs_fat.c:744:14, left shift of 1 by 31 places cannot be represented
in type 'int'
msdosfs_fat.c:744:24, signed integer overflow: -2147483648 - 1 cannot be
represented in type 'int [20]'
msdosfs_fat.c:840:13, left shift of 1 by 31 places cannot be represented
in type 'int'
msdosfs_fat.c:840:36, signed integer overflow: -2147483648 - 1 cannot be
represented in type 'int [20]'

Detected with micro-UBSan in the user mode.

Hinted from:	NetBSD (CVS 1.33)
MFC after:	2 weeks
Differenctial Revision:	https://reviews.freebsd.org/D16615
2018-08-08 15:08:22 +00:00
Randall Stewart
d18ea344e6 Fix a small bug in rack where it will
end up sending the FIN twice.
Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D16604
2018-08-08 13:36:49 +00:00
Fedor Uporov
53288b712d Split the dir_index and dir_nlink features.
Do not allow to create more that EXT4_LINK_MAX links to directory in case
if the dir_nlink is not set, like it is done in the fresh e2fsprogs updates.

MFC after:      3 months
2018-08-08 12:08:46 +00:00
Fedor Uporov
17c7b27f55 Fix directory blocks checksum updating logic.
The checksum updating functions were not called in case of dir index inode splitting
and in case of dir entry removing, when the entry was first in the block.
Fix and move the dir entry adding logic when i_count == 0 to new function.

MFC after:      3 months
2018-08-08 12:07:45 +00:00
Conrad Meyer
3dc1c7d6bc FUSE: Remove some set-but-not-used variables
No functional change.
2018-08-08 04:46:03 +00:00
Alan Cox
78f1deeffe Defer and aggregate swap_pager_meta_build frees.
Before swp_pager_meta_build replaces an old swapblk with an new one,
it frees the old one.  To allow such freeing of blocks to be
aggregated, have swp_pager_meta_build return the old swap block, and
make the caller responsible for freeing it.

Define a pair of short static functions, swp_pager_init_freerange and
swp_pager_update_freerange, to do the initialization and updating of
blk addresses and counters used in aggregating blocks to be freed.

Submitted by:	Doug Moore <dougm@rice.edu>
Reviewed by:	kib, markj (an earlier version)
Tested by:	pho
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D13707
2018-08-08 02:30:34 +00:00
Kevin Lo
a99020fbf3 - Fix hash calculation by MAC address
- Since rx_cmd_c is an uint16_t, use le16toh() instead of le32toh()

Reviewed by:	emaste
2018-08-08 01:20:02 +00:00
Navdeep Parhar
09a7189fb7 cxgbe(4): Allow the driver to specify a burst size when configuring a
traffic class for rate limiting.

Add experimental knobs that allow the user to specify a default pktsize
and burstsize for traffic classes associated with a port:

dev.<ifname>.<instance>.tc.pktsize
dev.<ifname>.<instance>.tc.burstsize

Sponsored by:	Chelsio Communications
2018-08-07 22:13:03 +00:00
Rick Macklem
93df87f208 Allow newnfs_request() to retry all callback RPCs with an NFSERR_DELAY reply.
The code in newnfs_request() retries RPCs that get a reply of NFSERR_DELAY,
but exempts certain NFSv4 operations. However, for callback RPCs, there
should not be any exemptions at this time. The code would have erroneously
exempted the CBRECALL callback, since it has the same operation number as
the CLOSE operation.
This patch fixes this by checking for a callback RPC (indicated by clp != NULL)
and not checking for exempt operations for callbacks.
This would have only affected the NFSv4 server when delegations are enabled
(they are not enabled by default) and the client replies to CBRECALL with
NFSERR_DELAY. This may never actually happen.
Spotted during code inspection.

MFC after:	2 weeks
2018-08-07 21:29:14 +00:00
Konstantin Belousov
8f94195022 Followup to r337430: only call elf_reloc_ifunc on x86.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-08-07 20:43:50 +00:00
Marius Strobl
e545f5a8a3 Update the list of architectures having atomic_fcmpset_{8,16,64}(9) and
atomic_swap_{64,int}(9) respectively as of r337433.
2018-08-07 18:59:02 +00:00
Marius Strobl
13a10f3414 Implement atomic_swap_{int,long,ptr}(9). 2018-08-07 18:56:51 +00:00
Marius Strobl
ac97c7e4c1 Implement atomic_swap_64(9). 2018-08-07 18:56:01 +00:00
Konstantin Belousov
cb0eecdf92 Futex support functions in linux.ko and linux32.ko on amd64 should be
aware of SMAP.

Reported and tested by:	Johannes Lundberg <johalun0@gmail.com>, wulf
Sponsored by:	The FreeBSD Foundation
2018-08-07 18:29:10 +00:00
Konstantin Belousov
289ead7cb0 Add missed handling of local relocs against ifunc target in the obj modules.
Reported and tested by:	wulf
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-08-07 18:26:46 +00:00
Mark Johnston
159f344b84 Recognize ICS1893C PHYs.
Submitted by:	Sebastian Huber <sebastian.huber@embedded-brains.de>
MFC after:	1 week
2018-08-07 17:13:42 +00:00
Mark Johnston
c7902fbeae Improve handling of control message truncation.
If a recvmsg(2) or recvmmsg(2) caller doesn't provide sufficient space
for all control messages, the kernel sets MSG_CTRUNC in the message
flags to indicate truncation of the control messages.  In the case
of SCM_RIGHTS messages, however, we were failing to dispose of the
rights that had already been externalized into the recipient's file
descriptor table.  Add a new function and mbuf type to handle this
cleanup task, and use it any time we fail to copy control messages
out to the recipient.  To simplify cleanup, control message truncation
is now only performed at control message boundaries.

The change also fixes a few related bugs:
- Rights could be leaked to the recipient process if an error occurred
  while copying out a message's contents.
- We failed to set MSG_CTRUNC if the truncation occurred on a control
  message boundary, e.g., if the caller received two control messages
  and provided only the exact amount of buffer space needed for the
  first.

PR:		131876
Reviewed by:	ed (previous version)
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16561
2018-08-07 16:36:48 +00:00
Colin Percival
0b4d5eb8fd Replace a pair of 8-bit writes to VGA memory with a single 16-bit write.
The VGA "text mode" buffer has a pair of bytes for each character: One
byte for the character symbol, and an "attribute" byte encoding the
foreground and background colours.  When updating the screen, we were
writing these two bytes separately.

On some virtualized systems, every write results in a glyph being redrawn
into a (graphical) virtual screen; writing these two bytes separately
results in twice as much work being done to draw characters, whereas if
we perform a single 16-bit write instead, the character only needs to be
redrawn once.

On an EC2 c5.4xlarge instance, this change cuts 1.30s from the kernel boot,
speeding it up from 8.90s to 7.60s.

MFC after:	1 week
2018-08-07 08:33:40 +00:00
Cy Schubert
95bdea60e0 Remove redundant and incorrect default definition of AF_INET6. AF_INET6
is defined in sys/socket.h where it's defined as 28.

A bit of trivia: On NetBSD AF_INET6 is defined as 24. On Solaris it is
defined as 26. This is probably why Darren defaulted to 26, because
ipfilter was originally written for SunOS 4 and Solaris many moons ago.

MFC after:	2 weeks
2018-08-07 07:12:59 +00:00
John Baldwin
d2aec9714a Make the system C11 atomics headers fully compatible with external GCC.
The <sys/cdefs.h> and <stdatomic.h> headers already included support for
C11 atomics via intrinsincs in modern versions of GCC, but these versions
tried to "hide" atomic variables inside a wrapper structure.  This wrapper
is not compatible with GCC's internal <stdatomic.h> header, so that if
GCC's <stdatomic.h> was used together with <sys/cdefs.h>, use of C11
atomics would fail to compile.  Fix this by not hiding atomic variables
in a structure for modern versions of GCC.  The headers already avoid
using a wrapper structure on clang.

Note that this wrapper was only used if C11 was not enabled (e.g.
via -std=c99), so this also fixes compile failures if a modern version
of GCC was used with -std=c11 but with FreeBSD's <stdatomic.h> instead
of GCC's <stdatomic.h> and this change fixes that case as well.

Reported by:	Mark Millard
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D16585
2018-08-06 23:51:08 +00:00
Navdeep Parhar
1979b51141 cxgbe(4): Allow user-configured and driver-configured traffic classes to
be used simultaneously.  Move sysctl_tc and sysctl_tc_params to
t4_sched.c while here.

MFC after:	3 weeks
Sponsored by:	Chelsio Communications
2018-08-06 23:21:13 +00:00
Navdeep Parhar
7b8f5a200a cxgbe(4): Break up sysctl_bitfield into 8 bit and 16 bit variants. Have
them display the current value of the bitfield rather than the fixed
value that was provided when the sysctl node was created.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-06 21:54:51 +00:00
Kirk McKusick
68c49bcc40 Put in place the framework for consolodating contiguous blocks into
a smaller number of larger TRIM requests. The hope had been to have
the full TRIM consolodation in place for 12.0, but the algorithms
are still under development and need further testing. With this
framework in place it will be possible to easily add TRIM consolodation
once the optimal strategy has been found.

The only functional change with this patch is the elimination of TRIM
requests for blocks that are freed before they have been likely to
have been written.

Reviewed by: kib
Discussed with: Warner Losh and Chuck Silvers
Sponsored by: Netflix
2018-08-06 21:09:11 +00:00
Navdeep Parhar
564ec04ea8 Fix typo in cxgbe/t4_tom. 2018-08-06 19:09:55 +00:00
Jonathan T. Looney
95a914f631 Address concerns about CPU usage while doing TCP reassembly.
Currently, the per-queue limit is a function of the receive buffer
size and the MSS.  In certain cases (such as connections with large
receive buffers), the per-queue segment limit can be quite large.
Because we process segments as a linked list, large queues may not
perform acceptably.

The better long-term solution is to make the queue more efficient.
But, in the short-term, we can provide a way for a system
administrator to set the maximum queue size.

We set the default queue limit to 100.  This is an effort to balance
performance with a sane resource limit.  Depending on their
environment, goals, etc., an administrator may choose to modify this
limit in either direction.

Reviewed by:	jhb
Approved by:	so
Security:	FreeBSD-SA-18:08.tcp
Security:	CVE-2018-6922
2018-08-06 17:36:57 +00:00
Andrew Turner
b17f3d298d Default to armv5te in LINT on arm. This should fix building LINT there. 2018-08-06 14:40:45 +00:00
Hans Petter Selasky
549dcdb34e Implement current_work() function in the LinuxKPI.
Tested by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-08-06 10:48:20 +00:00
Randall Stewart
936b2b64ae This fixes a bug in Rack where we were
not properly using the correct value for
Delayed Ack.

Sponsored by:	Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D16579
2018-08-06 09:22:07 +00:00
Hans Petter Selasky
db119089be Implement atomic_long_cmpxchg() function in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-08-06 08:40:02 +00:00