123547 Commits

Author SHA1 Message Date
Konstantin Belousov
cb0eecdf92 Futex support functions in linux.ko and linux32.ko on amd64 should be
aware of SMAP.

Reported and tested by:	Johannes Lundberg <johalun0@gmail.com>, wulf
Sponsored by:	The FreeBSD Foundation
2018-08-07 18:29:10 +00:00
Konstantin Belousov
289ead7cb0 Add missed handling of local relocs against ifunc target in the obj modules.
Reported and tested by:	wulf
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-08-07 18:26:46 +00:00
Mark Johnston
159f344b84 Recognize ICS1893C PHYs.
Submitted by:	Sebastian Huber <sebastian.huber@embedded-brains.de>
MFC after:	1 week
2018-08-07 17:13:42 +00:00
Mark Johnston
c7902fbeae Improve handling of control message truncation.
If a recvmsg(2) or recvmmsg(2) caller doesn't provide sufficient space
for all control messages, the kernel sets MSG_CTRUNC in the message
flags to indicate truncation of the control messages.  In the case
of SCM_RIGHTS messages, however, we were failing to dispose of the
rights that had already been externalized into the recipient's file
descriptor table.  Add a new function and mbuf type to handle this
cleanup task, and use it any time we fail to copy control messages
out to the recipient.  To simplify cleanup, control message truncation
is now only performed at control message boundaries.

The change also fixes a few related bugs:
- Rights could be leaked to the recipient process if an error occurred
  while copying out a message's contents.
- We failed to set MSG_CTRUNC if the truncation occurred on a control
  message boundary, e.g., if the caller received two control messages
  and provided only the exact amount of buffer space needed for the
  first.

PR:		131876
Reviewed by:	ed (previous version)
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16561
2018-08-07 16:36:48 +00:00
Colin Percival
0b4d5eb8fd Replace a pair of 8-bit writes to VGA memory with a single 16-bit write.
The VGA "text mode" buffer has a pair of bytes for each character: One
byte for the character symbol, and an "attribute" byte encoding the
foreground and background colours.  When updating the screen, we were
writing these two bytes separately.

On some virtualized systems, every write results in a glyph being redrawn
into a (graphical) virtual screen; writing these two bytes separately
results in twice as much work being done to draw characters, whereas if
we perform a single 16-bit write instead, the character only needs to be
redrawn once.

On an EC2 c5.4xlarge instance, this change cuts 1.30s from the kernel boot,
speeding it up from 8.90s to 7.60s.

MFC after:	1 week
2018-08-07 08:33:40 +00:00
Cy Schubert
95bdea60e0 Remove redundant and incorrect default definition of AF_INET6. AF_INET6
is defined in sys/socket.h where it's defined as 28.

A bit of trivia: On NetBSD AF_INET6 is defined as 24. On Solaris it is
defined as 26. This is probably why Darren defaulted to 26, because
ipfilter was originally written for SunOS 4 and Solaris many moons ago.

MFC after:	2 weeks
2018-08-07 07:12:59 +00:00
John Baldwin
d2aec9714a Make the system C11 atomics headers fully compatible with external GCC.
The <sys/cdefs.h> and <stdatomic.h> headers already included support for
C11 atomics via intrinsincs in modern versions of GCC, but these versions
tried to "hide" atomic variables inside a wrapper structure.  This wrapper
is not compatible with GCC's internal <stdatomic.h> header, so that if
GCC's <stdatomic.h> was used together with <sys/cdefs.h>, use of C11
atomics would fail to compile.  Fix this by not hiding atomic variables
in a structure for modern versions of GCC.  The headers already avoid
using a wrapper structure on clang.

Note that this wrapper was only used if C11 was not enabled (e.g.
via -std=c99), so this also fixes compile failures if a modern version
of GCC was used with -std=c11 but with FreeBSD's <stdatomic.h> instead
of GCC's <stdatomic.h> and this change fixes that case as well.

Reported by:	Mark Millard
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D16585
2018-08-06 23:51:08 +00:00
Navdeep Parhar
1979b51141 cxgbe(4): Allow user-configured and driver-configured traffic classes to
be used simultaneously.  Move sysctl_tc and sysctl_tc_params to
t4_sched.c while here.

MFC after:	3 weeks
Sponsored by:	Chelsio Communications
2018-08-06 23:21:13 +00:00
Navdeep Parhar
7b8f5a200a cxgbe(4): Break up sysctl_bitfield into 8 bit and 16 bit variants. Have
them display the current value of the bitfield rather than the fixed
value that was provided when the sysctl node was created.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-06 21:54:51 +00:00
Kirk McKusick
68c49bcc40 Put in place the framework for consolodating contiguous blocks into
a smaller number of larger TRIM requests. The hope had been to have
the full TRIM consolodation in place for 12.0, but the algorithms
are still under development and need further testing. With this
framework in place it will be possible to easily add TRIM consolodation
once the optimal strategy has been found.

The only functional change with this patch is the elimination of TRIM
requests for blocks that are freed before they have been likely to
have been written.

Reviewed by: kib
Discussed with: Warner Losh and Chuck Silvers
Sponsored by: Netflix
2018-08-06 21:09:11 +00:00
Navdeep Parhar
564ec04ea8 Fix typo in cxgbe/t4_tom. 2018-08-06 19:09:55 +00:00
Jonathan T. Looney
95a914f631 Address concerns about CPU usage while doing TCP reassembly.
Currently, the per-queue limit is a function of the receive buffer
size and the MSS.  In certain cases (such as connections with large
receive buffers), the per-queue segment limit can be quite large.
Because we process segments as a linked list, large queues may not
perform acceptably.

The better long-term solution is to make the queue more efficient.
But, in the short-term, we can provide a way for a system
administrator to set the maximum queue size.

We set the default queue limit to 100.  This is an effort to balance
performance with a sane resource limit.  Depending on their
environment, goals, etc., an administrator may choose to modify this
limit in either direction.

Reviewed by:	jhb
Approved by:	so
Security:	FreeBSD-SA-18:08.tcp
Security:	CVE-2018-6922
2018-08-06 17:36:57 +00:00
Andrew Turner
b17f3d298d Default to armv5te in LINT on arm. This should fix building LINT there. 2018-08-06 14:40:45 +00:00
Hans Petter Selasky
549dcdb34e Implement current_work() function in the LinuxKPI.
Tested by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-08-06 10:48:20 +00:00
Randall Stewart
936b2b64ae This fixes a bug in Rack where we were
not properly using the correct value for
Delayed Ack.

Sponsored by:	Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D16579
2018-08-06 09:22:07 +00:00
Hans Petter Selasky
db119089be Implement atomic_long_cmpxchg() function in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-08-06 08:40:02 +00:00
Hans Petter Selasky
f698bc4d76 Define __poll_t type in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-08-06 08:35:16 +00:00
Emmanuel Vadot
d19afc9abf aw_thermal: Add nvmem and H5 support
Now that aw_sid expose nvmem interface, use that to read the calibration
data.
Add support for H5 SoC.
Fix the bindings, we used to have non-upstreamed bindings. Switch to the
one that have been sent upstream. They are not stable yet, so we switch
from custom, wrong, bindings to correct, proposed bindings
2018-08-06 05:36:00 +00:00
Emmanuel Vadot
97eb836f8b aw_sid: Add nvmem interface
Rework aw_sid so it can work with the nvmem interface.
Each SoC expose a set of fuses (for now rootkey/boardid and, if available,
the thermal calibration data). A fuse can be private or public, reading private
fuse needs to be done via some registers instead of reading directly.
Each fuse is exposed as a sysctl.
For now leave the possibility for a driver to read any fuse without using
the nvmem interface as the awg and emac driver use this to generate a mac
address.
2018-08-06 05:35:24 +00:00
Rick Macklem
25705dd5d0 Copy all bits of a file handle in case there is padding in the structure.
At least on x86, fhandle_t is a packed structure, so I believe an
assignment will copy all the bits. However, for some current/future
architectures, there might be padding in the structure that doesn't get
copied via an assignment.
Since NFS assumes a file handle is an opaque blob of bits that can be
compared via memcmp()/bcmp(), all the bits including any padding must be
copied.
This patch replaces the assignments with a call to a byte copy function.
Spotted during code inspection.
2018-08-05 19:21:50 +00:00
Kristof Provost
91e0f2d200 pf: Increase default hash table size
Now that we (by default) limit the number of states to 100.000 it makse sense
to also adjust the default size of the hash table.

Based on the benchmarking results in
https://github.com/ocochard/netbenches/blob/master/Atom_C2758_8Cores-Chelsio_T540-CR/pf-states_hashsize/results/fbsd12-head.r332390/README.md
128K entries offers a good compromise between performance and memory use.

Users may still overrule this setting with the net.pf.states_hashsize and
net.pf.source_nodes_hashsize loader(8) tunables.
2018-08-05 13:54:37 +00:00
Vladimir Kondratyev
26f3e847c3 uep(4): add evdev support
To compile this driver with evdev support enabled, place
following lines into the kernel configuration file:

options EVDEV_SUPPORT
device evdev

Note: Native and evdev modes are mutually exclusive.

Reviewed by:	gonzo, wblock (docs)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11156
2018-08-05 11:14:13 +00:00
Emmanuel Vadot
69acf61478 allwinner: a64: Add THS clock support
The clock for the thermal sensor controller was missing when this driver
was made.
2018-08-05 06:16:36 +00:00
Emmanuel Vadot
aed85e3011 extres: clkdiv: Fix div_with_table
We didn't allowed a divider register value of 0 which can exists and
also didn't wrote the value but the divider, which result of a wrong
frequency to be selected
2018-08-05 06:15:35 +00:00
Emmanuel Vadot
4573cd3914 arm: allwinner: Disconnect A10/A20 HDMI driver
It doesn't work since 2 years when we stopped patching DTS.
The DTS now have the correct bindings but they are a lot different
from our hacked ones we used to have (and more representative of the
reality).
2018-08-05 06:10:13 +00:00
Emmanuel Vadot
c204112317 arm: allwinner: Remove old unused clocks
Remove the old clocks for allwinner as now all the SoCs have been converted
to clkng.
The only old clock now is the gmac clock which still lives under the /clocks
dts node.
2018-08-05 06:08:23 +00:00
Kyle Evans
3395e43a04 efirt: Don't enter EFI context early, convert addrs to KVA instead
efi_enter here was needed because efi_runtime dereference causes a fault
outside of EFI context, due to runtime table living in runtime service
space. This may cause problems early in boot, though, so instead access it
by converting paddr to KVA for access.

While here, remove the other direct PHYS_TO_DMAP calls and the explicit DMAP
requirement from efidev.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D16591
2018-08-04 21:41:10 +00:00
Konstantin Belousov
a70e9a1388 Swap in WKILLED processes.
Swapped-out process that is WKILLED must be swapped in as soon as
possible.  The reason is that such process can be killed by OOM and
its pages can be only freed if the process exits.  To exit, the kernel
stack of the process must be mapped.

When allocating pages for the stack of the WKILLED process on swap in,
use VM_ALLOC_SYSTEM requests to increase the chance of the allocation
to succeed.

Add counter of the swapped out processes to avoid unneeded iteration
over the allprocs list when there is no work to do, reducing the
allproc_lock ownership.

Reviewed by:	alc, markj (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D16489
2018-08-04 20:45:43 +00:00
Mark Johnston
5b0480f2cc Don't check rcv sockbuf limits when sending on a unix stream socket.
sosend_generic() performs an initial comparison of the amount of data
(including control messages) to be transmitted with the send buffer
size. When transmitting on a unix socket, we then compare the amount
of data being sent with the amount of space in the receive buffer size;
if insufficient space is available, sbappendcontrol() returns an error
and the data is lost.  This is easily triggered by sending control
messages together with an amount of data roughly equal to the send
buffer size, since the control message size may change in uipc_send()
as file descriptors are internalized.

Fix the problem by removing the space check in sbappendcontrol(),
whose only consumer is the unix sockets code.  The stream sockets code
uses the SB_STOP mechanism to ensure that senders will block if the
receive buffer fills up.

PR:		181741
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16515
2018-08-04 20:26:54 +00:00
Mark Johnston
e62ca80bde Style. 2018-08-04 20:16:36 +00:00
Dimitry Andric
aaf1312351 Fix build of hyperv with base gcc on i386
Summary:
Base gcc fails to compile `sys/dev/hyperv/pcib/vmbus_pcib.c` for i386,
with the following -Werror warnings:

cc1: warnings being treated as errors
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'new_pcichild_device':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:567: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'vmbus_pcib_on_channel_callback':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:940: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'hv_pci_protocol_negotiation':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1012: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'hv_pci_enter_d0':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1073: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'hv_send_resources_allocated':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1125: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'vmbus_pcib_map_msi':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1730: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]

This is because on i386, several casts from `uint64_t` to a pointer
reduce the value from 64 bit to 32 bit.

For gcc, this can be fixed by an intermediate cast to uintptr_t. Note
that I am assuming the incoming values will always fit into 32 bit!

Differential Revision: https://reviews.freebsd.org/D15753
MFC after:	3 days
2018-08-04 14:57:23 +00:00
Konstantin Belousov
54c531cacd Add END()s for amd64 linux futex support routines.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-08-04 13:57:50 +00:00
Vladimir Kondratyev
3a3dc5b5b4 wmt(4): Use internal function to calculate input report size
Usbhid's hid_report_size() calculates integral size of all reports of given
kind found in the HID descriptor rather then exact size of report with given
ID as its userland counterpart does. As all input data processed by the
driver is located within the same report, calculate required driver's buffer
size with userland version, imported in one of the previous commits.
This allows us to skip zeroing of buffer on processing of each report.

While here do some minor refactoring.

MFC after:	2 weeks
2018-08-04 12:31:19 +00:00
Vladimir Kondratyev
8107f311f4 wmt(4): Read Microsoft's "Touch Hardware Quality Assurance" certificate blob
if present to enable some devices like WaveShare touchscreens. Unlike
Windows we discard content of the blob. We try mimic Windows driver
behaviour from the USB device point of view.

Submitted by:	glebius (initial version)
2018-08-04 12:29:08 +00:00
Vladimir Kondratyev
36584a62c7 wmt(4): Read 'Contact count maximum' usage value from feature report
rather than from HID descriptor to match Microsoft documentation.
Fall back to HID descriptor provided value if 'Get Report' request failed.

MFC after:	2 weeks
2018-08-04 12:24:37 +00:00
Patrick Kelsey
8f410865b8 Mark the send queue ready so ALTQ is available. 2018-08-04 01:45:17 +00:00
Gleb Smirnoff
cc7963191d Now that after r335979 the kernel addresses in API structures are
fixed size, there is no reason left for the unions.

Discussed with:	brooks
2018-08-04 00:03:21 +00:00
Gleb Smirnoff
86b4ad7dd5 Use if_tunnel_check_nesting() for ng_iface(4). 2018-08-03 22:55:58 +00:00
Emmanuel Vadot
82533b026a arm: Remove ALLWINNER_UP kernel config
This was needed when we GENERIC couldn't boot on UP system.
2018-08-03 22:15:58 +00:00
Emmanuel Vadot
1d16c90d1a dtb: rpi: Only compile and copy the DTSO
The DTB is now loaded via the firmware, passed to u-boot then to loader.efi
Only compile and copy the dts overlays.
2018-08-03 22:06:15 +00:00
Emmanuel Vadot
35ab4bcc29 dtb: am335x: Remove links and add more dts
The links were to cope with the switch to upstream dts.
We don't need them anymore.
While here add the rest of the beaglebone family dts as u-boot is common
on all those boards and load the dtb based on the product name.
This just miss the pocketbeagle variant as it's not yet in sys/gnu/dts but
will be with the Linux 4.18 dts import.
2018-08-03 22:04:00 +00:00
Justin Hibbits
2e0090af65 nvme(4): Add bus_dmamap_sync() at the end of the request path
Summary:
Some architectures, in this case powerpc64, need explicit synchronization
barriers vs device accesses.

Prior to this change, when running 'make buildworld -j72' on a 18-core
(72-thread) POWER9, I would see controller resets often.  With this change, I
don't see these resets messages, though another tester still does, for yet to be
determined reasons, so this may not be a complete fix.  Additionally, I see a
~5-10% speed up in buildworld times, likely due to not needing to reset the
controller.

Reviewed By: jimharris
Differential Revision: https://reviews.freebsd.org/D16570
2018-08-03 20:04:06 +00:00
Bryan Drewery
bc0d7285f9 Fix some filemon path logging issues.
- Properly handle snprintf return value for truncation and avoid
  overflowing the later write with the bogus length.
- Increase the msgbufr size to handle a rename of 2 full files.

The larger allocation causes a slight performance hit which will be mitigated
in the future.  A rewrite with sbufs will likely be done as well.

Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after:	2 weeks
Approved by:	so (gtetlow)
Reviewed by:	kib
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D16098
2018-08-03 19:24:04 +00:00
Konstantin Belousov
2e62782dac Require write access when mmapping BAR.
This actually makes the rights requirements for accessing PCI config
space and BARs using /dev/pci same.  Since unchanged /dev/pci mode
only allows write open for root, default configuration de-facto limits
the BAR read to root only.  In particular, state-changing reads of the
registers are limited to root.

Discussed with:	se
Suggested and reviewed by:	jhb (kernel part)
Sponsored by:	The FreeBSD Foundation
MFC after:	12 days
Differential revision:	https://reviews.freebsd.org/D16580
2018-08-03 18:35:20 +00:00
Ruslan Bukin
c50c8f642c Return ENAMETOOLONG if the latest copied character
is not null terminator.

Sponsored by:	DARPA, AFRL
2018-08-03 16:44:56 +00:00
Mark Johnston
c16bd872dc Add the required page accounting to kmem_bootstrap_free().
Reviewed by:	alc, kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16581
2018-08-03 16:35:37 +00:00
Konstantin Belousov
35efb3b1de Fix typo in copyinstr_smap, resulting in mis-handling of too long strings.
Reported and tested by:	pho
PR:	230286
Sponsored by:	The FreeBSD Foundation
2018-08-03 15:35:29 +00:00
Andriy Gapon
e0fa977ea5 safer wait-free iteration of shared interrupt handlers
The code that iterates a list of interrupt handlers for a (shared)
interrupt, whether in the ISR context or in the context of an interrupt
thread, does so in a lock-free fashion.   Thus, the routines that modify
the list need to take special steps to ensure that the iterating code
has a consistent view of the list.  Previously, those routines tried to
play nice only with the code running in the ithread context.  The
iteration in the ISR context was left to a chance.

After commit r336635 atomic operations and memory fences are used to
ensure that ie_handlers list is always safe to navigate with respect to
inserting and removal of list elements.

There is still a question of when it is safe to actually free a removed
element.

The idea of this change is somewhat similar to the idea of the epoch
based reclamation.  There are some simplifications comparing to the
general epoch based reclamation.  All writers are serialized using a
mutex, so we do not need to worry about concurrent modifications.  Also,
all read accesses from the open context are serialized too.

So, we can get away just two epochs / phases.  When a thread removes an
element it switches the global phase from the current phase to the other
and then drains the previous phase.  Only after the draining the removed
element gets actually freed. The code that iterates the list in the ISR
context takes a snapshot of the global phase and then increments the use
count of that phase before iterating the list.  The use count (in the
same phase) is decremented after the iteration.  This should ensure that
there should be no iteration over the removed element when its gets
freed.

This commit also simplifies the coordination with the interrupt thread
context.  Now we always schedule the interrupt thread when removing one
of handlers for its interrupt.  This makes the code both simpler and
safer as the interrupt thread masks the interrupt thus ensuring that
there is no interaction with the ISR context.

P.S.  This change matters only for shared interrupts and I realize that
those are becoming a thing of the past (and quickly).  I also understand
that the problem that I am trying to solve is extremely rare.

PR:		229106
Reviewed by:	cem
Discussed with:	Samy Al Bahra
MFC after:	5 weeks
Differential Revision: https://reviews.freebsd.org/D15905
2018-08-03 14:27:28 +00:00
Hans Petter Selasky
62baacef3f Implement ktime_add_ms() and ktime_before() in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-08-03 09:02:57 +00:00
Mark Johnston
fe585be529 Verify that each frame pointer lies within the thread's kstack.
Previously, this check was omitted for the first frame pointer.

Reported by:	pho
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D16572
2018-08-03 02:51:37 +00:00