Commit Graph

112490 Commits

Author SHA1 Message Date
Ed Schouten
13b4b4df98 Provide the CloudABI vDSO to its executables.
CloudABI executables already provide support for passing in vDSOs. This
functionality is used by the emulator for OS X to inject system call
handlers. On FreeBSD, we could use it to optimize calls to
gettimeofday(), etc.

Though I don't have any plans to optimize any system calls right now,
let's go ahead and already pass in a vDSO. This will allow us to
simplify the executables, as the traditional "syscall" shims can be
removed entirely. It also means that we gain more flexibility with
regards to adding and removing system calls.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D7438
2016-08-10 21:02:41 +00:00
Stephen J. Kiernan
bf38f89f99 Add kernel environment variables under smbios.system for the following
SMBIOS Type 1 fields:
smbios.system.sku      - SKU Number (SMBIOS 2.4 and above)
smbios.system.family   - Family (SMBIOS 2.4 and above)

Add kernel environment variables under smbios.planar for the following
SMBIOS Type 2 fields:
smbios.planar.tag      - Asset Tag
smbios.planar.location - Location in Chassis

Reviewed by:	jhb, grembo
Approved by:	sjg (mentor)
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D7453
2016-08-10 18:23:23 +00:00
Michael Tuexen
be46a7c54d Improve a consistency check to not detect valid cases for
unordered user messages using DATA chunks as invalid ones.
While there, ensure that error causes are provided when
sending ABORT chunks in case of reassembly problems detected.
Thanks to Taylor Brandstetter for making me aware of this problem.
MFC after:	3 days
2016-08-10 17:19:33 +00:00
Edward Tomasz Napierala
411455a8fb Replace all remaining calls to vprint(9) with vn_printf(9), and remove
the old macro.

MFC after:	1 month
2016-08-10 16:12:31 +00:00
Ed Schouten
66bd4c2eeb Make cpu_set_user_tls() work when called on the running thread.
On all the other architectures, this function can also be called on the
currently running thread. In this case, we shouldn't fix up the address
in the PCB, but also patch up the register itself. Otherwise it will not
become active and will simply become overwritten by the next switch.

Reviewed by:	imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D7437
2016-08-10 15:45:25 +00:00
Mateusz Guzik
7c34b35b57 ktrace: do a lockless check on fork to see if tracing is enabled
This saves 2 lock acquisitions in the common case.
2016-08-10 15:25:44 +00:00
Mateusz Guzik
382172be68 sigio: do a lockless check in funsetownlist
There is no need to grab the lock first to see if sigio is used, and it
typically is not.
2016-08-10 15:24:15 +00:00
Konstantin Belousov
3a77833e87 Fix indentation.
Reported by:	hselasky
MFC after:	17 days
2016-08-10 14:41:53 +00:00
Konstantin Belousov
15ad3e51c5 Convert another tmpfs assert into runtime check.
The offset of the directory file, passed to getdirentries(2) syscall,
is user-controllable.  The value of the offset must not be asserted,
instead the invalid value should be checked and rejected if invalid.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-10 13:50:21 +00:00
Ruslan Bukin
3cd77f5416 Consider CROSS_BINUTILS_PREFIX environment variable so we use correct
objdump.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-08-10 13:49:17 +00:00
Konstantin Belousov
49c394a970 Re-schedule signals after kthread exits, since apparently there are
processes which combine kernel and non-kernel threads, e.g. nfsd.  For
such processes, termination of a kthread must recheck signal delivery
among other threads according to masks.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-10 13:47:12 +00:00
Konstantin Belousov
e42f8233fc Unconditionally perform checks that FPU region was entered, when #NM
exception is caught in kernel mode.  There are third-party modules
which trigger the issue, and since the problem causes usermode state
corruption at least, panic in production kernels as well.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-10 13:44:03 +00:00
Ruslan Bukin
2d700cb557 Remove extra -msoft-float flags settings.
This helps to build firmware modules.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-08-10 13:32:27 +00:00
Ruslan Bukin
5f8228b2f3 o Remove operation in machine mode.
Machine privilege level was specially designed to use in vendor's
  firmware or bootloader. We have implemented operation in machine
  mode in FreeBSD as part of understanding RISC-V ISA, but it is time
  to remove it.
  We now use BBL (Berkeley Boot Loader) -- standard RISC-V firmware,
  which provides operation in machine mode for us.
  We now use standard SBI calls to machine mode, instead of handmade
  'syscalls'.
o Remove HTIF bus.
  HTIF bus is now legacy and no longer exists in RISC-V specification.
  HTIF code still exists in Spike simulator, but BBL do not provide
  raw interface to it.
  Memory disk is only choice for now to have multiuser booted in Spike,
  until Spike has implemented more devices (e.g. Virtio, etc).

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-08-10 12:41:36 +00:00
Andrew Turner
4cf6e978b4 Uncomment the vm.kvm_size and vm.kvm_free sysctls. These work as expected so
there is no reason to leave them commented out.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-08-10 10:36:11 +00:00
Andrew Turner
4088f71f9b Implement pmap_align_superpage on arm64 based on the amd64 implementation.
This will be needed when superpage support is added.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-08-10 10:13:34 +00:00
Sepherosa Ziehau
127d6d1a64 hyperv/hn: Reorganize send done callback.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7450
2016-08-10 03:11:07 +00:00
Jean-Sébastien Pédron
650b96935c sys/pcpu.h: Revert change introduced in r303890
`device_t` is not defined outside the kernel but this header is used by
eg. libkvm or vmstat(8). Thus, r303890 broke the build.

So let's restore `struct device` here until a longer term solution is
found.

Reported by:	Michael Butler <imb@protected-networks.net>, Jenkins
MFC after:	3 days
MFC with:	r303890
2016-08-09 21:45:47 +00:00
Pedro F. Giffuni
a061aa46fe sys: replace comma with semicolon when pertinent.
Uses of commas instead of a semicolons can easily go undetected. The comma
can serve as a statement separator but this shouldn't be abused when
statements are meant to be standalone.

Detected with devel/coccinelle following a hint from DragonFlyBSD.

MFC after:	1 month
2016-08-09 19:42:20 +00:00
Pedro F. Giffuni
f0be707d74 sys/dev: replace comma with semicolon when pertinent.
Uses of commas instead of a semicolons can easily go undetected. The comma
can serve as a statement separator but this shouldn't be abused when
statements are meant to be standalone.

Detected with devel/coccinelle following a hint from DragonFlyBSD.

MFC after:	1 month
2016-08-09 19:41:46 +00:00
Jean-Sébastien Pédron
bd937497ea Consistently use device_t
Several files use the internal name of `struct device` instead of
`device_t` which is part of the public API. This patch changes all
`struct device *` to `device_t`.

The remaining occurrences of `struct device` are those referring to the
Linux or OpenBSD version of the structure, or the code is not built on
FreeBSD and it's unclear what to do.

Submitted by:	Matthew Macy <mmacy@nextbsd.org> (previous version)
Approved by:	emaste, jhibbits, sbruno
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D7447
2016-08-09 19:32:06 +00:00
John Baldwin
264cd10809 Add additional constants.
- Add constants for the fields in the root-entry table address register,
  namely the root type type (RTT) and root table address (RTA) mask.
- Add macros for the bitmask of the domain ID field in the second word
  of context table entries as well as a helper macro (DMAR_CTX2_GET_DID)
  to extract the domain ID from a context table entry.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Chelsio Communications
2016-08-09 19:02:14 +00:00
John Baldwin
10012d5309 Reliably return PCI_GETCONF_LAST_DEVICE from PCIOCGETCONF.
Previously the loop in PCIIOCGETCONF would terminate as soon as it
found enough matches.  Now it will continue iterating through the
PCI device list and only terminate if it finds another matching device
for which it has no room to store a conf structure.  This means that
PCI_GETCONF_LAST_DEVICE is reliably returned when the number of
matching devices is equal to the number of slots in the matches
buffer.  For example, if a program requests the conf structure for a
single PCI function with a specified domain/bus/slot/function it will
now get PCI_GETCONF_LAST_DEVICE instead of PCI_GETCONF_MORE_DEVS.

While here, simplify the loop conditional a bit more by explicitly
breaking out of the loop if copyout() fails and removing a redundant
i < pci_numdevs check.

Reviewed by:	vangyzen, imp
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7445
2016-08-09 17:57:11 +00:00
John Baldwin
ec55567ce6 Track the base absolute ID of ingress and egress queues.
Use this to map an absolute queue ID to a logical queue ID in interrupt
handlers.  For the regular cxgbe/cxl drivers this should be a no-op as
the base absolute ID should be zero.  VF devices have a non-zero base
absolute ID and require this change.  While here, export the absolute ID
of egress queues via a sysctl.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7446
2016-08-09 17:49:42 +00:00
Edward Tomasz Napierala
7ee3fef85a Remove some NULL checks after M_WAITOK allocations from sys/arm/.
MFC after:	1 month
2016-08-09 16:02:35 +00:00
Edward Tomasz Napierala
75ae87ede1 Remove NULL checks after M_WAITOK allocations from nand(4).
MFC after:	1 month
2016-08-09 15:56:33 +00:00
Edward Tomasz Napierala
108c227330 Remove NULL checks after M_WAITOK allocations from sys/dev/ofw/.
MFC after:	1 month
2016-08-09 15:55:14 +00:00
Edward Tomasz Napierala
4466486fb6 Remove NULL check after M_WAITOK allocation from mpt(4).
MFC after:	1 month
2016-08-09 15:52:17 +00:00
Edward Tomasz Napierala
1691e8bfb1 Remove NULL checks after M_WAITOK allocations from msk(4).
MFC after:	1 month
2016-08-09 15:51:11 +00:00
Edward Tomasz Napierala
1680b6b1e0 Remove NULL checks after M_WAITOK allocations from tws(4).
MFC after:	1 month
2016-08-09 15:50:03 +00:00
Hans Petter Selasky
5f9e5b5e62 Fix for use after free.
Clear the device description to avoid use after free because the
bsddev is not destroyed when the mlx5en module is unloaded. Only when
the parent mlx5 module is unloaded the bsddev is destroyed. This fixes
a panic on listing sysctls which refer strings in the bsddev after the
mlx5en module has been unloaded.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-09 07:43:15 +00:00
Andriy Gapon
96762fe314 fix a zfs cross-device rename crash introduced in r303763
The problem was that 'zfsvfs' variable was not initialized if the error
was detected, but in the exit path the variable was dereferenced before
the error code was checked.

Reported by:	np
MFC after:	3 days
X-MFC with:	r303763
2016-08-09 06:11:24 +00:00
Sepherosa Ziehau
b5070c1829 hyperv/hn: Move gpa array out of netvsc_packet.
Prepare to deprecate the netvsc_packet.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7436
2016-08-09 04:50:20 +00:00
Stephen J. Kiernan
0ce1624d0e Move IPv4-specific jail functions to new file netinet/in_jail.c
_prison_check_ip4 renamed to prison_check_ip4_locked

Move IPv6-specific jail functions to new file netinet6/in6_jail.c
_prison_check_ip6 renamed to prison_check_ip6_locked

Add appropriate prototypes to sys/sys/jail.h

Adjust kern_jail.c to call prison_check_ip4_locked and
prison_check_ip6_locked accordingly.

Add netinet/in_jail.c and netinet6/in6_jail.c to the list of files that
need to be built when INET and INET6, respectively, are configured in the
kernel configuration file.

Reviewed by:	jtl
Approved by:	sjg (mentor)
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D6799
2016-08-09 02:16:21 +00:00
Adrian Chadd
603e05dbcb [ar9300] don't program a negative readytime. 2016-08-09 01:05:29 +00:00
John Baldwin
a56745092c Reserve an adapter flag IS_VF to mark VF devices vs PF devices.
Sponsored by:	Chelsio Communications
2016-08-08 21:45:39 +00:00
John Baldwin
8f6690d385 Fix a typo. 2016-08-08 21:28:02 +00:00
Mark Johnston
434ac8b6b7 Handle races with listening socket close when connecting a unix socket.
If the listening socket is closed while sonewconn() is executing, the
nascent child socket is aborted, which results in recursion on the
unp_link lock when the child's pru_detach method is invoked. Fix this
by using a flag to mark such sockets, and skip a part of the socket's
teardown during detach.

Reported by:	Raviprakash Darbha <rdarbha@juniper.net>
Tested by:	pho
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D7398
2016-08-08 20:25:04 +00:00
Sean Bruno
2f632dbb0b Avoid panic from ng_uncallout when unpluggin ethernet cable with active
PPTP VPN connection.

Submitted by:	Michael Zhilin <mizhka@gmail.com>
Reviewed by:	ngie
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D7209
2016-08-08 19:31:01 +00:00
Sean Bruno
f7c7398485 Fixup ixl(4) options parsing to actually compile when using RSS/PCBGROUP
in GENERIC.

Fixup #ifdef RSS code blocks so that they build and add/delete variables
that were missesd during the creation of this code.

This code is untested and should have a big red warning on it.

Reported by:	npn@
MFC after:	2 days
2016-08-08 18:57:50 +00:00
Hans Petter Selasky
57d5dd7907 Switch to the new block based LRO input function for the mlx5en
driver. This change significantly increases the overall RX aggregation
ratio for heavily loaded networks handling 10-80 thousand simultaneous
connections.

Remove the turbo LRO code and all references to it which has now been
superceeded by the tcp_lro_queue_mbuf() function.

Tested by:	Netflix
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-08 16:22:16 +00:00
Ryan Stone
f57637a47b Don't enqueue NULL on a drbr
In one corner case in the bxe TX path, a NULL mbuf could be enqueued onto
a drbr queue.  This could case a KASSERT to fire with INVARIANTS enabled,
or the processing of packets from the queue to be prematurely ended later
on.

Submitted by:	Matt Joras (matt.joras AT isilon.com)
Reviewed by:	davidcs
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D7041
2016-08-08 16:19:24 +00:00
Michael Tuexen
d6e73fa13d Fix the sending of FORWARD-TSN and I-FORWARD-TSN chunks. The
last SID/SSN pair wasn't filled in.
Thanks to Julian Cordes for providing a packetdrill script
triggering the issue and making me aware of the bug.

MFC after:	3 days
2016-08-08 13:52:18 +00:00
Ed Schouten
a0d3a7a158 Import vDSO-related source files from the CloudABI repository.
CloudABI executables that are emulated on Mac OS X do not invoke system
calls through "syscall". Instead, they make use of a vDSO that is
provided by the emulator that provides symbols for all of the system
call routines. The emulator can implement these any way it likes.

At some point in time we want to do this for native execution as well,
so that CloudABI executables are entirely oblivious of how system calls
need to be performed. They will simply call into functions and let that
deal with all of the details.

These source files can be used to generate a simple vDSO that does
nothing more than invoke "syscall". All we need to do now is map it into
the processes.

Obtained from:	https://github.com/NuxiNL/cloudabi
2016-08-08 13:15:37 +00:00
Michael Tuexen
9c5ca6f247 Fix a locking issue found by stress testing with tsctp.
The inp read lock neeeds to be held when considering control->do_not_ref_stcb.
MFC after:	3 days
2016-08-08 08:20:10 +00:00
Sepherosa Ziehau
f077729dbd hyperv/ic: Pass the channel callback to hv_util_attach()
The saved channel callback in util softc is actually never used.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7424
2016-08-08 06:18:54 +00:00
Sepherosa Ziehau
9b6306f287 hyperv/ic: Expose the receive buffer length for callers to use.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7423
2016-08-08 06:11:28 +00:00
Sepherosa Ziehau
f5207880ba hyperv/ic: Remove never used second parameter of hv_negotiate_version()
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7422
2016-08-08 05:57:38 +00:00
Sepherosa Ziehau
e6aeff0c0a etherswitch: Unbreak LINT build
Sponsored by:	Microsoft
2016-08-08 05:57:04 +00:00
Michael Tuexen
124d851acf Consistently check for unsent data on the stream queues.
MFC after:	3 days
2016-08-07 23:04:46 +00:00
Ed Schouten
df31593d00 Sync in the latest CloudABI constants and data types.
The only change is the addition of AT_SYSINFO_EHDR, which can be used
for providing a vDSO.

Obtained from:	https://github.com/NuxiNL/cloudabi
2016-08-07 21:23:55 +00:00
Justin Hibbits
080286e983 Set EN_MAS7_UPDATE HID0 bit for e500 core.
Without enabling this bit, tlbre and tlbsx don't update the MAS7 register,
resulting in garbage in the register after a read (rather, the previous setting
of it for a tlbwe).  This can result in mmu_booke_mapdev_attr() thinking
mappings that should match actually don't, because tlb1_read_entry() can't
determine the correct address of a given entry.

MFC after:	11-RELEASE
2016-08-07 19:09:56 +00:00
Sean Bruno
4294f337b0 ixl(4): Update to ixl-1.6.6-k.
Submitted by:	erj
Reviewed by:	jeffrey.e.pieper@intel.com
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D7391
2016-08-07 18:12:36 +00:00
Jason A. Harmening
b57417e56f powerpc busdma: Use pmap_quick_enter_page()/pmap_quick_remove_page() to handle
bouncing of unmapped buffers.  Also treat userspace buffers as unmapped, to
avoid borrowing the UVA for copies.  This allows sync'ing userspace buffers
outside the context of the owning process, and sync'ing bounced maps in
non-sleepable contexts.

This change is equivalent to r286787 for x86.

Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D3989
2016-08-07 15:50:08 +00:00
Michael Tuexen
4d58b0c3a9 Remove stream queue entry consistently from wheel.
While there, improve the handling of drain.

MFC after:	3 days
2016-08-07 12:51:13 +00:00
Brooks Davis
df28d191fa Use a more conventional spelling of "breakpoint". 2016-08-07 09:02:54 +00:00
Adrian Chadd
eb81dc79e9 Extract out the various local definitions of ETHER_IS_BROADCAST() and
turn them into a shared definition.

Set M_MCAST/M_BCAST appropriately upon packet reception in net80211, just
before they are delivered up to the ethernet stack.

Submitted by:	rstone
2016-08-07 03:48:33 +00:00
Adrian Chadd
1b334c8bc1 [arswitch] extend the debug support to be configurable at runtime.
* remove the DEBUG ifdef; defining it is too far reaching throughout
  the whole system;
* add a bitmask in the softc for controlling debugging;
* .. enable said debugging as a sysctl;
* add bitmaps for register access, reset and vlans.

TODO:

* Now that the debug statements are configurable, we definitely could
  do with more debugging
* Move the debugging into the top-level etherswitch driver and have
  sub-drivers obey.
2016-08-07 01:32:37 +00:00
Adrian Chadd
b812fe4d6b [mips] add support for using the MIPS user register for TLS data.
This work, originally from Stacey Son, uses the MIPS UserReg for
reading the TLS data, and will fall back to the normal syscall path
when it isn't supported.

This code dynamically patches cpu_switch() to bypass the UserReg
instruction so to avoid generating a machine exception.

Thanks to sson for the original work, and to Dan Nelson for
bringing it to date and testing it on MIPS32 with me.

Tested:

* mips64 (sson)
* mips74k (dnelson_1901@yahoo.com) - AR9344 SoC, UserReg support
* mips24k (adrian) - AR9331 SoC, no UserReg support

Obtained from:	sson, dnelson_1901@yahoo.com
2016-08-07 01:29:55 +00:00
Cy Schubert
e13c1ef1e7 Add Logitech Unifying receiver.
MFC after:	1 week
2016-08-06 20:27:12 +00:00
Stephen J. Kiernan
a183d81dc9 Add hw.fdt sysctl node.
Make FDT blob available via opaque hw.fdt.dtb sysctl, if a DTB has been
installed by the time sysctls are registered.

Reviewed by:	andrew
Approved by:	sjg (mentor)
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D7411
2016-08-06 18:48:47 +00:00
Pedro F. Giffuni
25efc7c822 ext2fs: Add defines for some missing ext4 feature flags.
These are currently unused in our implementation and some even appear to
have not been implemented yet on linux but it is good to keep them for
reference.

Obtained from:	NetBSD (CVS Rev. 1.41)
MFC after:	1 month
2016-08-06 17:24:35 +00:00
Pedro F. Giffuni
ef4736fea8 ext2fs: Add some more inode flags.
These are currently unused in out implementation but it is good to keep
them for reference.

Obtained from:	NetBSD (CVS Rev. 1.35)
MFC after:	1 month
2016-08-06 16:48:40 +00:00
Michael Tuexen
cf46cace5c Don't modify a structure without holding a reference count on it.
MFC after:	3 days
2016-08-06 15:29:46 +00:00
Justin Hibbits
161c415133 Two fixups for dtrace
* Use the right incantation to get the next stack pointer.  Since powerpc uses
  special frames for traps, dereferencing the stack pointer straight up won't
  get us the next stack pointer in every case.
* Clear EE using the correct instruction sequence.  The PowerISA states that
  'andi.' ANDs the register with 0||<imm>, instead of sign extending or filling
  out the unavailable bits with 1.  Even if it did sign extend, PSL_EE is
  0x8000, so ~PSL_EE is 0x7fff, and the upper bits would be cleared.  Use rlwinm
  in the 32-bit case, and a two-rotate sequence in the 64-bit case, the latter
  chosen to follow the output generated by gcc.

MFC after:	1 week
2016-08-06 15:06:19 +00:00
Michael Tuexen
bfe7e9328c Mark an unused parameter as such.
MFC after:	3 days
2016-08-06 12:51:07 +00:00
Michael Tuexen
d1ea5fa9c2 Fix various bugs in relation to the I-DATA chunk support
This is joint work with rrs.

MFC after:	3 days
2016-08-06 12:33:15 +00:00
Andriy Gapon
4fb51b52ef fix .zfs-related cases in zfs_lookup that were broken by r303763
The problem is that the special .zfs nodes are not represented by
znodes but by special gfs-based nodes.
r303763 changed interface of zfs_dirlook such that started operating on
znodes rather than on vnodes and, thus, the function became unsuitable
for handling .zfs entities.
The solution is to move the handling of the special cases to zfs_lookup,
the only consumer of zfs_dirlook.
I already had this solution applied in D7421, but for different reasons.

Reported by:	asomers
MFC after:	3 days
X-MFC with:	r303763
2016-08-06 11:02:07 +00:00
Eric van Gyzen
a1566487db Fix some logic in PCIe HotPlug; display EI status
The interpretation of the Electromechanical Interlock Status was
inverted, so we disengaged the EI if a card was inserted.
Fix it to engage the EI if a card is inserted.

When displaying the slot capabilites/status with pciconf:

- We inverted the sense of the Power Controller Control bit,
  saying the power was off when it was really on (according to
  this bit).  Fix that.

- Display the status of the Electromechanical Interlock:
        EI(engaged)
        EI(disengaged)

Reviewed by:	jhb
MFC after:	3 days
Sponsored by:	Dell Inc.
Differential Revision:	https://reviews.freebsd.org/D7426
2016-08-05 23:23:48 +00:00
Mark Johnston
f1c1f188f7 mthca: Add a wrapper for the firmware's DIAG_RPRT command.
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2016-08-05 21:34:09 +00:00
Adrian Chadd
921d7118e1 [ar934x] add tap/tun as modules, for people who wish to use VPNs. 2016-08-05 17:17:36 +00:00
Adrian Chadd
ab9df3af7f [arge] add some extra MDIO debugging support
* add an ANY debug level which will always echo the message if debugging
  is compiled in;
* log MDIO transaction timeouts if debugging is compiled in;
* the argemdio device is different to arge, so turning on MDIO debugging
  flags in arge->sc_debug doesn't help.  Add a debug sysctl to argemdio
  as well so that MDIO transactions can be debugged.

Tested:

* AR9331
2016-08-05 17:16:35 +00:00
Alan Cox
f0edf3f806 Correct a spelling error. 2016-08-05 16:44:11 +00:00
Roger Pau Monné
3c9d594089 xen-netfront: improve the logic when handling nic features from ioctl
Simplify the logic involved in changing the nic features on the fly, and
only reset the frontend when really needed (when changing RX features). Also
don't return from the ioctl until the interface has been properly
reconfigured.

While there, make sure XN_CSUM_FEATURES is used consistently.

Reported by:	julian
MFC after:	5 days
X-MFC-with:	r303488
Sponsored by:	Citrix Systems R&D
2016-08-05 15:48:56 +00:00
Edward Tomasz Napierala
37e56c6efe Remove lockmgr_waiters(9) and BUF_LOCKWAITERS(9); they were not used
for anything.

Reviewed by:	kib@
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D7420
2016-08-05 13:53:28 +00:00
Sepherosa Ziehau
b9ec6f0b02 tcp/lro: If timestamps mismatch or it's a FIN, force flush.
This keeps the segments/ACK/FIN delivery order.

Before this patch, it was observed: if A sent FIN immediately after
an ACK, B would deliver FIN first to the TCP stack, then the ACK.
This out-of-order delivery causes one unnecessary ACK sent from B.

Reviewed by:	gallatin, hps
Obtained from:  rrs, gallatin
Sponsored by:	Netflix (rrs, gallatin), Microsoft (sephe)
Differential Revision:	https://reviews.freebsd.org/D7415
2016-08-05 09:08:00 +00:00
Hans Petter Selasky
f87a304c8b Keep a reference count on USB keyboard polling to allow recursive
cngrab() during a panic for example, similar to what the AT-keyboard
driver is doing.

Found by:	Bruce Evans <brde@optusnet.com.au>
MFC after:	1 week
2016-08-05 08:58:00 +00:00
Sepherosa Ziehau
6fd14eb940 hyperv/vmbus: Only make sure the TX bufring will not be closed.
KVP can write data, whose size is > 1/2 TX bufring size.

MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D7414
2016-08-05 08:57:51 +00:00
Andriy Gapon
f79bc17233 zfs: honour and make use of vfs vnode locking protocol
ZFS POSIX Layer is originally written for Solaris VFS which is very
different from FreeBSD VFS.  Most importantly many things that FreeBSD VFS
manages on behalf of all filesystems are implemented in ZPL in a different
way.
Thus, ZPL contains code that is redundant on FreeBSD or duplicates VFS
functionality or, in the worst cases, badly interacts / interferes
with VFS.

The most prominent problem is a deadlock caused by the lock order reversal
of vnode locks that may happen with concurrent zfs_rename() and lookup().
The deadlock is a result of zfs_rename() not observing the vnode locking
contract expected by VFS.

This commit removes all ZPL internal locking that protects parent-child
relationships of filesystem nodes.  These relationships are protected
by vnode locks and the code is changed to take advantage of that fact
and to properly interact with VFS.

Removal of the internal locking allowed all ZPL dmu_tx_assign calls to
use TXG_WAIT mode.

Another victim, disputable perhaps, is ZFS support for filesystems with
mixed case sensitivity.  That support is not provided by the OS anyway,
so in ZFS it was a buch of dead code.

To do:
- replace ZFS_ENTER mechanism with VFS managed / visible mechanism
- replace zfs_zget with zfs_vget[f] as much as possible
- get rid of not really useful now zfs_freebsd_* adapters
- more cleanups of unneeded / unused code
- fix / replace .zfs support

PR:		209158
Reported by:	many
Tested by:	many (thank you all!)
MFC after:	5 days
Sponsored by:	HybridCluster / ClusterHQ
Differential Revision: https://reviews.freebsd.org/D6533
2016-08-05 06:23:06 +00:00
Conrad Meyer
520f6023de ioat(4): Log channel number in CTR events 2016-08-05 02:56:31 +00:00
Bryan Drewery
c1fa440409 Regenerate after r303755.
MFC after:	3 days
X-MFC-With:	r303755
Sponsored by:	EMC / Isilon Storage Division
2016-08-04 19:15:51 +00:00
Bryan Drewery
417d5dec39 Still provide freebsd10_* symbols from libc for COMPAT10.
r296773 was done to only remove libc symbols for <7.  We want to provide
the syscall symbols going forward for 7+.

Discussed with:	jhb
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2016-08-04 19:14:18 +00:00
John Baldwin
f454e7ebf5 Add __printflike() to bus_describe_intr() to enable -Wformat checks.
Fix a few places that were passing a raw string as the format to use
a "%s" format string instead.

MFC after:	2 months
2016-08-04 18:29:16 +00:00
John Baldwin
ad1d96ca7a Don't permit mappings of invalid physical addresses on amd64 via /dev/mem.
Discussed with:	kib
2016-08-04 17:55:23 +00:00
Adrian Chadd
c94dc8089f [etherswitch] add in an initial API for controlling per-port LED behaviour.
This is just implemented for the AR8327 for now.

Submitted by:	Dan Nelson <dnelson_1901@yahoo.com>
2016-08-04 17:45:35 +00:00
Navdeep Parhar
9217931fb4 cxgbe/t4_tom: The page pod arena allocates from pod address space and
not index space.  The minimum valid allocation out of this arena is the
size of a single page pod.

Sponsored by:	Chelsio Communications
2016-08-04 17:29:42 +00:00
Alan Cox
248fe642a7 Clean up the comments and code style in and around vm_pageout_cluster().
In particular, fix factual, grammatical, and spelling errors in various
comments, and remove comments that are out of place in this function.

Reviewed by:	kib, markj
MFC after:	3 weeks
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D7410
2016-08-04 16:20:12 +00:00
Andrew Turner
b0316526b0 Remove the pvh_global_lock lock from the arm64 pmap. It is unneeded on arm64
as invalidation will have completed before the pmap_invalidate_* functions
have complete.

Discussed with:	alc, kib
Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-08-04 13:49:36 +00:00
Edward Tomasz Napierala
7b255097eb Remove unused - never actually implemented - vnode lock types
from vnode_if.src.

MFC after:	1 month
2016-08-04 13:45:18 +00:00
Andriy Gapon
eec75f198b report sector size and number of sectors in lsdev output for bios disks
MFC after:	3 weeks
2016-08-04 06:40:51 +00:00
Sepherosa Ziehau
8d17f17043 hyperv/storvsc: Claim SPC-3 conformance, thus enable UNMAP support
The Hyper-V on pre-win10 systems will only report SPC-2 conformance,
but it actually conforms to SPC-3.  The INQUIRY response is adjusted
to propagate the SPC-3 version information to CAM.

Submitted by:	Hongjiang Zhang <honzhan microsoft com>
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7405
2016-08-04 05:05:35 +00:00
Adrian Chadd
864ce9933d Add in tap/tun for openvpn-on-mips experiments.
(FWIW, it does work.)
2016-08-04 01:49:18 +00:00
Adrian Chadd
da21d39781 [ar934x] shuffle AR93XX_BASE -> std.AR934X 2016-08-03 19:23:22 +00:00
Adrian Chadd
aa4a430d2c [ar9330] ok, fine, I'll finally undo the 2011-era mistake of _BASE config files.
Repeated prodding by: imp
2016-08-03 19:18:53 +00:00
Adrian Chadd
cc98d789a8 [ar9330] add in module support for ipfw nat.
This actually does work, and works pretty well.
2016-08-03 19:13:09 +00:00
Bryan Drewery
78be18ae6e Correct some comments.
Sponsored by:	EMC / Isilon Storage Division
MFC after:	3 days
2016-08-03 18:48:56 +00:00
Emmanuel Vadot
80ab358ee4 We need aw_nmi to be attached which needs GIC so attach a bit later.
Also the GPIOC doesn't need to be attach early

Reviewed by:	andrew
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D7082
2016-08-03 18:45:56 +00:00
Oleksandr Tymoshenko
161f7406ad Fix EHCI driver by excluding first 512K from available memory
On Zynq 256K-512K memory region is not accessible by all bus masters.
EHCI driver fails when trying to use it for DMA transfers. Patching
memory node does not help because ubldr overrides values there with
the ones obtained from u-boot. So as a workaround we just mark first
512K as reserved.

PR:		211484
Submitted by:	Thomas Skibo <thoma555-bsd@yahoo.com>
MFC after:	3 days
2016-08-03 18:03:14 +00:00
Mark Johnston
15f6139cc2 Fix a few cosmetic issues in boot1.efi.
- Use ANSI function signatures.
- Remove unneeded checks for a NULL boot module.
- Use nitems().

MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2016-08-03 17:17:01 +00:00
John Baldwin
847bfa8eb7 Use the port device name for the iov device for Chelsio T4/T5 cards.
Chelsio T4/T5 adapters are multifunction cards.  The main driver uses
physical function 4 (PF4).  However, VF devices for SR-IOV are only
supported on physical functions 0 through 3, where PF0 creates VFs tied
to port 0, etc.  The t4iov/t5iov driver was previously added to
create VF devices for ports that are present on each adapter.  This
change uses the recently added pci_iov_attach_name() function to
name the character device in /dev/iov after the associated port on
the card (e.g. /dev/iov/cxl0 is used to create VFs that share the
cxl0 port).  With this in place, mark the t4iov/t5iov devices quiet
to prevent them from cluttering dmesg.

Reviewed by:	rstone
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7402
2016-08-03 17:11:08 +00:00
John Baldwin
0aee83cc1d Permit the name of the /dev/iov entry to be set by the driver.
The PCI_IOV option creates character devices in /dev/iov for each PF
device driver that registers support for creating VFs.  By default the
character device is named after the PF device (e.g. /dev/iov/foo0).
This change adds a variant of pci_iov_attach() called pci_iov_attach_name()
that allows the name of the /dev/iov entry to be specified by the
driver.

Reviewed by:	rstone
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7400
2016-08-03 17:09:12 +00:00
John Baldwin
1f095f7051 Apply the fix from r232612 to fixed function counters.
Reviewed by:	emaste
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D7397
2016-08-03 16:52:00 +00:00
Konstantin Belousov
ad600ac8e3 Remove ncl_printf(), use printf(9) directly. After r303710 the
function duplicates printf().

Correct function names in the messages [*].

Noted by:	bde [*]
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-03 15:58:20 +00:00
John Baldwin
2de70600fa Correct assertion on vcpuid argument to vm_gpa_hold().
PR:		208168
Submitted by:	Dave Cameron <daverabbitz@ihug.co.nz>
Reviewed by:	grehan
MFC after:	1 month
2016-08-03 15:20:10 +00:00
Konstantin Belousov
fa03524a9f Merge i386 and amd64 variants of mp_watchdog.c into x86/, there is no
difference between files.
For pc98, put x86/mp_x86.c into the same place as used by i386 file list.
Fix typo in comment.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-03 13:51:53 +00:00
Konstantin Belousov
83d7cf21ea Remove unneeded (recursing) Giant acquisition around vprintf(9).
Reviewed by:	rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-03 11:49:17 +00:00
Mateusz Guzik
0453ade508 locks: fix sx compilation on mips after r303643
The kernel.h header is required for the SYSINIT macro, which apparently
was present on amd64 by accident.

Reported by:	kib
2016-08-03 09:15:10 +00:00
Konstantin Belousov
29ffb32ccd Remove Giant asserts. Update comment.
Owning Giant in the init/uninit is accidental due to the moment where
VFS modules initialization is performed, and is not enforced by the
VFS interface.  The Giant lock does not prevent a parallel execution
of the code, it is VFS which implements the proper protocol.

Approved by:	des (pseudofs maintainer)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-03 08:57:15 +00:00
Konstantin Belousov
6828ba639a Some style changes. Fix a typo in comment.
Approved by:	des (pseudofs maintainer)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-03 08:53:29 +00:00
Konstantin Belousov
0c657d22eb Explain why swapgeom_close_ev() is delegated.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-03 07:11:19 +00:00
Konstantin Belousov
e69ba32f88 Remove mention of the Giant from the fork_return() description.
Making emphasis on this lock in the core function comment is confusing
for the modern kernel.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2016-08-03 07:10:09 +00:00
Ed Schouten
e938ebbc0c Regenerate system call tables for r303699 and r303700. 2016-08-03 06:36:45 +00:00
Ed Schouten
1b42875d4b Re-add traling slash that was removed in r303699.
I must have accidentally pressed some random key in vim.
2016-08-03 06:35:58 +00:00
Ed Schouten
a813fdc6c3 mprotect(): Change prototype to comply to POSIX.
Our mprotect() function seems to take a "const void *" address to the
pages whose permissions need to be adjusted. POSIX uses "void *". Simply
stick to the POSIX one to prevent us from writing unportable code.

PR:		211423 (exp-run)
Tested by:	antoine@ (Thanks!)
2016-08-03 06:33:04 +00:00
Justin Hibbits
fc5b6d7242 Remove a duplicate PMC CPU number
CPU number for MPC85XX is a duplicate for E500.  Since we don't support MPC85XX
"uncore" registers right now, rather than renumbering it simply remove it, and
it will properly use the E500 CPU ID.
2016-08-03 01:46:55 +00:00
Justin Hibbits
6cedae09a2 Merge MPC85XX and QorIQ config options
Summary:
MPC85XX and QorIQ are very similar.  When the DPAA dTSEC driver was
added, QORIQ_DPAA was brought in as a config option to support the differences
in hardware register settings between QorIQ (e500mc-, e5500- based) SoCs and
QUICC (e500v1/e500v2-based) SoCs, particularly in the Local Access Window (LAW)
target settings.

Unify these settings using macros to hide details and ease porting, and use a
new function (mpc85xx_is_qoriq()) to distinguish between QorIQ and QUICC SoCs at
runtime.

An alternative to using the function could be to use a variable initialized at
platform attach time, which may incur less overhead at runtime.  Since it's not
in the critical path once booted, this optimization doesn't seem necessary at
first pass.

Reviewed by: nwhitehorn
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D7294
2016-08-03 01:22:11 +00:00
Navdeep Parhar
515b36c5b5 cxgbe/t4_tom: Read the chip's DDP page sizes and save them in a
per-adapter data structure.  This replaces a global array with hardcoded
page sizes.

Sponsored by:	Chelsio Communications
2016-08-02 23:54:21 +00:00
Ed Maste
dd2be8cb6e Move/add ARM ELF PHDR types to elf_common.h
Accidentally missed in r303674
2016-08-02 20:26:04 +00:00
Ed Maste
4df4e8ac47 Add ELFOSABI_ARM_AEABI ELF OSABI constant
Reported by:	andrew
Sponsored by:	The FreeBSD Foundation
2016-08-02 18:42:32 +00:00
Andrew Turner
27c9d42531 Remove trailing whitespace from the arm64 pmap
Obtained from:	ABT Systems Ltd
MFC after:	3 weeks
Sponsored by:	The FreeBSD Foundation
2016-08-02 15:26:46 +00:00
Ruslan Bukin
98f50c44e3 Update RISC-V port to Privileged Architecture Version 1.9.
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-08-02 14:50:14 +00:00
Andrey V. Elsukov
723758b7ce Fix NULL pointer dereference.
ro pointer can be NULL when IPSec consumes mbuf.

PR:		211486
MFC after:	3 days
2016-08-02 12:18:06 +00:00
Sepherosa Ziehau
05cde7efa6 tcp/lro: Implement hash table for LRO entries.
This significantly improves HTTP workload performance and reduces
HTTP workload latency.

Reviewed by:	rrs, gallatin, hps
Obtained from:	rrs, gallatin
Sponsored by:	Netflix (rrs, gallatin) , Microsoft (sephe)
Differential Revision:	https://reviews.freebsd.org/D6689
2016-08-02 06:36:47 +00:00
Mateusz Guzik
fa5000a4f3 locks: fix compilation for KDTRACE_HOOKS && !ADAPTIVE_* case
Reported by:	Michael Butler <imb protected-networks.net>
2016-08-02 03:05:59 +00:00
Mateusz Guzik
0412689595 locks: fix up ifdef guards introduced in r303643
Both sx and rwlocks had copy-pasted ADAPTIVE_MUTEXES instead of the correct
define.

MFC after:	1 week
2016-08-02 00:15:08 +00:00
Conrad Meyer
9809e9dc3a rtentry: Initialize rt_mtx with MTX_NEW
The "rtentry" zone does not use UMA_ZONE_ZINIT, so it is invalid to assume the
mutex's memory will be zero.  Without MTX_NEW, garbage backing memory may
trigger the "re-initializing a mutex" assertion.

PR:		200991
Submitted by:	Chang-Hsien Tsai <luke.tw AT gmail.com>
2016-08-01 23:07:31 +00:00
Conrad Meyer
bf4c239e47 opencrypto AES-ICM: Fix heap corruption typo
This error looks like it was a simple copy-paste typo in the original commit
for this code (r275732).

PR:		204009
Reported by:	Chang-Hsien Tsai <luke.tw AT gmail.com>
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 22:57:03 +00:00
Conrad Meyer
5f00f45775 Fix ddb "show proc" to show full arguments
PR:		200052
Submitted by:	Chang-Hsien Tsai <luke.tw AT gmail.com>
2016-08-01 22:41:50 +00:00
John Baldwin
315048f2ad Store the offset of the KDOORBELL and GTS registers in the softc.
VF devices use a different register layout than PF devices.  Storing
the offset in a value in the softc allows code to be shared between the
PF and VF drivers.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7389
2016-08-01 22:39:51 +00:00
Mark Johnston
49fe7b5029 ipoib: Bound the number of egress mbufs buffered during pathrec lookups.
In pathological situations where the master subnet manager becomes
unresponsive for an extended period, we may otherwise end up queuing all
of the system's mbufs while waiting for a response to a path record lookup.

This addresses the same issue as commit 1e85b806f9 in Linux.

Reviewed by:	cem, ngie
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 22:22:11 +00:00
John Baldwin
2611037c1c Disable PCI hotplug support for slots with power controllers.
After further review of the spec, I do not think the current HotPlug
code handles slots with power controllers correctly.  In particular,
the power state of the slot is to be inferred from other events, not
from examining the state of the power control bit in SLOT_CTL.  For now,
disable PCI hotplug support on such slots.

PR:		211081
Tested by:	Jeffrey E Pieper <jeffrey.e.pieper@intel.com>
MFC after:	3 days
2016-08-01 22:19:23 +00:00
Mateusz Guzik
1ada904147 Implement trivial backoff for locking primitives.
All current spinning loops retry an atomic op the first chance they get,
which leads to performance degradation under load.

One classic solution to the problem consists of delaying the test to an
extent. This implementation has a trivial linear increment and a random
factor for each attempt.

For simplicity, this first thouch implementation only modifies spinning
loops where the lock owner is running. spin mutexes and thread lock were
not modified.

Current parameters are autotuned on boot based on mp_cpus.

Autotune factors are very conservative and are subject to change later.

Reviewed by:	kib, jhb
Tested by:	pho
MFC after:	1 week
2016-08-01 21:48:37 +00:00
Sean Bruno
e72a746a6e r293331 mistakingly failed to add an assignment of paddr to the rxbuf
but only in the NETMAP code.  This lead to the NETMAP code paths
passing nothing up to userland.

Submitted by:	Ad Schellevis <ad@opnsense.org>
Reported by:	Franco Fichtner <franco@opnsense.org>
MFC after:	1 day
2016-08-01 21:19:51 +00:00
Andrey V. Elsukov
0428336393 Do not invoke resize event if initial disk size is zero. Some disks
report the size only after first opening.  And due to the events are
asynchronous, some consumers can receive this event too late and
this confuses them. This partially restores previous behaviour, and
at the same time this should fix the problem, when already opened
provider loses resize event.

PR:		211028
MFC after:	3 weeks
2016-08-01 20:54:54 +00:00
Mark Johnston
4e071758a7 MFV be9130cc9: "IB/cma: Check for GID on listening devices first"
This is an optimization that improves IB connection setup times.

Discussed with:	hselasky
Obtained from:	Linux
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 20:29:09 +00:00
Mark Johnston
82f1d3ea2f MFV 29f27e847: "IB/cma: Use cached gids"
This addresses a regression from an earlier upstream change which caused
cma_acquire_dev() to bypass the port GID cache and instead query the HCA
for each entry in its GID table. These queries can become extremely slow on
multiport devices, which has a negative impact on connection setup times.

Discussed with:	hselasky
Obtained from:	Linux
MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-08-01 20:27:11 +00:00
Allan Jude
4deb8929ea Make boot code and loader check for unsupported ZFS feature flags
OpenZFS uses feature flags instead of a zpool version number to track
features since the split from Oracle. In addition to avoiding confusion
on ZFS vs OpenZFS version numbers, this also allows features to be added
to different operating systems that use OpenZFS in different order.

The previous zfs boot code (gptzfsboot) and loader (zfsloader) blindly
tries to read the pool, and if failed provided only a vague error message.

With this change, both the boot code and loader check the MOS features
list in the ZFS label and compare it against the list of features that
the loader supports. If any unsupported feature is active, the pool is
not considered as a candidate for booting, and a helpful diagnostic
message is printed to the screen. Features that are merely enabled via
zpool upgrade, but not in use, do not block booting from the pool.

Submitted by:	Toomas Soome <tsoome@me.com>
Reviewed by:	delphij, mav
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D6857
2016-08-01 19:37:43 +00:00
Alan Cox
87ff568c26 Restore the historical behavior of "sysctl vm.swap_idle_enabled=1". Prior
to r254304, we had separate functions for reclamation and laundering
(vm_pageout_scan) versus updating usage information, i.e., "reference
bits", on active pages (vm_pageout_page_stats), and we only performed
vm_req_vmdaemon(VM_SWAP_IDLE) if vm_pages_needed was true.  However, since
r254303, if vm_swap_idle_enabled was "1", we have performed
vm_req_vmdaemon(VM_SWAP_IDLE) regardless of whether we are short of free
pages.  This was unintended and too aggressive, so I suspect no one uses
this feature.  With this change, we restore the historical behavior and
only perform vm_req_vmdaemon(VM_SWAP_IDLE) when we are short of free
pages.

Reviewed by:	kib, markj
2016-08-01 17:25:07 +00:00
Andrew Gallatin
d4c22202e6 Rework IPV6 TCP path MTU discovery to match IPv4
- Re-write tcp_ctlinput6() to closely mimic the IPv4 tcp_ctlinput()

- Now that tcp_ctlinput6() updates t_maxseg, we can allow ip6_output()
  to send TCP packets without looking at the tcp host cache for every
  single transmit.

- Make the icmp6 code mimic the IPv4 code & avoid returning
  PRC_HOSTDEAD because it is so expensive.

Without these changes in place, every TCP6 pmtu discovery or host
unreachable ICMP resulted in a call to in6_pcbnotify() which walks the
tcbinfo table with the write lock held.  Because the tcbinfo table is
shared between IPv4 and IPv6, this causes huge scalabilty issues on
servers with lots of (~100K) TCP connections, to the point where even
a small percent of IPv6 traffic had a disproportionate impact on
overall throughput.

Reviewed by:	bz, rrs, ae (all earlier versions), lstewart (in Netflix's tree)
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D7272
2016-08-01 17:02:21 +00:00
Landon J. Fuller
6ed73911fe [mips/broadcom] Fetch UART console configuration from CFE.
Relying on the boot loader console configuration allows us to use a
common set of device hints for all SENTRY5 devices.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7376
2016-08-01 16:29:32 +00:00
Andrew Turner
727c18a84f Split out the FDT parts of the GICv2 interrupt controller driver. This will
allow us to add an ACPI attachment for arm64.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7307
2016-08-01 16:29:04 +00:00
Landon J. Fuller
3cbea0b15c Sync CFE interface with upstream cfe-1.4.2 release.
Approved by:	adrian (mentor)
Obtained from:	https://www.broadcom.com/support/communications-processors
Differential Revision:	https://reviews.freebsd.org/D7375
2016-08-01 16:26:08 +00:00
Andrew Turner
698c14e189 Add a kernel variable to let the user to select their preferred order
between ACPI and FDT. This will be needed on machines with both, e.g. the
SoftIron Overdrive 3000. The kernel will accept one or more comma separated
values of either 'acpi' or 'fdt'. Any other values are skipped.

To set it the user can either set it on the loader command line, or
in loader.conf e.g. in loader.conf:
kern.cfg.order=acpi,fdt

This will try using ACPI then FDT. If none of the selected options work the
kernel tries to use one to get the serial console, then panics.

Reviewed by:	emaste (earlier version)
Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7274
2016-08-01 12:17:44 +00:00
Julian Elischer
d7373c820e netgraph module for reconstructing checksums
PR:		206108
Submitted by:	Dmitry Vagin  daemon.hammer@ya.ru
MFC after:	1 month
2016-08-01 12:09:04 +00:00
Julian Elischer
bf909fc9a4 slite style changes. There is an incoming patch that rewrites a
lot of this module and I want to get the style and whitespace changes in
a separate commit (or maybe more).

PR: 206185
Submitted by:	Dmitry Vagin
MFC after:	1 month
2016-08-01 11:34:12 +00:00
Andrew Turner
49a92cd4a5 Add the fields for the PAR_EL1 register. This is used when performing an
address lookup with the AT instructions.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-08-01 10:36:58 +00:00
Sepherosa Ziehau
05f7a8a69c hyperv/storvsc: Stringent PRP list assertions
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7361
2016-08-01 05:09:11 +00:00
Sepherosa Ziehau
88cb8d7812 hyperv/storvsc: Set maxio to 128KB.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7360
2016-08-01 04:51:31 +00:00
Sepherosa Ziehau
9d6016a773 hyperv/vmbus: Remove the artificial entry limit of SG and PRP list.
Just make sure that the total channel packet size does not exceed 1/2
data size of the TX bufring.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7359
2016-08-01 04:26:24 +00:00
Adrian Chadd
52fe68b878 [ath] update comments. 2016-08-01 00:36:29 +00:00
Andrew Turner
63512a1276 Add the Data Fault Status Code values to the ESR_ELx registers for when the
fault code is a Data Abort.

Obtained from:	AT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 18:58:20 +00:00
Andrew Turner
ed24579135 Extract the common parts of pmap_kenter_device to a new function. This will
be used when superpage support is added.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 17:53:09 +00:00
Andrew Turner
7592321ad1 Fix the comment above pmap_invalidate_page. tlbi will invalidate the tlb
on all CPUs.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 14:59:44 +00:00
Andrew Turner
c5b3b20907 Relax the barriers around a TLB invalidation to only wait on
inner-shareable memory accesses. There is no need for full system barriers.

Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-07-31 12:59:10 +00:00
Mateusz Guzik
61852185ba locks: change sleep_cnt and spin_cnt types to u_int
Both variables are uint64_t, but they only count spins or sleeps.
All reasonable values which we can get here comfortably hit in 32-bit range.

Suggested by: kib
MFC after:	1 week
2016-07-31 12:11:55 +00:00
Mateusz Guzik
b7ff4d59de amd64: implement pagezero using rep stos
The current implementation uses non-temporal writes. This turns out to
be detrimental to performance if the page is used shortly after, which
is the typical case with page faults.

Switch to rep stos.

Reviewed by:	kib
MFC after:	1 week
2016-07-31 11:34:08 +00:00
Adrian Chadd
c2346bfa26 [wdr4300] invert the GPIO LED polarity.
This makes them behave correctly.

Submitted by:	Dan Nelson <dnelson_1901@yahoo.com>
2016-07-31 06:52:19 +00:00
Adrian Chadd
accdd12e71 [ar71xx_gpio] handle AR934x and QCA953x GPIO OE polarity.
For reasons I won't comment on, the AR934x and QCA953x GPIO_OE register
value is inverted - bit set == input, bit clear == output.

So, fix this in the output setting, in reading the initial state from
the boot loader, and also setting any gpiofunc pins that are necessary.
2016-07-31 06:51:34 +00:00
Enji Cooper
4fcae4df7e Conditionalize code which defines sysctls per _KERNEL #ifdef guard
This resolves several issues when compiling libzpool (userspace library), i.e.
-Wimplicit-function-declaration and -Wmissing-declarations issues.

MFC after:	2 weeks
Reported by:	clang
Tested with:	clang 3.8.1, gcc 4.2.1, gcc 5.3.0
Sponsored by:	EMC / Isilon Storage Division
2016-07-31 06:34:49 +00:00
Adrian Chadd
d75e8c5089 [gpioled] add support for inverting the LED polarity.
No, this isn't a star trek science joke - sometimes LEDs are wired
up to be active low, so this is needed.

Submitted by:	Dan Nelson <dnelson_1901@yahoo.com>
2016-07-31 06:24:26 +00:00
Mateusz Guzik
e0c45af904 sx: increment spin_cnt before cpu_spinwait in xlock
The change is a no-op only done for consistency with the rest of the file.
2016-07-30 22:23:31 +00:00
Mateusz Guzik
7a54be1870 rwlock: s/READER/WRITER/ in wlock lockstat annotation 2016-07-30 22:21:48 +00:00
Alexander Motin
7e58465356 Wrap previous MSIX workaround into #ifndef EARLY_AP_STARTUP.
With EARLY_AP_STARTUP we can successfully negotiate MSIX earlier.

Requested by:	jhb@
2016-07-30 21:06:59 +00:00
Bjoern A. Zeeb
6ca2d09437 Try to declare _hw_pci for all sysctl cases needed after r303497.
MFC after:	5 days
X-MFC with:	r303497
2016-07-30 20:31:12 +00:00
Imre Vadász
a3c0e7f2fb [iwm] Fix iwm_poll_bit() usage in iwm_stop_device(), fixup r303418.
* iwm_poll_bit() returns 1 on success and 0 on failure, whereas
  iwl_poll_bit() in Linux's iwlwifi returns >= 0 on success and < 0 on
  failure.

* Because of the wrong iwm_poll_bit return code check, no warning was
  printed if tx DMA stopping failed.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7371
2016-07-30 19:03:32 +00:00
Allan Jude
aafbb33897 Improve boot loader quote parsing
parse() is the boot loader's interp_parse.c is too naive about quotes

both single and double quotes were allowed to be mixed, and single
quotes did not follow the usual semantics (re variable expansion).

The old code did not check for terminating quotes

This update implements:
 * distinguishing single and double quote
 * variable expansion will not be done inside single quote protected area
 * will preserve inner quote for values like "value 'some list'"
 * ending quote check.

this diff does not implement ending quote order check, it shouldn't
be too hard, needs some improvements on parser state machine.

PR:		204602
Submitted by:	Toomas Soome <tsoome@me.com>
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D6000
2016-07-30 17:53:37 +00:00
Allan Jude
6bf8d16009 bcache should support reads shorter than sector size
dosfs (fat file systems) can perform reads of partial sectors
bcache should support such reads.

Submitted by:	Toomas Soome <tsoome@me.com>
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D6475
2016-07-30 17:45:56 +00:00
Alexander Motin
a8ec75016f Block MSIX negotiation until SMP started and IRQ reshuffled. 2016-07-30 15:56:36 +00:00
Alexander Motin
8260941c76 Make MAC address generation more random.
'ticks' approach does not work at boot time.
2016-07-30 15:51:16 +00:00
Alexander Motin
cf2e151f65 Fix infinite loops introduced at r303429. 2016-07-30 10:32:28 +00:00
Konstantin Belousov
50c22263bb Cache getbintime(9) answer in timehands, similarly to getnanotime(9)
and getmicrotime(9).

Suggested and reviewed by:	bde (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2016-07-30 09:25:57 +00:00
Mark Johnston
57185c52de Restore an ifdef that should not have been removed in r303535.
X-MFC-With:	r303535
2016-07-30 07:05:32 +00:00
Mark Johnston
6d1ffb50fc Include fasttrap handling for DATAMODEL_ILP32 when compiling for amd64.
MFC after:	1 month
2016-07-30 03:11:53 +00:00
Baptiste Daroussin
999c1fd64b Remove usage of _WITH_DPRINTF 2016-07-30 01:16:06 +00:00
John Baldwin
8dd07eab0e Various fixes to the t4/5nex character device.
- Remove null open/close methods.
- Don't set d_flags to 0 explicitly.
- Remove t5_cdevsw as the .d_name member isn't really used and doesn't
  warrant a separate cdevsw just for the name.
- Use ENOTTY as the error value for an unknown ioctl request.
- Use make_dev_s() to close race with setting si_drv1.

Sponsored by:	Chelsio Communications
2016-07-29 22:11:29 +00:00
Mark Johnston
897d0c6617 Use vm_page_undirty() instead of manually setting a page field.
Reviewed by:	alc
MFC after:	3 days
2016-07-29 21:05:37 +00:00
Alexander Motin
718e8abba9 Fix NTBT_QP_LINKS negotiation.
I believe it never worked correctly for more the one queue even in Linux.
This fixes case when one of consumer drivers is not loaded on one side,
but its queues still announced as ready if something else brought link up.

While there, remove some pointless NULL checks.
2016-07-29 21:03:30 +00:00
Mark Johnston
adaf3c4906 sdp: Destroy the RDMA ID after destroying the connection's queue pair.
This is the ordering documented by rdma_destroy_qp(). Also add a useful
KASSERT to sdp_pcbfree().

Sponsored by:	EMC / Isilon Storage Division
2016-07-29 21:03:02 +00:00
Mark Johnston
d3461164e0 sdp: Use malloc(9) instead of the Linux compat layer.
SDP transmit and receive rings are always created in a sleepable context,
so we can use M_WAITOK and remove error checks.

Sponsored by:	EMC / Isilon Storage Division
2016-07-29 21:01:04 +00:00
Mark Johnston
f66ec15275 sdp: Use the correct socket buffer in sdp_post_recvs_needed().
Sponsored by:	EMC / Isilon Storage Division
2016-07-29 20:54:43 +00:00
Alexander Motin
75b94efd08 Clear scratchpad after MSIX negotiation to not leak garbage. 2016-07-29 20:52:18 +00:00
Mark Johnston
7e968dab6d sdp: Always free received control packets after they're handled.
Sponsored by:	EMC / Isilon Storage Division
2016-07-29 20:51:52 +00:00
Mark Johnston
3b3e6d882f Fix the KASSERT format string arguments after r303507. 2016-07-29 20:48:42 +00:00
Mark Johnston
a3c0b052b7 sdp: Use the PCB as the rx completion handler argument.
The generic socket may be detached from the PCB before the completion
queue is drained and destroyed, so this change closes a race condition
in connection teardown.

Sponsored by:	EMC / Isilon Storage Division
2016-07-29 20:39:32 +00:00
Mark Johnston
fa46ade837 sdp: Destroy the PCB lock before freeing to the zone.
Sponsored by:	EMC / Isilon Storage Division
2016-07-29 20:36:01 +00:00
Mark Johnston
2cefa87b0b sdp: Use an mbufq for received control packets.
This is simpler than the hand-rolled queue, and fixes a use-after-free.

Sponsored by:	EMC / Isilon Storage Division
2016-07-29 20:35:04 +00:00
Mark Johnston
30a71b3c30 sdp: Remove Linux build files.
They aren't useful here, and Linux seems to have largely abandoned SDP
anyway.

Sponsored by:	EMC / Isilon Storage Division
2016-07-29 20:33:43 +00:00
John Baldwin
e2325d8285 Don't treat NOCPU as a valid CPU to CPU_ISSET.
If a thread is created bound to a cpuset it might already be bound before
it's very first timeslice, and td_lastcpu will be NOCPU in that case.

MFC after:	1 week
2016-07-29 20:19:14 +00:00
John Baldwin
005ce8e4e6 Fix locking issues with aio_fsync().
- Use correct lock in aio_cancel_sync when dequeueing job.
- Add _locked variants of aio_set/clear_cancel_function and use those
  to avoid lock recursion when adding and removing fsync jobs to the
  per-process sync queue.
- While here, add a basic test for aio_fsync().

PR:		211390
Reported by:	Randy Westlund <rwestlun@gmail.com>
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7339
2016-07-29 18:26:15 +00:00
John Baldwin
25a57bd64b Add a loader tunable (hw.pci.enable_pcie_hp) to disable PCI-e HotPlug.
Some systems and/or devices (such as riser cards) do not include a
non-compliant implementation of PCI-e HotPlug that can result in devices
not being attached (e.g. the HotPlug code might assume that a card is
being unplugged and will power the slot off and detach it).  This
tunable can be set to 0 to disable support for PCI-e HotPlug ignoring
the incorrect HotPlug state on these slots.

PR:		211081
Reported by:	Sergey Renkas <serg_ic@mail.ru> (SuperMicro X7 riser card)
Reported by:	Jeffrey E Pieper <jeffrey.e.pieper@intel.com>
	 	(Intel X520 adapter)
MFC after:	1 week
Relnotes:	yes
2016-07-29 17:54:21 +00:00
Alexander Motin
6bd57d14ed Once more refactor KPI between ntb_transport(4) and if_ntb(4)..
New design allows to attach multiple consumers to ntb_transport(4) instance.
Previous design obtained from Linux theoretically allowed that, but was not
practically usable (Linux also has only one consumer driver now).
2016-07-29 17:15:41 +00:00
Alan Cox
793172ea88 Remove a probe declaration that has been unused since r292469, when
vm_pageout_grow_cache() was replaced.

MFC after:	3 days
2016-07-29 16:43:51 +00:00
Roger Pau Monné
23006680c7 Revert r291022: x86/intr: allow mutex recursion in intr_remove_handler
This was only needed for Xen, and a better way to deal with this issue has
been found, so this commit can be reverted.

Sponsored by:		Citrix Systems R&D
MFC after:		5 days
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D7363
2016-07-29 16:35:58 +00:00
Roger Pau Monné
35fdb32d86 xen-intr: fix removal of event channels during resume
Event channel handlers cannot be removed during resume because there might
be an interrupt thread running on a CPU currently blocked in the
cpususpend_handler, which prevents the call to intr_remove_handler from
finishing and completely freezes the system during resume. r291022 tried to
fix this by allowing recursion in intr_remove_handler, but that's clearly
not enough.

Instead don't remove the handlers at the interrupt resume phase, and let
each driver remove the handler by itself during resume. In order to do this,
change the opaque event channel handler cookie to use the global interrupt
vector instead of the event channel port. The event channel port cannot be
used because after resume all event channels are reset, and the port numbers
can change.

Sponsored by:		Citrix Systems R&D
MFC after:		5 days
2016-07-29 16:34:54 +00:00
Roger Pau Monné
339690b541 xen-netfront: fix trying to send packets with disconnected netfront
In certain circumstances xn_txq_mq_start might be called with num_queues ==
0 during the resume phase after a migration, which can trigger a KASSERT.
Fix this by making sure the carrier is on before trying to transmit, or else
return that the queues are full.

Just as a note, I haven't been able to reproduce this crash on my test
systems, but I still think it's possible and worth fixing.

Reported by:		Karl Pielorz <kpielorz_lst@tdx.co.uk>
Sponsored by:		Citrix Systems R&D
MFC after:		5 days
Reviewed by:		Wei Liu <wei.liu2@citrix.com>
Differential revision:	https://reviews.freebsd.org/D7349
2016-07-29 16:33:45 +00:00
Warner Losh
6c99513703 Fix typo. 2016-07-29 15:24:50 +00:00
Ruslan Bukin
92bf0e5e2a Include FBT to modules build on RISC-V. 2016-07-29 12:30:33 +00:00
Ruslan Bukin
573d53050e Remove unused variables. 2016-07-29 12:29:17 +00:00
Edward Tomasz Napierala
905807264d Remove write-only variable.
MFC after:	1 month
2016-07-29 12:15:55 +00:00
Edward Tomasz Napierala
6be7599235 Improve error message.
MFC after:	1 month
2016-07-29 11:33:23 +00:00
Edward Tomasz Napierala
5a92ac1d68 Fix MTP description in the comment.
MFC after:	1 month
2016-07-29 11:33:01 +00:00
Andrew Turner
eda295b9e5 Add a generic EHCI USB driver based on the Allwinner A10 driver. It is ACPI
only for now, but wouldn't be too difficult to add support for FDT.

Reviewed by:	hselasky
Obtained from:	ABT Systems Ltd
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7352
2016-07-29 08:50:36 +00:00
Sepherosa Ziehau
cf6890fbdc hyperv/storvsc: Use busdma(9) and enable PIM_UNMAPPED by default.
The UNMAPPED I/O greatly improves userland direct disk I/O performance
by 35% ~ 135%.

Submitted by:	Hongjiang Zhang <honzhan microsoft com>
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7195
2016-07-29 06:22:11 +00:00
Sepherosa Ziehau
c0c9089709 hyperv/vmbus: Revoke unnecessary exposure of vmbus softc
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7348
2016-07-29 06:10:27 +00:00
Sepherosa Ziehau
fd42dfc550 hyperv/vmbus: Move driver glue to the beginning of the files
Just as most of other drivers do.  And move sysinit function close
to its SYSINIT.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7347
2016-07-29 05:58:24 +00:00
Sepherosa Ziehau
5b444edabc hyperv/vmbus: Forward declare static functions
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7346
2016-07-29 05:49:12 +00:00
Sepherosa Ziehau
8018156f90 hyperv/vmbus: Reindent function declarations.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7344
2016-07-29 03:16:51 +00:00
Warner Losh
08f1387933 Move protocol specific stuff into a linker set object that's
per-protocol. This reduces the number scsi symbols references by
cam_xpt significantly, and eliminates all ata / nvme symbols. There's
still some NVME / ATA specific code for dealing with XPT_NVME_IO and
XPT_ATA_IO respectively, and a bunch of scsi-specific code, but this
is progress.

Differential Revision: https://reviews.freebsd.org/D7289
2016-07-28 22:55:21 +00:00
Warner Losh
ded2b70617 Switch to linker sets to find the xport callback object. This
eliminates the need to special case everything in cam_xpt for new
transports. It is now a failure to not have a transport object when
registering the bus as well. You can still, however, create a
transport that's unspecified (XPT_)

Differential Revision: https://reviews.freebsd.org/D7289
2016-07-28 22:55:14 +00:00
Warner Losh
34dc8f1bb4 Kill a few stray debug printfs. 2016-07-28 22:40:31 +00:00
Alan Cox
f095d1bbc7 Remove any mention of cache (PG_CACHE) pages from the comments in
vm_pageout_scan().  That function has not cached pages since r284376.

MFC after:	3 days
2016-07-28 22:30:48 +00:00
Brooks Davis
40018b91dd Don't create pointless backups of generated files in "make sysent".
Any sensible workflow will include a revision control system from which
to restore the old files if required.  In normal usage, developers just
have to clean up the mess.

Reviewed by:	jhb
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D7353
2016-07-28 21:29:04 +00:00
Brooks Davis
e9004cefe9 cxgbe's firmware module fails to build on mips64 as well as mips32 so
disable for all mips.

Sponsored by:	DARPA, AFRL
2016-07-28 21:27:47 +00:00
Andrew Gallatin
0e3b891988 Call tcp_notify() directly to shoot down routes, rather than
calling in_pcbnotifyall().

This avoids lock contention on tcbinfo due to in_pcbnotifyall()
holding the tcbinfo write lock while walking all connections.

Reviewed by:	rrs, karels
MFC after:	2 weeks
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D7251
2016-07-28 19:32:25 +00:00
Stephen J. Kiernan
f11ec79842 Remove BSD and USL copyright and update license block in in_prot.c, as the
code in this file was written by Robert N. M. Waston.

Move cr_can* prototypes from sys/systm.h to sys/proc.h

Reported by:	rwatson
Reviewed by:	rwatson
Approved by:	sjg (mentor)
Differential Revision:	https://reviews.freebsd.org/D7345
2016-07-28 18:39:30 +00:00
John Baldwin
29c229e9fc Mark spg_len and fl_pktshift static.
These variables are no longer exported to t4_netmap.c after r296478.
2016-07-28 17:37:12 +00:00
Ruslan Bukin
9346408d90 Normalise the CWARNFLAGS inter-word spacing: remove all leading
and trailing space, and convert multiple consecutive spaces to
single space.

This helps to keep build output looking good.
2016-07-28 17:18:02 +00:00
Konstantin Belousov
88ad2d7b47 Do not delegate a work to geom event thread which can be done inline.
In particular, swapongeom_ev() needed event thread context when swap
pager configuration was performed under Giant and geom asserted that
Giant is not owned.  Now both of the reason went away.

On the other hand, note that swpageom_release() is called from the
bio_done context, and possible close cannot be performed inline.

Also fix some minor issues.  The swapgeom() function does not use the
td argument, remove it.  Recheck that the vnode passed is still VCHR
and not reclaimed after the lock.

Reviewed by:	mav
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-07-28 15:57:01 +00:00
Konstantin Belousov
2174a0c607 Fix style and typo.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-07-28 15:49:51 +00:00
Ed Maste
45eff3df96 remove CONSTRUCTORS from kernel linker scripts
The linker script CONSTRUCTORS keyword is only meaningful "when linking
object file formats which do not support arbitrary sections, such as
ECOFF and XCOFF"[1] and is ignored for other object file formats.

LLVM's lld does not yet accept (and ignore) CONSTRUCTORS, so just remove
CONSTRUCTORS from the linker scripts as it has no effect.

[1] https://sourceware.org/binutils/docs/ld/Output-Section-Keywords.html

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D7343
2016-07-28 13:54:46 +00:00
Ruslan Bukin
8da8940319 Build ofw_bus_if.h for modules for RISC-V. 2016-07-28 13:21:45 +00:00
Ruslan Bukin
c760a23737 Build DTrace assym.o with -msoft-float flag for RISC-V so we have
correct flag in ELF file.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-07-28 13:18:10 +00:00
Ruslan Bukin
96c072fcb0 o Add warn flags required to build modules with GCC 6.1;
o Sort GCC 4.8 warn flags.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-07-28 13:15:23 +00:00
Alexander Motin
8709930cfc Fix r303429 build with invariants. 2016-07-28 12:25:58 +00:00
Ed Schouten
5590eb985e Regenerate system call table for r303435. 2016-07-28 12:22:34 +00:00
Ed Schouten
d9c4cd2fbc Change the return type of msgrcv() to ssize_t as required by POSIX.
It looks like the msgrcv() system call is already written in such a way
that the size is internally computed as a size_t and written into all of
td_retval[0]. This means that it is effectively already returning
ssize_t. It's just that the userspace prototype doesn't match up.
2016-07-28 12:22:01 +00:00
Alexander Motin
4490696b3e Once more refactor KPI between NTB hardware and consumers.
New design allows hardware resources to be split between several consumers.
For example, one BAR can be dedicated for remote memory access, while other
resources can be used for packet transport for virtual Ethernet interface.
And even without resource split, this code allows to specify which consumer
driver should attach the hardware.

From some points this makes the code even closer to Linux one, even though
Linux does not provide the described flexibility.
2016-07-28 10:48:20 +00:00
Konstantin Belousov
2d19b736ed Rewrite subr_sleepqueue.c use of callouts to not depend on the
specifics of callout KPI.  Esp., do not depend on the exact interface
of callout_stop(9) return values.

The main change is that instead of requiring precise callouts, code
maintains absolute time to wake up.  Callouts now should ensure that a
wake occurs at the requested moment, but we can tolerate both run-away
callout, and callout_stop(9) lying about running callout either way.

As consequence, it removes the constant source of the bugs where
sleepq_check_timeout() causes uninterruptible thread state where the
thread is detached from CPU, see e.g. r234952 and r296320.

Patch also removes dual meaning of the TDF_TIMEOUT flag, making code
(IMO much) simpler to reason about.

Tested by:	pho
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D7137
2016-07-28 09:09:55 +00:00
Konstantin Belousov
a9e182e895 Extract the calculation of the callout fire time into the new function
callout_when(9).  See the man page update for the description of the
intended use.

Tested by:	pho
Reviewed by:	jhb, bjk (man page updates)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7137
2016-07-28 08:57:01 +00:00
Konstantin Belousov
ad90086e85 Fix typo in comment.
MFC after:  3 days
2016-07-28 08:53:38 +00:00
Konstantin Belousov
b7a25e63b6 When a debugger attaches to the process, SIGSTOP is sent to the
target.  Due to a way issignal() selects the next signal to deliver
and report, if the simultaneous or already pending another signal
exists, that signal might be reported by the next waitpid(2) call.
This causes minor annoyance for debuggers, which must be prepared to
take any signal as the first event, then filter SIGSTOP later.

More importantly, for tools like gcore(1), which attach and then
detach without processing events, SIGSTOP might leak to be delivered
after PT_DETACH.  This results in the process being unintentionally
stopped after detach, which is fatal for automatic tools.

The solution is to force SIGSTOP to be the first signal reported after
the attach.  Attach code is modified to set P2_PTRACE_FSTP to indicate
that the attaching ritual was not yet finished, and issignal() prefers
SIGSTOP in that condition.  Also, the thread which handles
P2_PTRACE_FSTP is made to guarantee to own p_xthread during the first
waitpid(2).  All that ensures that SIGSTOP is consumed first.

Additionally, if P2_PTRACE_FSTP is still set on detach, which means
that waitpid(2) was not called at all, SIGSTOP is removed from the
queue, ensuring that the process is resumed on detach.

In issignal(), when acting on STOPing signals, remove the signal from
queue before suspending.  Otherwise parallel attach could result in
ptracestop() acting on that STOP as if it was the STOP signal from the
attach.  Then SIGSTOP from attach leaks again.

As a minor refactoring, some bits of the common attach code is moved
to new helper proc_set_traced().

Reported by:	markj
Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D7256
2016-07-28 08:41:13 +00:00
Sepherosa Ziehau
7d8ee480c4 hyperv/vmbus: Inclusion cleanup
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7334
2016-07-28 06:46:10 +00:00
Sepherosa Ziehau
0bc135dffc hyperv/vmbus: Avoid unnecessary mb()
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7333
2016-07-28 06:30:29 +00:00
Luiz Otavio O Souza
92ee0c01d4 Enable the build of micphy as part of generic miibus build, but only for
FDT enabled systems.

Sponsored by:	Rubicon Communications (Netgate)
2016-07-28 05:59:56 +00:00
Imre Vadász
34938c3085 [iwm] When stopping TX DMA, wait for all channels at once.
* Makes the TX DMA stopping more similar to Linux code, and potentially
      a bit faster. Also, output an error message when TX DMA idling fails.

    Taken-From: Linux iwlwifi

Tested:

* AC3165, STA mode

Approved by:	adrian (mentor)
Obtained from:	DragonFlyBSD git 2ee486ddff973ac552ff787c17e8d83e8ae0f24c
Differential Revision:	https://reviews.freebsd.org/D7325
2016-07-27 20:51:31 +00:00
Bryan Drewery
a3f1ec8d91 opt_bdg.h was removed in r150636.
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2016-07-27 20:48:15 +00:00
Imre Vadász
2b8cf46a68 [iwm] Set different pm_timeout for action frames.
When building a Tx Command for management frames, we are lacking
    a check for action frames, for which we should set a different
    pm_timeout.  This cause the fw to stay awake for 100TU after each
    such frame is transmitted, resulting an excessive power consumption.

    Taken-From: Linux iwlwifi (git b084a35663c3f1f7)

Approved by:	adrian (mentor)
Obtained from:	Linux git b084a35663c3f1f7de1c45c4ae3006864c940fe7
Obtained from:	DragonFlyBSD git ba00f0e3ae873d6f0d5743e22c3ebc49c44dfdac
Differential Revision:	https://reviews.freebsd.org/D7324
2016-07-27 20:46:51 +00:00
Bryan Drewery
ce85964181 opt_apic.h is only used on i386.
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2016-07-27 20:45:00 +00:00
Bryan Drewery
1a27b3ad56 opt_random.h was removed in r287558 for opt_global.h
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2016-07-27 20:44:53 +00:00
Imre Vadász
74adf80539 [iwm] Fix inverted logic in iwm_tx().
The PROT_REQUIRE flag in should be set for data frames above a certain
    length, but we were setting it for !data frames above a certain length,
    which makes no sense at all.

    Taken-From: OpenBSD, Linux iwlwifi

Approved by:	adrian (mentor)
Obtained from:	DragonFlyBSD git 8cc03924a36c572c2908e659e624f44636dc2b33
Differential Revision:	https://reviews.freebsd.org/D7323
2016-07-27 20:43:08 +00:00
Stephen J. Kiernan
4ac21b4f09 Prepare for network stack as a module
- Move cr_canseeinpcb to sys/netinet/in_prot.c in order to separate the
   INET and INET6-specific code from the rest of the prot code (It is only
   used by the network stack, so it makes sense for it to live with the
   other network stack code.)
 - Move cr_canseeinpcb prototype from sys/systm.h to netinet/in_systm.h
 - Rename cr_seeotheruids to cr_canseeotheruids and cr_seeothergids to
   cr_canseeothergids, make them non-static, and add prototypes (so they
   can be seen/called by in_prot.c functions.)
 - Remove sw_csum variable from ip6_forward in ip6_forward.c, as it is an
   unused variable.

Reviewed by:	gnn, jtl
Approved by:	sjg (mentor)
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D2901
2016-07-27 20:34:09 +00:00
John Baldwin
b9a53e161b Adjust tests in fsync job scheduling loop to reduce indentation. 2016-07-27 19:31:25 +00:00
John Baldwin
07159830be Add support for zero-copy aio_write() on TOE sockets.
AIO write requests for a TOE socket on a Chelsio T4+ adapter can now
DMA directly from the user-supplied buffer.  This is implemented by
wiring the pages backing the user-supplied buffer and queueing special
mbufs backed by raw VM pages to the socket buffer.  The TOE code
recognizes these special mbufs and builds a sglist from the VM page
array associated with the mbuf when queueing a work request to the TOE.

Because these mbufs do not have an associated virtual address, m_data
is not valid.  Thus, the AIO handler does not invoke sosend() directly
for these mbufs but instead inlines portions of sosend_generic() and
tcp_usr_send().

An aiotx_buffer structure is used to describe the user buffer (e.g.
it holds the array of VM pages and a reference to the AIO job).  The
special mbufs reference this structure via m_ext.  Note that a single
job might be split across multiple mbufs (e.g. if it is larger than
the socket buffer size).  The 'ext_arg2' member of each mbuf gives an
offset relative to the backing aiotx_buffer.  The AIO job associated
with an aiotx_buffer structure is completed when the last reference to
the structure is released.

Zero-copy aio_write()'s for connections associated with a given
adapter can be enabled/disabled at runtime via the
'dev.t[45]nex.N.toe.tx_zcopy' sysctl.

MFC after:	1 month
Relnotes:	yes
Sponsored by:	Chelsio Communications
2016-07-27 18:29:35 +00:00
Mark Johnston
3ac8f842ea De-pluralize "queues" where appropriate in the pagedaemon code.
MFC after:	1 week
2016-07-27 17:11:03 +00:00
Ed Maste
18c23dffac ANSIfy kern_proc.c and delete register keyword
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D6478
2016-07-27 14:27:08 +00:00
Konstantin Belousov
1822421c0b Remove Giant from settime(), tc_setclock_mtx guards tc_windup() calls,
and there is no other issues with parallel settime().  Remove spl()
vestiges there as well.

Tested by:	pho (as part of the whole patch)
Reviewed by:	jhb (same)
Discussed wit:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D7302
2016-07-27 11:54:24 +00:00
Konstantin Belousov
5760b029ee Prevent parallel tc_windup() calls, both parallel top-level calls from
setclock() and from simultaneous top-level and interrupt.  For this,
tc_windup() is protected with a tc_setclock_mtx spinlock, in the try
mode when called from hardclock interrupt.  If spinlock cannot be
obtained without spinning from the interrupt context, this means that
top-level executes tc_windup() on other core and our try may be
avoided.

The boottimebin and boottime variables should be adjusted from
tc_windup().  To be correct, they must be part of the timehands and
read using lockless protocol.  Remove the globals and reimplement the
getboottime(9)/getboottimebin(9) KPI using the timehands read
protocol.

Tested by:	pho (as part of the whole patch)
Reviewed by:	jhb (same)
Discussed wit:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7302
2016-07-27 11:49:41 +00:00
Konstantin Belousov
4493f659e5 Fix a bug in r302252.
Change ntpadj_lock to spinlock always, and rename stuff removing
ADJ/adj from the names. ntp_update_second() requires ntp_lock and is
called from the tc_windup(), so ntp_lock must be a spinlock.  Add
missed lock to ntp_update_second().

Tested by:	pho (as part of the whole patch)
Reviewed by:	jhb (same)
Noted by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7302
2016-07-27 11:40:06 +00:00
Konstantin Belousov
21547fc7ca Reduce the resettodr_lock scope to only CLOCK_SETTIME() call.
Tested by:	pho (as part of the whole patch)
Reviewed by:	jhb (same)
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7302
2016-07-27 11:34:25 +00:00
Konstantin Belousov
4d29106e55 Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7302
2016-07-27 11:33:33 +00:00