Commit Graph

108120 Commits

Author SHA1 Message Date
Alexander Motin
27a8d05bd7 MFV r294793:
6367 spa_config_tryenter incorrectly handles the multiple-lock case

Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Prashanth Sreenivasa <prashksp@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Steven Hartland <steven.hartland@multiplay.co.uk>
Approved by: Matthew Ahrens <mahrens@delphix.com>

illumos/illumos-gate@e495b6e673
2016-01-26 12:28:53 +00:00
Edward Tomasz Napierala
ed81020097 Fix the way RCTL handles rules' rrl_exceeded on credenials change.
Because of what this variable does, it was probably harmless - but
still incorrect.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-01-26 11:28:55 +00:00
Svatopluk Kraus
267e03a45d Don't do icache sync on kernel memory and keep in line with comment
in elf_cpu_load_file(). The only time when the sync is needed is after
kernel module is loaded and the relocation info is processed. And it's
done in elf_cpu_load_file().
2016-01-26 10:24:18 +00:00
Svatopluk Kraus
24152caa00 Make code more compact and readable better in pmap_extract()
like functions. No functional change.

This is a follow up to r294722.

Suggested by:	kib
2016-01-26 09:50:23 +00:00
Sepherosa Ziehau
719d2f1ad5 hyperv/hn: Improve sending performance
- Avoid main lock contention by trylock for if_start, if that fails,
  schedule TX taskqueue for if_start
- Don't do direct sending if the packet to be sent is large, e.g.
  TSO packet.

This change gives me stable 9.1Gbps TCP sending performance w/ TSO
over a 10Gbe directly connected network (the performance fluctuated
between 4Gbps and 9Gbps before this commit). It also improves non-
TSO TCP sending performance a lot.

Reviewed by:		adrian, royger
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5074
2016-01-26 09:42:13 +00:00
Konstantin Belousov
88d74d64d7 Restore flushing of output for revoke(2) again. Document revoke()'s
intended behaviour in its man page.  Simplify tty_drain() to match.
Don't call ttydevsw methods in tty_flush() if the device is gone
since we now sometimes call it then.

The flushing was supposed to be implemented by passing the FNONBLOCK
flag to VOP_CLOSE() for revoke().  The tty driver is one of the few
that can block in close and was one of the fewer that knew about this.

This almost worked in FreeBSD-1 and similarly in Net/2.  These
versions only almost worked because there was and is considerable
confusion between IO_NDELAY and FNONBLOCK (aka O_NONBLOCK).  IO_NDELAY
is only valid for VOP_READ() and VOP_WRITE().  For other VOPs it has
the same value as O_SHLOCK.  But since vfs_subr.c and tty.c
consistently used the wrong flag and the O_SHLOCK flag is rarely set,
this mostly worked.  It also gave the feature than applications could
get the non-blocking close by abusing O_SHLOCK.

This was first broken then fixed in 1995.  I changed only the tty
driver to use FNONBLOCK, as a hack to get non-blocking via the normal
flag FNONBLOCK for last closes.  I didn't know about revoke()'s use
of IO_NDELAY or change it to be consistent, so revoke() was broken.
Then I changed revoke() to match.

This was next broken in 1997 then fixed in 1998.  Importing Lite2 made
the flags inconsistent again by undoing the fix only in vfs_subr.c.

This was next broken in 2008 by replacing everything in tty.c and not
checking any flags in last close.  Other bugs in draining limited the
resulting unbounded waits to drain in some cases.

It is now possible to fix this better using the new FREVOKE flag.
Just restore flushing for revoke() for now.  Don't restore or undo any
hacks for ordinary last closes yet.  But remove dead code in the
1-second relative timeout (r272789).  This did extra work to extend
the buggy draining for revoke() for as long as possible.  The 1-second
timeout made this not very long by usually flushing after 1 second.

Submitted by:	bde
MFC after:	2 weeks
2016-01-26 07:57:44 +00:00
Warner Losh
b69d02ee50 Allow new lines as white space for arguments that are parsed to allow
boot1 to pass in files with newlines in them. Now that the EFI loader
groks foo=bar on the command line, this can allow a more general setup
than traditional boot loader args will allow.

Differential Revision: https://reviews.freebsd.org/D5038
2016-01-26 06:26:56 +00:00
Warner Losh
01e288734b Read in /boot/config and /boot.config, like all the other boot
loaders and pass it along to /boot/loader.efi.

Differential Revision: https://reviews.freebsd.org/D5038
2016-01-26 06:26:55 +00:00
Warner Losh
33fcb4aff8 Parse the command line arguments, and do it before we initialize the
console so it can be changed by the command line arguments.

Differential Revision: https://reviews.freebsd.org/D5038
2016-01-26 06:26:46 +00:00
Warner Losh
5063232c10 RBX_ defines are in rbx.h, move it there.
Differential Revision: https://reviews.freebsd.org/D5038
2016-01-26 06:26:44 +00:00
Warner Losh
5bbc34e185 Move all the separate copies of the same strings into paths.h. There's
nothing machine specific about these.

Differential Revision: https://reviews.freebsd.org/D5038
2016-01-26 06:26:19 +00:00
Luigi Rizzo
f51b072d4c Revert one chunk from commit 285362, which introduced an off-by-one error
in computing a shift index. The error was due to the use of mixed
fls() / __fls() functions in another implementation of qfq.
To avoid that the problem occurs again, properly document which
incarnation of the function we need.
Note that the bug only affects QFQ in FreeBSD head from last july, as
the patch was not merged to other versions.
2016-01-26 04:48:24 +00:00
Justin Hibbits
49d36ba409 Older Book-E processors (e500v1/e500v2) don't support dcbzl.
The only difference between dcbzl and dcbz is dcbzl operates on native cache
line lengths regardless of L1CSR0[DCBZ32].  Since we don't change the cache line
size, the cacheline_size variable will reflect the used cache line length, and
dcbz will work as expected.
2016-01-26 04:41:18 +00:00
Justin Hibbits
30857edaea Fix a debug printf().
Somehow this printf() was missed in the conversion of vm_paddr_t to 64-bit, and
made it through until now.

Sponsored by:	Alex Perez/Inertial Computing
2016-01-26 03:52:14 +00:00
Mark Johnston
00b0fb8757 Remove a duplicate setting of the AH_DEBUG_ALQ option. 2016-01-26 01:16:45 +00:00
Mark Johnston
87b87bb853 Evaluate the sysctl_running fail point before taking the sysctl lock.
The fail point handler may sleep, but this is not permitted while holding a
rm read lock.

MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2016-01-26 01:15:18 +00:00
Andrew Turner
1e7b9e9e68 Allow us to be told about memory past the first 4GB point, but ignore it.
This allows, for example, UEFI pass a memory map with some ram in this
region, but for us to ignore it. This is the case when running under the
qemu virt machine type.

Sponsored by:	ABT Systems Ltd
2016-01-25 23:04:40 +00:00
Marius Strobl
57169cea64 - Make the code consistent with itself style-wise and bring it closer
to style(9).
- Mark unused arguments as such.
- Make the ttystates table const.
2016-01-25 22:58:06 +00:00
Marko Zec
ca7ba6a8fd Prune a definition which is / was never used. 2016-01-25 20:35:15 +00:00
Zbigniew Bodek
595f8a5905 Introduce support for HW watchpoints and single stepping for ARMv6/v7
Allows for using hardware watchpoints for 1, 2, 4, 8 byte long addresses.
The default configuration of watchpoint is RW but code allows to select
RO or WO and X.
Since debugging registers are per-CPU (CP14) the watchpoint is set on
the CPU that was lucky (or not) to enter DDB.

HW breakpoints are used to perform single step in KDB.
When HW breakpoint is enabled all watchpoints are temporary disabled
to avoid recursive abort on both watchpoint and breakpoint.
In case of branch, the breakpoint is set to both - next instruction
and possible branch address. This requires at least 2 breakpoints
supported in the CPU however this is a must for ARMv6/v7 CPUs.

Reviewed by:   imp
Submitted by:  Zbigniew Bodek <zbb@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Juniper Networks Inc.
Differential Revision: https://reviews.freebsd.org/D4037
2016-01-25 18:02:28 +00:00
Konstantin Belousov
2e77021e4c Don't allow opening the callout device when the callin device is already
open (in disguise as the console device).  The only allowed combination
was supposed to be the callin device with the console.

Fix the assertion in ttydev_close() that was meant to detect this (it
only detected all 3 devices being open).  Assert this in ttydev_open()
too.

Submitted by:	bde
MFC after:	2 weeks
2016-01-25 16:47:20 +00:00
Steven Hartland
94241c6011 Fix ixgbe compliation with DBG 1
Fixed ERROR_REPORTXX macros so that ixgbe compiles with #define DBG 1

MFC after:	1 week
Sponsored by:	Multiplay
Differential Revision:	https://reviews.freebsd.org/D5061
2016-01-25 16:18:53 +00:00
Konstantin Belousov
3593a18a91 Fix the %b flags string for ddb. All bits above the 5th
(TF_OPENED_CONS) were broken in r188147 by adding TF_OPENED_CONS
without updating the string.  It was especially confusing to display
OPENED_CONS as GONE and BYPASS as ZOMBIE.  2 flags at the end were
not updated in r188487.

Don't print an extra 0x prefix for %p in a ddb command.  In the rest
of the kernel there are more than 6000 lines with %p and only about
40 with this bug.

Print a non-extra 0x prefix for %b in a ddb command.  In the rest
of the kernel, there are approx. 180 lines with %b and 2/3 of them
have this bug.

Submitted by:	bde
MFC after:	2 weeks
2016-01-25 15:37:01 +00:00
Zbigniew Bodek
004ae5cbbd Simplify GICv3 related drivers' naming
Rename gic_v3_ instances to simply use 'gic' and 'its'.
The information about the controller's revision is printed
in the device announcement during boot anyway.
The intention behind this change is to avoid somewhat misleading
GIC instances naming such as:
    gic_v30
    gic_v31
    ...
etc.

Submitted by:  Zbigniew Bodek <zbb@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Cavium
Differential Revision: https://reviews.freebsd.org/D5016
2016-01-25 15:18:32 +00:00
Zbigniew Bodek
7ea5004ba7 Create proper FDT attachment for GICv2m
Avoid probing GICv2m to any parent bus/driver. Instead, match
GICv2m driver with FDT complatible strings as not every GIC
has a MSI controller in the form of GICv2m extension.

Submitted by:  Zbigniew Bodek <zbb@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Cavium
Differential Revision: https://reviews.freebsd.org/D5015
2016-01-25 15:10:43 +00:00
Zbigniew Bodek
073fae869b Do not destroy input buffer of the OF_getencprop() function on error
Currently when the OF_getprop() function returns with error,
the caller (OF_getencprop()) still changes the buffer endiannes.
This may destroy the default value passed in the input buffer if
used on a Little Endian platform.

Reviewed by:   mmel
Submitted by:  Zbigniew Bodek <zbb@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Cavium
2016-01-25 14:42:44 +00:00
Svatopluk Kraus
d36f48ddc2 Fix an occasional undefined instruction abort during module loading.
Even if data cache maintenance was done by IO code, the relocation
fixup process creates dirty cache entries that we must write back
before doing icache sync.

Reported by:	Thiagarajan Venkatasubramanian <tvenkata at juniper.net>
Reviewed by:	ian
2016-01-25 14:09:35 +00:00
Svatopluk Kraus
a9dc686c9a Do not use blk_write_cont() and remove it. There si no need to call
blk_flush() between two writes by physical address when these are
PAGE_SIZE aligned.

Fix some style nits.
2016-01-25 12:55:24 +00:00
Svatopluk Kraus
768f645256 Make minidump more like its i386 original back as with new pmap dump
interface all used physical addresses are PAGE_SIZE aligned.
Add missing copyright.

This is a follow up to r294722.
2016-01-25 12:49:08 +00:00
Svatopluk Kraus
971962e4d9 Create new pmap dump interface for minidump and use it for existing
pmap implementations on ARM. This way minidump code can be used without
any platform specific modification.

Also, this is the last piece missing for ARM_NEW_PMAP.

Differential Revision:	https://reviews.freebsd.org/D5023
2016-01-25 12:43:07 +00:00
Alexander V. Chernikov
0d6a516eb8 Convert TCP mtu checks to the new routing KPI. 2016-01-25 10:06:49 +00:00
Alexander V. Chernikov
94017572ab Fix flowtable part missed in r294706. 2016-01-25 09:31:32 +00:00
Andrew Turner
c9a608ac3f Add allwinner_machdep.h, it was missed in r294698. 2016-01-25 08:19:16 +00:00
Alexander V. Chernikov
61eee0e202 MFP r287070,r287073: split radix implementation and route table structure.
There are number of radix consumers in kernel land (pf,ipfw,nfs,route)
  with different requirements. In fact, first 3 don't have _any_ requirements
  and first 2 does not use radix locking. On the other hand, routing
  structure do have these requirements (rnh_gen, multipath, custom
  to-be-added control plane functions, different locking).
Additionally, radix should not known anything about its consumers internals.

So, radix code now uses tiny 'struct radix_head' structure along with
  internal 'struct radix_mask_head' instead of 'struct radix_node_head'.
  Existing consumers still uses the same 'struct radix_node_head' with
  slight modifications: they need to pass pointer to (embedded)
  'struct radix_head' to all radix callbacks.

Routing code now uses new 'struct rib_head' with different locking macro:
  RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing
  information base).

New net/route_var.h header was added to hold routing subsystem internal
  data. 'struct rib_head' was placed there. 'struct rtentry' will also
  be moved there soon.
2016-01-25 06:33:15 +00:00
Sepherosa Ziehau
51f6f18c88 hyperv/vmbus: Avoid extra copy of page information.
The page information array could contain up to 32 elements (i.e. 512B).
And on network side w/ TSO, 11+ (176B+) elements, i.e. ~44K TSO packet,
in the page information array is quite common.

This saves us some cpu cycles.

Reviewed by:		adrian, delphij
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4992
2016-01-25 05:33:18 +00:00
Alexander V. Chernikov
809da2a3e0 Remove unused radix_mpath definitions. 2016-01-25 05:28:19 +00:00
Sepherosa Ziehau
dc1418432b hyperv/hn: Trust host TCP segment checksum verification by default.
According to all available information, VMSWITCH always does the
TCP segment checksum verification before sending the segment to
guest.

Reviewed by:		adrian, delphij, Hongjiang Zhang <honzhan microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4991
2016-01-25 05:25:39 +00:00
Sepherosa Ziehau
7ea161b0ec hyperv/hn: Remove unnecessary zeroing out the netvsc_packet
All used fields are setup one by one, so there is no need to zero
out this large struct.

While I'm here, move the stack variable near its usage.

Reviewed by:		adrian, delphij, Jun Su <junsu microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4978
2016-01-25 05:18:57 +00:00
Sepherosa Ziehau
4d9e79a3be hyperv/hn: Use m_copydata for chimney sending.
While I'm here, move stack variables near their usage.

Reviewed by:		adrian, delphij, Jun Su <junsu microsoft com>
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4977
2016-01-25 05:12:00 +00:00
Sepherosa Ziehau
391ad73b70 hyperv/hn: Partly rework transmission path
- Avoid unnecessary malloc/free on transmission path.
- busdma(9)-fy transmission path.
- Properly handle IFF_DRV_OACTIVE.  This should fix the network
  stalls reported by many.
- Properly setup TSO parameters.
- Properly handle bpf(4) tapping.  This 5 times the performance
  during TCP sending test, when there is one bpf(4) attached.
- Allow size of chimney sending be tuned on a running system.
  Default value still needs more test to determine.

Reviewed by:		adrian, delphij
Approved by:		adrian (mentor)
Sponsored by:		Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D4972
2016-01-25 05:01:32 +00:00
Andrew Turner
d8b624dcab Update the Allwinner kernels:
* Use the ARM PLATFORM framework
 * Use ARM_INTRNG on teh A20 as it has a GICv2
 * Add a method to find which Allwinner SoC we are running on

Differential Revision:	https://reviews.freebsd.org/D5059
2016-01-25 00:24:57 +00:00
Andriy Voskoboinyk
74bdb73190 net80211: reduce stack usage for ieee80211_ioctl*() methods.
Use malloc(9) for
 - struct ieee80211req_wpaie2 (518 bytes, used in
ieee80211_ioctl_getwpaie())
 - struct ieee80211_scan_req (128 bytes, used in setmlme_assoc_adhoc()
and ieee80211_ioctl_scanreq())

Also, drop __noinline workarounds; stack overflow is not reproducible
with recent compilers.

Tested with Clang 3.7.1, GCC 4.2.1 (from 9.3-RELEASE) and 4.9.4
(with -fstack-usage flag)

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D5041
2016-01-24 23:35:20 +00:00
Andriy Voskoboinyk
536e056030 net80211: reduce code duplication
Do not duplicate code between IEEE80211_IOC_WPAIE and IEEE80211_IOC_WPAIE2
switch cases.

Approved by:	adrian (mentor)
Differential Revision:	D5041 (part)
2016-01-24 23:28:14 +00:00
Pedro F. Giffuni
afbad87898 ext2fs: passthrough any extra timestamps to the dinode struct.
In general we don't trust any of the extended timestamps unless the
EXT2F_ROCOMPAT_EXTRA_ISIZE feature is set. However, in the case where
we freshly allocated a new inode the information is valid and it is
better to pass it along instead of leaving the value undefined.

This should have no practical effect but should reduce the amount of
garbage if EXT2F_ROCOMPAT_EXTRA_ISIZE is set, like in cases where the
filesystem is converted from ext3 to ext4.

MFC after:	4 days
2016-01-24 23:24:47 +00:00
Andrew Turner
b4733230c9 Remove an extra newline that crept in. 2016-01-24 19:12:16 +00:00
Andrew Turner
aea7d91520 Add support for controlling the clocks for the audio codec and DMA engines.
Submitted by:	Jared McNeill <jmcneill@invisible.ca>
Differential Revision:	https://reviews.freebsd.org/D5052
2016-01-24 19:10:30 +00:00
Andrew Turner
a7ce3cb185 Fix the style of the reading of a nodes xref to make it readable. 2016-01-24 17:09:11 +00:00
Konstantin Belousov
ba7c64d17b Typo in comment. 2016-01-24 13:38:41 +00:00
Michal Meloun
0a17d8c230 Add reset framework, a second part of new 'extended resources' family of
support frameworks (i.e. regulators/phy/tsensors/fuses...).

It provides simple unified consumers interface for manipulations with
on-chip resets.

Reviewed by: ian, imp (paritaly)
2016-01-24 11:03:35 +00:00
Michal Meloun
12a05f9a86 Add clock framework, a first part of new 'extended resources' family of
support frameworks(i.e. reset/regulators/phy/tsensors/fuses...).

The clock framework significantly simplifies handling of complex clock
structures found in modern SoCs. It provides the unified consumers
interface, holds and manages actual clock topology, frequency and gating.

It's tested on three different ARM boards (Nvidia Tegra TK1, Inforce 6410 and
Odroid XU2) and on one MIPS board (Creator Ci20) by kan@.

The framework is still far from perfect and probably doesn't have stable
interface yet, but we want to start testing it on more real boards and
different architectures.

Reviewed by: ian, kan (earlier version)
2016-01-24 11:00:38 +00:00