Commit Graph

101471 Commits

Author SHA1 Message Date
Hans Petter Selasky
cfb1395111 Add more USB quirks for Western Digital external USB HDD
enclosures. Rename an incorrectly named device. Increase
limit for maximum number of quirks.

PR:	    178771, 180617
MFC after:  2 weeks
2014-12-08 10:41:34 +00:00
Craig Rodrigues
7b345d3999 Use CURVNET macros inside inet_get_local_port_range() function.
Without this fix, a kernel with VIMAGE + Infiniband will panic on bootup.

Certain necessary #include statements require LIST_HEAD.
Add these includes to ofed/include/linux/list.h, because
LIST_HEAD is specifically overridden in this file.

PR: 191468
Differential Revision: D1279
Reviewed by: hselasky
2014-12-08 07:25:59 +00:00
Xin LI
c2161091ad MFV r275540:
When importing a pool, don't assume that the passed pool configuration
at vdev_load is always vaild.  It's possible that a stale configuration
that comes with extra vdevs, where metaslab_init() would fail because
of lower layer returns error.

Change the code to make metaslab_init() handle and return errors from
lower layer and pass it back to upper layer and handle it there.

Illumos issue:
    5213 panic in metaslab_init due to space_map_open returning ENXIO

MFC after:	2 weeks
2014-12-08 06:04:42 +00:00
Mark Johnston
d6ad6a865a Add refcounting to IPv6 DAD objects and simplify the DAD code to fix a
number of races which could cause double frees or use-after-frees when
performing DAD on an address. In particular, an IPv6 address can now only be
marked as a duplicate from the DAD callout.

Differential Revision:	https://reviews.freebsd.org/D1258
Reviewed by:		ae, hrs
Reported by:		rstone
MFC after:		1 month
2014-12-08 04:44:40 +00:00
Zbigniew Bodek
8948956770 Fix buffer overflow in Marvell PCI/PCIe driver
Buffer overflow occured when more than one MSI was allocated.

Submitted by:    Wojciech Macek <wma@semihalf.com>
Obtained from:   Semihalf
2014-12-07 21:02:45 +00:00
Andriy Gapon
036a8c5dac remove opensolaris cyclic code, replace with high-precision callouts
In the old days callout(9) had 1 tick precision and that was inadequate
for some uses, e.g. DTrace profile module, so we had to emulate cyclic
API and behavior.  Now we can directly use callout(9) in the very few
places where cyclic was used.

Differential Revision:	https://reviews.freebsd.org/D1161
Reviewed by:	gnn, jhb, markj
MFC after:	2 weeks
2014-12-07 11:21:41 +00:00
Andrey V. Elsukov
4268212124 key_getspacq() returns holding the spacq_lock. Unlock it in all cases.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-12-07 06:47:00 +00:00
Michael Tuexen
39cbb549cc Include the received chunk padding when reporting an unknown chunk.
MFC after: 1 week
2014-12-06 22:57:19 +00:00
Alexander Motin
bfbfc4a3cb Count consecutive read requests as blocking in CTL for files and ZVOLs.
Technically read requests can be executed in any order or simultaneously
since they are not changing any data.  But ZFS prefetcher goes crasy when
it receives consecutive requests from different threads.  Since prefetcher
works on level of separate blocks, instead of two consecutive 128K requests
it may receive 32 8K requests in mixed order.

This patch is more workaround then a real fix, and it does not fix all of
prefetcher problems, but it improves sequential read speed by 3-4x times
in some configurations.  On the other side it may hurt performance if
some backing store has no prefetch, that is why it is disabled by default
for raw devices.

MFC after:	2 weeks
2014-12-06 20:39:25 +00:00
Michael Tuexen
d59107f700 Fix the support of mapped IPv4 addresses.
Thanks to Mark Bonnekessel and Markus Boese for making me aware of the
problems.
MFC after: 1 week
2014-12-06 20:00:08 +00:00
Andrew Turner
f9307cced7 Apply the same fix in r274697 to the ARM case. 2014-12-06 12:03:09 +00:00
Andrew Turner
17fea8f6b7 Use the unified syntax when generating assembly for clang. The clang 3.5
integrated assembler only accepts it.

MFC after:	1 week
Sponsored by:	ABT Systems Ltd
2014-12-06 11:59:35 +00:00
Xin LI
54f76dcb4a MFV r275535:
Unexpand ISP2() and MSEC2NSEC().

Illumos issue:
    5255 uts shouldn't open-code ISP2

MFC after:	2 weeks
2014-12-06 09:38:28 +00:00
Xin LI
6054c38913 MFV r275534:
Sync with Illumos.  This have no effect to FreeBSD.

Illumos issue:
    5285 pass in cpu_pause_func via pause_cpus

MFC after:	2 weeks
2014-12-06 09:14:46 +00:00
Xin LI
d4548c2e8e MFC r275533:
Sync with Illumos.  This have no effect to FreeBSD.

Illumos issue:
    5100 sparc build failed after 5004

MFC after:	2 weeks
2014-12-06 09:11:13 +00:00
Craig Rodrigues
a8da5dd658 MFp4: @181627
Allow UMA allocated memory to be freed when VNET jails are torn down.

Differential Revision: D1201
Submitted by: bz
Reviewed by: rwatson, gnn
2014-12-06 02:59:59 +00:00
Navdeep Parhar
b741402c40 cxgbe(4): allow the driver to use rx buffers that do not end on a pack
boundary.

MFC after:	2 weeks
2014-12-06 01:47:38 +00:00
Navdeep Parhar
e3207e1973 cxgbe(4): Allow for different pad and pack boundaries for different
adapters.  Set the pack boundary for T5 cards to be the same as the
PCIe max payload size.  The chip likes it this way.

In this revision the driver allocate rx buffers that align on both
boundaries.  This is not a strict requirement and a followup commit
will switch the driver to a more relaxed allocation strategy.

MFC after:	2 weeks
2014-12-06 00:13:56 +00:00
Xin LI
4603a0aeb2 Use %d instead of %u for error number. This way we see ERESTART as -1
not 4294967295 when doing DTrace.

MFC after:	2 weeks
2014-12-05 22:56:10 +00:00
Andrew Turner
0395da4366 Switch to a .cpu directive. These will work when clang 3.5 is imported
where the .arch directive is a nop.

MFC after:	1 week
Sponsored by:	ABT Systems Ltd
2014-12-05 19:23:51 +00:00
Andrew Turner
eff4f0ceee Switch to an armv6k cpu, without this clang 3.5 complains "bx lr" is
unsupported as it needs a newer cpu.

MFC after:	1 week
Sponsored by:	ABT Systems Ltd
2014-12-05 19:19:17 +00:00
Andrew Turner
ef477cd70b Place the literal pool after a RET otherwise clang 3.5 tries to put it too
far away from a ldr psuedo instruction. With this clang will place the
literal value here where it's close enough to be loaded.

MFC after:	1 week
Sponsored by:	ABT Systems Ltd
2014-12-05 19:14:05 +00:00
Andrew Turner
89636bddf6 Set the alignment to 4-bytes after a string as clang 3.5 can switch to
thumb mode if this is incorrect.

MFC after:	1 week
Sponsored by:	ABT Systems Ltd
2014-12-05 19:11:25 +00:00
Andrew Turner
524bca9008 Use the unified syntax in a few more assembly files
MFC after:	1 week
Sponsored by:	ABT Systems Ltd
2014-12-05 19:08:36 +00:00
Andrew Turner
ff9dd44ead Add missing END macros to some of the xscale functions.
MFC after:	1 week
Sponsored by:	ABT Systems Ltd
2014-12-05 19:04:08 +00:00
Xin LI
26f96d922b Fix a regression introduced in r274337 (large block support)
In dsl_dataset_hold_obj() we used zap_contains(.., DS_FIELD_LARGE_BLOCKS)
to determine whether the extensible (zapifyed) dataset have large blocks.
The code expects the result be either 0 (found) or ENOENT (not found),
however reused the variable 'err' which later code expects to be 0.

Fix this by adopting similar code construct that is used later for
DS_FIELD_BOOKMARK_NAMES, which uses a temporary variable zaperr to catch
errors from zap_* rountines.

Reported by:	Peter J. Creath (on FreeNAS; FreeNAS bug #6848)
Illumos issue:	5393 spurious failures from dsl_dataset_hold_obj()
Reviewed by:	mahrens
Sponsored by:	iXsystems, Inc.
X-MFC with:	r274337
2014-12-05 18:29:01 +00:00
John Baldwin
01ca58b23c Always ignore the deprecated MAP_RENAME and MAP_NORESERVE flags to mmap().
Some old libraries may be used even with newer binaries (specifically the
Nvidia driver libraries).

Differential Revision:	https://reviews.freebsd.org/D1262
Reviewed by:	kib
2014-12-05 15:24:42 +00:00
Konstantin Belousov
30d57414a0 When the last reference on the vnode' vm object is dropped, read the
vp->v_vflag without taking vnode lock and without bypass.  We do know
that vp is the lowest level in the stack, since the pointer is
obtained from the object' handle.  Stale VV_TEXT flag read can only
happen if parallel execve() is performed and not yet activated the
image, since process takes reference for text mapping.  In this case,
the execve() code manages the VV_TEXT flag on its own already.

It was observed that otherwise read-only sendfile(2) requires
exclusive vnode lock and contending on it on some loads for VV_TEXT
handling.

Reported by:	glebius, scottl
Tested by:	glebius, pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-12-05 15:02:30 +00:00
Alexander Motin
85700d4d7d In addition to r275481 allow threshold notifications work without UNMAP.
While without UNMAP support there is not much initiator can do about it,
the administrator still better be notified about the storage overflow.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-12-05 13:30:45 +00:00
Hans Petter Selasky
654ea8e767 Optimise bit searching loop by using the ffs() function.
Make some related bit shifts unsigned while at it.
2014-12-05 12:07:53 +00:00
Hans Petter Selasky
535701f699 Define the ffs() function in the USB bootloader's global and
independent header file.
2014-12-05 12:04:47 +00:00
Alexander Motin
1e68fe9c33 Avoid unneeded malloc/memcpy/free if there is no metadata on disk.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	2 weeks
2014-12-05 10:23:18 +00:00
Michael Tuexen
457b4b8836 This is the SCTP specific companion of
https://svnweb.freebsd.org/changeset/base/275358
which was provided by Hans Petter Selasky.
2014-12-04 21:17:50 +00:00
Alexander Motin
53c146de18 Add to CTL support for threshold notifications for file-backed LUNs.
Previously it was supported only for ZVOL-backed LUNs, but now should work
for file-backed LUNs too.  Used value in this case is a space occupied by
the backing file, while available value is an available space on file
system.  Pool thresholds are still not implemented in this case.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-12-04 18:37:42 +00:00
Alexander Motin
5a770b5496 Swap resource count scopes for used/available space.
Used count should be reported as per-LUN, while available should not.

MFC after:	1 week
2014-12-04 17:36:29 +00:00
Alexander Motin
26f0f92fa2 Decode some binary fields of Intel metadata.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	2 weeks
2014-12-04 15:54:45 +00:00
Alexander Motin
ef8daf3fed Add GET LBA STATUS command support to CTL.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files.
GEOM has no such API, so for LUNs backed by raw devices all LBAs will
be reported as mapped/unknown.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-12-04 11:34:19 +00:00
Andrey V. Elsukov
3d6aff5615 Fix style(9) and remove m_freem(NULL).
Add XXX comment, it looks incorrect, because m_pkthdr.len is already
incremented by M_PREPEND().

Sponsored by:	Yandex LLC
2014-12-04 05:02:12 +00:00
Sean Bruno
040bcc9ac0 Switch is an 8316, so make the comments say that.
Delete extraneous comment line that manifested itself from cut-n-pasting.

Sponsored by:	Nicholas Esborn <nick@desert.net>
2014-12-03 23:37:23 +00:00
Warner Losh
d0b6da086f Const poison in a few places to ensure we don't modify things
through the module data pointer.
2014-12-03 22:14:13 +00:00
Hans Petter Selasky
157675bd2d Optimise the bit searching loops, by quickly skipping the 16 first set
bits if all the 16 first bits are set. This way the worst case
searching time is reduced from 32 to 16 cycles.
2014-12-03 21:55:44 +00:00
Hans Petter Selasky
e93086d0bf Workaround for possible bug in the SAF1761 chip. Wait 125us before
re-using a hardware propritary transfer descriptor, PTD, in USB host
mode. If the PTD's are recycled too quickly, it has been observed that
the hardware simply fails to schedule the requested job or resets
completely disconnecting all devices.
2014-12-03 21:48:30 +00:00
Sean Bruno
258c6e3d7a There is only one argemdio device on this board.
Sponsored by:	Nicholas Esborn <nick@desert.net>
2014-12-03 19:41:49 +00:00
Sean Bruno
3bfb0fa7e9 Assign argemdio0 to the correct base address and assign argemdio1 to its
proper place *after* argemdio0

Correctly place arge0 and arge1 on their respective bus positions.

Sponsored by:	Nicholas Esborn <nick@desert.net>
2014-12-03 18:08:39 +00:00
Alexander Motin
ffe9621cc3 Increase CTL ports limit from 128 to 256 and LUNs limit from 256 to 1024.
After recent optimizations this change is no longer blocked by CTL memory
consumption.  Those limits are still not free, but much cheaper now.

MFC after:	1 week
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2014-12-03 16:04:01 +00:00
John Baldwin
b10c08a52b Revert device_getenv_int() for now as it duplicates resource_int_value().
We should perhaps implement a device_getenv_*() and device_setenv_*() API
as a convenience wrapper on top of resource_*_value() and resource_set_*().
2014-12-03 15:29:53 +00:00
Alexander Motin
f9477570ec Unify function names after r275458.
MFC after:	1 month
2014-12-03 15:19:38 +00:00
Alexander Motin
9e52565344 Do not pre-allocate UNIT ATTENTIONs storage for every possible initiator.
Abusing ability of major UAs cover minor ones we may not account UAs for
inactive ports.  Allocate UAs storage for port and start accounting only
after some initiator from that port fetched its first POWER ON OCCURRED.

This reduces per-LUN CTL memory usage from >1MB to less then 100K.

MFC after:	1 month
2014-12-03 15:16:18 +00:00
Ed Maste
672aa7f472 Increase BERI loader section alignment to 16
The .text, .bss, and .data sections claimed 16-byte alignment, but were
only aligned to 8 by the linker script.

Discovered with strip(1) from elftoolchain, which performs validation
absent from the binutils strip(1).

Sponsored by:	DARPA, AFRL
2014-12-03 14:04:57 +00:00
Alexander Motin
c9fe195c24 Remove some unused code. 2014-12-03 10:39:47 +00:00
Alexander Motin
411598df7a Do not pre-allocate reservation keys memory for every possible initiator.
In configurations with many ports, like iSCSI, each LUN is typically
accessed only by limited subset of ports.  Allocating that memory on
demand allows to reduce CTL memory usage from 5.3MB/LUN to 1.3MB/LUN.

MFC after:	1 month
2014-12-03 09:05:53 +00:00
Alexander Motin
2a72b5936d Plug memory leaks on UNMAP and XCOPY with invalid parameters.
MFC after:	1 week
2014-12-03 08:25:41 +00:00
Andrey V. Elsukov
18961126cb Remove __P() macro.
Suggested by:	kevlo
Sponsored by:	Yandex LLC
2014-12-03 04:08:41 +00:00
Andrey V. Elsukov
2e84e6eac9 ANSIfy function declarations.
Sponsored by:	Yandex LLC
2014-12-03 03:50:54 +00:00
Warner Losh
86fd88736f Remove unused PCMCIA_CARD* macros.
Always include the card human readable name. We support ~270 cards and
at ~20 bytes each, this bloats things by only ~5k. Retain the
PCMCIA_CARD vs PCMCIA_CARD_D distinction, though, in case this is
intolerable.
2014-12-03 00:47:05 +00:00
Jack F Vogel
4da1bbcda5 Revert r275136, it was not approved, it was sloppy, if a feature
like this is needed please resubmit for Intel's approval.
2014-12-02 23:02:57 +00:00
Michael Tuexen
4e88d37a2a Do the renaming of sb_cc to sb_ccc in a way with less code changes by
using a macro.
This is an alternate approach to
https://svnweb.freebsd.org/changeset/base/275326
which is easier to handle upstream.

Discussed with: rrs, glebius
2014-12-02 20:29:29 +00:00
George V. Neville-Neil
bd19924f6b This configuration file removes several debugging options, including
WITNESS and INVARIANTS checking, which are known to have significant
performance impact on running systems.  When benchmarking new features
this kernel should be used instead of the standard GENERIC.
This kernel configuration should never appear outside of the HEAD
of the FreeBSD tree.
2014-12-02 19:55:43 +00:00
Andrew Turner
c258e1cc69 Switch to unified syntax so these can be built with clang 3.5.
MFC after:	1 week
Sponsored by:	ABT Systems Ltd
2014-12-02 18:37:04 +00:00
Andrew Turner
4373d2f370 Use the APSR_nzcv format of mrc. The clang integrated assembler doesn't
support the old usage of r15.

Sponsored by:	ABT Systems Ltd
2014-12-02 18:35:34 +00:00
Andrew Turner
d9e2150b36 Fix the name of the coprocessor to include the "p" prefix, the clang
integrated assembler expects this.

MFC after:	1 Week
Sponsored by:	ABT Systems Ltd
2014-12-02 18:20:53 +00:00
Yoshihiro Takahashi
3047cb6803 MFi386: r275305 (by rdivacky)
Unbreak the code for non-digits below '0' by casting the expression
  to unsigned int.
2014-12-02 14:48:21 +00:00
Alexander Motin
77a06f9db0 Convert persis_offset from global variable to softc field. 2014-12-02 12:38:22 +00:00
Alexander Motin
40103f1ec4 Reduce code duplication by creating ctl_set_res_ua() helper. 2014-12-02 12:31:28 +00:00
Alexander Motin
1e8607769f Removed unused variable and unify some names. 2014-12-02 12:05:44 +00:00
Andriy Gapon
782c06dfc8 zfs_putpages: actually update mtime and ctime
Reported by:	Paul Koch <paul.koch@akips.com>
Tested by:	Paul Koch <paul.koch@akips.com>
MFC after:	2 weeks
2014-12-02 11:44:56 +00:00
Andrey V. Elsukov
2dfcd0ae9d Remove unneded check. No need to do m_pullup to the size that we prepended.
MFC after:	1 week
Sponsored by:	Yandex LLC
2014-12-02 05:41:03 +00:00
Andrey V. Elsukov
bd766f3425 Remove unneded check. No need to do m_pullup to the size that we prepended.
Sponsored by:	Yandex LLC
2014-12-02 05:28:40 +00:00
Andrey V. Elsukov
2d957916ef Remove route chaching support from ipsec code. It isn't used for some time.
* remove sa_route_union declaration and route_cache member from struct secashead;
* remove key_sa_routechange() call from ICMP and ICMPv6 code;
* simplify ip_ipsec_mtu();
* remove #include <net/route.h>;

Sponsored by:	Yandex LLC
2014-12-02 04:20:50 +00:00
Andrey V. Elsukov
1fea1b0889 Remove unused structure declarations.
Sponsored by:	Yandex LLC
2014-12-02 02:41:44 +00:00
Andrey V. Elsukov
0e23cc372d Remove unused declartations.
Sponsored by:	Yandex LLC
2014-12-02 02:32:28 +00:00
Andrew Turner
7f9b314ff2 Pull in the NetBSD global offset table handling code. Clang 3.5 creates
relocations the linker complains about.

Obtained from:	NetBSD
MFC after:	1 Week
2014-12-01 21:04:26 +00:00
Rui Paulo
afe2c75694 Allow multiple devices to mmap. It's impossible to prevent this with
checks on the open/close functions.

MFC after:	1 week
2014-12-01 19:48:23 +00:00
Konstantin Belousov
6afb32fc67 Disable recursion for the process spinlock.
Tested by:	pho
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2014-12-01 17:36:10 +00:00
Alexander Motin
fa91cabfbd When passing LUN IDs through treat ASCII values as fixed-length, not
interpreating NULLs as EOLs, but converting them to spaces.

SPC-4 does not tell that T10-based IDs should be NULL-terminated/padded.
And while it tells that it should include only ASCII chars (0x20-0x7F),
there are some USB sticks (SanDisk Ultra Fit), that have NULLs inside
the value.  Treating NULLs as EOLs there made those LUN IDs non-unique.

MFC after:	1 week
2014-12-01 15:21:54 +00:00
Alexander Motin
7511bd04e4 Move ctlfe_onoffline() out of lock to let it sleep when needed.
Do some more other polishing while there.

MFC after:	2 weeks
2014-12-01 13:55:45 +00:00
Hans Petter Selasky
c25290420e Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2014-12-01 11:45:24 +00:00
Konstantin Belousov
95c4bf756a Provide mutual exclusion between zone allocation/destruction and
uma_reclaim().  Reclamation code must not see half-constructed or
destructed zones.  Do this by bracing uma_zcreate() and uma_zdestroy()
into a shared-locked sx, and take the sx exclusively in uma_reclaim().

Usually zones are not created/destroyed during the system operation,
but tmpfs mounts do cause zone operations and exposed the bug.

Another solution could be to only expose a new keg on uma_kegs list
after the corresponding zone is fully constructed, and similar
treatment for the destruction.  But it probably requires more risky
code rearrangement as well.

Reported and tested by:	pho
Discussed with:	avg
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-11-30 20:20:55 +00:00
Justin T. Gibbs
2c6bf3d90b Remove trailing whitespace. 2014-11-30 19:32:00 +00:00
Bryan Venteicher
abec64bc76 Cleanup and performance improvement of the virtio_blk driver
- Add support for GEOM direct completion. Depending on the benchmark,
    this tends to give a ~30% improvement w.r.t IOPs and BW.
  - Remove an invariants check in the strategy routine. This assertion
    is caught later on by an existing panic.
  - Rename and resort various related functions to make more sense.

MFC after:	1 month
2014-11-30 16:36:26 +00:00
Gleb Smirnoff
2cbcd3c198 Merge from projects/sendfile:
- Provide pru_ready function for TCP.
- Don't call tcp_output() from tcp_usr_send() if no ready data was put
  into the socket buffer.
- In case of dropped connection don't try to m_freem() not ready data.

Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2014-11-30 13:43:52 +00:00
Gleb Smirnoff
c80ea19b38 Merge from projects/sendfile:
Provide pru_ready for AF_LOCAL sockets.  Local sockets sendsdata directly
to the receive buffer of the peer, thus pru_ready also works on the peer
socket.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-30 13:40:58 +00:00
Gleb Smirnoff
651e4e6a30 Merge from projects/sendfile: extend protocols API to support
sending not ready data:
o Add new flag to pru_send() flags - PRUS_NOTREADY.
o Add new protocol method pru_ready().

Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2014-11-30 13:24:21 +00:00
Gleb Smirnoff
0f9d0a73a4 Merge from projects/sendfile:
o Introduce a notion of "not ready" mbufs in socket buffers.  These
mbufs are now being populated by some I/O in background and are
referenced outside.  This forces following implications:
- An mbuf which is "not ready" can't be taken out of the buffer.
- An mbuf that is behind a "not ready" in the queue neither.
- If sockbet buffer is flushed, then "not ready" mbufs shouln't be
  freed.

o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc.
  The sb_ccc stands for ""claimed character count", or "committed
  character count".  And the sb_acc is "available character count".
  Consumers of socket buffer API shouldn't already access them directly,
  but use sbused() and sbavail() respectively.
o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones
  with M_BLOCKED.
o New field sb_fnrdy points to the first not ready mbuf, to avoid linear
  search.
o New function sbready() is provided to activate certain amount of mbufs
  in a socket buffer.

A special note on SCTP:
  SCTP has its own sockbufs.  Unfortunately, FreeBSD stack doesn't yet
allow protocol specific sockbufs.  Thus, SCTP does some hacks to make
itself compatible with FreeBSD: it manages sockbufs on its own, but keeps
sb_cc updated to inform the stack of amount of data in them.  The new
notion of "not ready" data isn't supported by SCTP.  Instead, only a
mechanical substitute is done: s/sb_cc/sb_ccc/.
  A proper solution would be to take away struct sockbuf from struct
socket and allow protocols to implement their own socket buffers, like
SCTP already does.  This was discussed with rrs@.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-30 12:52:33 +00:00
Andrew Turner
4c1720fd9a Correctly a few incorrect uses of ENTRY/EENTRY and END/EEND
Sponsored by:	ABT Systems Ltd
2014-11-30 12:25:04 +00:00
Andrew Turner
13fb42aabe Remove extra labels, ENTRY_NP already provides them.
Sponsored by:	ABT Systems Ltd
2014-11-30 12:20:24 +00:00
Gleb Smirnoff
300fa232ee Missed in r274421: use sbavail() instead of bare access to sb_cc. 2014-11-30 12:11:01 +00:00
Gleb Smirnoff
57f43a45a3 - Move sbcheck() declaration under SOCKBUF_DEBUG.
- Improve SOCKBUF_DEBUG macros.
- Improve sbcheck().

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-30 11:22:39 +00:00
Gleb Smirnoff
8967b220a3 Make sballoc() and sbfree() functions. Ideally, they could be marked
as static, but unfortunately Infiniband (ab)uses them.

Sponsored by:	Nginx, Inc.
2014-11-30 11:02:07 +00:00
Roman Divacky
6e52f863ee Unbreak the code for non-digits below '0' by casting the expression
to unsigned int.

Pointed out by: bde
2014-11-30 08:43:55 +00:00
Justin Hibbits
a8920f67f3 Add support for dtrace:fbt on modules for PowerPC
Summary:
Revert the initial FBT-with-KDB changes for trap_subr*.S, and instead use the
db_trap filter function to handle dtrace trap filtering.  With this, the MMU is
enabled by the support code, simplifying the codepath altogether.

Test Plan: Tested on my G4 PowerBook

Reviewers: #powerpc, nwhitehorn

Reviewed By: nwhitehorn

Differential Revision: https://reviews.freebsd.org/D1207

MFC after:	3 weeks
2014-11-29 20:54:33 +00:00
Andrew Turner
b643b9341c Update _ENTRY to use _EENTRY to reduce the common code. 2014-11-29 19:31:23 +00:00
Warner Losh
fac92ae126 The current limit of 100k for the linker hints file is getting a bit
crowded as we now are at about 70k. Bump the limit to 1MB instead
which is still quite a reasonable limit and allows for future growth
of this file and possible future expansion to additional data.

MFC After: 2 weeks
2014-11-29 17:29:30 +00:00
Konstantin Belousov
6762091ea4 Remove lock recursion for the pipe pair mutex, and disable the
recursion on mutex initialization.

The only places where the recursive acquire is performed are read and
write filters, since knlist, which uses the pipe pair mutex as lock,
is locked when filter is called.

The recursion was added in r93296, and consistent locking for
kn_fop->f_event() introduced in r133741.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2014-11-29 17:18:20 +00:00
Baptiste Daroussin
575bd6d8aa Ignore more warnings with external gcc 2014-11-29 14:30:39 +00:00
Yoshihiro Takahashi
ed7959517b MFi386: r275059, r275061, r275062 and r275191 (by rdivacky)
Shrink boot2 by a couple more bytes.
2014-11-29 12:22:31 +00:00
Yoshihiro Takahashi
6d6911c44b MFi386: r275237 (by rdivacky)
Shrink boot2 a bit more by factoring out common pattern
  of printf();return(-1);
2014-11-29 09:27:18 +00:00
Roman Divacky
0fa5393fa8 Shrink boot2 a bit more by factoring out common pattern
of printf();return(-1);

This shrinks it by 8bytes using clang35 and by 12bytes using clang34.
2014-11-29 08:59:26 +00:00
Bjoern A. Zeeb
2c3774c183 After r275196 unbreak NOIP and NOINET kernels by hiding an otherwise
unused varibale under the proper #ifdef.
2014-11-28 14:51:49 +00:00
Eygene Ryabinkin
317d2b1e5c DRM2: fix off-by-one overflow in ioctl processing
Call to the driver-specific ioctl used to process ioctl number
that will lead to the out-of-bounds access to the ioctl handler
array.

PR:		193367
Approved by:	kib
MFC after:	1 week
2014-11-28 12:14:59 +00:00
Andrew Turner
15eb3a7427 Some device tree configurations place the generic timer under the root
of the tree and not under simplebus. Update the driver to handle this.

Submitted by:	Julien Grall <julien.grall AT linaro.org>
MFC after:	1 week
2014-11-28 11:49:26 +00:00
Andrew Turner
56f0c37e9f We don't use the hypervisor interrupt, make it optional in the device tree.
Submitted by:	Julien Grall <julien.grall AT linaro.org>
MFC after:	1 week
2014-11-28 11:45:53 +00:00
Konstantin Belousov
70778bba03 Assert the state of the process lock and sigact mutex in
kern_sigprocmask() and reschedule_signals().

Discussed with:	rea
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-11-28 10:20:00 +00:00
Hans Petter Selasky
50ae6690fc Style changes:
- Move two IOCTL related defines to the top of the C-file
- Add more comments describing the recently added IOCTL small size and
small align macros
2014-11-28 09:32:07 +00:00
Cy Schubert
006e24e909 Correctly define constants.
MFC after:	1 week
2014-11-28 04:07:06 +00:00
Alexander V. Chernikov
1a3a2b6798 Fix build broken by r275195. 2014-11-27 23:10:03 +00:00
Alexander V. Chernikov
74860d4f7c Do not return unlocked/unreferenced lle in arpresolve/nd6_storelladdr -
return lle flags IFF needed.
Do not pass rte to arpresolve - pass is_gateway flag instead.
2014-11-27 23:06:25 +00:00
Alexander V. Chernikov
c69aeaad14 Do not try to copy header to @dst and than back to ethernet in case of
pseudo_AF_HDRCMPLT:

we copy media header from mbuf to 'struct sockaddr' @dst in bpf_movein, so
mbuf already contains valid info.
2014-11-27 21:29:19 +00:00
Roman Divacky
2dd4dcd2ab Revert part of r275059. Comparing unsigned 8 bit value
against -'0' is always false so the conditional block is
optimized away.
2014-11-27 18:43:44 +00:00
Justin Hibbits
409062f166 Fix hwpmc sampling for ppc970 (G5-class) processors.
With this, hwpmc sampling now works on these processors.

MFC after:	3 weeks
Relnotes:	yes
2014-11-27 18:41:14 +00:00
Justin Hibbits
bd52e21d55 Fix hwpmc sampling for MPC74xxx (G4) processors.
With this, hwpmc sampling now works correctly on these processors.

MFC after:	3 weeks
Relnotes:	yes
2014-11-27 06:42:34 +00:00
Andrey V. Elsukov
ffbf9cdeb6 Remove ip4_input() declaration. It was removed in r275133.
MFC after:	1 month
2014-11-27 00:27:39 +00:00
Ed Maste
e6f46a8880 Increase default and maximum callchain depths
Bump the default from 16 to 32, to accommodate kernel flamegraphs.
Bump the maximum from 32 to 128, to accommodate deep user stacks.

Reviewed by:	gnn
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1203
2014-11-26 20:56:08 +00:00
Adrian Chadd
c1a4be0fc0 Add PCI ID for Intel Lynx Point LP controller.
PR:		kern/195398
Submitted by:	grembo
Obtained from:	DragonflyBSD
MFC after:	1 week
2014-11-26 20:34:05 +00:00
Alfred Perlstein
56c14bca7e Make igb and ixgbe check tunables at probe time.
This allows one to make a kernel module to tune the
number of queues before the driver loads.

This is needed so that a module at SI_SUB_CPU can set
tunables for these drivers to take.  Otherwise getenv
is called too early by the TUNABLE macros.

Reviewed by: smh
Phabric: https://reviews.freebsd.org/D1149
2014-11-26 20:19:36 +00:00
Andrey V. Elsukov
b05765d75f Do not use xform_ipip as decapsulation fallback.
xform_ipip was used as fallback with low priority for IPIP
encapsulated packets that were decrypted. In some cases
it can decapsulate packets, that it shouldn't. This leads to situations,
when wrong configurations are magically working. Also it can propagate
wrong ingress interface and this can break security.

Now we redesigned the IPSEC code and IPIP encapsulation is called directly
from ipsec_output, and decapsulation is done in the ipsec_input with m_striphdr.

Differential Revision:	https://reviews.freebsd.org/D1220
MFC after:	1 month
Sponsored by:	Yandex LLC
2014-11-26 17:44:49 +00:00
Alexander Motin
2731e062b5 Fix WWNN/WWPN generation for virtual channels.
MFC after:	1 week
2014-11-26 16:05:01 +00:00
Alexander Motin
3e92f72cfa Fix incorrect check, blocking MULTIID functionality.
MFC after:	1 week
2014-11-26 15:03:21 +00:00
Konstantin Belousov
5c7bebf961 The process spin lock currently has the following distinct uses:
- Threads lifetime cycle, in particular, counting of the threads in
  the process, and interlocking with process mutex and thread lock.
  The main reason of this is that turnstile locks are after thread
  locks, so you e.g. cannot unlock blockable mutex (think process
  mutex) while owning thread lock.

- Virtual and profiling itimers, since the timers activation is done
  from the clock interrupt context.  Replace the p_slock by p_itimmtx
  and PROC_ITIMLOCK().

- Profiling code (profil(2)), for similar reason.  Replace the p_slock
  by p_profmtx and PROC_PROFLOCK().

- Resource usage accounting.  Need for the spinlock there is subtle,
  my understanding is that spinlock blocks context switching for the
  current thread, which prevents td_runtime and similar fields from
  changing (updates are done at the mi_switch()).  Replace the p_slock
  by p_statmtx and PROC_STATLOCK().

The split is done mostly for code clarity, and should not affect
scalability.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-11-26 14:10:00 +00:00
Konstantin Belousov
e442f29f08 Fix SA_SIGINFO | SA_RESETHAND handling. The sysent' sv_sendsig()
method needs pre-reset state of the ps_siginfo to correctly construct
signal frame.

Move sigdflt() call after the sv_sendsig() invocation in postsig().
Simultaneously extract common code from trapsignal() and postsig()
into new helper postsig_done().

Submitted by:	rea
MFC after:	1 week
2014-11-26 14:09:04 +00:00
Alexander Motin
315a4d6fb4 Some microoptimizations.
MFC after:	1 month
2014-11-26 13:56:54 +00:00
Alexander Motin
8592f07464 Make isp_find_pdb_by_*() search for targets in portdb in reverse order.
Records with target_mode == 1 are allocated from the end of portdb, so it
seems logical to start search from the end not traverse whole array.

MFC after:	1 month
2014-11-26 12:25:00 +00:00
Hans Petter Selasky
b2d05a1b26 Add new USB quirk.
MFC after:	1 week
PR:		195372
2014-11-26 10:58:08 +00:00
Alexander Motin
e67f3bec39 Add bunch of PCI IDs of Intel Wildcat Point (9 Series) chipsets.
MFC after:	1 week
2014-11-26 04:23:21 +00:00
Xin LI
d10e52d627 Revert r273060 per discussion with avg@ as we need to make L2ARC
aware of 4K devices and this one is not the right fix anyway.
2014-11-26 02:20:25 +00:00
Roman Divacky
6e1f6899f7 Fix style(9).
Suggested by: jkim
2014-11-25 18:58:40 +00:00
Roman Divacky
029bae9837 Fix style(9).
Suggested by: jkim
2014-11-25 18:53:17 +00:00
Roman Divacky
7eb52354d1 Shrink boot2 by a couple more bytes.
Reviewed by:    jhb
Tested by:      me, dim
2014-11-25 18:35:47 +00:00
Alexander Motin
f7241cceb0 Coalesce last data move and command status for read commands.
Make CTL core and block backend set success status before initiating last
data move for read commands.  Make CAM target and iSCSI frontends detect
such condition and send command status together with data.  New I/O flag
allows to skip duplicate status sending on later fe_done() call.

For Fibre Channel this change saves one of three interrupts per read command,
increasing performance from 126K to 160K IOPS.  For iSCSI this change saves
one of three PDUs per read command, increasing performance from 1M to 1.2M
IOPS.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-11-25 17:53:35 +00:00
Sean Bruno
82e843b93b Add support for Buffalo WZR-HP-AG300H atheros MIPS router.
Special thanks to Nicholas Esborn for the loaner router to get this
target bootstrapped.

Review:  D777
Reviewed by:    adrian
Sponsored by:   Nicholas Esborn <nick@desert.net>
2014-11-25 17:33:22 +00:00
Ruslan Bukin
c97038fa5a o Add Virtio MMIO bus driver to config
o Move Virtio-related to common config file
2014-11-25 16:53:22 +00:00
Ruslan Bukin
b4db959ac5 Add new devices to the config. 2014-11-25 16:24:31 +00:00
Ruslan Bukin
e8cf387c51 o Add PIO and vtblk mmio device info to the tree
o Add FPGA memory window to static dev mappings
o Fix whitespace
2014-11-25 16:06:19 +00:00
Ruslan Bukin
13e19fb323 Add BERI-specific virtio block backend device driver.
This part intended to operate on ARM side in heterogeneous
(ARM/BERI) system on crystal.
2014-11-25 15:58:59 +00:00
Andriy Gapon
e021bcbbc6 whitespace and cosmetic changes in callout_reset family of macros
- add parentheses around macro parameters for consistent style
- remove redundant parentheses around an expression
- use tab before a line continuation symbol

Differential Revision:	https://reviews.freebsd.org/D1161 (partial)
Reviewed by:	markj
MFC after:	1 week
2014-11-25 15:24:05 +00:00
Andriy Gapon
088b124ba7 callout(9): add sbt flavors of callout_schedule
Differential Revision:	https://reviews.freebsd.org/D1161 (partial)
Reviewed by:	jhb, markj
MFC after:	1 week
2014-11-25 15:21:21 +00:00
John Baldwin
fbdb0b778a MFamd64: Check for invalid flags in the machine context in sigreturn()
and setcontext().
2014-11-25 12:52:00 +00:00
Alexander Motin
993a751eb3 Decouple datamove/done logic from CTL status set. 2014-11-25 12:22:29 +00:00
Justin Hibbits
4dc3495501 Add Apple Intrepid USB controller ID.
MFC after:	2 weeks
2014-11-25 06:15:00 +00:00
Alexander Motin
4a2863452f Use ctl_set_success() instead of direct inlining.
MFC after:	1 week
2014-11-25 06:11:05 +00:00
Ed Maste
294246bb7d Revert r274772: it is not valid on MIPS
Reported by:	sbruno
2014-11-25 03:50:31 +00:00
Kevin Lo
f386f04f11 Add missing headers needed by write(). 2014-11-25 02:58:38 +00:00
Andrey V. Elsukov
af6209a133 Skip L2 addresses lookups for p2p interfaces.
Discussed with:	melifaro
Sponsored by:	Yandex LLC
2014-11-24 21:51:43 +00:00
John Baldwin
a2d751936b Add a bus_get_domain() wrapper around BUS_GET_DOMAIN(). Use this to add
a new per-device '%domain' sysctl node that returns the NUMA domain a
device is associated with if it is associated with one.

Note that this API is still a WIP and might change before 11.0 actually
ships.

Differential Revision:	https://reviews.freebsd.org/D930
Reviewed by:	kib, adrian
2014-11-24 19:55:45 +00:00
John Baldwin
20abb66ede Properly initialize the capability rights for vnodes exported to procstat
that aren't for file descriptors (cwd, jdir, tracevp, etc.).

Submitted by:	Mikhail <mp@lenta.ru>
2014-11-24 18:34:11 +00:00
Ian Lepore
e56e554106 Add busdma sync ops before reading and after modifying the descriptor rings.
This was previously working by accident because BUSDMA_COHERENT_MEMORY has
always been set to strongly-ordered on arm.  Now we're moving towards
normal-uncacheable (what might be called write-combining on other platforms)
and using the proper sync ops will be more important.  Of course, that
opens the question of just what is the "proper" sync op for shared
concurrent dma access as opposed to accesses where the handoff of control
of the memory has well-defined sequence points that match the available
busdma sync operations.
2014-11-24 16:12:11 +00:00
Philip Paeps
894d1973f1 Add a sysctl `net.link.tap.deladdrs_on_close' to configure whether tap
should delete configured addresses and routes when the interface is
closed.  Default is enabled (preserve current behaviour).

MFC after:	1 week
2014-11-24 14:00:27 +00:00
Alexander Motin
1251a76b12 Replace home-grown CTL IO allocator with UMA.
Old allocator created significant lock congestion protecting its lists
of preallocated I/Os, while UMA provides much better SMP scalability.
The downside of UMA is lack of reliable preallocation, that could guarantee
successful allocation in non-sleepable environments.  But careful code
review shown, that only CAM target frontend really has that requirement.
Fix that making that frontend preallocate and statically bind CTL I/O for
every ATIO/INOT it preallocates any way.  That allows to avoid allocations
in hot I/O path.  Other frontends either may sleep in allocation context
or can properly handle allocation errors.

On 40-core server with 6 ZVOL-backed LUNs and 7 iSCSI client connections
this change increases peak performance from ~700K to >1M IOPS!  Yay! :)

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-11-24 11:37:27 +00:00
Gleb Smirnoff
1bb5ad634e We already have "int i" in this scope.
Submitted by:	alc
2014-11-24 07:57:20 +00:00
Ian Lepore
3787815761 The arm PJ4B cpu is armv7 architecture, not v6.
If this feels like deja vu... the last time this was fixed in this file
only ARM_MMU_V6 was fixed, this time it's ARM_ARCH_V6 (and this time I
searched for other occurrances of pj4b in here).
2014-11-24 01:13:58 +00:00