105286 Commits

Author SHA1 Message Date
bryanv
6e8ba9083a Cleanup and performance improvement of the virtio_blk driver
- Add support for GEOM direct completion. Depending on the benchmark,
    this tends to give a ~30% improvement w.r.t IOPs and BW.
  - Remove an invariants check in the strategy routine. This assertion
    is caught later on by an existing panic.
  - Rename and resort various related functions to make more sense.

MFC after:	1 month
2014-11-30 16:36:26 +00:00
glebius
0aee39de73 Merge from projects/sendfile:
- Provide pru_ready function for TCP.
- Don't call tcp_output() from tcp_usr_send() if no ready data was put
  into the socket buffer.
- In case of dropped connection don't try to m_freem() not ready data.

Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2014-11-30 13:43:52 +00:00
glebius
3dd3d6d9ff Merge from projects/sendfile:
Provide pru_ready for AF_LOCAL sockets.  Local sockets sendsdata directly
to the receive buffer of the peer, thus pru_ready also works on the peer
socket.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-30 13:40:58 +00:00
glebius
9cadf1b974 Merge from projects/sendfile: extend protocols API to support
sending not ready data:
o Add new flag to pru_send() flags - PRUS_NOTREADY.
o Add new protocol method pru_ready().

Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2014-11-30 13:24:21 +00:00
glebius
25da94eb3e Merge from projects/sendfile:
o Introduce a notion of "not ready" mbufs in socket buffers.  These
mbufs are now being populated by some I/O in background and are
referenced outside.  This forces following implications:
- An mbuf which is "not ready" can't be taken out of the buffer.
- An mbuf that is behind a "not ready" in the queue neither.
- If sockbet buffer is flushed, then "not ready" mbufs shouln't be
  freed.

o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc.
  The sb_ccc stands for ""claimed character count", or "committed
  character count".  And the sb_acc is "available character count".
  Consumers of socket buffer API shouldn't already access them directly,
  but use sbused() and sbavail() respectively.
o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones
  with M_BLOCKED.
o New field sb_fnrdy points to the first not ready mbuf, to avoid linear
  search.
o New function sbready() is provided to activate certain amount of mbufs
  in a socket buffer.

A special note on SCTP:
  SCTP has its own sockbufs.  Unfortunately, FreeBSD stack doesn't yet
allow protocol specific sockbufs.  Thus, SCTP does some hacks to make
itself compatible with FreeBSD: it manages sockbufs on its own, but keeps
sb_cc updated to inform the stack of amount of data in them.  The new
notion of "not ready" data isn't supported by SCTP.  Instead, only a
mechanical substitute is done: s/sb_cc/sb_ccc/.
  A proper solution would be to take away struct sockbuf from struct
socket and allow protocols to implement their own socket buffers, like
SCTP already does.  This was discussed with rrs@.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-30 12:52:33 +00:00
andrew
45f9f39749 Correctly a few incorrect uses of ENTRY/EENTRY and END/EEND
Sponsored by:	ABT Systems Ltd
2014-11-30 12:25:04 +00:00
andrew
1409e27fc7 Remove extra labels, ENTRY_NP already provides them.
Sponsored by:	ABT Systems Ltd
2014-11-30 12:20:24 +00:00
glebius
996bd7abd3 Missed in r274421: use sbavail() instead of bare access to sb_cc. 2014-11-30 12:11:01 +00:00
glebius
176ae2299c - Move sbcheck() declaration under SOCKBUF_DEBUG.
- Improve SOCKBUF_DEBUG macros.
- Improve sbcheck().

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-30 11:22:39 +00:00
glebius
843bdaf93c Make sballoc() and sbfree() functions. Ideally, they could be marked
as static, but unfortunately Infiniband (ab)uses them.

Sponsored by:	Nginx, Inc.
2014-11-30 11:02:07 +00:00
rdivacky
844e919815 Unbreak the code for non-digits below '0' by casting the expression
to unsigned int.

Pointed out by: bde
2014-11-30 08:43:55 +00:00
jhibbits
56d5238825 Add support for dtrace:fbt on modules for PowerPC
Summary:
Revert the initial FBT-with-KDB changes for trap_subr*.S, and instead use the
db_trap filter function to handle dtrace trap filtering.  With this, the MMU is
enabled by the support code, simplifying the codepath altogether.

Test Plan: Tested on my G4 PowerBook

Reviewers: #powerpc, nwhitehorn

Reviewed By: nwhitehorn

Differential Revision: https://reviews.freebsd.org/D1207

MFC after:	3 weeks
2014-11-29 20:54:33 +00:00
andrew
443d4e8f9b Update _ENTRY to use _EENTRY to reduce the common code. 2014-11-29 19:31:23 +00:00
imp
d737995628 The current limit of 100k for the linker hints file is getting a bit
crowded as we now are at about 70k. Bump the limit to 1MB instead
which is still quite a reasonable limit and allows for future growth
of this file and possible future expansion to additional data.

MFC After: 2 weeks
2014-11-29 17:29:30 +00:00
kib
9127a1f190 Remove lock recursion for the pipe pair mutex, and disable the
recursion on mutex initialization.

The only places where the recursive acquire is performed are read and
write filters, since knlist, which uses the pipe pair mutex as lock,
is locked when filter is called.

The recursion was added in r93296, and consistent locking for
kn_fop->f_event() introduced in r133741.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2014-11-29 17:18:20 +00:00
bapt
b658e1f14b Ignore more warnings with external gcc 2014-11-29 14:30:39 +00:00
nyan
e9f219c6ad MFi386: r275059, r275061, r275062 and r275191 (by rdivacky)
Shrink boot2 by a couple more bytes.
2014-11-29 12:22:31 +00:00
nyan
42d4bd50d9 MFi386: r275237 (by rdivacky)
Shrink boot2 a bit more by factoring out common pattern
  of printf();return(-1);
2014-11-29 09:27:18 +00:00
rdivacky
0dd1357d84 Shrink boot2 a bit more by factoring out common pattern
of printf();return(-1);

This shrinks it by 8bytes using clang35 and by 12bytes using clang34.
2014-11-29 08:59:26 +00:00
bz
e5417bbe54 After r275196 unbreak NOIP and NOINET kernels by hiding an otherwise
unused varibale under the proper #ifdef.
2014-11-28 14:51:49 +00:00
rea
96e8d9b85a DRM2: fix off-by-one overflow in ioctl processing
Call to the driver-specific ioctl used to process ioctl number
that will lead to the out-of-bounds access to the ioctl handler
array.

PR:		193367
Approved by:	kib
MFC after:	1 week
2014-11-28 12:14:59 +00:00
andrew
6c403c8a16 Some device tree configurations place the generic timer under the root
of the tree and not under simplebus. Update the driver to handle this.

Submitted by:	Julien Grall <julien.grall AT linaro.org>
MFC after:	1 week
2014-11-28 11:49:26 +00:00
andrew
c4cd8cf0ef We don't use the hypervisor interrupt, make it optional in the device tree.
Submitted by:	Julien Grall <julien.grall AT linaro.org>
MFC after:	1 week
2014-11-28 11:45:53 +00:00
kib
1642e3809f Assert the state of the process lock and sigact mutex in
kern_sigprocmask() and reschedule_signals().

Discussed with:	rea
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-11-28 10:20:00 +00:00
hselasky
f4e178e921 Style changes:
- Move two IOCTL related defines to the top of the C-file
- Add more comments describing the recently added IOCTL small size and
small align macros
2014-11-28 09:32:07 +00:00
cy
70f022105b Correctly define constants.
MFC after:	1 week
2014-11-28 04:07:06 +00:00
melifaro
510cff60c1 Fix build broken by r275195. 2014-11-27 23:10:03 +00:00
melifaro
95c680b9a3 Do not return unlocked/unreferenced lle in arpresolve/nd6_storelladdr -
return lle flags IFF needed.
Do not pass rte to arpresolve - pass is_gateway flag instead.
2014-11-27 23:06:25 +00:00
melifaro
09e8890761 Do not try to copy header to @dst and than back to ethernet in case of
pseudo_AF_HDRCMPLT:

we copy media header from mbuf to 'struct sockaddr' @dst in bpf_movein, so
mbuf already contains valid info.
2014-11-27 21:29:19 +00:00
rdivacky
50fd85b21c Revert part of r275059. Comparing unsigned 8 bit value
against -'0' is always false so the conditional block is
optimized away.
2014-11-27 18:43:44 +00:00
jhibbits
2d64995d17 Fix hwpmc sampling for ppc970 (G5-class) processors.
With this, hwpmc sampling now works on these processors.

MFC after:	3 weeks
Relnotes:	yes
2014-11-27 18:41:14 +00:00
jhibbits
da596c6bb1 Fix hwpmc sampling for MPC74xxx (G4) processors.
With this, hwpmc sampling now works correctly on these processors.

MFC after:	3 weeks
Relnotes:	yes
2014-11-27 06:42:34 +00:00
ae
6ee47c6705 Remove ip4_input() declaration. It was removed in r275133.
MFC after:	1 month
2014-11-27 00:27:39 +00:00
emaste
20daa3d157 Increase default and maximum callchain depths
Bump the default from 16 to 32, to accommodate kernel flamegraphs.
Bump the maximum from 32 to 128, to accommodate deep user stacks.

Reviewed by:	gnn
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1203
2014-11-26 20:56:08 +00:00
adrian
18385c7e83 Add PCI ID for Intel Lynx Point LP controller.
PR:		kern/195398
Submitted by:	grembo
Obtained from:	DragonflyBSD
MFC after:	1 week
2014-11-26 20:34:05 +00:00
alfred
1413b742e9 Make igb and ixgbe check tunables at probe time.
This allows one to make a kernel module to tune the
number of queues before the driver loads.

This is needed so that a module at SI_SUB_CPU can set
tunables for these drivers to take.  Otherwise getenv
is called too early by the TUNABLE macros.

Reviewed by: smh
Phabric: https://reviews.freebsd.org/D1149
2014-11-26 20:19:36 +00:00
ae
f9ef15aae9 Do not use xform_ipip as decapsulation fallback.
xform_ipip was used as fallback with low priority for IPIP
encapsulated packets that were decrypted. In some cases
it can decapsulate packets, that it shouldn't. This leads to situations,
when wrong configurations are magically working. Also it can propagate
wrong ingress interface and this can break security.

Now we redesigned the IPSEC code and IPIP encapsulation is called directly
from ipsec_output, and decapsulation is done in the ipsec_input with m_striphdr.

Differential Revision:	https://reviews.freebsd.org/D1220
MFC after:	1 month
Sponsored by:	Yandex LLC
2014-11-26 17:44:49 +00:00
mav
579087806c Fix WWNN/WWPN generation for virtual channels.
MFC after:	1 week
2014-11-26 16:05:01 +00:00
mav
469e2dc867 Fix incorrect check, blocking MULTIID functionality.
MFC after:	1 week
2014-11-26 15:03:21 +00:00
kib
11cee2ecf7 The process spin lock currently has the following distinct uses:
- Threads lifetime cycle, in particular, counting of the threads in
  the process, and interlocking with process mutex and thread lock.
  The main reason of this is that turnstile locks are after thread
  locks, so you e.g. cannot unlock blockable mutex (think process
  mutex) while owning thread lock.

- Virtual and profiling itimers, since the timers activation is done
  from the clock interrupt context.  Replace the p_slock by p_itimmtx
  and PROC_ITIMLOCK().

- Profiling code (profil(2)), for similar reason.  Replace the p_slock
  by p_profmtx and PROC_PROFLOCK().

- Resource usage accounting.  Need for the spinlock there is subtle,
  my understanding is that spinlock blocks context switching for the
  current thread, which prevents td_runtime and similar fields from
  changing (updates are done at the mi_switch()).  Replace the p_slock
  by p_statmtx and PROC_STATLOCK().

The split is done mostly for code clarity, and should not affect
scalability.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-11-26 14:10:00 +00:00
kib
4501dadd00 Fix SA_SIGINFO | SA_RESETHAND handling. The sysent' sv_sendsig()
method needs pre-reset state of the ps_siginfo to correctly construct
signal frame.

Move sigdflt() call after the sv_sendsig() invocation in postsig().
Simultaneously extract common code from trapsignal() and postsig()
into new helper postsig_done().

Submitted by:	rea
MFC after:	1 week
2014-11-26 14:09:04 +00:00
mav
7af8a12e6f Some microoptimizations.
MFC after:	1 month
2014-11-26 13:56:54 +00:00
mav
7b9c16e8b5 Make isp_find_pdb_by_*() search for targets in portdb in reverse order.
Records with target_mode == 1 are allocated from the end of portdb, so it
seems logical to start search from the end not traverse whole array.

MFC after:	1 month
2014-11-26 12:25:00 +00:00
hselasky
6cbccb2aed Add new USB quirk.
MFC after:	1 week
PR:		195372
2014-11-26 10:58:08 +00:00
mav
1bac48ea8f Add bunch of PCI IDs of Intel Wildcat Point (9 Series) chipsets.
MFC after:	1 week
2014-11-26 04:23:21 +00:00
delphij
cbde4886c5 Revert r273060 per discussion with avg@ as we need to make L2ARC
aware of 4K devices and this one is not the right fix anyway.
2014-11-26 02:20:25 +00:00
rdivacky
a6f749a70e Fix style(9).
Suggested by: jkim
2014-11-25 18:58:40 +00:00
rdivacky
df9fb4f0b5 Fix style(9).
Suggested by: jkim
2014-11-25 18:53:17 +00:00
rdivacky
776a1c3a81 Shrink boot2 by a couple more bytes.
Reviewed by:    jhb
Tested by:      me, dim
2014-11-25 18:35:47 +00:00
mav
115d2f3c0b Coalesce last data move and command status for read commands.
Make CTL core and block backend set success status before initiating last
data move for read commands.  Make CAM target and iSCSI frontends detect
such condition and send command status together with data.  New I/O flag
allows to skip duplicate status sending on later fe_done() call.

For Fibre Channel this change saves one of three interrupts per read command,
increasing performance from 126K to 160K IOPS.  For iSCSI this change saves
one of three PDUs per read command, increasing performance from 1M to 1.2M
IOPS.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-11-25 17:53:35 +00:00