170154 Commits

Author SHA1 Message Date
jhb
5cec5b65a5 Use a dedicated taskqueue with a thread that runs at a software-interrupt
priority for the periodic polling of the machine check registers.
2011-02-03 13:09:22 +00:00
rrs
4d632ac7a7 Fix the per CPU stats so that:
1) They don't use the giant "MAX_CPU" define and instead
   are allocated dynamically based on mp_ncpus
2) Will zero with the netstat -z -s -p sctp
3) Will be properly handled by both the sctp_init and finish
   (the multi-net stuff was incorrectly bzero'ing in sctp_init
    the wrong size.. the bzero is now moved to the right places).
    And of course the free is put in at the very end.

MFC after:	3 Months
2011-02-03 11:52:22 +00:00
pjd
d2daebca5a Setup another socketpair between parent and child, so that primary sandboxed
worker can ask the main privileged process to connect in worker's behalf
and then we can migrate descriptor using this socketpair to worker.
This is not really needed now, but will be needed once we start to use
capsicum for sandboxing.

MFC after:	1 week
2011-02-03 11:39:49 +00:00
pjd
0c525303bd Add missing locking after moving keepalive_send() to remote send thread
in r214692.

MFC after:	1 week
2011-02-03 11:33:32 +00:00
pjd
f5164be44b Drop privileges after connecting to hastd, but before sending or receiving
anything.

MFC after:	1 week
2011-02-03 10:44:40 +00:00
pjd
c7493a8a85 Let the caller log info about successful privilege drop.
We don't want to log this in hastctl.

MFC after:	1 week
2011-02-03 10:37:44 +00:00
rrs
11926ca84d Adds an experimental option to create a pool of
threads. These serve as input threads and are queued
packets based on the V-tag number. This is similar to
what a modern card can do with queue's for TCP... but
alas modern cards know nothing about SCTP.

MFC after:	3 months (maybe)
2011-02-03 10:05:30 +00:00
emaste
aef7d099b3 Include driver name in panic string, to make it easier to find these should
the panic ever occur.
2011-02-03 03:07:11 +00:00
emaste
dc94d2f814 Revert part of r173264. Both aac_ioctl_sendfib and aac_ioctl_send_raw_srb
make use of the aac_ioctl_event callback, if aac_alloc_command fails.  This
can end up in an infinite loop in the while loop in aac_release_command.

Further investigation into the issue mentioned by Scott Long [1] will be
necessary.

[1] http://lists.freebsd.org/pipermail/freebsd-current/2007-October/078740.html
2011-02-03 02:14:53 +00:00
imp
c70bfd860a Setting TARGET and TARGET_ARCH needs to be done in _MAKE, not in the
TGTS rule as _MAKE is used elsewhere.  This should fix make world.
2011-02-02 23:59:24 +00:00
jilles
0bf07f466d sh: Add test for shell script without '#!'. 2011-02-02 22:03:18 +00:00
jilles
e252925aeb sh: Remove comment mentioning herefd, which is gone. 2011-02-02 21:48:53 +00:00
uqs
479c789dcd Add some obsolete manpages. 2011-02-02 21:09:30 +00:00
bz
4f26140aec Add missing argument after r218192. 2011-02-02 20:00:35 +00:00
uqs
d53cb8f4e9 libkvm: fix logic inversion introduced with last commit
Reported by:	Brandon Gooch <jamesbrandongooch@gmail.com>
Pointy hat to:	uqs
2011-02-02 17:01:26 +00:00
mdf
b291e9a365 Put the general logic for being a CPU hog into a new function
should_yield().  Use this in various places.  Encapsulate the common
case of check-and-yield into a new function maybe_yield().

Change several checks for a magic number of iterations to use
should_yield() instead.

MFC after:	1 week
2011-02-02 16:35:10 +00:00
pjd
95c40e7f09 - Rename proto_descriptor_{send,recv}() functions to
proto_connection_{send,recv} and change them to return proto_conn
  structure. We don't operate directly on descriptors, but on
  proto_conns.
- Add wrap method to wrap descriptor with proto_conn.
- Remove methods to send and receive descriptors and implement this
  functionality as additional argument to send and receive methods.

MFC after:	1 week
2011-02-02 15:53:09 +00:00
pjd
1267c20f91 Add proto_connect_wait() to wait for connection to finish.
If timeout argument to proto_connect() is -1, then the caller needs to use
this new function to wait for connection.

This change is in preparation for capsicum, where sandboxed worker wants
to ask main process to connect in worker's behalf and pass descriptor
to the worker. Because we don't want the main process to wait for the
connection, it will start async connection and pass descriptor to the
worker who will be responsible for waiting for the connection to finish.

MFC after:	1 week
2011-02-02 15:46:28 +00:00
pjd
3acb629cd2 Allow to specify connection timeout by the caller.
MFC after:	1 week
2011-02-02 15:42:00 +00:00
pjd
fe04ca4197 Move protocol allocation and deallocation to separate functions.
MFC after:	1 week
2011-02-02 15:23:07 +00:00
jhb
1c8ea10c3d Fix build with DIAGNOSTIC enabled.
Pointy hat to:	jhb
2011-02-02 14:59:05 +00:00
pluknet
1d70892dc0 Remove OpenSolaris include path referring to a non-existing directory
never committed from p4 dtrace branch.
[The correct include path is referenced from every opensolaris compat
consumer's module Makefile, so it doesn't serve any purpose anyway.]

Reported by:	arundel on freebsd-hackers@ via clang
Approved by:	kib (mentor)
MFC after:	1 week
2011-02-02 14:41:32 +00:00
rrs
21c4d27f12 1) Allow a chunk to track the cwnd it was at when sent.
2) Add separate max-bursts for retransmit and hb. These
   are set to sysctlable values but not settable via the
   socket api. This makes sure we don't blast out HB's or
   fast-retransmits.
3) Determine on the first data transmission on a net if
   its local-lan (by being under or over a RTT). This
   can later be used to think about different algorithms
   based on locallan vs big-i (experimental)
4) The cwnd should NOT be allowed to grow when an ECNEcho
   is seen (TCP has this same bug). We fix this in SCTP
   so an ECNe being seen prevents an advance of cwnd.
5) CWR's should not be sent multiple times to the
   same network, instead just updating the TSN being
   transmitted if needed.

MFC after:	1 Month
2011-02-02 11:13:23 +00:00
pjd
3d021845fb Be prepared that hp_client or hp_server might be NULL now.
MFC after:	1 week
2011-02-02 08:24:26 +00:00
marcel
696e30ffcc Rename INTR_VEC to MAP_IRQ. From the OFW or FDT we obtain a
PIC handle with interrupt pin. This we map to the resource
called SYS_RES_IRQ.
2011-02-02 05:58:51 +00:00
adrian
7ca6cd23bc Call the correct ANI Attach routine. 2011-02-02 03:55:34 +00:00
imp
b79d9b46d0 Make the generated files depend on the Makefile so new platforms are easier
to add than mipsn32 was when I was working on it...
2011-02-02 03:27:31 +00:00
imp
8045a94eb8 Revert last change now that the reason for it is no more...
MACHINE_ARCH is now always mipsel when building mips/mips.
2011-02-02 03:24:52 +00:00
mm
a9d332367d Recommit r218169, enclosing with #ifdef _KERNEL
This change is sufficient for the ZFS kernel module.

Discussed with:	pjd
MFC after:	1 week
2011-02-01 23:12:13 +00:00
imp
705cc0b145 Whitespace nit 2011-02-01 22:50:23 +00:00
n_hibma
e2cbb786e8 New ID for the Novatel MC547
PR:		154127
Submitted by:	Mike Tancsa
MFC after:	1 day
2011-02-01 22:26:06 +00:00
kan
e2a36c3715 Revert r218169 until it can be tested and fixed properly. 2011-02-01 21:15:35 +00:00
jhb
6121863a62 Some cosmetic fixes and remove a duplicate constant.
Submitted by:	Pedro F. Giffuni  giffunip at yahoo
2011-02-01 18:30:52 +00:00
jhb
3ce37d1bc3 - Set the next_alloc fields for an i-node after allocating a new block
so that future allocations start with most recently allocated block
  rather than the beginning of the filesystem.
- Fix ext2_alloccg() to properly scan for 8 block chunks that are not
  aligned on 8-bit boundaries.  Previously this was causing new blocks
  to be allocated in a highly fragmented fashion (block 0 of a file at
  lbn N, block 1 at lbn N + 8, block 2 at lbn N + 16, etc.).
- Cosmetic tweaks to the currently-disabled fancy realloc sysctls.

PR:		kern/153584
Discussed with:	bde
Tested by:	Pedro F. Giffuni  giffunip at yahoo, Zheng Liu (lz)
2011-02-01 18:21:45 +00:00
jhb
eb4f6eac7b Output an appropriate amount of padding to line up per-CPU state columns
rather than using a terminal sequence to move the cursor when drawing the
initial screen.

Requested by:	arundel
MFC after:	3 days
2011-02-01 15:48:27 +00:00
adrian
e42de5418c Just to be sure, make sure the MCS rates are allowed for TX.
Approved by:	rpaulo@
2011-02-01 15:26:30 +00:00
mm
cab1b4d893 For ZFS, change the type of clock_t to int64_t.
The clock_t type in OpenSolaris is long (int64_t on amd64).
On FreeBSD clock_t is int32_t. The clock_t type is used in several places
in the ZFS code to store system uptime in milliseconds ("seconds * hz").

With hz=1000 we have a 32-bit integer overflow in 24 days, 20 hours,
31 minutes and 23.648 seconds. This has a user reported negative impact
on l2arc_feed_thread() and may cause unexpected results from other functions
using clock_t.

Reported by:	Artem Belevich <fbsdlist@src.cx> on freebsd-fs@
MFC after:	1 week
2011-02-01 14:28:50 +00:00
kib
d690434e20 The unp_gc() function drops and reaquires lock between scan and
collect phases.  The unp_discard() function executes
unp_externalize_fp(), which might make the socket eligible for gc-ing,
and then, later, taskqueue will close the socket.  Since unp_gc()
dropped the list lock to do the malloc, close might happen after the
mark step but before the collection step, causing collection to not
find the socket and miss one array element.

I believe that the race was there before r216158, but the stated
revision made the window much wider by postponing the close to
taskqueue sometimes.

Only process as much array elements as we find the sockets during
second phase of gc [1].  Take linkage lock and recheck the eligibility
of the socket for gc, as well as call fhold() under the linkage lock.

Reported and tested by:	jmallett
Submitted by:   jmallett [1]
Reviewed by:	rwatson, jeff (possibly)
MFC after:	1 week
2011-02-01 13:33:49 +00:00
lstewart
8d21b8a169 Algorithm modules can define their own private congestion signal types in the
top 8 bits of the 32 bit signal bit field space for internal use. These private
signals should not be leaked outside of a module.

Given that many algorithm modules use the NewReno hook functions to simplify
their implementation, the obvious place such a leak would show up is in the
NewReno cong_signal hook function.

- Show the full number of significant bits in the signal type definitions in
  <netinet/cc.h>.

- Add a bitmask to simplify figuring out if a given signal is in the private or
  public bit range.

- Add a sanity check in newreno_cong_signal() to ensure private signals are not
  being leaked into the hook function.

Sponsored by:	FreeBSD Foundation
Discussed with:	David Hayes <dahayes at swin edu au>
MFC after:	1 week
X-MFC with:	r215166
2011-02-01 13:32:27 +00:00
mm
ae0c590545 Reintroduce bugfix from r210103 and fix xz on strong-aligned architectures.
This fix was accidentially reverted with the 5.0.0 update in r215187.

PR:		bin/154310
Submitted by:	Michael Moll <kvedulv@kvedulv.de>
MFC after:	3 days
2011-02-01 10:28:05 +00:00
hselasky
2f2f91bb7d Use correct kernel types for all fields in USB PF code and headers.
Approved by:	thompsa (mentor)
2011-02-01 10:25:48 +00:00
adrian
728432240e Add a new method to the rate control modules which extract out the
three other rates and try counts.

The 11n rate scenario path wants to take a list of rate and tries,
rather than calling setupxtxdesc().
2011-02-01 08:10:18 +00:00
adrian
d415e28b03 Include some preliminary TX HT rate scenario setup code.
The AR5416 and later TX descriptors have new fields for supporting
11n bits (eg 20/40mhz mode, short/long GI) and enabling/disabling
RTS/CTS protection per rate.

These functions will be responsible for initialising the TX descriptors
for the AR5416 and later chips for both HT and legacy frames.

Beacon frames will remain using the non-11n TX descriptor setup for now;
Linux ath9k does much the same.

Note that these functions aren't yet used anywhere; a few more framework
changes are needed before all of the right rate information is available
for TX.
2011-02-01 08:03:01 +00:00
pjd
7cbf58f4c0 Do not set socket send and receive buffer. It will be auto-tuned.
Confirmed by:	rwatson
MFC after:	1 week
2011-02-01 07:58:43 +00:00
adrian
ef074accce Refator the common code which calculates the 802.11g protection duration. 2011-02-01 07:50:26 +00:00
lstewart
108062a407 Fix typo in comment: "course" -> "coarse"
Sponsored by:	FreeBSD Foundation
Submitted by:	jmallett
MFC after:	3 months
X-MFC with:	r218152
2011-02-01 07:10:13 +00:00
lstewart
6d2fcf0be2 Import an implementation of the CAIA-Hamilton-Delay (CHD) congestion control
algorithm described in the paper "Improved coexistence and loss tolerance for
delay based TCP congestion control" by Hayes and Armitage. It is implemented as
a kernel module compatible with the recently committed modular congestion
control framework.

CHD enhances the approach taken by the Hamilton-Delay (HD) algorithm to provide
tolerance to non-congestion related packet loss and improvements to coexistence
with loss-based congestion control algorithms. A key idea in improving
coexistence with loss-based congestion control algorithms is the use of a shadow
window, which attempts to track how NewReno's congestion window (cwnd) would
evolve. At the next packet loss congestion event, CHD uses the shadow window to
correct cwnd in a way that reduces the amount of unfairness CHD experiences when
competing with loss-based algorithms.

In collaboration with:	David Hayes <dahayes at swin edu au> and
				Grenville Armitage <garmitage at swin edu au>
Sponsored by:	FreeBSD Foundation
Reviewed by:	bz and others along the way
MFC after:	3 months
2011-02-01 07:05:14 +00:00
adrian
ac497e209c * Add a rather hacky "does this speak the 11n TX descriptor format"
function; which will be later used by the TX path to determine
  whether to use the extended features or not.

* Break out the descriptor chaining logic into a separate function;
  again so it can be switched out later on for the 11n version when
  needed.

* Refactor out the encryption-swizzling code that's common in the
  raw and normal TX path.
2011-02-01 06:59:44 +00:00
lstewart
1138f32d73 Import a clean-room implementation of the Hamilton-Delay (HD) congestion control
algorithm based on the paper "A strategy for fair coexistence of loss and
delay-based congestion control algorithms" by Budzisz, Stanojevic, Shorten and
Baker. It is implemented as a kernel module compatible with the recently
committed modular congestion control framework.

HD uses a probabilistic approach to reacting to delay-based congestion. The
probability of reducing cwnd is zero when the queuing delay is very small,
increasing to a maximum at a set threshold, then back down to zero again when
the queuing delay is high. Normal operation keeps the queuing delay below the
set threshold. However, since loss-based congestion control algorithms push the
queuing delay high when probing for bandwidth, having the probability of
reducing cwnd drop back to zero for high delays allows HD to compete with
loss-based algorithms.

In collaboration with:	David Hayes <dahayes at swin edu au> and
				Grenville Armitage <garmitage at swin edu au>
Sponsored by:	FreeBSD Foundation
Reviewed by:	bz and others along the way
MFC after:	3 months
2011-02-01 06:42:46 +00:00
lstewart
fec0e548bc Import a clean-room implementation of the VEGAS congestion control algorithm
based on the paper "TCP Vegas: end to end congestion avoidance on a global
internet" by Brakmo and Peterson. It is implemented as a kernel module
compatible with the recently committed modular congestion control framework.

VEGAS uses network delay as a congestion indicator and unlike regular loss-based
algorithms, attempts to keep the network operating with stable queuing delays
and no congestion losses. By keeping network buffers used along the path within
a set range, queuing delays are kept low while maintaining high throughput.

In collaboration with:	David Hayes <dahayes at swin edu au> and
				Grenville Armitage <garmitage at swin edu au>
Sponsored by:	FreeBSD Foundation
Reviewed by:	bz and others along the way
MFC after:	3 months
2011-02-01 06:17:00 +00:00