175204 Commits

Author SHA1 Message Date
jhb
acc0442708 Some tweaks to the CPUID support:
- Don't always pass the cpuid request to the current CPU as some nodes
  we will emulate purely in software.
- Pass in the APIC ID of the virtual CPU so we can return the proper APIC
  ID.
- Always report a completely flat topology with no SMT or multicore.
- Report the CPUID2_HV feature and implement support for the 0x40000000
  CPUID level.
- Use existing constants from <machine/specialreg.h> when possible and
  use cpu_feature2 when checking for VMX support.
2011-06-02 14:04:07 +00:00
jhb
bcc4a489b1 Missed this in the previous commit to add 'show vmcs': add opt_ddb.h as
a source file.
2011-06-02 13:57:10 +00:00
jhb
63af589b2c Add a 'show vmcs' DDB command to dump state about the current CPU's
current VMCS.
2011-06-02 13:49:19 +00:00
grehan
31bc5dbbff IFC @ r222256 2011-05-24 15:39:34 +00:00
zec
b3769a4355 Provide fake link status information in an attempt to let ng_eiface(4)
virtual ifnets more realistically mimic physical ethernet interfaces.
The main motivation behind this change is to allow for ng_eiface(4)
interfaces to participate in STP if_bridge(4) configurations.

When announcing link status changes, switch to the vnet to which the
ifnet belongs, since it is possible for ng_eiface ifnets to be assigned
to a vnet different from the one in which its netgraph node resides.

MFC after:	3 days
2011-05-24 14:10:33 +00:00
jhb
7028e129fd Fix an issue with critical sections and SMP rendezvous handlers.
Specifically, a critical_exit() call that drops the nesting level to zero
has a brief window where the pending preemption flag is set and the
nesting level is set to zero.  This is done purposefully to avoid races
where a preemption scheduled by an interrupt could be lost otherwise (see
revision 144777).  However, this does mean that if an interrupt fires
during this window and enters and exits a critical section, it may preempt
from the interrupt context.  This is generally fine as the interrupt code
is careful to arrange critical sections so that they are not exited until
it is safe to preempt (e.g. interrupts EOI'd and masked if necessary).

However, the SMP rendezvous IPI handler does not quite follow this rule,
and in general a rendezvous can never be preempted.  Rendezvous handlers
are also not permitted to schedule threads to execute, so they will not
typically trigger preemptions.  SMP rendezvous handlers may use
spinlocks (carefully) such as the rm_cleanIPI() handler used in rmlocks,
but using a spinlock also enters and exits a critical section.  If the
interrupted top-half code is in the brief window of critical_exit() where
the nesting level is zero but a preemption is pending, then releasing the
spinlock can trigger a preemption.  Because we know that SMP rendezvous
handlers can never schedule a thread, we know that a critical_exit() in
an SMP rendezvous handler will only preempt in this edge case.  We also
know that the top-half thread will happily handle the deferred preemption
once the SMP rendezvous has completed, so the preemption will not be lost.

This makes it safe to employ a workaround where we use a nested critical
section in the SMP rendezvous code itself around rendezvous action
routines to prevent any preemptions during an SMP rendezvous.  The
workaround intentionally avoids checking for a deferred preemption
when leaving the critical section on the assumption that if there is a
pending preemption it will be handled by the interrupted top-half code.

Submitted by:	mlaier (variation specific to rm_cleanIPI())
Obtained from:	Isilon
MFC after:	1 week
2011-05-24 13:36:41 +00:00
jhb
4d0fe668f7 Update comments for DEVICE_PROBE() to reflect that BUS_PROBE_DEFAULT is
now the preferred typical return value from a probe routine.  Discourage
the use of 0 (BUS_PROBE_SPECIFIC) as it should be used very rarely.
Point the reader to the DEVICE_PROBE(9) manpage for more detailed notes
on possible probe return values.

Submitted by:	Philip Soeberg  philip-dev of soeberg net
2011-05-24 13:22:40 +00:00
jhb
d73862793b Simplify a stale assertion. We have not called mi_switch() from a nested
critical section during a preemption for several years.

MFC after:	1 week
2011-05-24 13:17:08 +00:00
rwatson
08101b4867 An inpcb lock is no longer required in in_pcbref() since the move to
refcount(9).

MFC after:      3 weeks
Sponsored by:   Juniper Networks, Inc.
2011-05-24 13:08:59 +00:00
rwatson
528150c490 Teach netstat(1) about the new global netisr policy sysctl,
net.isr.dispatch, and about per-protocol dispatch policies.

MFC after:	3 weeks
Reviewed by:	bz
Sponsored by:	Juniper Networks, Inc.
2011-05-24 12:38:00 +00:00
rwatson
d81ad1a304 Rework netisr policy mechanism so that per-protocol dispatch policies can
be represented:

- A single policy namespace is defined, consisting of four possible
  policies: "default" to use the global default, "deferred" to force
  deferred dispatch, "direct" to employ direct dispatch where possible, and
  "hybrid" which makes a dynamic decision based on CPU affinity, ordering,
  etc.  Routines are implemented to convert between strings and an integer
  namespace.

- A new global variable, netisr_dispatch_policy, subsumes existing global
  variables for direct dispatch, forced direct dispatch, etc, and is used
  for explicit policy interpretation and composition.  Old variables remain
  so that they can be exported by legacy sysctls for use by old netstat(1)
  binaries.  A new sysctl and tunable, netisr.dispatch.policy, accepts the
  above strings for specifying a global policy default.

- The protocol registration structure, netisr_handler, grows an nh_dispatch
  field, which accepts a per-policy policy override.  The default value is
  '0', which corresponds to "default", meaning that protocols will accept
  the global default policy unless otherwise specified.

- Policies are now interpreted and composed explicitly at various points in
  packet dispatch; protocol policies override global policies.

- Protocols grow the ability to express a non-opinion about affinity even
  when implenting m2cpuid by returning NETISR_CPUID_NONE.  In that case, the
  framework falls back on source ordering, rather than simply using the
  current CPU.

These changes are in support of allowing link layer re-dispatch based on
RSS or similar hashes provided by NICs, especially in the case where the
number of hardware receive queues matches hardware core count, rather than
hardware thread count, requiring further software redistributeon.  (i.e.,
on RMI XLR).

MFC after:      3 weeks
Reviewed by:    bz
Sponsored by:   Juniper Networks, Inc.
2011-05-24 12:34:19 +00:00
brucec
c9f24d69ee Remove an outdated comment as requested by Bruce Evans in a private email to
Alexander Best (arundel@).

For clang, -fdiagnostics-show-option is enabled by default, but for gcc it
isn't. This option will report which -W* flag was responsible for triggering
a certain warning. This will bring gcc warnings closer to the ones clang emits
and might also help developers track down tinderbox failures a bit quicker.

Submitted by:	arundel
2011-05-24 09:01:56 +00:00
zec
0e7fcf0b2c Allow for vlan(4) interfaces with MTU of 1500 bytes to be configured
on top of epair(4) virtual interfaces, since there's no physical
hardware associated with epair interfaces which would imply any
constraints on MTU sizes.

MFC after:	3 days
2011-05-24 08:02:55 +00:00
zec
3ab76c6800 Let epair(4) virtual interfaces report fake link / media status,
by borrowing the skeleton of if_media manipulation and reporting
code from if_lagg(4).  The main motivation behind this change is
to allow for epair(4) interfaces to participate in STP if_bridge(4)
configurations.

Reviewed by:	bz
MFC after:	3 days
2011-05-24 07:57:28 +00:00
ru
9e9f9bc898 Ensure there is a whitespace after a mount point.
PR:		157286
Submitted by:	Marcus Reid
MFC after:	3 days
2011-05-24 06:56:40 +00:00
ae
65130c345c Remove unused variable.
MFC after:	1 week
2011-05-24 06:46:07 +00:00
ae
791fcf05bc Remove unused variable.
MFC after:	1 week
2011-05-24 06:44:16 +00:00
adrian
593db502aa Use the new per-series antenna and TPC definitions when setting ctl8->11.
This should hopefully make it clearer to developers what is going on
and when TPC is being hacked on, make it obvious why it isn't working for
series 1, 2, 3.

I won't flip on setting TX power for TX series 1, 2, 3 until I've done
some further testing with Kite to ensure it doesn't break anything.
(Before people ask - yes, TPC is only needed for 5ghz regdomains and
yes, Kite is a 2.4ghz only chip, but there are potential use cases
for 2ghz TPC. I just need to sit down and ensure it's supported and
functional.)
2011-05-24 05:49:02 +00:00
adrian
4184989e51 Add in descriptions for TX descriptor fields ctl8-11 - these fields
control the antenna control bits for the four TX series and the
TPC settings for TX series 1, 2, 3.

The specifics:

* The TPC setting for TX series 0 is handled in ctl0.

* TPC is currently disabled, so the per-packet TX power is
  set via the global per-rate TX power register, not per packet.

* The antenna control bits don't matter for AR5416 and later
  so they should stay 0 (which they currently do); they may
  be set for Kite but as there's no TX diversity supported
  at the moment (it requires the NIC to be built with an
  external antenna switch, matching how antenna diversity
  is done on legacy NICs), so again keep them 0.

This is in preparation for supporting per-rate TPC on the
AR5416 and later. The Kite (and soon to come Kiwi) code
sets ctl8-11 to 0x0, which doesn't have any effect at
the moment. When TPC is enabled it would result in the
second, third and fourth TX series attmpts to be done with
a TX power of 0. This commit doesn't change that; it'll
be followed up with some commits to properly set the TPC
registers appropriately.
2011-05-24 05:34:45 +00:00
nwhitehorn
2382136ff6 Add RTC support for the LV1 clock on the PS3. The hypervisor won't let us
set it, but it's better than nothing.
2011-05-24 02:19:45 +00:00
grehan
949e126edf Catch up with CURRENTs different timer usage compared to 8.1. A counter
value of 0 in rategen mode is equivalent to a max initial value.
The TSC is now correctly calibrated on a 9.0 guest.

Obtained from:	NetApp
2011-05-24 01:08:53 +00:00
attilio
679e6860f1 Merge r221846 from largeSMP project branch:
Fix arguments passing to _long() version of atomic function for mips.

The native implementation is bogus in that regard and offers the same
problem solved for powerpc as r222198, but mips' guys just wanted a
small and self-contained patch for mips rather than rewriting the
whole support.

Reviewed by:	art, imp
Tested by:	gonzo
MFC after:	2 weeks
2011-05-23 23:35:50 +00:00
rmacklem
e63c03c6e6 Set the MNT_NFS4ACLS flag for an NFSv4 client mount
if the NFSv4 server supports it. Requested by trasz.

MFC after:	2 weeks
2011-05-23 22:31:42 +00:00
yongari
364291c0f2 Add 88E8075 Yukon Supreme to the list of supported hardware list. 2011-05-23 22:02:15 +00:00
yongari
b19607cc53 When MTU is changed, check whether driver should be reinitialized or
not.  If reinitialized is required, clear driver running flag.
2011-05-23 21:56:04 +00:00
yongari
11e70b7227 Add initial support for Marvell 88E8055/88E8075 Yukon Supreme. 2011-05-23 21:51:47 +00:00
imp
ae32fa8440 Test against "no" rather than "yes" for MK_KERNEL_SYMBOLS
Also, change DEBUG back to DEBUG_FLAGS in kmod.mk.  The latter accidentally
snuck in with my backwards compat fix.

Submitted by:	ru,gcooper
2011-05-23 21:32:45 +00:00
pjd
42a14e17b5 Keep statistics on number of BIO_READ, BIO_WRITE, BIO_DELETE and BIO_FLUSH
requests as well as number of activemap updates.

Number of BIO_WRITEs and activemap updates are especially interesting, because
if those two are too close to each other, it means that your workload needs
bigger number of dirty extents. Activemap should be updated as rarely as
possible.

MFC after:	1 week
2011-05-23 21:15:19 +00:00
yongari
bef2f8f5d1 Do not touch ASF related register for controllers that do not have
these registers. Also disable Watchdog of ASF microcontroller.
2011-05-23 21:11:46 +00:00
yongari
bb9ab13b76 Make sure to enable all clocks before accessing registers.
Releasing PHY from power down/COMA is done after enabling all
clocks. While I'm here remove unnecessary controller reset.
2011-05-23 21:00:56 +00:00
pjd
0b27b5e691 Recognize BIO_FLUSH requests and pass them to userland.
MFC after:	1 week
2011-05-23 21:00:37 +00:00
pjd
93ce9f3fcb To handle BIO_FLUSH and BIO_DELETE requests in secondary worker we need
to use ioctl(2). This is why we can't use capsicum for now to sandbox
secondary. Capsicum is still used to sandbox hastctl.

MFC after:	1 week
2011-05-23 20:59:50 +00:00
yongari
32741e735e Do not configure RAM registers for controllers that do not have
them.  These registers are defined only for Yukon XL, Yukon EC and
Yukon FE.
2011-05-23 20:18:09 +00:00
jkim
db83c1dd70 Decrease ACPI-fast timecounter quality to 900 and increase HPET timecounter
quality to 950.  HPET on modern platforms usually have better resolution and
lower latency than ACPI timer.  Effectively this changes default timecounter
hardware from ACPI-fast to HPET by default when both are available.

Discussed with:	avg
2011-05-23 20:12:36 +00:00
yongari
4df8eb7953 Rework store and forward configuration of TX MAC FIFO. Basically it
enables store and forward mode except for jumbo frame on Yukon
Ultra.
2011-05-23 20:09:32 +00:00
ru
5a5a985b61 BKVASIZE was bumped to 16k more than a decade ago. 2011-05-23 19:59:01 +00:00
yongari
5c47ee0af2 Do not blindly clear entire GPHY control register. It seems some
bits of the register is used for other purposes such that clearing
these bits resulted in unexpected results such as corrupted RX
frames or missing LE status updates.  For old controllers like
Yukon EC it had no effect but it caused all kind of troubles on
Yukon Supreme.
This change shall improve stability of controllers like Yukon
Ultra, Ultra2, Extreme, Optima and Supreme.
2011-05-23 19:58:08 +00:00
ru
d031d28b6b expr -> sh arithmetic expansion 2011-05-23 19:57:12 +00:00
rwatson
95a805600b Continue to refine inpcb reference counting and locking, in preparation for
reworking of inpcbinfo locking:

(1) Convert inpcb reference counting from manually manipulated integers to
    the refcount(9) KPI.  This allows the refcount to be managed atomically
    with an inpcb read lock rather than write lock, or even with no inpcb
    lock at all.  As a result, in_pcbref() also no longer requires an inpcb
    lock, so can be performed solely using the lock used to look up an
    inpcb.

(2) Shift more inpcb freeing activity from the in_pcbrele() context (via
    in_pcbfree_internal) to the explicit in_pcbfree() context.  This means
    that the inpcb refcount is increasingly used only to maintain memory
    stability, not actually defer the clean up of inpcb protocol parts.
    This is desirable as many of those protocol parts required the pcbinfo
    lock, which we'd like not to acquire in in_pcbrele() contexts.  Document
    this in comments better.

(3) Introduce new read-locked and write-locked in_pcbrele() variations,
    in_pcbrele_rlocked() and in_pcbrele_wlocked(), which allow the inpcb to
    be properly unlocked as needed.  in_pcbrele() is a wrapper around the
    latter, and should probably go away at some point.  This makes it
    easier to use this weak reference model when holding only a read lock,
    as will happen in the future.

This may well be safe to MFC, but some more KBI analysis is required.

Reviewed by:    bz
MFC after:      3 weeks
Sponsored by:   Juniper Networks, Inc.
2011-05-23 19:32:02 +00:00
jh
fbe30c6e5c In init_dynamic_kenv(), ignore environment strings exceeding the
KENV_MNAMELEN + 1 + KENV_MVALLEN + 1 length limit to avoid buffer
overflow in getenv(). Currenly loader(8) doesn't limit the length of
environment strings.

PR:		kern/132104
MFC after:	1 month
2011-05-23 16:40:44 +00:00
rwatson
79b3da72c2 Move from passing a wildcard boolean to a general set up lookup flags into
in_pcb_lport(), in_pcblookup_local(), and in_pcblookup_hash(), and similarly
for IPv6 functions.  In the future, we would like to support other flags
relating to locking strategy.

This change doesn't appear to modify the KBI in practice, as callers already
passed in INPLOOKUP_WILDCARD rather than a simple boolean.

MFC after:      3 weeks
Reviewed by:    bz
Sponsored by:   Juniper Networks, Inc.
2011-05-23 15:23:18 +00:00
rwatson
56d91395ad A number of quite incremental refinements to struct inpcbinfo's definition:
(1) Add a locking guide for inpcbinfo.
(2) Annotate inpcbinfo fields with synchronisation information; not all
    annotations are 100% satisfactory.
(3) Reorder inpcbinfo fields so that the lock is at the head of the
    structure, and close to fields it protects.
(4) Sort fields that will eventually be hashlock/pcbgroup-related together
    even though they remain locked by ipi_lock for now.

Reviewed by:	bz
Sponsored by:	Juniper Networks
X-MFC after:	KBI analysis required
2011-05-23 13:51:57 +00:00
delphij
c7822ff2b1 Match symbolic link handling behavior with GNU gzip, bzip2 and xz:
When we are operating on a symbolic link pointing to an existing
file, bail out by default, but go ahead if -f is specified.

Submitted by:	arundel
MFC after:	2 weeks
2011-05-23 09:40:21 +00:00
delphij
f71ef98f21 Diff reduction against NetBSD. The most notable change is to zdiff(1) to
handle more file formats including bzip2 and xz.

MFC after:	2 weeks
2011-05-23 09:02:44 +00:00
benl
383b306f0a Fix clang warnings.
Approved by:	philip (mentor)
2011-05-22 22:17:06 +00:00
benl
9dc57d1405 Fix clang warnings.
Approved by:	philip (mentor)
2011-05-22 22:16:19 +00:00
benl
3bcf417808 Fix clang warnings.
Approved by:	philip (mentor)
2011-05-22 22:15:42 +00:00
benl
fe2c179872 Fix clang warnings.
Approved by:	philip (mentor)
2011-05-22 22:15:16 +00:00
benl
0748b6b611 Fix clang compile warnings.
Approved by:	philip (mentor)
2011-05-22 22:14:15 +00:00
attilio
a8b367d89d Merge r221912 from largeSMP project branch:
Fix a long-standing bug in cpuset_thread0() where only the first part
of cs_mask is set full.

Submitted by:	anonymous
MFC after:	1 week
2011-05-22 21:35:03 +00:00