82893 Commits

Author SHA1 Message Date
Alexander Motin
0eac2d6be3 Intel NM10 chipset's SATA controller has same PCI ID and revision as ICH7's,
but has only 2 SATA ports instead of 4. The worst part is that SStatus and
SError registers for missing ports are not implemented and return wrong
values (0xffffffff), that caused infinite reset loop.

Just ignore that SError value while I found no better way to identify them.
2011-06-09 16:30:13 +00:00
Jung-uk Kim
bc8e4ad2ef Tidy up r222866.
- Re-add accidentally removed atomic op. for sysctl(9) handler.
- Remove a period(`.') at the end of a debugging message.
- Consistently spell "low" for "TSC-low" timecounter throughout.

Pointed out by:	bde
2011-06-08 23:44:59 +00:00
David Christensen
76dbe6498b - Major reorganization of mbuf handling throughout the driver to
increase robustness (no more calls to panic(9)) and simplify
  code.
- Allocate RX/TX data structures as a single buffer rather than
  an array of 4KB pages to simplify code.
- Fixed LRO (aka TPA) code.  Removed kernel module parameter and
  support enabling disabling LRO through ifconfig(8) command line.
  LRO is still disabled by default but should be enabled for best
  performance on an endpoint device.
- Fixed statistcs code and removed kernel module parameter (stats
  should just work).
- Added many software counters to help identify the cause of some
  performance issues.
- Streamlined adapter internal init/stop code paths.
- Fiddled with debug code (adding some here, removing some there).
- Continued style(9) adjustments.
2011-06-08 21:18:14 +00:00
Jung-uk Kim
26e6537a73 Increase quality of TSC (or TSC-low) timecounter to 1000 if it is P-state
invariant.  For SMP case (TSC-low), it also has to pass SMP synchronization
test and the CPU vendor/model has to be white-listed explicitly.  Currently,
all Intel CPUs and single-socket AMD Family 15h processors are listed here.

Discussed with:	hackers
2011-06-08 20:08:06 +00:00
Jung-uk Kim
95f2f0985b Introduce low-resolution TSC timecounter "TSC-low". It replaces the normal
TSC timecounter if TSC frequency is higher than ~4.29 MHz (or 2^32-1 Hz) or
multiple CPUs are present.  The "TSC-low" frequency is always lower than a
preset maximum value and derived from TSC frequency (by being halved until
it becomes lower than the maximum).  Note the maximum value for SMP case is
significantly lower than UP case because we want to reduce (rare but known)
"temporal anomalies" caused by non-serialized RDTSC instruction.  Normally,
it is still higher than "ACPI-fast" timecounter frequency (which was default
timecounter hardware for long time until r222222) to be useful.
2011-06-08 19:38:31 +00:00
Attilio Rao
e3adb68519 In the current code, a double panic condition may lead to dumps
interleaving.
Signal dumping to happen only for the first panic which should be the
most important.

Sponsored by:	Sandvine Incorporated
Submitted by:	Nima Misaghian (nmisaghian AT sandvine DOT com)
MFC after:	2 weeks
2011-06-08 19:28:59 +00:00
Jung-uk Kim
75aa1914d5 Remove a redundant assignment since r221703. 2011-06-08 18:52:42 +00:00
Andreas Tobler
d91c258074 - Improve error handling.
- Add retry loops in the i2c read/write functions.
- Combied the ADC channel selection and readout of the value into
  one iicbus_transfer to avoid possible races.

Reviewed by: nwhitehorn
2011-06-08 16:00:30 +00:00
Bjoern A. Zeeb
869052041d Add the missing call to ip6_ipsec_filtertunnel() to be able to control
whether decapsulated IPsec packets will be passed to pfil again depending
on the setting of the net.ip6.ipsec6.filtertunnel sysctl.

PR:		kern/157670
Submitted by:	Manuel Kasper (mk neon1.net)
MFC after:	2 weeks
2011-06-08 10:59:36 +00:00
Andriy Gapon
234dab4a82 remove code for dynamic offlining/onlining of CPUs on x86
The code has definitely been broken for SCHED_ULE, which is a default
scheduler.  It may have been broken for SCHED_4BSD in more subtle ways,
e.g. with manually configured CPU affinities and for interrupt devilery
purposes.
We still provide a way to disable individual CPUs or all hyperthreading
"twin" CPUs before SMP startup.  See the UPDATING entry for details.

Interaction between building CPU topology and disabling CPUs still
remains fuzzy: topology is first built using all availble CPUs and then
the disabled CPUs should be "subtracted" from it.  That doesn't work
well if the resulting topology becomes non-uniform.

This work is done in cooperation with Attilio Rao who in addition to
reviewing also provided parts of code.

PR:		kern/145385
Discussed with:	gcooper, ambrisko, mdf, sbruno
Reviewed by:	attilio
Tested by:	pho, pluknet
X-MFC after:	never
2011-06-08 08:12:15 +00:00
Bjoern A. Zeeb
ffe8cd7b10 Correct comments and debug logging in ipsec to better match reality.
MFC after:	3 days
2011-06-08 03:02:11 +00:00
Marius Strobl
ab267f9dbf - For the case when tl1_align(_trap) is used to call rsf_fatal via
RSF_FATAL we need to switch to alternate globals for KSTACK_CHECK just
  like tl1_data_excptn(_trap) does. This is more or less cosmetic because
  in case RSF_FATAL is called we're already heading south.
- Correct an END().
- Read the window state from the correct register for a CATR().
2011-06-07 23:15:21 +00:00
Martin Matuska
baa256da8c Silence notice on pool creation, import and access.
Suggested by:	Jeremy Chadwick (freebsd-stable@)
Discussed with:	pjd
MFC after:	1 week
2011-06-07 20:46:31 +00:00
Marko Zec
2fe7ca2ca6 Set curvnet context in a callout-trigerred code path.
MFC after:	3 days
2011-06-07 20:46:03 +00:00
John Baldwin
c721b93449 Log the socket address passed as the destination to sendto() and sendmsg()
via ktrace.

MFC after:	1 week
2011-06-07 17:40:33 +00:00
Marius Strobl
c40847145b Adapt CATR() to r222813. This is somewhat tricky as we can't afford using
more than three temporary register in several places CATR() is used so
this code trades instructions in for registers. Actually, this still isn't
sufficient and CATR() has the side-effect of clobbering %y. Luckily, with
the current uses of CATR() this either doesn't matter or we are able to
(save and) restore it.
Now that there's only one use of AND() and TEST() left inline these.
2011-06-07 17:33:39 +00:00
Marius Strobl
3bd5692b1f Fix a problem with r222813; given that we may only operate on interrupt
globals here but clobber %y save and restore the latter.
2011-06-07 17:19:14 +00:00
Alexander Motin
cbebc90de0 Make automatic hw.snd.default_unit choice a bit more intelligent. Instead
of just setting it to the first registered device, reevaluate it for each
device registered, trying to choose best candidate, unless one was forced.
For now use such preference order: play&rec, play, rec.

As side effect, this should workaround the situation when HDMI audio output
of the video card, usually not connected to anything, becomes default, that
requires manual user intervention to make sound working. If at some point
this won't be enough, we can try to fetch some additional priority flags
from the device driver.
2011-06-07 17:01:52 +00:00
Adrian Chadd
e484e92770 Since HAL_PHYERR_* is used in the radar code, always include ah_desc.h. 2011-06-07 14:00:47 +00:00
Adrian Chadd
3d423111f6 Flesh out a new HAL method to fetch the radar PHY error frame information.
For the AR5211/AR5212, this is apparently a one byte pulse duration
counter value. It is only coded up here for the AR5212 as I don't have
any AR5211-series hardware to test it on.

This information was extracted from the Madwifi DFS branch along with
some local additions.

Please note - all this does is extract out the radar event duration,
it in no way reflects the presence of a radar. Further code is needed
to take a set of radar events and filter them to extract out correct
radar pulse trains (and ignore other events.)

For further information, please see:

http://wiki.freebsd.org/dev/ath_hal%284%29/RadarDetection

This includes references to the relevant patents which describe what
is going on.

Obtained from:	Madwifi
2011-06-07 09:03:28 +00:00
Attilio Rao
5e9857e76b MFC 2011-06-07 08:24:29 +00:00
Attilio Rao
74e4245e3f Bring back the number of CPU to 32. 2011-06-07 08:05:23 +00:00
Andrey V. Elsukov
56e38090a4 Fix indentation. 2011-06-07 06:57:22 +00:00
Andrey V. Elsukov
c57e67d04e Sync ng_nat with recent (r222806) ipfw_nat changes:
Make a behaviour of the libalias based in-kernel NAT a bit closer to
  how natd(8) does work. natd(8) drops packets only when libalias returns
  PKT_ALIAS_IGNORED and "deny_incoming" option is set, but ipfw_nat
  always did drop packets that were not aliased, even if they should
  not be aliased and just are going through.

Also add SCTP support: mark response packets to skip firewall processing.

MFC after:	1 month
2011-06-07 06:48:42 +00:00
Andrey V. Elsukov
bd853db48c Make a behaviour of the libalias based in-kernel NAT a bit closer to
how natd(8) does work. natd(8) drops packets only when libalias returns
PKT_ALIAS_IGNORED and "deny_incoming" option is set, but ipfw_nat
always did drop packets that were not aliased, even if they should
not be aliased and just are going through.

PR:		kern/122109, kern/129093, kern/157379
Submitted by:	Alexander V. Chernikov (previous version)
MFC after:	1 month
2011-06-07 06:42:29 +00:00
Andriy Gapon
d1817e7db7 amdsbwd: update to support SB8xx southbridges
Many thanks to Tino <tinotom@gmail.com> for drawing my attention to
this, for doing a lot of testing and providing great feedback.
Many thanks to AMD for continuing to release public specifications for
their chipsets.

PR:		kern/157568
Tested by:	Tino <tinotom@gmail.com>
MFC after:	1 week
2011-06-07 06:18:02 +00:00
Kenneth D. Merry
5e319c480c Set pca.p_bufr to NULL when we haven't allocated a buffer.
Otherwise, p_bufr is set to garbage on the stack, and if that garbage
happens to be non-NULL, and the TOLOG or TOCONS flag is set, putbuf()
will get called and attempt to fill the non-existent buffer.

This is really only relevant for tprintf() (and only when the priority is
not -1), but set it in uprintf() and ttyprintf() for completeness.

The next step, to avoid log buffer scrambling, would be to add the
PRINTF_BUFR_SIZE code to tprintf(), but this should prevent panics.

Submitted by:	rmacklem
Found by:	pho
2011-06-07 05:04:37 +00:00
David Xu
a231144921 Use p4prio_to_tsprio to calculate TS priority instead of using
p4prio_to_rtpprio which is for RT priority.

PR:	kern/157657
Submitted by:	krivenok.dmitry at gmail dot com
MFC after:	3 days
2011-06-07 02:50:14 +00:00
Marcel Moolenaar
299cceef03 Fix making kernel dumps from the debugger by creating a command
for it. Do not not expect a developer to call doadump(). Calling
doadump does not necessarily work when it's declared static. Nor
does it necessarily do what was intended in the context of text
dumps. The dump command always creates a core dump.

Move printing of error messages from doadump to the dump command,
now that we don't have to worry about being called from DDB.
2011-06-07 01:28:12 +00:00
Marcel Moolenaar
28fb80aa8c Call set_cputicker() to have the time counter use the ITC register.
Note that the ITC frequency is fixed.
2011-06-07 01:06:49 +00:00
Marcel Moolenaar
9f11397eb5 o Bump the EFI loader version to 3.1.
o   Add the about, pbvm and reboot commands.
o   Trim the banner (suppress maker and date).
2011-06-07 00:59:31 +00:00
Marcel Moolenaar
6d48fab9c5 Add ia64_sync_icache() and use it to make the I-cache coherent
after loading the kernel's text segment. The kernel will do the
same for loaded modules, so don't worry about that.
2011-06-07 00:39:15 +00:00
Jung-uk Kim
393ec7ad27 Validate INT 15h and 16h vectors more strictly. Traditionally these entry
points are fixed addresses and (U)EFI CSM specification also mandated that.
Unfortunately, (U)EFI CSM specification does not specifically mention this
is to call service routine via interrupt vector table or to jump directly
to the entry point.  As a result, some CSM seems to install two routines
and acts differently, depending on how it was executed, unfortunately.
When INT 15h is used, it calls a function pointer (which is probably a UEFI
service function).  When it jumps directly to the entry point, it executes
a simple and traditional INT 15h service routine.  Therefore, actually there
are two possible fixes, i. e., this fix or jumping directly to the fixed
entry point.  However, we chose this fix because a) keyboard typematic
support via BIOS is becoming extremely rarer and b) we cannot support random
service routine installed by a firmware or a boot loader.  This should fix
Lenovo X220 laptop, specifically.

Reviewed by:	delphij
MFC after:	3 days
2011-06-06 23:03:37 +00:00
Jung-uk Kim
7d09e4ab23 Revert r222152. The root cause was analysed and better fix is upcoming.
Discussed with:	delphij
2011-06-06 22:18:40 +00:00
Attilio Rao
9c68ff4742 MFC 2011-06-06 22:06:42 +00:00
Hans Petter Selasky
2906af23b8 Reset clear-stall error counter before setting up the USB control transfers.
MFC after:	14 days
2011-06-06 22:03:09 +00:00
Bjoern A. Zeeb
1417604e70 Unbreak kernels with non-default PCBGROUP included but no WITNESS.
Rather than including lock.h in in_pcbgroup.c in right order, fix it
for all consumers of in_pcb.h by further header file pollution under
#ifdef KERNEL.

Reported by:	Pan Tsu (inyaoo gmail.com)
2011-06-06 21:45:32 +00:00
Hans Petter Selasky
9eb0d7025d Improve enumeration of Low- and Full-speed devices connected through a
High-speed USB HUB by resetting the transaction translator (TT)
before trying re-enumeration. Also when clear-stall fails multiple times
try a re-enumeration.

Suggested by:	Trevor Blackwell
MFC after:	14 days
2011-06-06 21:45:09 +00:00
Attilio Rao
81c02539f1 MFC 2011-06-06 21:38:39 +00:00
Marcel Moolenaar
e726a6b70c Improve cpu_idle():
o   cpu_idle_hook is expected to be called with interrupts
    disabled and re-enables interrupts on return.
o   sync with x86: don't idle when the CPU has runnable tasks
o   have callers of ia64_call_pal_static() disable interrupts
    and re-enable interrupts.
o   add, but compile-out, support for idle mode. This will be
    enabled at some later time, after proper testing.
2011-06-06 19:06:15 +00:00
Warner Losh
54e397e566 Make a couple of debug printfs DEVPRINTF. 2011-06-06 16:27:38 +00:00
John Baldwin
a59f78daa9 Some style fixes.
Submitted by:	bde
2011-06-06 15:33:15 +00:00
Martin Matuska
298a6c3de6 Remove empty #ifndef
MFC after:	3 days
2011-06-06 14:46:43 +00:00
Andriy Gapon
ecee337a8c don't use cpuid level 4 in x86 cpu topology detection if it's not supported
This regression was introduced in r213323.
There are probably no Intel cpus that support amd64 mode, but do not
support cpuid level 4, but it's better to keep i386 and amd64 versions
of this code in sync.

Discovered by:	pho
Tested by:	pho
MFC after:	2 weeks
2011-06-06 14:23:13 +00:00
John Baldwin
0d439b5f93 More properly handle Cardbus cards that that store their CIS in a BAR after
the recent changes to track BAR state explicitly.  The code would now
attempt to add the same BAR twice in this case.  Instead, change this so
that it recognizes this case and only adds it once and do not delete the
BAR outright after parsing the CIS.

Tested by:	bschmidt
2011-06-06 13:21:11 +00:00
John Baldwin
69b63a9dc7 Clear the device_t pointer in 'struct resource' when releasing a device
as otherwise the sysctl to export rman info can dereference a stale
pointer.

PR:		kern/115371
Submitted by:	Arthur Hartwig
MFC after:	1 week
2011-06-06 13:12:56 +00:00
Robert Watson
52cd27cb58 Implement a CPU-affine TCP and UDP connection lookup data structure,
struct inpcbgroup.  pcbgroups, or "connection groups", supplement the
existing inpcbinfo connection hash table, which when pcbgroups are
enabled, might now be thought of more usefully as a per-protocol
4-tuple reservation table.

Connections are assigned to connection groups base on a hash of their
4-tuple; wildcard sockets require special handling, and are members
of all connection groups.  During a connection lookup, a
per-connection group lock is employed rather than the global pcbinfo
lock.  By aligning connection groups with input path processing,
connection groups take on an effective CPU affinity, especially when
aligned with RSS work placement (see a forthcoming commit for
details).  This eliminates cache line migration associated with
global, protocol-layer data structures in steady state TCP and UDP
processing (with the exception of protocol-layer statistics; further
commit to follow).

Elements of this approach were inspired by Willman, Rixner, and Cox's
2006 USENIX paper, "An Evaluation of Network Stack Parallelization
Strategies in Modern Operating Systems".  However, there are also
significant differences: we maintain the inpcb lock, rather than using
the connection group lock for per-connection state.

Likewise, the focus of this implementation is alignment with NIC
packet distribution strategies such as RSS, rather than pure software
strategies.  Despite that focus, software distribution is supported
through the parallel netisr implementation, and works well in
configurations where the number of hardware threads is greater than
the number of NIC input queues, such as in the RMI XLR threaded MIPS
architecture.

Another important difference is the continued maintenance of existing
hash tables as "reservation tables" -- these are useful both to
distinguish the resource allocation aspect of protocol name management
and the more common-case lookup aspect.  In configurations where
connection tables are aligned with hardware hashes, it is desirable to
use the traditional lookup tables for loopback or encapsulated traffic
rather than take the expense of hardware hashes that are hard to
implement efficiently in software (such as RSS Toeplitz).

Connection group support is enabled by compiling "options PCBGROUP"
into your kernel configuration; for the time being, this is an
experimental feature, and hence is not enabled by default.

Subject to the limited MFCability of change dependencies in inpcb,
and its change to the inpcbinfo init function signature, this change
in principle could be merged to FreeBSD 8.x.

Reviewed by:    bz
Sponsored by:   Juniper Networks, Inc.
2011-06-06 12:55:02 +00:00
Andrey V. Elsukov
1e587bfa32 Do not return EINVAL when user does ipfw set N flush on an empty set.
MFC after:	2 weeks
2011-06-06 10:39:38 +00:00
Hiroki Sato
23be782526 Do not activate automatic LL addr configuration when 0/1->1 transition of
ND6_IFF_IFDISABLED flag.
2011-06-06 04:12:57 +00:00
Hiroki Sato
db82af41db - Implement RDNSS and DNSSL options (RFC 6106, IPv6 Router Advertisement
Options for DNS Configuration) into rtadvd(8) and rtsold(8).  DNS
  information received by rtsold(8) will go to resolv.conf(5) by
  resolvconf(8) script.  This is based on work by J.R. Oldroyd (kern/156259)
  but revised extensively[1].

- rtadvd(8) now supports "noifprefix" to disable gathering on-link prefixes
  from interfaces when no "addr" is specified[2].  An entry in rtadvd.conf
  with "noifprefix" + no "addr" generates an RA message with no prefix
  information option.

- rtadvd(8) now supports RTM_IFANNOUNCE message to fix crashes when an
  interface is added or removed.

- Correct bogus ND_OPT_ROUTE_INFO value to one in RFC 4191.

Reviewed by:	bz[1]
PR:		kern/156259 [1]
PR:		bin/152458 [2]
2011-06-06 03:06:43 +00:00