Commit Graph

67877 Commits

Author SHA1 Message Date
jeff
7ff6e9903f Fix a race which could result in some timeout buckets being skipped.
- When a tick occurs on a cpu, iterate from cs_softticks until ticks.
   The per-cpu tick processing happens asynchronously with the actual
   adjustment of the 'ticks' variable.  Sometimes the results may
   be visible before the local call and sometimes after.  Previously this
   could cause a one tick window where we didn't evaluate the bucket.
 - In softclock fetch curticks before incrementing cc_softticks so we
   don't skip insertions which were made for the current time.

Sponsored by:	Nokia
2008-07-19 05:18:29 +00:00
jeff
b2f69d1b1e - Check whether we've recorded this tick in ts_ticks on another cpu in
sched_tick() to prevent multiple increments for one tick.  This pushes
   the value out of range and breaks priority calculation.

Reviewed by:	kib
Found by:	pho/nokia
Sponsored by:	Nokia
MFC after:	3 days
2008-07-19 05:13:47 +00:00
alc
493038bc03 Correct an error in pmap_change_attr()'s initial loop that verifies that the
given range of addresses are mapped.  Previously, the loop was testing the
same address every time.

Submitted by:	Magesh Dhasayyan
2008-07-18 22:05:51 +00:00
alc
07ead18715 Simplify pmap_extract()'s control flow, making it more like the related
functions pmap_extract_and_hold() and pmap_kextract().
2008-07-18 20:07:50 +00:00
alc
dba03ac26a Eliminate stale comments from kmem_malloc(). 2008-07-18 17:41:31 +00:00
dwmalone
f7cc3b4928 Add an accept filter for TCP based DNS requests. It waits until the
whole first request is present before returning from accept.
2008-07-18 14:44:51 +00:00
rwatson
4efb6d2a37 Eliminate use of the global ripsrc which was being used to pass address
information from rip_input() to rip_append().  Instead, pass the source
address for an IP datagram to rip_append() using a stack-allocated
sockaddr_in, similar to udp_input() and udp_append().

Prior to the move to rwlocks for inpcbinfo, this was not a problem, as
use of the global was synchronized using the ripcbinfo mutex, but with
read-locking there is the potential for a race during concurrent
receive.

This problem is not present in the IPv6 raw IP socket code, which
already used a stack variable for the address.

Spotted by:	mav
MFC after:	1 week (before inpcbinfo rwlock changes)
2008-07-18 10:47:07 +00:00
kmacy
6dfc39c2b6 revert local change 2008-07-18 07:10:33 +00:00
kmacy
eacfaa0e61 revert change from local tree 2008-07-18 07:07:57 +00:00
kmacy
96f4c28cd5 new vendor PHY support 2008-07-18 07:01:51 +00:00
kmacy
daba4ab7d8 revert changes accidentally included in last commit 2008-07-18 06:22:57 +00:00
alc
69e52692b2 Eliminate unused global variables. (These global variables became fields of
struct kva_md_info many years ago.)
2008-07-18 06:14:36 +00:00
kmacy
c01ed5ad9b import vendor fixes to cxgb 2008-07-18 06:12:31 +00:00
yongari
25c768ed0d Correct 1000Mbps link handling logic for JMC250. This should make
jme(4) run on 1000Mbps link.
2008-07-18 04:20:48 +00:00
yongari
065c59620f Use DELAY() instead of pause if waiting time is less than 1ms.
This will fix driver hang if hz < 1000.

Pointed out by:	thompsa
2008-07-18 01:00:54 +00:00
luoqi
dc64dfc792 Fix a benign typo that would give out an incorrect warning message.
Change a get-or-set sequence on OHCI_COMMAND_STATUS register which
is "write to set" to a simple set.
2008-07-17 22:40:23 +00:00
kib
eff9ee09b4 Pair the VOP_OPEN call from do_execve() with the reciprocal VOP_CLOSE.
This was unnoticed because local filesystems usually do nothing
non-trivial in the close vop.

Reported and tested by:	Rick Macklem
MFC after:	2 weeks
2008-07-17 16:44:07 +00:00
gallatin
57b9f1fb86 Clean up mxge's use of callouts as pointed out by jhb,
and handle NIC hardware watchdog resets.

- remove buggy code at the top of mxge_tick() which tried
  to detect a race which is already detected in the kernel's
  callout code.

- move callout_stop() and callout_reset() into mxge_close()
  mxge_open() rather than doing the callout manipulation
  all over the place.

- use callout_drain(), rather than callout_stop() to prevent
  a potential race between mxge_tick() and mxge_detach()
  which could lead to softclock using a destroyed mutex

- restructure the mxge_tick() and mxge_watchdog_reset()
  routines to avoid resetting a callout, and then
  immediately stopping it if the watchdog reset routine
  is called, and fails.

- enable the driver to handle NIC hardware watchdog
  resets by restoring the NIC's PCI config space, which is
  lost when the NIC hardware watchdog triggers.

Reviewed by: jhb (previus version)
2008-07-17 15:46:35 +00:00
ed
12460029b8 Move the TCSA* definitions out of _KERNEL. They are processed in libc.
The tcsetattr() routine already converts the TCSA* arguments to their
respective TIOCSETA* ioctl's in the C library. There is no need to have
these definitions inside the kernel.

Approved by:	philip (mentor, implicit)
2008-07-16 12:36:39 +00:00
ed
edbfc4f6f4 Sort the ioctl's in <sys/ttycom.h> by number.
I think one of the reasons why we have so many conflicts in the TTY
ioctl category, is because the ioctl's aren't ordered logically. This
commit only sorts them by number. The comments may still be inaccurate.

Approved by:	philip (mentor)
2008-07-16 11:23:15 +00:00
ed
27f054a7ef Remove OTTYDISC, NETLDISC and NTTYDISC definitions.
When I ported most applications away from <sgtty.h>, I noticed none of
them were actually using these definitions. I kept them in place,
because I didn't want to touch tools like pstat(8) and stty(1).

In preparation for the MPSAFE TTY layer, remove these definitions. This
doesn't have any impact with respect to binary compatibility (see
tty_conf.c).

We couldn now add an #error to <sys/ioctl_compat.h> when included
outside the kernel. Unfortunately, kdump's mkioctls includes this file
unconditionally.

Approved by:	philip (mentor)
2008-07-16 11:20:04 +00:00
rwatson
aaeba0f3d2 Fix error in comment.
MFC after:	3 weeks
2008-07-16 10:55:50 +00:00
yongari
8949e679ff Fix a multicast handling regression on VT6105M introduced in
vr(4) overhauling(r177050).

It seems that filtering multicast addresses with multicast CAM
entries require accessing 'CAM enable bit' for each CAM entry.
Subsequent accessing multicast CAM control register without
toggling the 'CAM enable bit' seem to no effects.
In order to fix that separate CAM setup from CAM mask configuration
and CAM entry modification. While I'm here add VLAN CAM filtering
feature which will be enabled in future(FreeBSD now can receive
VLAN id insertion/removal event from vlan(4) on the fly).

For VT6105M hardware, explicitly disable VLAN hardware tag
insertion/stripping and enable VLAN CAM filtering for VLAN id 0.
This shall make non-VLAN frames set VR_RXSTAT_VIDHIT bit in Rx
status word.

Added multicast/VLAN CAM address definition to header file.

PR:	kern/125010, kern/125024
MFC after:	1 week
2008-07-16 08:35:29 +00:00
yongari
dc88e0e3e5 Fix VR_RXSTAT_RX_OK bit definition which lasted for more than 9
years. All datasheet I have indicates the bit 15 is the
VR_RXSTAT_RX_OK. The bit 14 is reserved for all Rhine family
except VT6105M. VT6105M uses that bit to indicate a VLAN frame
with matching CAM VLAN id.
Use the VR_RXSTAT_RX_OK instead of VR_RXSTAT_RXERR when vr(4)
checks the validity of received frame.
This should fix occasional dropping frames on VT6105M.

Tested by:	Goran Lowkrantz ( goran.lowkrantz at ismobile dot com )
MFC after:	1 week
2008-07-16 08:02:23 +00:00
rwatson
6d9661b224 Merge last of a series of rwlock conversion changes to UDP, which
completes the move to a fully parallel UDP transmit path by using
global read, rather than write, locking of inpcbinfo in further
semi-connected cases:

- Add macros to allow try-locking of inpcb and inpcbinfo.
- Always acquire an incpcb read lock in udp_output(), which stablizes the
  local inpcb address and port bindings in order to determine what further
  locking is required:
  - If the inpcb is currently not bound (at all) and are implicitly
    connecting, we require inpcbinfo and inpcb write locks, so drop the
    read lock and re-acquire.
  - If the inpcb is bound for at least one of the port or address, but an
    explicit source or destination is requested, trylock the inpcbinfo
    lock, and if that fails, drop the inpcb lock, lock the global lock,
    and relock the inpcb lock.
  - Otherwise, no further locking is required (common case).
- Update comments.

In practice, this means that the vast majority of consumers of UDP sockets
will not acquire any exclusive locks at the socket or UDP levels of the
network stack.  This leads to a marked performance improvement in several
important workloads, including BIND, nsd, and memcached over UDP, as well
as significant improvements in pps microbenchmarks.

The plan is to MFC all of the rwlock changes to RELENG_7 once they have
settled for a weeks in the tree.

Tested by:	ps, kris (older revision), bde
MFC after:	3 weeks
2008-07-15 15:38:47 +00:00
rpaulo
2d49f66781 Fix commment in typo.
M    tcp_output.c
2008-07-15 10:32:35 +00:00
alc
d4de04e9b1 Update bus_dmamem_alloc()'s first call to malloc() such that M_WAITOK is
specified when appropriate.

Reviewed by:	scottl
2008-07-15 03:34:49 +00:00
delphij
eae79cfa0f Add quirk for Dell D630 laptops.
Tested by:	Quake Lee <quakelee geekcn org>,
		Robert Noland <rnoland 2hip net>
MFC after:	1 week
Approved by:	ariff
2008-07-15 02:34:44 +00:00
jkim
5faf505c39 Allow injecting big packets via bpf(4) up to min(MTU, 16K-byte).
MFC after:	1 week
2008-07-14 22:41:48 +00:00
obrien
c131962593 Match the implementation of the inline function from libkern.h. 2008-07-14 21:36:02 +00:00
eri
253913181a Fix carp(4) panics that can occur during carp interface configuration.
Approved by:	mlaier (mentor)
Reported by:	Scott Ullrich
MFC after:	1 week
2008-07-14 20:11:51 +00:00
jfv
188dc0a4d4 Add event notification at attach/detach so the NIC
is able to detect it and do hardware filtering.
2008-07-14 18:40:21 +00:00
jfv
fbb0502c9b Add an event handler to the vlan driver so the NIC driver
becomes aware of it, and gets the VLAN ID. This will allow
the easy use of VLAN hardware filtering by adapters that
support it.
2008-07-14 18:38:52 +00:00
trhodes
b82c051e92 Fill in the string portion of the bluetooth stack version sysctl.
Approved by:	emax
2008-07-14 13:45:05 +00:00
dougb
67437d52b5 Change the character prefixed to the svn version to "r" since that seems
to be how they are commonly referred to.
2008-07-13 20:08:38 +00:00
alc
181ba3c627 Handle a race between pmap_kextract() and pmap_promote_pde(). This race
caused ZFS to crash when restoring a snapshot with superpage promotion
enabled.

Reported by:	kris
2008-07-13 18:19:53 +00:00
antoine
89ca3c5933 Staticize M_STACK.
Approved by:	rwatson (mentor)
MFC after:	1 month
2008-07-13 17:15:05 +00:00
ed
a8f4e95b68 Make uart(4) the default serial port driver on i386 and amd64.
The uart(4) driver has the advantage of supporting a wider variety of
hardware on a greater amount of platforms. This driver has already been
the standard on platforms such as ia64, powerpc and sparc64.

I've decided not to change anything on pc98. I'd rather let people from
the pc98 team look at this.

Approved by:	philip (mentor), marcel
2008-07-13 07:20:14 +00:00
ticso
c72cf0954d fix multicast hash register definition 2008-07-12 23:40:07 +00:00
alc
a761fe4f10 Refine the changes made in SVN rev 180430. Specifically, instantiate a new
page table page only if the 2MB page mapping has been used.  Also, refactor
some assertions.
2008-07-12 21:24:42 +00:00
rodrigc
f280e5ed8f In nmount(), if we see "update" in the mount options,
set MNT_UPDATE in fsflags, and delete the
"update" option from the global mount options.

MNT_UPDATE is a command, and not a property of a mount
that should persist after the command is executed.

We need to do similar things for MNT_FORCE and MNT_RELOAD.

All mount flags are prefixed by MNT_..... it would
be nice if flags which were commands were named differently
from flags which are persistent properties of a mount.
This was not such a big deal in the pre-nmount() days,
but with nmount() it is more important.

Requested by:	yar
MFC after:	2 weeks
2008-07-12 20:12:40 +00:00
alc
6dd633c3c8 In order to apply pmap_demote_pde() to a page directory entry (PDE) from the
direct map, the PDE must have PG_M and PG_A preset.

Noticed by: Magesh Dhasayyan
2008-07-12 18:43:57 +00:00
scottl
c326e0792a A number of significant enhancements to the ciss driver:
1.  The FreeBSD driver was setting an interrupt coalesce delay of 1000us
for reasons that I can only speculate on.  This was hurting everything
from lame sequential I/O "benchmarks" to legitimate filesystem metadata
operations that relied on serialized barrier writes.  One of my
filesystem tests went from 35s to complete down to 6s.

2.  Implemented the Performant transport method.  Without the fix in
(1), I saw almost no difference.  With it, my filesystem tests showed
another 5-10% improvement in speed.  It was hard to measure CPU
utilization in any meaningful way, so it's not clear if there was a
benefit there, though there should have been since the interrupt handler
was reduced from 2 or more PCI reads down to 1.

3.  Implemented MSI-X.  Without any docs on this, I was just taking a
guess, and it appears to only work with the Performant method.  This
could be a programming or understanding mistake on my part.  While this
by itself made almost no difference to performance since the Performant
method already eliminated most of the synchronous reads over the PCI
bus, it did allow the CISS hardware to stop sharing its interrupt with
the USB hardware, which in turn allowed the driver to become decoupled
from the Giant-locked USB driver stack.  This increased performance by
almost 20%.  The MSI-X setup was done with 4 vectors allocated, but only
1 vector used since the performant method was told to only use 1 of 4
queues.  Fiddling with this might make it work with the simpleq method,
not sure.  I did not implement MSI since I have no MSI-specific hardware
in my test lab.

4.  Improved the locking in the driver, trimmed some data structures.
This didn't improve test times in any measurable way, but it does look
like it gave a minor improvement to CPU usage when many
processes/threads were doing I/O in parallel.  Again, this was hard to
accurately test.
2008-07-11 21:20:51 +00:00
delphij
3b0f89fd90 Don't leak DMA map if not freed.
Submitted by:	kevlo
2008-07-11 18:26:12 +00:00
emax
797b3cb554 Dust off old code for support of USB isochronous transfers.
USB isochronous transfer support is required for Bluetooth SCO.
While i'm here change u_int to uint and update TODO.
This should produce no visible changes unless the device is
broken (or really old).

MFC after:	3 months
2008-07-11 17:13:43 +00:00
lulf
5cc7bcb02c - Fix a logic error when updating plex configuration.
Approved by:	pjd (mentor)
2008-07-11 16:46:29 +00:00
obrien
fa9172e3f7 Improve readability and cscope searches a little bit by not using the
same variable name in closely related (but not conflicting) contexts.
2008-07-11 14:48:28 +00:00
kib
b341cd2da2 Use the VM_ALLOC_INTERRUPT for the page requests when allocating memory
for the bio for swapout write. It allows the page allocator to drain
free page list deeper. As result, a deadlock where pageout deamon sleeps
waiting for bio to be allocated for swapout is no more reproducable in
practice.

Alan said that M_USE_RESERVE shall be ressurrected and used there, but
until this is implemented, M_NOWAIT does exactly what is needed.

Tested by:	pho, kris
Reviewed by:	alc
No objections from:	phk
MFC after:	2 weeks (RELENG_7 only)
2008-07-11 11:27:42 +00:00
kib
da671c0533 Make it atomic for the devfs_populate_loop() to see the setting of
SI_ALIAS flag and initialization of the si_parent when alias is created.
Assert that supplied parent device is not NULL.

Both situations could cause NULL dereference in the
devfs_populate_loop() when creating a symlink for SI_ALIAS'ed device.
Namely, cdp->cdp_c.si_parent may be NULL.

Reported by:	mav
MFC after:	2 weeks
2008-07-11 11:22:19 +00:00
obrien
3b9db50b75 Revert r180431.
r180431 broke the AMD64 build (the only arch using kern/link_elf_obj.c)
2008-07-11 01:10:40 +00:00