Commit Graph

72268 Commits

Author SHA1 Message Date
Marko Zec
21ca7b57bd Change the curvnet variable from a global const struct vnet *,
previously always pointing to the default vnet context, to a
dynamically changing thread-local one.  The currvnet context
should be set on entry to networking code via CURVNET_SET() macros,
and reverted to previous state via CURVNET_RESTORE().  Recursions
on curvnet are permitted, though strongly discuouraged.

This change should have no functional impact on nooptions VIMAGE
kernel builds, where CURVNET_* macros expand to whitespace.

The curthread->td_vnet (aka curvnet) variable's purpose is to be an
indicator of the vnet context in which the current network-related
operation takes place, in case we cannot deduce the current vnet
context from any other source, such as by looking at mbuf's
m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc.  Moreover, so
far curvnet has turned out to be an invaluable consistency checking
aid: it helps to catch cases when sockets, ifnets or any other
vnet-aware structures may have leaked from one vnet to another.

The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros
was a result of an empirical iterative process, whith an aim to
reduce recursions on CURVNET_SET() to a minimum, while still reducing
the scope of CURVNET_SET() to networking only operations - the
alternative would be calling CURVNET_SET() on each system call entry.
In general, curvnet has to be set in three typicall cases: when
processing socket-related requests from userspace or from within the
kernel; when processing inbound traffic flowing from device drivers
to upper layers of the networking stack, and when executing
timer-driven networking functions.

This change also introduces a DDB subcommand to show the list of all
vnet instances.

Approved by:	julian (mentor)
2009-05-05 10:56:12 +00:00
Jamie Gritton
49939083a0 Add a constant PR_MAXMETHOD to better define the jail/OSD interface.
Reviewed by:	dchagin, kib
Approved by:	bz (mentor)
2009-05-05 05:49:08 +00:00
Alexander Motin
614dd4f83c Do not try to initialize LAPIC timer if we are not going to use it.
It solves assertion, when kernel built with INVARIANTS configured
to use i8254 timer.
2009-05-05 01:13:20 +00:00
John Baldwin
8859442e76 Always compute the root of the kernel source tree and explicitly pass it
to module builds.  This avoids having to have the module builds walk up
the tree to find the kernel sources.  It also allows a kernel + module
build to succeed when a new level of module subdirectories is added without
requiring that the /usr/share/mk/bsd.kmod.mk file on the machine be patched.

MFC after:	1 week
2009-05-04 20:25:56 +00:00
Jamie Gritton
84a8cad0f6 Mark Linux MIB sysctls MPSAFE.
Reviewed by:	dchagin, kib
Approved by:	bz (mentor)
2009-05-04 19:06:05 +00:00
Jung-uk Kim
4ef853cc7f Unlock the largest standard CPUID on Intel CPUs for both amd64 and i386 and
fix SMP topology detection.  On i386, we extend it to cover Core, Core 2,
and Core i7 processors, not just Pentium 4 family, and move it to better
place.  On amd64, all supported Intel CPUs should have this MSR.
2009-05-04 18:05:27 +00:00
Ulf Lilleengen
ad75dd77e0 - Make the gvinum softc invisible to userland, as it is not needed. 2009-05-04 17:30:20 +00:00
Rick Macklem
9ec7b004d0 Add the experimental nfs subtree to the kernel, that includes
support for NFSv4 as well as NFSv2 and 3.
	It lives in 3 subdirs under sys/fs:
	nfs - functions that are common to the client and server
	nfsclient - a mutation of sys/nfsclient that call generic functions
	to do RPCs and handle state. As such, it retains the
	buffer cache handling characteristics and vnode semantics that
	are found in sys/nfsclient, for the most part.
	nfsserver - the server. It includes a DRC designed specifically for
	NFSv4, that is used instead of the generic DRC in sys/rpc.
	The build glue will be checked in later, so at this point, it
	consists of 3 new subdirs that should not affect kernel building.

Approved by:	kib (mentor)
2009-05-04 15:23:58 +00:00
Ed Schouten
3382ac3233 Remove unneeded check for SESS_LEADER().
We perform the same check ~10 lines above.
2009-05-04 11:11:10 +00:00
Alexander Motin
64886da216 Oops, sorry. Fix for fix. 2009-05-04 08:41:54 +00:00
Alexander Motin
b5f9da0f8d There is no atrtc driver in pc98, so hide atrtcclock_disable variable usage
in APM driver for this platform. This should fix pc98 build.
2009-05-04 08:36:47 +00:00
Alan Cox
3a2cdcb0e3 Eliminate vnode_pager_input_smlfs()'s pointless call to pmap_clear_modify().
The page can't possibly have any modified page table entries because it
isn't even mapped.
2009-05-04 06:30:00 +00:00
Robert Watson
6fb82ecbde Remove redundant NFSMNT_NFSV3 check in DTrace hooks for NFS RPC.
MFC after:	1 month
2009-05-04 02:19:52 +00:00
Robert Watson
9e4fda10a0 Fix typo in comment.
MFC after:	1 month
2009-05-04 02:06:39 +00:00
Andrew Thompson
8ee6f90a0c Relax the condition for printing the lost state transition message. The new
state will be set before the EXT_STATEWAIT flag is cleared and its ok to
transition again at that point.
2009-05-03 18:29:04 +00:00
Alexander Motin
1703f2b424 Rename statclock_disable variable to atrtcclock_disable that it actually is,
and hide it inside of atrtc driver. Add new tunable hint.atrtc.0.clock
controlling it. Setting it to 0 disables using RTC clock as stat-/
profclock sources.

Teach i386 and amd64 SMP platforms to emulate stat-/profclocks using i8254
hardclock, when LAPIC and RTC clocks are disabled.

This allows to reduce global interrupt rate of idle system down to about
100 interrupts per core, permitting C3 and deeper C-states provide maximum
CPU power efficiency.
2009-05-03 17:47:21 +00:00
Alexander Motin
2500b6d96d Make dev.cpu.X.cx_usage sysctl also report current average of sleep time. 2009-05-03 06:25:37 +00:00
Alexander Motin
bb1d6ad5b3 Remove unused variable and fix spelling in comment. 2009-05-03 04:58:44 +00:00
Warner Losh
12e36acb09 Bring in Andrew Thompson's port of Sepherosa Ziehau's bwi driver for
Broadcom BCM43xx chipsets.  This driver uses the v3 firmware that
needs to be fetched separately.  A port will be committed to create
the bwi firmware module.

The driver matches the following chips: Broadcom BCM4301, BCM4307,
BCM4306, BCM4309, BCM4311, BCM4312, BCM4318, BCM4319

The driver works for 802.11b and 802.11g.

Limitations:
	This doesn't support the 802.11a or 802.11n portion of radios.
	Some BCM4306 and BCM4309 cards don't work with Channel 1, 2 or 3.
	Documenation for this firmware is reverse engineered from
		 http://bcm.sipsolutions.net/
	V4 of the firmware is needed for 11a or 11n support
		 http://bcm-v4.sipsolutions.net/
	Firmware needs to be fetched from a third party, port to be committed

# I've tested this with a BCM4319 mini-pci and a BCM4318 CardBus card, and
# not connected it to the build until the firmware port is committed.

Obtained from:	DragonFlyBSD, //depot/projects/vap
Reviewed by:	sam@, thompsa@
2009-05-03 04:01:43 +00:00
Yoshihiro Takahashi
80c516a808 MFi386: revision 191745
Add support for using i8254 and rtc timers as event sources for i386 SMP
  system. Redistribute hard-/stat-/profclock events to other CPUs using IPI.
2009-05-03 02:37:13 +00:00
Alexander Motin
b0baaaaecf Avoid comparing negative signed to positive unsignad values. It was
leading to a bug, when C-state does not decrease on sleep shorter then
declared transition latency. Fixing this deprecates workaround for broken
C-states on some hardware.

By the way, change state selecting logic a bit. Instead of last sleep
time use short-time average of it. Global interrupts rate in system is a
quite random value, to corellate subsequent sleeps so directly.
2009-05-02 22:30:33 +00:00
Kip Macy
51a70df2db fix XEN compilation 2009-05-02 22:22:00 +00:00
Sam Leffler
86629111ea don't say "ac WME_AC_BE"; remove "ac" 2009-05-02 20:28:55 +00:00
Sam Leffler
dca68715ab promote ieee80211_seq typedef 2009-05-02 20:25:22 +00:00
Sam Leffler
43049c4828 o dump tx/rx seq#'s for qos tid's
o improve check for when to dump rx ampdu state
2009-05-02 20:21:21 +00:00
Sam Leffler
24a599dd50 whitespace 2009-05-02 20:18:18 +00:00
Sam Leffler
04f19fd699 make superg/fast-frames state dynamically-allocated (and indirect off
the com structure instead of embedded); this reduces the overhead when
not configured and reduces visibility of the contents
2009-05-02 20:16:55 +00:00
Andrew Thompson
5efea30f03 Create a taskqueue for each wireless interface which provides a serialised
sleepable context for net80211 driver callbacks. This removes the need for USB
and firmware based drivers to roll their own code to defer the chip programming
for state changes, scan requests, channel changes and mcast/promisc updates.
When a driver callback completes the hardware state is now guaranteed to have
been updated and is in sync with net80211 layer.

This nukes around 1300 lines of code from the wireless device drivers making
them more readable and less race prone.

The net80211 layer has been updated as follows
 - all state/channel changes are serialised on the taskqueue.
 - ieee80211_new_state() always queues and can now be called from any context
 - scanning runs from a single taskq function and executes to completion. driver
   callbacks are synchronous so the channel, phy mode and rx filters are
   guaranteed to be set in hardware before probe request frames are
   transmitted.

Help and contributions from Sam Leffler.

Reviewed by:	sam
2009-05-02 15:14:18 +00:00
Alexander Motin
a40d9024df Add support for using i8254 and rtc timers as event sources for i386 SMP
system. Redistribute hard-/stat-/profclock events to other CPUs using IPI.
2009-05-02 12:59:47 +00:00
Alexander Motin
6a3a164d6e Add support for using i8254 and rtc timers as event sources for amd64 SMP
system. Redistribute hard-/stat-/profclock events to other CPUs using IPIs.
2009-05-02 12:20:43 +00:00
Dmitry Chagin
40092d93b4 Linux socketpair() call expects explicit specified protocol for
AF_LOCAL domain unlike FreeBSD which expects 0 in this case.

Approved by:	kib (mentor)
MFC after:	1 month
2009-05-02 10:51:40 +00:00
Dmitry Chagin
d789bfd562 Move extern variable definitions to the header file.
Approved by:	kib (mentor)
MFC after:	1 month
2009-05-02 10:06:49 +00:00
Marko Zec
5f416f8e84 Make indentation more uniform accross vnet container structs.
This is a purely cosmetic / NOP change.

Reviewed by:	bz
Approved by:	julian (mentor)
Verified by:	svn diff -x -w producing no output
2009-05-02 08:16:26 +00:00
Alan Cox
7b89d46e0f A variety of changes:
Reimplement "kernel_pmap" in the standard way.

Eliminate unused variables.  (These are mostly variables that were
discarded by the machine-independent layer after FreeBSD 4.x.)

Properly handle a vm_page_alloc() failure in pmap_init().

Eliminate dead or legacy (FreeBSD 4.x) code.

Eliminate unnecessary page queues locking.

Eliminate some excess white space.

Correct the synchronization of pmap_page_exists_quick().

Tested by: gonzo
2009-05-02 06:12:38 +00:00
Marko Zec
d7fcc52895 Unbreak options VIMAGE + nooptions INVARIANTS kernel builds.
Submitted by:	julian
Approved by:	julian (mentor)
2009-05-02 05:02:28 +00:00
Alexander Motin
58a2bb4996 Add resume methods to i8254 and atrtc devices. 2009-05-01 21:43:04 +00:00
Sam Leffler
ce7fa847ac revert wip 2009-05-01 21:31:39 +00:00
Robert Watson
fa76567150 Rename MAC Framework-internal macros used to invoke policy entry points:
MAC_BOOLEAN           -> MAC_POLICY_BOOLEAN
  MAC_BOOLEAN_NOSLEEP   -> MAC_POLICY_BOOLEANN_NOSLEEP
  MAC_CHECK             -> MAC_POLICY_CHECK
  MAC_CHECK_NOSLEEP     -> MAC_POLICY_CHECK_NOSLEEP
  MAC_EXTERNALIZE       -> MAC_POLICY_EXTERNALIZE
  MAC_GRANT             -> MAC_POLICY_GRANT
  MAC_GRANT_NOSLEEP     -> MAC_POLICY_GRANT_NOSLEEP
  MAC_INTERNALIZE       -> MAC_POLICY_INTERNALIZE
  MAC_PERFORM           -> MAC_POLICY_PERFORM_CHECK
  MAC_PERFORM_NOSLEEP   -> MAC_POLICY_PERFORM_NOSLEEP

This frees up those macro names for use in wrapping calls into the MAC
Framework from the remainder of the kernel.

Obtained from:	TrustedBSD Project
2009-05-01 21:05:40 +00:00
Alexander Motin
2f369c9496 Small addition to r191720.
Restore previous behaviour for the case of unknown interrupt. Invocation
of IRQ -1 crashes my system on resume. Returning 0, as it was, is not
perfect also, but at least not so dangerous.
2009-05-01 20:53:37 +00:00
Andrew Thompson
3f11aba75f Reorder the bridge add and delete routines to avoid calling ifpromisc() with
the bridge lock held.
2009-05-01 19:46:42 +00:00
Sam Leffler
dcad868984 o add uath
o sort usb wireless drivers
2009-05-01 17:20:16 +00:00
Sam Leffler
9e0ebab5d4 add more tdma fixed rate defaults 2009-05-01 17:18:45 +00:00
Sam Leffler
71aa1d3234 add uath; sort usb wireless drivers 2009-05-01 17:17:06 +00:00
Sam Leffler
cebc52549c add uath 2009-05-01 17:16:33 +00:00
Sam Leffler
cae597f39f add ralfw 2009-05-01 17:15:29 +00:00
Alexander Motin
1ecff35a6b Use value -1 instead of 0 for marking unused APIC vectors. This fixes
IRQ0 routing on LAPIC-enabled systems.

Add hint.apic.0.clock tunable. Setting it 0 disables using LAPIC timers
as hard-/stat-/profclock sources falling back to using i8254 and rtc timers.

On modern CPUs LAPIC is a part of CPU core which is shutting down when CPU
enters C3 or deeper power state. It makes no problems for interrupt
processing, as chipset wakes up CPU on interrupt triggering. But entering
C3 state kills LAPIC timer and freezes system time, making C3 and deeper
states practically unusable. Using i8254 timer allows to avoid this
problem.

By using i8254 timer my T7700 C2D CPU with UP kernel successfully enters
C3 state, saving more then a Watt of total idle power (>10%) in addition to
all other power-saving techniques.

This technique is not working for SMP yet, as only one CPU receives
timer interrupts. But I think that problem could be fixed by forwarding
interrupts to other CPUs with IPI.
2009-05-01 17:05:49 +00:00
Dmitry Chagin
79262bf1f0 Reimplement futexes.
Old implemention used Giant to protect the kernel data structures,
but at the same time called malloc(M_WAITOK), that could cause the
calling thread to sleep and lost Giant protection. User-visible
result was the missed wakeup.

New implementation uses one sx lock per futex. The sx protects
the futex structures and allows to sleep while copyin or copyout
are performed.

Unlike linux, we return EINVAL when FUTEX_CMP_REQUEUE operation
is requested and either caller specified futexes are equial or
second futex already exists. This is acceptable since the situation
can only occur from the application error, and glibc falls back to
old FUTEX_WAKE operation when FUTEX_CMP_REQUEUE returns an error.

Approved by:	kib (mentor)
MFC after:	1 month
2009-05-01 15:36:02 +00:00
Bruce M Simpson
a3d3b633a9 Limit scope of acquisition of INP_RLOCK for multicast input filter
to the scope of its use, even though this may thrash the lock if
the INP is referenced for other purposes.

Tested by:	David Wolfskill
2009-05-01 11:05:24 +00:00
Alexander Motin
b1d1fff76c Improve kernel dumping reliability for busy ATA channels:
- Generate fake channel interrupts even if channel busy with previous
request to let it finish. Without this, dumping requests were just queued
and never processed.
 - Drop pre-dump requests queue on dumping. ATA code, working in dumping
(interruptless) mode, unable to handle long request queue. Actually, to get
coherent dump we anyway should do as few unrelated actions as possible.
2009-05-01 08:03:46 +00:00
Pyun YongHyeon
59dffaa088 Separate multicast filtering of SysKonnect GENESIS and Marvell
Yukon from common multicast handling code. Yukon uses hash-based
multicast filtering(big endian form) but GENESIS uses perfect
multicast filtering as well as hash-based one(little endian form).
Due to the differences of multicast filtering there is no much
sense to have a common code.
 o Remove sk_setmulti() and introduce sk_rxfilter_yukon(),
   sk_rxfilter_yukon() that handles multicast filtering setup.
 o Have sk_rxfilter_{yukon, genesis} handle promiscuous mode and
   nuke sk_setpromisc(). This simplifies ioctl handler as well as
   giving a chance to check validity of Rx control register of
   Yukon.
 o Don't reinitialize controller when IFF_ALLMULTI flags is changed.
 o Nuke sk_gmchash(), it's not needed anymore.
 o Always reconfigure Rx control register whenever a new multicast
   filtering condition is changed. This fixes multicast filtering
   setup on Yukon.

PR:	kern/134051
2009-05-01 03:24:03 +00:00