Commit Graph

612 Commits

Author SHA1 Message Date
Alan Cox
b8e7fc24fe Add configuration knobs for the superpage reservation system. Initially,
the reservation will only be enabled on amd64.
2007-12-27 16:45:39 +00:00
Jeff Roberson
eea4f254fe - Re-implement lock profiling in such a way that it no longer breaks
the ABI when enabled.  There is no longer an embedded lock_profile_object
   in each lock.  Instead a list of lock_profile_objects is kept per-thread
   for each lock it may own.  The cnt_hold statistic is now always 0 to
   facilitate this.
 - Support shared locking by tracking individual lock instances and
   statistics in the per-thread per-instance lock_profile_object.
 - Make the lock profiling hash table a per-cpu singly linked list with a
   per-cpu static lock_prof allocator.  This removes the need for an array
   of spinlocks and reduces cache contention between cores.
 - Use a seperate hash for spinlocks and other locks so that only a
   critical_enter() is required and not a spinlock_enter() to modify the
   per-cpu tables.
 - Count time spent spinning in the lock statistics.
 - Remove the LOCK_PROFILE_SHARED option as it is always supported now.
 - Specifically drop and release the scheduler locks in both schedulers
   since we track owners now.

In collaboration with:	Kip Macy
Sponsored by:	Nokia
2007-12-15 23:13:31 +00:00
Kip Macy
26a4d66f05 add compile option to remove extra branch introduced by tcp offload support code 2007-12-15 19:53:35 +00:00
Marcel Moolenaar
5aaa8fefdf Add a BSD disklabel backend to g_part:
o  Disklabels can have between 8 and 20 partitions (inclusive).
o  No device special file is created for the raw partition.
o  Switch ia64 to use this backend.
o  No support for boot code yet.
2007-12-06 02:32:42 +00:00
Robert Watson
3c90d1ea74 Break out stack(9) from ddb(4):
- Introduce per-architecture stack_machdep.c to hold stack_save(9).
- Introduce per-architecture machine/stack.h to capture any common
  definitions required between db_trace.c and stack_machdep.c.
- Add new kernel option "options STACK"; we will build in stack(9) if it is
  defined, or also if "options DDB" is defined to provide compatibility
  with existing users of stack(9).

Add new stack_save_td(9) function, which allows the capture of a stacktrace
of another thread rather than the current thread, which the existing
stack_save(9) was limited to.  It requires that the thread be neither
swapped out nor running, which is the responsibility of the consumer to
enforce.

Update stack(9) man page.

Build tested:	amd64, arm, i386, ia64, powerpc, sparc64, sun4v
Runtime tested:	amd64 (rwatson), arm (cognet), i386 (rwatson)
2007-12-02 20:40:35 +00:00
Attilio Rao
573c6b82df Make ADAPTIVE_GIANT as the default in the kernel and remove the option.
Currently, Giant is not too much contented so that it is ok to treact it
like any other mutexes.

Please don't forget to update your own custom config kernel files.

Approved by:	cognet, marcel (maintainers of arches where option is
		not enabled at the moment)
2007-11-28 05:50:45 +00:00
Pawel Jakub Dawidek
f854db0bf5 Bring in the GEOM Virtualisation class, which allows to create huge GEOM
providers with limited physical storage and add physical storage as
needed.

Submitted by:	Ivan Voras
Sponsored by:	Google Summer of Code 2006
Approved by:	re (kensmith)
2007-09-23 07:34:23 +00:00
Max Laier
47c96e9530 Remove PF_MPSAFE_UGID leftover.
Spotted by:	bz
Approved by:	re (gnn)
2007-09-22 18:22:31 +00:00
Ariff Abdullah
b28624fde6 Update snd_emu10kx driver with recent perforce changes (and few
other changes too).

(without any real order)

1. Use device_get_nameunit for mutex naming
2. Add timer for low-latency playback
3. Move most mixer controls from sysctls to mixer(8) controls.
   This is a largest part of this patch.
4. Add analog/digital switch (as a temporary sysctl)
5. Get back support for low-bitrate playback (with help of (2))
6. Change locking for exclusive I/O. Writing to non-PTR register
   is almost safe and does not need to be ordered with PTR operations.
7. Disable MIDI until we get it to detach properly and fix memory
   managment problems.
8. Enable multichannel playback by default. It is as stable as
   single-channel mode. Multichannel recording is still an
   experimental feature.
9. Multichannel options can be changed by loader tunables.
10. Add a way to disable card from a loader tunable.
11. Add new PCI IDs.
12. Debugger settings are loader tunables now.
14. Remove some unused variables.
15. Mark pcm sub-devices MPSAFE.
16. Partially revert (bus_setup_intr -> snd_setup_intr) since it need
    to be done independently

Submitted by:	Yuriy Tsibizov (driver maintainer)
Approved by:	re (bmah)
2007-09-12 07:43:43 +00:00
Robert Watson
0bf686c125 Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which
previously conditionally acquired Giant based on debug.mpsafenet.  As that
has now been removed, they are no longer required.  Removing them
significantly simplifies error-handling in the socket layer, eliminated
quite a bit of unwinding of locking in error cases.

While here clean up the now unneeded opt_net.h, which previously was used
for the NET_WITH_GIANT kernel option.  Clean up some related gotos for
consistency.

Reviewed by:	bz, csjp
Tested by:	kris
Approved by:	re (kensmith)
2007-08-06 14:26:03 +00:00
Bjoern A. Zeeb
cc977adc71 Rename option IPSEC_FILTERGIF to IPSEC_FILTERTUNNEL.
Also rename the related functions in a similar way.
There are no functional changes.

For a packet coming in with IPsec tunnel mode, the default is
to only call into the firewall with the "outer" IP header and
payload.

With this option turned on, in addition to the "outer" parts,
the "inner" IP header and payload are passed to the
firewall too when going through ip_input() the second time.

The option was never only related to a gif(4) tunnel within
an IPsec tunnel and thus the name was very misleading.

Discussed at:			BSDCan 2007
Best new name suggested by:	rwatson
Reviewed by:			rwatson
Approved by:			re (bmah)
2007-08-05 16:16:15 +00:00
Scott Long
c5933b2086 Introduce Danny Braniss' iSCSI initiator, version 2.0.99. Please read the
included man pages on how to use it.  This code is still somewhat experimental
but has been successfully tested on a number of targets.  Many thanks to
Danny for contributing this.

Approved by: re
2007-07-24 15:35:02 +00:00
Robert Watson
2b851aeb63 Disconnect netatm from the build as it is not MPSAFE and relies on
NET_NEEDS_GIANT, which will shortly be removed.  This is done in a
away that it may be easily reattached to the build before 7.1 if
appropriate locking is added.  Specifics:

- Don't install netatm include files
- Disconnect netatm command line management tools
- Don't build libatm
- Don't include ATM parts in rescue or sysinstall
- Don't install sample configuration files and documents
- Don't build kernel support as a module or in NOTES
- Don't build netgraph wrapper nodes for netatm

This removes the last remaining consumer of NET_NEEDS_GIANT.

Reviewed by:	harti
Discussed with:	bz, bms
Approved by:	re (kensmith)
2007-07-14 21:49:24 +00:00
Randall Stewart
b54d3a6c48 - Modular congestion control, with RFC2581 being the default.
- CMT_PF states added (w/sysctl to turn the PF version on)
- sctp_input.c had a missing incr of cookie case when the
  auth was bad. This meant a free was called without an
  increment to refcnt, added increment like rest of code.
- There was a case, unlikely, when the scope of the destination
  changed (this is a TSNH case). In that case, it would not free
  the alloc'ed asoc (in sctp_input.c).
- When listed addresses found a colliding cookie/Init, then
  the collided upon tcb was not unlocked in sctp_pcb.c
- Add error checking on arguments of sctp_sendx(3) to prevent it from
  referencing a NULL pointer.
- Fix an error return of sctp_sendx(3), it was returing
  ENOMEM not -1.
- Get assoc id was changed to use the sanctified socket api
  method for getting a assoc id (PEER_ADDR_INFO instead of
  PEER_ADDR_PARAMS).
- Fix it so a peeled off socket will get a proper error return
  if it trys to send to a different address then it is connected to.
- Fix so that select_a_stream can avoid an endless loop that
  could hang a caller.
- time_entered (state set time) was not being set in all cases
  to the time we went established.
Approved by:	re(ken smith)
2007-07-14 09:36:28 +00:00
Attilio Rao
c1a6d9fa42 Fix some problems with lock_profiling in sx locks:
- Adjust lock_profiling stubs semantic in the hard functions in order to be
  more accurate and trustable
- Disable shared paths for lock_profiling.  Actually, lock_profiling has a
  subtle race which makes results caming from shared paths not completely
  trustable. A macro stub (LOCK_PROFILING_SHARED) can be actually used for
  re-enabling this paths, but is currently intended for developing use only.
- Use homogeneous names for automatic variables in hard functions regarding
  lock_profiling
- Style fixes
- Add a CTASSERT for some flags building

Discussed with: kmacy, kris
Approved by: jeff (mentor)
Approved by: re
2007-07-06 13:20:44 +00:00
Bjoern A. Zeeb
118043c6b1 Temporary disconnect i4bing, i4bisppp and i4bipr from the build for
the 7.0 timeframe.

This is needed because I4B is not locked and NET_NEEDS_GIANT goes away.

The plan is to lock I4B and bring everything back for 7.1.

Approved by:	re (kensmith)
2007-07-04 00:18:39 +00:00
George V. Neville-Neil
b2630c2934 Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSEC
option is now deprecated, as well as the KAME IPsec code.
What was FAST_IPSEC is now IPSEC.

Approved by: re
Sponsored by: Secure Computing
2007-07-03 12:13:45 +00:00
Rong-En Fan
534046e301 - Remove UMAP filesystem. It was disconnected from build three years ago,
and it is seriously broken.

Discussed on:   freebsd-arch@
Approved by:	re (mux)
2007-06-25 05:06:57 +00:00
Alan Cox
2446e4f02c Enable the new physical memory allocator.
This allocator uses a binary buddy system with a twist.  First and
foremost, this allocator is required to support the implementation of
superpages.  As a side effect, it enables a more robust implementation
of contigmalloc(9).  Moreover, this reimplementation of
contigmalloc(9) eliminates the acquisition of Giant by
contigmalloc(..., M_NOWAIT, ...).

The twist is that this allocator tries to reduce the number of TLB
misses incurred by accesses through a direct map to small, UMA-managed
objects and page table pages.  Roughly speaking, the physical pages
that are allocated for such purposes are clustered together in the
physical address space.  The performance benefits vary.  In the most
extreme case, a uniprocessor kernel running on an Opteron, I measured
an 18% reduction in system time during a buildworld.

This allocator does not implement page coloring.  The reason is that
superpages have much the same effect.  The contiguous physical memory
allocation necessary for a superpage is inherently colored.

Finally, the one caveat is that this allocator does not effectively
support prezeroed pages.  I hope this is temporary.  On i386, this is
a slight pessimization.  However, on amd64, the beneficial effects of
the direct-map optimization outweigh the ill effects.  I speculate
that this is true in general of machines with a direct map.

Approved by:	re
2007-06-16 04:57:06 +00:00
Xin LI
d1fa59e9e1 MFp4: Add tmpfs, an efficient memory file system.
Please note that, this is currently considered as an
experimental feature so there could be some rough
edges.  Consult http://wiki.freebsd.org/TMPFS for
more information.

For now, connect tmpfs to build on i386 and amd64
architectures only.  Please let us know if you have
success with other platforms.

This work was developed by Julio M. Merino Vidal
for NetBSD as a SoC project; Rohit Jalan ported it
from NetBSD to FreeBSD.  Howard Su and Glen Leeder
are worked on it to continue this effort.

Obtained from:	NetBSD via p4
Submitted by:	Howard Su (with some minor changes)
Approved by:	re (kensmith)
2007-06-16 01:56:05 +00:00
Randall Stewart
80fefe0a08 - Fix so ifn's are properly deleted when the ref count goes to 0.
- Fix so VRF's will clean themselves up when no references are around.
- Allow sctp_ifa to be passed into inpcb_bind, addr_mgmt_ep_sa to bypass
  normal validation checks.
- turn auto-asconf off for subset bound sockets
- Moves all logging to use KTR. This gets rid of most
  of the logging #ifdef's with a few exceptions reducing
  the number of config options for SCTP.
2007-06-14 22:59:04 +00:00
Robert Watson
2281b8f054 Remove IPX over IP tunneling support, which allows IPX routing over IP
tunnels, and was not MPSAFE.  The code can be easily restored in the
event that someone with an IPX over IP tunnel configuration can work
with me to test patches.

This removes one of five remaining consumers of NET_NEEDS_GIANT.

Approved by:	re (kensmith)
2007-06-13 14:01:43 +00:00
Marcel Moolenaar
6bc5044561 Add the MBR partitioning scheme to g_part. This does not yet
support the ability to install boot code.
2007-06-13 04:27:36 +00:00
Attilio Rao
e682569165 Remove the MUTEX_WAKE_ALL option and make it the default behaviour for our
mutexes.
Currently we alredy force MUTEX_WAKE_ALL beacause of some problems with the
!MUTEX_WAKE_ALL case (unavioidable priority inversion).
2007-06-08 21:36:52 +00:00
Jeff Roberson
8e0185f604 - Remove sched_core.c. The maintainer has lost interest in pursuing this
and it has been neglected in the recent ksegrp removal as well as
   the thread_lock() changes.

Discussed with:	davidxu
2007-06-05 00:12:37 +00:00
Randall Stewart
0696e1203e - Fix a memory overwrite when the mapping array
is expanded, size of expansion was not taken int consideration.
-  Fix so vtag hash is 1 bigger so that it modulo's out
   correctly, avoids a panic when restart with right modulo happens.
-  do not dereference stcb when control->do_not_ref_stcb is set
-  Fix up packet logging to not often use a lock and also to
   add to options.
-  Fix some logging option duplication in the sctputil.h
2007-05-30 17:39:45 +00:00
Alexander Motin
7d3b4a0846 A node that implements various traffic shaping and rate limiting algorithms (ng_car).
Approved by:	glebius (mentor)
2007-05-15 16:43:01 +00:00
Paolo Pisati
566bd15634 Make interrupt filtering support compilable.
The entire code is wrapperd in #ifdef ... #endif so it won't harm
the actual implementation, but developers are encouraged to test it.
For arm, ia64, ppc, sparc64 and sun4v some work is still
needed, thus arch maintainers are encouraged to bring their arch on par
with respect to i386 and amd64.

Approved by: re (implicit?)
2007-05-06 17:04:34 +00:00
Kip Macy
f8bbd17f06 Add option for disabling allocation from the packet zone 2007-04-14 20:16:03 +00:00
Andre Oppermann
5dd9dfefd6 Retire unused TCP_SACK_DEBUG. 2007-04-04 14:44:15 +00:00
John Baldwin
4e7f640dfb Optimize sx locks to use simple atomic operations for the common cases of
obtaining and releasing shared and exclusive locks.  The algorithms for
manipulating the lock cookie are very similar to that rwlocks.  This patch
also adds support for exclusive locks using the same algorithm as mutexes.

A new sx_init_flags() function has been added so that optional flags can be
specified to alter a given locks behavior.  The flags include SX_DUPOK,
SX_NOWITNESS, SX_NOPROFILE, and SX_QUITE which are all identical in nature
to the similar flags for mutexes.

Adaptive spinning on select locks may be enabled by enabling the
ADAPTIVE_SX kernel option.  Only locks initialized with the SX_ADAPTIVESPIN
flag via sx_init_flags() will adaptively spin.

The common cases for sx_slock(), sx_sunlock(), sx_xlock(), and sx_xunlock()
are now performed inline in non-debug kernels.  As a result, <sys/sx.h> now
requires <sys/lock.h> to be included prior to <sys/sx.h>.

The new kernel option SX_NOINLINE can be used to disable the aforementioned
inlining in non-debug kernels.

The size of struct sx has changed, so the kernel ABI is probably greatly
disturbed.

MFC after:	1 month
Submitted by:	attilio
Tested by:	kris, pjd
2007-03-31 23:23:42 +00:00
John Baldwin
02e4a32084 Sort. 2007-03-27 19:32:40 +00:00
Jung-uk Kim
2be4e4713a Catch up with ACPI-CA 20070320 import. 2007-03-22 18:16:43 +00:00
John Baldwin
cd6e6e4e11 - Simplify the #ifdef's for adaptive mutexes and rwlocks by conditionally
defining a macro earlier in the file.
- Add NO_ADAPTIVE_RWLOCKS option to disable adaptive spinning for rwlocks.
2007-03-22 16:09:23 +00:00
Andre Oppermann
85c497918c Make TCP_DROP_SYNFIN a standard part of TCP. Disabled by default it
doesn't impede normal operation negatively and is only a few lines of
code.  It's close relatives blackhole and log_in_vain aren't options
either.
2007-03-21 18:25:28 +00:00
Randall Stewart
d2e5427a0d Adds missing flight size logging option for SCTP. 2007-03-20 10:19:09 +00:00
Dag-Erling Smørgrav
1db3766d66 Add GEOM_MULTIPATH so LINT will build.
Pointy hat to:	mjacob
2007-02-27 12:05:25 +00:00
Kip Macy
f183910b97 Further improvements to LOCK_PROFILING:
- Fix missing initialization in kern_rwlock.c causing bogus times to be collected
 - Move updates to the lock hash to after the lock is released for spin mutexes,
   sleep mutexes, and sx locks
 - Add new kernel build option LOCK_PROFILE_FAST - only update lock profiling
   statistics when an acquisition is contended. This reduces the overhead of
   LOCK_PROFILING to increasing system time by 20%-25% which on
   "make -j8 kernel-toolchain" on a dual woodcrest is unmeasurable in terms
   of wall-clock time. Contrast this to enabling lock profiling without
   LOCK_PROFILE_FAST and I see a 5x-6x slowdown in wall-clock time.
2007-02-27 06:42:05 +00:00
Bruce M Simpson
0948f0a28f Build PIM by default as part of the IPv4 multicast forwarding path.
Make PIM dynamically loadable by using encap_attach_func().
PIM may now be loaded into a GENERIC kernel.

Tested with:	ports/net/pimdd && tcpreplay && wireshark
Reviewed by:	Pavlin Radoslavov
2007-02-10 13:59:13 +00:00
Marcel Moolenaar
1d3aed33e8 Evolve the ctlreq interface added to geom_gpt into a generic
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).

The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
2007-02-07 18:55:31 +00:00
Craig Rodrigues
379064396d Remove MSDOSFS_LARGE compile time option. It has been converted
to a run time "-o large" mount option.

PR:		105964
MFC after:	2 weeks
2007-01-30 05:01:06 +00:00
Marius Strobl
420a38dd4b Wrap the EISA-specific parts of the dpt(4) and si(4) back-ends in
the newly added DEV_EISA. This is done so that these back-ends can
be compiled on platforms not providing in{b,w,l}()/out{b,w,l}() and
friends (but may wish to use them together with bus front-ends other
than the EISA one).
2007-01-18 13:33:36 +00:00
Marius Strobl
6e62b06982 Add missing SC_NO_MODE_CHANGE option. Disable it in the powerpc
NOTES though, as ofw_syscons(4) doesn't properly interface with
syscons(4) regarding loading the font specified with SC_DFLT_FONT,
causing a kernel with both options SC_OFWFB and SC_NO_MODE_CHANGE
to not link.
2007-01-10 18:45:18 +00:00
Paolo Pisati
61c0e134f5 Wrap ipfw nat support in a new kernel config option named
"IPFIREWALL_NAT": this way nat is turned off by default and
POLA is preserved.

Reviewed by: rwatson
2007-01-03 11:12:54 +00:00
Max Laier
240589a9fe Work around a long standing LOR with user/group rules by doing the socket
lookup early.  This has some performance implications and should not be
enabled by default, but might help greatly in certain setups.  After some
more testing this could be turned into a sysctl.

Tested by:	avatar
LOR ids:	17, 24, 32, 46, 191 (conceptual)
MFC after:	6 weeks
2006-12-29 13:59:03 +00:00
Gleb Smirnoff
9e6f1d3be4 Build bits for ng_deflate(4) and ng_pred1(4). 2006-12-29 13:16:43 +00:00
Matt Jacob
786da2bbd0 spelling nit 2006-12-18 05:42:33 +00:00
Matt Jacob
f9fbd1a4bc Make MAXPHYS and DFLTPHYS options (finally). 2006-12-10 04:23:23 +00:00
John Birrell
e0b651251d Turn console printf buffering into a kernel option and only on
by default for sun4v where it is absolutely required.

This change moves the buffer from struct pcpu to the stack to avoid
using the critical section which created a LOR in a couple of cases
due to interaction with the tty code and kqueue. The LOR can't be
fixed with the critical section and the pcpu buffer can't be used
without the critical section.

Putting the buffer on the stack was my initial solution, but it was
pointed out that the stress on the stack might cause problems
depending on the call path. We don't have a way of creating tests
for those possible cases, so it's best to leave this as an option
for the time being. In time we may get enough data to enable this
option more generally.
2006-11-30 04:17:05 +00:00
John Birrell
d5768a7a5b Remove the KDTRACE option because I can't implement it the
way I intended due to licensing restrictions. I had intended
that it would be defaulted on, with opt-out possible for
companies that don't accept the CDDL. The FreeBSD GENERIC
kernel has to be entirely BSD licensed, so the only alternative
would have been to make KDTRACE an opt-in option. That isn't
a design I favour.
2006-11-21 08:23:20 +00:00