Commit Graph

703 Commits

Author SHA1 Message Date
Robert Watson
0bf686c125 Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which
previously conditionally acquired Giant based on debug.mpsafenet.  As that
has now been removed, they are no longer required.  Removing them
significantly simplifies error-handling in the socket layer, eliminated
quite a bit of unwinding of locking in error cases.

While here clean up the now unneeded opt_net.h, which previously was used
for the NET_WITH_GIANT kernel option.  Clean up some related gotos for
consistency.

Reviewed by:	bz, csjp
Tested by:	kris
Approved by:	re (kensmith)
2007-08-06 14:26:03 +00:00
Bjoern A. Zeeb
cc977adc71 Rename option IPSEC_FILTERGIF to IPSEC_FILTERTUNNEL.
Also rename the related functions in a similar way.
There are no functional changes.

For a packet coming in with IPsec tunnel mode, the default is
to only call into the firewall with the "outer" IP header and
payload.

With this option turned on, in addition to the "outer" parts,
the "inner" IP header and payload are passed to the
firewall too when going through ip_input() the second time.

The option was never only related to a gif(4) tunnel within
an IPsec tunnel and thus the name was very misleading.

Discussed at:			BSDCan 2007
Best new name suggested by:	rwatson
Reviewed by:			rwatson
Approved by:			re (bmah)
2007-08-05 16:16:15 +00:00
Scott Long
c5933b2086 Introduce Danny Braniss' iSCSI initiator, version 2.0.99. Please read the
included man pages on how to use it.  This code is still somewhat experimental
but has been successfully tested on a number of targets.  Many thanks to
Danny for contributing this.

Approved by: re
2007-07-24 15:35:02 +00:00
Robert Watson
2b851aeb63 Disconnect netatm from the build as it is not MPSAFE and relies on
NET_NEEDS_GIANT, which will shortly be removed.  This is done in a
away that it may be easily reattached to the build before 7.1 if
appropriate locking is added.  Specifics:

- Don't install netatm include files
- Disconnect netatm command line management tools
- Don't build libatm
- Don't include ATM parts in rescue or sysinstall
- Don't install sample configuration files and documents
- Don't build kernel support as a module or in NOTES
- Don't build netgraph wrapper nodes for netatm

This removes the last remaining consumer of NET_NEEDS_GIANT.

Reviewed by:	harti
Discussed with:	bz, bms
Approved by:	re (kensmith)
2007-07-14 21:49:24 +00:00
Randall Stewart
b54d3a6c48 - Modular congestion control, with RFC2581 being the default.
- CMT_PF states added (w/sysctl to turn the PF version on)
- sctp_input.c had a missing incr of cookie case when the
  auth was bad. This meant a free was called without an
  increment to refcnt, added increment like rest of code.
- There was a case, unlikely, when the scope of the destination
  changed (this is a TSNH case). In that case, it would not free
  the alloc'ed asoc (in sctp_input.c).
- When listed addresses found a colliding cookie/Init, then
  the collided upon tcb was not unlocked in sctp_pcb.c
- Add error checking on arguments of sctp_sendx(3) to prevent it from
  referencing a NULL pointer.
- Fix an error return of sctp_sendx(3), it was returing
  ENOMEM not -1.
- Get assoc id was changed to use the sanctified socket api
  method for getting a assoc id (PEER_ADDR_INFO instead of
  PEER_ADDR_PARAMS).
- Fix it so a peeled off socket will get a proper error return
  if it trys to send to a different address then it is connected to.
- Fix so that select_a_stream can avoid an endless loop that
  could hang a caller.
- time_entered (state set time) was not being set in all cases
  to the time we went established.
Approved by:	re(ken smith)
2007-07-14 09:36:28 +00:00
Attilio Rao
c1a6d9fa42 Fix some problems with lock_profiling in sx locks:
- Adjust lock_profiling stubs semantic in the hard functions in order to be
  more accurate and trustable
- Disable shared paths for lock_profiling.  Actually, lock_profiling has a
  subtle race which makes results caming from shared paths not completely
  trustable. A macro stub (LOCK_PROFILING_SHARED) can be actually used for
  re-enabling this paths, but is currently intended for developing use only.
- Use homogeneous names for automatic variables in hard functions regarding
  lock_profiling
- Style fixes
- Add a CTASSERT for some flags building

Discussed with: kmacy, kris
Approved by: jeff (mentor)
Approved by: re
2007-07-06 13:20:44 +00:00
Bjoern A. Zeeb
118043c6b1 Temporary disconnect i4bing, i4bisppp and i4bipr from the build for
the 7.0 timeframe.

This is needed because I4B is not locked and NET_NEEDS_GIANT goes away.

The plan is to lock I4B and bring everything back for 7.1.

Approved by:	re (kensmith)
2007-07-04 00:18:39 +00:00
George V. Neville-Neil
b2630c2934 Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSEC
option is now deprecated, as well as the KAME IPsec code.
What was FAST_IPSEC is now IPSEC.

Approved by: re
Sponsored by: Secure Computing
2007-07-03 12:13:45 +00:00
Rong-En Fan
534046e301 - Remove UMAP filesystem. It was disconnected from build three years ago,
and it is seriously broken.

Discussed on:   freebsd-arch@
Approved by:	re (mux)
2007-06-25 05:06:57 +00:00
Alan Cox
2446e4f02c Enable the new physical memory allocator.
This allocator uses a binary buddy system with a twist.  First and
foremost, this allocator is required to support the implementation of
superpages.  As a side effect, it enables a more robust implementation
of contigmalloc(9).  Moreover, this reimplementation of
contigmalloc(9) eliminates the acquisition of Giant by
contigmalloc(..., M_NOWAIT, ...).

The twist is that this allocator tries to reduce the number of TLB
misses incurred by accesses through a direct map to small, UMA-managed
objects and page table pages.  Roughly speaking, the physical pages
that are allocated for such purposes are clustered together in the
physical address space.  The performance benefits vary.  In the most
extreme case, a uniprocessor kernel running on an Opteron, I measured
an 18% reduction in system time during a buildworld.

This allocator does not implement page coloring.  The reason is that
superpages have much the same effect.  The contiguous physical memory
allocation necessary for a superpage is inherently colored.

Finally, the one caveat is that this allocator does not effectively
support prezeroed pages.  I hope this is temporary.  On i386, this is
a slight pessimization.  However, on amd64, the beneficial effects of
the direct-map optimization outweigh the ill effects.  I speculate
that this is true in general of machines with a direct map.

Approved by:	re
2007-06-16 04:57:06 +00:00
Xin LI
d1fa59e9e1 MFp4: Add tmpfs, an efficient memory file system.
Please note that, this is currently considered as an
experimental feature so there could be some rough
edges.  Consult http://wiki.freebsd.org/TMPFS for
more information.

For now, connect tmpfs to build on i386 and amd64
architectures only.  Please let us know if you have
success with other platforms.

This work was developed by Julio M. Merino Vidal
for NetBSD as a SoC project; Rohit Jalan ported it
from NetBSD to FreeBSD.  Howard Su and Glen Leeder
are worked on it to continue this effort.

Obtained from:	NetBSD via p4
Submitted by:	Howard Su (with some minor changes)
Approved by:	re (kensmith)
2007-06-16 01:56:05 +00:00
Randall Stewart
80fefe0a08 - Fix so ifn's are properly deleted when the ref count goes to 0.
- Fix so VRF's will clean themselves up when no references are around.
- Allow sctp_ifa to be passed into inpcb_bind, addr_mgmt_ep_sa to bypass
  normal validation checks.
- turn auto-asconf off for subset bound sockets
- Moves all logging to use KTR. This gets rid of most
  of the logging #ifdef's with a few exceptions reducing
  the number of config options for SCTP.
2007-06-14 22:59:04 +00:00
Robert Watson
2281b8f054 Remove IPX over IP tunneling support, which allows IPX routing over IP
tunnels, and was not MPSAFE.  The code can be easily restored in the
event that someone with an IPX over IP tunnel configuration can work
with me to test patches.

This removes one of five remaining consumers of NET_NEEDS_GIANT.

Approved by:	re (kensmith)
2007-06-13 14:01:43 +00:00
Marcel Moolenaar
6bc5044561 Add the MBR partitioning scheme to g_part. This does not yet
support the ability to install boot code.
2007-06-13 04:27:36 +00:00
Attilio Rao
e682569165 Remove the MUTEX_WAKE_ALL option and make it the default behaviour for our
mutexes.
Currently we alredy force MUTEX_WAKE_ALL beacause of some problems with the
!MUTEX_WAKE_ALL case (unavioidable priority inversion).
2007-06-08 21:36:52 +00:00
Jeff Roberson
8e0185f604 - Remove sched_core.c. The maintainer has lost interest in pursuing this
and it has been neglected in the recent ksegrp removal as well as
   the thread_lock() changes.

Discussed with:	davidxu
2007-06-05 00:12:37 +00:00
Randall Stewart
0696e1203e - Fix a memory overwrite when the mapping array
is expanded, size of expansion was not taken int consideration.
-  Fix so vtag hash is 1 bigger so that it modulo's out
   correctly, avoids a panic when restart with right modulo happens.
-  do not dereference stcb when control->do_not_ref_stcb is set
-  Fix up packet logging to not often use a lock and also to
   add to options.
-  Fix some logging option duplication in the sctputil.h
2007-05-30 17:39:45 +00:00
Alexander Motin
7d3b4a0846 A node that implements various traffic shaping and rate limiting algorithms (ng_car).
Approved by:	glebius (mentor)
2007-05-15 16:43:01 +00:00
Paolo Pisati
566bd15634 Make interrupt filtering support compilable.
The entire code is wrapperd in #ifdef ... #endif so it won't harm
the actual implementation, but developers are encouraged to test it.
For arm, ia64, ppc, sparc64 and sun4v some work is still
needed, thus arch maintainers are encouraged to bring their arch on par
with respect to i386 and amd64.

Approved by: re (implicit?)
2007-05-06 17:04:34 +00:00
Kip Macy
f8bbd17f06 Add option for disabling allocation from the packet zone 2007-04-14 20:16:03 +00:00
Andre Oppermann
5dd9dfefd6 Retire unused TCP_SACK_DEBUG. 2007-04-04 14:44:15 +00:00
John Baldwin
4e7f640dfb Optimize sx locks to use simple atomic operations for the common cases of
obtaining and releasing shared and exclusive locks.  The algorithms for
manipulating the lock cookie are very similar to that rwlocks.  This patch
also adds support for exclusive locks using the same algorithm as mutexes.

A new sx_init_flags() function has been added so that optional flags can be
specified to alter a given locks behavior.  The flags include SX_DUPOK,
SX_NOWITNESS, SX_NOPROFILE, and SX_QUITE which are all identical in nature
to the similar flags for mutexes.

Adaptive spinning on select locks may be enabled by enabling the
ADAPTIVE_SX kernel option.  Only locks initialized with the SX_ADAPTIVESPIN
flag via sx_init_flags() will adaptively spin.

The common cases for sx_slock(), sx_sunlock(), sx_xlock(), and sx_xunlock()
are now performed inline in non-debug kernels.  As a result, <sys/sx.h> now
requires <sys/lock.h> to be included prior to <sys/sx.h>.

The new kernel option SX_NOINLINE can be used to disable the aforementioned
inlining in non-debug kernels.

The size of struct sx has changed, so the kernel ABI is probably greatly
disturbed.

MFC after:	1 month
Submitted by:	attilio
Tested by:	kris, pjd
2007-03-31 23:23:42 +00:00
John Baldwin
02e4a32084 Sort. 2007-03-27 19:32:40 +00:00
Jung-uk Kim
2be4e4713a Catch up with ACPI-CA 20070320 import. 2007-03-22 18:16:43 +00:00
John Baldwin
cd6e6e4e11 - Simplify the #ifdef's for adaptive mutexes and rwlocks by conditionally
defining a macro earlier in the file.
- Add NO_ADAPTIVE_RWLOCKS option to disable adaptive spinning for rwlocks.
2007-03-22 16:09:23 +00:00
Andre Oppermann
85c497918c Make TCP_DROP_SYNFIN a standard part of TCP. Disabled by default it
doesn't impede normal operation negatively and is only a few lines of
code.  It's close relatives blackhole and log_in_vain aren't options
either.
2007-03-21 18:25:28 +00:00
Randall Stewart
d2e5427a0d Adds missing flight size logging option for SCTP. 2007-03-20 10:19:09 +00:00
Dag-Erling Smørgrav
1db3766d66 Add GEOM_MULTIPATH so LINT will build.
Pointy hat to:	mjacob
2007-02-27 12:05:25 +00:00
Kip Macy
f183910b97 Further improvements to LOCK_PROFILING:
- Fix missing initialization in kern_rwlock.c causing bogus times to be collected
 - Move updates to the lock hash to after the lock is released for spin mutexes,
   sleep mutexes, and sx locks
 - Add new kernel build option LOCK_PROFILE_FAST - only update lock profiling
   statistics when an acquisition is contended. This reduces the overhead of
   LOCK_PROFILING to increasing system time by 20%-25% which on
   "make -j8 kernel-toolchain" on a dual woodcrest is unmeasurable in terms
   of wall-clock time. Contrast this to enabling lock profiling without
   LOCK_PROFILE_FAST and I see a 5x-6x slowdown in wall-clock time.
2007-02-27 06:42:05 +00:00
Bruce M Simpson
0948f0a28f Build PIM by default as part of the IPv4 multicast forwarding path.
Make PIM dynamically loadable by using encap_attach_func().
PIM may now be loaded into a GENERIC kernel.

Tested with:	ports/net/pimdd && tcpreplay && wireshark
Reviewed by:	Pavlin Radoslavov
2007-02-10 13:59:13 +00:00
Marcel Moolenaar
1d3aed33e8 Evolve the ctlreq interface added to geom_gpt into a generic
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).

The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
2007-02-07 18:55:31 +00:00
Craig Rodrigues
379064396d Remove MSDOSFS_LARGE compile time option. It has been converted
to a run time "-o large" mount option.

PR:		105964
MFC after:	2 weeks
2007-01-30 05:01:06 +00:00
Marius Strobl
420a38dd4b Wrap the EISA-specific parts of the dpt(4) and si(4) back-ends in
the newly added DEV_EISA. This is done so that these back-ends can
be compiled on platforms not providing in{b,w,l}()/out{b,w,l}() and
friends (but may wish to use them together with bus front-ends other
than the EISA one).
2007-01-18 13:33:36 +00:00
Marius Strobl
6e62b06982 Add missing SC_NO_MODE_CHANGE option. Disable it in the powerpc
NOTES though, as ofw_syscons(4) doesn't properly interface with
syscons(4) regarding loading the font specified with SC_DFLT_FONT,
causing a kernel with both options SC_OFWFB and SC_NO_MODE_CHANGE
to not link.
2007-01-10 18:45:18 +00:00
Paolo Pisati
61c0e134f5 Wrap ipfw nat support in a new kernel config option named
"IPFIREWALL_NAT": this way nat is turned off by default and
POLA is preserved.

Reviewed by: rwatson
2007-01-03 11:12:54 +00:00
Max Laier
240589a9fe Work around a long standing LOR with user/group rules by doing the socket
lookup early.  This has some performance implications and should not be
enabled by default, but might help greatly in certain setups.  After some
more testing this could be turned into a sysctl.

Tested by:	avatar
LOR ids:	17, 24, 32, 46, 191 (conceptual)
MFC after:	6 weeks
2006-12-29 13:59:03 +00:00
Gleb Smirnoff
9e6f1d3be4 Build bits for ng_deflate(4) and ng_pred1(4). 2006-12-29 13:16:43 +00:00
Matt Jacob
786da2bbd0 spelling nit 2006-12-18 05:42:33 +00:00
Matt Jacob
f9fbd1a4bc Make MAXPHYS and DFLTPHYS options (finally). 2006-12-10 04:23:23 +00:00
John Birrell
e0b651251d Turn console printf buffering into a kernel option and only on
by default for sun4v where it is absolutely required.

This change moves the buffer from struct pcpu to the stack to avoid
using the critical section which created a LOR in a couple of cases
due to interaction with the tty code and kqueue. The LOR can't be
fixed with the critical section and the pcpu buffer can't be used
without the critical section.

Putting the buffer on the stack was my initial solution, but it was
pointed out that the stress on the stack might cause problems
depending on the call path. We don't have a way of creating tests
for those possible cases, so it's best to leave this as an option
for the time being. In time we may get enough data to enable this
option more generally.
2006-11-30 04:17:05 +00:00
John Birrell
d5768a7a5b Remove the KDTRACE option because I can't implement it the
way I intended due to licensing restrictions. I had intended
that it would be defaulted on, with opt-out possible for
companies that don't accept the CDDL. The FreeBSD GENERIC
kernel has to be entirely BSD licensed, so the only alternative
would have been to make KDTRACE an opt-in option. That isn't
a design I favour.
2006-11-21 08:23:20 +00:00
Kip Macy
7c0435b933 MUTEX_PROFILING has been generalized to LOCK_PROFILING. We now profile
wait (time waited to acquire) and hold times for *all* kernel locks. If
the architecture has a system synchronized TSC, the profiling code will
use that - thereby minimizing profiling overhead. Large chunks of profiling
code have been moved out of line, the overhead measured on the T1 for when
it is compiled in but not enabled is < 1%.

Approved by: scottl (standing in for mentor rwatson)
Reviewed by: des and jhb
2006-11-11 03:18:07 +00:00
Randall Stewart
f8829a4a40 Ok, here it is, we finally add SCTP to current. Note that this
work is not just mine, but it is also the works of Peter Lei
and Michael Tuexen. They both are my two key other developers
working on the project.. and they need ata-boy's too:
****
peterlei@cisco.com
tuexen@fh-muenster.de
****
I did do a make sysent which updated the
syscall's and sysproto.. I hope that is correct... without
it you don't build since we have new syscalls for SCTP :-0

So go out and look at the NOTES, add
option SCTP (make sure inet and inet6 are present too)
and play with SCTP.

I will see about comitting some test tools I have after I
figure out where I should place them. I also have a
lib (libsctp.a) that adds some of the missing socketapi
functions that I need to put into lib's.. I will talk
to George about this :-)

There may still be some 64 bit issues in here, none of
us have a 64 bit processor to test with yet.. Michael
may have a MAC but thats another beast too..

If you have a mac and want to use SCTP contact Michael
he maintains a web site with a loadable module with
this code :-)

Reviewed by:	gnn
Approved by:	gnn
2006-11-03 15:23:16 +00:00
Matt Jacob
bd3fd815a7 2nd and final commit that moves us to CAM_NEW_TRAN_CODE
as the default.

Reviewed by multitudes.
2006-11-02 00:54:38 +00:00
Pawel Jakub Dawidek
f348204c94 Hook up gjournal bits to the build.
Sponsored by:	home.pl
2006-10-31 22:22:30 +00:00
Ruslan Ermilov
5d9f25dce2 Added the GEOM_CACHE option.
Reminded by:	pjd
2006-10-06 10:43:42 +00:00
Ruslan Ermilov
6c9fdda750 Added COMPAT_FREEBSD6 option. 2006-09-26 12:36:34 +00:00
Robert Watson
738f14d4b1 Remove MAC_DEBUG label counters, which were used to debug leaks and
other problems while labels were first being added to various kernel
objects.  They have outlived their usefulness.

MFC after:	1 month
Suggested by:	Christopher dot Vance at SPARTA dot com
Obtained from:	TrustedBSD Project
2006-09-20 13:33:41 +00:00
Julian Elischer
b7522c27d2 Remove the IPFIREWALL_FORWARD_EXTENDED option and make it on by default as it always was
in older versions of FreeBSD. This option is pointless as it is needed in just
about every interesting usage of forward that I have ever seen. It doesn't make
the system any safer and just wastes huge amounts of develper time
when the system doesn't behave as expected when code is moved from
4.x to 6.x It doesn't make
the system any safer and just wastes huge amounts of develper time
when the system doesn't behave as expected when code is moved from
4.x to 6.x  or 7.x
Reviewed by:	glebius
MFC after:	1 week
2006-08-17 00:37:03 +00:00
John Birrell
b0777fc474 Add an option to enable KSE support.
Add an option to build in kernel DTrace hooks. Without this option, the
DTrace modules acn't be loaded.
2006-08-03 05:19:33 +00:00
Marcel Moolenaar
302981e72a Remove sio(4) and related options from MI files to amd64, i386
and pc98 MD files. Remove nodevice and nooption lines specific
to sio(4) from ia64, powerpc and sparc64 NOTES. There were no
such lines for arm yet.
sio(4) is usable on less than half the platforms, not counting
a future mips platform. Its presence in MI files is therefore
increasingly becoming a burden.
2006-07-29 18:38:54 +00:00
Warner Losh
16c84e5e51 Add new kernel config option. NO_SYSCTL_DESCR to omit the descriptions for
the sysctls.  This saves a lot of space in the resulting kernel which is
important for embedded systems.  This change was done in a ABI compatible
way.  The pointer is still there, it just points to an empty string instead
of the description.

MFC After: 3 days
2006-07-18 17:00:51 +00:00
Poul-Henning Kamp
9c499ad92f Remove the NDEVFSINO and NDEVFSOVERFLOW options which no longer exists in
DEVFS.

Remove the opt_devfs.h file now that it is empty.
2006-07-17 09:07:02 +00:00
Poul-Henning Kamp
cabdf61c75 Remove config(8)'s knowledge about NMBCLUSTERS, no code in /sys
knows about it any more.
2006-07-17 08:14:46 +00:00
Alexander Leidinger
0fa7ab6a31 - Connect the snd_emu10kx driver to the build. [1]
- Bump __FreeBSD_version, no need to build the port now.

Submitted by:	Yuriy Tsibizov <Yuriy.Tsibizov@gfk.ru> [1]
2006-07-15 20:22:40 +00:00
Gleb Smirnoff
d473c9d543 A netgraph node that can do different manipulations with
mbuf_tags(9) on packets.

Submitted by:		Vadim Goncharov <vadimnuclight tpu.ru>
mdoc(7) reviewed by:	ru
2006-06-27 12:45:28 +00:00
Andrew Thompson
bdea400f3b Add a pseudo interface for packet filtering IPSec connections before or after
encryption. There are two functions, a bpf tap which has a basic header with
the SPI number which our current tcpdump knows how to display, and handoff to
pfil(9) for packet filtering.

Obtained from:	OpenBSD
Based on:	kern/94829
No objections:	arch, net
MFC after:	1 month
2006-06-26 22:30:08 +00:00
Sergey Babkin
d81175c738 Backed out the change by request from rwatson.
PR:		kern/14584
2006-06-26 22:03:22 +00:00
Sergey Babkin
7a799f1ef0 The common UID/GID space implementation. It has been discussed on -arch
in 1999, and there are changes to the sysctl names compared to PR,
according to that discussion. The description is in sys/conf/NOTES.
Lines in the GENERIC files are added in commented-out form.
I'll attach the test script I've used to PR.

PR:		kern/14584
Submitted by:	babkin
2006-06-25 18:37:44 +00:00
David Xu
b41f1452d9 Add scheduler CORE, the work I have done half a year ago, recent,
I picked it up again. The scheduler is forked from ULE, but the
algorithm to detect an interactive process is almost completely
different with ULE, it comes from Linux paper "Understanding the
Linux 2.6.8.1 CPU Scheduler", although I still use same word
"score" as a priority boost in ULE scheduler.

Briefly, the scheduler has following characteristic:
1. Timesharing process's nice value is seriously respected,
   timeslice and interaction detecting algorithm are based
   on nice value.
2. per-cpu scheduling queue and load balancing.
3. O(1) scheduling.
4. Some cpu affinity code in wakeup path.
5. Support POSIX SCHED_FIFO and SCHED_RR.
Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler
uses 256 priority queues. Unlike ULE which using pull and push, the
scheduelr uses pull method, the main reason is to let relative idle
cpu do the work, but current the whole scheduler is protected by the
big sched_lock, so the benefit is not visible, it really can be worse
than nothing because all other cpu are locked out when we are doing
balancing work, which the 4BSD scheduelr does not have this problem.
The scheduler does not support hyperthreading very well, in fact,
the scheduler does not make the difference between physical CPU and
logical CPU, this should be improved in feature. The scheduler has
priority inversion problem on MP machine, it is not good for
realtime scheduling, it can cause realtime process starving.
As a result, it seems the MySQL super-smack runs better on my
Pentium-D machine when using libthr, despite on UP or SMP kernel.
2006-06-13 13:12:56 +00:00
Marius Strobl
acb8c14985 Make the ISAPNP code optional and only enable it on i386 and pc98 (used
for CBUS-PNP cards there) by default, as there are no amd64 and sparc64
machines with ISA slots and which therefore could make use of this code
known to exist. For sparc64 this additionally allows to get rid of the
compat shims for in{b,w,l}()/out{b,w,l}() etc and the associated hacks.

OK'ed by:	imp, peter
2006-06-12 21:07:13 +00:00
Sam Leffler
3735f6ff08 remove ath hal options; having them here causes opt_ah.h to be clobbered
by config and that breaks builds unless one duplicates the options in the
config file

MFC after:	1 month
2006-06-07 17:53:15 +00:00
Doug Ambrisko
741367d5a5 Add in a bunch of things to the mfi driver:
- Linux ioctl support, with the other Linux changes MegaCli
	will run if you mount linprocfs & linsysfs then set
	sysctl compat.linux.osrelease=2.6.12 or similar.  This works
	on i386.  It should work on amd64 but not well tested yet.
	StoreLib may or may not work.  Remember to kldload mfi_linux.
      - Add in AEN (Async Event Notification) support so we can
	get messages from the firmware when something happens.
	Not all messages are in defined in event detail.  Use
	event_log to try to figure out what happened.
      - Try to implement something like SIGIO for StoreLib.  Since
	mrmonitor doesn't work right I can't fully test it.  StoreLib
	works best with the rh9 base.  In theory mrmonitor isn't
	needed due to native driver support of AEN :-)
Now we can configure and monitor the RAID better.

Submitted by:	IronPort Systems.
2006-05-18 23:30:48 +00:00
Max Laier
656faadcb8 Remove ip6fw. Since ipfw has full functional IPv6 support now and - in
contrast to ip6fw - is properly lockes, it is time to retire ip6fw.
2006-05-12 20:39:23 +00:00
Benno Rice
26ab616fdc Add a new kernel config option, VERBOSE_SYSINIT.
When porting FreeBSD to a new platform, one of the more useful things to do is
get mi_startup() to let you know which SYSINIT it's up to.  Most people tend to
whack a printf in the SYSINIT loop to print the address of the function it's
about to call.  Going one better, jhb made a version that uses DDB to look up
the name of the function and print that instead.  This version is essentially
his with the addition of some ifdeffery to make it optional and to allow it to
work (although using only the function address, not the symbol) if you forgot
to enable DDB.

All the cool bits by:	jhb
Approved by:		scottl, rink, cognet, imp
2006-05-12 02:01:38 +00:00
Alexander Leidinger
f4eb471709 - change the example of compiling only specific modules to not contain
the linux module, since it is not cross-platform
- move linprocfs from "files" and "options" to architecture specific files,
  since it only makes sense to build this for those architectures, where we
  also have a linuxolator
- disable the build of the linuxolator on our tier-2 architecture "Alpha":
  * we don't have a linux_base port which supports Alpha and at the
    same time is not outdated/obsoleted upstream/in a good condition/
    currently working
  * the upcomming new default linux base port is based upon Fedora
    Core 3 (security support via http://www.fedoralegacy.org), which
    isn't available for Alpha (like the current default linux base
    port which is based upon Red Hat 8)
  * nobody answered my request for testing it ~1 month ago on
    current@ and alpha@ (it doesn't surprises me, see above)
  * a SoC student wouldn't have to waste time on something which
    nobody is willing to test

This does not remove the alpha specific MD files of the linuxolator yet.

Discussed on:		arch (mostly silence)
Spiritual support by:	scottl
2006-05-07 18:12:18 +00:00
Sam Leffler
33f02c8e06 AH_REGOPS_FUNC is needed for sparc
MFC after:	2 weeks
2006-05-05 04:19:36 +00:00
Marcel Moolenaar
64220a7e28 Rewrite of puc(4). Significant changes are:
o  Properly use rman(9) to manage resources. This eliminates the
   need to puc-specific hacks to rman. It also allows devinfo(8)
   to be used to find out the specific assignment of resources to
   serial/parallel ports.
o  Compress the PCI device "database" by optimizing for the common
   case and to use a procedural interface to handle the exceptions.
   The procedural interface also generalizes the need to setup the
   hardware (program chipsets, program clock frequencies).
o  Eliminate the need for PUC_FASTINTR. Serdev devices are fast by
   default and non-serdev devices are handled by the bus.
o  Use the serdev I/F to collect interrupt status and to handle
   interrupts across ports in priority order.
o  Sync the PCI device configuration to include devices found in
   NetBSD and not yet merged to FreeBSD.
o  Add support for Quatech 2, 4 and 8 port UARTs.
o  Add support for a couple dozen Timedia serial cards as found
   in Linux.
2006-04-28 21:21:53 +00:00
Michael Reifenberger
c4529f4161 make BGE_FAKE_AUTONEG a tunable.
This allows one to change the behavior of the driver pre-boot.

NOTE: This patch was made for DragonFly BSD by Sepherosa Ziehau.

PR:		kern/94833
Submitted by:	Devon H. O'Dell
Obtained from:	DragonFly
MFC after:	1 month
2006-04-25 15:56:52 +00:00
Marcel Moolenaar
cea4d8752f o Move ISA specific code from ppc.c to ppc_isa.c -- a bus front-
end for isa(4).
o  Add a seperate bus frontend for acpi(4) and allow ISA DMA for
   it when ISA is configured in the kernel. This allows acpi(4)
   attachments in non-ISA configurations, as is possible for ia64.
o  Add a seperate bus frontend for pci(4) and detect known single
   port parallel cards.
o  Merge PC98 specific changes under pc98/cbus into the MI driver.
   The changes are minor enough for conditional compilation and
   in this form invites better abstraction.
o  Have ppc(4) usabled on all platforms, now that ISA specifics
   are untangled enough.
2006-04-24 23:31:51 +00:00
Matt Jacob
af60634800 Add ISP_DEFAULT_ROLES as a config option. 2006-04-18 22:24:55 +00:00
Paul Saab
d8636a9ab7 Hook bce up to the build 2006-04-10 20:04:22 +00:00
Sam Leffler
a585a9a1bc o add opt_ath.h enable tweaking various config parameters for the driver
without modifying the source code
o default debug msgs and diag support to off

MFC after:	3 days
2006-04-03 18:14:02 +00:00
Scott Long
7f631a410c Hook the MFI driver up to the build. 2006-03-29 09:57:22 +00:00
Yaroslav Tykhiy
8d96e45531 Retire NETSMBCRYPTO as a kernel option and make its functionality
enabled by default in NETSMB and smbfs.ko.

With the most of modern SMB providers requiring encryption by
default, there is little sense left in keeping the crypto part
of NETSMB optional at the build time.

This will also return smbfs.ko to its former properties users
are rather accustomed to.

Discussed with:		freebsd-stable, re (scottl)
Not objected by:	bp, tjr (silence)
MFC after:		5 days
2006-03-05 22:52:17 +00:00
Yaroslav Tykhiy
375ce6798f Take the functionality contained in the former "options TDFX_LINUX"
into a separate module.  Accordingly, convert the option into a device
named similarly.

Note for MFC: Perhaps the option should stay in RELENG_6 for POLA reasons.

Suggested by:	scottl
Reviewed by:	cokane
MFC after:	5 days
2006-03-03 21:37:38 +00:00
Warner Losh
8dfbd03d30 Move XBOX option to options. While it is only valid on i386,
syscons_isa is shared with other machines.
2006-03-03 18:09:37 +00:00
Robert Watson
07881ef960 Add 'options AUDIT' and associate various .c files with the AUDIT
option.  We always build audit_syscalls.c so that the system call
stubs can return ENOSYS rather than the system call code
generating SIGSYS for the system calls.  We are not yet ready to
add AUDIT to LINT, as the prototypes for system call arguments
won't be there until after the system calls for audit are added.

Much work from:	wsalamon
Obtained from:	TrustedBSD Project
2006-02-01 21:00:16 +00:00
Pawel Jakub Dawidek
847a2a1716 Add buffer corruption protection (RedZone) for kernel's malloc(9).
It detects both: buffer underflows and buffer overflows bugs at runtime
(on free(9) and realloc(9)) and prints backtraces from where memory was
allocated and from where it was freed.

Tested by:	kris
2006-01-31 11:09:21 +00:00
Gleb Smirnoff
75ee267c22 Merge the //depot/user/yar/vlan branch into CVS. It contains some collective
work by yar, thompsa and myself. The checksum offloading part also involves
work done by Mihail Balikov.

The most important changes:

o   Instead of global linked list of all vlan softc use a per-trunk
  hash. The size of hash is dynamically adjusted, depending on
  number of entries. This changes struct ifnet, replacing counter
  of vlans with a pointer to trunk structure. This change is an
  improvement for setups with big number of VLANs, several interfaces
  and several CPUs. It is a small regression for a setup with a single
  VLAN interface.
    An alternative to dynamic hash is a per-trunk static array with
  4096 entries, which is a compile time option - VLAN_ARRAY. In my
  experiments the array is not an improvement, probably because such
  a big trunk structure doesn't fit into CPU cache.
o   Introduce an UMA zone for VLAN tags. Since drivers depend on it,
  the zone is declared in kern_mbuf.c, not in optional vlan(4) driver.
  This change is a big improvement for any setup utilizing vlan(4).
o   Use rwlock(9) instead of mutex(9) for locking. We are the first
  ones to do this! :)
o   Some drivers can do hardware VLAN tagging + hardware checksum
  offloading. Add an infrastructure for this. Whenever vlan(4) is
  attached to a parent or parent configuration is changed, the flags
  on vlan(4) interface are updated.

In collaboration with:	yar, thompsa
In collaboration with:	Mihail Balikov <mihail.balikov interbgc.com>
2006-01-30 13:45:15 +00:00
John Baldwin
3f08bd8bce Add a basic reader/writer lock implementation to the kernel. This
implementation is by no means perfect as far as some of the algorithms
that it uses and the fact that it is missing some functionality (try
locks and upgrades/downgrades are not there yet), however it does seem
to work in my local testing.  There is more detail in the comments in the
code, but the short version follows.

A reader/writer lock is very much like a regular mutex: it cannot be held
across a voluntary sleep; it can be acquired in an interrupt thread; if
the lock is held by a writer then the priority of any threads that block
on the lock will be lent to the owner; the simple case lock operations all
are done in a single atomic op.  It also shares some similiarities
with sx locks: it supports reader/writer semantics (multiple readers,
but single writers); readers are allowed to recurse, but writers are not.

We can extend this implementation further by either improving algorithms
or adding new functionality, but this should at least give us a base to
work with now.

Reviewed by:	arch (in theory)
Tested on:	i386 (4 cpu box with a kernel module that used 4 threads
		that randomly chose between read locks and write locks
		that ran w/o panicing for over a day solid.  It usually
		panic'd within a few seconds when there were bugs during
		testing. :)  The kernel module source is available on
		request.)
2006-01-27 23:13:26 +00:00
Poul-Henning Kamp
d3e64681d6 Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43)
to COMPAT_43TTY.

Add COMPAT_43TTY to NOTES and */conf/GENERIC

Compile tty_compat.c only under the new option.

Spit out
	#warning "Old BSD tty API used, please upgrade."
if ioctl_compat.h gets #included from userland.
2006-01-10 09:19:10 +00:00
Warner Losh
5c65ae3a88 New option: NO_FFS_SNAPSHOT. I did this in p4 about the same time
that NetBSD implemented it independently of them (don't know which one
was actually first).  This saves about 24k for those times you don't
need snapshot support (like when running off a ram disk, or in an
embedded environment where size matters).
2006-01-06 04:44:09 +00:00
Alexander Leidinger
ef39c05baa MI changes:
- provide an interface (macros) to the page coloring part of the VM system,
   this allows to try different coloring algorithms without the need to
   touch every file [1]
 - make the page queue tuning values readable: sysctl vm.stats.pagequeue
 - autotuning of the page coloring values based upon the cache size instead
   of options in the kernel config (disabling of the page coloring as a
   kernel option is still possible)

MD changes:
 - detection of the cache size: only IA32 and AMD64 (untested) contains
   cache size detection code, every other arch just comes with a dummy
   function (this results in the use of default values like it was the
   case without the autotuning of the page coloring)
 - print some more info on Intel CPU's (like we do on AMD and Transmeta
   CPU's)

Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue"
and report if the cache* values are zero (= bug in the cache detection code)
or not.

Based upon work by:	Chad David <davidc@acns.ab.ca> [1]
Reviewed by:		alc, arch (in 2004)
Discussed with:		alc, Chad David, arch (in 2004)
2005-12-31 14:39:20 +00:00
Ruslan Ermilov
e9484e32a1 Remove all redundant option file names that don't hurt readability. 2005-12-12 10:15:11 +00:00
Craig Rodrigues
e1fd210e51 Hook XFS into kernel build. 2005-12-12 01:14:59 +00:00
David Xu
0e263b06af Add option P1003_1B_MQUEUE. 2005-12-03 01:40:38 +00:00
John Baldwin
0c43612a35 Sort. 2005-11-23 18:11:24 +00:00
John Baldwin
021eda1d85 Remove the sx(4) driver at the request of the author. The author
originally wrote it for 4.x and hasn't really had the time to fully update
it to 5.x and later.  Also, the author doesn't use the hardware anymore as
well.  If someone does need this driver they can always resurrect it from
the Attic.

Requested by:	Frank Mayhar frank at exit dot com
2005-10-14 18:24:58 +00:00
Gleb Smirnoff
f0796cd26c - Don't pollute opt_global.h with DEVICE_POLLING and introduce
opt_device_polling.h
- Include opt_device_polling.h into appropriate files.
- Embrace with HAVE_KERNEL_OPTION_HEADERS the include in the files that
  can be compiled as loadable modules.

Reviewed by:	bde
2005-10-05 10:09:17 +00:00
Max Laier
b6de9e91bd Remove bridge(4) from the tree. if_bridge(4) is a full functional
replacement and has additional features which make it superior.

Discussed on:	-arch
Reviewed by:	thompsa
X-MFC-after:	never (RELENG_6 as transition period)
2005-09-27 18:10:43 +00:00
Warner Losh
d1d8e770db No ED_NO_MIIBUS no more. Not one more or the same number of non positive options 2005-09-18 20:53:53 +00:00
Pawel Jakub Dawidek
5ca1fcfe06 Connect GEOM_ELI class to the build.
MFC after:	1 week
2005-07-27 21:47:55 +00:00
Pawel Jakub Dawidek
869de95743 Connect GZERO to the build.
MFC after:	3 days
2005-07-25 10:49:05 +00:00
Takanori Watanabe
f35f3346a2 Add options for sl811.
Pointed out by: nyan
2005-07-15 05:12:49 +00:00
Peter Wemm
b107fcf877 Add COMPAT_FREEBSD5
Approved by:	re
2005-06-30 00:09:18 +00:00
Doug White
ff50922b16 Backout the change I made before 5.4-R since I wasn't aware that it was only
a problem with one particular switch module.  Create a kernel option
BGE_FAKE_AUTONEG that restores the 5.4 behavior, which should make the DNLK
switch module work. IBM/Intel blades with Intel or AD switch modules should
work without patching or kernel options with this commit.

Hardware for testing provided by several folks, including
Danny Braniss <danny@cs.huji.ac.il>, Achim Patzner <ap@bnc.net>,
and OffMyServer.

Approved by: re
2005-06-24 21:43:47 +00:00
Peter Wemm
4da0d332f4 Move HWPMC_HOOKS into its own opt_hwpmc_hooks.h file. It doesn't merit
being in opt_global.h and forcing a global recompile when only a few files
reference it.

Approved by:  re
2005-06-24 00:16:57 +00:00
Jean-Sébastien Pédron
fe98fb3210 Connect reiserfs build to every platforms, not only i386 and pc98.
Reviewed by:	mux (mentor)
Approved by:	re (dougb)
2005-06-21 10:17:55 +00:00
Gleb Smirnoff
e9110049aa Attach ng_tcpmss to the build. 2005-06-10 08:05:13 +00:00
Stephan Uphoff
6097174e4d Add IPI support for preempting a thread on another CPU.
MFC after:	3 weeks
2005-06-09 18:23:54 +00:00
Gleb Smirnoff
73e872668d Make NETGRAPH_DEBUG a kernel option, so that it can't be turned off
without hacking source.

In collaboration with:	ru, julian
2005-05-16 08:25:55 +00:00
Yoshihiro Takahashi
164e09ddb4 - Move the NPX_DEBUG option to options.{i386,pc98} and use opt_npx.h.
- Move npx related defines to {i386,pc98}/include/npx.h to remove #include
  {isa,cbus}.h.
2005-05-12 12:47:41 +00:00
Gleb Smirnoff
c4c9b52b87 ng_nat - a netgraph(4) node, which does NAT 2005-05-05 23:41:21 +00:00
Gleb Smirnoff
376ea9726b libalias is now buildable as kernel module 2005-05-05 22:43:04 +00:00
Darren Reed
0c3757df94 Patches from Ruslan Ermilov to address problems compiling LINT 2005-04-28 16:33:15 +00:00
Joseph Koshy
ebccf1e3a6 Bring a working snapshot of hwpmc(4), its associated libraries, userland utilities
and documentation into -CURRENT.

Bump FreeBSD_version.

Reviewed by:	alc, jhb (kernel changes)
2005-04-19 04:01:25 +00:00
Jeff Roberson
dfe614392f - Move LOOKUP_SHARED from opt_global.h to opt_vfs.h so we don't have
to recompile the whole kernel if we change it.
2005-04-03 23:49:13 +00:00
Søren Schmidt
8ca4df3299 This is the much rumoured ATA mkIII update that I've been working on.
o       ATA is now fully newbus'd and split into modules.
        This means that on a modern system you just load "atapci and ata"
        to get the base support, and then one or more of the device
        subdrivers "atadisk atapicd atapifd atapist ataraid".
        All can be loaded/unloaded anytime, but for obvious reasons you
        dont want to unload atadisk when you have mounted filesystems.

o       The device identify part of the probe has been rewritten to fix
        the problems with odd devices the old had, and to try to remove
        so of the long delays some HW could provoke. Also probing is done
	without the need for interrupts, making earlier probing possible.

o       SATA devices can be hot inserted/removed and devices will be created/
        removed in /dev accordingly.
	NOTE: only supported on controllers that has this feature:
	Promise and Silicon Image for now.
	On other controllers the usual atacontrol detach/attach dance is
	still needed.

o	Support for "atomic" composite ATA requests used for RAID.

o       ATA RAID support has been rewritten and and now supports these
        metadata formats:
                 "Adaptec HostRAID"
                 "Highpoint V2 RocketRAID"
                 "Highpoint V3 RocketRAID"
                 "Intel MatrixRAID"
                 "Integrated Technology Express"
                 "LSILogic V2 MegaRAID"
                 "LSILogic V3 MegaRAID"
                 "Promise FastTrak"
                 "Silicon Image Medley"
		 "FreeBSD PseudoRAID"

o       Update the ioctl API to match new RAID levels etc.

o       Update atacontrol to know about the new RAID levels etc
        NOTE: you need to recompile atacontrol with the new sys/ata.h,
        make world will take care of that.
	NOTE2: that rebuild is done differently from the old system as
	the rebuild is now done piggybacked on read requests to the
	array, so atacontrol simply starts a background "dd" to rebuild
	the array.

o       The reinit code has been worked over to be much more robust.

o       The timeout code has been overhauled for races.

o	Support of new chipsets.

o       Lots of fixes for bugs found while doing the modulerization and
        reviewing the old code.

Missing or changed features from current ATA:

o       atapi-cd no longer has support for ATAPI changers. Todays its
        much cheaper and alot faster to copy those CD images to disk
        and serve them from there. Besides they dont seem to be made
        anymore, maybe for that exact reason.

o       ATA RAID can only read metadata from all the above metadata formats,
	not write all of them (Promise and Highpoint V2 so far). This means
	that arrays can be picked up from the BIOS, but they cannot be
	created from FreeBSD. There is more to it than just the missing
	write metadata support, those formats are not unique to a given
	controller like Promise and Highpoint formats, instead they exist
	for several types, and even worse, some controllers can have
	different formats and its impossible to tell which one.
	The outcome is that we cannot reliably create the metadata of those
	formats and be sure the controller BIOS will understand it.
	However write support is needed to update/fail/rebuild the arrays
	properly so it sits fairly high on the TODO list.

o       So far atapicam is not supported with these changes. When/if this
	will change is up to the maintainer of atapi-cam so go there for
	questions.

HW donated by:  Webveveriet AS
HW donated by:  Frode Nordahl
HW donated by:  Yahoo!
HW donated by:  Sentex
Patience by:	Vife and my boys (and even the cats)
2005-03-30 12:03:40 +00:00
Dag-Erling Smørgrav
bcc1205c89 Add PSEUDOFS_TRACE option. 2005-03-14 16:04:27 +00:00
Maxim Konovalov
8cbe387216 o s/opt_ifpw.h/opt_ipfw.h/ in the previous commit.
Submitted by:	YONETANI Tomokazu
2005-03-06 11:22:49 +00:00
Andre Oppermann
099dd0430b Bring back the full packet destination manipulation for 'ipfw fwd'
with the kernel compile time option:

 options IPFIREWALL_FORWARD_EXTENDED

This option has to be specified in addition to IPFIRWALL_FORWARD.

With this option even packets targeted for an IP address local
to the host can be redirected.  All restrictions to ensure proper
behaviour for locally generated packets are turned off.  Firewall
rules have to be carefully crafted to make sure that things like
PMTU discovery do not break.

Document the two kernel options.

PR:		kern/71910
PR:		kern/73129
MFC after:	1 week
2005-02-22 17:40:40 +00:00
Gleb Smirnoff
a97719482d Add CARP (Common Address Redundancy Protocol), which allows multiple
hosts to share an IP address, providing high availability and load
balancing.

Original work on CARP done by Michael Shalayeff, with many
additions by Marco Pfatschbacher and Ryan McBride.

FreeBSD port done solely by Max Laier.

Patch by:	mlaier
Obtained from:	OpenBSD (mickey, mcbride)
2005-02-22 13:04:05 +00:00
Warner Losh
969eaf2179 Break out obscure ISA cards into their own files, as well as ne2000
and wd80x3 support.  Make the obscure ISA cards optional, and add
those options to NOTES on i386 (note: the ifdef around the whole code
is for module building).  Tweak pc98 ed support to include wd80x3 too.
Add goo for alpha too.

The affected cards are the 3Com 3C503, HP LAN+ and SIC (whatever that
is).  I couldn't find any of these for sale on ebay, so they are
untested.  If you have one of these cards, and send it to me, I'll
ensure that you have no future problems with it...

Minor cleanups as well by using functions rather than cut and paste
code for some probing operations (where the function call overhead is
lost in the noise).

Remove use of kvtop, since they aren't required anymore.  This driver
needs to get its memory mapped act together, however, and use bus
space.  It doesn't right now.

This reduces the size of if_ed.ko from about 51k to 33k on my laptop.
2005-02-09 20:03:40 +00:00
Gleb Smirnoff
f2a7ef4e00 Hook up ng_ipfw to kernel build. 2005-02-05 12:15:56 +00:00
Bosko Milekic
e4eb384b47 Bring in MemGuard, a very simple and small replacement allocator
designed to help detect tamper-after-free scenarios, a problem more
and more common and likely with multithreaded kernels where race
conditions are more prevalent.

Currently MemGuard can only take over malloc()/realloc()/free() for
particular (a) malloc type(s) and the code brought in with this
change manually instruments it to take over M_SUBPROC allocations
as an example.  If you are planning to use it, for now you must:

	1) Put "options DEBUG_MEMGUARD" in your kernel config.
	2) Edit src/sys/kern/kern_malloc.c manually, look for
	   "XXX CHANGEME" and replace the M_SUBPROC comparison with
	   the appropriate malloc type (this might require additional
	   but small/simple code modification if, say, the malloc type
	   is declared out of scope).
	3) Build and install your kernel.  Tune vm.memguard_divisor
	   boot-time tunable which is used to scale how much of kmem_map
	   you want to allott for MemGuard's use.  The default is 10,
	   so kmem_size/10.

ToDo:
	1) Bring in a memguard(9) man page.
	2) Better instrumentation (e.g., boot-time) of MemGuard taking
	   over malloc types.
	3) Teach UMA about MemGuard to allow MemGuard to override zone
	   allocations too.
	4) Improve MemGuard if necessary.

This work is partly based on some old patches from Ian Dowse.
2005-01-21 18:09:17 +00:00
Pawel Jakub Dawidek
560cb85703 Connect SHSEC GEOM class to the build. 2005-01-11 18:18:40 +00:00
Sam Leffler
d487b9ed20 update for new ath hal 2004-12-08 18:20:53 +00:00
Peter Wemm
b7c3f3a9d5 Catch a few more autofs references.
Submitted by:  obrien
2004-11-12 19:44:30 +00:00
Robert Watson
df970488b3 Move the 'debug' sysctl tree under options SYSCTL_DEBUG. It generates
an inordinate amount of synchronous console output that is fairly
undesirable on slower serial console.  It's easily hit by accident
when frobbing other sysctls late at night.
2004-10-27 19:26:01 +00:00
Poul-Henning Kamp
08d0c00b91 Per recent HEADSUP: Disconnect (old)vinum from the kernel build.
Users should move to the new geom_vinum implementation instead.

The refcount logic which is being added to devices to enable safe module
unloading and the buf/vm work also in progress would require a major rework
of the (old)-vinum code to comply with the new semantics.

The actual source files will not be removed until I have coordinated with
the geomvinum people if they need any bits repo-copied etc.
2004-09-23 08:34:50 +00:00
Gleb Smirnoff
cec50dea12 Attach ng_netflow to kernel build.
Approved by:	julian (mentor)
2004-09-16 20:35:28 +00:00
Alfred Perlstein
0793d4d1e4 Hook autofs to the build. 2004-09-02 20:44:56 +00:00
Scott Long
9923b511ed Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined.  Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c).  This
is a possible MT5 candidate.
2004-09-02 18:59:15 +00:00
Brooks Davis
b443062227 General modernization of coda:
- Ditch NVCODA
 - Don't use a static major
 - Don't declare functions extern

Reviewed by:	peter
2004-09-01 01:19:52 +00:00
Peter Wemm
f37a929ca1 Kill count device support from config. I've changed the last few
remaining consumers to have the count passed as an option.  This is
i4b, pc98/wdc, and coda.

Bump configvers.h from 500013 to 600000.

Remove heuristics that tried to parse "device ed5" as 5 units of the ed
device.  This broke things like the snd_emu10k1 device, which required
quotes to make it parse right.  The no-longer-needed quotes have been
removed from NOTES, GENERIC etc.  eg, I've removed the quotes from:
   device  snd_maestro
   device  "snd_maestro3"
   device  snd_mss

I believe everything will still compile and work after this.
2004-08-30 23:03:58 +00:00
Dag-Erling Smørgrav
0eac4495db Remove the HW_WDOG option; it serves no purpose.
MFC after:	3 days
2004-08-29 11:10:09 +00:00
Robert Watson
1d8cd39e71 Change the default disposition of debug.mpsafenet from 0 to 1, which
will cause the network stack to operate without the Giant lock by
default.  This change has the potential to improve performance by
increasing parallelism and decreasing latency in network processing.

Due to the potential exposure of existing or new bugs, the following
compatibility functionality is maintained:

- It is still possible to disable Giant-free operation by setting
  debug.mpsafenet to 0 in loader.conf.

- Add "options NET_WITH_GIANT", which will restore the default value of
  debug.mpsafenet to 0, and is intended for use on systems compiled with
  known unsafe components, or where a more conservative configuration is
  desired.

- Add a new declaration, NET_NEEDS_GIANT("componentname"), which permits
  kernel components to declare dependence on Giant over the network
  stack.  If the declaration is made by a preloaded module or a compiled
  in component, the disposition of debug.mpsafenet will be set to 0 and
  a warning concerning performance degraded operation printed to the
  console.  If it is declared by a loadable kernel module after boot, a
  warning is displayed but the disposition cannot be changed.  This is
  implemented by defining a new SYSINIT() value, SI_SUB_SETTINGS, which
  is intended for the processing of configuration choices after tunables
  are read in and the console is available to generate errors, but
  before much else gets going.

This compatibility behavior will go away when we've finished the last
of the locking work and are confident that operation is correct.
2004-08-28 15:11:13 +00:00
Andre Oppermann
c21fd23260 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
John-Mark Gurney
000968010a add options MPROF_BUFFERS and MPROF_HASH_SIZE that adjust the sizes of
the mutex profiling buffers.  Document them in the man page and in NOTES.
Ensure _HASH_SIZE is larger than _BUFFERS with a cpp error.
2004-08-19 06:38:26 +00:00
Andre Oppermann
9b932e9e04 Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland
and preserves the ipfw ABI.  The ipfw core packet inspection and filtering
functions have not been changed, only how ipfw is invoked is different.

However there are many changes how ipfw is and its add-on's are handled:

 In general ipfw is now called through the PFIL_HOOKS and most associated
 magic, that was in ip_input() or ip_output() previously, is now done in
 ipfw_check_[in|out]() in the ipfw PFIL handler.

 IPDIVERT is entirely handled within the ipfw PFIL handlers.  A packet to
 be diverted is checked if it is fragmented, if yes, ip_reass() gets in for
 reassembly.  If not, or all fragments arrived and the packet is complete,
 divert_packet is called directly.  For 'tee' no reassembly attempt is made
 and a copy of the packet is sent to the divert socket unmodified.  The
 original packet continues its way through ip_input/output().

 ipfw 'forward' is done via m_tag's.  The ipfw PFIL handlers tag the packet
 with the new destination sockaddr_in.  A check if the new destination is a
 local IP address is made and the m_flags are set appropriately.  ip_input()
 and ip_output() have some more work to do here.  For ip_input() the m_flags
 are checked and a packet for us is directly sent to the 'ours' section for
 further processing.  Destination changes on the input path are only tagged
 and the 'srcrt' flag to ip_forward() is set to disable destination checks
 and ICMP replies at this stage.  The tag is going to be handled on output.
 ip_output() again checks for m_flags and the 'ours' tag.  If found, the
 packet will be dropped back to the IP netisr where it is going to be picked
 up by ip_input() again and the directly sent to the 'ours' section.  When
 only the destination changes, the route's 'dst' is overwritten with the
 new destination from the forward m_tag.  Then it jumps back at the route
 lookup again and skips the firewall check because it has been marked with
 M_SKIP_FIREWALL.  ipfw 'forward' has to be compiled into the kernel with
 'option IPFIREWALL_FORWARD' to enable it.

 DUMMYNET is entirely handled within the ipfw PFIL handlers.  A packet for
 a dummynet pipe or queue is directly sent to dummynet_io().  Dummynet will
 then inject it back into ip_input/ip_output() after it has served its time.
 Dummynet packets are tagged and will continue from the next rule when they
 hit the ipfw PFIL handlers again after re-injection.

 BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as
 they did before.  Later this will be changed to dedicated ETHER PFIL_HOOKS.

More detailed changes to the code:

 conf/files
	Add netinet/ip_fw_pfil.c.

 conf/options
	Add IPFIREWALL_FORWARD option.

 modules/ipfw/Makefile
	Add ip_fw_pfil.c.

 net/bridge.c
	Disable PFIL_HOOKS if ipfw for bridging is active.  Bridging ipfw
	is still directly invoked to handle layer2 headers and packets would
	get a double ipfw when run through PFIL_HOOKS as well.

 netinet/ip_divert.c
	Removed divert_clone() function.  It is no longer used.

 netinet/ip_dummynet.[ch]
	Neither the route 'ro' nor the destination 'dst' need to be stored
	while in dummynet transit.  Structure members and associated macros
	are removed.

 netinet/ip_fastfwd.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_fw.h
	Removed 'ro' and 'dst' from struct ip_fw_args.

 netinet/ip_fw2.c
	(Re)moved some global variables and the module handling.

 netinet/ip_fw_pfil.c
	New file containing the ipfw PFIL handlers and module initialization.

 netinet/ip_input.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.  ip_forward() does not longer require
	the 'next_hop' struct sockaddr_in argument.  Disable early checks
	if 'srcrt' is set.

 netinet/ip_output.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_var.h
	Add ip_reass() as general function.  (Used from ipfw PFIL handlers
	for IPDIVERT.)

 netinet/raw_ip.c
	Directly check if ipfw and dummynet control pointers are active.

 netinet/tcp_input.c
	Rework the 'ipfw forward' to local code to work with the new way of
	forward tags.

 netinet/tcp_sack.c
	Remove include 'opt_ipfw.h' which is not needed here.

 sys/mbuf.h
	Remove m_claim_next() macro which was exclusively for ipfw 'forward'
	and is no longer needed.

Approved by:	re (scottl)
2004-08-17 22:05:54 +00:00
Pawel Jakub Dawidek
e81856c34c Connect RAID3 GEOM class to the build. 2004-08-16 06:36:21 +00:00
David Malone
1f44b0a1b5 Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD
have already done this, so I have styled the patch on their work:

        1) introduce a ip_newid() static inline function that checks
        the sysctl and then decides if it should return a sequential
        or random IP ID.

        2) named the sysctl net.inet.ip.random_id

        3) IPv6 flow IDs and fragment IDs are now always random.
        Flow IDs and frag IDs are significantly less common in the
        IPv6 world (ie. rarely generated per-packet), so there should
        be smaller performance concerns.

The sysctl defaults to 0 (sequential IP IDs).

Reviewed by:	andre, silby, mlaier, ume
Based on:	NetBSD
MFC after:	2 months
2004-08-14 15:32:40 +00:00
Max Khon
75261008d7 Add geom_uzip -- geom class that implements read-only compressed disks.
Currently supports cloop V2.0 disk compression format.
May support more formats in future.
2004-08-13 09:40:58 +00:00
Hartmut Brandt
a7e2239469 Allow the ATM call control module to be built into the kernel. 2004-08-12 15:01:59 +00:00
Pawel Jakub Dawidek
8a8fbaca32 Connect GEOM_MIRROR class to the build. 2004-07-30 23:18:53 +00:00
Pawel Jakub Dawidek
bd58d1d222 Remove the old geom_mirror class.
Approved by:	phk
2004-07-30 22:50:21 +00:00
Robert Watson
a9abdce44a Add "options ADAPTIVE_GIANT" which causes Giant to also be treated in
an adaptive fashion when adaptive mutexes are enabled.  The theory
behind non-adaptive Giant is that Giant will be held for long periods
of time, and therefore spinning waiting on it is wasteful.  However,
in MySQL benchmarks which are relatively Giant-free, running Giant
adaptive makes an observable difference on SMP (5% transaction rate
improvement).  As such, make adaptive behavior on Giant an option so
it can be more widely benchmarked.
2004-07-27 16:34:48 +00:00
Gleb Smirnoff
31578ac8ae Add ng_device(4) to LINT.
Reviewed by:	marks
Approved by:	julian (mentor)
2004-07-20 12:42:54 +00:00
Alexander Kabaev
8024a9da08 Unbreak kernel compiles by preserving an old opt_adaptive_mutexes.h file
name.
2004-07-18 18:21:39 +00:00
Scott Long
701f140800 Enable ADAPTIVE_MUTEXES by default by changing the sense of the option to
NO_ADAPTIVE_MUTEXES.  This option has been enabled by default on amd64 for
quite some time, and has been extensively tested on i386 and sparc64.  It
shows measurable performance gains in many circumstances, and few negative
effects.  It would be nice in t he future if adaptive mutexes actually went
to sleep after a certain amount of spinning, but that will require quite a
bit more testing.
2004-07-18 15:59:03 +00:00
Marcel Moolenaar
e2ee21738f Update for the KDB framework:
o  Rename WITNESS_DDB to WITNESS_KDB. In the new world order KDB is the
   acronym to use for debugging related code. The DDB option is used
   to enable the DDB debugger backend only.
o  Likewise, rename DDB_TRACE to KDB_TRACE, rename DDB_UNATTENDED to
   KDB_UNATTENDED and rename SC_HISTORY_DDBKEY to SC_HISTORY_KDBKEY.
o  Remove DDB_NOKLDSYM. The new DDB backend supports pre-linker symbol
   lookups as well as KLD symbol lookups at the same time.
o  Remove GDB_REMOTE_CHAT. The GDB protocol hacks to allow this are
   FreeBSD specific. At the same time, the GDB protocol has packets
   for console output.
2004-07-11 01:44:07 +00:00
Marcel Moolenaar
0aefe3632e Add new options for the KDB framework. This commit merely adds them and
in particular not without removing the options they replace or in the
proper location in this file. The purpose of this commit is to make it
possible to commit changes in parts without causing massive build
breakages. At least, that's the intend. I have no idea if it actually
works out as I hope...
2004-07-10 19:34:06 +00:00
Brian Somers
0ac4013324 Change the following environment variables to kernel options:
bootp -> BOOTP
    bootp.nfsroot -> BOOTP_NFSROOT
    bootp.nfsv3 -> BOOTP_NFSV3
    bootp.compat -> BOOTP_COMPAT
    bootp.wired_to -> BOOTP_WIRED_TO

- i.e. back out the previous commit.  It's already possible to
pxeboot(8) with a GENERIC kernel.

Pointed out by: dwmalone
2004-07-08 22:35:36 +00:00
Brian Somers
59e1ebc9b5 Change the following kernel options to environment variables:
BOOTP -> bootp
    BOOTP_NFSROOT -> bootp.nfsroot
    BOOTP_NFSV3 -> bootp.nfsv3
    BOOTP_COMPAT -> bootp.compat
    BOOTP_WIRED_TO -> bootp.wired_to

This lets you PXE boot with a GENERIC kernel by putting this sort of thing
in loader.conf:

    bootp="YES"
    bootp.nfsroot="YES"
    bootp.nfsv3="YES"
    bootp.wired_to="bge1"

or even setting the variables manually from the OK prompt.
2004-07-08 13:40:33 +00:00
Tim J. Robbins
3bc482ec1c By popular request, add a workaround that allows large (>128GB or so)
FAT32 filesystems to be mounted, subject to some fairly serious limitations.

This works by extending the internal pseudo-inode-numbers generated from
the file's starting cluster number to 64-bits, then creating a table
mapping these into arbitrary 32-bit inode numbers, which can fit in
struct dirent's d_fileno and struct vattr's va_fileid fields. The mappings
do not persist across unmounts or reboots, so it's not possible to export
these filesystems through NFS. The mapping table may grow to be rather
large, and may grow large enough to exhaust kernel memory on filesystems
with millions of files.

Don't enable this option unless you understand the consequences.
2004-07-03 13:22:38 +00:00
John Baldwin
0c0b25ae91 Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
  determine if a thread about to be added to a run queue should be
  preempted to directly.  If it is not safe to preempt or if the new
  thread does not have a high enough priority, then the function returns
  false and sched_add() adds the thread to the run queue.  If the thread
  should be preempted to but the current thread is in a nested critical
  section, then the flag TDF_OWEPREEMPT is set and the thread is added
  to the run queue.  Otherwise, mi_switch() is called immediately and the
  thread is never added to the run queue since it is switch to directly.
  When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
  then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
  setrunqueue() now does all the correct work.  This also removes the
  do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
  supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
  chance to run if the architecture supports native preemption since
  the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
  architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
  preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by:	scottl (with his re@ hat)
2004-07-02 20:21:44 +00:00
Pawel Jakub Dawidek
e1237b285b Introduce GEOM_LABEL class.
This class is used for detecting volume labels on file systems:
UFS, MSDOSFS (FAT12, FAT16, FAT32) and ISO9660.
It also provide native labelization (there is no need for file system).

g_label_ufs.c is based on geom_vol_ffs from Gordon Tetlow.
g_label_msdos.c and g_label_iso9660.c are probably hacks, I just found
where volume labels are stored and I use those offsets here,
but with this class it should be easy to do it as it should be done by
someone who know how.
Implementing volume labels detection for other file systems also should
be trivial.

New providers are created in those directories:
/dev/ufs/ (UFS1, UFS2)
/dev/msdosfs/ (FAT12, FAT16, FAT32)
/dev/iso9660/ (ISO9660)
/dev/label/ (native labels, configured with glabel(8))

Manual page cleanups and some comments inside were submitted by
Simon L. Nielsen, who was, as always, very helpful. Thanks!
2004-07-02 19:40:36 +00:00
John Baldwin
ef0ebfc351 Add two new kernel options to allow rudimentary profiling of the internal
hash tables used in the sleep queue and turnstile code.  Each option adds
a sysctl tree under debug containing the maximum depth of any bucket in
the hash table as well as a separate node for each bucket (or chain)
containing the current depth and maximum depth for that bucket.
2004-06-29 02:30:12 +00:00
Robert Watson
d07af9d904 Add options NETGRAPH_FEC to hook up ng_fec.c to the LINT build. 2004-06-27 02:36:33 +00:00