Commit Graph

576 Commits

Author SHA1 Message Date
Robert Watson
07881ef960 Add 'options AUDIT' and associate various .c files with the AUDIT
option.  We always build audit_syscalls.c so that the system call
stubs can return ENOSYS rather than the system call code
generating SIGSYS for the system calls.  We are not yet ready to
add AUDIT to LINT, as the prototypes for system call arguments
won't be there until after the system calls for audit are added.

Much work from:	wsalamon
Obtained from:	TrustedBSD Project
2006-02-01 21:00:16 +00:00
Pawel Jakub Dawidek
847a2a1716 Add buffer corruption protection (RedZone) for kernel's malloc(9).
It detects both: buffer underflows and buffer overflows bugs at runtime
(on free(9) and realloc(9)) and prints backtraces from where memory was
allocated and from where it was freed.

Tested by:	kris
2006-01-31 11:09:21 +00:00
Gleb Smirnoff
75ee267c22 Merge the //depot/user/yar/vlan branch into CVS. It contains some collective
work by yar, thompsa and myself. The checksum offloading part also involves
work done by Mihail Balikov.

The most important changes:

o   Instead of global linked list of all vlan softc use a per-trunk
  hash. The size of hash is dynamically adjusted, depending on
  number of entries. This changes struct ifnet, replacing counter
  of vlans with a pointer to trunk structure. This change is an
  improvement for setups with big number of VLANs, several interfaces
  and several CPUs. It is a small regression for a setup with a single
  VLAN interface.
    An alternative to dynamic hash is a per-trunk static array with
  4096 entries, which is a compile time option - VLAN_ARRAY. In my
  experiments the array is not an improvement, probably because such
  a big trunk structure doesn't fit into CPU cache.
o   Introduce an UMA zone for VLAN tags. Since drivers depend on it,
  the zone is declared in kern_mbuf.c, not in optional vlan(4) driver.
  This change is a big improvement for any setup utilizing vlan(4).
o   Use rwlock(9) instead of mutex(9) for locking. We are the first
  ones to do this! :)
o   Some drivers can do hardware VLAN tagging + hardware checksum
  offloading. Add an infrastructure for this. Whenever vlan(4) is
  attached to a parent or parent configuration is changed, the flags
  on vlan(4) interface are updated.

In collaboration with:	yar, thompsa
In collaboration with:	Mihail Balikov <mihail.balikov interbgc.com>
2006-01-30 13:45:15 +00:00
John Baldwin
3f08bd8bce Add a basic reader/writer lock implementation to the kernel. This
implementation is by no means perfect as far as some of the algorithms
that it uses and the fact that it is missing some functionality (try
locks and upgrades/downgrades are not there yet), however it does seem
to work in my local testing.  There is more detail in the comments in the
code, but the short version follows.

A reader/writer lock is very much like a regular mutex: it cannot be held
across a voluntary sleep; it can be acquired in an interrupt thread; if
the lock is held by a writer then the priority of any threads that block
on the lock will be lent to the owner; the simple case lock operations all
are done in a single atomic op.  It also shares some similiarities
with sx locks: it supports reader/writer semantics (multiple readers,
but single writers); readers are allowed to recurse, but writers are not.

We can extend this implementation further by either improving algorithms
or adding new functionality, but this should at least give us a base to
work with now.

Reviewed by:	arch (in theory)
Tested on:	i386 (4 cpu box with a kernel module that used 4 threads
		that randomly chose between read locks and write locks
		that ran w/o panicing for over a day solid.  It usually
		panic'd within a few seconds when there were bugs during
		testing. :)  The kernel module source is available on
		request.)
2006-01-27 23:13:26 +00:00
Poul-Henning Kamp
d3e64681d6 Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43)
to COMPAT_43TTY.

Add COMPAT_43TTY to NOTES and */conf/GENERIC

Compile tty_compat.c only under the new option.

Spit out
	#warning "Old BSD tty API used, please upgrade."
if ioctl_compat.h gets #included from userland.
2006-01-10 09:19:10 +00:00
Warner Losh
5c65ae3a88 New option: NO_FFS_SNAPSHOT. I did this in p4 about the same time
that NetBSD implemented it independently of them (don't know which one
was actually first).  This saves about 24k for those times you don't
need snapshot support (like when running off a ram disk, or in an
embedded environment where size matters).
2006-01-06 04:44:09 +00:00
Alexander Leidinger
ef39c05baa MI changes:
- provide an interface (macros) to the page coloring part of the VM system,
   this allows to try different coloring algorithms without the need to
   touch every file [1]
 - make the page queue tuning values readable: sysctl vm.stats.pagequeue
 - autotuning of the page coloring values based upon the cache size instead
   of options in the kernel config (disabling of the page coloring as a
   kernel option is still possible)

MD changes:
 - detection of the cache size: only IA32 and AMD64 (untested) contains
   cache size detection code, every other arch just comes with a dummy
   function (this results in the use of default values like it was the
   case without the autotuning of the page coloring)
 - print some more info on Intel CPU's (like we do on AMD and Transmeta
   CPU's)

Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue"
and report if the cache* values are zero (= bug in the cache detection code)
or not.

Based upon work by:	Chad David <davidc@acns.ab.ca> [1]
Reviewed by:		alc, arch (in 2004)
Discussed with:		alc, Chad David, arch (in 2004)
2005-12-31 14:39:20 +00:00
Ruslan Ermilov
e9484e32a1 Remove all redundant option file names that don't hurt readability. 2005-12-12 10:15:11 +00:00
Craig Rodrigues
e1fd210e51 Hook XFS into kernel build. 2005-12-12 01:14:59 +00:00
David Xu
0e263b06af Add option P1003_1B_MQUEUE. 2005-12-03 01:40:38 +00:00
John Baldwin
0c43612a35 Sort. 2005-11-23 18:11:24 +00:00
John Baldwin
021eda1d85 Remove the sx(4) driver at the request of the author. The author
originally wrote it for 4.x and hasn't really had the time to fully update
it to 5.x and later.  Also, the author doesn't use the hardware anymore as
well.  If someone does need this driver they can always resurrect it from
the Attic.

Requested by:	Frank Mayhar frank at exit dot com
2005-10-14 18:24:58 +00:00
Gleb Smirnoff
f0796cd26c - Don't pollute opt_global.h with DEVICE_POLLING and introduce
opt_device_polling.h
- Include opt_device_polling.h into appropriate files.
- Embrace with HAVE_KERNEL_OPTION_HEADERS the include in the files that
  can be compiled as loadable modules.

Reviewed by:	bde
2005-10-05 10:09:17 +00:00
Max Laier
b6de9e91bd Remove bridge(4) from the tree. if_bridge(4) is a full functional
replacement and has additional features which make it superior.

Discussed on:	-arch
Reviewed by:	thompsa
X-MFC-after:	never (RELENG_6 as transition period)
2005-09-27 18:10:43 +00:00
Warner Losh
d1d8e770db No ED_NO_MIIBUS no more. Not one more or the same number of non positive options 2005-09-18 20:53:53 +00:00
Pawel Jakub Dawidek
5ca1fcfe06 Connect GEOM_ELI class to the build.
MFC after:	1 week
2005-07-27 21:47:55 +00:00
Pawel Jakub Dawidek
869de95743 Connect GZERO to the build.
MFC after:	3 days
2005-07-25 10:49:05 +00:00
Takanori Watanabe
f35f3346a2 Add options for sl811.
Pointed out by: nyan
2005-07-15 05:12:49 +00:00
Peter Wemm
b107fcf877 Add COMPAT_FREEBSD5
Approved by:	re
2005-06-30 00:09:18 +00:00
Doug White
ff50922b16 Backout the change I made before 5.4-R since I wasn't aware that it was only
a problem with one particular switch module.  Create a kernel option
BGE_FAKE_AUTONEG that restores the 5.4 behavior, which should make the DNLK
switch module work. IBM/Intel blades with Intel or AD switch modules should
work without patching or kernel options with this commit.

Hardware for testing provided by several folks, including
Danny Braniss <danny@cs.huji.ac.il>, Achim Patzner <ap@bnc.net>,
and OffMyServer.

Approved by: re
2005-06-24 21:43:47 +00:00
Peter Wemm
4da0d332f4 Move HWPMC_HOOKS into its own opt_hwpmc_hooks.h file. It doesn't merit
being in opt_global.h and forcing a global recompile when only a few files
reference it.

Approved by:  re
2005-06-24 00:16:57 +00:00
Jean-Sébastien Pédron
fe98fb3210 Connect reiserfs build to every platforms, not only i386 and pc98.
Reviewed by:	mux (mentor)
Approved by:	re (dougb)
2005-06-21 10:17:55 +00:00
Gleb Smirnoff
e9110049aa Attach ng_tcpmss to the build. 2005-06-10 08:05:13 +00:00
Stephan Uphoff
6097174e4d Add IPI support for preempting a thread on another CPU.
MFC after:	3 weeks
2005-06-09 18:23:54 +00:00
Gleb Smirnoff
73e872668d Make NETGRAPH_DEBUG a kernel option, so that it can't be turned off
without hacking source.

In collaboration with:	ru, julian
2005-05-16 08:25:55 +00:00
Yoshihiro Takahashi
164e09ddb4 - Move the NPX_DEBUG option to options.{i386,pc98} and use opt_npx.h.
- Move npx related defines to {i386,pc98}/include/npx.h to remove #include
  {isa,cbus}.h.
2005-05-12 12:47:41 +00:00
Gleb Smirnoff
c4c9b52b87 ng_nat - a netgraph(4) node, which does NAT 2005-05-05 23:41:21 +00:00
Gleb Smirnoff
376ea9726b libalias is now buildable as kernel module 2005-05-05 22:43:04 +00:00
Darren Reed
0c3757df94 Patches from Ruslan Ermilov to address problems compiling LINT 2005-04-28 16:33:15 +00:00
Joseph Koshy
ebccf1e3a6 Bring a working snapshot of hwpmc(4), its associated libraries, userland utilities
and documentation into -CURRENT.

Bump FreeBSD_version.

Reviewed by:	alc, jhb (kernel changes)
2005-04-19 04:01:25 +00:00
Jeff Roberson
dfe614392f - Move LOOKUP_SHARED from opt_global.h to opt_vfs.h so we don't have
to recompile the whole kernel if we change it.
2005-04-03 23:49:13 +00:00
Søren Schmidt
8ca4df3299 This is the much rumoured ATA mkIII update that I've been working on.
o       ATA is now fully newbus'd and split into modules.
        This means that on a modern system you just load "atapci and ata"
        to get the base support, and then one or more of the device
        subdrivers "atadisk atapicd atapifd atapist ataraid".
        All can be loaded/unloaded anytime, but for obvious reasons you
        dont want to unload atadisk when you have mounted filesystems.

o       The device identify part of the probe has been rewritten to fix
        the problems with odd devices the old had, and to try to remove
        so of the long delays some HW could provoke. Also probing is done
	without the need for interrupts, making earlier probing possible.

o       SATA devices can be hot inserted/removed and devices will be created/
        removed in /dev accordingly.
	NOTE: only supported on controllers that has this feature:
	Promise and Silicon Image for now.
	On other controllers the usual atacontrol detach/attach dance is
	still needed.

o	Support for "atomic" composite ATA requests used for RAID.

o       ATA RAID support has been rewritten and and now supports these
        metadata formats:
                 "Adaptec HostRAID"
                 "Highpoint V2 RocketRAID"
                 "Highpoint V3 RocketRAID"
                 "Intel MatrixRAID"
                 "Integrated Technology Express"
                 "LSILogic V2 MegaRAID"
                 "LSILogic V3 MegaRAID"
                 "Promise FastTrak"
                 "Silicon Image Medley"
		 "FreeBSD PseudoRAID"

o       Update the ioctl API to match new RAID levels etc.

o       Update atacontrol to know about the new RAID levels etc
        NOTE: you need to recompile atacontrol with the new sys/ata.h,
        make world will take care of that.
	NOTE2: that rebuild is done differently from the old system as
	the rebuild is now done piggybacked on read requests to the
	array, so atacontrol simply starts a background "dd" to rebuild
	the array.

o       The reinit code has been worked over to be much more robust.

o       The timeout code has been overhauled for races.

o	Support of new chipsets.

o       Lots of fixes for bugs found while doing the modulerization and
        reviewing the old code.

Missing or changed features from current ATA:

o       atapi-cd no longer has support for ATAPI changers. Todays its
        much cheaper and alot faster to copy those CD images to disk
        and serve them from there. Besides they dont seem to be made
        anymore, maybe for that exact reason.

o       ATA RAID can only read metadata from all the above metadata formats,
	not write all of them (Promise and Highpoint V2 so far). This means
	that arrays can be picked up from the BIOS, but they cannot be
	created from FreeBSD. There is more to it than just the missing
	write metadata support, those formats are not unique to a given
	controller like Promise and Highpoint formats, instead they exist
	for several types, and even worse, some controllers can have
	different formats and its impossible to tell which one.
	The outcome is that we cannot reliably create the metadata of those
	formats and be sure the controller BIOS will understand it.
	However write support is needed to update/fail/rebuild the arrays
	properly so it sits fairly high on the TODO list.

o       So far atapicam is not supported with these changes. When/if this
	will change is up to the maintainer of atapi-cam so go there for
	questions.

HW donated by:  Webveveriet AS
HW donated by:  Frode Nordahl
HW donated by:  Yahoo!
HW donated by:  Sentex
Patience by:	Vife and my boys (and even the cats)
2005-03-30 12:03:40 +00:00
Dag-Erling Smørgrav
bcc1205c89 Add PSEUDOFS_TRACE option. 2005-03-14 16:04:27 +00:00
Maxim Konovalov
8cbe387216 o s/opt_ifpw.h/opt_ipfw.h/ in the previous commit.
Submitted by:	YONETANI Tomokazu
2005-03-06 11:22:49 +00:00
Andre Oppermann
099dd0430b Bring back the full packet destination manipulation for 'ipfw fwd'
with the kernel compile time option:

 options IPFIREWALL_FORWARD_EXTENDED

This option has to be specified in addition to IPFIRWALL_FORWARD.

With this option even packets targeted for an IP address local
to the host can be redirected.  All restrictions to ensure proper
behaviour for locally generated packets are turned off.  Firewall
rules have to be carefully crafted to make sure that things like
PMTU discovery do not break.

Document the two kernel options.

PR:		kern/71910
PR:		kern/73129
MFC after:	1 week
2005-02-22 17:40:40 +00:00
Gleb Smirnoff
a97719482d Add CARP (Common Address Redundancy Protocol), which allows multiple
hosts to share an IP address, providing high availability and load
balancing.

Original work on CARP done by Michael Shalayeff, with many
additions by Marco Pfatschbacher and Ryan McBride.

FreeBSD port done solely by Max Laier.

Patch by:	mlaier
Obtained from:	OpenBSD (mickey, mcbride)
2005-02-22 13:04:05 +00:00
Warner Losh
969eaf2179 Break out obscure ISA cards into their own files, as well as ne2000
and wd80x3 support.  Make the obscure ISA cards optional, and add
those options to NOTES on i386 (note: the ifdef around the whole code
is for module building).  Tweak pc98 ed support to include wd80x3 too.
Add goo for alpha too.

The affected cards are the 3Com 3C503, HP LAN+ and SIC (whatever that
is).  I couldn't find any of these for sale on ebay, so they are
untested.  If you have one of these cards, and send it to me, I'll
ensure that you have no future problems with it...

Minor cleanups as well by using functions rather than cut and paste
code for some probing operations (where the function call overhead is
lost in the noise).

Remove use of kvtop, since they aren't required anymore.  This driver
needs to get its memory mapped act together, however, and use bus
space.  It doesn't right now.

This reduces the size of if_ed.ko from about 51k to 33k on my laptop.
2005-02-09 20:03:40 +00:00
Gleb Smirnoff
f2a7ef4e00 Hook up ng_ipfw to kernel build. 2005-02-05 12:15:56 +00:00
Bosko Milekic
e4eb384b47 Bring in MemGuard, a very simple and small replacement allocator
designed to help detect tamper-after-free scenarios, a problem more
and more common and likely with multithreaded kernels where race
conditions are more prevalent.

Currently MemGuard can only take over malloc()/realloc()/free() for
particular (a) malloc type(s) and the code brought in with this
change manually instruments it to take over M_SUBPROC allocations
as an example.  If you are planning to use it, for now you must:

	1) Put "options DEBUG_MEMGUARD" in your kernel config.
	2) Edit src/sys/kern/kern_malloc.c manually, look for
	   "XXX CHANGEME" and replace the M_SUBPROC comparison with
	   the appropriate malloc type (this might require additional
	   but small/simple code modification if, say, the malloc type
	   is declared out of scope).
	3) Build and install your kernel.  Tune vm.memguard_divisor
	   boot-time tunable which is used to scale how much of kmem_map
	   you want to allott for MemGuard's use.  The default is 10,
	   so kmem_size/10.

ToDo:
	1) Bring in a memguard(9) man page.
	2) Better instrumentation (e.g., boot-time) of MemGuard taking
	   over malloc types.
	3) Teach UMA about MemGuard to allow MemGuard to override zone
	   allocations too.
	4) Improve MemGuard if necessary.

This work is partly based on some old patches from Ian Dowse.
2005-01-21 18:09:17 +00:00
Pawel Jakub Dawidek
560cb85703 Connect SHSEC GEOM class to the build. 2005-01-11 18:18:40 +00:00
Sam Leffler
d487b9ed20 update for new ath hal 2004-12-08 18:20:53 +00:00
Peter Wemm
b7c3f3a9d5 Catch a few more autofs references.
Submitted by:  obrien
2004-11-12 19:44:30 +00:00
Robert Watson
df970488b3 Move the 'debug' sysctl tree under options SYSCTL_DEBUG. It generates
an inordinate amount of synchronous console output that is fairly
undesirable on slower serial console.  It's easily hit by accident
when frobbing other sysctls late at night.
2004-10-27 19:26:01 +00:00
Poul-Henning Kamp
08d0c00b91 Per recent HEADSUP: Disconnect (old)vinum from the kernel build.
Users should move to the new geom_vinum implementation instead.

The refcount logic which is being added to devices to enable safe module
unloading and the buf/vm work also in progress would require a major rework
of the (old)-vinum code to comply with the new semantics.

The actual source files will not be removed until I have coordinated with
the geomvinum people if they need any bits repo-copied etc.
2004-09-23 08:34:50 +00:00
Gleb Smirnoff
cec50dea12 Attach ng_netflow to kernel build.
Approved by:	julian (mentor)
2004-09-16 20:35:28 +00:00
Alfred Perlstein
0793d4d1e4 Hook autofs to the build. 2004-09-02 20:44:56 +00:00
Scott Long
9923b511ed Turn PREEMPTION into a kernel option. Make sure that it's defined if
FULL_PREEMPTION is defined.  Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c).  This
is a possible MT5 candidate.
2004-09-02 18:59:15 +00:00
Brooks Davis
b443062227 General modernization of coda:
- Ditch NVCODA
 - Don't use a static major
 - Don't declare functions extern

Reviewed by:	peter
2004-09-01 01:19:52 +00:00
Peter Wemm
f37a929ca1 Kill count device support from config. I've changed the last few
remaining consumers to have the count passed as an option.  This is
i4b, pc98/wdc, and coda.

Bump configvers.h from 500013 to 600000.

Remove heuristics that tried to parse "device ed5" as 5 units of the ed
device.  This broke things like the snd_emu10k1 device, which required
quotes to make it parse right.  The no-longer-needed quotes have been
removed from NOTES, GENERIC etc.  eg, I've removed the quotes from:
   device  snd_maestro
   device  "snd_maestro3"
   device  snd_mss

I believe everything will still compile and work after this.
2004-08-30 23:03:58 +00:00
Dag-Erling Smørgrav
0eac4495db Remove the HW_WDOG option; it serves no purpose.
MFC after:	3 days
2004-08-29 11:10:09 +00:00
Robert Watson
1d8cd39e71 Change the default disposition of debug.mpsafenet from 0 to 1, which
will cause the network stack to operate without the Giant lock by
default.  This change has the potential to improve performance by
increasing parallelism and decreasing latency in network processing.

Due to the potential exposure of existing or new bugs, the following
compatibility functionality is maintained:

- It is still possible to disable Giant-free operation by setting
  debug.mpsafenet to 0 in loader.conf.

- Add "options NET_WITH_GIANT", which will restore the default value of
  debug.mpsafenet to 0, and is intended for use on systems compiled with
  known unsafe components, or where a more conservative configuration is
  desired.

- Add a new declaration, NET_NEEDS_GIANT("componentname"), which permits
  kernel components to declare dependence on Giant over the network
  stack.  If the declaration is made by a preloaded module or a compiled
  in component, the disposition of debug.mpsafenet will be set to 0 and
  a warning concerning performance degraded operation printed to the
  console.  If it is declared by a loadable kernel module after boot, a
  warning is displayed but the disposition cannot be changed.  This is
  implemented by defining a new SYSINIT() value, SI_SUB_SETTINGS, which
  is intended for the processing of configuration choices after tunables
  are read in and the console is available to generate errors, but
  before much else gets going.

This compatibility behavior will go away when we've finished the last
of the locking work and are confident that operation is correct.
2004-08-28 15:11:13 +00:00
Andre Oppermann
c21fd23260 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00
John-Mark Gurney
000968010a add options MPROF_BUFFERS and MPROF_HASH_SIZE that adjust the sizes of
the mutex profiling buffers.  Document them in the man page and in NOTES.
Ensure _HASH_SIZE is larger than _BUFFERS with a cpp error.
2004-08-19 06:38:26 +00:00
Andre Oppermann
9b932e9e04 Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland
and preserves the ipfw ABI.  The ipfw core packet inspection and filtering
functions have not been changed, only how ipfw is invoked is different.

However there are many changes how ipfw is and its add-on's are handled:

 In general ipfw is now called through the PFIL_HOOKS and most associated
 magic, that was in ip_input() or ip_output() previously, is now done in
 ipfw_check_[in|out]() in the ipfw PFIL handler.

 IPDIVERT is entirely handled within the ipfw PFIL handlers.  A packet to
 be diverted is checked if it is fragmented, if yes, ip_reass() gets in for
 reassembly.  If not, or all fragments arrived and the packet is complete,
 divert_packet is called directly.  For 'tee' no reassembly attempt is made
 and a copy of the packet is sent to the divert socket unmodified.  The
 original packet continues its way through ip_input/output().

 ipfw 'forward' is done via m_tag's.  The ipfw PFIL handlers tag the packet
 with the new destination sockaddr_in.  A check if the new destination is a
 local IP address is made and the m_flags are set appropriately.  ip_input()
 and ip_output() have some more work to do here.  For ip_input() the m_flags
 are checked and a packet for us is directly sent to the 'ours' section for
 further processing.  Destination changes on the input path are only tagged
 and the 'srcrt' flag to ip_forward() is set to disable destination checks
 and ICMP replies at this stage.  The tag is going to be handled on output.
 ip_output() again checks for m_flags and the 'ours' tag.  If found, the
 packet will be dropped back to the IP netisr where it is going to be picked
 up by ip_input() again and the directly sent to the 'ours' section.  When
 only the destination changes, the route's 'dst' is overwritten with the
 new destination from the forward m_tag.  Then it jumps back at the route
 lookup again and skips the firewall check because it has been marked with
 M_SKIP_FIREWALL.  ipfw 'forward' has to be compiled into the kernel with
 'option IPFIREWALL_FORWARD' to enable it.

 DUMMYNET is entirely handled within the ipfw PFIL handlers.  A packet for
 a dummynet pipe or queue is directly sent to dummynet_io().  Dummynet will
 then inject it back into ip_input/ip_output() after it has served its time.
 Dummynet packets are tagged and will continue from the next rule when they
 hit the ipfw PFIL handlers again after re-injection.

 BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as
 they did before.  Later this will be changed to dedicated ETHER PFIL_HOOKS.

More detailed changes to the code:

 conf/files
	Add netinet/ip_fw_pfil.c.

 conf/options
	Add IPFIREWALL_FORWARD option.

 modules/ipfw/Makefile
	Add ip_fw_pfil.c.

 net/bridge.c
	Disable PFIL_HOOKS if ipfw for bridging is active.  Bridging ipfw
	is still directly invoked to handle layer2 headers and packets would
	get a double ipfw when run through PFIL_HOOKS as well.

 netinet/ip_divert.c
	Removed divert_clone() function.  It is no longer used.

 netinet/ip_dummynet.[ch]
	Neither the route 'ro' nor the destination 'dst' need to be stored
	while in dummynet transit.  Structure members and associated macros
	are removed.

 netinet/ip_fastfwd.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_fw.h
	Removed 'ro' and 'dst' from struct ip_fw_args.

 netinet/ip_fw2.c
	(Re)moved some global variables and the module handling.

 netinet/ip_fw_pfil.c
	New file containing the ipfw PFIL handlers and module initialization.

 netinet/ip_input.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.  ip_forward() does not longer require
	the 'next_hop' struct sockaddr_in argument.  Disable early checks
	if 'srcrt' is set.

 netinet/ip_output.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_var.h
	Add ip_reass() as general function.  (Used from ipfw PFIL handlers
	for IPDIVERT.)

 netinet/raw_ip.c
	Directly check if ipfw and dummynet control pointers are active.

 netinet/tcp_input.c
	Rework the 'ipfw forward' to local code to work with the new way of
	forward tags.

 netinet/tcp_sack.c
	Remove include 'opt_ipfw.h' which is not needed here.

 sys/mbuf.h
	Remove m_claim_next() macro which was exclusively for ipfw 'forward'
	and is no longer needed.

Approved by:	re (scottl)
2004-08-17 22:05:54 +00:00
Pawel Jakub Dawidek
e81856c34c Connect RAID3 GEOM class to the build. 2004-08-16 06:36:21 +00:00
David Malone
1f44b0a1b5 Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD
have already done this, so I have styled the patch on their work:

        1) introduce a ip_newid() static inline function that checks
        the sysctl and then decides if it should return a sequential
        or random IP ID.

        2) named the sysctl net.inet.ip.random_id

        3) IPv6 flow IDs and fragment IDs are now always random.
        Flow IDs and frag IDs are significantly less common in the
        IPv6 world (ie. rarely generated per-packet), so there should
        be smaller performance concerns.

The sysctl defaults to 0 (sequential IP IDs).

Reviewed by:	andre, silby, mlaier, ume
Based on:	NetBSD
MFC after:	2 months
2004-08-14 15:32:40 +00:00
Max Khon
75261008d7 Add geom_uzip -- geom class that implements read-only compressed disks.
Currently supports cloop V2.0 disk compression format.
May support more formats in future.
2004-08-13 09:40:58 +00:00
Hartmut Brandt
a7e2239469 Allow the ATM call control module to be built into the kernel. 2004-08-12 15:01:59 +00:00
Pawel Jakub Dawidek
8a8fbaca32 Connect GEOM_MIRROR class to the build. 2004-07-30 23:18:53 +00:00
Pawel Jakub Dawidek
bd58d1d222 Remove the old geom_mirror class.
Approved by:	phk
2004-07-30 22:50:21 +00:00
Robert Watson
a9abdce44a Add "options ADAPTIVE_GIANT" which causes Giant to also be treated in
an adaptive fashion when adaptive mutexes are enabled.  The theory
behind non-adaptive Giant is that Giant will be held for long periods
of time, and therefore spinning waiting on it is wasteful.  However,
in MySQL benchmarks which are relatively Giant-free, running Giant
adaptive makes an observable difference on SMP (5% transaction rate
improvement).  As such, make adaptive behavior on Giant an option so
it can be more widely benchmarked.
2004-07-27 16:34:48 +00:00
Gleb Smirnoff
31578ac8ae Add ng_device(4) to LINT.
Reviewed by:	marks
Approved by:	julian (mentor)
2004-07-20 12:42:54 +00:00
Alexander Kabaev
8024a9da08 Unbreak kernel compiles by preserving an old opt_adaptive_mutexes.h file
name.
2004-07-18 18:21:39 +00:00
Scott Long
701f140800 Enable ADAPTIVE_MUTEXES by default by changing the sense of the option to
NO_ADAPTIVE_MUTEXES.  This option has been enabled by default on amd64 for
quite some time, and has been extensively tested on i386 and sparc64.  It
shows measurable performance gains in many circumstances, and few negative
effects.  It would be nice in t he future if adaptive mutexes actually went
to sleep after a certain amount of spinning, but that will require quite a
bit more testing.
2004-07-18 15:59:03 +00:00
Marcel Moolenaar
e2ee21738f Update for the KDB framework:
o  Rename WITNESS_DDB to WITNESS_KDB. In the new world order KDB is the
   acronym to use for debugging related code. The DDB option is used
   to enable the DDB debugger backend only.
o  Likewise, rename DDB_TRACE to KDB_TRACE, rename DDB_UNATTENDED to
   KDB_UNATTENDED and rename SC_HISTORY_DDBKEY to SC_HISTORY_KDBKEY.
o  Remove DDB_NOKLDSYM. The new DDB backend supports pre-linker symbol
   lookups as well as KLD symbol lookups at the same time.
o  Remove GDB_REMOTE_CHAT. The GDB protocol hacks to allow this are
   FreeBSD specific. At the same time, the GDB protocol has packets
   for console output.
2004-07-11 01:44:07 +00:00
Marcel Moolenaar
0aefe3632e Add new options for the KDB framework. This commit merely adds them and
in particular not without removing the options they replace or in the
proper location in this file. The purpose of this commit is to make it
possible to commit changes in parts without causing massive build
breakages. At least, that's the intend. I have no idea if it actually
works out as I hope...
2004-07-10 19:34:06 +00:00
Brian Somers
0ac4013324 Change the following environment variables to kernel options:
bootp -> BOOTP
    bootp.nfsroot -> BOOTP_NFSROOT
    bootp.nfsv3 -> BOOTP_NFSV3
    bootp.compat -> BOOTP_COMPAT
    bootp.wired_to -> BOOTP_WIRED_TO

- i.e. back out the previous commit.  It's already possible to
pxeboot(8) with a GENERIC kernel.

Pointed out by: dwmalone
2004-07-08 22:35:36 +00:00
Brian Somers
59e1ebc9b5 Change the following kernel options to environment variables:
BOOTP -> bootp
    BOOTP_NFSROOT -> bootp.nfsroot
    BOOTP_NFSV3 -> bootp.nfsv3
    BOOTP_COMPAT -> bootp.compat
    BOOTP_WIRED_TO -> bootp.wired_to

This lets you PXE boot with a GENERIC kernel by putting this sort of thing
in loader.conf:

    bootp="YES"
    bootp.nfsroot="YES"
    bootp.nfsv3="YES"
    bootp.wired_to="bge1"

or even setting the variables manually from the OK prompt.
2004-07-08 13:40:33 +00:00
Tim J. Robbins
3bc482ec1c By popular request, add a workaround that allows large (>128GB or so)
FAT32 filesystems to be mounted, subject to some fairly serious limitations.

This works by extending the internal pseudo-inode-numbers generated from
the file's starting cluster number to 64-bits, then creating a table
mapping these into arbitrary 32-bit inode numbers, which can fit in
struct dirent's d_fileno and struct vattr's va_fileid fields. The mappings
do not persist across unmounts or reboots, so it's not possible to export
these filesystems through NFS. The mapping table may grow to be rather
large, and may grow large enough to exhaust kernel memory on filesystems
with millions of files.

Don't enable this option unless you understand the consequences.
2004-07-03 13:22:38 +00:00
John Baldwin
0c0b25ae91 Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
  determine if a thread about to be added to a run queue should be
  preempted to directly.  If it is not safe to preempt or if the new
  thread does not have a high enough priority, then the function returns
  false and sched_add() adds the thread to the run queue.  If the thread
  should be preempted to but the current thread is in a nested critical
  section, then the flag TDF_OWEPREEMPT is set and the thread is added
  to the run queue.  Otherwise, mi_switch() is called immediately and the
  thread is never added to the run queue since it is switch to directly.
  When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
  then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
  setrunqueue() now does all the correct work.  This also removes the
  do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
  supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
  chance to run if the architecture supports native preemption since
  the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
  architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
  preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by:	scottl (with his re@ hat)
2004-07-02 20:21:44 +00:00
Pawel Jakub Dawidek
e1237b285b Introduce GEOM_LABEL class.
This class is used for detecting volume labels on file systems:
UFS, MSDOSFS (FAT12, FAT16, FAT32) and ISO9660.
It also provide native labelization (there is no need for file system).

g_label_ufs.c is based on geom_vol_ffs from Gordon Tetlow.
g_label_msdos.c and g_label_iso9660.c are probably hacks, I just found
where volume labels are stored and I use those offsets here,
but with this class it should be easy to do it as it should be done by
someone who know how.
Implementing volume labels detection for other file systems also should
be trivial.

New providers are created in those directories:
/dev/ufs/ (UFS1, UFS2)
/dev/msdosfs/ (FAT12, FAT16, FAT32)
/dev/iso9660/ (ISO9660)
/dev/label/ (native labels, configured with glabel(8))

Manual page cleanups and some comments inside were submitted by
Simon L. Nielsen, who was, as always, very helpful. Thanks!
2004-07-02 19:40:36 +00:00
John Baldwin
ef0ebfc351 Add two new kernel options to allow rudimentary profiling of the internal
hash tables used in the sleep queue and turnstile code.  Each option adds
a sysctl tree under debug containing the maximum depth of any bucket in
the hash table as well as a separate node for each bucket (or chain)
containing the current depth and maximum depth for that bucket.
2004-06-29 02:30:12 +00:00
Robert Watson
d07af9d904 Add options NETGRAPH_FEC to hook up ng_fec.c to the LINT build. 2004-06-27 02:36:33 +00:00
Robert Watson
9d56413324 Add options NETGRAPH_EIFACE, which causes ng_eiface.c to be built into
the kernel, similar to NETGRAPH_IFACE for ng_iface.c.  It appears to
have been omitted when added to the kernel.
2004-06-27 02:25:38 +00:00
Paul Saab
6d90faf3d8 Add support for TCP Selective Acknowledgements. The work for this
originated on RELENG_4 and was ported to -CURRENT.

The scoreboarding code was obtained from OpenBSD, and many
of the remaining changes were inspired by OpenBSD, but not
taken directly from there.

You can enable/disable sack using net.inet.tcp.do_sack. You can
also limit the number of sack holes that all senders can have in
the scoreboard with net.inet.tcp.sackhole_limit.

Reviewed by:	gnn
Obtained from:	Yahoo! (Mohan Srinivasan, Jayanth Vijayaraghavan)
2004-06-23 21:04:37 +00:00
Max Laier
02b199f158 Link ALTQ to the build and break with ABI for struct ifnet. Please recompile
your (network) modules as well as any userland that might make sense of
sizeof(struct ifnet).
This does not change the queueing yet. These changes will follow in a
seperate commit. Same with the driver changes, which need case by case
evaluation.

__FreeBSD_version bump will follow.

Tested-by:	(i386)LINT
2004-06-13 17:29:10 +00:00
Poul-Henning Kamp
1930e303cf Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1.  We have neither
a sparc32 port nor a SunOS4.x compatibility desire these days.
2004-06-11 11:16:26 +00:00
Pawel Jakub Dawidek
7dc92b13d0 - Connect geom(8) and its libraries to the build.
- Connect geom_stripe and geom_nop modules to the build.
- Connect STRIPE and NOP classes to the LINT build.
- Disconnect gconcat(8) from the build.

Supported by:	Wheel - Open Technologies - http://www.wheel.pl
2004-05-20 10:37:13 +00:00
Warner Losh
560c97db54 Expose USBVERBOSE as a first-class option. It will be needed soon as
an option.  Note that this option doesn't follow the normal USB_ or
Uxxx_ convention.  That's because it is this way in the upstream
provider and I didn't want to change that.
2004-05-13 03:15:04 +00:00
Doug Ambrisko
a00d3e6140 Remove new options and my prevention of system freeze when the sio probe
returns okay when HW probe fails.  This happens when comconsole flag is
set but VGA console is used instead.

Back out requested by:  bde (He will be looking at other solutions from scratch)
2004-05-03 22:35:28 +00:00
Pawel Jakub Dawidek
7226443d39 Allow geom_concat and geom_gate to be compiled in kernel. 2004-05-03 21:18:56 +00:00
Robert Watson
0a05006dd2 Add MAC_STATIC, a kernel option that disables internal MAC Framework
synchronization protecting against dynamic load and unload of MAC
policies, and instead simply blocks load and unload.  In a static
configuration, this allows you to avoid the synchronization costs
associated with introducing dynamicism.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, McAfee Research
2004-05-03 20:53:05 +00:00
Doug Ambrisko
33c5911242 Some enhancements and bug fix.
-  Define option FORCECONSPEED to force the serial console to
        be CONSPEED.  I've run into a lot of boards in which
        the detect for prior speed doesn't work and ends up with
        broken console since it is at the wrong speed.
     -  If a serial port is marked as a console, but console=vidconsole
        and if the serial ports doesn't exist it will be probed and
        attached at a 8250 chip.  Then writes to that will freeze the
        system.
     -  Add an option flags 0x400000 to mark this as a potential
        comconsole in-case the one flaged with 0x10 does not exist
        in the system.

This makes it easier to deploy on systems with one or two serial ports.

Obtained from:	IronPort
2004-04-30 21:16:52 +00:00
Maksim Yevmenkin
b84b10f92f Address few style issues pointed out by bde
Reviewed by:	bde, ru
2004-04-27 16:38:15 +00:00
Roman Kurakin
0a6818e29c Connect ng_sppp to the build process. 2004-04-24 22:03:02 +00:00
Maksim Yevmenkin
666ea1b65f Make sure Bluetooth stuff can be statically compiled into kernel
Submitted by:	ps
Reviewed by:	imp (mentor), ru
2004-04-23 19:48:43 +00:00
Scott Long
fce240d124 garbage collect ASR_MEASURE_PERFORMANCE 2004-04-21 20:18:06 +00:00
Nate Lawson
e27f806a59 As promised a while ago, remove DA_OLD_QUIRKS and all quirks it was enabling.
These are no longer needed now that we don't send 6-byte commands to RBC
devices.
2004-04-19 03:33:55 +00:00
Warner Losh
9dc313a3f7 Add glue for new sx driver. 2004-04-11 20:01:18 +00:00
John Baldwin
535eb30962 Add a new kernel option MUTEX_WAKE_ALL that changes the mutex unlock code
to awaken all waiters when a contested mutex is released instead of just
the highest priority waiter.  If the various threads are awakened in
sequence then each thread may acquire and release the lock in question
without contention resulting in fewer expensive unlock and lock
operations.  This old behavior of waking just the highest priority is
still used if this option is specified.  Making the algorithm conditional
on a kernel option will allows us to benchmark both cases later and
determine which one should be used by default.

Requested by:	tanimura-san
2004-04-06 19:12:24 +00:00
Vinod Kashyap
9e429d3871 Moved comments on 3ware 9000 series RAID controller driver options from
options to NOTES.
2004-03-31 18:46:13 +00:00
Scott Long
662d381879 Give in to the oblique nagging and move AAC and AHC/AHD comments out of
/sys/conf/options and into /sys/conf/NOTES
2004-03-31 08:22:09 +00:00
Vinod Kashyap
f646fe1a8d Added options for 3ware 9000 series RAID controller driver (twa). 2004-03-30 18:53:18 +00:00
Scott Long
846ca5d04f Remove RAIDFrame. It hasn't worked since GEOM replaced the old disk
mini-layer.  I don't have time to bing it forward into the GEOM world, and
no one else has stepped forward to claim it.  It'll be in the Attic for safe
keeping for now.
2004-03-16 12:23:43 +00:00
Benno Rice
bde778e9f2 Add a netgraph node to handle ATM LLC encapsulation. This currently handles
ethernet (tested) and FDDI (not tested).  The main use for this is on ADSL (or
other ATM) connections where bridged ethernet is used, PPPoE being a prime
example.

There is no manual page as yet, I will write one shortly.

Reviewed by:	harti
2004-03-08 10:54:35 +00:00
Poul-Henning Kamp
4103b7652d Rename the WATCHDOG option to SW_WATCHDOG and make it use the
generic watchdoc(9) interface.

Make watchdogd(8) perform as watchdog(8) as well, and make it
possible to specify a check command to run, timeout and sleep
periods.

Update watchdog(4) to talk about the generic interface and add
new watchdog(8) page.
2004-02-28 20:56:35 +00:00
Max Laier
cc5934f5af Tweak existing header and other build infrastructure to be able to build
pf/pflog/pfsync as modules. Do not list them in NOTES or modules/Makefile
(i.e. do not connect it to any (automatic) builds - yet).

Approved by: bms(mentor)
2004-02-26 03:53:54 +00:00
Bruce Evans
d01cde0480 Fixed some insertion sort errors. 2004-02-25 09:35:35 +00:00
Poul-Henning Kamp
9aece96fc0 Add DDB_NUMSYM option which in addition to the symbolic representation
also prints the actual numerical value of the symbol in question.

Users of addr2line(1) will be less proficient in hex arithmetic as a
consequence.

This amongst other things means that traceback lines change from:
   siointr1(c4016800,c073bda0,0,c06b699c,69f) at siointr1+0xc5
to
   siointr1(c4016800,c073bda0,0,c06b699c,69f) at 0xc062b0bd = siointr1+0xc5

I made this an option to avoid bikesheds.
~
~
~
2004-02-24 22:51:42 +00:00
Bruce M Simpson
1cfd4b5326 Initial import of RFC 2385 (TCP-MD5) digest support.
This is the first of two commits; bringing in the kernel support first.
This can be enabled by compiling a kernel with options TCP_SIGNATURE
and FAST_IPSEC.

For the uninitiated, this is a TCP option which provides for a means of
authenticating TCP sessions which came into being before IPSEC. It is
still relevant today, however, as it is used by many commercial router
vendors, particularly with BGP, and as such has become a requirement for
interconnect at many major Internet points of presence.

Several parts of the TCP and IP headers, including the segment payload,
are digested with MD5, including a shared secret. The PF_KEY interface
is used to manage the secrets using security associations in the SADB.

There is a limitation here in that as there is no way to map a TCP flow
per-port back to an SPI without polluting tcpcb or using the SPD; the
code to do the latter is unstable at this time. Therefore this code only
supports per-host keying granularity.

Whilst FAST_IPSEC is mutually exclusive with KAME IPSEC (and thus IPv6),
TCP_SIGNATURE applies only to IPv4. For the vast majority of prospective
users of this feature, this will not pose any problem.

This implementation is output-only; that is, the option is honoured when
responding to a host initiating a TCP session, but no effort is made
[yet] to authenticate inbound traffic. This is, however, sufficient to
interwork with Cisco equipment.

Tested with a Cisco 2501 running IOS 12.0(27), and Quagga 0.96.4 with
local patches. Patches for tcpdump to validate TCP-MD5 sessions are also
available from me upon request.

Sponsored by:	sentex.net
2004-02-11 04:26:04 +00:00