Commit Graph

474 Commits

Author SHA1 Message Date
John-Mark Gurney
000968010a add options MPROF_BUFFERS and MPROF_HASH_SIZE that adjust the sizes of
the mutex profiling buffers.  Document them in the man page and in NOTES.
Ensure _HASH_SIZE is larger than _BUFFERS with a cpp error.
2004-08-19 06:38:26 +00:00
Andre Oppermann
9b932e9e04 Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland
and preserves the ipfw ABI.  The ipfw core packet inspection and filtering
functions have not been changed, only how ipfw is invoked is different.

However there are many changes how ipfw is and its add-on's are handled:

 In general ipfw is now called through the PFIL_HOOKS and most associated
 magic, that was in ip_input() or ip_output() previously, is now done in
 ipfw_check_[in|out]() in the ipfw PFIL handler.

 IPDIVERT is entirely handled within the ipfw PFIL handlers.  A packet to
 be diverted is checked if it is fragmented, if yes, ip_reass() gets in for
 reassembly.  If not, or all fragments arrived and the packet is complete,
 divert_packet is called directly.  For 'tee' no reassembly attempt is made
 and a copy of the packet is sent to the divert socket unmodified.  The
 original packet continues its way through ip_input/output().

 ipfw 'forward' is done via m_tag's.  The ipfw PFIL handlers tag the packet
 with the new destination sockaddr_in.  A check if the new destination is a
 local IP address is made and the m_flags are set appropriately.  ip_input()
 and ip_output() have some more work to do here.  For ip_input() the m_flags
 are checked and a packet for us is directly sent to the 'ours' section for
 further processing.  Destination changes on the input path are only tagged
 and the 'srcrt' flag to ip_forward() is set to disable destination checks
 and ICMP replies at this stage.  The tag is going to be handled on output.
 ip_output() again checks for m_flags and the 'ours' tag.  If found, the
 packet will be dropped back to the IP netisr where it is going to be picked
 up by ip_input() again and the directly sent to the 'ours' section.  When
 only the destination changes, the route's 'dst' is overwritten with the
 new destination from the forward m_tag.  Then it jumps back at the route
 lookup again and skips the firewall check because it has been marked with
 M_SKIP_FIREWALL.  ipfw 'forward' has to be compiled into the kernel with
 'option IPFIREWALL_FORWARD' to enable it.

 DUMMYNET is entirely handled within the ipfw PFIL handlers.  A packet for
 a dummynet pipe or queue is directly sent to dummynet_io().  Dummynet will
 then inject it back into ip_input/ip_output() after it has served its time.
 Dummynet packets are tagged and will continue from the next rule when they
 hit the ipfw PFIL handlers again after re-injection.

 BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as
 they did before.  Later this will be changed to dedicated ETHER PFIL_HOOKS.

More detailed changes to the code:

 conf/files
	Add netinet/ip_fw_pfil.c.

 conf/options
	Add IPFIREWALL_FORWARD option.

 modules/ipfw/Makefile
	Add ip_fw_pfil.c.

 net/bridge.c
	Disable PFIL_HOOKS if ipfw for bridging is active.  Bridging ipfw
	is still directly invoked to handle layer2 headers and packets would
	get a double ipfw when run through PFIL_HOOKS as well.

 netinet/ip_divert.c
	Removed divert_clone() function.  It is no longer used.

 netinet/ip_dummynet.[ch]
	Neither the route 'ro' nor the destination 'dst' need to be stored
	while in dummynet transit.  Structure members and associated macros
	are removed.

 netinet/ip_fastfwd.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_fw.h
	Removed 'ro' and 'dst' from struct ip_fw_args.

 netinet/ip_fw2.c
	(Re)moved some global variables and the module handling.

 netinet/ip_fw_pfil.c
	New file containing the ipfw PFIL handlers and module initialization.

 netinet/ip_input.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.  ip_forward() does not longer require
	the 'next_hop' struct sockaddr_in argument.  Disable early checks
	if 'srcrt' is set.

 netinet/ip_output.c
	Removed all direct ipfw handling code and replace it with the new
	'ipfw forward' handling code.

 netinet/ip_var.h
	Add ip_reass() as general function.  (Used from ipfw PFIL handlers
	for IPDIVERT.)

 netinet/raw_ip.c
	Directly check if ipfw and dummynet control pointers are active.

 netinet/tcp_input.c
	Rework the 'ipfw forward' to local code to work with the new way of
	forward tags.

 netinet/tcp_sack.c
	Remove include 'opt_ipfw.h' which is not needed here.

 sys/mbuf.h
	Remove m_claim_next() macro which was exclusively for ipfw 'forward'
	and is no longer needed.

Approved by:	re (scottl)
2004-08-17 22:05:54 +00:00
Pawel Jakub Dawidek
e81856c34c Connect RAID3 GEOM class to the build. 2004-08-16 06:36:21 +00:00
David Malone
1f44b0a1b5 Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD
have already done this, so I have styled the patch on their work:

        1) introduce a ip_newid() static inline function that checks
        the sysctl and then decides if it should return a sequential
        or random IP ID.

        2) named the sysctl net.inet.ip.random_id

        3) IPv6 flow IDs and fragment IDs are now always random.
        Flow IDs and frag IDs are significantly less common in the
        IPv6 world (ie. rarely generated per-packet), so there should
        be smaller performance concerns.

The sysctl defaults to 0 (sequential IP IDs).

Reviewed by:	andre, silby, mlaier, ume
Based on:	NetBSD
MFC after:	2 months
2004-08-14 15:32:40 +00:00
Max Khon
75261008d7 Add geom_uzip -- geom class that implements read-only compressed disks.
Currently supports cloop V2.0 disk compression format.
May support more formats in future.
2004-08-13 09:40:58 +00:00
Hartmut Brandt
a7e2239469 Allow the ATM call control module to be built into the kernel. 2004-08-12 15:01:59 +00:00
Pawel Jakub Dawidek
8a8fbaca32 Connect GEOM_MIRROR class to the build. 2004-07-30 23:18:53 +00:00
Pawel Jakub Dawidek
bd58d1d222 Remove the old geom_mirror class.
Approved by:	phk
2004-07-30 22:50:21 +00:00
Robert Watson
a9abdce44a Add "options ADAPTIVE_GIANT" which causes Giant to also be treated in
an adaptive fashion when adaptive mutexes are enabled.  The theory
behind non-adaptive Giant is that Giant will be held for long periods
of time, and therefore spinning waiting on it is wasteful.  However,
in MySQL benchmarks which are relatively Giant-free, running Giant
adaptive makes an observable difference on SMP (5% transaction rate
improvement).  As such, make adaptive behavior on Giant an option so
it can be more widely benchmarked.
2004-07-27 16:34:48 +00:00
Gleb Smirnoff
31578ac8ae Add ng_device(4) to LINT.
Reviewed by:	marks
Approved by:	julian (mentor)
2004-07-20 12:42:54 +00:00
Alexander Kabaev
8024a9da08 Unbreak kernel compiles by preserving an old opt_adaptive_mutexes.h file
name.
2004-07-18 18:21:39 +00:00
Scott Long
701f140800 Enable ADAPTIVE_MUTEXES by default by changing the sense of the option to
NO_ADAPTIVE_MUTEXES.  This option has been enabled by default on amd64 for
quite some time, and has been extensively tested on i386 and sparc64.  It
shows measurable performance gains in many circumstances, and few negative
effects.  It would be nice in t he future if adaptive mutexes actually went
to sleep after a certain amount of spinning, but that will require quite a
bit more testing.
2004-07-18 15:59:03 +00:00
Marcel Moolenaar
e2ee21738f Update for the KDB framework:
o  Rename WITNESS_DDB to WITNESS_KDB. In the new world order KDB is the
   acronym to use for debugging related code. The DDB option is used
   to enable the DDB debugger backend only.
o  Likewise, rename DDB_TRACE to KDB_TRACE, rename DDB_UNATTENDED to
   KDB_UNATTENDED and rename SC_HISTORY_DDBKEY to SC_HISTORY_KDBKEY.
o  Remove DDB_NOKLDSYM. The new DDB backend supports pre-linker symbol
   lookups as well as KLD symbol lookups at the same time.
o  Remove GDB_REMOTE_CHAT. The GDB protocol hacks to allow this are
   FreeBSD specific. At the same time, the GDB protocol has packets
   for console output.
2004-07-11 01:44:07 +00:00
Marcel Moolenaar
0aefe3632e Add new options for the KDB framework. This commit merely adds them and
in particular not without removing the options they replace or in the
proper location in this file. The purpose of this commit is to make it
possible to commit changes in parts without causing massive build
breakages. At least, that's the intend. I have no idea if it actually
works out as I hope...
2004-07-10 19:34:06 +00:00
Brian Somers
0ac4013324 Change the following environment variables to kernel options:
bootp -> BOOTP
    bootp.nfsroot -> BOOTP_NFSROOT
    bootp.nfsv3 -> BOOTP_NFSV3
    bootp.compat -> BOOTP_COMPAT
    bootp.wired_to -> BOOTP_WIRED_TO

- i.e. back out the previous commit.  It's already possible to
pxeboot(8) with a GENERIC kernel.

Pointed out by: dwmalone
2004-07-08 22:35:36 +00:00
Brian Somers
59e1ebc9b5 Change the following kernel options to environment variables:
BOOTP -> bootp
    BOOTP_NFSROOT -> bootp.nfsroot
    BOOTP_NFSV3 -> bootp.nfsv3
    BOOTP_COMPAT -> bootp.compat
    BOOTP_WIRED_TO -> bootp.wired_to

This lets you PXE boot with a GENERIC kernel by putting this sort of thing
in loader.conf:

    bootp="YES"
    bootp.nfsroot="YES"
    bootp.nfsv3="YES"
    bootp.wired_to="bge1"

or even setting the variables manually from the OK prompt.
2004-07-08 13:40:33 +00:00
Tim J. Robbins
3bc482ec1c By popular request, add a workaround that allows large (>128GB or so)
FAT32 filesystems to be mounted, subject to some fairly serious limitations.

This works by extending the internal pseudo-inode-numbers generated from
the file's starting cluster number to 64-bits, then creating a table
mapping these into arbitrary 32-bit inode numbers, which can fit in
struct dirent's d_fileno and struct vattr's va_fileid fields. The mappings
do not persist across unmounts or reboots, so it's not possible to export
these filesystems through NFS. The mapping table may grow to be rather
large, and may grow large enough to exhaust kernel memory on filesystems
with millions of files.

Don't enable this option unless you understand the consequences.
2004-07-03 13:22:38 +00:00
John Baldwin
0c0b25ae91 Implement preemption of kernel threads natively in the scheduler rather
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
  determine if a thread about to be added to a run queue should be
  preempted to directly.  If it is not safe to preempt or if the new
  thread does not have a high enough priority, then the function returns
  false and sched_add() adds the thread to the run queue.  If the thread
  should be preempted to but the current thread is in a nested critical
  section, then the flag TDF_OWEPREEMPT is set and the thread is added
  to the run queue.  Otherwise, mi_switch() is called immediately and the
  thread is never added to the run queue since it is switch to directly.
  When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
  then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
  setrunqueue() now does all the correct work.  This also removes the
  do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
  supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
  chance to run if the architecture supports native preemption since
  the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
  architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
  preemption, namely alpha, i386, and amd64.

This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.

Approved by:	scottl (with his re@ hat)
2004-07-02 20:21:44 +00:00
Pawel Jakub Dawidek
e1237b285b Introduce GEOM_LABEL class.
This class is used for detecting volume labels on file systems:
UFS, MSDOSFS (FAT12, FAT16, FAT32) and ISO9660.
It also provide native labelization (there is no need for file system).

g_label_ufs.c is based on geom_vol_ffs from Gordon Tetlow.
g_label_msdos.c and g_label_iso9660.c are probably hacks, I just found
where volume labels are stored and I use those offsets here,
but with this class it should be easy to do it as it should be done by
someone who know how.
Implementing volume labels detection for other file systems also should
be trivial.

New providers are created in those directories:
/dev/ufs/ (UFS1, UFS2)
/dev/msdosfs/ (FAT12, FAT16, FAT32)
/dev/iso9660/ (ISO9660)
/dev/label/ (native labels, configured with glabel(8))

Manual page cleanups and some comments inside were submitted by
Simon L. Nielsen, who was, as always, very helpful. Thanks!
2004-07-02 19:40:36 +00:00
John Baldwin
ef0ebfc351 Add two new kernel options to allow rudimentary profiling of the internal
hash tables used in the sleep queue and turnstile code.  Each option adds
a sysctl tree under debug containing the maximum depth of any bucket in
the hash table as well as a separate node for each bucket (or chain)
containing the current depth and maximum depth for that bucket.
2004-06-29 02:30:12 +00:00
Robert Watson
d07af9d904 Add options NETGRAPH_FEC to hook up ng_fec.c to the LINT build. 2004-06-27 02:36:33 +00:00
Robert Watson
9d56413324 Add options NETGRAPH_EIFACE, which causes ng_eiface.c to be built into
the kernel, similar to NETGRAPH_IFACE for ng_iface.c.  It appears to
have been omitted when added to the kernel.
2004-06-27 02:25:38 +00:00
Paul Saab
6d90faf3d8 Add support for TCP Selective Acknowledgements. The work for this
originated on RELENG_4 and was ported to -CURRENT.

The scoreboarding code was obtained from OpenBSD, and many
of the remaining changes were inspired by OpenBSD, but not
taken directly from there.

You can enable/disable sack using net.inet.tcp.do_sack. You can
also limit the number of sack holes that all senders can have in
the scoreboard with net.inet.tcp.sackhole_limit.

Reviewed by:	gnn
Obtained from:	Yahoo! (Mohan Srinivasan, Jayanth Vijayaraghavan)
2004-06-23 21:04:37 +00:00
Max Laier
02b199f158 Link ALTQ to the build and break with ABI for struct ifnet. Please recompile
your (network) modules as well as any userland that might make sense of
sizeof(struct ifnet).
This does not change the queueing yet. These changes will follow in a
seperate commit. Same with the driver changes, which need case by case
evaluation.

__FreeBSD_version bump will follow.

Tested-by:	(i386)LINT
2004-06-13 17:29:10 +00:00
Poul-Henning Kamp
1930e303cf Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1.  We have neither
a sparc32 port nor a SunOS4.x compatibility desire these days.
2004-06-11 11:16:26 +00:00
Pawel Jakub Dawidek
7dc92b13d0 - Connect geom(8) and its libraries to the build.
- Connect geom_stripe and geom_nop modules to the build.
- Connect STRIPE and NOP classes to the LINT build.
- Disconnect gconcat(8) from the build.

Supported by:	Wheel - Open Technologies - http://www.wheel.pl
2004-05-20 10:37:13 +00:00
Warner Losh
560c97db54 Expose USBVERBOSE as a first-class option. It will be needed soon as
an option.  Note that this option doesn't follow the normal USB_ or
Uxxx_ convention.  That's because it is this way in the upstream
provider and I didn't want to change that.
2004-05-13 03:15:04 +00:00
Doug Ambrisko
a00d3e6140 Remove new options and my prevention of system freeze when the sio probe
returns okay when HW probe fails.  This happens when comconsole flag is
set but VGA console is used instead.

Back out requested by:  bde (He will be looking at other solutions from scratch)
2004-05-03 22:35:28 +00:00
Pawel Jakub Dawidek
7226443d39 Allow geom_concat and geom_gate to be compiled in kernel. 2004-05-03 21:18:56 +00:00
Robert Watson
0a05006dd2 Add MAC_STATIC, a kernel option that disables internal MAC Framework
synchronization protecting against dynamic load and unload of MAC
policies, and instead simply blocks load and unload.  In a static
configuration, this allows you to avoid the synchronization costs
associated with introducing dynamicism.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, McAfee Research
2004-05-03 20:53:05 +00:00
Doug Ambrisko
33c5911242 Some enhancements and bug fix.
-  Define option FORCECONSPEED to force the serial console to
        be CONSPEED.  I've run into a lot of boards in which
        the detect for prior speed doesn't work and ends up with
        broken console since it is at the wrong speed.
     -  If a serial port is marked as a console, but console=vidconsole
        and if the serial ports doesn't exist it will be probed and
        attached at a 8250 chip.  Then writes to that will freeze the
        system.
     -  Add an option flags 0x400000 to mark this as a potential
        comconsole in-case the one flaged with 0x10 does not exist
        in the system.

This makes it easier to deploy on systems with one or two serial ports.

Obtained from:	IronPort
2004-04-30 21:16:52 +00:00
Maksim Yevmenkin
b84b10f92f Address few style issues pointed out by bde
Reviewed by:	bde, ru
2004-04-27 16:38:15 +00:00
Roman Kurakin
0a6818e29c Connect ng_sppp to the build process. 2004-04-24 22:03:02 +00:00
Maksim Yevmenkin
666ea1b65f Make sure Bluetooth stuff can be statically compiled into kernel
Submitted by:	ps
Reviewed by:	imp (mentor), ru
2004-04-23 19:48:43 +00:00
Scott Long
fce240d124 garbage collect ASR_MEASURE_PERFORMANCE 2004-04-21 20:18:06 +00:00
Nate Lawson
e27f806a59 As promised a while ago, remove DA_OLD_QUIRKS and all quirks it was enabling.
These are no longer needed now that we don't send 6-byte commands to RBC
devices.
2004-04-19 03:33:55 +00:00
Warner Losh
9dc313a3f7 Add glue for new sx driver. 2004-04-11 20:01:18 +00:00
John Baldwin
535eb30962 Add a new kernel option MUTEX_WAKE_ALL that changes the mutex unlock code
to awaken all waiters when a contested mutex is released instead of just
the highest priority waiter.  If the various threads are awakened in
sequence then each thread may acquire and release the lock in question
without contention resulting in fewer expensive unlock and lock
operations.  This old behavior of waking just the highest priority is
still used if this option is specified.  Making the algorithm conditional
on a kernel option will allows us to benchmark both cases later and
determine which one should be used by default.

Requested by:	tanimura-san
2004-04-06 19:12:24 +00:00
Vinod Kashyap
9e429d3871 Moved comments on 3ware 9000 series RAID controller driver options from
options to NOTES.
2004-03-31 18:46:13 +00:00
Scott Long
662d381879 Give in to the oblique nagging and move AAC and AHC/AHD comments out of
/sys/conf/options and into /sys/conf/NOTES
2004-03-31 08:22:09 +00:00
Vinod Kashyap
f646fe1a8d Added options for 3ware 9000 series RAID controller driver (twa). 2004-03-30 18:53:18 +00:00
Scott Long
846ca5d04f Remove RAIDFrame. It hasn't worked since GEOM replaced the old disk
mini-layer.  I don't have time to bing it forward into the GEOM world, and
no one else has stepped forward to claim it.  It'll be in the Attic for safe
keeping for now.
2004-03-16 12:23:43 +00:00
Benno Rice
bde778e9f2 Add a netgraph node to handle ATM LLC encapsulation. This currently handles
ethernet (tested) and FDDI (not tested).  The main use for this is on ADSL (or
other ATM) connections where bridged ethernet is used, PPPoE being a prime
example.

There is no manual page as yet, I will write one shortly.

Reviewed by:	harti
2004-03-08 10:54:35 +00:00
Poul-Henning Kamp
4103b7652d Rename the WATCHDOG option to SW_WATCHDOG and make it use the
generic watchdoc(9) interface.

Make watchdogd(8) perform as watchdog(8) as well, and make it
possible to specify a check command to run, timeout and sleep
periods.

Update watchdog(4) to talk about the generic interface and add
new watchdog(8) page.
2004-02-28 20:56:35 +00:00
Max Laier
cc5934f5af Tweak existing header and other build infrastructure to be able to build
pf/pflog/pfsync as modules. Do not list them in NOTES or modules/Makefile
(i.e. do not connect it to any (automatic) builds - yet).

Approved by: bms(mentor)
2004-02-26 03:53:54 +00:00
Bruce Evans
d01cde0480 Fixed some insertion sort errors. 2004-02-25 09:35:35 +00:00
Poul-Henning Kamp
9aece96fc0 Add DDB_NUMSYM option which in addition to the symbolic representation
also prints the actual numerical value of the symbol in question.

Users of addr2line(1) will be less proficient in hex arithmetic as a
consequence.

This amongst other things means that traceback lines change from:
   siointr1(c4016800,c073bda0,0,c06b699c,69f) at siointr1+0xc5
to
   siointr1(c4016800,c073bda0,0,c06b699c,69f) at 0xc062b0bd = siointr1+0xc5

I made this an option to avoid bikesheds.
~
~
~
2004-02-24 22:51:42 +00:00
Bruce M Simpson
1cfd4b5326 Initial import of RFC 2385 (TCP-MD5) digest support.
This is the first of two commits; bringing in the kernel support first.
This can be enabled by compiling a kernel with options TCP_SIGNATURE
and FAST_IPSEC.

For the uninitiated, this is a TCP option which provides for a means of
authenticating TCP sessions which came into being before IPSEC. It is
still relevant today, however, as it is used by many commercial router
vendors, particularly with BGP, and as such has become a requirement for
interconnect at many major Internet points of presence.

Several parts of the TCP and IP headers, including the segment payload,
are digested with MD5, including a shared secret. The PF_KEY interface
is used to manage the secrets using security associations in the SADB.

There is a limitation here in that as there is no way to map a TCP flow
per-port back to an SPI without polluting tcpcb or using the SPD; the
code to do the latter is unstable at this time. Therefore this code only
supports per-host keying granularity.

Whilst FAST_IPSEC is mutually exclusive with KAME IPSEC (and thus IPv6),
TCP_SIGNATURE applies only to IPv4. For the vast majority of prospective
users of this feature, this will not pose any problem.

This implementation is output-only; that is, the option is honoured when
responding to a host initiating a TCP session, but no effort is made
[yet] to authenticate inbound traffic. This is, however, sufficient to
interwork with Cisco equipment.

Tested with a Cisco 2501 running IOS 12.0(27), and Quagga 0.96.4 with
local patches. Patches for tcpdump to validate TCP-MD5 sessions are also
available from me upon request.

Sponsored by:	sentex.net
2004-02-11 04:26:04 +00:00
Warner Losh
65b4a1b917 Remote meteor driver. It hasn't compiled in over 3 years. If someone
makes it compile again, and can test it, we can restore the driver to
the tree.
2003-12-07 04:41:11 +00:00
Warner Losh
6ee2f106aa The dgb driver is redundant with the digi driver in the tree. It uses
lots of old interfaces, and digi now supports all cards that dgb
supported.  The author of the driver says that this is no longer
necessary.

Approved by: babkin@
2003-12-07 04:18:52 +00:00