3077 Commits

Author SHA1 Message Date
Peter Wemm
f1b665c8fe Revive backed out pmap related changes from Feb 2002. The highlights are:
- It actually works this time, honest!
- Fine grained TLB shootdowns for SMP on i386.  IPI's are very expensive,
  so try and optimize things where possible.
- Introduce ranged shootdowns that can be done as a single IPI.
- PG_G support for i386
- Specific-cpu targeted shootdowns.  For example, there is no sense in
  globally purging the TLB cache for where we are stealing a page from
  the local unshared process on the local cpu.  Use pm_active to track
  this.
- Add some instrumentation for the tlb shootdown code.
- Rip out SMP code from <machine/cpufunc.h>
- Try and fix some very bogus PG_G and PG_PS interactions that were bad
  enough to cause vm86 bios calls to break.  vm86 depended on our existing
  bugs and this was the cause of the VESA panics last time.
- Fix the silly one-line error that caused the 'panic: bad pte' last time.
- Fix a couple of other silly one-line errors that should have caused more
  pain than they did.

Some more work is needed:
- pmap_{zero,copy}_page[_idle].  These can be done without IPI's if we
  have a hook in cpu_switch.
- The IPI handlers need some cleanup.  I have a bogus %ds load that can
  be avoided.
- APTD handling is rather bogus and appears to be a large source of
  global TLB IPI shootdowns for no really good reason.

I see speedups of between 1.5% and ~4% on buildworlds in a while 1 loop.
I expect to see a bigger difference when there is significant pageout
activity or the system otherwise has memory shortages.

I have backed out a few optimizations that I had been using over the last
few days in order to be a little more conservative.  I'll revisit these
again over the next few days as the dust settles.

New option:  DISABLE_PG_G - In case I missed something.
2002-07-12 07:56:11 +00:00
Alfred Perlstein
074453c230 Introduce syscall.master option 'COMPAT4' which allows one to wrap
syscalls for FreeBSD 4 compatibility.
Add kernel option COMPAT_FREEBSD4 to enable these syscalls.
2002-07-12 06:38:34 +00:00
Bruce Evans
b147fcf936 Fixed misspelling of "hint." as "hints." in the description of the "hint."
keyword and in the description of rp's hints.

Didn't fix rp's hints being mostly in comments so that they are harder to
use (they don't get linted either way because makeLINT.sh strips them and
there is no compile-time syntax checking of hints anyway).
2002-07-11 20:43:37 +00:00
Ruslan Ermilov
6dc6a04be5 Do not override the standard `distribute' target that is currently
available from bsd.obj.mk.

The native version was identical (and pretty much unused except in
the -DMODULES_WITH_WORLD case, which it is not for "make release")
except that the "bin" -> "base" change of the default DISTRIBUTION
name did not propagate here.
2002-07-11 14:13:37 +00:00
Kenneth D. Merry
2c8f5a28bb Move the MSIZE and MCLSHIFT options out of the undocumented section in
NOTES.  Add some comments about the potential problems associated with NIC
driver modules and changing these options.

Fix sorting problems in sys/conf/options with the MSIZE and MCLSHIFT
options.

Reviewed by:	bde
2002-07-11 04:15:53 +00:00
Matt Jacob
e05ec8935c Enable ISP SBus support. 2002-07-11 03:26:39 +00:00
Benno Rice
99bc8c72f7 Add setjmp (needed for DDB). 2002-07-10 12:26:17 +00:00
Benno Rice
45b4eca56d Add DDB support. 2002-07-10 12:21:54 +00:00
Josef Karthauser
04b401aa8a It's not "usio" anymore, it's "ucom".
Submitted by:	nsayer
2002-07-10 01:42:25 +00:00
David E. O'Brien
8442e07371 Desupport the TurboChannel Alpha's. This means the DEC3000/300* Pelic*
and DEC3000/[4-9]00 Flamingo/Sandpiper families.
2002-07-09 19:20:18 +00:00
Mitsuru IWASAKI
98479b041b Resolve conflicts arising from the ACPI CA 20020611 import. 2002-07-09 17:54:02 +00:00
Benno Rice
98f8e6c099 Driver for the Apple UniNorth Host-PCI bridge.
This is in a PowerMac-specific subdirectory as it is hoped that we will support
more than just the PowerMac platform.
2002-07-09 13:34:09 +00:00
Benno Rice
3008110e16 Add ofw_pci.c in the pci case. 2002-07-09 13:29:18 +00:00
Benno Rice
25b60a3b49 1) Add busdma machdep code.
2) Add bus_pio.h and bus_memio.h (which do nothing).

Submitted by:	Peter Grehan <peterg@ptree32.com.au> (1)
2002-07-09 12:47:14 +00:00
Benno Rice
ca01920852 Driver for OpenPIC compatible interrupt controllers.
It's fairly PowerMac specific at the moment, but that should be fixable.
2002-07-09 11:26:10 +00:00
Benno Rice
f6a7723dff Add interrupt handling support code.
I've tried to make this fairly platform-independant as some PowerPC platforms
may not have openpic-style interrupt controllers.  This may not have the best
performance but it works for now.
2002-07-09 11:12:20 +00:00
Mark Peek
b7c5c8fb06 Back out previous TCBHASHSIZE change. This should not be a kernel option.
Pointed out by:	bde
2002-07-08 22:00:43 +00:00
Mark Peek
08d6c46194 Document TCBHASHSIZE in NOTES and add it to the allowable kernel options.
PR:		32912
Submitted by:	Carl Schmidt <carl@slackerbsd.org>
MFC after:	3 days
2002-07-08 02:53:59 +00:00
Benno Rice
f00abca0c4 Add bmtphy.c 2002-07-05 11:08:55 +00:00
Maxime Henrion
2b4edb69f1 Move every code related to mount(2) in a new file, vfs_mount.c.
The file vfs_conf.c which was dealing with root mounting has
been repo-copied into vfs_mount.c to preserve history.
This makes nmount related development easier, and help reducing
the size of vfs_syscalls.c, which is still an enormous file.

Reviewed by:	rwatson
Repo-copy by:	peter
2002-07-02 17:09:22 +00:00
David E. O'Brien
d2be885e99 This is the start of the FreeBSD/x86_64 kernel. 2002-06-30 08:05:21 +00:00
Julian Elischer
e602ba25fd Part 1 of KSE-III
The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)

Reviewed by:	Almost everyone who counts
	(at various times, peter, jhb, matt, alfred, mini, bernd,
	and a cast of thousands)

	NOTE: this is still Beta code, and contains lots of debugging stuff.
	expect slight instability in signals..
2002-06-29 17:26:22 +00:00
Benno Rice
825467cae1 Add in_cksum.c 2002-06-29 09:50:20 +00:00
Benno Rice
6c2a062580 Many fixes to low-level trap and interrupt handling:
- Tidy up clock code.  Don't repeatedly call hardclock().
- Remove intrnames, decrnest and intrcnt from locore.s
- Coalesce all trap handling into a single stub that then calls a dispatch
  function.

Submitted by:	Peter Grehan <peterg@ptree32.com.au>
2002-06-29 09:28:21 +00:00
Luigi Rizzo
9758b77ff1 The new ipfw code.
This code makes use of variable-size kernel representation of rules
(exactly the same concept of BPF instructions, as used in the BSDI's
firewall), which makes firewall operation a lot faster, and the
code more readable and easier to extend and debug.

The interface with the rest of the system is unchanged, as witnessed
by this commit. The only extra kernel files that I am touching
are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In
userland I only had to touch those programs which manipulate the
internal representation of firewall rules).

The code is almost entirely new (and I believe I have written the
vast majority of those sections which were taken from the former
ip_fw.c), so rather than modifying the old ip_fw.c I decided to
create a new file, sys/netinet/ip_fw2.c .  Same for the user
interface, which is in sbin/ipfw/ipfw2.c (it still compiles to
/sbin/ipfw).  The old files are still there, and will be removed
in due time.

I have not renamed the header file because it would have required
touching a one-line change to a number of kernel files.

In terms of user interface, the new "ipfw" is supposed to accepts
the old syntax for ipfw rules (and produce the same output with
"ipfw show". Only a couple of the old options (out of some 30 of
them) has not been implemented, but they will be soon.

On the other hand, the new code has some very powerful extensions.
First, you can put "or" connectives between match fields (and soon
also between options), and write things like

ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any

This should make rulesets slightly more compact (and lines longer!),
by condensing 2 or more of the old rules into single ones.

Also, as an example of how easy the rules can be extended, I have
implemented an 'address set' match pattern, where you can specify
an IP address in a format like this:

        10.20.30.0/26{18,44,33,22,9}

which will match the set of hosts listed in braces belonging to the
subnet 10.20.30.0/26 . The match is done using a bitmap, so it is
essentially a constant time operation requiring a handful of CPU
instructions (and a very small amount of memmory -- for a full /24
subnet, the instruction only consumes 40 bytes).

Again, in this commit I have focused on functionality and tried
to minimize changes to the other parts of the system. Some performance
improvement can be achieved with minor changes to the interface of
ip_fw_chk_t. This will be done later when this code is settled.

The code is meant to compile unmodified on RELENG_4 (once the
PACKET_TAG_* changes have been merged), for this reason
you will see #ifdef __FreeBSD_version in a couple of places.
This should minimize errors when (hopefully soon) it will be time
to do the MFC.
2002-06-27 23:02:18 +00:00
Kenneth D. Merry
98cb733c67 At long last, commit the zero copy sockets code.
MAKEDEV:	Add MAKEDEV glue for the ti(4) device nodes.

ti.4:		Update the ti(4) man page to include information on the
		TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options,
		and also include information about the new character
		device interface and the associated ioctls.

man9/Makefile:	Add jumbo.9 and zero_copy.9 man pages and associated
		links.

jumbo.9:	New man page describing the jumbo buffer allocator
		interface and operation.

zero_copy.9:	New man page describing the general characteristics of
		the zero copy send and receive code, and what an
		application author should do to take advantage of the
		zero copy functionality.

NOTES:		Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS,
		TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT.

conf/files:	Add uipc_jumbo.c and uipc_cow.c.

conf/options:	Add the 5 options mentioned above.

kern_subr.c:	Receive side zero copy implementation.  This takes
		"disposable" pages attached to an mbuf, gives them to
		a user process, and then recycles the user's page.
		This is only active when ZERO_COPY_SOCKETS is turned on
		and the kern.ipc.zero_copy.receive sysctl variable is
		set to 1.

uipc_cow.c:	Send side zero copy functions.  Takes a page written
		by the user and maps it copy on write and assigns it
		kernel virtual address space.  Removes copy on write
		mapping once the buffer has been freed by the network
		stack.

uipc_jumbo.c:	Jumbo disposable page allocator code.  This allocates
		(optionally) disposable pages for network drivers that
		want to give the user the option of doing zero copy
		receive.

uipc_socket.c:	Add kern.ipc.zero_copy.{send,receive} sysctls that are
		enabled if ZERO_COPY_SOCKETS is turned on.

		Add zero copy send support to sosend() -- pages get
		mapped into the kernel instead of getting copied if
		they meet size and alignment restrictions.

uipc_syscalls.c:Un-staticize some of the sf* functions so that they
		can be used elsewhere.  (uipc_cow.c)

if_media.c:	In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid
		calling malloc() with M_WAITOK.  Return an error if
		the M_NOWAIT malloc fails.

		The ti(4) driver and the wi(4) driver, at least, call
		this with a mutex held.  This causes witness warnings
		for 'ifconfig -a' with a wi(4) or ti(4) board in the
		system.  (I've only verified for ti(4)).

ip_output.c:	Fragment large datagrams so that each segment contains
		a multiple of PAGE_SIZE amount of data plus headers.
		This allows the receiver to potentially do page
		flipping on receives.

if_ti.c:	Add zero copy receive support to the ti(4) driver.  If
		TI_PRIVATE_JUMBOS is not defined, it now uses the
		jumbo(9) buffer allocator for jumbo receive buffers.

		Add a new character device interface for the ti(4)
		driver for the new debugging interface.  This allows
		(a patched version of) gdb to talk to the Tigon board
		and debug the firmware.  There are also a few additional
		debugging ioctls available through this interface.

		Add header splitting support to the ti(4) driver.

		Tweak some of the default interrupt coalescing
		parameters to more useful defaults.

		Add hooks for supporting transmit flow control, but
		leave it turned off with a comment describing why it
		is turned off.

if_tireg.h:	Change the firmware rev to 12.4.11, since we're really
		at 12.4.11 plus fixes from 12.4.13.

		Add defines needed for debugging.

		Remove the ti_stats structure, it is now defined in
		sys/tiio.h.

ti_fw.h:	12.4.11 firmware.

ti_fw2.h:	12.4.11 firmware, plus selected fixes from 12.4.13,
		and my header splitting patches.  Revision 12.4.13
		doesn't handle 10/100 negotiation properly.  (This
		firmware is the same as what was in the tree previously,
		with the addition of header splitting support.)

sys/jumbo.h:	Jumbo buffer allocator interface.

sys/mbuf.h:	Add a new external mbuf type, EXT_DISPOSABLE, to
		indicate that the payload buffer can be thrown away /
		flipped to a userland process.

socketvar.h:	Add prototype for socow_setup.

tiio.h:		ioctl interface to the character portion of the ti(4)
		driver, plus associated structure/type definitions.

uio.h:		Change prototype for uiomoveco() so that we'll know
		whether the source page is disposable.

ufs_readwrite.c:Update for new prototype of uiomoveco().

vm_fault.c:	In vm_fault(), check to see whether we need to do a page
		based copy on write fault.

vm_object.c:	Add a new function, vm_object_allocate_wait().  This
		does the same thing that vm_object allocate does, except
		that it gives the caller the opportunity to specify whether
		it should wait on the uma_zalloc() of the object structre.

		This allows vm objects to be allocated while holding a
		mutex.  (Without generating WITNESS warnings.)

		vm_object_allocate() is implemented as a call to
		vm_object_allocate_wait() with the malloc flag set to
		M_WAITOK.

vm_object.h:	Add prototype for vm_object_allocate_wait().

vm_page.c:	Add page-based copy on write setup, clear and fault
		routines.

vm_page.h:	Add page based COW function prototypes and variable in
		the vm_page structure.

Many thanks to Drew Gallatin, who wrote the zero copy send and receive
code, and to all the other folks who have tested and reviewed this code
over the years.
2002-06-26 03:37:47 +00:00
Warner Losh
6b891daaa5 Partially back out the "make all interfaces standard" commit. There's
a small chance that it might have broken loading the miibus, so err on
the side of caution until I can figure out what is going on.  This
backs out all but the PCI, PCIB and ISA bus interfaces being
"standard," which have been well tested...
2002-06-24 01:53:26 +00:00
Warner Losh
8c575e95cd plxcard for OLDCARD almost certainly isn't going to happen. 2002-06-23 07:31:29 +00:00
Warner Losh
f24cd27f4f As disclosed to arch@, make more interfaces standard. This allows for
easier loading of modules that might refer to these interfaces.  None
of the code that implements them is standard, just the glue.  This
bloats the kernel a whopping 8k.

Silence on: arch@
2002-06-23 07:27:24 +00:00
Robert Watson
e35e7abac0 Remove CAPABILITIES from NOTES 2002-06-21 19:53:04 +00:00
Julian Elischer
a835396035 A node that creates a device entry in /dev (yay devfs)
so that /dev/mumble can be the entrypoint to some networking graph,
e.g. a tunnel or a remote tape drive or whatever...

Not fully tested (by me) yet.

Submitted by:	Mark Santcroos <marks@ripe.net>
MFC after:	3 weeks
2002-06-18 21:32:33 +00:00
Nick Hibma
d8dbc77c56 Make the speed used by gdb over serial settable in the kernel configuration.
This facilitates the use in circumstances where you are using a serial
console as well. GDB doesn't support anything higher than 9600 baud (19k2
if you are lucky), but the console does.
2002-06-18 21:30:37 +00:00
David E. O'Brien
97f9c29ef3 Allow one to configure `sio'. 2002-06-18 01:14:54 +00:00
Nick Hibma
dba3dc7bdc Use OBJDIR instead of CURDIR. This unbreaks loading modules through
'make load' if an object dir was, like it is used in /sys/modules. I.e.

	cd /sys/modules/umass
	make obj
	make
	make load

works again without having to install the module.

If no objdir was used the module in the current directory is used.
2002-06-17 20:01:06 +00:00
John Hay
cd669cef39 sppp needs slcompress.c nowadays.
PR:		39369
2002-06-17 05:40:49 +00:00
Maxime Henrion
2812d7722d Removed a duplicate -ffreestanding. It's already set in bsd.kern.mk.
Approved by:	bde
2002-06-16 10:42:05 +00:00
Robert Watson
a3cce19f7d kern_cap.c no longer needed. 2002-06-13 23:19:34 +00:00
Robert Watson
1bde53c130 POSIX.1e capabilities aren't here yet, don't put an option for it
in the options file.
2002-06-13 22:41:23 +00:00
Brooks Davis
22afbb6bb0 Remote pci.h/NPCI usage from i4b code.
Approved by:	hm
2002-06-13 06:04:28 +00:00
Poul-Henning Kamp
11b2dcdbbe Put geom_gpt.c under the GEOM option instead of having a special GEOM_GPT
option for it.
2002-06-10 18:49:41 +00:00
Jake Burkholder
f5ee661c9b Remove code from trap which is handled in userland now. 2002-06-08 07:17:19 +00:00
John Baldwin
363ba2bcfd According to Bruce, this file shouldn't have comments to describe what
options do.  Comments should be in NOTES and having the comments in two
places usually means that one place will just bitrot.  Thus, remove the
comment for KTRACE_REQUEST_POOL from the previous revision.

Requested by:	bde
2002-06-07 14:33:23 +00:00
John Baldwin
ea3fc8e4cd Overhaul the ktrace subsystem a bit. For the most part, the actual vnode
operations to dump a ktrace event out to an output file are now handled
asychronously by a ktrace worker thread.  This enables most ktrace events
to not need Giant once p_tracep and p_traceflag are suitably protected by
the new ktrace_lock.

There is a single todo list of pending ktrace requests.  The various
ktrace tracepoints allocate a ktrace request object and tack it onto the
end of the queue.  The ktrace kernel thread grabs requests off the head of
the queue and processes them using the trace vnode and credentials of the
thread triggering the event.

Since we cannot assume that the user memory referenced when doing a
ktrgenio() will be valid and since we can't access it from the ktrace
worker thread without a bit of hassle anyways, ktrgenio() requests are
still handled synchronously.  However, in order to ensure that the requests
from a given thread still maintain relative order to one another, when a
synchronous ktrace event (such as a genio event) is triggered, we still put
the request object on the todo list to synchronize with the worker thread.
The original thread blocks atomically with putting the item on the queue.
When the worker thread comes across an asynchronous request, it wakes up
the original thread and then blocks to ensure it doesn't manage to write a
later event before the original thread has a chance to write out the
synchronous event.  When the original thread wakes up, it writes out the
synchronous using its own context and then finally wakes the worker thread
back up.  Yuck.  The sychronous events aren't pretty but they do work.

Since ktrace events can be triggered in fairly low-level areas (msleep()
and cv_wait() for example) the ktrace code is designed to use very few
locks when posting an event (currently just the ktrace_mtx lock and the
vnode interlock to bump the refcoun on the trace vnode).  This also means
that we can't allocate a ktrace request object when an event is triggered.
Instead, ktrace request objects are allocated from a pre-allocated pool
and returned to the pool after a request is serviced.

The size of this pool defaults to 100 objects, which is about 13k on an
i386 kernel.  The size of the pool can be adjusted at compile time via the
KTRACE_REQUEST_POOL kernel option, at boot time via the
kern.ktrace_request_pool loader tunable, or at runtime via the
kern.ktrace_request_pool sysctl.

If the pool of request objects is exhausted, then a warning message is
printed to the console.  The message is rate-limited in that it is only
printed once until the size of the pool is adjusted via the sysctl.

I have tested all kernel traces but have not tested user traces submitted
by utrace(2), though they should work fine in theory.

Since a ktrace request has several properties (content of event, trace
vnode, details of originating process, credentials for I/O, etc.), I chose
to drop the first argument to the various ktrfoo() functions.  Currently
the functions just assume the event is posted from curthread.  If there is
a great desire to do so, I suppose I could instead put back the first
argument but this time make it a thread pointer instead of a vnode pointer.

Also, KTRPOINT() now takes a thread as its first argument instead of a
process.  This is because the check for a recursive ktrace event is now
per-thread instead of process-wide.

Tested on:	i386
Compiles on:	sparc64, alpha
2002-06-07 05:32:59 +00:00
Matthew N. Dodd
26837af419 'device hea' is no longer broken.
Add 'nowerror' to a few 'hea' files to ignore warnings on volatiles.
2002-06-07 02:04:09 +00:00
Justin T. Gibbs
cdd49e97b4 Hook up the ahd driver. 2002-06-06 16:35:58 +00:00
Prafulla Deuskar
a7fabc2b60 Added support for 82545EM and 82546EB based adapters.
Added Vlan support.

MFC after:	1 week
2002-06-03 22:30:51 +00:00
Matthew N. Dodd
26c1165dce Add new 'hea' driver files. 2002-06-03 09:14:12 +00:00
Alfred Perlstein
6e330f3e36 bde noticed that SOMAXCONN breaks pretty badly as an option for LINT.
so back it out.
2002-06-02 04:32:52 +00:00
Brooks Davis
09d225d8c3 The loop back device hasn't been a count device for a while so remove
the number of interfaces.
2002-05-31 06:28:13 +00:00
Takanori Watanabe
80f1001813 Make oldcard and newcard kernel module work. 2002-05-30 17:38:00 +00:00