freebsd-skq

Author	SHA1	Message	Date
Luigi Rizzo	9758b77ff1	The new ipfw code. This code makes use of variable-size kernel representation of rules (exactly the same concept of BPF instructions, as used in the BSDI's firewall), which makes firewall operation a lot faster, and the code more readable and easier to extend and debug. The interface with the rest of the system is unchanged, as witnessed by this commit. The only extra kernel files that I am touching are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In userland I only had to touch those programs which manipulate the internal representation of firewall rules). The code is almost entirely new (and I believe I have written the vast majority of those sections which were taken from the former ip_fw.c), so rather than modifying the old ip_fw.c I decided to create a new file, sys/netinet/ip_fw2.c . Same for the user interface, which is in sbin/ipfw/ipfw2.c (it still compiles to /sbin/ipfw). The old files are still there, and will be removed in due time. I have not renamed the header file because it would have required touching a one-line change to a number of kernel files. In terms of user interface, the new "ipfw" is supposed to accepts the old syntax for ipfw rules (and produce the same output with "ipfw show". Only a couple of the old options (out of some 30 of them) has not been implemented, but they will be soon. On the other hand, the new code has some very powerful extensions. First, you can put "or" connectives between match fields (and soon also between options), and write things like ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any This should make rulesets slightly more compact (and lines longer!), by condensing 2 or more of the old rules into single ones. Also, as an example of how easy the rules can be extended, I have implemented an 'address set' match pattern, where you can specify an IP address in a format like this: 10.20.30.0/26{18,44,33,22,9} which will match the set of hosts listed in braces belonging to the subnet 10.20.30.0/26 . The match is done using a bitmap, so it is essentially a constant time operation requiring a handful of CPU instructions (and a very small amount of memmory -- for a full /24 subnet, the instruction only consumes 40 bytes). Again, in this commit I have focused on functionality and tried to minimize changes to the other parts of the system. Some performance improvement can be achieved with minor changes to the interface of ip_fw_chk_t. This will be done later when this code is settled. The code is meant to compile unmodified on RELENG_4 (once the PACKET_TAG_* changes have been merged), for this reason you will see #ifdef __FreeBSD_version in a couple of places. This should minimize errors when (hopefully soon) it will be time to do the MFC.	2002-06-27 23:02:18 +00:00
Kenneth D. Merry	98cb733c67	At long last, commit the zero copy sockets code. MAKEDEV: Add MAKEDEV glue for the ti(4) device nodes. ti.4: Update the ti(4) man page to include information on the TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options, and also include information about the new character device interface and the associated ioctls. man9/Makefile: Add jumbo.9 and zero_copy.9 man pages and associated links. jumbo.9: New man page describing the jumbo buffer allocator interface and operation. zero_copy.9: New man page describing the general characteristics of the zero copy send and receive code, and what an application author should do to take advantage of the zero copy functionality. NOTES: Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS, TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT. conf/files: Add uipc_jumbo.c and uipc_cow.c. conf/options: Add the 5 options mentioned above. kern_subr.c: Receive side zero copy implementation. This takes "disposable" pages attached to an mbuf, gives them to a user process, and then recycles the user's page. This is only active when ZERO_COPY_SOCKETS is turned on and the kern.ipc.zero_copy.receive sysctl variable is set to 1. uipc_cow.c: Send side zero copy functions. Takes a page written by the user and maps it copy on write and assigns it kernel virtual address space. Removes copy on write mapping once the buffer has been freed by the network stack. uipc_jumbo.c: Jumbo disposable page allocator code. This allocates (optionally) disposable pages for network drivers that want to give the user the option of doing zero copy receive. uipc_socket.c: Add kern.ipc.zero_copy.{send,receive} sysctls that are enabled if ZERO_COPY_SOCKETS is turned on. Add zero copy send support to sosend() -- pages get mapped into the kernel instead of getting copied if they meet size and alignment restrictions. uipc_syscalls.c:Un-staticize some of the sf* functions so that they can be used elsewhere. (uipc_cow.c) if_media.c: In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid calling malloc() with M_WAITOK. Return an error if the M_NOWAIT malloc fails. The ti(4) driver and the wi(4) driver, at least, call this with a mutex held. This causes witness warnings for 'ifconfig -a' with a wi(4) or ti(4) board in the system. (I've only verified for ti(4)). ip_output.c: Fragment large datagrams so that each segment contains a multiple of PAGE_SIZE amount of data plus headers. This allows the receiver to potentially do page flipping on receives. if_ti.c: Add zero copy receive support to the ti(4) driver. If TI_PRIVATE_JUMBOS is not defined, it now uses the jumbo(9) buffer allocator for jumbo receive buffers. Add a new character device interface for the ti(4) driver for the new debugging interface. This allows (a patched version of) gdb to talk to the Tigon board and debug the firmware. There are also a few additional debugging ioctls available through this interface. Add header splitting support to the ti(4) driver. Tweak some of the default interrupt coalescing parameters to more useful defaults. Add hooks for supporting transmit flow control, but leave it turned off with a comment describing why it is turned off. if_tireg.h: Change the firmware rev to 12.4.11, since we're really at 12.4.11 plus fixes from 12.4.13. Add defines needed for debugging. Remove the ti_stats structure, it is now defined in sys/tiio.h. ti_fw.h: 12.4.11 firmware. ti_fw2.h: 12.4.11 firmware, plus selected fixes from 12.4.13, and my header splitting patches. Revision 12.4.13 doesn't handle 10/100 negotiation properly. (This firmware is the same as what was in the tree previously, with the addition of header splitting support.) sys/jumbo.h: Jumbo buffer allocator interface. sys/mbuf.h: Add a new external mbuf type, EXT_DISPOSABLE, to indicate that the payload buffer can be thrown away / flipped to a userland process. socketvar.h: Add prototype for socow_setup. tiio.h: ioctl interface to the character portion of the ti(4) driver, plus associated structure/type definitions. uio.h: Change prototype for uiomoveco() so that we'll know whether the source page is disposable. ufs_readwrite.c:Update for new prototype of uiomoveco(). vm_fault.c: In vm_fault(), check to see whether we need to do a page based copy on write fault. vm_object.c: Add a new function, vm_object_allocate_wait(). This does the same thing that vm_object allocate does, except that it gives the caller the opportunity to specify whether it should wait on the uma_zalloc() of the object structre. This allows vm objects to be allocated while holding a mutex. (Without generating WITNESS warnings.) vm_object_allocate() is implemented as a call to vm_object_allocate_wait() with the malloc flag set to M_WAITOK. vm_object.h: Add prototype for vm_object_allocate_wait(). vm_page.c: Add page-based copy on write setup, clear and fault routines. vm_page.h: Add page based COW function prototypes and variable in the vm_page structure. Many thanks to Drew Gallatin, who wrote the zero copy send and receive code, and to all the other folks who have tested and reviewed this code over the years.	2002-06-26 03:37:47 +00:00
Warner Losh	6b891daaa5	Partially back out the "make all interfaces standard" commit. There's a small chance that it might have broken loading the miibus, so err on the side of caution until I can figure out what is going on. This backs out all but the PCI, PCIB and ISA bus interfaces being "standard," which have been well tested...	2002-06-24 01:53:26 +00:00
Warner Losh	8c575e95cd	plxcard for OLDCARD almost certainly isn't going to happen.	2002-06-23 07:31:29 +00:00
Warner Losh	f24cd27f4f	As disclosed to arch@, make more interfaces standard. This allows for easier loading of modules that might refer to these interfaces. None of the code that implements them is standard, just the glue. This bloats the kernel a whopping 8k. Silence on: arch@	2002-06-23 07:27:24 +00:00
Robert Watson	e35e7abac0	Remove CAPABILITIES from NOTES	2002-06-21 19:53:04 +00:00
Julian Elischer	a835396035	A node that creates a device entry in /dev (yay devfs) so that /dev/mumble can be the entrypoint to some networking graph, e.g. a tunnel or a remote tape drive or whatever... Not fully tested (by me) yet. Submitted by: Mark Santcroos <marks@ripe.net> MFC after: 3 weeks	2002-06-18 21:32:33 +00:00
Nick Hibma	d8dbc77c56	Make the speed used by gdb over serial settable in the kernel configuration. This facilitates the use in circumstances where you are using a serial console as well. GDB doesn't support anything higher than 9600 baud (19k2 if you are lucky), but the console does.	2002-06-18 21:30:37 +00:00
David E. O'Brien	97f9c29ef3	Allow one to configure `sio'.	2002-06-18 01:14:54 +00:00
Nick Hibma	dba3dc7bdc	Use OBJDIR instead of CURDIR. This unbreaks loading modules through 'make load' if an object dir was, like it is used in /sys/modules. I.e. cd /sys/modules/umass make obj make make load works again without having to install the module. If no objdir was used the module in the current directory is used.	2002-06-17 20:01:06 +00:00
John Hay	cd669cef39	sppp needs slcompress.c nowadays. PR: 39369	2002-06-17 05:40:49 +00:00
Maxime Henrion	2812d7722d	Removed a duplicate -ffreestanding. It's already set in bsd.kern.mk. Approved by: bde	2002-06-16 10:42:05 +00:00
Robert Watson	a3cce19f7d	kern_cap.c no longer needed.	2002-06-13 23:19:34 +00:00
Robert Watson	1bde53c130	POSIX.1e capabilities aren't here yet, don't put an option for it in the options file.	2002-06-13 22:41:23 +00:00
Brooks Davis	22afbb6bb0	Remote pci.h/NPCI usage from i4b code. Approved by: hm	2002-06-13 06:04:28 +00:00
Poul-Henning Kamp	11b2dcdbbe	Put geom_gpt.c under the GEOM option instead of having a special GEOM_GPT option for it.	2002-06-10 18:49:41 +00:00
Jake Burkholder	f5ee661c9b	Remove code from trap which is handled in userland now.	2002-06-08 07:17:19 +00:00
John Baldwin	363ba2bcfd	According to Bruce, this file shouldn't have comments to describe what options do. Comments should be in NOTES and having the comments in two places usually means that one place will just bitrot. Thus, remove the comment for KTRACE_REQUEST_POOL from the previous revision. Requested by: bde	2002-06-07 14:33:23 +00:00
John Baldwin	ea3fc8e4cd	Overhaul the ktrace subsystem a bit. For the most part, the actual vnode operations to dump a ktrace event out to an output file are now handled asychronously by a ktrace worker thread. This enables most ktrace events to not need Giant once p_tracep and p_traceflag are suitably protected by the new ktrace_lock. There is a single todo list of pending ktrace requests. The various ktrace tracepoints allocate a ktrace request object and tack it onto the end of the queue. The ktrace kernel thread grabs requests off the head of the queue and processes them using the trace vnode and credentials of the thread triggering the event. Since we cannot assume that the user memory referenced when doing a ktrgenio() will be valid and since we can't access it from the ktrace worker thread without a bit of hassle anyways, ktrgenio() requests are still handled synchronously. However, in order to ensure that the requests from a given thread still maintain relative order to one another, when a synchronous ktrace event (such as a genio event) is triggered, we still put the request object on the todo list to synchronize with the worker thread. The original thread blocks atomically with putting the item on the queue. When the worker thread comes across an asynchronous request, it wakes up the original thread and then blocks to ensure it doesn't manage to write a later event before the original thread has a chance to write out the synchronous event. When the original thread wakes up, it writes out the synchronous using its own context and then finally wakes the worker thread back up. Yuck. The sychronous events aren't pretty but they do work. Since ktrace events can be triggered in fairly low-level areas (msleep() and cv_wait() for example) the ktrace code is designed to use very few locks when posting an event (currently just the ktrace_mtx lock and the vnode interlock to bump the refcoun on the trace vnode). This also means that we can't allocate a ktrace request object when an event is triggered. Instead, ktrace request objects are allocated from a pre-allocated pool and returned to the pool after a request is serviced. The size of this pool defaults to 100 objects, which is about 13k on an i386 kernel. The size of the pool can be adjusted at compile time via the KTRACE_REQUEST_POOL kernel option, at boot time via the kern.ktrace_request_pool loader tunable, or at runtime via the kern.ktrace_request_pool sysctl. If the pool of request objects is exhausted, then a warning message is printed to the console. The message is rate-limited in that it is only printed once until the size of the pool is adjusted via the sysctl. I have tested all kernel traces but have not tested user traces submitted by utrace(2), though they should work fine in theory. Since a ktrace request has several properties (content of event, trace vnode, details of originating process, credentials for I/O, etc.), I chose to drop the first argument to the various ktrfoo() functions. Currently the functions just assume the event is posted from curthread. If there is a great desire to do so, I suppose I could instead put back the first argument but this time make it a thread pointer instead of a vnode pointer. Also, KTRPOINT() now takes a thread as its first argument instead of a process. This is because the check for a recursive ktrace event is now per-thread instead of process-wide. Tested on: i386 Compiles on: sparc64, alpha	2002-06-07 05:32:59 +00:00
Matthew N. Dodd	26837af419	'device hea' is no longer broken. Add 'nowerror' to a few 'hea' files to ignore warnings on volatiles.	2002-06-07 02:04:09 +00:00
Justin T. Gibbs	cdd49e97b4	Hook up the ahd driver.	2002-06-06 16:35:58 +00:00
Prafulla Deuskar	a7fabc2b60	Added support for 82545EM and 82546EB based adapters. Added Vlan support. MFC after: 1 week	2002-06-03 22:30:51 +00:00
Matthew N. Dodd	26c1165dce	Add new 'hea' driver files.	2002-06-03 09:14:12 +00:00
Alfred Perlstein	6e330f3e36	bde noticed that SOMAXCONN breaks pretty badly as an option for LINT. so back it out.	2002-06-02 04:32:52 +00:00
Brooks Davis	09d225d8c3	The loop back device hasn't been a count device for a while so remove the number of interfaces.	2002-05-31 06:28:13 +00:00
Takanori Watanabe	80f1001813	Make oldcard and newcard kernel module work.	2002-05-30 17:38:00 +00:00
David E. O'Brien	31741f8a9e	PHK claims there is a crc32.c now.	2002-05-29 21:58:56 +00:00
David E. O'Brien	22f24d720a	Back out revision 1.639. PHK filed to commit the libkern file.	2002-05-29 21:57:27 +00:00
Poul-Henning Kamp	f4258597dc	Add one copy of crc32() and crc32_tab[] in libkern, and remove it two other places. Comment out crc32 related definitions in zlib.h, we don't seem to have the corresponding code in our kernel.	2002-05-29 20:24:09 +00:00
Jake Burkholder	1982efc5c2	Merge the code in pv.c into pmap.c directly. Place all page mappings onto the pv lists in the vm_page, even unmanaged kernel mappings. This is so that the virtual cachability of these mappings can be tracked when a page is mapped to more than one virtual address. All virtually cachable mappings of a physical page must have the same virtual colour, or illegal alises can be created in the data cache. This is a bit tricky because we still have to recognize managed and unmanaged mappings, even though they are all on the pv lists.	2002-05-29 06:08:45 +00:00
Marcel Moolenaar	bcd46c600a	Add support to GEOM for GUID Partition Tables (GPTs). The support is currently conditional on both the GEOM and GEOM_GPT options to avoid getting GPT by default and having the MBR and GPT classes clash. The correct behaviour of the MBR class would be to back-off (reject) a MBR if it's a Protective MBR (a MBR with a single partition of type 0xEE that spans the whole disk (as far as the MBR is concerned). The correct behaviour if the GPT class would be to back-off (reject) a GPT if there's a MBR that's not a Protective MBR. At this stage it's inconvenient to destroy a good MBR when working with GPTs that it's more convenient to have the MBR class back-off when it detects the GPT signature on disk and have the GPT class ignore the MBR. In sys/gpt.h UUIDs (GUIDs) for the following FreeBSD partitions have been defined: GPT_ENT_TYPE_FREEBSD FreeBSD slice with disklabel. This is the equivalent of the well-known FreeBSD MBR partition type. GPT_ENT_TYPE_FREEBSD_{SWAP\|UFS\|UFS2\|VINUM} FreeBSD partitions in the context of disklabel. This is speculating on the idea to use the GPT to hold partitions instead if slices and removing the fixed (and low) limits we have on the number of partitions. This commit lacks a GPT image for the regression suite.	2002-05-28 09:04:48 +00:00
Marcel Moolenaar	52183d0145	Add uuidgen(2) and uuidgen(1). The uuidgen command, by means of the uuidgen syscall, generates one or more Universally Unique Identifiers compatible with OSF/DCE 1.1 version 1 UUIDs. From the Perforce logs (change 11995): Round of cleanups: o Give uuidgen() the correct prototype in syscalls.master o Define struct uuid according to DCE 1.1 in sys/uuid.h o Use struct uuid instead of uuid_t. The latter is defined in sys/uuid.h but should not be used in kernel land. o Add snprintf_uuid(), printf_uuid() and sbuf_printf_uuid() to kern_uuid.c for use in the kernel (currently geom_gpt.c). o Rename the non-standard struct uuid in kern/kern_uuid.c to struct uuid_private and give it a slightly better definition for better byte-order handling. See below. o In sys/gpt.h, fix the broken uuid definitions to match the now compliant struct uuid definition. See below. o In usr.bin/uuidgen/uuidgen.c catch up with struct uuid change. A note about byte-order: The standard failed to provide a non-conflicting and unambiguous definition for the binary representation. My initial implementation always wrote the timestamp as a 64-bit little-endian (2s-complement) integral. The clock sequence was always written as a 16-bit big-endian (2s-complement) integral. After a good nights sleep and couple of Pan Galactic Gargle Blasters (not necessarily in that order :-) I reread the spec and came to the conclusion that the time fields are always written in the native by order, provided the the low, mid and hi chopping still occurs. The spec mentions that you "might need to swap bytes if you talk to a machine that has a different byte-order". The clock sequence is always written in big-endian order (as is the IEEE 802 address) because its division is resulting in bytes, making the ordering unambiguous.	2002-05-28 06:16:08 +00:00
Poul-Henning Kamp	291daf5735	Add a proof-of-concept encryption class. "The only hard problem in cryptography is key-management." All sectors are encrypted with AES in CBC mode using a constant key, currently compiled in and all zero. To activate this module, write the magic header on the partition: echo "<<FreeBSD-GEOM-AES>>" \| dd conv=sync of=/dev/md98 The encrypted device will be one sector shorter and have ".aes" appended to its name. Sponsored by: DARPA & NAI Labs.	2002-05-26 18:14:38 +00:00
Jake Burkholder	a6b82b31b1	Remove a hack for using an external compiler if cross compiling.	2002-05-26 15:55:28 +00:00
Peter Wemm	e09d00a880	For now, make the .ifdef GCC3 case default. We should change -Wno-format back to -fformat-extensions (or whatever) when we have the functionality. We are gaining warnings again that should be fixed but the are being hidden by NO_WERROR and all the -Wformat noise.	2002-05-24 01:02:45 +00:00
Ruslan Ermilov	1cd1fdeaf5	Fixed broken ``make -jX install''. Spotted by: make release TARGET_ARCH=ia64	2002-05-23 07:25:01 +00:00
John Baldwin	2498cf8c42	Add code to make default mutexes adaptive if the ADAPTIVE_MUTEXES kernel option is used (not on by default). - In the case of trying to lock a mutex, if the MTX_CONTESTED flag is set, then we can safely read the thread pointer from the mtx_lock member while holding sched_lock. We then examine the thread to see if it is currently executing on another CPU. If it is, then we keep looping instead of blocking. - In the case of trying to unlock a mutex, it is now possible for a mutex to have MTX_CONTESTED set in mtx_lock but to not have any threads actually blocked on it, so we need to handle that case. In that case, we just release the lock as if MTX_CONTESTED was not set and return. - We do not adaptively spin on Giant as Giant is held for long times and it slows SMP systems down to a crawl (it was taking several minutes, like 5-10 or so for my test alpha and sparc64 SMP boxes to boot up when they adaptively spinned on Giant). - We only compile in the code to do this for SMP kernels, it doesn't make sense for UP kernels. Tested on: i386, alpha, sparc64	2002-05-21 20:47:11 +00:00
Noriaki Mitsunaga	15e19cbbe8	MFi386: 1.398-1.399 (${MACHINE_ARCH}_dump.c -> dump_machdep.c)	2002-05-21 04:13:08 +00:00
Jake Burkholder	f7c81a5182	De-inline the tlb demap functions. These were so big that gcc3.1 refused to inline them anyway. ;)	2002-05-20 16:10:17 +00:00
Yoshihiro Takahashi	db39e02e6b	MFi386: revision 1.400.	2002-05-19 13:20:05 +00:00
Yoshihiro Takahashi	05012df834	Remove unneeded entries.	2002-05-19 13:18:10 +00:00
Marcel Moolenaar	1a4a595c4b	Remove CWARNFLAGS and add GCC3. We handle GCC3.x specific flags centrally now that we have GCC3 in the tree. The GCC3 variable is a helper during the switch.	2002-05-19 03:41:48 +00:00
Marcel Moolenaar	c444f61706	Hook up the new linux_ptrace implementation. PR: 33299 Submitted by: Alexander N. Kabaev <ak03@gte.com>	2002-05-19 01:27:14 +00:00
Robert Watson	2bab796d96	Remove IFS from 5.0-CURRENT. This facilitates introducing UFS2 as IFS had its fingers deep in the belly of the UFS/FFS split. IFS will be reimplemented by the maintainer at a later date. Requested by: adrian (maintainer)	2002-05-19 00:11:08 +00:00
Tom Rhodes	d394511de3	More s/file system/filesystem/g	2002-05-16 21:28:32 +00:00
Ian Dowse	2bf6dd18ba	The ufs/ffs files are no longer required by ext2fs.	2002-05-16 20:54:44 +00:00
Ian Dowse	9504abaad7	Complete the separation of ext2fs from ufs by copying the remaining shared code and converting all ufs references. Originally it may have made sense to share common features between the two filesystems, but recently it has only caused problems, the UFS2 work being the final straw. All UFS_* indirect calls are now direct calls to ext2_* functions, and ext2fs-specific mount and inode structures have been introduced.	2002-05-16 19:08:03 +00:00
Jeff Roberson	0e2d6cc899	Disable the shared locking namei() code for now. It breaks several stacking filesystems. This is on hold until the rest of VFS Locking is reviewed and deemed safe. It can be enabled with 'options LOOKUP_SHARED'.	2002-05-14 21:59:49 +00:00
Ruslan Ermilov	be1d673d24	Check that kldxref(8) exists before running it.	2002-05-14 07:49:12 +00:00
Benno Rice	289fc68db6	Build the fpu support routines.	2002-05-13 07:53:22 +00:00

1 2 3 4 5 ...

3053 Commits