- Split the code out into if_clone.[ch].
- Locked struct if_clone. [1]
- Add a per-cloner match function rather then simply matching names of
the form <name><unit> and <name>.
- Use the match function to allow creation of <interface>.<tag>
vlan interfaces. The old way is preserved unchanged!
- Also the match function to allow creation of stf(4) interfaces named
stf0, stf, or 6to4. This is the only major user visible change in
that "ifconfig stf" creates the interface stf rather then stf0 and
does not print "stf0" to stdout.
- Allow destroy functions to fail so they can refuse to delete
interfaces. Currently, we forbid the deletion of interfaces which
were created in the init function, particularly lo0, pflog0, and
pfsync0. In the case of lo0 this was a panic implementation so it
does not count as a user visiable change. :-)
- Since most interfaces do not need the new functionality, an family of
wrapper functions, ifc_simple_*(), were created to wrap old style
cloner functions.
- The IF_CLONE_INITIALIZER macro is replaced with a new incompatible
IFC_CLONE_INITIALIZER and ifc_simple consumers use IFC_SIMPLE_DECLARE
instead.
Submitted by: Maurycy Pawlowski-Wieronski <maurycy at fouk.org> [1]
Reviewed by: andre, mlaier
Discussed on: net
Version 3.5 brings:
- Atomic commits of ruleset changes (reduce the chance of ending up in an
inconsistent state).
- A 30% reduction in the size of state table entries.
- Source-tracking (limit number of clients and states per client).
- Sticky-address (the flexibility of round-robin with the benefits of
source-hash).
- Significant improvements to interface handling.
- and many more ...
your (network) modules as well as any userland that might make sense of
sizeof(struct ifnet).
This does not change the queueing yet. These changes will follow in a
seperate commit. Same with the driver changes, which need case by case
evaluation.
__FreeBSD_version bump will follow.
Tested-by: (i386)LINT
conform to the rfc2734 and rfc3146 standard for IP over firewire and
should eventually supercede the fwe driver. Right now the broadcast
channel number is hardwired and we don't support MCAP for multicast
channel allocation - more infrastructure is required in the firewire
code itself to fix these problems.
all of the interface between the driver and the bus. This will enable
us to stop special casing eisa bus attachments in modules and treat them
like we treat all other busses.
In the longer run, we need to eliminate much (all?) of these interfaces
and switch to using the standard bus_alloc_resource(), but that's not
done right now.
# I've not updated the modules to include eisa, etc, just yet
Tested on: Compaq Proliant 3000/333 purchased for eisa work
mbuma is an Mbuf & Cluster allocator built on top of a number of
extensions to the UMA framework, all included herein.
Extensions to UMA worth noting:
- Better layering between slab <-> zone caches; introduce
Keg structure which splits off slab cache away from the
zone structure and allows multiple zones to be stacked
on top of a single Keg (single type of slab cache);
perhaps we should look into defining a subset API on
top of the Keg for special use by malloc(9),
for example.
- UMA_ZONE_REFCNT zones can now be added, and reference
counters automagically allocated for them within the end
of the associated slab structures. uma_find_refcnt()
does a kextract to fetch the slab struct reference from
the underlying page, and lookup the corresponding refcnt.
mbuma things worth noting:
- integrates mbuf & cluster allocations with extended UMA
and provides caches for commonly-allocated items; defines
several zones (two primary, one secondary) and two kegs.
- change up certain code paths that always used to do:
m_get() + m_clget() to instead just use m_getcl() and
try to take advantage of the newly defined secondary
Packet zone.
- netstat(1) and systat(1) quickly hacked up to do basic
stat reporting but additional stats work needs to be
done once some other details within UMA have been taken
care of and it becomes clearer to how stats will work
within the modified framework.
From the user perspective, one implication is that the
NMBCLUSTERS compile-time option is no longer used. The
maximum number of clusters is still capped off according
to maxusers, but it can be made unlimited by setting
the kern.ipc.nmbclusters boot-time tunable to zero.
Work should be done to write an appropriate sysctl
handler allowing dynamic tuning of kern.ipc.nmbclusters
at runtime.
Additional things worth noting/known issues (READ):
- One report of 'ips' (ServeRAID) driver acting really
slow in conjunction with mbuma. Need more data.
Latest report is that ips is equally sucking with
and without mbuma.
- Giant leak in NFS code sometimes occurs, can't
reproduce but currently analyzing; brueffer is
able to reproduce but THIS IS NOT an mbuma-specific
problem and currently occurs even WITHOUT mbuma.
- Issues in network locking: there is at least one
code path in the rip code where one or more locks
are acquired and we end up in m_prepend() with
M_WAITOK, which causes WITNESS to whine from within
UMA. Current temporary solution: force all UMA
allocations to be M_NOWAIT from within UMA for now
to avoid deadlocks unless WITNESS is defined and we
can determine with certainty that we're not holding
any locks when we're M_WAITOK.
- I've seen at least one weird socketbuffer empty-but-
mbuf-still-attached panic. I don't believe this
to be related to mbuma but please keep your eyes
open, turn on debugging, and capture crash dumps.
This change removes more code than it adds.
A paper is available detailing the change and considering
various performance issues, it was presented at BSDCan2004:
http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf
Please read the paper for Future Work and implementation
details, as well as credits.
Testing and Debugging:
rwatson,
brueffer,
Ketrien I. Saihr-Kesenchedra,
...
Reviewed by: Lots of people (for different parts)
been developed for use with FreeBSD, version 4.8 and later.
Submitted by: Hema Joyce
Reviewed by: Prafulla Deuskar
Approved by: Prafulla Deuskar
MFC after: 1 week
converts miidevs to a .h file, so rename to reflect that.
The usb and pccard versions have also been renamed and will be hooked
into the build system shortly (I've made the conversion in my p4
tree).
- Connect geom_stripe and geom_nop modules to the build.
- Connect STRIPE and NOP classes to the LINT build.
- Disconnect gconcat(8) from the build.
Supported by: Wheel - Open Technologies - http://www.wheel.pl
parameter).
Keep using it only in the i386 NOTES for now. It is fairly MI, but it
doesn't use bus-space and has a couple of i386 i/o instructions in pci
intitialization.
The VIA Nehemias is so obviously specific to i386 that it should not
be compiled on non-i386 platforms. The obviousness is in the fact that
all functions in nehemias.c are purely i386 inline assembly, guarded
by #ifdef __i386__
can more easily be used INSTEAD OF the hard-working Yarrow.
The only hardware source used at this point is the one inside
the VIA C3 Nehemiah (Stepping 3 and above) CPU. More sources will
be added in due course. Contributions welcome!
to select a serial console and debug port (resp). On ia64 these replace
the use of hints completely and take precedence over hints on alpha,
amd64 and i386. On sparc64 these variables are not yet recognised.
The reasons for introducing these variables are:
1. Hints have side-effects. They reserve the unit number for use by
isa or acpi devices and therefore cannot be used to select a pci
device. Also, the use of a unit number to select a device prior
to bus enumeration is nonsense. The new variables have no side-
effects and are not based on unit numbers.
2. Hints don't have the expression power to allow the sysadmin to
select UARTs that are not legacy PC devices and need the support
of compile-time constants to give the sysadmin some level of
flexibility.
The hw.uart.console and hw.uart.dbgport variables specify a list of
attributes. An attribute is a tag-value pair, seperated by a colon.
Attributes are seperated by a comma. Where possible, tags are the
same as those in /etc/remote (only br and pa in practice). Details
can be found in the manpage (not part of this commit).
Not tested on: amd64, pc98
from ddp_usrreq.c. Functions moved are:
at_pcballoc()
at_pcbconnect()
at_pcbdetach()
at_pcbdisconnect()
at_pcbsetaddr()
at_sockaddr()
Also moved are ddp_ports and ddpcb, global variables associated with DDP
pcbs. This makes PCB implementation more parallel to inet, inet6, and
ipx.
mini-layer. I don't have time to bing it forward into the GEOM world, and
no one else has stepped forward to claim it. It'll be in the Attic for safe
keeping for now.
Intel C/C++ compiler (lang/icc) to build the kernel.
The icc CPUTYPE CFLAGS use icc v7 syntax, icc v8 moans about them, but
doesn't abort. They also produce CPU specific code (new instructions
of the CPU, not only CPU specific scheduling), so if you get coredumps
with signal 4 (SIGILL, illegal instruction) you've used the wrong
CPUTYPE.
Incarnations of this patch survive gcc compiles and my make universe.
I use it on my desktop.
To use it update share/mk, add
/usr/local/intel/compiler70/ia32/bin (icc v7, works)
or
/usr/local/intel_cc_80/bin (icc v8, doesn't work)
to your PATH, make sure you have a new kernel compile directory
(e.g. MYKERNEL_icc) and run
CFLAGS="-O2 -ip" CC=icc make depend
CFLAGS="-O2 -ip" CC=icc make
in it.
Don't compile with -ipo, the build infrastructure uses ld directly to
link the kernel and the modules, but -ipo needs the link step to be
performed with Intel's linker.
Problems with icc v8:
- panic: npx0 cannot be emulated on an SMP system
- UP: first start of /bin/sh results in a FP exception
Parts of this commit contains suggestions or submissions from
Marius Strobl <marius@alchemy.franken.de>.
Reviewed by: silence on -arch
Submitted by: netchild
ethernet (tested) and FDDI (not tested). The main use for this is on ADSL (or
other ATM) connections where bridged ethernet is used, PPPoE being a prime
example.
There is no manual page as yet, I will write one shortly.
Reviewed by: harti
but a bit more reamins to be done. For now, it is usable.
PR:
Submitted by: Taku YAMAMOTO <taku@cent.saitama-u.ac.jp>
Reviewed by:
Approved by:
Obtained from:
MFC after:
sleep queue interface:
- Sleep queues attempt to merge some of the benefits of both sleep queues
and condition variables. Having sleep qeueus in a hash table avoids
having to allocate a queue head for each wait channel. Thus, struct cv
has shrunk down to just a single char * pointer now. However, the
hash table does not hold threads directly, but queue heads. This means
that once you have located a queue in the hash bucket, you no longer have
to walk the rest of the hash chain looking for threads. Instead, you have
a list of all the threads sleeping on that wait channel.
- Outside of the sleepq code and the sleep/cv code the kernel no longer
differentiates between cv's and sleep/wakeup. For example, calls to
abortsleep() and cv_abort() are replaced with a call to sleepq_abort().
Thus, the TDF_CVWAITQ flag is removed. Also, calls to unsleep() and
cv_waitq_remove() have been replaced with calls to sleepq_remove().
- The sched_sleep() function no longer accepts a priority argument as
sleep's no longer inherently bump the priority. Instead, this is soley
a propery of msleep() which explicitly calls sched_prio() before
blocking.
- The TDF_ONSLEEPQ flag has been dropped as it was never used. The
associated TDF_SET_ONSLEEPQ and TDF_CLR_ON_SLEEPQ macros have also been
dropped and replaced with a single explicit clearing of td_wchan.
TD_SET_ONSLEEPQ() would really have only made sense if it had taken
the wait channel and message as arguments anyway. Now that that only
happens in one place, a macro would be overkill.
pf/pflog/pfsync as modules. Do not list them in NOTES or modules/Makefile
(i.e. do not connect it to any (automatic) builds - yet).
Approved by: bms(mentor)
to a new mac_inet.c. This code is now conditionally compiled based
on inet support being compiled into the kernel.
Move socket related MAC Framework entry points from mac_net.c to a new
mac_socket.c.
To do this, some additional _enforce MIB variables are now non-static.
In addition, mbuf_to_label() is now mac_mbuf_to_label() and non-static.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
This is the first of two commits; bringing in the kernel support first.
This can be enabled by compiling a kernel with options TCP_SIGNATURE
and FAST_IPSEC.
For the uninitiated, this is a TCP option which provides for a means of
authenticating TCP sessions which came into being before IPSEC. It is
still relevant today, however, as it is used by many commercial router
vendors, particularly with BGP, and as such has become a requirement for
interconnect at many major Internet points of presence.
Several parts of the TCP and IP headers, including the segment payload,
are digested with MD5, including a shared secret. The PF_KEY interface
is used to manage the secrets using security associations in the SADB.
There is a limitation here in that as there is no way to map a TCP flow
per-port back to an SPI without polluting tcpcb or using the SPD; the
code to do the latter is unstable at this time. Therefore this code only
supports per-host keying granularity.
Whilst FAST_IPSEC is mutually exclusive with KAME IPSEC (and thus IPv6),
TCP_SIGNATURE applies only to IPv4. For the vast majority of prospective
users of this feature, this will not pose any problem.
This implementation is output-only; that is, the option is honoured when
responding to a host initiating a TCP session, but no effort is made
[yet] to authenticate inbound traffic. This is, however, sufficient to
interwork with Cisco equipment.
Tested with a Cisco 2501 running IOS 12.0(27), and Quagga 0.96.4 with
local patches. Patches for tcpdump to validate TCP-MD5 sessions are also
available from me upon request.
Sponsored by: sentex.net
delete it each time its run and have it regenerated each time by make.
I used a quick hackish script rather than putting it in the files file
and used the before-depend rule to avoid the depend/no-depend hacks.
a long time: lmc The LAN Media Corp PCI WAN driver based on tulip.
This driver hasn't compiled for 3 years since the PCI compat shims
were removed, and Lan Media appears to have gone out of business.
These cards appear to be rare (a recent search of ebay had no hits).
Should someone wish to revive this driver, submitting patches to make
it compile plus a testing report will bring it back.
aid other kernel code, especially code which can be in a module such as
the acpi_cpu(4) driver, to work properly with both SMP and UP kernels.
The exported symbols include mp_ncpus, all_cpus, mp_maxid, smp_started, and
the smp_rendezvous() function. This also means that CPU_ABSENT() is now
always implemented the same on all kernels.
Approved by: re (scottl)
the routing table. Move all usage and references in the tcp stack
from the routing table metrics to the tcp hostcache.
It caches measured parameters of past tcp sessions to provide better
initial start values for following connections from or to the same
source or destination. Depending on the network parameters to/from
the remote host this can lead to significant speedups for new tcp
connections after the first one because they inherit and shortcut
the learning curve.
tcp_hostcache is designed for multiple concurrent access in SMP
environments with high contention and is hash indexed by remote
ip address.
It removes significant locking requirements from the tcp stack with
regard to the routing table.
Reviewed by: sam (mentor), bms
Reviewed by: -net, -current, core@kame.net (IPv6 parts)
Approved by: re (scottl)
* Use the cpu_idle_hook() to do idling for C1-C3.
* Use both _CST and the FADT to detect Cx states.
* Use both _PTC and P_CNT for controlling throttling.
* Add a notify handler to detect changes in _CST and _PSS
* Call the _INI function for each processor if present. This will be
done by ACPI-CA in the future.
* Fix a bug on SMP systems where CPUs will attach multiple times if the
bus is rescan.
* Document new sysctls for controlling idling.
Short description of ip_fastforward:
o adds full direct process-to-completion IPv4 forwarding code
o handles ip fragmentation incl. hw support (ip_flow did not)
o sends icmp needfrag to source if DF is set (ip_flow did not)
o supports ipfw and ipfilter (ip_flow did not)
o supports divert, ipfw fwd and ipfilter nat (ip_flow did not)
o returns anything it can't handle back to normal ip_input
Enable with sysctl -w net.inet.ip.fastforwarding=1
Reviewed by: sam (mentor)
in various kernel objects to represent security data, we embed a
(struct label *) pointer, which now references labels allocated using
a UMA zone (mac_label.c). This allows the size and shape of struct
label to be varied without changing the size and shape of these kernel
objects, which become part of the frozen ABI with 5-STABLE. This opens
the door for boot-time selection of the number of label slots, and hence
changes to the bound on the number of simultaneous labeled policies
at boot-time instead of compile-time. This also makes it easier to
embed label references in new objects as required for locking/caching
with fine-grained network stack locking, such as inpcb structures.
This change also moves us further in the direction of hiding the
structure of kernel objects from MAC policy modules, not to mention
dramatically reducing the number of '&' symbols appearing in both the
MAC Framework and MAC policy modules, and improving readability.
While this results in minimal performance change with MAC enabled, it
will observably shrink the size of a number of critical kernel data
structures for the !MAC case, and should have a small (but measurable)
performance benefit (i.e., struct vnode, struct socket) do to memory
conservation and reduced cost of zeroing memory.
NOTE: Users of MAC must recompile their kernel and all MAC modules as a
result of this change. Because this is an API change, third party
MAC modules will also need to be updated to make less use of the '&'
symbol.
Suggestions from: bmilekic
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
turnstiles to implement blocking isntead of implementing a thread queue
directly. These turnstiles are somewhat similar to those used in Solaris 7
as described in Solaris Internals but are also different.
Turnstiles do not come out of a fixed-sized pool. Rather, each thread is
assigned a turnstile when it is created that it frees when it is destroyed.
When a thread blocks on a lock, it donates its turnstile to that lock to
serve as queue of blocked threads. The queue associated with a given lock
is found by a lookup in a simple hash table. The turnstile itself is
protected by a lock associated with its entry in the hash table. This
means that sched_lock is no longer needed to contest on a mutex. Instead,
sched_lock is only used when manipulating run queues or thread priorities.
Turnstiles also implement priority propagation inherently.
Currently turnstiles only support mutexes. Eventually, however, turnstiles
may grow two queue's to support a non-sleepable reader/writer lock
implementation. For more details, see the comments in sys/turnstile.h and
kern/subr_turnstile.c.
The two primary advantages from the turnstile code include: 1) the size
of struct mutex shrinks by four pointers as it no longer stores the
thread queue linkages directly, and 2) less contention on sched_lock in
SMP systems including the ability for multiple CPUs to contend on different
locks simultaneously (not that this last detail is necessarily that much of
a big win). Note that 1) means that this commit is a kernel ABI breaker,
so don't mix old modules with a new kernel and vice versa.
Tested on: i386 SMP, sparc64 SMP, alpha SMP
dcons(4): very simple console and gdb port driver
dcons_crom(4): FireWire attachment
dconschat(8): User interface to dcons
Tested with: i386, i386-PAE, and sparc64.
security/mac/mac_net.c
security/mac/mac_pipe.c
security/mac/mac_process.c
security/mac/mac_system.c
security/mac/mac_vfs.c
Note: Here begins a period of NOTES/LINT build breakage due to duplicate
symbols that will shortly be removed from kern_mac.c.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Though this is still incomplete and has some missing features such as
exclusive login and event notification, it may be enough for someone
who wants to play with it.
This driver is supposed to work with firewire(4), targ(4) of CAM(4)
and scsi_target(8) which can be found in /usr/share/example/scsi_target.
This driver doesn't require sbp(4) which implements initiator mode.
Sample configuration:
Kernel: (you can use modules as well)
device firewire
device scbus
device targ
device sbp_targ
After reboot:
# mdconfig -a -t malloc -s 10m
md0
# scsi_target 0:0:0 /dev/md0
(Assuming sbp_targ0 on scbus0)
You should find the 10MB HDD on FreeBSD/MacOS X/WinXP or whatever connected
to the target using FireWire.
Manpage is not finished yet.
do exactly the same as vop_nopoll() for consistency and put a
comment in the two pointing at each other.
Retire seltrue() in favour of no_poll().
Create private default functions in kern_conf.c instead of public
ones.
Change default strategy to return the bio with ENODEV instead of
doing nothing which would lead the bio stranded.
Retire public nullopen() and nullclose() as well as the entire band
of public no{read,write,ioctl,mmap,kqfilter,strategy,poll,dump}
funtions, they are the default actions now.
Move the final two trivial functions from subr_xxx.c to kern_conf.c
and retire the now empty subr_xxx.c
ethernet chips. This driver is pretty simple, however it contains
special DSP initialization code which is needed in order to get
the chip to negotiate a gigE link. (This special initialization
may not be needed in subsequent chip revs.) Also:
- Fix typo in if_rlreg.h (RL_GMEDIASTAT_1000MPS -> RL_GMEDIASTAT_1000MBPS)
- Deal with shared interrupts in re_intr(): if interface isn't up,
return.
- Fix another bug in re_gmii_writereg() (properly apply data field mask)
- Allow PHY driver to read the RL_GMEDIASTAT register via the
re_gmii_readreg() register (this is register needed to determine
real time link/media status).
written by Stuart Walsh and Duncan Barclay (with some kibbitzing by
me). I'm checking it in on Stuart's behalf.
The BCM4401 is built into several x86 laptop and desktop systems. For the
moment, I have only enabled it in the x86 kernel config because although
it's a PCI device, I haven't heard of any standalone NICs that use it. If
somebody knows of one, we can easily add it to the other arches.
This driver uses register/structure data gleaned from the Linux
driver released by Broadcom, but does not contain any of the code
from the Linux driver itself. It uses busdma.
rl(4) driver and put it in a new re(4) driver. The re(4) driver shares
the if_rlreg.h file with rl(4) but is a separate module. (Ultimately
I may change this. For now, it's convenient.)
rl(4) has been modified so that it will never attach to an 8139C+
chip, leaving it to re(4) instead. Only re(4) has the PCI IDs to
match the 8169/8169S/8110S gigE chips. if_re.c contains the same
basic code that was originally bolted onto if_rl.c, with the
following updates:
- Added support for jumbo frames. Currently, there seems to be
a limit of approximately 6200 bytes for jumbo frames on transmit.
(This was determined via experimentation.) The 8169S/8110S chips
apparently are limited to 7.5K frames on transmit. This may require
some more work, though the framework to handle jumbo frames on RX
is in place: the re_rxeof() routine will gather up frames than span
multiple 2K clusters into a single mbuf list.
- Fixed bug in re_txeof(): if we reap some of the TX buffers,
but there are still some pending, re-arm the timer before exiting
re_txeof() so that another timeout interrupt will be generated, just
in case re_start() doesn't do it for us.
- Handle the 'link state changed' interrupt
- Fix a detach bug. If re(4) is loaded as a module, and you do
tcpdump -i re0, then you do 'kldunload if_re,' the system will
panic after a few seconds. This happens because ether_ifdetach()
ends up calling the BPF detach code, which notices the interface
is in promiscuous mode and tries to switch promisc mode off while
detaching the BPF listner. This ultimately results in a call
to re_ioctl() (due to SIOCSIFFLAGS), which in turn calls re_init()
to handle the IFF_PROMISC flag change. Unfortunately, calling re_init()
here turns the chip back on and restarts the 1-second timeout loop
that drives re_tick(). By the time the timeout fires, if_re.ko
has been unloaded, which results in a call to invalid code and
blows up the system.
To fix this, I cleared the IFF_UP flag before calling ether_ifdetach(),
which stops the ioctl routine from trying to reset the chip.
- Modified comments in re_rxeof() relating to the difference in
RX descriptor status bit layout between the 8139C+ and the gigE
chips. The layout is different because the frame length field
was expanded from 12 bits to 13, and they got rid of one of the
status bits to make room.
- Add diagnostic code (re_diag()) to test for the case where a user
has installed a broken 32-bit 8169 PCI NIC in a 64-bit slot. Some
NICs have the REQ64# and ACK64# lines connected even though the
board is 32-bit only (in this case, they should be pulled high).
This fools the chip into doing 64-bit DMA transfers even though
there is no 64-bit data path. To detect this, re_diag() puts the
chip into digital loopback mode and sets the receiver to promiscuous
mode, then initiates a single 64-byte packet transmission. The
frame is echoed back to the host, and if the frame contents are
intact, we know DMA is working correctly, otherwise we complain
loudly on the console and abort the device attach. (At the moment,
I don't know of any way to work around the problem other than
physically modifying the board, so until/unless I can think of a
software workaround, this will have do to.)
- Created re(4) man page
- Modified rlphy.c to allow re(4) to attach as well as rl(4).
Note that this code works for the sample 8169/Marvell 88E1000 NIC
that I have, but probably won't work for the 8169S/8110S chips.
RealTek has sent me some sample NICs, but they haven't arrived yet.
I will probably need to add an rlgphy driver to handle the on-board
PHY in the 8169S/8110S (it needs special DSP initialization).
of what uart(4) is and/or is not see the initial commit log of one
of the files in sys/dev/uart (or see share/man/man4/uart.4).
Note that currently pc98 shares the MD file with i386. This needs
to change when pc98 support is fleshed-out to properly support the
various UARTs. A good example is sparc64 in this respect.
We build uart(4) as a module on all platforms. This may break
the ppc port. That depends on whether they do actually build
modules.
To use uart(4) on alpha, one must use the NO_SIO option.
o Introduce PUC_PORT_TYPE_UART so that we can attach to uart(4),
o Introduce port sub-types (eg PUC_PORT_UART_NS8250, PUC_PORT_UART_Z8530)
to handle different hardware and determine resource sizes.
o Introduce two new IVARs: PUC_IVAR_SUBTYPE and PUC_IVAR_REGSHFT. Both
are used by uart(4) to get sufficient information to talk to the HW.
o Introduce PUC_FLAGS_ALTRES to tell puc(4) to try memory mapped I/O
if I/O port space cannot be allocated, or vice versa.
o Have ports of type PUC_PORT_TYPE_COM attach to uart(1) if attaching
to sio(4) fails (due to not having the sio driver).
o Put struct puc_device_description in struct puc_softc instead of
having a pointer to a device description in the softc. This allows
us to create device descriptions on the fly without having to use
malloc() or otherwise have them staticly defined.
o Move puc_find_description() from puc.c to puc_pci.c as it's specific
to PCI.
o Add EBUS and SBUS frontends for use on sparc64. Note that the P in
puc stands for PCI, so we kinda mess things up here. It's too soon
to worry about it though. We'll know what to do about it in time.
NOTE: This commit changes the behaviour of puc(4) to not quieten the
device probe and attach for child devices. The uart(4) driver provides
additional device description that is valuable to have.
change also disables interrupts around non-S4 suspends whereas before we
did not do this. Our version of AcpiEnterSleepStateS4bios was almost
identical to the ACPICA version.
Restructure the way ATA/ATAPI commands are processed, use a common
ata_request structure for both. This centralises the way requests
are handled so locking is much easier to handle.
The driver is now layered much more cleanly to seperate the lowlevel
HW access so it can be tailored to specific controllers without touching
the upper layers. This is needed to support some of the newer
semi-intelligent ATA controllers showing up.
The top level drivers (disk, ATAPI devices) are more or less still
the same with just corrections to use the new interface.
Pull ATA out from under Gaint now that locking can be done in a sane way.
Add support for a the National Geode SC1100. Thanks to Soekris engineering
for sponsoring a Soekris 4801 to make this support.
Fixed alot of small bugs in the chipset code for various chips now
we are around in that corner anyways.
found only many tv-cards.
We currently use more ore less evil hacks (slow_msp_audio sysctl) to
configure the various variants of these chips in order to have
stereo autodetection work. Nevertheless, this doesn't always work
even though it _should_, according to the specs.
This is, for example, the case for some popular Hauppauge models sold
sold in Germany.
However, the Linux driver always worked for me and others. Looking at
the sourcecode you will find that the linux-driver uses a very much
enhanced approach to program the various msp34xx chipset variants,
which is also found in the specs for these chips.
This is a port of the Linux MSP34xx code, written by Gerd Knorr
<kraxel@bytesex.org>, who agreed to re-release his code under a
BSD license for this port.
A new config option "BKTR_NEW_MSP34XX_DRIVER" is added, which is required
to enable the new driver. Otherwise the old code is used.
The msp34xx.c file is diff-reduced to the linux-driver to make later
modifications easier, thus it doesn't follow style(9) in most cases.
Approved by: roger (committing this, no time to test/review),
keichii (code review)
it attaches to all existing NATM network interfaces in the system
and creates a HARP physical interface for each of them. This allows
us to use the same set of ATM drivers for all ATM stuff. It is
possible to use the same interface for HARP, NATM and netgraph at the
same time.
is not natural and needlessly exposes a lot of dirty laundry.
Move private interfaces between the two from swap_pager.h to swap_pager.c
and staticize as much as possible.
No functional change.
with a ProATM-155 and an IDT evaluation board and should also work
with a ProATM-25 (it seems to work at least, I cannot really measure
what the card emits). The driver has been tested on i386 and sparc64,
but should work an other archs also. It supports UBR, CBR, ABR and VBR;
AAL0, AAL5 and AALraw. As an additional feature VCI/VPI 0/0 can be
opened for receiving in AALraw mode and receives all cells not claimed
by other open VCs (even cells with invalid GFC, VPI and VCI fields and
OAM cells).
Thanks to Christian Bucari from ProSum for lending two cards and answering
my questions.
large to huge amounts of small or medium sized receive buffers. The problem
with these situations is that they eat up the available DMA address space
very quickly when using mbufs or even mbuf clusters. Additionally this
facility provides a direct mapping between 32-bit integers and these buffers.
This is needed for devices originally designed for 32-bit systems. Ususally
the virtual address of the buffer is used as a handle to find the buffer as
soon as it is returned by the card. This does not work for 64-bit machines
and hence this mapping is needed.
having the PCI-ISA bridge driver depend on both pci and isa.
- Have the PCI-EISA bridge driver depend on both pci and eisa as well.
- Make acpi_isab.c depend on acpi and isa.
Submitted by: Marius Strobl <marius@alchemy.franken.de> (1,2)
ACPI nodes with the plug and play ID's defined for a "Generic ISA Bus
Device" as defined in section 10.7 of the ACPI 2.0 specification. This
gives machines like the Libretto that contain a fake ISA bus that is not
connected via a PCI-ISA bridge an ISA bus for ISA devices to attach to.
Tested by: markm
with a new implementation that has a mostly reentrant "addchar"
routine, supports multiple message buffers in the kernel, and hides
the implementation details from callers.
The new code uses a kind of sequence number to represend the current
read and write positions in the buffer. This approach (suggested
mainly by bde) permits the read and write pointers to be maintained
separately, which reduces the number of atomic operations that are
required. The "mostly reentrant" above refers to the way that while
it is now always safe to have any number of concurrent writers,
readers could see the message buffer after a writer has advanced
the pointers but before it has witten the new character.
Discussed on: freebsd-arch
redundant paths to the same device.
This class reacts to a label in the first sector of the device,
which is created the following way:
# "0123456789abcdef012345..."
# "<----magic-----><-id-...>
echo "GEOM::FOX someid" | dd of=/dev/da0 conv=sync
NB: Since the fact that multiple disk devices are in fact the same
device is not known to GEOM, the geom taste/spoil process cannot
fully catch all corner cases and this module can therefore be
confused if you do the right wrong things.
NB: The disk level drivers need to do the right thing for this to
be useful, and that is not by definition currently the case.
toggle several media options (sonet/sdh, for example) with ifconfig and
to see the carrier state in ifconfig's output. It gives also read/write
access (given the right privilegs) to the S/Uni registers to user space
programs.
ethernet controller. The driver has been tested with the LinkSys
USB200M adapter. I know for a fact that there are other devices out
there with this chip but don't have all the USB vendor/device IDs.
Note: I'm not sure if this will force the driver to end up in the
install kernel image or not. Special magic needs to be done to exclude
it to keep the boot floppies from bloating again, someone please
advise.
monitors the entropy data harvested by crypto drivers to verify it complies
with FIPS 140-2. If data fails any test then the driver discards it and
commences continuous testing of harvested data until it is deemed ok.
Results are collected in a statistics block and, optionally, reported on
the console. In normal use the overhead associated with this driver is
not noticeable.
Note that drivers must (currently) be compiled specially to enable use.
Obtained from: original code by Jason L. Wright
included in XFree86 4.3, but includes some fixes. Notable changes include
Radeon 8500-9100 support, PCI Radeon/Rage 128 support, transform & lighting
support for Radeons, and vblank syncing support for r128, radeon, and mga.
The gamma driver was removed due to lack of any users.
drain routines are done by swi_net, which allows for better queue control
at some future point. Packets may also be directly dispatched to a netisr
instead of queued, this may be of interest at some installations, but
currently defaults to off.
Reviewed by: hsu, silby, jayanth, sam
Sponsored by: DARPA, NAI Labs
permit users and groups to bind ports for TCP or UDP, and is intended
to be combined with the recently committed support for
net.inet.ip.portrange.reservedhigh. The policy is twiddled using
sysctl(8). To use this module, you will need to compile in MAC
support, and probably set reservedhigh to 0, then twiddle
security.mac.portacl.rules to set things as desired. This policy
module only restricts ports explicitly bound using bind(), not
implicitly bound ports where the port number is selected by the
IP stack. It appears to work properly in my local configuration,
but needs more broad testing.
A sample policy might be:
# sysctl security.mac.portacl.rules="uid:425:tcp:80,uid:425:tcp:79"
This permits uid 425 to bind TCP sockets to ports 79 and 80. Currently
no distinction is made for incoming vs. outgoing ports with TCP,
although that would probably be easy to add.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
time and there's no indication that it will improve anytime soon.
By removing support for SimOS it is possible to build LINT on
Alpha, which is considered more important at the moment.
Not objected to on: alpha@
type M_STRING, now defined in malloc.h. Useful when string parsing
must occur using the kernel strsep() and we want to avoid toasting
the source string.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
This moves all chipset specific code to a new file 'ata-chipset.c'.
Extensive use of tables and pointers to avoid having the same switch
on chipset type in several places, and to allow substituting various
functions for different HW arch needs.
Added PIO mode setup and all DMA modes.
Support for all known SiS chipsets. Thanks to Christoph Kukulies for
sponsoring a nice ASUS P4S8X SiS648 based board for this work!
Tested on: i386, PC98, alpha and sparc64
for the agp module, and add agp to the list of modules to compile for alpha.
Add an alpha_mb() to agp_flush_cache for alpha -- it's not correct but may
improve the situation, and it's what linux and NetBSD do.
Insted of embedding a struct g_stat in consumers and providers, merely
include a pointer.
Remove a couple of <sys/time.h> includes now unneeded.
Add a special allocator for struct g_stat. This allocator will allocate
entire pages and hand out g_stat functions from there. The "id" field
indicates free/used status.
Add "/dev/geom.stats" device driver whic exports the pages from the
allocator to userland with mmap(2) in read-only mode.
This mmap(2) interface should be considered a non-public interface and
the functions in libgeom (not yet committed) should be used to access
the statistics data.
included in the kernel. Include imgact_elf.c in conf/files, instead of
both imgact_elf32.c and imgact_elf64.c, which will use the default word
size for an architecture as defined in machine/elf.h. Architectures that
wish to build an additional image activator for an alternate word size can
include either imgact_elf32.c or imgact_elf64.c in files.${ARCH}, which
allows it to be dependent on MD options instead of solely on architecture.
Glanced at by: peter
With a 1 byte transmit fifo, 3 byte receive fifo, and wierd multiplexed I/O
designed for a Z80 cpu, this chip redefines suckage.
Based on the openbsd and netbsd drivers. Only really works as a console,
modem support is not complete since I can't test it.
This mostly consists of functionality to serialize accesses to
the two ATA channels (which can also be used to "fix" certain
PCI based controllers).
Add support for Acard controllers.
Enable the ATA driver in PC98 GENERIC, and add device hints.
Update man page with latest support.
The PC98 core team has kindly provided me with a PC98
machine that made this all possible, thanks to all that
contributed to that effort, without that this would
probably newer have been possible..
Approved by: re@
Previously these were libc functions but were requested to
be made into system calls for atomicity and to coalesce what
might be two entrances into the kernel (signal mask setting
and floating point trap) into one.
A few style nits and comments from bde are also included.
Tested on alpha by: gallatin
No functional changes, but:
+ the mrouting module now should behave the same as the compiled-in
version (it did not before, some of the rsvp code was not loaded
properly);
+ netinet/ip_mroute.c is now truly optional;
+ removed some redundant/unused code;
+ changed many instances of '0' to NULL and INADDR_ANY as appropriate;
+ removed several static variables to make the code more SMP-friendly;
+ fixed some minor bugs in the mrouting code (mostly, incorrect return
values from functions).
This commit is also a prerequisite to the addition of support for PIM,
which i would like to put in before DP2 (it does not change any of
the existing APIs, anyways).
Note, in the process we found out that some device drivers fail to
properly handle changes in IFF_ALLMULTI, leading to interesting
behaviour when a multicast router is started. This bug is not
corrected by this commit, and will be fixed with a separate commit.
Detailed changes:
--------------------
netinet/ip_mroute.c all the above.
conf/files make ip_mroute.c optional
net/route.c fix mrt_ioctl hook
netinet/ip_input.c fix ip_mforward hook, move rsvp_input() here
together with other rsvp code, and a couple
of indentation fixes.
netinet/ip_output.c fix ip_mforward and ip_mcast_src hooks
netinet/ip_var.h rsvp function hooks
netinet/raw_ip.c hooks for mrouting and rsvp functions, plus
interface cleanup.
netinet/ip_mroute.h remove an unused and optional field from a struct
Most of the code is from Pavlin Radoslavov and the XORP project
Reviewed by: sam
MFC after: 1 week
"refreshing" the label on the vnode before use, just get the label
right from inception. For single-label file systems, set the label
in the generic VFS getnewvnode() code; for multi-label file systems,
leave the labeling up to the file system. With UFS1/2, this means
reading the extended attribute during vfs_vget() as the inode is
pulled off disk, rather than hitting the extended attributes
frequently during operations later, improving performance. This
also corrects sematics for shared vnode locks, which were not
previously present in the system. This chances the cache
coherrency properties WRT out-of-band access to label data, but in
an acceptable form. With UFS1, there is a small race condition
during automatic extended attribute start -- this is not present
with UFS2, and occurs because EAs aren't available at vnode
inception. We'll introduce a work around for this shortly.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
they may be statically linked into the kernel. Note that statically
linked modules, unlike dynamically linked modules, get INVARIANTS,
so if there are INVARIANTS failures, you'll bump into them rather
than not. Add the options to NOTES.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
- Add detach support to the driver so that you can kldunload the module.
Note that currently rc_detach() fails to detach a unit if any of its
child devices are open, thus a kldunload will fail if any of the tty
devices are currently open.
- sys/i386/isa/ic/cd180.h was moved to sys/dev/ic/cd180.h as part of
this change.
Requested by: rwatson
Tested by: rwatson
This is an encryption module designed for to secure denial of access
to the contents of "cold disks" with or without destruction activation.
Major features:
* Based on AES, MD5 and ARC4 algorithms.
* Four cryptographic barriers:
1) Pass-phrase encrypts the master key.
2) Pass-phrase + Lock data locates master key.
3) 128 bit key derived from 2048 bit master key protects sector key.
3) 128 bit random single-use sector keys protect data payload.
* Up to four different changeable pass-phrases.
* Blackening feature for provable destruction of master key material.
* Isotropic disk contents offers no information about sector contents.
* Configurable destination sector range allows steganographic deployment.
This commit adds the kernel part, separate commits will follow for the
userland utility and documentation.
This software was developed for the FreeBSD Project by Poul-Henning Kamp and
NAI Labs, the Security Research Division of Network Associates, Inc. under
DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the DARPA CHATS
research program.
Many thanks to Robert Watson, CBOSS Principal Investigator for making this
possible.
Sponsored by: DARPA & NAI Labs.
changes for "LSILogic"
(2) enabled non-disk support through CAM interface
(3) HA_INQ (a) enabled tagged queuing (b) disable reset during
driver loading (b) renamed BSDi string to LSI
(4) disabled detecting disk devices during SCSI INQUIRY
(5) changed dcdb single element sglist to send one entire buffer chunk
(6) nsgelem not set in sglist
(7) ap_data_transfer_length not set for dcdb
(8) changed "struct thread" to "d_thread_t" for compatibliity { xxx_open,
xxx_close, xxx_ioctl }
(9) miscellaneous compatiblity fixes
(10) bug fix for 0x0409/0x1000 card
(11) added compiling amr_cam.c in sys/conf/files
(12) added compiling amr_cam.c in sys/modules/amr/Makefile
Reviewed by:ps
MFC after:1 week
1 week
configuration stuff as well as conditional code in the IPv4 and IPv6
areas. Everything is conditional on FAST_IPSEC which is mutually
exclusive with IPSEC (KAME IPsec implmentation).
As noted previously, don't use FAST_IPSEC with INET6 at the moment.
Reviewed by: KAME, rwatson
Approved by: silence
Supported by: Vernier Networks
- Begin moving scheduler specific functionality into sched_4bsd.c
- Replace direct manipulation of scheduler data with hooks provided by the
new api.
- Remove KSE specific state modifications and single runq assumptions from
kern_switch.c
Reviewed by: -arch
allow us to avoid nasty by-hand string parsing stuff in a number of
places in the kernel, reducing the risk of unexpected consequences
for kernel correctness.
among other things, the DEVFS rule subsystem to match nodes against a
path pattern supplied by the user.
fnmatch.c was repo-copied from src/lib/libc/gen/fnmatch.c, and the
only changes to it are those necessary to make it compile in the
kernel. The relevant parts of fnmatch.h were imported into libkern.h.
Approved by: -arch
NB: But it will enable it in all kernels not having options "NO_GEOM"
Put the GEOM related options into the intended order.
Add "options NO_GEOM" to all kernel configs apart from NOTES.
In some order of controlled fashion, the NO_GEOM options will be
removed, architecture by architecture in the coming days.
There are currently three known issues which may force people to
need the NO_GEOM option:
boot0cfg/fdisk:
Tries to update the MBR while it is being used to control
slices. GEOM does not allow this as a direct operation.
SCSI floppy drives:
Appearantly the scsi-da driver return "EBUSY" if no media
is inserted. This is wrong, it should return ENXIO.
PC98:
It is unclear if GEOM correctly recognizes all variants of
PC98 disklabels. (Help Wanted! I have neither docs nor HW)
These issues are all being worked.
Sponsored by: DARPA & NAI Labs.
This allocate the best IRQ to boot-disable devices (have IRQ 0).
Allocated IRQ will be used for PCI interrupt routing when ACPI is
enabled.
Note that verbose messaging enabled for the time being so that
people can easily notice the strange behavior if it happened.
gets signals operating based on a TailQ, and is good enough to run X11,
GNOME, and do job control. There are some intricate parts which could be
more refined to match the sigset_t versions, but those require further
evaluation of directions in which our signal system can expand and contract
to fit our needs.
After this has been in the tree for a while, I will make in kernel API
changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo
more robustly, such that we can actually pass information with our
(queued) signals to the userland. That will also result in using a
struct ksiginfo pointer, rather than a signal number, in a lot of
kern_sig.c, to refer to an individual pending signal queue member, but
right now there is no defined behaviour for such.
CODAFS is unfinished in this regard because the logic is unclear in
some places.
Sponsored by: New Gold Technology
Reviewed by: bde, tjr, jake [an older version, logic similar]
aac driver dependent on the linux emulation module. This was
especially bad for the release engineers who tried to move the
aac driver from the kernel onto the drivers floppy. The linux
compat bits for this driver are now in their own driver, aac_linux.
It can be loaded as a module or compiled into the kernel. For
the latter case, the AAC_COMPAT_LINUX option is needed, along with
the COMPAT_LINUX option.
I've tested this in every configuration I can think of. This is an
MFC candidate for 4.7.
Idea from: rwatson
MFC after: 3 days
so that it is MI. Allow nfs_mountroot to return an error if the nfs_diskless
struct is not valid, rather than panicing later on. Call nfs_setup_diskless()
from nfs_mountroot if NFS_ROOT is defined, like bootpc_init(). Removed legacy
root mount support for sparc64, and enabled NFS_ROOT by default.
i4bq931, i4b, isic, iwic, ifpi, ifpi2, ifpnp, ihfc, and itjc are
no longer count devices. Also remove a few other instances of N<DEVICE>
being used to control compilation of whole files.
Reviewed by: hm
This feature can be disabled via the AHD/AHC_REG_PRETTY_PRINT kernel
option.
The ahc driver now uses the same debug options mechanism as ahd:
AHC_DEBUG - Compile in debugging code
AHC_DEBUG_OPTS - String of debug options as listed in aic7xxx.h
This is an architecture that present a thing message passing interface
to the OS. You can query as to how many ports and what kind are attached
and enable them and so on.
A less grand view is that this is just another way to package SCSI (SPI or
FC) and FC-IP into a one-driver interface set.
This driver support the following hardware:
LSI FC909: Single channel, 1Gbps, Fibre Channel (FC-SCSI only)
LSI FC929: Dual Channel, 1-2Gbps, Fibre Channel (FC-SCSI only)
LSI 53c1020: Single Channel, Ultra4 (320M) (Untested)
LSI 53c1030: Dual Channel, Ultra4 (320M)
Currently it's in fair shape, but expect a lot of changes over the
next few weeks as it stabilizes.
Credits:
The driver is mostly from some folks from Jeff Roberson's company- I've
been slowly migrating it to broader support that I it came to me as.
The hardware used in developing support came from:
FC909: LSI-Logic, Advansys (now Connetix)
FC929: LSI-Logic
53c1030: Antares Microsystems (they make a very fine board!)
MFC after: 3 weeks
The CAM<>ATAPI layer was submitted by "Thomas Quinot <thomas@cuivre.fr.eu.org>"
changes form the version on the net by me (formatting, ability to be used
alone without the ATAPI native device driver, proper speed reporting...)
See /sys/conf/NOTES for usage.
Submitted by: Thomas Quinot <thomas@cuivre.fr.eu.org>
kernel access control.
Modify procfs so that (when mounted multilabel) it exports process MAC
labels as the vnode labels of procfs vnodes associated with processes.
Approved by: des
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
This driver actually works slightly better on -stable than on -current
(the system locks on detach on -current), so it should be MFC'd somewhat
sooner.
This driver currently points out a difficulty in the sound device framework.
The PCM unregister routine is allowed to refuse the detach if the device is
in use. In the case of a USB device, however, this unregistration is much more
mandatory in nature, since the device is *actually* gone when this call is
made. The sound subsystem really should not refuse an unregistration and
should take its own steps to reject further I/O. As a result, if you detach
a USB sound device while it is in use, you can expect a panic shortly
thereafter.
This device cannot currently record audio. Some routines are unwritten as
of yet in uaudio.c to support recording.
This device hangs my -current box on detach. I don't know why. This does
not happen on my -stable machine.
Obtained from: Hiroyuki Aizu
MFC after: 2 weeks
handler in the kernel at the same time. Also, allow for the
exec_new_vmspace() code to build a different sized vmspace depending on
the executable environment. This is a big help for execing i386 binaries
on ia64. The ELF exec code grows the ability to map partial pages when
there is a page size difference, eg: emulating 4K pages on 8K or 16K
hardware pages.
Flesh out the i386 emulation support for ia64. At this point, the only
binary that I know of that fails is cvsup, because the cvsup runtime
tries to execute code in pages not marked executable.
Obtained from: dfr (mostly, many tweaks from me).
administrator to define certain properties of new devfs nodes before
they become visible to the userland. Both static (e.g., /dev/speaker)
and dynamic (e.g., /dev/bpf*, some removable devices) nodes are
supported. Each DEVFS mount may have a different ruleset assigned to
it, permitting different policies to be implemented for things like
jails.
Approved by: phk
one out of a block cipher. This has 2 advantages:
1) The code is _much_ simpler
2) We aren't committing our security to one algorithm (much as we
may think we trust AES).
While I'm here, make an explicit reseed do a slow reseed instead
of a fast; this is in line with what the original paper suggested.
The file vfs_conf.c which was dealing with root mounting has
been repo-copied into vfs_mount.c to preserve history.
This makes nmount related development easier, and help reducing
the size of vfs_syscalls.c, which is still an enormous file.
Reviewed by: rwatson
Repo-copy by: peter
The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)
Reviewed by: Almost everyone who counts
(at various times, peter, jhb, matt, alfred, mini, bernd,
and a cast of thousands)
NOTE: this is still Beta code, and contains lots of debugging stuff.
expect slight instability in signals..
This code makes use of variable-size kernel representation of rules
(exactly the same concept of BPF instructions, as used in the BSDI's
firewall), which makes firewall operation a lot faster, and the
code more readable and easier to extend and debug.
The interface with the rest of the system is unchanged, as witnessed
by this commit. The only extra kernel files that I am touching
are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In
userland I only had to touch those programs which manipulate the
internal representation of firewall rules).
The code is almost entirely new (and I believe I have written the
vast majority of those sections which were taken from the former
ip_fw.c), so rather than modifying the old ip_fw.c I decided to
create a new file, sys/netinet/ip_fw2.c . Same for the user
interface, which is in sbin/ipfw/ipfw2.c (it still compiles to
/sbin/ipfw). The old files are still there, and will be removed
in due time.
I have not renamed the header file because it would have required
touching a one-line change to a number of kernel files.
In terms of user interface, the new "ipfw" is supposed to accepts
the old syntax for ipfw rules (and produce the same output with
"ipfw show". Only a couple of the old options (out of some 30 of
them) has not been implemented, but they will be soon.
On the other hand, the new code has some very powerful extensions.
First, you can put "or" connectives between match fields (and soon
also between options), and write things like
ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any
This should make rulesets slightly more compact (and lines longer!),
by condensing 2 or more of the old rules into single ones.
Also, as an example of how easy the rules can be extended, I have
implemented an 'address set' match pattern, where you can specify
an IP address in a format like this:
10.20.30.0/26{18,44,33,22,9}
which will match the set of hosts listed in braces belonging to the
subnet 10.20.30.0/26 . The match is done using a bitmap, so it is
essentially a constant time operation requiring a handful of CPU
instructions (and a very small amount of memmory -- for a full /24
subnet, the instruction only consumes 40 bytes).
Again, in this commit I have focused on functionality and tried
to minimize changes to the other parts of the system. Some performance
improvement can be achieved with minor changes to the interface of
ip_fw_chk_t. This will be done later when this code is settled.
The code is meant to compile unmodified on RELENG_4 (once the
PACKET_TAG_* changes have been merged), for this reason
you will see #ifdef __FreeBSD_version in a couple of places.
This should minimize errors when (hopefully soon) it will be time
to do the MFC.
MAKEDEV: Add MAKEDEV glue for the ti(4) device nodes.
ti.4: Update the ti(4) man page to include information on the
TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options,
and also include information about the new character
device interface and the associated ioctls.
man9/Makefile: Add jumbo.9 and zero_copy.9 man pages and associated
links.
jumbo.9: New man page describing the jumbo buffer allocator
interface and operation.
zero_copy.9: New man page describing the general characteristics of
the zero copy send and receive code, and what an
application author should do to take advantage of the
zero copy functionality.
NOTES: Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS,
TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT.
conf/files: Add uipc_jumbo.c and uipc_cow.c.
conf/options: Add the 5 options mentioned above.
kern_subr.c: Receive side zero copy implementation. This takes
"disposable" pages attached to an mbuf, gives them to
a user process, and then recycles the user's page.
This is only active when ZERO_COPY_SOCKETS is turned on
and the kern.ipc.zero_copy.receive sysctl variable is
set to 1.
uipc_cow.c: Send side zero copy functions. Takes a page written
by the user and maps it copy on write and assigns it
kernel virtual address space. Removes copy on write
mapping once the buffer has been freed by the network
stack.
uipc_jumbo.c: Jumbo disposable page allocator code. This allocates
(optionally) disposable pages for network drivers that
want to give the user the option of doing zero copy
receive.
uipc_socket.c: Add kern.ipc.zero_copy.{send,receive} sysctls that are
enabled if ZERO_COPY_SOCKETS is turned on.
Add zero copy send support to sosend() -- pages get
mapped into the kernel instead of getting copied if
they meet size and alignment restrictions.
uipc_syscalls.c:Un-staticize some of the sf* functions so that they
can be used elsewhere. (uipc_cow.c)
if_media.c: In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid
calling malloc() with M_WAITOK. Return an error if
the M_NOWAIT malloc fails.
The ti(4) driver and the wi(4) driver, at least, call
this with a mutex held. This causes witness warnings
for 'ifconfig -a' with a wi(4) or ti(4) board in the
system. (I've only verified for ti(4)).
ip_output.c: Fragment large datagrams so that each segment contains
a multiple of PAGE_SIZE amount of data plus headers.
This allows the receiver to potentially do page
flipping on receives.
if_ti.c: Add zero copy receive support to the ti(4) driver. If
TI_PRIVATE_JUMBOS is not defined, it now uses the
jumbo(9) buffer allocator for jumbo receive buffers.
Add a new character device interface for the ti(4)
driver for the new debugging interface. This allows
(a patched version of) gdb to talk to the Tigon board
and debug the firmware. There are also a few additional
debugging ioctls available through this interface.
Add header splitting support to the ti(4) driver.
Tweak some of the default interrupt coalescing
parameters to more useful defaults.
Add hooks for supporting transmit flow control, but
leave it turned off with a comment describing why it
is turned off.
if_tireg.h: Change the firmware rev to 12.4.11, since we're really
at 12.4.11 plus fixes from 12.4.13.
Add defines needed for debugging.
Remove the ti_stats structure, it is now defined in
sys/tiio.h.
ti_fw.h: 12.4.11 firmware.
ti_fw2.h: 12.4.11 firmware, plus selected fixes from 12.4.13,
and my header splitting patches. Revision 12.4.13
doesn't handle 10/100 negotiation properly. (This
firmware is the same as what was in the tree previously,
with the addition of header splitting support.)
sys/jumbo.h: Jumbo buffer allocator interface.
sys/mbuf.h: Add a new external mbuf type, EXT_DISPOSABLE, to
indicate that the payload buffer can be thrown away /
flipped to a userland process.
socketvar.h: Add prototype for socow_setup.
tiio.h: ioctl interface to the character portion of the ti(4)
driver, plus associated structure/type definitions.
uio.h: Change prototype for uiomoveco() so that we'll know
whether the source page is disposable.
ufs_readwrite.c:Update for new prototype of uiomoveco().
vm_fault.c: In vm_fault(), check to see whether we need to do a page
based copy on write fault.
vm_object.c: Add a new function, vm_object_allocate_wait(). This
does the same thing that vm_object allocate does, except
that it gives the caller the opportunity to specify whether
it should wait on the uma_zalloc() of the object structre.
This allows vm objects to be allocated while holding a
mutex. (Without generating WITNESS warnings.)
vm_object_allocate() is implemented as a call to
vm_object_allocate_wait() with the malloc flag set to
M_WAITOK.
vm_object.h: Add prototype for vm_object_allocate_wait().
vm_page.c: Add page-based copy on write setup, clear and fault
routines.
vm_page.h: Add page based COW function prototypes and variable in
the vm_page structure.
Many thanks to Drew Gallatin, who wrote the zero copy send and receive
code, and to all the other folks who have tested and reviewed this code
over the years.
a small chance that it might have broken loading the miibus, so err on
the side of caution until I can figure out what is going on. This
backs out all but the PCI, PCIB and ISA bus interfaces being
"standard," which have been well tested...
easier loading of modules that might refer to these interfaces. None
of the code that implements them is standard, just the glue. This
bloats the kernel a whopping 8k.
Silence on: arch@
so that /dev/mumble can be the entrypoint to some networking graph,
e.g. a tunnel or a remote tape drive or whatever...
Not fully tested (by me) yet.
Submitted by: Mark Santcroos <marks@ripe.net>
MFC after: 3 weeks
is currently conditional on both the GEOM and GEOM_GPT options to
avoid getting GPT by default and having the MBR and GPT classes
clash.
The correct behaviour of the MBR class would be to back-off (reject)
a MBR if it's a Protective MBR (a MBR with a single partition of type
0xEE that spans the whole disk (as far as the MBR is concerned).
The correct behaviour if the GPT class would be to back-off (reject)
a GPT if there's a MBR that's not a Protective MBR.
At this stage it's inconvenient to destroy a good MBR when working
with GPTs that it's more convenient to have the MBR class back-off
when it detects the GPT signature on disk and have the GPT class
ignore the MBR.
In sys/gpt.h UUIDs (GUIDs) for the following FreeBSD partitions
have been defined:
GPT_ENT_TYPE_FREEBSD
FreeBSD slice with disklabel. This is the equivalent of
the well-known FreeBSD MBR partition type.
GPT_ENT_TYPE_FREEBSD_{SWAP|UFS|UFS2|VINUM}
FreeBSD partitions in the context of disklabel. This is
speculating on the idea to use the GPT to hold partitions
instead if slices and removing the fixed (and low) limits
we have on the number of partitions.
This commit lacks a GPT image for the regression suite.
The uuidgen command, by means of the uuidgen syscall, generates one
or more Universally Unique Identifiers compatible with OSF/DCE 1.1
version 1 UUIDs.
From the Perforce logs (change 11995):
Round of cleanups:
o Give uuidgen() the correct prototype in syscalls.master
o Define struct uuid according to DCE 1.1 in sys/uuid.h
o Use struct uuid instead of uuid_t. The latter is defined
in sys/uuid.h but should not be used in kernel land.
o Add snprintf_uuid(), printf_uuid() and sbuf_printf_uuid()
to kern_uuid.c for use in the kernel (currently geom_gpt.c).
o Rename the non-standard struct uuid in kern/kern_uuid.c
to struct uuid_private and give it a slightly better definition
for better byte-order handling. See below.
o In sys/gpt.h, fix the broken uuid definitions to match the now
compliant struct uuid definition. See below.
o In usr.bin/uuidgen/uuidgen.c catch up with struct uuid change.
A note about byte-order:
The standard failed to provide a non-conflicting and
unambiguous definition for the binary representation. My initial
implementation always wrote the timestamp as a 64-bit little-endian
(2s-complement) integral. The clock sequence was always written
as a 16-bit big-endian (2s-complement) integral. After a good
nights sleep and couple of Pan Galactic Gargle Blasters (not
necessarily in that order :-) I reread the spec and came to the
conclusion that the time fields are always written in the native
by order, provided the the low, mid and hi chopping still occurs.
The spec mentions that you "might need to swap bytes if you talk
to a machine that has a different byte-order". The clock sequence
is always written in big-endian order (as is the IEEE 802 address)
because its division is resulting in bytes, making the ordering
unambiguous.
"The only hard problem in cryptography is key-management."
All sectors are encrypted with AES in CBC mode using a constant key,
currently compiled in and all zero.
To activate this module, write the magic header on the partition:
echo "<<FreeBSD-GEOM-AES>>" | dd conv=sync of=/dev/md98
The encrypted device will be one sector shorter and have ".aes"
appended to its name.
Sponsored by: DARPA & NAI Labs.
IFS had its fingers deep in the belly of the UFS/FFS split. IFS
will be reimplemented by the maintainer at a later date.
Requested by: adrian (maintainer)
shared code and converting all ufs references. Originally it may
have made sense to share common features between the two filesystems,
but recently it has only caused problems, the UFS2 work being the
final straw.
All UFS_* indirect calls are now direct calls to ext2_* functions,
and ext2fs-specific mount and inode structures have been introduced.
0xdeadc0de and then check for it just before memory is handed off as part
of a new request. This will catch any post free/pre alloc modification of
memory, as well as introduce errors for anything that tries to dereference
it as a pointer.
This code takes the form of special init, fini, ctor and dtor routines that
are specificly used by malloc. It is in a seperate file because additional
debugging aids will want to live here as well.
ever connect a SCSI Cdrom/Tape/Jukebox/Scanner/Printer/kitty-litter-scooper
to your high-end RAID controller. The interface to the arrays is still
via the block interface; this merely provides a way to circumvent the
RAID functionality and access the SCSI buses directly. Note that for
somewhat obvious reasons, hard drives are not exposed to the da driver
through this interface, though you can still talk to them via the pass
driver. Be the first on your block to low-level format unsuspecting
drives that are part of an array!
To enable this, add the 'aacp' device to your kernel config.
MFC after: 3 days
- Add stubs for EISA and SBUS cards.
(VME, FutureBUS, and TurboChannel stubs not provided.)
- Add infrastructure to build driver and bus front-end modules.
time-of-day clocks, ported from NetBSD. The front-ends are expected
to be at least partly machine-dependent; the sparc64 EBus and SBus
ones will be commited to MD directories for now (in a subsequent commit).
a set of helper routines to deal with real-time clocks. The generic
functions access the clock diver using a kobj interface. This is intended
to reduce code reduplication and make it easy to support more than one
clock model on a single architecture.
This code is currently only used on sparc64, but it is planned to convert
the code of the other architectures to it later.
I have not been able to find very much information about the PC98
extended partition layout so this is gleaned from the source in
our pc98 architecture. Corrections and patched very welcome.
Sponsored by: DARPA and NAI Labs.
The detection code in this method is written so that it should work on
all architectures which means that you can plug a Sun disk into a i386
now and access the partitions.
We still need an endian-agnostic ufs/ffs before this is really
interresting, but the main focus was to get sparc64 onto the GEOM
trail.
This makes other power-management system (APM for now) to be able to
generate power profile change events (ie. AC-line status changes), and
other kernel components, not only the ACPI components, can be notified
the events.
- move subroutines in acpi_powerprofile.c (removed) to kern/subr_power.c
- call power_profile_set_state() also from APM driver when AC-line
status changes
- add call-back function for Crusoe LongRun controlling on power
profile changes for a example