reducues the maintenance load for the mutex code. The only MD portions
of the mutex code are in machine/mutex.h now, which include the assembly
macros for handling mutexes as well as optionally overriding the mutex
micro-operations. For example, we use optimized micro-ops on the x86
platform #ifndef I386_CPU.
- Change the behavior of the SMP_DEBUG kernel option. In the new code,
mtx_assert() only depends on INVARIANTS, allowing other kernel developers
to have working mutex assertiions without having to include all of the
mutex debugging code. The SMP_DEBUG kernel option has been renamed to
MUTEX_DEBUG and now just controls extra mutex debugging code.
- Abolish the ugly mtx_f hack. Instead, we dynamically allocate
seperate mtx_debug structures on the fly in mtx_init, except for mutexes
that are initiated very early in the boot process. These mutexes
are declared using a special MUTEX_DECLARE() macro, and use a new
flag MTX_COLD when calling mtx_init. This is still somewhat hackish,
but it is less evil than the mtx_f filler struct, and the mtx struct is
now the same size with and without mutex debugging code.
- Add some micro-micro-operation macros for doing the actual atomic
operations on the mutex mtx_lock field to make it easier for other archs
to override/optimize mutex ops if needed. These new tiny ops also clean
up the code in some places by replacing long atomic operation function
calls that spanned 2-3 lines with a short 1-line macro call.
- Don't call mi_switch() from mtx_enter_hard() when we block while trying
to obtain a sleep mutex. Calling mi_switch() would bogusly release
Giant before switching to the next process. Instead, inline most of the
code from mi_switch() in the mtx_enter_hard() function. Note that when
we finally kill Giant we can back this out and go back to calling
mi_switch().
in most of the atomic operations. Now for these operations, you can
use the normal atomic operation, you can use the operation with a read
barrier, or you can use the operation with a write barrier. The function
names follow the same semantics used in the ia64 instruction set. An
atomic operation with a read barrier has the extra suffix 'acq', due to
it having "acquire" semantics. An atomic operation with a write barrier
has the extra suffix 'rel'. These suffixes are inserted between the
name of the operation to perform and the typename. For example, the
atomic_add_int() function now has 3 variants:
- atomic_add_int() - this is the same as the previous function
- atomic_add_acq_int() - this function combines the add operation with a
read memory barrier
- atomic_add_rel_int() - this function combines the add operation with a
write memory barrier
- Add 'ptr' to the list of types that we can perform atomic operations
on. This allows one to do atomic operations on uintptr_t's. This is
useful in the mutex code, for example, because the actual mutex lock is
a pointer.
- Add two new operations for doing loads and stores with memory barriers.
The new load operations use a read barrier before the load, and the
new store operations use a write barrier after the load. For example,
atomic_load_acq_int() will atomically load an integer as well as
enforcing a read barrier.
write caching is disabled on both SCSI and IDE disks where large
memory dumps could take up to an hour to complete.
Taking an i386 scsi based system with 512MB of ram and timing (in
seconds) how long it took to complete a dump, the following results
were obtained:
Before: After:
WCE TIME WCE TIME
------------------ ------------------
1 141.820972 1 15.600111
0 797.265072 0 65.480465
Obtained from: Yahoo!
Reviewed by: peter
and associated user-level signal trampoline glue.
Without this patch, an SA_SIGINFO style handler can be installed by a linux
app, but if the handler accesses its sip argument, it will get a garbage
pointer and likely segfault.
We currently supply a valid pointer, but its contents are mainly
garbage. Filling this in properly is future work.
This is the second of 3 commits that will get IBM's JDK 1.3 working with
FreeBSD ...
loads, prints the copyright, and either hangs or locks solid. The
PC tends to be in the data segment and the RA is in XentMM
Doug really came up with the fix, I'm just the monkey typing. Doug says:
The alpha can only support 64k of globals with $gp pointing at
base+32k so that the code can use 16bit signed offsets from $gp to
access it. .... it is possible to have multiple .got subsections
and the linker handles this with the relocations for 'ldgp' pseudo
instructions. [Without this patch] the code in exception.s has been
linked to use a different gp from locore.s (where pal_kgp is set).
Reviewed by: dfr
Before calling kernacc(), make sure that we're not calling it
with a K0SEG address.
This gets alphas booting with SMP_DEBUG & INVARIANTS options
approved by: jhb
Replace all in-tree uses with <sys/mouse.h> which repo-copied a few
moments ago from src/sys/i386/include/mouse.h by peter.
This is also the appropriate fix for exo-tree sources.
Put warnings in <machine/mouse.h> to discourage use.
November 15th 2000 the warnings will be converted to errors.
January 15th 2001 the <machine/mouse.h> files will be removed.
Replace all in-tree uses with necessary subset of <sys/{fb,kb,cons}io.h>.
This is also the appropriate fix for exo-tree sources.
Put warnings in <machine/console.h> to discourage use.
November 15th 2000 the warnings will be converted to errors.
January 15th 2001 the <machine/console.h> files will be removed.
check in the [basic.link] section of the C++ standard wrong. gcc-2.7.2.3
apparently doesn't do the check, so the bug doesn't affect RELENG_3.
PR: 16170, 21427
Submitted by: Max Khon <fjoe@lark.websci.ru> (i386 version)
Discussed with: jdp
return through doreti to handle ast's. This is necessary for the
clock interrupts to work properly.
- Change the clock interrupts on the x86 to be fast instead of threaded.
This is needed because both hardclock() and statclock() need to run in
the context of the current process, not in a separate thread context.
- Kill the prevproc hack as it is no longer needed.
- We really need Giant when we call psignal(), but we don't want to block
during the clock interrupt. Instead, use two p_flag's in the proc struct
to mark the current process as having a pending SIGVTALRM or a SIGPROF
and let them be delivered during ast() when hardclock() has finished
running.
- Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was
broken on the x86 if it was turned on since cpl is gone. It's only use
was to bogusly run softclock() directly during hardclock() rather than
scheduling an SWI.
- Remove the COM_LOCK simplelock and replace it with a clock_lock spin
mutex. Since the spin mutex already handles disabling/restoring
interrupts appropriately, this also lets us axe all the *_intr() fu.
- Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use
temporary fast interrupts for the APIC trial.
- Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending
signals in hardclock() that are to be delivered in ast().
Submitted by: jakeb (making statclock safe in a fast interrupt)
Submitted by: cp (concept of delaying signals until ast())
- Make softinterrupts (SWI's) almost completely MI, and divorce them
completely from the x86 hardware interrupt code.
- The ihandlers array is now gone. Instead, there is a MI shandlers array
that just contains SWI handlers.
- Most of the former machine/ipl.h files have moved to a new sys/ipl.h.
- Stub out all the spl*() functions on all architectures.
Submitted by: dfr
to accomodate the changes.
Here's a list of things that have changed (I may have left out a few); for a
relatively complete list, see http://people.freebsd.org/~bmilekic/mtx_journal
* Remove old (once useful) mcluster code for MCLBYTES > PAGE_SIZE which
nobody uses anymore. It was great while it lasted, but now we're moving
onto bigger and better things (Approved by: wollman).
* Practically re-wrote the allocation macros in sys/sys/mbuf.h to accomodate
new allocations which grab the necessary lock.
* Make sure that necessary mbstat variables are manipulated with
corresponding atomic() routines.
* Changed the "wait" routines, cleaned it up, made one routine that does
the job.
* Generalized MWAKEUP() macro. Got rid of m_retry and m_retryhdr, as they
are now included in the generalized "wait" routines.
* Sleep routines now use msleep().
* Free lists have locks.
* etc... probably other stuff I'm missing...
Things to look out for and work on later:
* find a better way to (dynamically) adjust EXT_COUNTERS
* move necessity to recurse on a lock from drain routines by providing
lock-free lower-level version of MFREE() (and possibly m_free()?).
* checkout include of mutex.h in sys/sys/mbuf.h - probably violating
general philosophy here.
The code has been reviewed quite a bit, but problems may arise... please,
don't panic! Send me Emails: bmilekic@freebsd.org
Reviewed by: jlemon, cp, alfred, others?
Previously, these cards were supported by the lnc driver (and they
still are, but the pcn driver will claim them first), which is fine
except the lnc driver runs them in 16-bit LANCE compatibility mode.
The pcn driver runs these chips in 32-bit mode and uses the RX alignment
feature to achieve zero-copy receive. (Which puts it in the same
class as the xl, fxp and tl chipsets.) This driver is also MI, so it
will work on the x86 and alpha platforms. (The lnc driver is still
needed to support non-PCI cards. At some point, I'll need to newbusify
it so that it too will me MI.)
The Am79c978 HomePNA adapter is also supported.
newbus for referencing device interrupt handlers.
- Move the 'struct intrec' type which describes interrupt sources into
sys/interrupt.h instead of making it just be a x86 structure.
- Don't create 'ithd' and 'intrec' typedefs, instead, just use 'struct ithd'
and 'struct intrec'
- Move the code to translate new-bus interrupt flags into an interrupt thread
priority out of the x86 nexus code and into a MI ithread_priority()
function in sys/kern/kern_intr.c.
- Remove now-uneeded x86-specific headers from sys/dev/ata/ata-all.c and
sys/pci/pci_compat.c.
re-enable interrupts when actually releasing the lock.
- Bring across some fixes to propagate_priority from the x86 code.
(It still doesn't work properly, however.)
- Use the SMTX state when putting a process that blocks on a mutex to sleep.
- Use mi_switch instead of cpu_switch so that accounting works properly as
well as other things.
- Bring across DDB protection of the spinlock timeout panic which is useful
in a multiple CPU system when 1 CPU enters the debugger holding the
sched_lock so that the other CPU doesn't panic as well resulting in all
sorts of fun things.
- Bring across various other small changes in format strings and comments
to sync up with the x86 code.
fixes a serious problem with the previous version where an input could
have been placed in the same register as an output which would stop
the inline from working properly.
* Redo atomic_{set,clear,add,subtract}_{32,64} as inlines since the code
sequence is shorter than the call sequence to the code in atomic.s.
I will remove the functions from atomic.s after a grace period to allow
people to rebuild kernel modules.
and mtx_exit(). This change tracks the i386 version.
Rename mtx_enter(), mtx_try_enter(), and mtx_exit() and wrap them with cpp
macros that expand to pass filename and line number information. This is
necessary since we're using inline functions instead of macros now.
Add const to the filename pointers passed througout the mtx and witness
code.
include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The
alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be
preempted (i386 only).
Partially contributed by: BSDi (BSD/OS)
Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
machines. The patch uses an existing global variable in place of the
newbus accessor to get at use_bwx.
This is a quick fix to get miatas booting again; somebody
with more newbus skills than I can muster will have to correct it.
Matt Jacob's description of the problem from the -alpha list:
The IVAR accessor stuff for pcib is incompletely specified for CIA. There's
only one accessor defined, and that's to get the BUS instance number.
<..>
The device methods that try and get at the use_bwx get overriden because
there's only one ivar for CIA's pcib, and that's for hose #, and it's always
zero.
foo_pcib_[read|write]_config() functions rather than relying on
a break or return being in the CFG macro.
This fixes a panic later in the boot process on a UP1000. From
inspection, it looks like this fixes a similar problem in the tsunami code.
Approved by: dfr
the drivers.
* Remove legacy inx/outx support from chipset and replace with macros
which call busspace.
* Rework pci config accesses to route through the pcib device instead of
calling a MD function directly.
With these changes it is possible to cleanly support machines which have
more than one independantly numbered PCI busses. As a bonus, the new
busspace implementation should be measurably faster than the old one.
In summary:
o This file has been moved to sys/compat/linux,
o Any MD syscalls in this file are moved to
linux_machdep.c in sys/i386/linux,
o Include directives, makefiles and config files
have been updated.
that should be better.
The old code counted references to mbuf clusters by using the offset
of the cluster from the start of memory allocated for mbufs and
clusters as an index into an array of chars, which did the reference
counting. If the external storage was not a cluster then reference
counting had to be done by the code using that external storage.
NetBSD's system of linked lists of mbufs was cosidered, but Alfred
felt it would have locking issues when the kernel was made more
SMP friendly.
The system implimented uses a pool of unions to track external
storage. The union contains an int for counting the references and
a pointer for forming a free list. The reference counts are
incremented and decremented atomically and so should be SMP friendly.
This system can track reference counts for any sort of external
storage.
Access to the reference counting stuff is now through macros defined
in mbuf.h, so it should be easier to make changes to the system in
the future.
The possibility of storing the reference count in one of the
referencing mbufs was considered, but was rejected 'cos it would
often leave extra mbufs allocated. Storing the reference count in
the cluster was also considered, but because the external storage
may not be a cluster this isn't an option.
The size of the pool of reference counters is available in the
stats provided by "netstat -m".
PR: 19866
Submitted by: Bosko Milekic <bmilekic@dsuper.net>
Reviewed by: alfred (glanced at by others on -net)
gcc's internal exit() prototypes and the (futile) hackery that we did to
try and avoid warnings. main() was renamed for similar reasons.
Remove an exit related hack from makesyscalls.sh.
ether_ifdetach().
The former consolidates the operations of if_attach(), ng_ether_attach(),
and bpfattach(). The latter consolidates the corresponding detach operations.
Reviewed by: julian, freebsd-net
hose are 16 PCI instances apart. This allows us to recognize secondary
PCI busses (at least to a first level) until the pci infrastructure is
fixed.
Turn on support for secondary cycles, too. Redo debug printouts.
Instead, for now (until we get a pci infrastructure cleanup),
assign the PCI bus number to be mcpcia bus instance << 4. This
is to allow secondary bridges some room to be recongnized on
4100 systems.
SYSCTL_LONG macro to be consistent with other integer sysctl variables
and require an initial value instead of assuming 0. Update several
sysctl variables to use the unsigned types.
PR: 15251
Submitted by: Kelly Yancey <kbyanc@posi.net>
irongate chipset (used in the UP1000) which does not support scatter/gather
DMA. We'll still use scatter gather if the core logic chipset supports it.
Reviewed by: dfr
fields, not lex/yacc grammar so it is not an exact match but should be
close enough for most cases.
Deal with 'port?', 'irq?' style specifications. These are parsed as
seperate values in lex/yacc in config(8) but tripped up this helper tool.
deal with filename arguments. It is amazing how much you forget over time.
Thanks to the people that reminded me this. I knew there was an easy way
that didn't involve messing with $argv, filehandles, etc, but just could
not remember - all of my books are on the opposite side of the planet..
Use Warner Losh's "hint" driver to decode ascii strings to fill the
resource table at boot time.
config(8) no longer generates an ioconf.c table - ie: the configuration
no longer has to be compiled into the kernel. You can reconfigure your
isa devices with the likes of this at loader(8) time:
set hint.ed.0.port=0x320
userconfig will be rewritten to use this style interface one day and will
move to /boot/userconfig.4th or something like that.
It is still possible to statically compile in a set of hints into a kernel
if you do not wish to use loader(8). See the "hints" directive in GENERIC
as an example.
All device wiring has been moved out of config(8). There is a set of
helper scripts (see i386/conf/gethints.pl, and the same for alpha and pc98)
that extract the 'at isa? port foo irq bar' from the old files and produces
a hints file. If you install this file as /boot/device.hints (and update
/boot/defaults/loader.conf - You can do a build/install in sys/boot) then
loader will load it automatically for you. You can also compile in the
hints directly with: hints "device.hints" as well.
There are a few things that I'm not too happy with yet. Under this scheme,
things like LINT would no longer be useful as "documentation" of settings.
I have renamed this file to 'NOTES' and stored the example hints strings
in it. However... this is not something that config(8) understands, so
there is a script that extracts the build-specific data from the
documentation file (NOTES) to produce a LINT that can be config'ed and
built. A stack of man4 pages will need updating. :-/
Also, since there is no longer a difference between 'device' and
'pseudo-device' I collapsed the two together, and the resulting 'device'
takes a 'number of units' for devices that still have it statically
allocated. eg: 'device fe 4' will compile the fe driver with NFE set
to 4. You can then set hints for 4 units (0 - 3). Also note that
'device fe0' will be interpreted as "zero units of 'fe'" which would be
bad, so there is a config warning for this. This is only needed for
old drivers that still have static limits on numbers of units.
All the statically limited drivers that I could find were marked.
Please exercise EXTREME CAUTION when transitioning!
Moral support by: phk, msmith, dfr, asmodai, imp, and others
loader for alpha (Yay!) we still need to explicitly look for boot_verbose-
I assume because the boothowto flags aren't passed to us at boot like x86.
Do some minor cosmetics as well.