appropriate function, rather than doing a horse-and-buggy
acquire. They now take the mutex type as an arg and can be
used with sleep as well as spin mutexes.
All calls to mtx_init() for mutexes that recurse must now include
the MTX_RECURSE bit in the flag argument variable. This change is in
preparation for an upcoming (further) mutex API cleanup.
The witness code will call panic() if a lock is found to recurse but
the MTX_RECURSE bit was not set during the lock's initialization.
The old MTX_RECURSE "state" bit (in mtx_lock) has been renamed to
MTX_RECURSED, which is more appropriate given its meaning.
The following locks have been made "recursive," thus far:
eventhandler, Giant, callout, sched_lock, possibly some others declared
in the architecture-specific code, all of the network card driver locks
in pci/, as well as some other locks in dev/ stuff that I've found to
be recursive.
Reviewed by: jhb
non-386 atomic_load_acq(). %eax is an input since its value is used in
the cmpxchg instruction, but we don't care what value it is, so setting
it to a specific value is just wasteful. Thus, it is being used without
being initialized as the warning stated, but it is ok for it to be used
because its value isn't important. Thus, we are only sort of lying when
we say it is an output only operand.
- Add "cc" to the clobber list for atomic_load_acq() since the cmpxchgl
changes ZF.
slow enough as it is, without having to constantly check that it really
is an i386 still. It was possible to compile out the conditionals for
faster cpus by leaving out 'I386_CPU', but it was not possible to
unconditionally compile for the i386. You got the runtime checking whether
you wanted it or not. This makes I386_CPU mutually exclusive with the
other cpu types, and tidies things up a little in the process.
Reviewed by: alfred, markm, phk, benno, jlemon, jhb, jake, grog, msmith,
jasone, dcs, des (and a bunch more people who encouraged it)
compiling errors where gcc would run out of registers.
- Add "cc" to the list of clobbers for micro-ops where we perform
instructions that alter %eflags.
- Use xchgl instead of cmpxchgl to release a spin lock. This could allow
for more efficient register allocation as we no longer mandate that %eax
be used.
- Reenable the optimized mutex micro-ops in the non-i386 case.
that modules can call.
- Remove the old gcc <= 2.8 versions of the atomic ops.
- Resort the order of some things in the file so that there is only
one #ifdef for KLD_MODULE, and so that all WANT_FUNCTIONS stuff is
moved to the bottom of the file.
- Remove ATOMIC_ACQ_REL() and just use explicit macros instead.
time I tinkered around here. Since INTREN is called from the interrupt
critical path now, it should not be too expensive. In this case, we
look at the bits being changed to decide which 8 bit IO port to write to
rather than unconditionally writing to both. I could probably have gone
further and only done the write if the bits actually changed, but that
seemed overkill for the usual case in interrupt threads.
[an outb is rather expensive when it has to cross the ISA bus]
exactly the same functionality via a sysctl, making this feature
a run-time option.
The default is 1(ON), which means that /dev/random device will
NOT block at startup.
setting kern.random.sys.seeded to 0(OFF) will cause /dev/random
to block until the next reseed, at which stage the sysctl
will be changed back to 1(ON).
While I'm here, clean up the sysctls, and make them dynamic.
Reviewed by: des
Tested on Alpha by: obrien
implement memory fences for the 486+. The 386 still uses versions w/o
memory fences as all operations on the 386 are not program ordered.
The 386 versions are not MP safe.
declarations of a variable of the same name. The one in the outer block
was unused and probably just slipped in at one point or another. This
silences a compiler warning.
__FreeBSD_version 500015 can be used to detect their disappearance.
- Move the symbols for SMP_prvspace and lapic from globals.s to
locore.s.
- Remove globals.s with extreme prejudice.
symbols in globals.s.
PCPU_GET(name) returns the value of the per-cpu variable
PCPU_PTR(name) returns a pointer to the per-cpu variable
PCPU_SET(name, val) sets the value of the per-cpu variable
In general these are not yet used, compatibility macros remain.
Unifdef SMP struct globaldata, this makes variables such as cpuid
available for UP as well.
Rebuilding modules is probably a good idea, but I believe old
modules will still work, as most of the old infrastructure
remains.
as multi-processor kernels. The old way made it difficult for kernel
modules to be portable between uni-processor and multi-processor
kernels. It is no longer necessary to jump through hoops.
- always load %fs with the private segment on entry to the kernel
- change the type of the self referntial pointer from struct privatespace
to struct globaldata
- make the globaldata symbol have value 0 in all cases, so the symbols
in globals.s are always offsets, not aliases for fields in globaldata
- define the globaldata space used for uniprocessor kernels in C, rather
than assembler
- change the assmebly language accessors to use %fs, add a macro
PCPU_ADDR(member, reg), which loads the register reg with the address
of the per-cpu variable member
vm86_trap() to return to the calling program directly. vm86_trap()
doesn't return, thus it was never returning to trap() to release
Giant. Thus, release Giant before calling vm86_trap().
struct swblock entries by dividing the number of the entries by 2
until the swap metadata fits.
- Reject swapon(2) upon failure of swap_zone allocation.
This is just a temporary fix. Better solutions include:
(suggested by: dillon)
o reserving swap in SWAP_META_PAGES chunks, and
o swapping the swblock structures themselves.
Reviewed by: alfred, dillon
variables from i386 assembly language. The syntax is PCPU(member)
where member is the capitalized name of the per-cpu variable, without
the gd_ prefix. Example: movl %eax,PCPU(CURPROC). The capitalization
is due to using the offsets generated by genassym rather than the symbols
provided by linking with globals.o. asmacros.h is the wrong place for
this but it seemed as good a place as any for now. The old implementation
in asnames.h has not been removed because it is still used to de-mangle
the symbols used by the C variables for the UP case.
of explicit calls to lockmgr. Also provides macros for the flags
pased to specify shared, exclusive or release which map to the
lockmgr flags. This is so that the use of lockmgr can be easily
replaced with optimized reader-writer locks.
- Add some locking that I missed the first time.
the witness code is compiled in. Without this, the witness code doesn't
notice that sched_lock is released by fork_trampoline() and thus gets all
confused about spin lock order later on.
held and panic if so (conditional on witness).
- Change witness_list to return the number of locks held so this is easier.
- Add kern/syscalls.c to the kernel build if witness is defined so that the
panic message can contain the name of the offending system call.
- Add assertions that Giant and sched_lock are not held when returning from
a system call, which were missing for alpha and ia64.
- Move PCI core code to dev/pci.
- Split bridge code out into separate modules.
- Remove the descriptive strings from the bridge drivers. If you
want to know what a device is, use pciconf. Add support for
broadly identifying devices based on class/subclass, and for
parsing a preloaded device identification database so that if
you want to waste the memory, you can identify *anything* we know
about.
- Remove machine-dependant code from the core PCI code. APIC interrupt
mapping is performed by shadowing the intline register in machine-
dependant code.
- Bring interrupt routing support to the Alpha
(although many platforms don't yet support routing or mapping
interrupts entirely correctly). This resulted in spamming
<sys/bus.h> into more places than it really should have gone.
- Put sys/dev on the kernel/modules include path. This avoids
having to change *all* the pci*.h includes.
calling the C functions mtx_enter_hard() and mtx_exit_hard() clobbers them.
Note that %eax is also not call safe, but it is already clobbered due to
cmpxchg. However, now we are back to not compiling again, so these macros
are still left disabled for now.
that of MTX_EXIT. Don't assume that the reg parameter to MTX_ENTER
holds curproc, load it explicitly. Put semi-colons at the end of
the macros to be more consistent and so its harder to forget them
when these change.
SMP problem. Compaq, in their infinite wisdom, forgot to put the IO apic
intpin #0 connection to the 8259 PIC into the mptable. This hack is to
look and see if intpin #0 has *no* table entry and adds a fake ExtInt
entry for the remap routines to use. isa/clock.c will still test the
interrupts. This entry is only ever used on an already broken system.
mpapic.c. This gives us the benefit of C type checking. These functions
are not called in any critical paths and are not used by the interrupt
routines.
spending, which was unused now that all software interrupts have
their own thread. Make the legacy schednetisr use an atomic op
for setting bits in the netisr mask.
Reviewed by: jhb
Also, while here, run up to 32 interrupt sources on APIC systems.
Normalize INTREN/INTRDIS so they are the same on both UP and SMP systems
rather than sometimes a macro, and sometimes a function.
Reviewed by: jhb, jakeb
MPLOCKED macro
(2) Use decimal 12 rather than hex 0xc in an addl
(3) Implement MTX_ENTER for the I386_CPU case
(4) Use semi-colons between instructions to allow MTX_ENTER
and MTX_ENTER_WITH_RECURSION to be assembled
(5) Use incl instead of incw to increment the recusion count
(6) 10 is not a valid label, use 7, 8 and 9 rather than 8, 9 and 10
(7) Sort numeric labels
Submitted by: bde (2, 4, and 5)
pushl that of the new process, rather than doing a movl (%esp) and
assuming that the stack has been setup right. This make the initial
stack setup slightly more sane, and will make it easier to stick
an interrupted process onto the run queue without its knowing.
process is on the alternate stack or not. For compatibility
with sigstack(2) state is being updated if such is needed.
We now determine whether the process is on the alternate
stack by looking at its stack pointer. This allows a process
to siglongjmp from a signal handler on the alternate stack
to the place of the sigsetjmp on the normal stack. When
maintaining state, this would have invalidated the state
information and causing a subsequent signal to be delivered
on the normal stack instead of the alternate stack.
PR: 22286
-current and RELENG_4 with GENERIC.
NKPT is the number of initial bootstrap page table pages we create for
the kernel during startup. Once VM is up, we resize it as needed, but
with 4G ram, the size of the vm_page_t structures was pushing it over
the limit. The fact that trimmed down kernels boot on 4G ram machines
suggests that we were pretty close to the edge.
The "30" is arbitary, but smaller than the 'nkpt' variable on all
machines that I checked.
timeout. If DIAGNOSTIC is turned on, then display a message to the console
with a map of which CPUs failed to stop or restart. This gives an SMP box
at least a fighting chance of getting into DDB if one of the other CPUs has
interrupts disabled.
counter register in-CPU.
This is to be used as a fast "timer", where linearity is more important
than time, and multiple lines in the linearity caused by multiple CPUs
in an SMP machine is not a problem.
This adds no code whatsoever to the FreeBSD kernel until it is actually
used, and then as a single-instruction inline routine (except for the
80386 and 80486 where it is some more inline code around nanotime(9).
Reviewed by: bde, kris, jhb
- Use the mutex in hardclock to ensure no races between it and
softclock.
- Make softclock be INTR_MPSAFE and provide a flag,
CALLOUT_MPSAFE, which specifies that a callout handler does not
need giant. There is still no way to set this flag when
regstering a callout.
Reviewed by: -smp@, jlemon
may block on a mutex while on the sleep queue without corrupting
it.
- Move dropping of Giant to after the acquire of sched_lock.
Tested by: John Hay <jhay@icomtek.csir.co.za>
jhb
instead of DIAGNOSTIC.
- Remove the p_wchan check as it no longer applies since a process may be
switched out during CURSIG() within msleep() or mawait().
- Remove an extra sanity check only needed during the early SMPng work.
acquire Giant as needed in functions that call mi_switch(). The releases
need to be done outside of the sched_lock to avoid potential deadlocks
from trying to acquire Giant while interrupts are disabled.
Submitted by: witness
sched_lock. This is needed for kernel threads that are created before
interrupts are enabled. kthreads created by kld's that are created at
SI_SUB_KLD such as the random kthread.
Tested by: phk
syscall compare against a variable sv_minsigstksz in struct
sysentvec as to properly take the size of the machine- and
ABI dependent struct sigframe into account.
The SVR4 and iBCS2 modules continue to have a minsigstksz of
8192 to preserve behavior. The real values (if different) are
not known at this time. Other ABI modules use the real
values.
The native MINSIGSTKSZ is now defined as follows:
Arch MINSIGSTKSZ
---- -----------
alpha 4096
i386 2048
ia64 12288
Reviewed by: mjacob
Suggested by: bde
systems.
From the PR:
When 'probe.slot' is PCI_SLOTMAX (== 31) and 'probe.func' is 7,
call to 'pci_cfgread()' here and machine suddenly hangs up.
I don't know why... (or 450GX chipset's bug?)
PR: i386/20379
Submitted by: Masayuki FUKUI <fukui@sonic.nm.fujitsu.co.jp>
comments on the same line like so:
device foo # FooInc Brand NetEther cards
Also, move the wireless NIC cards to their own section.
Add commented out wl driver in wireless section.
Remove obsolete or redundant comments about some of the wireless cards
that used to apply but don't since we've removed 'at foobus'.
There should be no functional changes in this change.
happen when the vm system maps past the end of an object or tries
to map a zero length object, the pmap layer misses the fact that
offsets wrap into negative numbers and we get stuck.
Found by: Joost Pol aka Nohican <nohican@marcella.niets.org>
Submitted by: tegge
- Look for a hardwired interrupt in the routing table for this
bus/device/pin (we already did this).
- Look for another device with the same link byte which has a hardwired
interrupt.
- Look for a PCI device matching an entry with the same link byte
which has already been assigned an interrupt, and use that.
- Look for a routable interrupt listed in the "PCI only" interrupts
field and use that.
- Pick the first interrupt that's marked as routable and use that.
This removes support for booting current kernels with very old bootblocks.
Device driver writers: Please remove initializations for the d_bmaj
field in your cdevsw{}.
because it only takes a struct tag which makes it impossible to
use unions, typedefs etc.
Define __offsetof() in <machine/ansi.h>
Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h>
Remove myriad of local offsetof() definitions.
Remove includes of <stddef.h> in kernel code.
NB: Kernelcode should *never* include from /usr/include !
Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API.
Deprecate <struct.h> with a warning. The warning turns into an error on
01-12-2000 and the file gets removed entirely on 01-01-2001.
Paritials reviews by: various.
Significant brucifications by: bde
type of software interrupt. Roughly, what used to be a bit in spending
now maps to a swi thread. Each thread can have multiple handlers, just
like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
software interrupt thread to run, so spending, NSWI, and the shandlers
array are no longer needed. We can now have an arbitrary number of
software interrupt threads. When you register a software interrupt
thread via sinthand_add(), you get back a struct intrhand that you pass
to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
more intuitive. Also, prefix all the members of struct intrhand with
'ih_'.
- Make swi_net() a MI function since there is now no point in it being
MD.
Submitted by: cp
(a NetBSD port for NEC PC-98x1 machines). They are ncv for NCR 53C500,
nsp for Workbit Ninja SCSI-3, and stg for TMC 18C30 and 18C50.
I thank NetBSD/pc98 and bsd-nomads people.
Obtained from: NetBSD/pc98
reducues the maintenance load for the mutex code. The only MD portions
of the mutex code are in machine/mutex.h now, which include the assembly
macros for handling mutexes as well as optionally overriding the mutex
micro-operations. For example, we use optimized micro-ops on the x86
platform #ifndef I386_CPU.
- Change the behavior of the SMP_DEBUG kernel option. In the new code,
mtx_assert() only depends on INVARIANTS, allowing other kernel developers
to have working mutex assertiions without having to include all of the
mutex debugging code. The SMP_DEBUG kernel option has been renamed to
MUTEX_DEBUG and now just controls extra mutex debugging code.
- Abolish the ugly mtx_f hack. Instead, we dynamically allocate
seperate mtx_debug structures on the fly in mtx_init, except for mutexes
that are initiated very early in the boot process. These mutexes
are declared using a special MUTEX_DECLARE() macro, and use a new
flag MTX_COLD when calling mtx_init. This is still somewhat hackish,
but it is less evil than the mtx_f filler struct, and the mtx struct is
now the same size with and without mutex debugging code.
- Add some micro-micro-operation macros for doing the actual atomic
operations on the mutex mtx_lock field to make it easier for other archs
to override/optimize mutex ops if needed. These new tiny ops also clean
up the code in some places by replacing long atomic operation function
calls that spanned 2-3 lines with a short 1-line macro call.
- Don't call mi_switch() from mtx_enter_hard() when we block while trying
to obtain a sleep mutex. Calling mi_switch() would bogusly release
Giant before switching to the next process. Instead, inline most of the
code from mi_switch() in the mtx_enter_hard() function. Note that when
we finally kill Giant we can back this out and go back to calling
mi_switch().
in most of the atomic operations. Now for these operations, you can
use the normal atomic operation, you can use the operation with a read
barrier, or you can use the operation with a write barrier. The function
names follow the same semantics used in the ia64 instruction set. An
atomic operation with a read barrier has the extra suffix 'acq', due to
it having "acquire" semantics. An atomic operation with a write barrier
has the extra suffix 'rel'. These suffixes are inserted between the
name of the operation to perform and the typename. For example, the
atomic_add_int() function now has 3 variants:
- atomic_add_int() - this is the same as the previous function
- atomic_add_acq_int() - this function combines the add operation with a
read memory barrier
- atomic_add_rel_int() - this function combines the add operation with a
write memory barrier
- Add 'ptr' to the list of types that we can perform atomic operations
on. This allows one to do atomic operations on uintptr_t's. This is
useful in the mutex code, for example, because the actual mutex lock is
a pointer.
- Add two new operations for doing loads and stores with memory barriers.
The new load operations use a read barrier before the load, and the
new store operations use a write barrier after the load. For example,
atomic_load_acq_int() will atomically load an integer as well as
enforcing a read barrier.
write caching is disabled on both SCSI and IDE disks where large
memory dumps could take up to an hour to complete.
Taking an i386 scsi based system with 512MB of ram and timing (in
seconds) how long it took to complete a dump, the following results
were obtained:
Before: After:
WCE TIME WCE TIME
------------------ ------------------
1 141.820972 1 15.600111
0 797.265072 0 65.480465
Obtained from: Yahoo!
Reviewed by: peter