control the number of lines per page rather than a constant. The variable
can be examined and changed in ddb as '$lines'. Setting the variable to
0 will effectively turn off paging.
- Change db_putchar() to force out pending whitespace before outputting
newlines and carriage returns so that one can rub out content on the
current line via '\r \r' type strings.
- Change the simple pager to rub out the --More-- prompt explicitly when
the routine exits.
- Add some aliases to the simple pager to make it more compatible with
more(1): 'e' and 'j' do a single line. 'd' does half a page, and
'f' does a full page.
MFC after: 1 month
Inspired by: kris
the final set of traces -- someone with more busdma background
will probably want to review and expand this, as well as port to
other platforms. This tracing is sufficient to identify key
busdma events on i386, and in particular to draw attention to
bounce buffering events that may have a substantial performance
impact.
modes on a tty structure.
Both the ".init" and the current settings are initialized allowing
the function to be used both at attach and open time.
The function takes an argument to decide if echoing should be enabled.
Echoing should not be enabled for regular physical serial ports
unless they are consoles, in which case they should be configured
by ttyconsolemode() instead.
Use the new function throughout.
invalidate the TLB(s) if the old mapping wasn't used by the CPU. With
network interfaces that implement checksum off-loading, the old mapping is
almost never used by the CPU, only by the device driver for setting up the
DMA operation.
Reviewed by: tegge@
Restructure pmap_enter() to prevent the loss of a page modified (PG_M) bit
in a race between processors. (This restructuring assumes the newly atomic
pte_load_store() for correct operation.)
Reviewed by: tegge@
PR: i386/61852
RAM. Many older, legacy bridges only allow allocation from this
range. This only appies to devices who don't have their memory
assigned by the BIOS (since we allocate the ranges so assigned
exactly), so should have minimal impact.
Hoewver, for CardBus bridges (cbb), they rarely get the resources
allocated by the BIOS, and this patch helps them greatly. Typically
the 'bad Vcc' messages are caused by this problem.
that is no longer required. (In fact, it is not clear that it was ever
required in HEAD or RELENG_4, only RELENG_3 required a work-around.) Now,
as before revision 1.251, if the preexisting PTE is invalid, pmap_enter()
does not call pmap_invalidate_page() to update the TLB(s).
Note: Even with this change, the handling of a copy-on-write fault is
inefficient, in such cases pmap_enter() calls pmap_invalidate_page() twice.
Discussed with: bde@
PR: kern/16568
need to mask off the page offset bits. (This operation made some sense
prior to i386/i386/pmap.c revision 1.254 when we passed a physical address
rather than a vm_page pointer to pmap_enter().)
uses predate the change in the pmap_enter() interface that replaced the
page's physical address by the address of its vm_page structure. The
PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page
structure that was being passed in.
1. Process p1 is currently being swapped in.
2. Process p2 calls linux_ptrace(PTRACE_GETFPXREGS, p1_pid, ...)
3. After acquiring a reference to FIRST_THREAD_IN_PROC(p1),
p2 blocks in faultin() while p1 finishes being swapped in.
This means p2 won't get back the lock on p1 until after p1's
threads are runnable.
4. After p1 is swapped in, the first thread in p1 exits.
5. p2 now uses its dangling reference to p1's first thread.
pmap_copy(). This entails additional locking in pmap_copy() and the
addition of a "flags" parameter to the page table page allocator for
specifying whether it may sleep when memory is unavailable. (Already,
pmap_copy() checks the availability of memory, aborting if it is scarce.
In theory, another CPU could, however, allocate memory between
pmap_copy()'s check and the call to the page table page allocator,
causing the current thread to release its locks and sleep. This change
makes this scenario impossible.)
Reviewed by: tegge@
multiprocessors. Specifically, the error is conditioning the call to
pmap_invalidate_page() on whether the pmap is active on the current CPU.
This call must be unconditional. Regardless of whether the pmap is active
on the CPU performing _pmap_unwire_pte_hold(), it could be active on another
CPU. For example, a call to pmap_remove_all() by the page daemon could
result in a call to _pmap_unwire_pte_hold() with the pmap inactive on the
current CPU and active on another CPU. In such circumstances, failing to
call pmap_invalidate_page() results in a stale TLB entry on the other CPU
that still maps the now deallocated page table page. What happens next is
typically a mysterious panic in pmap_enter() by the other CPU, either
"pmap_enter: attempted pmap_enter on 4MB page" or "pmap_enter: pte vanished,
va: 0x%lx". Both occur because the former page table page has been recycled
and allocated to a new purpose. Consequently, it no longer contains zeroes.
See also Peter's i386/i386/pmap.c revision 1.448 and the related e-mail
thread last year.
Many thanks to the engineers at Sandvine for providing clear and concise
information until all of the pieces of the puzzle fell into place and
for testing an earlier patch.
MT5 Candidate
a stack trace from ddb, the output will pause with a '--More--' prompt
every 18 lines. If you hit Enter, it will print another line and prompt
again. If you hit space it will output another page and then prompt.
If you hit 'q' or 'x' it will abort the rest of the stack trace.
- Fix the sparc64 userland stack trace to honor the total count of lines
to print. This is useful if your trace happens to walk back onto
0xdeadc0de and gets stuck in an endless loop.
MFC after: 1 month
Tested on: i386, alpha, sparc64
the page table page's wired count rather than its hold count to contain
the reference count. My rationale for this change is based on several
factors:
1. The machine-independent and pmap layers used the same hold count field
in subtly different ways. The machine-independent layer uses the hold
count to implement a form of ephemeral wiring that is used by pipes,
physio, etc. In other words, subsystems where we wish to temporarily
block a page from being swapped out while it is mapped into the kernel's
address space. Such pages are never removed from the page queues.
Instead, the page daemon recognizes a non-zero hold count to mean "hands
off this page." In contrast, page table pages are never in the page
queues; they are wired from birth to death. The hold count was being
used as a kind of reference count, specifically, the number of valid
page table entries within the page. Not surprisingly, these two
different uses imply different synchronization rules: in the machine-
independent layer access to the hold count requires the page queues
lock; whereas in the pmap layer the pmap lock is required. Thus,
continued use by the pmap layer of vm_page_unhold(), which asserts that
the page queues lock is held, made no sense.
2. _pmap_unwire_pte_hold() was too forgiving in its handling of the wired
count. An unexpected wired count on a page table page was ignored and
the underlying page leaked.
3. In a word, microoptimization. Using the wired count exclusively, rather
than a combination of the wired and hold counts, makes the code slightly
smaller and faster.
Reviewed by: tegge@
not sure yet about 5.x... MFC if needed.
Also fixes small problems with examining some registers and
some specific gdb transfer problems.
As the patch says:
This is not a pretty patch and only meant as a temporary
fix until a better solution is committed.
PR: i386/71715
Submitted by: Stephan Uphoff <ups@tree.com>
MFC after: 1 week
and which takes a M_WAITOK/M_NOWAIT flag argument.
Add compatibility isa_dmainit() macro which whines loudly if
isa_dma_init() fails.
Problem uncovered by: tegge
the loss of a page modified (PG_M) bit in a race between processors.
Quoting Tor:
One scenario where the old code could cause a lost PG_M bit is a
multithreaded linux program (or FreeBSD program using the
linuxthreads port) where one thread was starting a subprocess.
The thread doing fork() would call vmspace_fork(), which would then
call vm_map_copy_entry() which would call pmap_protect() on an area
possibly accessed by other threads.
Additionally, make the clearing of PG_M by pmap_protect() unconditional if
write permission is removed. Previously, PG_M could persist on a read-only
unmanaged page. That seems inconsistent and confusing.
In collaboration with: tegge@
MT5 candidate
PR: 61852
fully initialed when the pmap layer tries to call sched_pini() early in the
boot and results in an quick panic. Use ke_pinned instead as was originally
done with Tor's patch.
Approved by: julian
value was only enough for 8GB of RAM, the new value can do 16GB. This still
isn't optimal since it doesn't scale. Fixing this for amd64 looks to be
fairly easy, but for i386 will be quite difficult.
Reviewed by: peter
scheduler specific extension to it. Put it in the extension as
the implimentation details of how the pinning is done needn't be visible
outside the scheduler.
Submitted by: tegge (of course!) (with changes)
MFC after: 3 days
VT6122 gigabit ethernet chip and integrated 10/100/1000 copper PHY.
The vge driver has been added to GENERIC for i386, pc98 and amd64,
but not to sparc or ia64 since I don't have the ability to test
it there. The vge(4) driver supports VLANs, checksum offload and
jumbo frames.
Also added the lge(4) and nge(4) drivers to GENERIC for i386 and
pc98 since I was in the neighborhood. There's no reason to leave them
out anymore.
across frames. Basically, if the current frame is for the
'dblfault_handler' function, then get the next %eip and %ebp values to use
from the original TSS of the thread that has the saved state when the
double fault triggered.
MFC after: 4 days
and was propagated to nearly every platform. The boundary of the child needs
to consider the boundary of the parent and pick the minimum of the two, not
the maximum. However, if either is 0 then pick the appropriate one.
This bug was exposed by a recent change to ATA, which should now be fixed by
this change. The alignment and maxsegsz tag attributes likely also need
a similar review in the near future.
This is a MT5 candidate.
Reviewed by: marcel
Submitted by: sos (in part)
It can be switched back once 5.3 is tested and released. Also turn on
PREEMPTION as many of the stability problems with it have been fixed.
MT5: 3 days.
but with slightly cleaned up interfaces.
The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.
The KSE (or td_sched) structure is now allocated per thread and has no
allocation code of its own.
Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.
Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.
The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.
A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.
Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.
Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by: scottl, peter
MFC after: 1 week
FULL_PREEMPTION is defined. Add a runtime warning to ULE if PREEMPTION is
enabled (code inspired by the PREEMPTION warning in kern_switch.c). This
is a possible MT5 candidate.
The removed argument could trivially be derived from the remaining one.
That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument.
Having both proc and thread as an argumen tjust gives an opportunity for
them to get out sync.
MFC after: 3 days
in diagnostics. It has outlived its usefulness and has started
causing panics for people who turn on DIAGNOSTIC, in what is otherwise
good code.
MFC after: 2 days
remaining consumers to have the count passed as an option. This is
i4b, pc98/wdc, and coda.
Bump configvers.h from 500013 to 600000.
Remove heuristics that tried to parse "device ed5" as 5 units of the ed
device. This broke things like the snd_emu10k1 device, which required
quotes to make it parse right. The no-longer-needed quotes have been
removed from NOTES, GENERIC etc. eg, I've removed the quotes from:
device snd_maestro
device "snd_maestro3"
device snd_mss
I believe everything will still compile and work after this.
these two reasons:
1. On ia64 a function pointer does not hold the address of the first
instruction of a functions implementation. It holds the address
of a function descriptor. Hence the user(), btrap(), eintr() and
bintr() prototypes are wrong for getting the actual code address.
2. The logic forces interrupt, trap and exception entry points to
be layed-out contiguously. This can not be achieved on ia64 and is
generally just bad programming.
The MCOUNT_FROMPC_USER macro is used to set the frompc argument to
some kernel address which represents any frompc that falls outside
the kernel text range. The macro can expand to ~0U to bail out in
that case.
The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to
some kernel address to represent a call to a trap or interrupt
handler. This to avoid that the trap or interrupt handler appear to
be called from everywhere in the call graph. The macro can expand
to ~0U to prevent adjusting frompc. Note that the argument is selfpc,
not frompc.
This commit defines the macros on all architectures equivalently to
the original code in sys/libkern/mcount.c. People can take it from
here...
Compile-tested on: alpha, amd64, i386, ia64 and sparc64
Boot-tested on: i386
valid pmap to the pmap functions that require one. Remove the checks for
NULL. (These checks have their origins in the Mach pmap.c that was
integrated into BSD. None of the new code written specifically for
FreeBSD included them.)
compile option. All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.
If no hooks are connected the entire packet filter hooks section and related
activities are jumped over. This removes any performance impact if no hooks
are active.
Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
The C code assumes that the carry bit is always kept from the previous
operation. However, the pointer indexing requires another add operation.
Thus, the carry bit from the first operation is tromped over by the
"addl" operation that ends up following it, so the "adcl" that follows
that has no effect because the carry bit is cleared before it.
The result is checksum failure on received packets.
The larger issue is that there isn't any other way of preventing the compiler
inserting arbitrary instructions between different __asm statements (and
that the commit message in revision 1.13 of in_cksum.h is wrong on
this point). From
http://developer.apple.com/documentation/DeveloperTools/gcc-3.3/gcc/Extended-Asm.html
---8<---8<---8<---
You can't expect a sequence of volatile asm instructions to remain
perfectly consecutive. If you want consecutive output, use a single
asm. Also, GCC will perform some optimizations across a volatile
asm instruction; GCC does not "forget everything" when it encounters
a volatile asm instruction the way some other compilers do.
---8<---8<---8<---
Also, this change also makes the ASM code much easier to read.
PR: 69257
Submitted by: Mike Bristow <mike@urgle.com>, Qing Li <qing.li@bluecoat.com>
directly. This removes a few more users of the stackgap and also marks
the syscalls using these wrappers MP safe where appropriate.
Tested on: i386 with linux acroread5
Compiled on: i386, alpha LINT
We were obtaining different spin mutexes (which disable interrupts after
aquisition) and spin waiting for delivery. For example, KSE processes
do LDT operations which use smp_rendezvous, while other parts of the
system are doing things like tlb shootdowns with a different mutex.
This patch uses the common smp_rendezvous mutex for all MD home-grown
IPIs that spinwait for delivery. Having the single mutex means that
the spinloop to aquire it will enable interrupts periodically, thus
avoiding the cross-ipi deadlock.
Obtained from: dwhite, alc
Reviewed by: jhb
with VmWare 4.x. At least with VmWare version 4.5.2, i386 version of
atomic_cmpset_int() is about 30 times slower than non-i386 version. It
makes this delta a good 5.3 MFC candidate, since otherwise it will
mislead users who run FreeBSD under modern VmWare otherwise.
Since pmap_enter() calls pmap_invalidate_page(), which needs interrupts
enabled in the SMP case, we defer the disable to right before saving the
register context. This has been incorrect for about a year but caused no
real problems because the identity page never actually replaces a previously
mapped page and suspend/resume on SMP systems has been uncommon.
Tested by: sos
MFC after: 3 days
parent rather than track resources locally. The original code
was incomplete in that it would only honor requests for resources
that already exist in its resource list. This prevented many ISA
identify routines from allocating temporary resources. Passing
the requests up to legacy's parent losing no functionality and
allows these requests to succeed.
Reviewed by: imp, jhb
Approved by: RE
logical CPUs on a system to be used as a dedicated watchdog to cause a
drop to the debugger and/or generate an NMI to the boot processor if
the kernel ceases to respond. A sysctl enables the watchdog running
out of the processor's idle thread; a callout is launched to reset a
timer in the watchdog. If the callout fails to reset the timer for ten
seconds, the watchdog will fire. The sysctl allows you to select which
CPU will run the watchdog.
A sample "debug.leak_schedlock" is included, which causes a sysctl to
spin holding sched_lock in order to trigger the watchdog. On my Xeons,
the watchdog is able to detect this failure mode and break into the
debugger, which cannot otherwise be done without an NMI button.
This option does not currently work with sched_ule due to ule's push
notion of scheduling, similar to machdep.hlt_logical_cpus failing to
work with that scheduler.
On face value, this might seem somewhat inefficient, but there are a
lot of dual-processor Xeons with HTT around, so using one as a watchdog
for testing is not as inefficient as one might fear.
* Serialize access to the sysctl routines and the notify handler
* Assert that the sx lock is held in any functions they call.
* Note that recursively calling to re-enable the hotkeys is sub-optimal.
to allow dumping per-thread machine specific notes. On ia64 we use this
function to flush the dirty registers onto the backingstore before we
write out the PRSTATUS notes.
Tested on: alpha, amd64, i386, ia64 & sparc64
Not tested on: arm, powerpc
a standard configuration similar to [NO_]ADAPTIVE_MUTEXES. This
feature causes Giant to be included in the set of mutexes adaptively
spun on. It appears to have a positive effect on performance on SMP
across several workloads, including measurements of a 16% improvement
on buildworld, and 30%+ improvement for MySQL using the supersmack
benchmark with Giant over the network stack; a 6% improvement without
Giant on the network stack (as a result of less giant contention).
fix the obvious bugs, nastier ones reside below the surfac), and having
it commented out here just encourages people to try it.
# I'm not removing it from the base system, yet.
calls. Note that the information included is a bit different from the
existing KTR traces generated on powerpc, as I'm primarily interested
in kernel context (thread, syscall #, proc, etc), not the user
arguments to the system call. Some convergence would be useful here.
location (for the wake code). It should not be needed since we don't
map other pages at the same location and if there was an old mapping, it
would be restored by a fault. The old code had serious problems, namely
that it was restoring the new page it had just removed (not opage) and
it could only guess at the right protection (since there's no
pmap_extract_protect function). Thanks to Alan Cox for explaining much
of this to me.
Also, remove a commented-out initializecpu() call since it is not needed.
Restoring the cpu context is better than attempting to init from scratch.
Reviewed by: alc (earlier version)
spin-wait code to use the same spin mutex (smp_tlb_mtx) as the TLB ipi
and spin-wait code snippets so that you can't get into the situation of
one CPU doing a TLB shootdown to another CPU that is doing a lazy pmap
shootdown each of which are waiting on each other. With this change, only
one of the CPUs would do an IPI and spin-wait at a time.
vm_page_sleep_if_busy() and the page table page's busy flag as a
synchronization mechanism on page table pages.
Also, relocate the inline pmap_unwire_pte_hold() so that it can be used
to shorten _pmap_unwire_pte_hold() on alpha and amd64. This places
pmap_unwire_pte_hold() next to a comment that more accurately describes
it than _pmap_unwire_pte_hold().
being defined, define and use a new MD macro, cpu_spinwait(). It only
expands to something on i386 and amd64, so the compiled code should be
identical.
Name of the macro found by: jhb
Reviewed by: jhb
pic_eoi_source() into one call. This halves the number of spinlock operations
and indirect function calls in the normal case of handling a normal (ithread)
interrupt. Optimize the atpic and ioapic drivers to use inlines where
appropriate in supporting the intr_execute_handlers() change.
This knocks 900ns, or roughly 1350 cycles, off of the time spent servicing an
interrupt in the common case on my 1.5GHz P4 uniprocessor system. SMP systems
likely won't see as much of a gain due to the ioapic being more efficient than
the atpic. I'll investigate porting this to amd64 soon.
Reviewed by: jhb
their own directory and module, leaving the MD parts in the MD
area (the MD parts _are_ part of the modules). /dev/mem and /dev/io
are now loadable modules, thus taking us one step further towards
a kernel created entirely out of modules. Of course, there is nothing
preventing the kernel from having these statically compiled.
- Enable recursion on the page queues lock. This allows calls to
vm_page_alloc(VM_ALLOC_NORMAL) and UMA's obj_alloc() with the page
queues lock held. Such calls are made to allocate page table pages
and pv entries.
- The previous change enables a partial reversion of vm/vm_page.c
revision 1.216, i.e., the call to vm_page_alloc() by vm_page_cowfault()
now specifies VM_ALLOC_NORMAL rather than VM_ALLOC_INTERRUPT.
- Add partial locking to pmap_copy(). (As a side-effect, pmap_copy()
should now be faster on i386 SMP because it no longer generates IPIs
for TLB shootdown on the other processors.)
- Complete the locking of pmap_enter() and pmap_enter_quick(). (As of now,
all changes to a user-level pmap on alpha, amd64, and i386 are performed
with appropriate locking.)
dereference curthread. It is called only from critical_{enter,exit}(),
which already dereferences curthread. This doesn't seem to affect SMP
performance in my benchmarks, but improves MySQL transaction throughput
by about 1% on UP on my Xeon.
Head nodding: jhb, bmilekic
the thread ID and call db_trace_thread().
Since arm has all the logic in db_stack_trace_cmd(), rename the
new DB_COMMAND function to db_stack_trace to avoid conflicts on
arm.
While here, have db_stack_trace parse its own arguments so that
we can use a more natural radix for IDs. If the ID is not a thread
ID, or more precisely when no thread exists with the ID, try if
there's a process with that ID and return the first thread in it.
This makes it easier to print stack traces from the ps output.
requested by: rwatson@
tested on: amd64, i386, ia64
a fast interrupt handler doing an swi_sched(). This fixed the lockups I
saw on my laptop when using xmms in KDE and on rwatson's MySQL benchmarks
on SMP. This will eventually be removed and/or modified when I figure out
what the root cause is and fix that.
pmap_remove_page(). The reason being that pmap_pte_quick() requires
the page queues lock, which is already held, rather than Giant.
- Assert that the page queues lock is held in pmap_remove_page() and
pmap_remove_pte().
future:
rename ttyopen() -> tty_open() and ttyclose() -> tty_close().
We need the ttyopen() and ttyclose() for the new generic cdevsw
functions for tty devices in order to have consistent naming.
pmap_protect() and pmap_remove(). In general, they require the lock in
order to modify a page's pv list or flags. In some cases, however,
pmap_protect() can avoid acquiring the lock.
for unknown events.
A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
pmap_remove_pages(). (The implementation of pmap_remove_pages() is
optional. If pmap_remove_pages() is unimplemented, the acquisition and
release of the page queues lock is unnecessary.)
Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().
a problem that could also be fixed differently without reverting previous
attempts to fix DELAY while the debugger is active (rev 1.204). The bug
was that the i8254 implements a countdown timer, while for (k)db_active
a countup timer was implemented. This resulted in premature termination
and consequently the breakage of DELAY. The fix (relative to rev 1.211)
is to implement a countdown timer for the kdb_active case. As such the
ability to step clock initialization is preserved and DELAY does what is
expected of it.
Blushed: bde :-)
Submitted by: bde
Most of the changes are a direct result of adding thread awareness.
Typically, DDB_REGS is gone. All registers are taken from the
trapframe and backtraces use the PCB based contexts. DDB_REGS was
defined to be a trapframe on all platforms anyway.
Thread awareness introduces the following new commands:
thread X switch to thread X (where X is the TID),
show threads list all threads.
The backtrace code has been made more flexible so that one can
create backtraces for any thread by giving the thread ID as an
argument to trace.
With this change, ia64 has support for breakpoints.
o Make debugging code conditional upon KDB instead of DDB.
o Declare ksym_start and ksym_end as extern and initialize them.
This was previously and bogusly handled by DDB itself.
o Call kdb_enter() instead of Debugger().
o Remove implementation of Debugger().
debugger is not active. The fixes breakages of DELAY() when
running in the debugger, because not calling getit() when the
debugger is active yields a DELAY that doesn't.
o s/ddb_on_nmi/kdb_on_nmi/g
o Rename sysctl machdep.ddb_on_nmi to machdep.kdb_on_nmi
o Make debugging support conditional upon KDB instead of DDB.
o Call kdb_reenter() when kdb_active is non-zero.
o Call kdb_trap() to enter the debugger when not already active.
o Update comments accordingly.
o Remove misplaced prototype of kdb_trap().
o Make debugging code conditional upon KDB instead of DDB.
o Call kdb_enter() instead of Debugger().
o Remove local (static) variable in_debugger. Use kdb_active instead.
a PCB from a trapframe for purposes of unwinding the stack. The PCB
is used as the thread context and all but the thread that entered the
debugger has a valid PCB.
This function can also be used to create a context for the threads
running on the CPUs that have been stopped when the debugger got
entered. This however is not done at the time of this commit.
in which multiple (presumably different) debugger backends can be
configured and which provides basic services to those backends.
Besides providing services to backends, it also serves as the single
point of contact for any and all code that wants to make use of the
debugger functions, such as entering the debugger or handling of the
alternate break sequence. For this purpose, the frontend has been
made non-optional.
All debugger requests are forwarded or handed over to the current
backend, if applicable. Selection of the current backend is done by
the debug.kdb.current sysctl. A list of configured backends can be
obtained with the debug.kdb.available sysctl. One can enter the
debugger by writing to the debug.kdb.enter sysctl.
backend improves over the old GDB support in the following ways:
o Unified implementation with minimal MD code.
o A simple interface for devices to register themselves as debug
ports, ala consoles.
o Compression by using run-length encoding.
o Implements GDB threading support.
bootp -> BOOTP
bootp.nfsroot -> BOOTP_NFSROOT
bootp.nfsv3 -> BOOTP_NFSV3
bootp.compat -> BOOTP_COMPAT
bootp.wired_to -> BOOTP_WIRED_TO
- i.e. back out the previous commit. It's already possible to
pxeboot(8) with a GENERIC kernel.
Pointed out by: dwmalone
BOOTP -> bootp
BOOTP_NFSROOT -> bootp.nfsroot
BOOTP_NFSV3 -> bootp.nfsv3
BOOTP_COMPAT -> bootp.compat
BOOTP_WIRED_TO -> bootp.wired_to
This lets you PXE boot with a GENERIC kernel by putting this sort of thing
in loader.conf:
bootp="YES"
bootp.nfsroot="YES"
bootp.nfsv3="YES"
bootp.wired_to="bge1"
or even setting the variables manually from the OK prompt.
When two drivers share an ISA DMA channel, they both call isa_dmainit()
and the second call fails if DIAGNOSTIC is on.
If isa_dmainit() was already called successfully, just return silently.
This only works if both drivers agree on the bounce buffer size,
but since sharing DMA is usually only possible on very special
hardware and then typically only for devices of the same type (which
would have multiple instances of the same device driver), this is
not a problem in practice.
belong in the respective drivers. I've not removed ALL of them, as a
few still haven't moved. I've just removed the ones that aren't used.
# these can be removed from amd64, but I'm having issues getting to
# sledge at the moment for a build.
honor the alignment and boundary constraints in the dma tag when loading
buffers. Previously, these constraints were only honored when allocating
memory via bus_dmamem_alloc(). Now, bus_dmamap_load() will automatically
use bounce buffers when needed.
Also add a set of sysctls to monitor the global busdma stats. These are:
hw.busdma.free_bpages
hw.busdma.reserved_bpages
hw.busdma.active_bpages
hw.busdma.total_bpages
hw.busdma.total_bounced
hw.busdma.total_deferred
than as one-off hacks in various other parts of the kernel:
- Add a function maybe_preempt() that is called from sched_add() to
determine if a thread about to be added to a run queue should be
preempted to directly. If it is not safe to preempt or if the new
thread does not have a high enough priority, then the function returns
false and sched_add() adds the thread to the run queue. If the thread
should be preempted to but the current thread is in a nested critical
section, then the flag TDF_OWEPREEMPT is set and the thread is added
to the run queue. Otherwise, mi_switch() is called immediately and the
thread is never added to the run queue since it is switch to directly.
When exiting an outermost critical section, if TDF_OWEPREEMPT is set,
then clear it and call mi_switch() to perform the deferred preemption.
- Remove explicit preemption from ithread_schedule() as calling
setrunqueue() now does all the correct work. This also removes the
do_switch argument from ithread_schedule().
- Do not use the manual preemption code in mtx_unlock if the architecture
supports native preemption.
- Don't call mi_switch() in a loop during shutdown to give ithreads a
chance to run if the architecture supports native preemption since
the ithreads will just preempt DELAY().
- Don't call mi_switch() from the page zeroing idle thread for
architectures that support native preemption as it is unnecessary.
- Native preemption is enabled on the same archs that supported ithread
preemption, namely alpha, i386, and amd64.
This change should largely be a NOP for the default case as committed
except that we will do fewer context switches in a few cases and will
avoid the run queues completely when preempting.
Approved by: scottl (with his re@ hat)
pv entries per 1GB of user virtual memory. (eg: if we had 1GB file was
mmaped into 30 processes, that would theoretically reduce the KVA demand by
30MB for pv entries. In reality though, we limit pv entries so we don't
have that many at once.)
We used to store the vm_page_t for the page table page. But we recently
had the pa of the ptp, or can calculate it fairly quickly. If we wanted
to avoid the shift/mask operation in pmap_pde(), we could recover the
pa but that means we have to store it for a while.
This does not measurably change performance.
Suggested by: alc
Tested by: alc
This should fix problems with older SMP systems that only have ISA/EISA
IRQs when routing virgin PCI interrupts as well as on other boxes whose
MADT does not have any interrupt override entries for ISA IRQs that are
used to route PCI interrupts even in APIC mode.
- Allow ioapic_set_{nmi,smi,extint}() to be called multiple times on the
same pin so long as the pin's mode is the same as the mode being
requested.
- Add a notion of bus type for the interrupt associated with interrupt pin.
This is needed so that we can force all EISA interrupts to be active high
in the forthcoming ioapic_config_intr().
- Fix a bug for EISA systems that didn't remap IRQs. This would have broken
EISA systems that tried to disable mixed mode for IRQ 0.
in npxsetregs() too. npxsetregs() must overwrite the previous state, and
it is never paired with an npxgetregs() that would defuse the previous
state (since npxgetregs() would have fninit'ed the state, leaving nothing
to do).
PR: 68058 (this should complete the fix)
Tested by: Simon Barner <barner@in.tum.de>
unnecessary because cpu_setregs() and/or npxinit() always sets CR0_TS
during system initialization, and CR0_TS is set in the next statement
(fpstate_drop()) if necessary after system initialization. Setting
it unnecessarily was less than a pessimization since it broke the
invariant that the npx can be used without an npxdna() trap if
fpucurthread is non-null. The broken invariant became harmful when I
added an fnclex to npxdrop().
Removed setting of CR0_MP in exec_setregs(). This was similarly
unnecessary but was harmless.
Updated comments (mainly by removing them). Things are simpler now
that we have cpu_setregs() and don't support a math emulator or pretend
to support not having either a math emulator or an npx.
Removed the ifdef for avoiding setting CR0_NE in the !SMP case in
cpu_setregs(). npx_probe() should reverse the setting if it wants to
force IRQ13 exception handling for testing.
frstor can trap despite it being a control instruction, since it bogusly
checks for pending exceptions in the state that it is overwriting.
This used to be a non-problem because frstor was always paired with a
previous fnsave, and fnsave does an implicit fninit so any pending
exceptions only remain live in the saved state. Now frstor is sometimes
paired with npxdrop() and we must do a little more than just forget
that the npx was used in npxdrop() to avoid a trap later. This is a
non-problem in the FXSR case because fxrstor doesn't do the bogus check.
FXSR is part of SSE, and npxdrop() is only in FreeBSD-5.x, so this bug
only affected old machines running FreeBSD-5.x.
PR: 68058
devclass will be present even if the driver was disabled by a hint. Using
device_get_softc() provides the right info even if it's overkill.
Explained by: jhb
placates gcc which seems to like to complain about -1 being assigned to
an unsigned value. It is well defined and intended, but since signess bugs
are being hunted just change to 0xffffffff.
o Mask the lower 8 bits, not the lower 4 bits for the ai_capabilities word.
All 8 bits are defined and the 0xf was almost certainly a typo.
o Define APM_UNKNOWN to 0xff for emulation layer.
Otherwise, the setting of the PG_M bit by one processor could be lost if
another processor is simultaneously changing the PG_W bit.
Reviewed by: tegge@
present and thus that the PnPBIOS probe should be skipped instead of
having ACPI zero out the PnPBIOStable pointer.
- Make the PnPBIOStable pointer static to i386/i386/bios.c now that that is
the only place it is used.
Dividing by 0 in order to check for irq13/exception16 delivery apparently
always causes an irq13 even if we have configured for exception16 (by
setting CR0_NE). This was expected, but the timing of the irq13 was
unexpected. Without CR0_NE, the irq13 is delivered synchronously at
least on my test machine, but with CR0_NE it is delivered a little
later (about 250 nsec) in PIC mode and much later (5000-10000 nsec)
in APIC mode. So especially in APIC mode, the irq13 may arrive after
it is supposed to be shut down. It should then be masked, but the
shutdown is incomplete, so the irq goes to a null handler that just
reports it as stray. The fix is to wait a bit after dividing by 0 to
give a good chance of the irq13 being handled by its proper handler.
Removed the hack that was supposed to recover from the incomplete shutdown
of irq13. The shutdown is now even more incomplete, or perhaps just
incomplete in a different way, but the hack now has no effect because
irq13 is edge triggered and handling of edge triggered interrupts is
now optimized by skipping their masking. The hack only worked due
to it accidentally not losing races.
The incomplete shutdown of irq13 still allows unprivileged users to
generate a stray irq13 (except on systems where irq13 is actually used)
by unmasking an npx exception and causing one. The exception gets
handled properly by the exception 16 handler. A spurious irq13 is
delivered asynchronously but is harmless (as in the probe) because it
is almost perfectly not handled by the null interrupt handler.
Perfectly not handling it involves mainly not resetting the npx busy
latch. This prevents further irq13's despite them not being masked in
the [A]PIC.
size_t and size_t *, respectively. Update callers for the new interface.
This is a better fix for overflows that occurred when dumping segments
larger than 2GB to core files.
APIC interrupt pin based on the settings in the corresponding interrupt
source structure.
- Use ioapic_program_intpin() in place of manual frobbing of the intpin
configuration in ioapic_program_destination() and ioapic_register().
- Use ioapic_program_intpin() to implement suspend/resume support for I/O
APICs.
mbuma is an Mbuf & Cluster allocator built on top of a number of
extensions to the UMA framework, all included herein.
Extensions to UMA worth noting:
- Better layering between slab <-> zone caches; introduce
Keg structure which splits off slab cache away from the
zone structure and allows multiple zones to be stacked
on top of a single Keg (single type of slab cache);
perhaps we should look into defining a subset API on
top of the Keg for special use by malloc(9),
for example.
- UMA_ZONE_REFCNT zones can now be added, and reference
counters automagically allocated for them within the end
of the associated slab structures. uma_find_refcnt()
does a kextract to fetch the slab struct reference from
the underlying page, and lookup the corresponding refcnt.
mbuma things worth noting:
- integrates mbuf & cluster allocations with extended UMA
and provides caches for commonly-allocated items; defines
several zones (two primary, one secondary) and two kegs.
- change up certain code paths that always used to do:
m_get() + m_clget() to instead just use m_getcl() and
try to take advantage of the newly defined secondary
Packet zone.
- netstat(1) and systat(1) quickly hacked up to do basic
stat reporting but additional stats work needs to be
done once some other details within UMA have been taken
care of and it becomes clearer to how stats will work
within the modified framework.
From the user perspective, one implication is that the
NMBCLUSTERS compile-time option is no longer used. The
maximum number of clusters is still capped off according
to maxusers, but it can be made unlimited by setting
the kern.ipc.nmbclusters boot-time tunable to zero.
Work should be done to write an appropriate sysctl
handler allowing dynamic tuning of kern.ipc.nmbclusters
at runtime.
Additional things worth noting/known issues (READ):
- One report of 'ips' (ServeRAID) driver acting really
slow in conjunction with mbuma. Need more data.
Latest report is that ips is equally sucking with
and without mbuma.
- Giant leak in NFS code sometimes occurs, can't
reproduce but currently analyzing; brueffer is
able to reproduce but THIS IS NOT an mbuma-specific
problem and currently occurs even WITHOUT mbuma.
- Issues in network locking: there is at least one
code path in the rip code where one or more locks
are acquired and we end up in m_prepend() with
M_WAITOK, which causes WITNESS to whine from within
UMA. Current temporary solution: force all UMA
allocations to be M_NOWAIT from within UMA for now
to avoid deadlocks unless WITNESS is defined and we
can determine with certainty that we're not holding
any locks when we're M_WAITOK.
- I've seen at least one weird socketbuffer empty-but-
mbuf-still-attached panic. I don't believe this
to be related to mbuma but please keep your eyes
open, turn on debugging, and capture crash dumps.
This change removes more code than it adds.
A paper is available detailing the change and considering
various performance issues, it was presented at BSDCan2004:
http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf
Please read the paper for Future Work and implementation
details, as well as credits.
Testing and Debugging:
rwatson,
brueffer,
Ketrien I. Saihr-Kesenchedra,
...
Reviewed by: Lots of people (for different parts)
of this micro-optimization occurs when we call pmap_enter() to wire an
already mapped page. Because of the micro-optimization, we fail to
mark the PTE as wired. Later, on teardown of the address space,
pmap_remove_pages() destroys the PTE before vm_fault_unwire() has
unwired the page. (pmap_remove_pages() is not supposed to destroy
wired PTEs. They are destroyed by a later call to pmap_remove().)
Thus, the page becomes lost.
Note: The page is not lost if the application called munlock(2), only
if it relies on teardown of the address space to unwire its pages.
For the historically inclined, this bug was introduced by a
megacommit, revision 1.182, roughly six years ago.
Leak observed by: green@ and dillon independently
Patch submitted by: dillon at backplane dot com
Reviewed by: tegge@
MFC after: 1 week
been developed for use with FreeBSD, version 4.8 and later.
Submitted by: Hema Joyce
Reviewed by: Prafulla Deuskar
Approved by: Prafulla Deuskar
MFC after: 1 week
gmon and struct gmonhdr was originally just to represent the kernel
(profiling) clock frequency and it remains poorly suited to representing
the frequencies of fast counters like the TSC. It broke a year or two
ago. This quick fix keeps it working for another year or month or two
until TSC frequencies can exceed 2^32, by dividing the frequency by 2.
Dividing the frequency by 4 would work for a little longer but would
lose a little too much precision.
Fixed profiling of trap, syscall and interrupt handlers and some
ordinary functions, essentially by backing out half of rev.1.106 of
i386/exception.s. The handlers must be between certain labels for
the purposes of profiling, and this was broken by scattering them in
separately compiled .s files, especially for ordinary functions that
ended up between the labels. Merge the files by #including them as
before, except with different pathnames and better comments and
organization. Changes to the scattered files are minimal -- just
move the labels to the file that does the #includes.
This also partly fixes profiling of IPIs -- all IPI handlers are now
correctly classified as interrupt handlers, but many are still missing
mcount calls.
vm86bios.s is included as before, but it is now between the labels for
interrupt handlers again, which seems to be wrong since half of it is
for a non-interrupt handler.
polarity rather than assuming that level triggered IRQs use active low and
edge triggered IRQs use active high. Both the MultiProcessor 1.4
and ACPI 2.0 Specifications state in their examples that level triggered
EISA IRQs are active low, but in practice they seem to be active high.
Reported by: Nik Azim Azam nskyline_r35 at yahoo dot com
high resolution kernel profiling (options GUPROF. "U" in GUPROF stands
for microseconds resolution, but the resolution is now smaller than 1
nanosecond on multi-GHz machines and the accuracy is heading towards
1 nanosecond too). Arches that support GUPROF must now provide certain
macros for the calibration. GUPROF is now only supported for i386's,
so the absence of the new macros for other arches doesn't break anything
that wasn't already broken. amd64's have uncommitted support for
GUPROF, and sparc64's have support that seems to be complete except
here (there was an #error for non-i386 cases; now there are undefined
macros).
Changed the asms a little:
- declare them as __volatile. They must not be moved, and exporting a
label across asms is technically incorrect, so try harder to stop gcc
moving them.
- don't put the non-clobbered register "bx" in the clobber list. The
clobber lists are still more conservative than necessary.
- drop the non-support for gcc-1. It just gave a better error message,
and this is not useful since compiling with gcc-1 would cause thousands
of worse error messages.
- drop the support for aout.
to <sys/gmon.h>. Cleaned them up a little by not attempting to ifdef
for incomplete and out of date support for GUPROF in userland, as in
the sparc64 version.
elf_reloc() backends for two reasons. First, to support the possibility
of there being two elf linkers in the kernel (eg: amd64), and second, to
pass the relocbase explicitly (for relocating .o format kld files).
are used.
- Reduce duplication of a couple of macros removing the duplicates from
ich.h.
- Remove unused macros from icu.h as well as locore protection as this
header is no longer included in assembly sources.
- Require the APIC enumerators to explicitly enable mixed mode by calling
ioapic_enable_mixed_mode(). Calling this function tells the apic driver
that the PC-AT 8259A PICs are present and routable through the first I/O
APIC via an ExtINT pin. The mptable enumerator always calls this
function for now. The MADT enumerator only enables mixed mode if the
PC-AT compatability flag is set in the MADT header.
- Allow mixed mode to be enabled or disabled via a 'hw.apic.mixed_mode'
tunable. By default this tunable is set to 1 (true). The kernel option
NO_MIXED_MODE changes the default to 0 to preserve existing behavior, but
adding 'hw.apic.mixed_mode=0' to loader.conf achieves the same effect.
- Only use mixed mode to route IRQ 0 if it is both enabled by the APIC
enumerator and activated by the loader tunable. Note that both
conditions must be true, so if the APIC enumerator does not enable mixed
mode, then you can't set the tunable to try to override the enumerator.
to map. If the checksum fails, the table is unmapped and a NULL pointer
returned.
- For ACPI version >= 2.0, check the extended checksum of the RSDP.
AcpiOsGetRootPointer() already checks the version 1.0 checksum.
- Remap the full MADT table at the end of madt_probe() so that we verify
its checksum before saying it is really there.
Requested by: njl
individual asm versions. The global lock is shared between the BIOS and
OS and thus cannot use our mutexes. It is defined in section 5.2.9.1 of
the ACPI specification.
Reviewed by: marcel, bde, jhb
host-PCI bridge device and find a valid $PIR.
- Make pci_pir_parse() private to pci_pir.c and have pir0's attach routine
call it instead of having legacy_pcib_attach() call it.
- Implement suspend/resume support for the $PIR by giving pir0 a resume
method that calls the BIOS to reroute each link that was already routed
before the machine was suspended.
- Dump the state of the routed flag in the links display code.
- If a link's IRQ is set by a tunable, then force that link to be re-routed
the first time it is used.
- Move the 'Found $PIR' message under bootverbose as the pir0 description
line lists the number of entries already. The pir0 line also only shows
up if we are actually using the $PIR which is a bonus.
- Use BUS_CONFIG_INTR() to ensure that any IRQs used by a PCI link are
set to level/low trigger/polarity.
active low polarity when using the PIC interrupt model. This should fix
broken SCI interrupts on machines when not using the APIC where the BIOS
doesn't program the ELCR to level trigger for the ACPI SCI.
Requested by: njl
polarity for a specified IRQ. The intr_config_intr() function wraps
this pic method hiding the IRQ to interrupt source lookup.
- Add a config_intr() method to the atpic(4) driver that reconfigures
the interrupt using the ELCR if possible and returns an error otherwise.
- Add a config_intr() method to the apic(4) driver that just logs any
requests that would change the existing programming under bootverbose.
Currently, the only changes the apic(4) driver receives are due to bugs
in the acpi(4) driver and its handling of link devices, hence the reason
for such requests currently being ignored.
- Have the nexus(4) driver on i386 implement the bus_config_intr() function
by calling intr_config_intr().
and intr_polarity enums for passing around interrupt trigger modes and
polarity rather than using the magic numbers 0 for level/low and 1 for
edge/high.
- Convert the mptable parsing code to use the new ELCR wrapper code rather
than reading the ELCR directly. Also, use the ELCR settings to control
both the trigger and polarity of EISA IRQs instead of just the trigger
mode.
- Rework the MADT's handling of the ACPI SCI again:
- If no override entry for the SCI exists at all, use level/low trigger
instead of the default edge/high used for ISA IRQs.
- For the ACPI SCI, use level/low values for conforming trigger and
polarity rather than the edge/high values we use for all other ISA
IRQs.
- Rework the tunables available to override the MADT. The
hw.acpi.force_sci_lo tunable is no longer supported. Instead, there
are now two tunables that can independently override the trigger mode
and/or polarity of the SCI. The hw.acpi.sci.trigger tunable can be
set to either "edge" or "level", and the hw.acpi.sci.polarity tunable
can be set to either "high" or "low". To simulate hw.acpi.force_sci_lo,
set hw.acpi.sci.trigger to "level" and hw.acpi.sci.polarity to "low".
If you are having problems with ACPI either causing an interrupt storm
or not working at all (e.g., the power button doesn't turn invoke a
shutdown -p now), you can try tweaking these two tunables to find the
combination that works.
IRQ is edge triggered or level triggered. For ISA interrupts, we assume
that edge triggered interrupts are always active high and that level
triggered interrupts are always active low.
- Don't disable an edge triggered interrupt in the PIC. This avoids
outb instructions to the actual PIC for traditional ISA IRQs such as
IRQ 1, 6, 14, and 15. (Fast interrupts such as IRQs 0 and 8 don't mask
their source, so this doesn't change anything for them.)
- For MCA systems we assume that all interrupts are level triggered and
thus need masking. Otherwise, we probe the ELCR. If it exists we trust
what it tells us regarding which interrupts are level triggered. If it
does not exist, we assume that IRQs 0, 1, 2, and 8 are edge triggered
and that all other IRQs are level triggered and need masking.
- Instruct the ELCR mini-driver to restore its saved state during resume.
register controlled the trigger mode and polarity of EISA interrupts.
However, it appears that most (all?) PCI systems use the ELCR to manage
the trigger mode and polarity of ISA interrupts as well since ISA IRQs used
to route PCI interrupts need to be level triggered with active low
polarity. We check to see if the ELCR exists by sanity checking the value
we get back ensuring that IRQS 0 (8254), 1 (atkbd), 2 (the link from the
slave PIC), and 8 (RTC) are all clear indicating edge trigger and active
high polarity.
This mini-driver will be used by the atpic driver to manage the trigger and
polarity of ISA IRQs. Also, the mptable parsing code will use this mini
driver rather than examining the ELCR directly.
interrupt source.
- Only do an outb() to the PIC to clear a bit in imen if the bit is set.
- Add a NUM_ISA_IRQS macro to replace uglier
'sizeof(array) / sizeof(member)' expressions along with a CTASSERT() to
ensure that the macro is correct.
than using legacy_pcib_attach(). The MP Table drivers don't use the $PIR,
and the legacy_pcib_attach() function probes and parses the $PIR in
addition to adding the pci bus child device.
parameter).
Keep using it only in the i386 NOTES for now. It is fairly MI, but it
doesn't use bus-space and has a couple of i386 i/o instructions in pci
intitialization.
correct interrupt source.
- Cache a pointer to the i8254_intsrc's pending method to avoid several
pointer indirections in i8254_get_timecount().
Reported by: bde
modules is a very nice way to produce hard-to-find panics. Who would look for
a bug in a Makefile anyway?
Has anyone seen the pointy hat? :-o
Approved by: njl (mentor)
gadgets (hotkeys, lcd, ...) on Asus laptops. I aim to closely track the
acpi4asus project which implements these features in the Linux kernel.
If this breaks your laptop, please let me know how it does it :-)
Approved by: njl (mentor)
there are a lot of other dependencies that preclude the kernel from
working). Instead, have a more generic note that isa should not be
removed. This should be less confusing for users.
different BIOSs use the same exact settings to mean two very different and
incompatible things for the SCI. Thus, if the SCI is remapped to a PCI
interrupt, we now trust the trigger/polarity that the MADT provides by
default. However, the SCI can be forced to level/lo as 1.10 did by setting
the tunable "hw.acpi.force_sci_lo" to a non-zero value from the loader.
Thus, if rev 1.10 caused an interrupt storm, it should nwo fix your
machine. If rev 1.10 fixed an interrupt storm on your machine, you
probably need to set the aforementioned tunable in /boot/loader.conf to
prevent the interrupt storm.
The more general problem of getting the SCI's trigger/polarity programmed
"correctly" (for some value of correctly meaning several workarounds for
broken BIOSs and inconsistent "implementations" of the ACPI standard) is
going to require more work, but this band-aid should improve the current
situation somewhat.
Requested by: njl
change the video output but use a separate device with a DSSX method
and a HID of "TOS6201" instead. We use a pseudo-driver to get the handle
for this object and pass it to the acpi_toshiba driver.
This is untested but seems to match the Linux Toshiba driver.
move its declaration to the machine-dependent header file on those
machines that use it. In principle, only i386 should have it.
Alpha and AMD64 should use their direct virtual-to-physical mapping.
- Remove pmap_kenter_temporary() from ia64. It is unused. Approved
by: marcel@
without the (defunct) isa compatibility shims. The new-bus-specific
parts are very similar to the ones for the pci probe and attach.
This was held up too long waiting for a repo copy to src/sys/dev/cy,
so I decided to fix the files in their old place. This gives easier
to read and merge diffs anyway.
The "count" line in src/sys/conf/files won't be changed until after
the repo copy, so old kernel configs that specify a count need not be
(and must not be) changed until then. The count is just ignored in
the driver. One unfinished detail is dynamic allocation of arrays
with <count> and (<count> * 32) entries, and iteration over the arrays.
This is now kludged with a fixed count of 10 (up to 10 cards with up
to 32 ports each).
Prodded by: imp
Submitted by: mostly by imp
Approved by: imp
common attach function so that the lock gets initialized in all cases.
This fixes breakage of the initialization of the lock in the pci case
in rev.1.135 (between the releases of 5.1 and 5.2). The lock is only
used in the SMP case, so this bug was not always fatal.
it belongs. Change the implementation to match those of rfs() and
rgs() for consistency and irrespective of whether the original was
more correct or not (technically speaking).
instead of treating it as an unimplemented syscall. This appears to make
StarOffice 7.0 Linux binaries work according to submitter; also tested
with nvidia driver by submitter.
Submitted by: Matthias Schuendehuette
with more than the normal amount of stack pages, however the stack
pointer always wound up being initialized using KSTACK_PAGES. It
should be using td->td_kstack_pages instead. This means that although
the vm subsystem would give you all the stack pages you asked for,
%esp would always be initialized as if you had just 2 pages, and
the rest would go to waste.
I wanted to use the 'give me more stack pages' feature of kthread_create()
because the Intel 2200BG NDIS driver does an alloca() of about 5000 bytes,
which wrecks the stack with the default 2 page size, and I was baffled
that no matter how much code I shoved into thread contexts with
allegedly larger stacks, the thing would still crash unless I changed
KSTACK_PAGES.
Note: this bug is present in _ALL_ arches at this point. Peter has
promised to merge this fix into all of them.
level of abstraction for any and all CPU mask and CPU bitmap variables
so that platforms have the ability to break free from the hard limit
of 32 CPUs, simply because we don't have more bits in an u_int. Note
that the type is not supposed to solve massive parallelism, where
the number of CPUs can be larger than the width of the widest integral
type. As such, cpumask_t is not supposed to be a compound type. If
such would be necessary in the future, we can deal with the issues
then and there. For now, it can be assumed that the type is integral
and unsigned.
With this commit, all MD definitions start off as u_int. This allows
us to phase-in cpumask_t at our leasure without breaking anything.
Once cpumask_t is used consistently, platforms can switch to wider
(or smaller) types if such would be beneficial (or not; whatever :-)
Compile-tested on: i386
options, status pointer and rusage pointer as arguments. It is up to
the caller to copyout the status and rusage to userland if needed. This
lets us axe the 'compat' argument and hide all that functionality in
owait(), by the way. This also cleans up some locking in kern_wait()
since it no longer has to drop locks around copyout() since all the
copyout()'s are deferred.
- Convert owait(), wait4(), and the various ABI compat wait() syscalls to
use kern_wait() rather than wait1() or wait4(). This removes a bit
more stackgap usage.
Tested on: i386
Compiled on: i386, alpha, amd64
dependent function by the same name and a machine-independent function,
sf_buf_mext(). Aside from the virtue of making more of the code machine-
independent, this change also makes the interface more logical. Before,
sf_buf_free() did more than simply undo an sf_buf_alloc(); it also
unwired and if necessary freed the page. That is now the purpose of
sf_buf_mext(). Thus, sf_buf_alloc() and sf_buf_free() can now be used
as a general-purpose emphemeral map cache.
Only cy, bs and wd in the tree still use it. I have a replacement for
cy that I need to test on ISA and PCI cards. bs and wd are pc98 only
drivers that appear to no longer be necessary. I'll be removing them
when I hear back from the pc98 people.
this driver is being retired. Remove it from the tree. If someone
wants to update it to the latest APIs and can test the hardware, it
can return to the tree.
COMPAT_PCI api. This API is going away, so this driver is going away
also.
If users are interested in updating this, please contact the author
since he has some preliminary work to move this to newer APIs.
driver uses COMPAT_ISA shims, and those shims are going away.
It can be brought back if someone updates it to the latest APIs, and
moves it to the appropriate place in the tree.
to build the kernel. It doesn't affect the operation if gcc.
Most of the changes are just adding __INTEL_COMPILER to #ifdef's, as
icc v8 may define __GNUC__ some parts may look strange but are
necessary.
Additional changes:
- in_cksum.[ch]:
* use a generic C version instead of the assembly version in the !gcc
case (ASM code breaks with the optimizations icc does)
-> no bad checksums with an icc compiled kernel
Help from: andre, grehan, das
Stolen from: alpha version via ppc version
The entire checksum code should IMHO be replaced with the DragonFly
version (because it isn't guaranteed future revisions of gcc will
include similar optimizations) as in:
---snip---
Revision Changes Path
1.12 +1 -0 src/sys/conf/files.i386
1.4 +142 -558 src/sys/i386/i386/in_cksum.c
1.5 +33 -69 src/sys/i386/include/in_cksum.h
1.5 +2 -0 src/sys/netinet/igmp.c
1.6 +0 -1 src/sys/netinet/in.h
1.6 +2 -0 src/sys/netinet/ip_icmp.c
1.4 +3 -4 src/contrib/ipfilter/ip_compat.h
1.3 +1 -2 src/sbin/natd/icmp.c
1.4 +0 -1 src/sbin/natd/natd.c
1.48 +1 -0 src/sys/conf/files
1.2 +0 -1 src/sys/conf/files.amd64
1.13 +0 -1 src/sys/conf/files.i386
1.5 +0 -1 src/sys/conf/files.pc98
1.7 +1 -1 src/sys/contrib/ipfilter/netinet/fil.c
1.10 +2 -3 src/sys/contrib/ipfilter/netinet/ip_compat.h
1.10 +1 -1 src/sys/contrib/ipfilter/netinet/ip_fil.c
1.7 +1 -1 src/sys/dev/netif/txp/if_txp.c
1.7 +1 -1 src/sys/net/ip_mroute/ip_mroute.c
1.7 +1 -2 src/sys/net/ipfw/ip_fw2.c
1.6 +1 -2 src/sys/netinet/igmp.c
1.4 +158 -116 src/sys/netinet/in_cksum.c
1.6 +1 -1 src/sys/netinet/ip_gre.c
1.7 +1 -2 src/sys/netinet/ip_icmp.c
1.10 +1 -1 src/sys/netinet/ip_input.c
1.10 +1 -2 src/sys/netinet/ip_output.c
1.13 +1 -2 src/sys/netinet/tcp_input.c
1.9 +1 -2 src/sys/netinet/tcp_output.c
1.10 +1 -1 src/sys/netinet/tcp_subr.c
1.10 +1 -1 src/sys/netinet/tcp_syncache.c
1.9 +1 -2 src/sys/netinet/udp_usrreq.c
1.5 +1 -2 src/sys/netinet6/ipsec.c
1.5 +1 -2 src/sys/netproto/ipsec/ipsec.c
1.5 +1 -1 src/sys/netproto/ipsec/ipsec_input.c
1.4 +1 -2 src/sys/netproto/ipsec/ipsec_output.c
and finally remove
sys/i386/i386 in_cksum.c
sys/i386/include in_cksum.h
---snip---
- endian.h:
* DTRT in C++ mode
- quad.h:
* we don't use gcc v1 anymore, remove support for it
Suggested by: bde (long ago)
- assym.h:
* avoid zero-length arrays (remove dependency on a gcc specific
feature)
This change changes the contents of the object file, but as it's
only used to generate some values for a header, and the generator
knows how to handle this, there's no impact in the gcc case.
Explained by: bde
Submitted by: Marius Strobl <marius@alchemy.franken.de>
- aicasm.c:
* minor change to teach it about the way icc spells "-nostdinc"
Not approved by: gibbs (no reply to my mail)
- bump __FreeBSD_version (lang/icc needs to know about the changes)
Incarnations of this patch survive gcc compiles since a loooong time,
I use it on my desktop. An icc compiled kernel works since Nov. 2003
(exceptions: snd_* if used as modules), it survives a build of the
entire ports collection with icc.
Parts of this commit contains suggestions or submissions from
Marius Strobl <marius@alchemy.franken.de>.
Reviewed by: -arch
Submitted by: netchild
in the non-_KERNEL case. This "fixes" applications that include
this "kernel-only" header and also include <strings.h> (or get
<strings.h> via the default _BSD_VISIBLE pollution in <string.h>.
In C++ there was a fatal error: the declaration specifies C linkage
but the implementation gives C++ linkage. In C there was only a
static/extern mismatch if the headers were included in a certain order
order, and a partially redundant declaration for all include orders;
gcc emits incomplete or wrong diagnostics for these, but only for
compiling with -Wsystem-headers and certain other warning options, so
the problem was usually not seen for C.
Ports breakage reported by: kris
pmap. For the kernel pmap, Giant is not required. In general, for
other pmaps, Giant is required by i386's pmap_pte() implementation.
Specifically, the use of PMAP2/PADDR2 is synchronized by Giant.
Note: In principle, updates to the kernel pmap's wired count could be
lost without Giant. However, in practice, we never use the kernel
pmap's wired count. This will be resolved when pmap locking appears.
- With the above change, cpu_thread_clean() and uma_large_free() need
not acquire Giant. (The first case is simply the revival of
i386/i386/vm_machdep.c's revision 1.226 by peter.)
in the Memory Mapped Configuration Region (MMCR) to reset the CPU.
If CPU_ELAN is set, try this first to reset the CPU before the
traditional way.
Without this change, my Compulab board powers down on 'reset' instead
of rebooting.
ever since alpha/alpha/pmap.c revision 1.81 introduced the list allpmaps,
there has been no reason for having this function on Alpha. Briefly,
when pmap_growkernel() relied upon the list of all processes to find and
update the various pmaps to reflect a growth in the kernel's valid
address space, pmap_init2() served to avoid a race between pmap
initialization and pmap_growkernel(). Specifically, pmap_pinit2() was
responsible for initializing the kernel portions of the pmap and
pmap_pinit2() was called after the process structure contained a pointer
to the new pmap for use by pmap_growkernel(). Thus, an update to the
kernel's address space might be applied to the new pmap unnecessarily,
but an update would never be lost.
- completely unused things
- all of rev.1.102 (C++ support). <sys/cdefs.h> is included by the
prerequisite <sys/types.h>. __BEGIN_DECLS/__END_DECLS has no effect
(except possibly if undefined behaviour is invoked using a hack like
defining away __inline) since this header doesn't really support any
extern functions.