interrupt such as IRQ 22 or 19. However, the ACPI BIOS still routes
interrupts from some PCI devices to the same intpin calling the pin
IRQ 22. Thus, ACPI expects to address a single interrupt source via two
different names. To work around this, if the SCI is remapped to a non-ISA
interrupt (i.e., greater than 15), then we use
acpi_OverrideInterruptLevel() function to tell ACPI to use IRQ 22 or 19
rather than IRQ 9 for the SCI.
Previously we would change IRQ 22 or 19's name to IRQ 9 when we encountered
such an Interrupt Source Override entry in the MADT which routed the SCI
properly but left PCI devices mapped to IRQ 22 or 19 w/o a routable
interrupt.
Tested by: sos
This ensures that uart gets a higher console priority than syscons when
a serial console is being used. Testing against the "console" environment
variable doesn't make sense since we only have one loader console driver.
should now only have HTT CPUs if they have explicitly asked for them
either by enabling HyperThreading in the BIOS or by using the
MPTABLE_FORCE_HTT kernel option.
should only be used if they are enabled in the BIOS. Now that we support
enumerating CPUs using the ACPI MADT, any HTT machine using ACPI should
respect the BIOS setting. For HTT machines with ACPI disabled in the
kernel, the MPTABLE_FORCE_HTT kernel option can be used to try to probe HTT
CPUs like have done in the past for the MP Table case. This option should
only be enabled if HTT is enabled in the BIOS.
that we currently do not keep track of whether the thread has
actually used the high FP registers before. If not, we should
not save them in the context which automaticly means that we
also would not restore them from the context. For now, do it
unconditionally so that we can reach functional completeness.
functions switched to using {g|s}et_mcontext(). The problem is that
sigreturn(), being a syscall, can be given an async. context (i.e.
one corresponding to an interrupt or trap). When this happens, we
try to return to user mode via epc_syscall_return with a trapframe
that can only be used to return to user mode via exception_restore.
To fix this, we check the frame's flags immediately prior to
epc_syscall_return and branch to exception_restore for non-syscall
frames. Modify the assertion in set_mcontext() to check that if
there's a mismatch, it's because of sigreturn().
ktr_resize_pool(); this eliminates a potential livelock.
Return ENOSPC only if we encountered an out-of-memory condition when
trying to increase the pool size.
Reviewed by: jhb, bde (style)
cache after a data access error we must discard all cache lines. When
disabled existing cache lines are not invalidated by stores to memory, so
we risk reading stale data that was cached before the data access error if
we don't flush them. This is especially fatal when the memory involved
is the active part of the kernel or user stack. For good measure we also
flush the instruction cache.
This fixes random crashes when the X server probes the PCI bus through
/dev/pci.
commit broke the world because it depended on namespace pollution that
was only in my version of <machine/bootinfo.h>. The include was removed
in rev.1.63 after the last reference to it went away in rev.1.61.
dsp_open: rearrange to only hold one lock at a time
dsp_close: ditto
mixer_hwvol_init: delete locking, the only consumer seems to
be the ess driver and it only call it a creation time, I
think the device will be stable across the sleepable malloc.
cmi interrupt routine: Release locks while caller chn_intr,
either this or do what emu10k1 does which is have no locks
at in the interrupt handler.
Submitted by: mat@cnd.mcgill.ca
consistency initialized. Consequently, a number of conditionals that
checked the validity of b_object before passing it to VM_OBJECT_LOCK()
and VM_OBJECT_UNLOCK() are no longer needed.
The reason this was done was to avoid a race to the root when an
NFS server went down. However a semi-recent change to the way that
the kernel's lookup() routine traverses mount points prevents this.
Rev 1.39 of vfs_lookup.c changed the ordering of locks such that we
aquire a shared lock on the mount point being accessed and then drop
the directory vnode lock before requesting the target lock.
With that in place we no longer need shared locks for NFS to prevent
race to the root lockups.
it is marked as RTF_UP. This appears to fix a crash that was sometimes
triggered when dhclient(8) tried to send a packet after an interface
had been detatched.
Reviewed by: sam
a resource leak. Move the resource deallocation code from fifo_close()
to a new function, fifo_cleanup(), and call fifo_cleanup() from
fifo_close() and the appropriate places in fifo_open().
Tested by: Lukas Ertl
Pointy hat to: truckman
because RFNOWAIT was being passed to kproc_create.
The result was that shutdown took quite a bit longer because this
errant "child" would not respond to termination signals from init
at system shutdown.
RFNOWAIT dissassociates itself from the caller by attaching to init
as a parent proc. We could have had the taskqueue proc listen for
SIGKILL, but being able to SIGKILL a potentially critical system
process doesn't seem like a good idea.
comment about this flag in rev.1.61. It is not historical like the
comment said; it is the flag that says that most of what is laboriously
put in the bootinfo struct is actually there. Newer kernels were
bootable by even the broken boot2 without losing anything except the
symbol table, but older kernels need at least the memory sizes.
Restoring the "|" with RB_BOOTINFO that was lost in rev.1.43 costs 5
bytes. The fix can be done in only 4 bytes by fixing some code that
was removed in rev.1.61 (put RB_BOOTINFO back in in the initial value
of "opts" and fix RBX_MASK to not clobber it.)
crashes that had sn0 as the irq that's being serviced, when there was
no sn0 in the system. This seems to prevent them. Also, we want to
wait until after we've registered with the network layer before we
turn on the interrupt spigot to avoid races.
- redo updating.
rijndael-api-fst.[ch]:
- switch to use new low level rijndael api.
- stop using u8, u16 and u32.
- space cleanup.
Tested by: gbde(8) and phk's test program
free one sem_undo with un_cnt == 0 instead of all of them. This is a
temporary workaround until the SLIST_FOREACH_PREVPTR loop gets fixed so
that it doesn't cause cycles in semu_list when removing multiple adjacent
items. It might be easier to just use (doubly-linked) LISTs here instead
of complicated SLIST code to achieve O(1) removals.
This bug manifested itself as a complete lockup under heavy semaphore use
by multiple processes with the SEM_UNDO flag set.
PR: 58984
Only update them in the newly created context to reflect the state
after copying the dirty registers onto the user stack. If we were to
update the trapframe, we lose the state at entry into the kernel. We
may need that after we create the context, such as for KSE upcalls.
We have to update the trapframe after writing the dirty registers to
the user stack for signal delivery to work. But this is best done in
sendsig() itself where it applies, not in get_mcontext() where it's
done unconditionally.
with debugger, so testing P_TRACED in SIGPENDING is useless. This test also
is the culprit which causes lots of 'failed to set signal flags properly for
ast()' to be printed on console which is just a false complaint.
must return EINVAL if size is zero. Submitted by: tegge
- In order to avoid a race condition in multithreaded applications, the
check and removal operations by munmap(2) must be in the same critical
section. To accomodate this, vm_map_check_protection() is modified to
require its caller to obtain at least a read lock on the map.
revision 1.142
date: 2003/10/11 03:04:26; author: toshii
Fix a done list handling bug which exhibits under high shared
interrupt rate and bus traffic. As the interrupt register is
read after checking hcca_done_head, there was a small chance
of dropping a done list. Ignore OHCI_WDH interrupt bit if
hcca_done_head is zero so that OHCI_WDH is processed later.
revision 1.141
date: 2003/09/10 20:08:29; author: mycroft;
Update actlen even in the case where a TD returns an error --
this is critical for the umass bulk-only STALL case.
revision 1.176
date: 2003/11/04 19:11:21; author: mycroft;
Ignore a CRCTO error on a SETUP transaction in combination with
STALLED or NAK. This fixes problems with the GL641.
- remove the unnecessary elm arg from SIMPLEQ_REMOVE_HEAD().
this mirrors the functionality of SLIST_REMOVE_HEAD() (the other
singly-linked list type) and FreeBSD's STAILQ_REMOVE_HEAD()
use set_mcontext() to restore the context in sigreturn(). Since we
put the syscall number and the syscall arguments in the trapframe
(we don't save the scratch registers for syscalls, which allows us
to reuse the space to our advantage), create a MD specific flag so
that we save the scratch registers even for syscalls. We would not
be able to restart a syscall otherwise.
The signal trampoline does not need to flush the regiters anymore,
because get_mcontext() already handles that. In fact, if we set up
the context correctly, we do not need to have a trampoline at all.
This change however only minimally changes the trampoline code. In
follow-up commits this can be further optimized.
Note that normally we preserve cfm and iip in the trapframe created
by the EPC syscall path when we restore a context in set_mcontext()
because those fields are not normally set for a synchronuous context.
The kernel puts the return address and frame info of the syscall
stub in there. By preserving these fields we hide this detail from
userland which allows us to use setcontext(2) for user created
contexts. However, sigreturn() is commonly called from the trampoline,
which means that if we preserve cfm and iip in all cases, we would
return to the trampoline after the sigreturn(), which means we hit
the safety net: we call exit(2). So, we do not preserve cfm and iip
when we have a synchronous context that also has scratch registers
(the uncommon context created by sendsig() only), under the assumption
that if such a context is created in userland, something special is
going on and the use of cfm and iip is then just another quirk. All
this is invisible in the common case.
if we drop into the pmap or vnode layers.
- Migrate the handling of zero-length msync(2)s into vm_map_sync() so that
multithread applications can't change the map between implementing the
zero-length hack in msync(2) and reacquiring the map lock in
vm_map_sync().
Reviewed by: tegge
Since all callers either passed 0 or 1 for clear_ret, define bit 0 in
the flags for use as clear_ret. Reserve bits 1, 2 and 3 for use by MI
code for possible (but unlikely) future use. The remaining bits are for
use by MD code.
This change is triggered by a need on ia64 to have another knob for
get_mcontext().
o When we're resetting the board, make sure that we error out the pending
CCBs first. Otherwise the aha_cmd won't accept further commands, such
as those that are used to reset the card (AOP_INITIALIZE_MBOX). This
appears to cause a cascade failure where no more commands are possible
to the card.
o Reduce from 10s down to 1s the amount of time we're willing to tolerate
the card being awol. This helps the above case.
o Add some error checking to two commands issued in the probe.
I have a dim memory of gibbs@ trying to tell me about this problem a
few years ago, so pointy hat to imp@ for sitting on it so long.
in the log message for kern_sched.c 1.83 (which should have been
repo-copied to preserve history for this file), the (4BSD) scheduler
algorithm only works right if stathz is nearly 128 Hz. The old
commit lock said 64 Hz; the scheduler actually wants nearly 16 Hz
but there was a scale factor of 4 to give the requirement of 64 Hz,
and rev.1.83 changed the scale factor so that the requirement became
128 Hz. The change of the scale factor was incomplete in the SMP
case. Then scheduling ticks are provided by smp_ncpu CPUs, and the
scheduler cannot tell the difference between this and 1 CPU providing
scheduling ticks smp_ncpu times faster, so we need another scale
factor of smp_ncp or an algorithm change.
This quick fix uses the scale factor without even trying to optimize
the runtime divisions required for this as is done for the other
scale factor.
The main algorithmic problem is the clamp on the scheduling tick counts.
This was 295; it is now approximately 295 * smp_ncpu. When the limit
is reached, threads get free timeslices and scheduling becomes very
unfair to the threads that don't hit the limit. The limit can be
reached and maintained in the worst case if the load average is larger
than (limit / effective_stathz - 1) / 2 = 0.65 now (was just 0.08 with
2 CPUs before this change), so there are algorithmic problems even for
a load average of 1. Fortunately, the worst case isn't common enough
for the problem to be very noticeable (it is mainly for niced CPU hogs
competing with less nice CPU hogs).
thread being waken up. The thread waken up can run at a priority as
high as after tsleep().
- Replace selwakeup()s with selwakeuppri()s and pass appropriate
priorities.
- Add cv_broadcastpri() which raises the priority of the broadcast
threads. Used by selwakeuppri() if collision occurs.
Not objected in: -arch, -current
that msync(2) is its only caller.
- Migrate the parts of the old vm_map_clean() that examined the internals
of a vm object to a new function vm_object_sync() that is implemented in
vm_object.c. At the same, introduce the necessary vm object locking so
that vm_map_sync() and vm_object_sync() can be called without Giant.
Reviewed by: tegge
are zx1 based machines and they don't particularly like it when we
poke at them with PC legacy code. The atkbd and psm devices were
disabled in the hints file so that one could enable them on machines
that support legacy devices, but that's not really something you can
expect from a first-time installer. This still leaves syscons (sc)
and the vga device, which were enabled by default and wrecking havoc
anyway. We could disable them by default like the atkbd and psm
devices, but there's really no point in pretending we're in a better
shape that way.
o pickup Giant in divert_packet to protect sbappendaddr since it
can be entered through MPSAFE callouts or through ip_input when
mpsafenet is 1
o add missing locking on output
o add locking to abort and shutdown
o add a ctlinput handler to invalidate held routing table references
on an ICMP redirect (may not be needed)
Supported by: FreeBSD Foundation
o add assertions in tcp_respond to validate inpcb locking assumptions
o use local variable instead of chasing pointers in tcp_respond
Supported by: FreeBSD Foundation
whether or not the isr needs to hold Giant when running; Giant-less
operation is also controlled by the setting of debug_mpsafenet
o mark all netisr's except NETISR_IP as needing Giant
o add a GIANT_REQUIRED assertion to the top of netisr's that need Giant
o pickup Giant (when debug_mpsafenet is 1) inside ip_input before
calling up with a packet
o change netisr handling so swi_net runs w/o Giant; instead we grab
Giant before invoking handlers based on whether the handler needs Giant
o change netisr handling so that netisr's that are marked MPSAFE may
have multiple instances active at a time
o add netisr statistics for packets dropped because the isr is inactive
Supported by: FreeBSD Foundation
PALM_4 initialisation hack. I've not confirmed it myself, but
seeing as we already don't use it for the Sony Clie_41, let's drop
it from the Clie_40 also and see what happens.
(Question: What about the Clie_S360 and Clie_NX60 devices? Do we
need to drop Palm4 from those as well? Possibly, but I've not had
any reports about those so I don't know.)
PR: kern/56575
MFC after: 3 days
is highly MD in an emulation environment since it operates on the host
environment. Although the setregs functions are really for exec support
rather than signals, they deal with the same sorts of context and include
files. So I put it there rather than create yet another file.
since there is no direct association between M:N thread and kse,
sometimes, a thread does not have a kse, in that case, return a pctcpu
from its last kse, it is not perfect, but gives a good number to be
displayed.
pmap_pte() and pmap_pte_quick(). The distinction being based upon the
locks that are held by the caller. When the given pmap is not the
current pmap, pmap_pte() should be used when Giant is held and
pmap_pte_quick() should be used when the vm page queues lock is held.
- When assigning to PMAP1 or PMAP2, include PG_A anf PG_M.
- Reenable the inlining of pmap_is_current().
In collaboration with: tegge
has a LOR between IPFW inpcb locks but I'm committing it now as the
lesser of two evils (the other being unlocked use of in_pcblookup).
Supported by: FreeBSD Foundation
successfully initialized in the label as a socket peer label, not a
socket label. For current policy modules, this didn't make a
difference, but if a policy module had label data in the peer label
that was to be GC'd in a different way than the normal socket label,
it might have been a problem.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
that it was obsoleted. it is better to fail than just hiding use
of ipsec_gethist() at build.
Sugessted by: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
Removed banal comments about ELAN*. Complain about ELAN* being misnamed
instead (so that these options are not obviously related to a CPU and
don't sort with CPU_ELAN).
Complain about CPU_DISABLE_CMPXCHG being in the wrong namespace.
to a routing table entry w/o bumping the reference count or locking
against the entry being free'd. This caused major havoc (for some
reason it appeared most frequently for folks running natd). Fix
is to bump the reference count whenever we copy the route cache
contents into a private copy so the entry cannot be reclaimed out
from under us. This is a short term fix as the forthcoming routing
table changes will eliminate this cache entirely.
Supported by: FreeBSD Foundation
Instead, let the vm objects be lazily instantiated at fault time. This
results in the allocation of fewer vm objects and vm map entries due to
aggregation in the vm system.
sure we handle stacked registers properly by taking into account
that:
1. bspstore points after the frame (due to cover),
2. we need to adjust for intermediate NaT collections.
not in scheduler specific data because eventually it will be required by
all schedulers.
- Implement sched_pin and unpin as an inline for now. If a scheduler needs
to do something more complicated than adjusting the pinned count we can
move this into a function later in an api compatible way.
o move it from subr_bus.c to netisr.c where it more properly belongs
o add NET_PICKUP_GIANT and NET_DROP_GIANT macros that will be used to
grab Giant as needed when MPSAFE operation is enabled
Supported by: FreeBSD Foundation
pin that is used by the default identity mapping if it still maps to the
old vector. The ACPI case might need some tweaking for the SCI interrupt
case since ACPI likes to address the intpin using both the IRQ remapped to
it as well as the previous existing PCI IRQ mapped to it.
Reported by: kan
on ia64. The bug is present in i386 as well but didn't show up due to
more relaxed page protections. This fix has been submitted to the vendor.
Submitted by: marcel
rather than signed. This fixes some cosmetics such as verbose printf's
for IRQs greater than 127.
- The calculation for next_ioapic_base was also adjusted so that it will
only complain once for each hole in the IRQs provided by ACPI for IO
APICs.
Reported by: Michal Mertl <mime@traveller.cz>
Don't put the name of the file in a comment. $FreeBSD$ gives more than
enough about the file's pathname.
Fixed misdescription of the file. It isn't the whole unified Makefile...
Moved the settings of WERROR and of the standard extra CFLAGS
-finline-limit and -fno-strict-aliasing to a less wrong place. They
were in the section for profiling.
of ffs_reload()'s mountp parameter to mp in rev.1.28 of ffs_vnops.c
had not been merged here.
ext2fs_reload() is still missing locking from not merging other changes
to ffs_reload(), but none of these is related to recent locking changes.
but can be enabled by setting hw.atm.hatmN.mpsafe in the kernel
environment to a non-zero value before loading the driver. When
the problems with network MPSAFEty have been sorted out this will
be removed and the driver will default to MPSAFE.
We put them directly onto the free list instead of calling the
external mbuf free routine (that routine would have cleaned the flag).
This fixes a bug which manifests itself in falsely reporting a lot of used
buffers when configuring the interface down.
conservative lock. The problem with the lock-less algorithm is that
it suffers from the ABA problem. Running an application with funnels
a couple of 100kpkts/s through the netgraph system on a dual CPU system
with MPSAFE drivers will panic almost immediatly with the old algorithm.
It may be possible to eliminate the contention between threads that insert
free items into the list and those that get free items by using the
Michael/Scott queue algorithm that has two locks.
Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to
operate on this mutex transparently.
Eventually new mutex will be protecting more fields in
struct mount, not only vnode list.
Discussed with: jeff
1.36 +73 -60 src/sys/compat/linux/linux_ipc.c
1.83 +102 -48 src/sys/kern/sysv_shm.c
1.8 +4 -0 src/sys/sys/syscallsubr.h
That change was intended to support vmware3, but
wantrem parameter is useless because vmware3 uses SYSV shared memory
to talk with X server and X server is native application.
The patch worked because check for wantrem was not valid
(wantrem and SHMSEG_REMOVED was never checked for SHMSEG_ALLOCATED segments).
Add kern.ipc.shm_allow_removed (integer, rw) sysctl (default 0) which when set
to 1 allows to return removed segments in
shm_find_segment_by_shmid() and shm_find_segment_by_shmidx().
MFC after: 1 week
the amd64 implementation of the pcpu macros is even more verbose than on
i386 and that causes gcc to way overestimate the complexity of this
2-instruction macro. The other platforms can probably lower their
default values.
from the xe driver. Should probably be removed when current probe/attach
problems with the driver are fixed, but is useful now when requesting
diagnostic information from users.
Reviewed by: imp (mentor)
isa_device pointer as its argument and uses that to call the driver's
interrupt handler passing the unit number as its argument. This should
fix COMPAT_OLDISA devices with a unit number of 0.
Reviewed by: peter
Reported by: bde
- share policy-on-socket for listening socket.
- don't copy policy-on-socket at all. secpolicy no longer contain
spidx, which saves a lot of memory.
- deep-copy pcb policy if it is an ipsec policy. assign ID field to
all SPD entries. make it possible for racoon to grab SPD entry on
pcb.
- fixed the order of searching SA table for packets.
- fixed to get a security association header. a mode is always needed
to compare them.
- fixed that the incorrect time was set to
sadb_comb_{hard|soft}_usetime.
- disallow port spec for tunnel mode policy (as we don't reassemble).
- an user can define a policy-id.
- clear enc/auth key before freeing.
- fixed that the kernel crashed when key_spdacquire() was called
because key_spdacquire() had been implemented imcopletely.
- preparation for 64bit sequence number.
- maintain ordered list of SA, based on SA id.
- cleanup secasvar management; refcnt is key.c responsibility;
alloc/free is keydb.c responsibility.
- cleanup, avoid double-loop.
- use hash for spi-based lookup.
- mark persistent SP "persistent".
XXX in theory refcnt should do the right thing, however, we have
"spdflush" which would touch all SPs. another solution would be to
de-register persistent SPs from sptree.
- u_short -> u_int16_t
- reduce kernel stack usage by auto variable secasindex.
- clarify function name confusion. ipsec_*_policy ->
ipsec_*_pcbpolicy.
- avoid variable name confusion.
(struct inpcbpolicy *)pcb_sp, spp (struct secpolicy **), sp (struct
secpolicy *)
- count number of ipsec encapsulations on ipsec4_output, so that we
can tell ip_output() how to handle the packet further.
- When the value of the ul_proto is ICMP or ICMPV6, the port field in
"src" of the spidx specifies ICMP type, and the port field in "dst"
of the spidx specifies ICMP code.
- avoid from applying IPsec transport mode to the packets when the
kernel forwards the packets.
Tested by: nork
Obtained from: KAME
- Add the following functions to the api: sched_bind(), sched_unbind(),
sched_pin(), and sched_unpin(). Bind/unbind are used for traditional
cpu binding. Pin and unpin are meant to allow the kernel to hold a thread
on a particular cpu so that it may cache per-cpu data without fear of
being migrated.
no matter where in the directory structure it may be. Use this and the "-k"
flag in the generated gdbinit files so that the "getsyms" function in gdb
requires no user intervention to run and will find every module if they're
in the kernel build's module directory. This is still quite useful for
cases where gdb knows that the path for some modules is /boot/kernel and
others are in the object directory for /usr/src/sys/$ARCH/compile/kernel.
Approved by: grog
waitrunningbufspace() calls so that they are always able to
proceed and clean up buffer space.
Submitted by: Brian Fundakowski Feldman <green@freebsd.org>
o Fix minor type problems
o Fix minor problem with a couple debug printfs.
o Default to a sane media type when none is reported.
o Minor style changes
The PR complains this will fix the IBM 300GL cards.
Submitted by: Max Gotlib
PR: 11462
Use zpfind() to see if the process became a zombie if pfind() doesn't find it
and if the caller wants to know about process death, so that the caller knows
the process died even if it happened before the kevent was actually registered.
MFC after: 1 week
This fixes a race condition (specifically with signal events) that could
lead to the kn being re-inserted into the list after it has been destroyed,
which is not something we want to happen.
PR: kern/58258