"scheduler" here has very little to do with scheduling. It is actually
the swapper, and it really must be the last SYSINIT'ed item like its
comment says, since proc0 metamorphoses into swapper by calling
scheduler() last in mi_start(), and scheduler() never returns.. Rev.1.29
of subr_4bsd.c broke this by adding another SI_ORDER_FIRST item
(kproc_start() for schedcpu_thread() onto the SI_SUB_RUN_SCHEDULER_LIST.
The sorting of SYSINITs with identical orders (at all levels) is
apparently nondeterministic, so this resulted in schedule() sometimes
being called second last and schedcpu_thread() not being called at all.
This quick fix just changes the code to almost match the comment
(SI_ORDER_FIRST -> SI_ORDER_ANY). "LAST" is misspelled "ANY", and
there is no way to ensure that there is only 1 very lst SYSINIT.
A more complete fix would remove the SYSINIT obfuscation.
won't associate in BSS mode if you use AUTHMODE_SHARED. I probably don't
understand enough to know when SHARED should be used vs. OPEN or WPA.
For now, go back to what works.
bit for this being the last CTIO2. It didn't matter since it really was the
last CTIO2 and the resources recycled, but still....
Add in CTIO3 define for future DAC work.
for direct-mapped addresses. Assume that any address less than KVA
is one of these and return it. Also assert that an address is KVA
does have a valid mapping - callers of pmap_kextract don't check
the return value, since they assume that they have a valid virtual
address.
addressing of memory. Makes a substantial improvement for apps that
stress the limited amount of KVM on PPC (e.g. untarring the ports tree).
uma_machdep.c stolen from amd64/ia64.
updated for the regparm ABI on amd64.
Context switch debug regs.
Update for fpu simplification
Don't needlessly reload %cr3, in case the cpu has the tlb flush filter
turned off. Re-add LAZY_SWITCH stubs.
profiling buffers and hash table. This makes it a lot easier to
do multiple profiling runs without rebooting or performing
gratuitous arithmetic. Sysctl is named debug.mutex.prof.reset.
Reviewed by: jake
- witness_lock() is split into two pieces: witness_checkorder() and
witness_lock(). Witness_checkorder() determines if acquiring a specified
lock at the time it is called would result in a lock order. It
optionally adds a new lock order relationship as well. witness_lock()
updates witness's data structures to assume that a lock has been acquired
by stick a new lock instance in the appropriate lock instance list.
- The mutex and sx lock functions now call checkorder() prior to trying to
acquire a lock and continue to call witness_lock() after the acquire is
completed. This will let witness catch a deadlock before it happens
rather than trying to do so after the threads have deadlocked (i.e. never
actually report it).
- A new function witness_defineorder() has been added that adds a lock
order between two locks at runtime without having to acquire the locks.
If the lock order cannot be added it will return an error. This function
is available to programmers via the WITNESS_DEFINEORDER() macro which
accepts either two mutexes or two sx locks as its arguments.
- A few simple wrapper macros were added to allow developers to call
witness_checkorder() anywhere as a way of enforcing locking assertions
in code that might acquire a certain lock in some situations. The
macros are: witness_check_{mutex,shared_sx,exclusive_sx} and take an
appropriate lock as the sole argument.
- The code to remove a lock instance from a lock list in witness_unlock()
was unnested by using a goto to vastly improve the readability of this
function.
instead of taskqueue_swi. This shaves from 1 to 10% of the overhead.
Overhaul the locking once more, there was a few possible races that
are now closed.
panic() so that the buffer overflow just beyond this point is always
caught, even when the code is not compiled with INVARIANTS.
Change chn_setblocksize() buffer reallocation code to attempt to avoid
the feed_vchan16() buffer overflow by attempting to always keep the
bufsoft buffer at least as large as the bufhard buffer.
Print a diagnositic message
Danger! %s bufsoft size increasing from %d to %d after CHANNEL_SETBLOCKSIZE()
if our best attempts fail. If feed_vchan16() were to be called by
the interrupt handler while locks are dropped in chn_setblocksize()
to increase the size bufsoft to match the size of bufhard, the panic()
code in feed_vchan16() will be triggered. If the diagnostic message
is printed, it is a warning that a panic is possible if the system
were to see events in an "unlucky" order.
Change the locking code to avoid the need for MTX_RECURSIVE mutexes.
Add the MTX_DUPOK option to the channel mutexes and change the locking
sequence to always lock the parent channel before its children to avoid
the possibility of deadlock.
Actually implement locking assertions for the channel mutexes and fix
the problems found by the resulting assertion violations.
Clean up the locking code in dsp_ioctl().
Allocate the channel buffers using the malloc() M_WAITOK option instead
of M_NOWAIT so that buffer allocation won't fail. Drop locks across
the malloc() calls.
Add/modify KASSERTS() in attempt to detect problems early.
Abuse layering by adding a pointer to the snd_dbuf structure that points
back to the pcm_channel that owns it. This allows sndbuf_resize() to do
proper locking without having to change the its API, which is used by
the hardware drivers.
Don't dereference a NULL pointer when setting hw.snd.maxautovchans
if a hardware driver is not loaded. Noticed by Ryan Sommers
<ryans at gamersimpact.com>.
Tested by: Stefan Ehmann <shoesoft AT gmx.net>
Tested by: matk (Mathew Kanner)
Tested by: Gordon Bergling <gbergling AT 0xfce3.net>
debug.ddb_use_printf sysctl, output kernel debugger data to both the
console and kernel message buffer via printf. This fixes the case where
backtrace() went directly to the console and should help debugging greatly.
Thanks to Ian Dowse for the work, minor edits or any bugs are by myself.
Submitted by: iedowse
the proc lock only if we actually need to perform a stop. This
avoids two locks and unlocks of the process lock each system call,
and wins me about 20% on a simply system call test (getuid(),
which would otherwise require no locking). This also has a net
improvement of about 10MB/s on some of the SMP bandwidth tests
I'm running.
Reviewed by: jhb
- malloc() returns a void* and does not need a cast
- when called with M_WAITOK, malloc() can not return NULL so don't
check for that case. The result of the check was bogus anyway since
it would leave the interface broken.
o remove extraneous bzero
o add SYSINIT to properly initialize ip4_def_policy
Submitted by: "Bjoern A. Zeeb" <bzeeb+freebsd@zabbadoz.net>
Submitted by: gnn@neville-neil.com
assure backward compatibility (conditional on !BURN_BRIDGES), look it up
by its old name first, and log a warning (but accept the setting) if it
was found. If both the old and new name are defined, the new name takes
precedence.
Also export vm.kmem_size as a read-only sysctl variable; I find it hard to
tune a parameter when I don't know its default value, especially when that
default value is computed at boot time.
kbd_attach() is called kbd[0-9]+, with a different unit number. This
makes it impossible to write a devd rule which will automatically
switch to a USB keyboard when one is attached, because there is no way
to guess the correct device node to pass to kbdcontrol.
Therefore, change kbd_attach() to create a device node using the
keyboard device's real name (atkbd0, ukbd0...), and create the
kbd[0-9]+ node as an alias for backward compatibility.
on an SIOCSIFADDR (by way of brain damage in net80211).
Also, avoid trying to set NDIS_80211_AUTHMODE_AUTO since the Microsoft
documentation I have recommends not using it, and the Centrino driver
seems to dislike being told to use it.
every system call, and that grabs and release the process lock each
time. Don't fix it (yet), but document it so we know to fix it.
Also should be a 5.3-RELEASE todo item.
and NdisCancelTimer(). NdisInitializeTimer() doesn't accept an NDIS
miniport context argument, so we have to derive it from the timer
function context (which is supposed to be the adapter private context).
NdisCancelTimer is now an alias for NdisMCancelTimer().
Also add stubs for NdisMRegisterDevice() and NdisMDeregisterDevice().
These are no-ops for now, but will likely get fleshed in once I start
working on the Am1771/Am1772 wireless driver.
can look at the ACPI tables. If the startup fails, we panic and tell the
user to try rebooting with ACPI disabled. Previously in this case we
would try to use $PIR interrupt routing which only works for the atpic
while using the apic to handle interrupts which would result in misrouted
interrupts and a hang at boot time with no error message.
- Read the SCI out of the FADT instead of hardcoding 9 when checking to see
if an interrupt override entry is for the SCI.
- Try to work around some BIOS brain damage for the SCI's programming by
forcing the SCI to be level triggered and active low if it is routed
to a non-ISA interrupt (greater than 15) or if it is identity mapped with
edge trigger and active high polarity. This should fix some of the hangs
with device apic and ACPI that some people see.
Reviewed by: njl
problem here still to be solved: the sockaddr_hci has still a 16 byte
field for the node name. The code currently does not correctly use the
length field in the sockaddr to handle the address length, so
node names get truncated to 15 characters when put into a sockaddr_hci.
introducing a START_NOW command. This command does not send
and GET_IFINDEX message downstream (to wait for the response from
the ETHERNET node), but directly starts the sending process. This allows
one to generate traffic as input for any hook on any node.
ifconfig(8) flag since header for version 2 is the same but IP payload
is prepended with additional 4-bytes field.
Inspired by: Roman Synyuk <roman@univ.kiev.ua>
MFC after: 2 weeks
attached when shutting down, kill our kthreads, but don't destroy
the mutex pool and uma zone resources since the driver shutdown
routine may need them later.
device that doesn't exists. I'm using my discretion and
committing without mentor approval since Seigo is away.
Noticed by: Maxime Henrion <mux@freebsd.org>
own file and make it opt-in, not mandatory, depending on CPU_ENABLE_LONGRUN
config(8) option.
PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
Discussed with: nate
MFC after: 2 weeks
than the switchin functions to guarantee that we're operating with the
correct tlb entry.
- Remove the post copy/zero tlb invalidations. It is faster to invalidate
an entry that is known to exist and so it is faster to invalidate after
use. However, some architectures implement speculative page table
prefetching so we can not be guaranteed that the invalidated entry is still
invalid when we re-enter any of these functions. As a result of this we
must always invalidate before use to be safe.
a deadlock in several years. Furthermore, the IPI code is currently
protected by a seperate spinlock. This only served to make IPIs twice as
expensive as they had to be which severely slowed down the IPI heavy ULE
scheduler.
SW_INVOL. Assert that one of these is set in mi_switch() and propery
adjust the rusage statistics. This is to simplify the large number of
users of this interface which were previously all required to adjust the
proper counter prior to calling mi_switch(). This also facilitates more
switch and locking optimizations.
- Change all callers of mi_switch() to pass the appropriate paramter and
remove direct references to the process statistics.
mutex profiling code. As with existing mutex profiling, measurement
is done with respect to mtx_lock() instances in the code, as opposed
to specific mutexes. In particular, measure two things:
(1) Lock contention. How often did this mtx_lock() call get made and
have to sleep (or almost sleep) waiting for the lock. This helps
identify the "victims" of contention.
(2) Hold contention. How often, while the lock was held by a thread
as a result of this mtx_lock(), did another thread try to acquire
the same mutex. This helps identify the causes of contention.
I'm currently exploring adding measurement of "time waited for the
lock", but the current implementation has proven useful to me so far
so I figured I'd commit it so others could try it out. Note that this
increases the size of mutexes when MUTEX_PROFILING is enabled, so you
might find you need to further bump UMA_BOOT_PAGES. Fixes welcome.
The once over: des, others
one which runs the actual update. This fixes a bug where there were
a delay in applying the frequency adjustment. In extreme cases this
could result in marginal stability of the kernel-pll.
full state. (When swap is added their state will change appropriately.)
2. Set swap_pager_full and swap_pager_almost_full to the full state when
the last swap device is removed.
Combined these changes eliminate nonsense messages from the kernel on swap-
less machines.
Item 2 submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz>
Prodding by: phk
For some very unclear reason this device contains a FTDI 8U232AM USB->COM
adapter, but reports different device id than original 8U232AM. At the same
time, it reports vendor id of FTDI.
Sponsored by: Porta Software Ltd
MFC after: 2 weeks
Suggested by: nate
- get rid of "magick" values in code and make sysctl's reflecting reality
on processor versions which have one or another frequency "forbidden"
due to errata.
MFC after: 2 weeks
Suggested by: nate
- get rid of "magick" values in code and make sysctl's reflecting reality
on processor versions which have one or another frequency "forbidden"
due to errata.
PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after: 2 weeks
the user requests a read-only mount. This is necessary because we
don't do the VOP_OPEN again if they upgrade a read-only mount to
read-write.
Noticed by: bde
The uidinfo code appears to be MPSAFE, and is referenced without Giant
elsewhere. While this grab of Giant was only made in fairly rare
circumstances (actually GC'ing on refcount==0), grabbing Giant here
potentially introduces lock order issues with any locks held by the
caller. So this probably won't help performance much unless you change
credentials a lot in an application, and leave a lot of file descriptors
and cached credentials around. However, it simplifies locking down
consumers of the credential interfaces.
Bumped into by: sam
Appeased: tjr
to a new prison_complete() task run by a task queue. This removes
a requirement for grabbing Giant in crfree(). Embed the 'struct task'
in 'struct prison' so that we don't have to allocate memory from
prison_free() (which means we also defer the FREE()).
With this change, I believe grabbing Giant from crfree() can now be
removed, but need to check the uidinfo code paths.
To avoid header pollution, move the definition of 'struct task'
to _task.h, and recursively include from taskqueue.h and jail.h; much
preferably to all files including jail.h picking up a requirement to
include taskqueue.h.
Bumped into by: sam
Reviewed by: bde, tjr
Replace wrong check returned EFBIG with EOVERFLOW handling from POSIX:
36708 [EOVERFLOW] The file is a regular file, nbyte is greater than 0, the
starting position is before the end-of-file, and the starting position is
greater than or equal to the offset maximum established in the open file
description associated with fildes.
ffs_write:
Replace u_int64_t cast with uoff_t cast which is more natural for types
used.
ffs_write & ffs_read:
Remove uio_offset and uio_resid checks for negative values, the caller
supposed to do it already. Add comments about it.
Reviewed by: bde
recwin and sendwin. This removes a big source of confusion and makes
following the code much easier.
Reviewed by: sam (mentor)
Obtained from: DragonFlyBSD rev 1.6 (hsu)
rid's and to deallocate resources if a failure occurs during attach. This
patch also fixes the driver to return failure if bus_alloc_resource() for
the IRQ fails rather than panic'ing on the next line by passing a NULL
resource to bus_setup_intr(). The other attachments already do all this.
Submitted by: Jun Su <csujun@263.net>
In case no real/physical IEEE 802 address is available, both the expired
"draft-leach-uuids-guids-01" (section "4. Node IDs when no IEEE 802
network card is available") and RFC 2518 (section "6.4.1 Node Field
Generation Without the IEEE 802 Address") recommend (quoted from RFC
2518):
"The ideal solution is to obtain a 47 bit cryptographic quality random
number, and use it as the low 47 bits of the node ID, with the _most_
significant bit of the first octet of the node ID set to 1. This bit
is the unicast/multicast bit, which will never be set in IEEE 802
addresses obtained from network cards; hence, there can never be a
conflict between UUIDs generated by machines with and without network
cards."
Unfortunately, this incorrectly explains how to implement this and
the FreeBSD UUID generator code inherited this generation bug from
the broken reference code in the standards draft. They should instead
specify the "_least_ significant bit of the first octet of the node ID"
as the multicast bit in a memory and hexadecimal string representation
of a 48-bit IEEE 802 MAC address.
This standards bug arised from a false interpretation, as the multicast
bit is actually the _most_ significant bit in IEEE 802.3 (Ethernet)
_transmission order_ of an IEEE 802 MAC address. The standards authors
forgot that the bitwise order of an _octet_ from a MAC address _memory_
and hexadecimal string representation is still always from left (MSB,
bit 7) to right (LSB, bit 0).
Fortunately, this UUID generation bug could have occurred on systems
without any Ethernet NICs only.
no-op on {i386/alpha/ia64/sparc64} where chars are signed by
default. Should help ARM and S390 which also suffer from this.
Tested on: ppc, i386, objdump disasm before/after diffs
Reviewed by: obrien, bde (a while back)
use a bounce buffer for the actual transfer to avoid crossing a 64k
boundary. To do this, we malloc a buffer twice as big as we need and then
find an aligned block within that buffer to do the transfer. The check
to see which part of the block we use used the wrong variable for part of
the condition meaning that in certain edge cases we would ask the BIOS to
cross a 64k boundary. The BIOS request would then fail resulting in file
transfers that just magically fail in the middle without any apparent
reason. Specifically, my tests for the splitfs boot floppies managed to
trigger this edge case.
MFC after: 1 week
X-MFC-info: along with fixes to libstand filesystems
Presumably, at some point, you had to include jail.h if you included
proc.h, but that is no longer required.
Result of: self injury involving adding something to struct prison
is NULL, otherwise ipsec4_process_packet() may try to m_freem() a
bad pointer.
In ipsec4_process_packet(), don't try to m_freem() 'm' twice; ipip_output()
already did it.
Obtained from: netbsd
o For traps, the cr.iip register points to the next instruction to
execute on interrupt return (modulo slot). Since we need to get
the bundle of the instruction that caused the FP fault/trap, make
sure we fetch the previous bundle if the next instruction is in
fact the first in a bundle.
o When we call the FPSWA handler, we need to tell it whether it's
a trap or a fault (first argument). This was hardcoded to mean a
fault.
Also, for FP faults, when a fault is converted to a trap, adjust the
cr.iip and cr.ipsr registers to point to the next instruction. This
makes sure that the SIGFPE handler gets a consistent state.