results unused; this, with -Werror option of gcc, rise a warning for gcc
which let the buildkernel to be busted.
Fix this removing upcall_free().
Reported by: various
Approved by: jeff
Approved by: re
Pointy hat to: attilio
dangerous races.
Fix this problems adding correct locking for the members of 'struct
kse_upcall' and other struct proc/struct thread related members.
For the moment, just leave ku_mflag and ku_flags "lazy" locked.
While here, cleanup the code removing the function kse_GC() (unused),
and merging upcall_link(), upcall_unlink(), upcall_stash() in their
respective callers (static functions, very short and only called in one
place).
Reported by: pav
Tested by: pav (on some pointyhat cluster nodes)
Approved by: jeff
Approved by: re
Sponsorized by: NGX Italy (http://www.ngx.it)
print a one line error message. Add some comments on not being able to
trust the day of week field (I'll act on these comments in a follow up
commit).
Approved by: re
MFC after: 3 weeks
filt_ttyrdetach() etc would later attempt to dereference cdev->si_tty,
causing a 0xdeadc0de dereference. Change kn_hook value from cdev to
struct tty to avoid dereferencing freed cdev.
In ttygone(), wake up select(), sigio and kevent() users in addition
to the queue sleepers.
Return EV_EOF from kevent filters if TS_GONE is set.
Submitted by: peter
Tested by: Peter Holm
Approved by: re (kensmith)
MFC after: 2 weeks
- Adjust lock_profiling stubs semantic in the hard functions in order to be
more accurate and trustable
- As for sx locks, disable shared paths for lock_profiling. Actually,
lock_profiling has a subtle race which makes results caming from shared
paths not completely trustable. A macro stub (LOCK_PROFILING_SHARED) can
be actually used for re-enabling this paths, but is currently intended
for developing use only.
- style(9) fixes
Approved by: jeff, kmacy, jhb[1]
Approved by: re
[1] Had initial reservations not shared by others, conceded
in the end.
machines.
- Leave the long-term load balancer running by default once per second.
- Enable stealing load from the idle thread only when the remote processor
has more than two transferable tasks. Setting this to one further
improves buildworld. Setting it higher improves mysql.
- Remove the bogus pick_zero option. I had not intended to commit this.
- Entirely disallow migration for threads with SRQ_YIELDING set. This
balances out the extra migration allowed for with the load balancers.
It also makes pick_pri perform better as I had anticipated.
Tested by: Dmitry Morozovsky <marck@rinet.ru>
Approved by: re
properly. We have to temporarily unlock the TDQ lock so we can lock
the thread and add it to the run queue. This is used only for KSE.
- When we add a thread from the tdq_move() via sched_balance() we need to
ipi the target if it's sitting in the idle thread or it'll never run.
Reported by: Rene Landan
Approved by: re
new code and third party modules which try to depend on it.
- Initialize sched_lock in sched_4bsd.c.
- Declare sched_lock in sparc64 pmap.c and assert that we're compiling
with SCHED_4BSD to prevent accidental crashes from running ULE. This
is the sole remaining file outside of the scheduler that uses the
global sched_lock.
Approved by: re
been in development for over 6 months as SCHED_SMP.
- Implement one spin lock per thread-queue. Threads assigned to a
run-queue point to this lock via td_lock.
- Improve the facility for assigning threads to CPUs now that sched_lock
contention no longer dominates scheduling decisions on larger SMP
machines.
- Re-write idle time stealing in an attempt to make it less damaging to
general performance. This is still disabled by default. See
kern.sched.steal_idle.
- Call the long-term load balancer from a callout rather than sched_clock()
so there are no locks held. This is disabled by default. See
kern.sched.balance.
- Parameterize many scheduling decisions via sysctls. Try to document
these via sysctl descriptions.
- General structural and naming cleanups.
- Document each function with comments.
Tested by: current@ amd64, x86, UP, SMP.
Approved by: re
kernels exposed by the recent fixes to resource limits for 32-bit processes
on 64-bit kernels:
- Let ABIs expose their maximum stack size via a new pointer in sysentvec
and use that in preference to maxssiz during exec() rather than always
using maxssiz for all processses.
- Apply the ABI's limit fixup to the previous stack size when adjusting
RLIMIT_STACK to determine if the existing mapping for the stack needs to
be grown or shrunk (as well as how much it should be grown or shrunk).
Approved by: re (kensmith)
- Adjust lock_profiling stubs semantic in the hard functions in order to be
more accurate and trustable
- Disable shared paths for lock_profiling. Actually, lock_profiling has a
subtle race which makes results caming from shared paths not completely
trustable. A macro stub (LOCK_PROFILING_SHARED) can be actually used for
re-enabling this paths, but is currently intended for developing use only.
- Use homogeneous names for automatic variables in hard functions regarding
lock_profiling
- Style fixes
- Add a CTASSERT for some flags building
Discussed with: kmacy, kris
Approved by: jeff (mentor)
Approved by: re
it with netipsec now that KAME IPsec is gone.
While here add missing netinet6 directories.
Add comments about the ports needed to be able to run those targets.
Reviewed by: philip
Approved by: re (rwatson)
ftruncate(), but without the pad arg.
There are several reasons for this. Consider 'mmap()'. On AMD64, the
function call (and syscall) ABI allow for 6 register arguments. Additional
arguments go on the stack. mmap(2) has 6 arguments. However, the syscall
definition has an extra 'int pad' argument. This pushes it to 7 arguments,
which means one must spill into the memory stack. Since the kernel API
doesn't match userland API, we have a hack in libc - libc/sys/mmap.c.
This implements the userland API by calling __syscall() with an extra
argument and the pad argument, for a total of 8 args. This is all
unnecessary and inconvenient for several things, including the kernel's
syscall handler code which now has to handle merging stack arguments with
register arguments. It is a big deal for certain 3rd party code.
I'm adding libc glue to make the transition totally painless. I had
intended to mark the old syscalls as COMPAT6, but the potential to shoot
your feet by building a new kernel without COMPAT_FREEBSD6 but with a
slighly older userland was too great. For now, they have manual
"freebsd6_" prefixes rather than being COMPAT6. They will go back to
being marked 'COMPAT6' after 7-stable starts.
Approved by: re (kensmith)
Also, change the visibility of compat syscalls a slightly. Compat
syscalls were missing from 'syscalls.h' entirely. This additionally adds
them with their compat prefix. eg: SYS_freebsd6_mmap.
Also, the syscalls.c names strings have different prefixes to differentiate
syscalls. Instead of several "old.mmap" strings, there will now be a
"compat.mmap" and "compat6.mmap" etc. Before, both would have had the
same "old.mmap" label.
Approved by: re
shall not be called while holding cdev mutex. devfs_inos unrhdr has cdev as
mutex, thus creating this LOR situation.
Postpone calling free() in kern/subr_unit.c:alloc_unr() and nested functions
until the unrhdr mutex is dropped. Save the freed items on the ppfree list
instead, and provide the clean_unrhdrl() and clean_unrhdr() functions to
clean the list.
Call clean_unrhdrl() after devfs_create() calls immediately before
dropping cdev mutex. devfs_create() is the only user of the alloc_unrl()
in the tree.
Reviewed by: phk
Tested by: Peter Holm
LOR: 80
Approved by: re (kensmith)
can acquire shared filedescriptor locks in the appropriate cases.
- Remove Giant from calls that issue ioctls. The ioctl path has been
mpsafe for some time now.
- Only acquire giant for VOP_ADVLOCK when the filesystem requires giant.
advlock is now mpsafe.
Reviewed by: rwatson
Approved by: re
to protect this datastructure instead.
- Preallocate an extra lockf structure in case we want to split a lock
on insert or delete.
- msleep() on the vnode interlock when blocking on a lock.
Reviewed by: rwatson
Approved by: re
- Use cpu_spinwait() in the spin loops in stop_cpus(), restart_cpus(), and
smp_rendezvous_action().
- Remove unneeded acq memory barriers in stop_cpus(), restart_cpus(), and
smp_rendezvous_action().
- Add an additional synch point in smp_rendezvous() to ensure that all the
CPUs will always see an up-to-date value of smp_rv_setup_func.
Reviewed by: attilio
Approved by: re (kensmith)
Tested on: alpha, amd64, i386, sparc64 SMP (for several years)
Lock cdev mutex too to close the race with tty being freed.
Relock clone_drain_lock to prevent the LOR with proctree lock, thus
add #include <fs/devfs/devfs_int.h>.
Suggested by: tegge
Debugging help and testing by: Peter Holm
Approved by: re (kensmith)
Lock Giant in the clone handler.
Use destroy_dev_sched() explicitely from pty_maybecleanup() and postpone
pty_release() until both master and slave cdevs are destroyed by setting
it as callback for destroy_dev_sched().
Debugging help and testing by: Peter Holm
Approved by: re (kensmith)
destroy_dev() is called from csw method, and no d_purge driver method is
provided. Transform the direct call to destroy_dev() into destroy_dev_sched().
Reviewed by: njl (programming interface)
Debugging help and testing by: Peter Holm
Approved by: re (kensmith)
destroy_dev() from d_close() cdev method would self-deadlock.
devfs_close() bump device thread reference counter, and destroy_dev()
sleeps, waiting for si_threadcount to reach zero for cdev without
d_purge method.
destroy_dev_sched() could be used instead from d_close(), to
schedule execution of destroy_dev() in another context. The
destroy_dev_sched_drain() function can be used to drain the scheduled
calls to destroy_dev_sched(). Similarly, drain_dev_clone_events() drains
the events clone to make sure no lingering devices are left after
dev_clone event handler deregistered.
make_dev_credf(MAKEDEV_REF) function should be used from dev_clone
event handlers instead of make_dev()/make_dev_cred() to ensure that created
device has reference counter bumped before cdev mutex is dropped inside
make_dev().
Reviewed by: tegge (early versions), njl (programming interface)
Debugging help and testing by: Peter Holm
Approved by: re (kensmith)
could lead to a deadlock).
- sleepq_set_timeout acquires callout_lock (via callout_reset()) only
with sleepq chain lock held
- msleep_spin in _callout_stop_safe lock the sleepqueue chain with
callout_lock held
In order to solve this don't use msleep_spin in _callout_stop_safe() but
use directly sleepqueues as inline msleep_spin code. Rearrange the
wakeup path in order to have it consistent too.
Reported by: kris (via stress2 test suite)
Tested by: Timothy Redaelli <drizzt@gufi.org>
Reviewed by: jhb
Approved by: jeff (mentor)
Approved by: re
This is very similar to sx_init_flags: it initializes the rwlock using
special flags passed as third argument (RW_DUPOK, RW_NOPROFILE,
RW_NOWITNESS, RW_QUIET, RW_RECURSE).
Among these, the most important new feature is probabilly that rwlocks
can be acquired recursively now (for both shared and exclusive paths).
Because of the recursion counter, the ABI is changed.
Tested by: Timothy Redaelli <drizzt@gufi.org>
Reviewed by: jhb
Approved by: jeff (mentor)
Approved by: re
Postpone call to devfs_free() after cdev mutex is dropped. Reuse
cdp_list link for queuing devices awaiting deletion in the
cdevp_free_list.
Reported by: Hans Petter Selasky <hselasky c2i net>
Tested by: Peter Holm
Approved by: re (kensmith)
MFC after: 2 weeks
a privilege is checked against the real uid rather than the effective
uid, instead decide which uid to use in priv_check_cred() based on the
privilege passed in. We use the real uid for PRIV_MAXFILES,
PRIV_MAXPROC, and PRIV_PROC_LIMIT. Remove the definition of
SUSER_RUID; there are now no flags defined for priv_check_cred().
Obtained from: TrustedBSD Project
- Move the rtc_mtx spin lock out from under #ifdef SMP as it's just
not SMP-specific.
- Add a new spin lock pcib_mtx for locking "fast" interrupt handlers
of host-to-PCI bridge drivers on sparc64.
- In tdq_choose() only assert that a thread does not have too high a
priority (low value) for the queue we removed it from. This will catch
bugs in priority elevation. It's not a serious error for the thread
to have too low a priority as we don't change queues in this case as
an optimization.
Reported by: kris
or idle priority of another process owned by the same user. This means
that privilege in rtprio(2) (and rtprio_thread(2)) is required indirectly
via p_cansched(9) or directly to set realtime/idle privilege, rather than
directly affecting target process authorization.