- Fix null-pointer dereference introduced when snapshotting
was introduced. This occured because unlike the previous code,
vn_start_write() doesn't always return a non-NULL mp, as
filesystems may not support the VOP_GETWRITEMOUNT() call. For
now, rely on two pointers, so that vn_finished_write() works
properly.
- Fix locking problems on exit, introduced at some past time,
some when snapshots came in, where a vnode might not be
unlocked before being vrele'd in various error situations.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
The binary format "bintime" is a 32.64 format, it will go to 64.64
when time_t does.
The bintime format is available to consumers of time in the kernel,
and is preferable where timeintervals needs to be accumulated.
This change simplifies much of the magic math inside the timecounters
and improves the frequency and time precision by a couple of bits.
I have not been able to measure a performance difference which was not
a tiny fraction of the standard deviation on the measurements.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.
Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
- Create a private list of active pmaps rather than abusing the list of all
processes when we need to look up pmaps. The process list needs a sx lock
and we can't be getting sx locks in the middle of cpu_switch()
(pmap_activate() can call pmap_get_asn() from cpu_switch()). Instead, we
protect the list with a spinlock. This also means the list is shorter
since a pmap can be used by more than one process and we could (at least
in thoery) dink with pmap's more than once, but now we only touch each
pmap once when we have to update all of them.
- Wrap pmap_activate()'s code to get a new ASN in an explicit critical section
so that when it is called while doing an exec() we can't get preempted.
- Replace splhigh() in pmap_growkernel() with a critical section to prevent
preemption while we are adjusting the kernel page tables.
- Fixes abuse of PCPU_GET(), which doesn't return an L-value.
- Also adds some slight cleanups to the ASN handling by adding some macros
instead of magic numbers in relation to the ASN and ASN generations.
Reviewed by: dfr
shared.
Also introduce vm_endcopy instead of using pointer tricks when
initializing new vmspaces.
The race occured because of how the reference was utilized:
test vmspace reference,
possibly block,
decrement reference
When sharing a vmspace between multiple processes it was possible
for two processes exiting at the same time to test the reference
count, possibly block and neither one free because they wouldn't
see the other's update.
Submitted by: green
HZ=BIGNUM will strain the assumptions behind timecounters to the
point where they break.
This may or may not help people seeing microuptime() backwards messages.
Make the global timecounter variable volatile, it makes no difference in
the code GCC generates, but it makes represents the intent correctly.
Thanks to: jdp
MFC after: 2 weeks
call VOP_INACTIVE before placing the vnode back on the free list.
Otherwise there is a race condition on SMP machines between
getnewvnode() locking the vnode to reclaim it and vrele()
locking the vnode to inactivate it. This window of vulnerability
becomes exaggerated in the presence of filesystems that have
been suspended as the inactive routine may need to temporarily
release the lock on the vnode to avoid deadlock with the syncer
process.
threads race for a file slot.
dup2(2) incorrectly assumes that if it needs to grow the ofiles
array that it will get what it wants. This assertion was valid
before we allowed shared filedescriptor tables but is now incorrect.
The assertion can trigger superfolous panics if the thread doing a
dup2 looses a race with another thread while possibly blocked in
the MALLOC call in fdalloc. Another thread may grab the slot we
are requesting which makes fdalloc return something other than what
we asked for, this will triggering the bogus assertion.
MFC after: 2 weeks
Reviewed by: phk
signal trampoline for old signals. The arches that support old signals
currently abuse sigreturn(2) instead. This mainly complicates things
and slightly breaks the the new sigreturn(2).
COMPAT is too limited to support the correct configuration of osigreturn,
and this commit doesn't attempt to fix it; it just moves the bogusness:
osigreturn() must now be provided unconditionally even on arches that
don't really need it; previously it had to be provided under the bogus
condition defined(COMPAT_43).
other threads as well as speed up the interfaces.
To fix the race and accomplish the speedup, remove selholddrop and
pollholddrop. The entire concept is somewhat bogus because holding
the individual struct file pointers offers us no guarantees that
another thread context won't close it on us thereby removing our
access to our own reference.
Selholddrop and pollholddrop also would do multiple locks and unlocks
of mutexes _per-file_ in the fd arrays to be scanned, this needed to
be sped up.
Instead of using selholddrop and pollholddrop, simply hold the
filedesc lock over the selscan and pollscan functions. This should
protect us against close(2)'s on the files as reduce the multiple
lock/unlock pairs per fd into a single lock over the filedesc.
from 1 megabyte of ram per user to 2 megabytes of ram per user, and
reduce the cap from 512 to 384. 512 leaves around 240 MB of KVM available
while 384 leaves 270 MB of KVM available. Available KVM is important
in order to deal with zalloc and kernel malloc area growth.
Reviewed by: mckusick
MFC: either before 4.5 if re's agree, or after 4.5
This allows obtaining crash dumps from the panics occured during late stages
of kernel initialisation before system enters into single-user mode.
MFC after: 2 weeks
replace mutex_lock calls on uidinfo with macro calls:
mtx_lock(&uidp->ui_mtx) -> UIDINFO_LOCK(uidp)
Terry Lambert <tlambert2@mindspring.com> helped with this.
sleeping on a process object but changed the corresponding
wakeup()s to the thread object. The result was that non-raw
aio ops waited for an aio daemon to timeout before action
was taken. Now, we sleep on the thread object.
PR: kern/34016
operation. The vgonel() code has always called vclean() but until we
started proactively freeing vnodes it would never actually be called with
a dirty vnode, so this situation did not occur prior to the vnlru() code.
Now that we proactively free vnodes when kern.maxvnodes is hit, however,
vclean() winds up with work to do and improperly generates the warnings.
Reviewed by: peter
Approved by: re (for MFC)
MFC after: 1 day