is currently executing when we try to remove it in exit1(). Without this,
it was possible for the callout to bogusly rearm itself and eventually
refire after the process had been free'd resulting in a panic.
PR: kern/51964
Reported by: Jilles Tjoelker <jilles@stack.nl>
Reviewed by: tegge, bde
curthread. Unlike td_flags, this field does not need any locking.
- Replace the td_inktr and td_inktrace variables with equivalent private
thread flags.
- Move TDF_OLDMASK over to the private flags field so it no longer requires
sched_lock.
second and equalizing the load between the two most imbalanced CPU. This
is intended to clear up long term load imbalances that would not be handled
by the 'pull' method in sched_choose().
- Pull out some bits of sched_choose() into a kseq_move() function that moves
an arbitrary thread from one kseq to another.
adding it to the nice tables. Therefore, in kseq_add_nice, we should
keep in mind that the load will be 1 if we are the only thread, and not
0.
- Assert that the sched lock is held in all the appropriate places.
- Increase the scope of the sched lock in sched_pctcpu_update().
- Hold the sched lock in sched_runnable(). It is not held by the caller.
"", temporarily map it to a call to extattr_list_vp() to provide
compatibility for older applications using the "" API to retrieve
EA lists.
Use VOP_LISTEXTATTR() to support extattr_list_vp() rather than
VOP_GETEXTATTR(..., "", ...).
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Asssociates Laboratories
specific attribute name. It will have the same semantics as the
older vop_getextattr() "retrieve the names" hack, returning
a buffer with ASCII nul-seperated names.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
we were passing in a void* representing the PCB of the parent thread.
Now we pass a pointer to the parent thread itself.
The prime reason for this change is to allow cpu_set_upcall() to copy
(parts of) the trapframe instead of having it done in MI code in each
caller of cpu_set_upcall(). Copying the trapframe cannot always be
done with a simply bcopy() or may not always be optimal that way. On
ia64 specifically the trapframe contains information that is specific
to an entry into the kernel and can only be used by the corresponding
exit from the kernel. A trapframe copied verbatim from another frame
is in most cases useless without some additional normalization.
Note that this change removes the assignment to td->td_frame in some
implementations of cpu_set_upcall(). The assignment is redundant.
A previous call to cpu_thread_setup() already did the exact same
assignment. An added benefit of removing the redundant assignment is
that we can now change td_pcb without nasty side-effects.
This change officially marks the ability on ia64 for 1:1 threading.
Not tested on: amd64, powerpc
Compile & boot tested on: alpha, sparc64
Functionally tested on: i386, ia64
extattr_list_link() system calls, which return a least of extended
attributes defined for a vnode referenced by a file descriptor
or path name. Currently, we just invoke VOP_GETEXTATTR() since
it will convert a request for an empty name into a query for a
name list, which was the old (more hackish) API. At some point
in the near future, we'll push the distinction between get and
list down to the vnode operation layer, but this provides access
to the new API for applications in the short term.
Pointed out by: Dominic Giampaolo <dbg@apple.com>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
file/directory/link, rather than using a less explicit hack on
the extattr retrieval API:
extattr_list_fd()
extattr_list_file()
extattr_list_link()
The existing API was counter-intuitive, and poorly documented.
The prototypes for these system calls are identical to
extattr_get_*(), but without a specific attribute name to
leave NULL.
Pointed out by: Dominic Giampaolo <dbg@apple.com>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Don't copyin() data we are about to overwrite.
Add a flag to tell userland that KSE is officially "DONE" with the
mailbox and has gone away.
Obtained from: davidxu@
we failed to put the bucket back into the general cache/container.
Also, fix a bad assumption. There was a KASSERT() that aimed to
guarantee that whenever the pcpu container's mc_starved was > 0,
that whatever the bucket we were freeing to was an empty bucket,
assuming it belonged to the pcpu container cache. However, there
is at least one case where this is not true anymore; consider:
1) All containers empty, next thread to try to alloc will touch
a pcpu container, notice it's empty, and increment the pcpu
container's mc_starved.
2) Some other thread frees an mbuf belonging to a bucket in
the general cache/container. Then it frees another mbuf
belonging to the same bucket (still in gen container).
3) Some third thread tries to allocate an mbuf from the pcpu
container and, since empty, grabs one mbuf now available
in the general cache and moves the non-empty bucket from
which it took 1 mbuf and to which the thread in (2) freed
to, and moves it to the pcpu container.
4) A final thread tries to free an mbuf belonging to the
NON-EMPTY bucket mentionned in (2) and (3) and, since
the pcpu container's mc_starved is > 0, but the bucket
is obviously non-empty, it trips on the KASSERT.
This meant that one could potentially get a panic in some
cases when out of mbufs and clusters. The problem could
be mitigated by commenting out some cv_signal() calls,
but I'm assuming that was pure coincidence and this is
the correct fix.
- Use a hash of umtx queues to queue blocked threads. We hash on pid and the
virtual address of the umtx structure. This eliminates cases where we
previously held a lock across a casuptr call.
Reviwed by: jhb (quickly)
the lameness of the kstack code. The EPC overhaul de-lame-ified the
kstack code by removing the need for contigmalloc(). We can now
allocate stacks using malloc(). We probably want to make the stacks
swappable as well so that we can make it MI. But that's another story.
why certain exceptions are made, note an inconsistency between
FreeBSD and some other implementations regarding IPC_M, and let
suser() generate our EPERM rather than forcing it ourselves.
Remove a carriage return that crept in in the last commit.
Reviewed by: gordon
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
the stack to be changed in a way incompatible with elf32_map_insert()
where we used data_buf without initializing it for when the partial
mapping resulting in a misaligned image (typical when the page size
implied by the image is not the same as the page size in use by the
kernel). Since data_buf is passed by reference to vm_map_find(), the
compiler cannot warn about it.
While here, move all local variables to the top of the function.
in the kernel, the sysctl_register() call would fail, as expected.
However, when unloading this module again, the kernel would then panic
in sysctl_unregister(). Print a message error instead.
Submitted by: Nicolai Petri <nicolai@catpipe.net>
Reviewed by: imp
Approved by: re@ (jhb)
buf_start() to avoid triggering a panic in softdep_disk_io_initiation()
if b_iocmd happened to be BIO_READ. The later initialisation of
b_iocmd in cluster_wbuild() could probably be moved to before the
buf_start() call, but this patch keeps the change as simple as
possible.
This is reported to fix occasional "softdep_disk_io_initiation: read"
panics, especially on NFS servers.
Reported by: Nick Hilliard <nick@netability.ie>
Tested by: Nick Hilliard <nick@netability.ie>
Approved by: re (rwatson)
because we could fail due to a small buffer and loop and rerun. If this
happens, then the vsnprintf() will have already taken the arguments off
the va_list. For i386 and others, this doesn't matter because the
va_list type is a passed as a copy. But on powerpc and amd64, this is
fatal because the va_list is a reference to an external structure that
keeps the vararg state due to the more complicated argument passing system.
On amd64, arguments can be passed as follows:
First 6 int/pointer type arguments go in registers, the rest go on
the memory stack.
Float and double are similar, except using SSE registers.
long double (80 bit precision) are similar except using the x87 stack.
Where the 'next argument' comes from depends on how many have been
processed so far and what type it is. For amd64, gcc keeps this state
somewhere that is referenced by the va_list.
I found a description that showed the va_copy was required here:
http://mirrors.ccs.neu.edu/cgi-bin/unixhelp/man-cgi?va_end+9
The single unix spec doesn't mention va_copy() at all.
Anyway, the problem was that the sysctl kern.geom.conf* nodes would panic
due to walking off the end of the va_arg lists in vsnprintf. A better fix
would be to have sbuf_vprintf() use a single pass and call kvprintf()
with a callback function that stored the results and grew the buffer
as needed.
Approved by: re (scottl)
size and the kernel's heap size, specifically, vm_kmem_size. This
function allows a maximum of 40% of the vm_kmem_size to be used for
vnodes and vm objects. This is a conservative bound based upon recent
problem reports. (In other words, a slight increase in this percentage
may be safe.)
Finally, machines with less than ~3GB of RAM should be unaffected
by this change, i.e., the maximum number of vnodes should remain
the same. If necessary, machines with 3GB or more of RAM can increase
the maximum number of vnodes by increasing vm_kmem_size.
Desired by: scottl
Tested by: jake
Approved by: re (rwatson,scottl)