vfs_export() fails. Restoring old options and flags after successful
VFS_MOUNT(9) call may cause the file system internal state to become
inconsistent with mount options and flags. Specifically the FFS super
block fs_ronly field and the MNT_RDONLY flag may get out of sync.
PR: kern/133614
Discussed on: freebsd-hackers
the debugger back-end has changed. This means that switching from ddb
to gdb no longer requires a "step" which can be dangerous on an
already-crashed kernel.
Also add a capability to get from the gdb back-end back to ddb, by
typing ^C in the console window.
While here, simplify kdb_sysctl_available() by using
sbuf_new_for_sysctl(), and use strlcpy() instead of strncpy() since the
strlcpy semantic is desired.
MFC after: 1 month
VNET socket push back:
try to minimize the number of places where we have to switch vnets
and narrow down the time we stay switched. Add assertions to the
socket code to catch possibly unset vnets as seen in r204147.
While this reduces the number of vnet recursion in some places like
NFS, POSIX local sockets and some netgraph, .. recursions are
impossible to fix.
The current expectations are documented at the beginning of
uipc_socket.c along with the other information there.
Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH
Reviewed by: jhb
Tested by: zec
Tested by: Mikolaj Golub (to.my.trociny gmail.com)
MFC after: 2 weeks
Catch a set vnet upon return to user space. This usually
means return paths with CURVNET_RESTORE() missing.
If VNET_DEBUG is turned on we can even tell the function
that did the CURVNET_SET() which is really helpful; else
we print "N/A".
Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH
Reviewed by: jhb
MFC after: 11 days
KASSERT()s and eliminate the rest.
Replace excessive printf()s and a panic() in bufdone_finish() with a
KASSERT() in vm_page_io_finish().
Reviewed by: kib
Make VNET_ASSERT() available with either VNET_DEBUG or INVARIANTS.
Change the syntax to match KASSERT() to allow more flexible panic
messages rather than having a printf with hardcoded arguments
before panic.
Adjust the few assertions we have to the new format (and enhance
the output).
Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH
Reviewed by: jhb
MFC after: 2 weeks
attributes for preloaded modules/images. In particular, MODINFO_ADDR has
the added complexity of not always being relocated properly. Rather than
kluging this in the various components that are affected, we handle it
in a centralized place (preload_fetch_addr()). To that end, expose a new
variable, preload_addr_relocate, that MD initialization code can set and
that turns the address attribute into a valid kernel VA.
Architectures that need the relocation: arm & powerpc (at least).
Components that can utilize this: acpi(4), md(4), fb(4), pci(4), ZFS, geli.
Sponsored by: Juniper Networks
misnamed since it was introduced and should not be globally exposed
with this name. The equivalent functionality is now available using
kern_yield(curthread->td_user_pri). The function remains
undocumented.
Bump __FreeBSD_version.
- entirely eliminate some calls to uio_yeild() as being unnecessary,
such as in a sysctl handler.
- move should_yield() and maybe_yield() to kern_synch.c and move the
prototypes from sys/uio.h to sys/proc.h
- add a slightly more generic kern_yield() that can replace the
functionality of uio_yield().
- replace source uses of uio_yield() with the functional equivalent,
or in some cases do not change the thread priority when switching.
- fix a logic inversion bug in vlrureclaim(), pointed out by bde@.
- instead of using the per-cpu last switched ticks, use a per thread
variable for should_yield(). With PREEMPTION, the only reasonable
use of this is to determine if a lock has been held a long time and
relinquish it. Without PREEMPTION, this is essentially the same as
the per-cpu variable.
MI ucontext_t and x86 MD parts.
Kernel allocates the structures on the stack, and not clearing
reserved fields and paddings causes leakage.
Noted and discussed with: bde
MFC after: 2 weeks
should_yield(). Use this in various places. Encapsulate the common
case of check-and-yield into a new function maybe_yield().
Change several checks for a magic number of iterations to use
should_yield() instead.
MFC after: 1 week
collect phases. The unp_discard() function executes
unp_externalize_fp(), which might make the socket eligible for gc-ing,
and then, later, taskqueue will close the socket. Since unp_gc()
dropped the list lock to do the malloc, close might happen after the
mark step but before the collection step, causing collection to not
find the socket and miss one array element.
I believe that the race was there before r216158, but the stated
revision made the window much wider by postponing the close to
taskqueue sometimes.
Only process as much array elements as we find the sockets during
second phase of gc [1]. Take linkage lock and recheck the eligibility
of the socket for gc, as well as call fhold() under the linkage lock.
Reported and tested by: jmallett
Submitted by: jmallett [1]
Reviewed by: rwatson, jeff (possibly)
MFC after: 1 week
each of the threads needs more while current pool of the buffers is
exhausted, then neither thread can make progress.
Switch to nowait allocations after we got first buffer already.
Reported by: az
Reviewed by: alc (previous version)
Tested by: pho
MFC after: 1 week
The fdcheckstd() function makes sure fds 0, 1 and 2 are open by opening
/dev/null. If this fails (e.g. missing devfs or wrong permissions),
fdcheckstd() will return failure and the process will exit as if it received
SIGABRT. The KASSERT is only to check that kern_open() returns the expected
fd, given that it succeeded.
Tripping the KASSERT is most likely if fd 0 is open but fd 1 or 2 are not.
MFC after: 2 weeks
sbuf_new_for_sysctl(9). This allows using an sbuf with a SYSCTL_OUT
drain for extremely large amounts of data where the caller knows that
appropriate references are held, and sleeping is not an issue.
Inspired by: rwatson
unfunctional. Wiring the user buffer has only been done explicitly
since r101422.
Mark the kern.disks sysctl as MPSAFE since it is and it seems to have
been mis-using the NOLOCK flag.
Partially break the KPI (but not the KBI) for the sysctl_req 'lock'
field since this member should be private and the "REQ_LOCKED" state
seems meaningless now.
before checking the validity of the next buffer pointer. Otherwise, the
buffer might be reclaimed after the check, causing iteration to run into
wrong buffer.
Reported and tested by: pho
MFC after: 1 week
- Move the realtime priority range up above kernel sleep priorities and
just below interrupt thread priorities.
- Contract the interrupt and kernel sleep priority ranges a bit so that
the timesharing priority band can be increased. The new timeshare range
is now slightly larger than the old realtime + timeshare ranges.
- Change the ULE scheduler to no longer use realtime priorities for
interactive threads. Instead, the larger timeshare range is now split
into separate subranges for interactive and non-interactive ("batch")
threads. The end result is that interactive threads and non-interactive
threads still use the same priority ranges as before, but realtime
threads now have a separate, dedicated priority range.
- Do not modify the priority of non-timeshare threads in sched_sleep()
or via cv_broadcastpri(). Realtime and idle priority threads will
no longer have their priorities affected by sleeping in the kernel.
Reviewed by: jeff
interactive timeshare threads (PRI_*_INTERACTIVE) and non-interactive
timeshare threads (PRI_*_BATCH) and use these instead of PRI_*_REALTIME
and PRI_*_TIMESHARE. No functional change.
Reviewed by: jeff
PI_DISKLOW. While here, rename PI_TTYLOW to PI_TTY.
- Add a macro PI_SWI() that takes a SWI_* constant as an argument and
returns the suitable thread priority.
That revision is introducing a bug which is more visible than problems
it is trying to fix.
As long as my time is very limited in this period I am going to
commit back this patch just once it is fully fixed.
Reported by: dim, Nicholas Esborn
PT_GNU_STACK program header, if present and enabled. Two new sysctls
are provided, kern.elf32.nxstack and kern.elf64.nxstack, that allow to
enable PT_GNU_STACK for ABIs of specified bitsize, if ABI decided to
support shared page.
Inform rtld about access mode of the stack initial mapping by
AT_STACKPROT aux vector.
At the moment, the default is disabled, waiting for the usermode
support bits.
setting SV_SHP flag and providing pointer to the vm object and mapping
address. Provide simple allocator to carve space in the page, tailored
to put the code with alignment restrictions.
Enable shared page use for amd64, both native and 32bit FreeBSD
binaries. Page is private mapped at the top of the user address
space, moving a start of the stack one page down. Move signal
trampoline code from the top of the stack to the shared page.
Reviewed by: alc
to match the desired priority in td_priority. Otherwise the first time
thread0 used a borrowed priority it would drop down to PUSER instead of
PVM.
- Explicitly initialize the starting priority of new kprocs to PVM to
avoid inheriting some random priority from thread0.
MFC after: 2 weeks
thread and proc have been copied and zeroed from the old thread and
proc. Otherwise attempts to modify thread or process data in sched_fork()
could be undone.
- Don't copy td_{base,}_user_pri from the old thread to the new thread in
sched_fork_thread() in ULE. This is already done courtesy the bcopy()
of the thread copy region.
- Always initialize the real priority (td_priority) of new threads to the
new thread's base priority (td_base_pri) to avoid bogusly inheriting a
borrowed priority from the parent thread.
MFC after: 2 weeks