This function permits a range of one scatter/gather list to be appended to
another sglist. This can be used to construct a scatter/gather list that
reorders or duplicates ranges from one or more existing scatter/gather
lists.
Sponsored by: Chelsio Communications
Previously, when the VI_TRYLOCK failed, we would spin under the mutex
that protects the vnode active list until we either succeeded or
noticed that we had hogged the CPU. Since we were violating the lock
order, this would guarantee that we would become a hog under any
deadlock condition (e.g. a race with vdrop(9) on the same vnode). In
the presence of many concurrent threads in sync(2) or vdrop etc, the
victim could hang for a long time.
Now, avoid spinning by dropping and reacquiring the locks in the
conventional lock order when the trylock fails. This requires a dance
with the vnode hold count.
Submitted by: Tom Rix <trix@juniper.net>
Tested by: pho
Differential revision: https://reviews.freebsd.org/D10692
is blocked. The spurious wakeup might result in spurious EINTR.
The reschedule_signals() function is called when the calling thread
has the signal mask changed. For each newly blocked signal, we try to
find a thread which might have the signal not blocked. If no such
thread exists, sigtd() returns random thread, which must not be waken
up. I decided that re-checking, as suggested by PR submitter, is more
reasonable change than to change sigtd() interface, due to other uses
of sigtd(). signotify() already performs this check.
Submitted by: Duane <parakleta@darkreality.org>
PR: 219228
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
When a thread enters ptracestop(), for example because it had received
SIGSTOP from ptrace(PT_ATTACH), it attempts to suspend other threads in
the same process. In the case of a thread sleeping interruptibly in an
SBDRY section, sig_suspend_threads() must wake the thread and allow it to
reach the user-mode boundary. However, sig_suspend_threads() would
erroneously avoid waking up such threads, resulting in an apparent hang.
Reviewed by: kib
Tested by: pho
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
the code auto-generated for *.m - kobj_lookup_method(9) is useful;
for example in back-ends or base class device drivers in order to
determine whether a default method has been overridden. Thus, allow
for the kobj_method_t pointer argument - used by KOBJOPLOOKUP in
order to update the cache entry - of kobj_lookup_method(9), to be
NULL. Actually, that pointer is redundant as it's just set to the
same kobj_method_t that the kobj_lookup_method(9) function returns
in the first place, but probably it serves to reduce the number of
instructions generated for KOBJOPLOOKUP.
- For the same reason, move updating kobj_lookup_{hits,misses} (if
KOBJ_STATS is defined) from kobj_lookup_method(9) to KOBJOPLOOKUP.
As a side-effect, this gets rid of the convoluted approach of always
incrementing kobj_lookup_hits in KOBJOPLOOKUP and then in case of
a cache miss, decrementing it in kobj_lookup_method(9) again.
The previous misuse of sys_sigqueue() was sending random register or
stack garbage to 64-bit targets. The freebsd32 implementation preserves
the sival_int member of value when signaling a 64-bit process.
Document the mixed ABI implementation of union sigval and the
incompability of sival_ptr with pointer integrity schemes.
Reviewed by: kib, wblock
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D10605
Add IRQ placement-only and ithread-only API variants. intr_event_bind
has been extended with sibling methods, as it has many more callsites in
existing code.
Reviewed by: kib@, adrian@ (earlier version)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D10586
Some notes:
- Only i386 and amd64 layouts are checked, other Tier-1 (or close to
it) architectures would benefit from the same check.
- Unconditional enabling of the asserts depend on the stability of locks
memory layout. If locks are optimized to avoid bloat when some debugging
or profiling features turned off, it makes sense to only assert layout
for production configs.
Reviewed by: badger, emaste, jhb, vangyzen
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D10526
This check has been redundant since it was introduced in r162554.
Reviewed by: emaste, glebius
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10322
in place. To do per-cpu stats, convert all fields that previously were
maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
before we have set up UMA and we can do counter_u64_alloc(), provide an
early counter mechanism:
o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
so that at early stages of boot, before counters are allocated we already
point to a counter that can be safely written to.
o For sparc64 that required a whole dummy pcpu[MAXCPU] array.
Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.
This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html
Reviewed by: kib, gallatin, marius, lidl
Differential Revision: https://reviews.freebsd.org/D10156
This fixes some panics after disconnecting mounted disks.
Submitted by: imp (slightly different version, which I've then lost)
Reviewed by: kib, imp, mckusick
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D9674
by the shutdown(2) system call. This ability has been lost as part of the svn
revision 285910.
Reviewed by: ed, rwatson, glebius, hiren
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10351
do for streaming sockets.
And do more cleanup in the sbappendaddr_locked_internal() to prevent
leak information from existing mbuf to the one, that will be possible
created later by netgraph.
Suggested by: glebius
Tested by: Irina Liakh <spell at itl ua>
MFC after: 1 week
The arm64 binutils only accepts 0 as an offset to the Load-Acquire Register
instructions where llvm will acceps both 0 and 0x0. The thread switching
code uses these with SCHED_ULE to block waiting for a lock to be released.
As the offset of the data to be loaded is zero this is safe, however it is
useful to keep the offset in the instruction to document what is being
loaded.
To work around this issue in binutils only generate the 0x prefix for
non-zero values.
Reported by: kan
Sponsored by: DARPA, AFRL
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().
Reviewed by: gnn, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10313
This matches the getcwd() definition.
This is technically an ABI change, but that would only effect 64-bit
big-endian platforms that pass arguments on the stack. We have none of
those.
Reviewed by: jhb
Obtained from: CheriABI
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9428
was issued during VM-initiated i/o (pageout), so that the function
does not try to flush or remove pages or wait for the vm object
paging-in-progress counter.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
X-Differential revision: https://reviews.freebsd.org/D10241
Don't zero unused pointer members again.
Per discussion with secteam we are not issuing an advisory for this
issue as we have no current evidence it leaks exploitable information.
Reviewed by: rwatson, glebius, delphij
MFC after: 1 day
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D10227
As posix_fadvise() does not lock the vnode argument, don't capture
detailed vnode information for the time being.
Obtained from: TrustedBSD Project
MFC after: 3 weeks
Sponsored by: DARPA, AFRL
This requires minor changes to the audit framework to allow capturing
paths that are not filesystem paths (i.e., will not be canonicalised
relative to the process current working directory and/or filesystem
root).
Obtained from: TrustedBSD Project
MFC after: 3 weeks
Sponsored by: DARPA, AFRL
resulting in a process dumping core in the corefile.
Also extend procstat to view select members of 'struct ptrace_lwpinfo'
from the contents of the note.
Sponsored by: Dell EMC Isilon
map the 'which' argument into a suitable audit event identifier for the
specific operation requested.
Obtained from: TrustedBSD Project
MFC after: 3 weeks
Sponsored by: DARPA, AFRL
that used to work via the bold hack).
Fix the table entry for bright black. Fix spelling of plain black in
nearby table entries (use the macro for black everywhere everywhere).
Fix the currently-unused non-bright color table to not have bright
colors in entries 9-15.
Improve nearby comments. Start converting to the xterm terminology
and default rendering of "bright" instead of "light" for bright
colors.
Syscons wasn't affected by the bug since I optimized it a little by
converting colors 0-15 directly. This also fixes the layering of
the conversion for these colors.
Apply the same optimization to vt (actually the layer above it). This
also moves the conversion 1 closer to the correct layer for colors
0-15.
The optimization of just avoiding 2 calls to a trivial function is worth
about 10% for simple output to the virtual buffer with occasional
rendering. The optimization is so large because the 2 calls are done
on every character, so although there are too many other calls and
other instructions per character, there are only about 10 times as
many. Old versions of syscons were about 10 times faster for simple
output, by using a fast path with about 12 instructions per character.
Rendering to even slow hardware takes relatively little time provided
it is rarely actually done.
crash when the file shrinks. This also fixes sendfile(2) not sending more
data in a case when the file grows, and the request is open-ended or
specifies a size that is greater than old file size.
PR: 217789
Reviewed by: gallatin
MFC after: 10 days
The existing ELF image activator requires the brandinfo to provide such
a string unconditionally, even if the executable format in question
doesn't use this type of branding. Skip matching when it's a null
pointer.
Reviewed by: kib
MFC after: 2 weeks
This is done so that the thread state changes during the switch
are not confused with the thread state changes reported when the thread
spins on a lock.
Here is an example, three consecutive entries for the same thread (from top to
bottom):
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"sleep", attributes: prio:84, wmesg:"-", lockname:"(null)"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"spinning", attributes: lockname:"sched lock 1"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"running", attributes: none
The above trace could leave an impression that the final state of
the thread was "running".
After this change the sleep state will be reported after the "spinning"
and "running" states reported for the sched lock.
Reviewed by: jhb, markj
MFC after: 1 week
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D9961
matches static binaries.
Interpretation of the 'static' there is that the binary must not
specify an interpreter. In particular, shared objects are matched by
the brand if BI_CAN_EXEC_DYN is also set.
This improves precision of the brand matching, which should eliminate
surprises due to brand ordering.
Revert r315701.
Discussed with and tested by: ed (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This will provide a slightly better smoking gun than just stating
"can't remove non-dynamic nodes!" when calling sysctl_ctx_free(9)
and sysctl_remove_{name,oid}(9) with a non-dynamic (likely
static) sysctl.
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Because of integer types, the timeout calculation result was limited to
INT_MAX / (1000 * hz) seconds. For systems with hz=10000, this is only 215
seconds. Perform the calculation with 64-bit math to allow sleeping for the
full INT_MAX / hz interval (215000 seconds on such hz=10000 systems).
Submitted by: Scott Ferris <sferris at isilon.com>
Sponsored by: Dell EMC Isilon
We must ensure there's space for the terminating null in the temporary
buffer in imgact_binmisc_populate_interp().
Note that there's no buffer overflow here because xbe->xbe_interpreter's
length and null termination is checked in imgact_binmisc_add_entry()
before imgact_binmisc_populate_interp() is called. However, the latter
should correctly enforce its own bounds.
Reviewed by: sbruno
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D10042
vm_pindex_t's into a vm_ooffset_t.
The length given to shm_dotruncate() must never be negative. Assert this.
Tidy up a comment.
Reviewed by: kib
MFC after: 1 week
Add a clock_nanosleep() syscall, as specified by POSIX.
Make nanosleep() a wrapper around it.
Attach the clock_nanosleep test from NetBSD. Adjust it for the
FreeBSD behavior of updating rmtp only when interrupted by a signal.
I believe this to be POSIX-compliant, since POSIX mentions the rmtp
parameter only in the paragraph about EINTR. This is also what
Linux does. (NetBSD updates rmtp unconditionally.)
Copy the whole nanosleep.2 man page from NetBSD because it is complete
and closely resembles the POSIX description. Edit, polish, and reword it
a bit, being sure to keep any relevant text from the FreeBSD page.
Reviewed by: kib, ngie, jilles
MFC after: 3 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D10020