branded as well as unbranded binaries. This will be required to add
support for the new ELFv2 ABI on powerpc64, which is distinguished from
ELFv1 by the contents of the ELF header's flags field.
Reviewed by: imp
MFC after: 2 weeks
- While at it, arrange #ifndefs in kern_dump.c more intelligently; it's
rather confusing to have multiple competing and/or unused functions in
the kernel.
during iteration instead of relocking it for each traversed rule.
Reviewed by: mjg@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D4110
new return codes of -1 were mistakenly being considered "true". Callout_stop
now returns -1 to indicate the callout had either already completed or
was not running and 0 to indicate it could not be stopped. Also update
the manual page to make it more consistent no non-zero in the callout_stop
or callout_reset descriptions.
MFC after: 1 Month with associated callout change.
certain kernel structures for use by debuggers. This mostly aids
in examining cores from a kernel without debug symbols as a debugger
can infer these values if debug symbols are available.
One set of variables describes the layout of 'struct linker_file' to
walk the list of loaded kernel modules.
A second set of variables describes the layout of 'struct proc' and
'struct thread' to walk the list of processes in the kernel and the
threads in each process.
The 'pcb_size' variable is used to index into the stoppcbs[] array.
The 'vm_maxuser_address' is used to distinguish kernel virtual addresses
from user addresses. This doesn't have to be perfect, and
'vm_maxuser_address' is a cheap and simple way to differentiate kernel
pointers from simple values like TIDs and PIDs.
While here, annotate the fields in struct pcb used by kgdb on amd64
and i386 to note that their ABI should be preserved. Annotations for
other platforms will be added in the future.
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D3773
should be used by TCP for sure in its cleanup of the IN-PCB (will be coming shortly).
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D4076
If you attempt to set a pcpu limit that is higher than
110% using rctl (for instance, you want a jail to be
able to use 2 cores on your system so you set pcpu to
200%) the thing you are trying to limit becomes unthrottled.
PR: 189870
Submitted by: dustinwenz@ebureau.com
Reviewed by: trasz
MFC after: 1 week
as otherwise most of the time is spent resolving UIDs to names.
Reviewed by: mjg@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D4059
variable during mp_start() which is too late. Move this to mp_setmaxid()
where other architectures set it and move x86 assertions to MI code.
Reviewed by: kib (x86 part)
is non-zero.
- Include the process address in the PROC_ASSERT_HELD() and
PROC_ASSERT_NOT_HELD() assertion messages so that the corresponding
process can be found easily when debugging.
MFC after: 1 week
Use the right intmax_t type instead of intptr_t in a few remaining
places.
Add support for CTLFLAG_TUN for the new fixed with types. Bruce will be
upset that the new handlers silently truncate tuned quad-sized inputs,
but so do all of the existing handlers.
Add the new types to debug_dump_node, for whatever use that is.
Bump FreeBSD_version again, for good measure. We are changing
SYSCTL_HANDLER_ARGS and a member of struct sysctl_oid to intmax_t.
Correct the sysctl typed NULL values for the fixed-width types. (Hat
tip: hps@.)
Suggested by: hps (partial)
Sponsored by: EMC / Isilon Storage Division
Things seem to get stuck in low memory conditions where no bufs are available,
the reclamation path is called to wakeup the daemon, but no sleeping is done.
Because of this, we are stuck in a tight loop in the current process and
never run said reclamation path.
This was introduced in r289279 . This is only a temporary workaround
to restore system usefulness until the more permanent solutions can be
found.
Tested:
* Carambola2, 64MB (and 32MB by manual config.)
Add S8, S16, S32, and U32 types; add SYSCTL*() macros for them, as well
as for the existing 64-bit types. (While SYSCTL*QUAD and UQUAD macros
already exist, they do not take the same sort of 'val' parameter that
the other macros do.)
Clean up the documented "types" in the sysctl.9 document. (These are
macros and thus not real types, but the manual page documents intent.)
The sysctl_add_oid(9) arg2 has been bumped from intptr_t to intmax_t to
accommodate 64-bit types on 32-bit pointer architectures.
This is just the kernel support piece; the userspace sysctl(1) support
will follow in a later patch.
Submitted by: Ravi Pokala <rpokala@panasas.com>
Reviewed by: cem
Relnotes: no
Sponsored by: Panasas
Differential Revision: https://reviews.freebsd.org/D4091
not be found. Otherwise, relocations against such symbols will be silently
ignored instead of causing an error to be raised.
Reviewed by: kib
MFC after: 1 week
whether an error is recoverable. Always re-dirty the buffer on errors
from write requests. The invalidation we used to do for errors not EIO
doesn't need to be done for a device that's really gone, since that's
done in a different path.
Reviewed by: mckusick@, kib@
self-documented, and eases addition of new ops.
For the similar reasons, eliminate UMTX_OP_MAX. nitems() handles the
only use of the symbol.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
if they are not required for mounting rootfs. However, it's possible
that some setups try to mount them in mountcritlocal (ie from fstab).
Export the list of current root mount holds using a new sysctl,
vfs.root_mount_hold, and make mountcritlocal retry if "mount -a" fails
and the list is not empty.
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3709
When destroying a character device the si_devsw field is set to NULL
before all references are gone, to indicate the character device is
going away. This can cause a NULL-dereference fault inside physio().
The callers of physio() should own a thread reference on the cdev and
if si_devsw is seen as non-NULL, it is usable during the execution of
the function. Else an ENXIO error code is returned.
Reviewed by: kib
MFC after: 2 weeks
is 0. Without this change it was sleeping for one tick. Maybe not a big
deal, but it makes share/dtrace/blocking script to report that.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D3814
Sponsored by: Wheel Systems, http://wheelsystems.com
linux_syscallnames[] from linux_* to linux32_* to avoid conflicts with
linux64.ko. While here, add support for linux64 binaries to systrace.
- Update NOPROTO entries in amd64/linux/syscalls.master to match the
main table to fix systrace build.
- Add a special case for union l_semun arguments to the systrace
generation.
- The systrace_linux32 module now only builds the systrace_linux32.ko.
module on amd64.
- Add a new systrace_linux module that builds on both i386 and amd64.
For i386 it builds the existing systrace_linux.ko. For amd64 it
builds a systrace_linux.ko for 64-bit binaries.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D3954
For CloudABI we need to initialize the registers of new threads
differently based on whether the thread got created through a fork or
through simple thread creation.
Add a flag, TDP_FORKING, that is set by do_fork() and cleared by
fork_exit(). This can be tested against in schedtail.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D3973
r289660:
Do not allow to execute ptrace(PT_TRACE_ME) when the process is
already traced.
Do not allow to execute ptrace(PT_TRACE_ME) when there is no parent
which can trace the process, i.e. when the parent is already init.
Note that after the PT_TRACE_ME request the process is unkillable and
non-continuable until a debugger is attached, or parent is killed, the
later clears P_TRACED state. Since init clearly would not debug the
caller, and cannot be killed, disallow creation of unkillable
processes.
Reviewed by: jhb, pho
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D3908
When establishing the locking state for several lock types (including
blockable mutexes and sx) failed, locking primitives try to spin while
the owner thread is running. The spinning loop performs the test for
running condition by dereferencing the owner->td_state field of the
owner thread. If the owner thread exited while spinner was put off
the processor, it is harmless to access reused struct thread owner,
since in some near future the current processor would notice the owner
change and make appropriate progress. But it could be that the page
which carried the freed struct thread was unmapped, then we fault
(this cannot happen on amd64).
For now, disallowing free of the struct thread seems to be good
enough, and tests which create a lot of threads once, did not
demonstrated regressions.
Reviewed by: jhb, pho
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D3908
atapicd(4) has been removed since r249083, and if a system has more than one
optical drive, it will likely be /dev/cd1
Update mount.conf(8) to reflect the change in behavior
MFC after: never
Sponsored by: EMC / Isilon Storage Division
executable image. Keep one page (arbitrary) limit on the max allowed
size of the PT_NOTES.
The ELF image activators still require that program headers of the
executable are fully contained in the first page of the image file.
Reviewed by: emaste, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D3871
8x performance improvement in a micro benchmark on a 4 socket machine.
- Get buffer headers from a per-cpu uma cache that sits in from of the
free queue.
- Use a per-cpu quantum cache in vmem to eliminate contention for kva.
- Use multiple clean queues according to buffer cache size to eliminate
clean queue lock contention.
- Introduce a bufspace daemon that attempts to prevent getnewbuf() callers
from blocking or doing direct recycling.
- Close some bufspace allocation races that could lead to endless
recycling.
- Further the transition to a more modern style of small functions grouped
by prefix in order to improve growing complexity.
Sponsored by: EMC / Isilon
Reviewed by: kib
Tested by: pho
packets and/or state transitions from each TCP socket. That would help with
narrowing down certain problems we see in the field that are hard to reproduce
without understanding the history of how we got into a certain state. This
change provides just that.
It saves copies of the last N packets in a list in the tcpcb. When the tcpcb is
destroyed, the list is freed. I thought this was likely to be more
performance-friendly than saving copies of the tcpcb. Plus, with the packets,
you should be able to reverse-engineer what happened to the tcpcb.
To enable the feature, you will need to compile a kernel with the TCPPCAP
option. Even then, the feature defaults to being deactivated. You can activate
it by setting a positive value for the number of captured packets. You can do
that on either a global basis or on a per-socket basis (via a setsockopt call).
There is no way to get the packets out of the kernel other than using kmem or
getting a coredump. I thought that would help some of the legal/privacy concerns
regarding such a feature. However, it should be possible to add a future effort
to export them in PCAP format.
I tested this at low scale, and found that there were no mbuf leaks and the peak
mbuf usage appeared to be unchanged with and without the feature.
The main performance concern I can envision is the number of mbufs that would be
used on systems with a large number of sockets. If you save five packets per
direction per socket and have 3,000 sockets, that will consume at least 30,000
mbufs just to keep these packets. I tried to reduce the concerns associated with
this by limiting the number of clusters (not mbufs) that could be used for this
feature. Again, in my testing, that appears to work correctly.
Differential Revision: D3100
Submitted by: Jonathan Looney <jlooney at juniper dot net>
Reviewed by: gnn, hiren
This removes the need for manually changing this flag for Google Chrome
users. It also improves compatibility with Linux applications running under
Linuxulator compatibility layer, and possibly also helps in porting software
from Linux.
Generally speaking, the flag allows applications to create the shared memory
segment, attach it, remove it, and then continue to use it and to reattach it
later. This means that the kernel will automatically "clean up" after the
application exits.
It could be argued that it's against POSIX. However, SUSv3 says this
about IPC_RMID: "Remove the shared memory identifier specified by shmid from
the system and destroy the shared memory segment and shmid_ds data structure
associated with it." From my reading, we break it in any case by deferring
removal of the segment until it's detached; we won't break it any more
by also deferring removal of the identifier.
This is the behaviour exhibited by Linux since... probably always, and
also by OpenBSD since the following commit:
revision 1.54
date: 2011/10/27 07:56:28; author: robert; state: Exp; lines: +3 -8;
Allow segments to be used even after they were marked for deletion with
the IPC_RMID flag.
This is permitted as an extension beyond the standards and this is similar
to what other operating systems like linux do.
MFC after: 1 month
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3603
struct thread and kernel stack for the thread. Otherwise, a load
similar to a fork bomb would exhaust KVA and possibly kmem, mostly due
to the struct proc being type-stable.
The nprocs counter is changed from being protected by allproc_lock sx
to be an atomic variable. Note that ddb/db_ps.c:db_ps() use of nprocs
was unsafe before, and is still unsafe, but it seems that the only
possible undesired consequence is the harmless warning printed when
allproc linked list length does not match nprocs.
Diagnosed by: Svatopluk Kraus <onwahe@gmail.com>
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week