When sendfile_getobj() is called on a DTYPE_SHM file, it never
initializes error, which is eventually returned to the caller.
Differential Revision: https://reviews.freebsd.org/D1989
Reviewed by: kib
Reported by: Brainy Code Scanner, by Maxime Villard.
currently a spin lock. Apparently, the only reason for this is that
umtx_thread_exit() is called under the process spinlock, which put the
requirement on the umtx_lock. Note that the witness static order list
is wrong for the umtx_lock, umtx_lock is explicitely before any thread
lock, so it is also before sleepq locks.
Change umtx_lock to be the sleepable mutex. For the reason above, the
calls to umtx_thread_exit() are moved from thread_exit() earlier in
each caller, when the process spin lock is not yet taken.
Discussed with: jhb
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
consistently. This also matches the per-cpu pointer declaration
anyway.
This changes the tweak we give to the load from -32..31 to be 0..31
which seems more inline with the rest of the code (- rnd and the -=
64). It should also provide the randomness we need, and may fix a
signedness bug in the old code (it isn't clear that the effect was
intentional as opposed to sloppy, and the right shift of a signed
value is undefined to boot).
This stores sched_balance() behavior when it used random().
Differential Revision: https://reviews.freebsd.org/D1981
prevent errors from yanking devices out from under filesystems. Only
care about special vnodes on devfs, special nodes on other kinds of
filesystems do not have special properties.
Sponsored by: EMC / Isilon Storage Division
Submitted by: Conrad Meyer
MFC after: 1 week
jail's creation parameters. This allows the kernel version to be reliably
spoofed within the jail whether examined directly with sysctl or
indirectly with the uname -r and -K options.
The values can only be set at jail creation time, to eliminate the need
for any locking when accessing the values via sysctl.
The overridden values are inherited by nested jails (unless the config for
the nested jails also overrides the values).
There is no sanity or range checking, other than disallowing an empty
release string or a zero release date, by design. The system
administrator is trusted to set sane values. Setting values that are
newer than the actual running kernel will likely cause compatibility
problems.
Differential Revision: https://reviews.freebsd.org/D1948
Relnotes: yes
we need randomness in ULE. This removes random() call from the
rebalance interval code.
Submitted by: Harrison Grundy
Differential Revision: https://reviews.freebsd.org/D1968
to its previous, unowned state. This avoids compounding an existing
problem of inconsistent ownership.
Submitted by: Eric van Gyzen <eric_van_gyzen@dell.com>
Obtained from: Dell Inc.
PR: 198914
MFC after: 1 week
is empty, look up the umtx_pi and disown it if the current thread owns it.
This can happen if a signal or timeout removed the last waiter from
the queue, but there is still a thread in do_lock_pi() holding a reference
on the umtx_pi. The unlocking thread might not own the umtx_pi in this case,
but if it does, it must disown it to keep the ownership consistent between
the umtx_pi and the umutex.
Submitted by: Eric van Gyzen <eric_van_gyzen@dell.com>
with advice from: Elliott Rabe and Jim Muchow, also at Dell Inc.
Obtained from: Dell Inc.
PR: 198914
message. This can happen when application is sending packets too big
for the path MTU and recvmsg() will return zero (indicating no data)
but there will be a cmsghdr with cmsg_type set to IPV6_PATHMTU.
Remove KASSERT() which does NULL pointer dereference in such case.
Also call m_freem() only when m isn't NULL.
PR: 197882
MFC after: 1 week
Sponsored by: Yandex LLC
Introduce fget_fcntl which performs appropriate checks when needed.
This removes a branch from fget_unlocked.
Introduce fget_mmap dealing with cap_rights_to_vmprot conversion.
This removes a branch from _fget.
Modify fget_unlocked to pass sequence counter to interested callers so
that they can perform their own checks and make sure the result was
otained from stable & current state.
Reviewed by: silence on -hackers
instead of preprocessor macros.
This will make debugger output of 'print *m' exactly match the names
we use in code, making life of a kernel hacker way more pleasant. And
this also allows to rename struct_m_ext back to m_ext.
STAILQs and SLISTs using the same structure field as good old m_next
and m_nextpkt linkage occupy.
New code is encouraged to use queue(3) macros, instead of implementing
the wheel. However, better not to have a mixture of old style and
queue(3) in one file or subsystem.
Reviewed by: rwatson, rrs, rpaulo
Differential Revision: D1499
This is a more generic version of taskqueue_start_threads_pinned()
which only supports a single cpuid.
This originally came from John Baldwin <jhb@> who implemented it
as part of a push towards NUMA awareness in drivers. I started implementing
something similar for RSS and NUMA, then found he already did it.
I'd like to axe taskqueue_start_threads_pinned() so it doesn't become
part of a longer-term API. (Read: hps@ wants to MFC things, and
if I don't do this soon, he'll MFC what's here. :-)
I have a follow-up commit which converts the intel drivers over
to using the cpuset version of this function, so we can eventually
nuke the the pinned version.
Tested:
* igb, ixgbe
Obtained from: jhbbsd
children. Handle the situation instead asserting that it is
impossible.
Reported and tested by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
It is safe to move the call to socantsendmore_locked() after
sbdrop_locked() as long as we hold the sockbuf lock across the two
calls.
CR: D1805
Reviewed by: adrian, kmacy, julian, rwatson
includes the shared page allowing debuggers to use the signal trampoline
code to identify signal frames in core dumps.
Differential Revision: https://reviews.freebsd.org/D1828
Reviewed by: alc, kib
MFC after: 1 week
- vfs.recycles counts the number of vnodes forcefully recycled to avoid
exceeding kern.maxvnodes.
- vfs.vnodes_created counts the number of vnodes created by successful
calls to getnewvnode().
Differential Revision: https://reviews.freebsd.org/D1671
Reviewed by: kib
MFC after: 1 week
code in my last commit. The cc_exec_next is used to track the next
when a direct call is being made from callout. It is *never* used
in the in-direct method. When macro-izing I made it so that it
would separate out direct/vs/non-direct. This is incorrect and can
cause panics as Peter Holm has found for me (Thanks so much Peter for
all your help in this). What this change does is restore that behavior
but also get rid of the cc_next from the array and instead make it
be part of the base callout structure. This way no one else will get
confused since we will never use it for non-direct.
Reviewed by: Peter Holm and more importantly tested by him ;-)
MFC after: 3 days.
Sponsored by: Netflix Inc.
unmount, which causes error from nmount(2) call when performing
MNT_DELEXPORT over the directory which ceased to be a mount point.
The race is legitimate and innocent, but results in the chatty mountd.
Silence it by providing an distinguished error code for the situation,
and ignoring the error in mountd loop.
Based on the patch by: Andreas Longwitz <longwitz@incore.de>
Prodded and tested by: bdrewery
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
This change implements a notification (via devctl) to userland when
the kernel produces coredumps after a process has crashed.
devd can then run a specific command to produce a human readable crash
report. The command is most usually a helper that runs gdb/lldb
commands on the file/coredump pair. It's possible to use this
functionality for implementing automatic generation of crash reports.
devd(8) will be notified of the full path of the binary that crashed and
the full path of the coredump file.
is being done in the callout code and harmonizes the macro
use.:
1) The callout_active() will lie. Basically if a migration
is occuring and the callout is about to expire and the
migration has been deferred, the callout_active will no
longer return true until after the migration. This confuses
and breaks callers that are doing callout_init(&c, 1); such
as TCP.
2) The migration code had a bug in it where when migrating, if
a two calls to callout_reset came in and they both collided with
the callout on the wheel about to run, then the second call to
callout_reset would corrupt the list the callout wheel uses
putting the callout thread into a endless loop.
3) Per imp, I have fixed all the macro occurance in the code that
were for the most part being ignored.
Phabricator D1711 and looked at by lstewart and jhb and sbruno.
Reviewed by: kostikbel, imp, adrian, hselasky
MFC after: 3 days
Sponsored by: Netflix Inc.
allows the user to request administrative changes to individual devices
such as attach or detaching drivers or disabling and re-enabling devices.
- Add a new /dev/devctl2 character device which uses ioctls for device
requests. The ioctls use a common 'struct devreq' which is somewhat
similar to 'struct ifreq'.
- The ioctls identify the device to operate on via a string. This
string can either by the device's name, or it can be a bus-specific
address. (For unattached devices, a bus address is the only way to
locate a device.) Bus drivers register an eventhandler to claim
unrecognized device names that the driver recognizes as a valid address.
Two buses currently support addresses: ACPI recognizes any device
in the ACPI namespace via its full path starting with "\" and
the PCI bus driver recognizes an address specification of
'pci[<domain>:]<bus>:<slot>:<func>' (identical to the PCI selector
strings supported by pciconf).
- To make it easier to cut and paste, change the PnP location string
in the PCI bus driver to output a full PCI selector string rather
than 'slot=<slot> function=<func>'.
- Add a devctl(3) interface in libdevctl which provides a wrapper around
the ioctls and is the preferred interface for other userland code.
- Add a devctl(8) program which is a simple wrapper around the requests
supported by devctl(3).
- Add a device_is_suspended() function to check DF_SUSPENDED.
- Add a resource_unset_value() function that can be used to remove a
hint from the kernel environment. This is used to clear a
hint.<driver>.<unit>.disabled hint when re-enabling a boot-time
disabled device.
Reviewed by: imp (parts)
Requested by: imp (changing PCI location string)
Relnotes: yes
flag value is already exposed via dv_flags, just not the meaning of the
flags themselves. Use these constants to annotate devices that are
disabled or suspended in devinfo output.
in kernel config files..
put VERBOSE_SYSINIT in it's own option header so the one file,
init_main.c, can use it instead of requiring an entire kernel recompile
to change one file..
chances of finding problems related to wraparound sooner.
This comes from P4 change 167856 on 2009/08/26 around when we had problems
with the TCP stack with ticks after 24 days of uptime.