instead of preprocessor macros.
This will make debugger output of 'print *m' exactly match the names
we use in code, making life of a kernel hacker way more pleasant. And
this also allows to rename struct_m_ext back to m_ext.
STAILQs and SLISTs using the same structure field as good old m_next
and m_nextpkt linkage occupy.
New code is encouraged to use queue(3) macros, instead of implementing
the wheel. However, better not to have a mixture of old style and
queue(3) in one file or subsystem.
Reviewed by: rwatson, rrs, rpaulo
Differential Revision: D1499
This is a more generic version of taskqueue_start_threads_pinned()
which only supports a single cpuid.
This originally came from John Baldwin <jhb@> who implemented it
as part of a push towards NUMA awareness in drivers. I started implementing
something similar for RSS and NUMA, then found he already did it.
I'd like to axe taskqueue_start_threads_pinned() so it doesn't become
part of a longer-term API. (Read: hps@ wants to MFC things, and
if I don't do this soon, he'll MFC what's here. :-)
I have a follow-up commit which converts the intel drivers over
to using the cpuset version of this function, so we can eventually
nuke the the pinned version.
Tested:
* igb, ixgbe
Obtained from: jhbbsd
children. Handle the situation instead asserting that it is
impossible.
Reported and tested by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
It is safe to move the call to socantsendmore_locked() after
sbdrop_locked() as long as we hold the sockbuf lock across the two
calls.
CR: D1805
Reviewed by: adrian, kmacy, julian, rwatson
includes the shared page allowing debuggers to use the signal trampoline
code to identify signal frames in core dumps.
Differential Revision: https://reviews.freebsd.org/D1828
Reviewed by: alc, kib
MFC after: 1 week
- vfs.recycles counts the number of vnodes forcefully recycled to avoid
exceeding kern.maxvnodes.
- vfs.vnodes_created counts the number of vnodes created by successful
calls to getnewvnode().
Differential Revision: https://reviews.freebsd.org/D1671
Reviewed by: kib
MFC after: 1 week
code in my last commit. The cc_exec_next is used to track the next
when a direct call is being made from callout. It is *never* used
in the in-direct method. When macro-izing I made it so that it
would separate out direct/vs/non-direct. This is incorrect and can
cause panics as Peter Holm has found for me (Thanks so much Peter for
all your help in this). What this change does is restore that behavior
but also get rid of the cc_next from the array and instead make it
be part of the base callout structure. This way no one else will get
confused since we will never use it for non-direct.
Reviewed by: Peter Holm and more importantly tested by him ;-)
MFC after: 3 days.
Sponsored by: Netflix Inc.
unmount, which causes error from nmount(2) call when performing
MNT_DELEXPORT over the directory which ceased to be a mount point.
The race is legitimate and innocent, but results in the chatty mountd.
Silence it by providing an distinguished error code for the situation,
and ignoring the error in mountd loop.
Based on the patch by: Andreas Longwitz <longwitz@incore.de>
Prodded and tested by: bdrewery
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
This change implements a notification (via devctl) to userland when
the kernel produces coredumps after a process has crashed.
devd can then run a specific command to produce a human readable crash
report. The command is most usually a helper that runs gdb/lldb
commands on the file/coredump pair. It's possible to use this
functionality for implementing automatic generation of crash reports.
devd(8) will be notified of the full path of the binary that crashed and
the full path of the coredump file.
is being done in the callout code and harmonizes the macro
use.:
1) The callout_active() will lie. Basically if a migration
is occuring and the callout is about to expire and the
migration has been deferred, the callout_active will no
longer return true until after the migration. This confuses
and breaks callers that are doing callout_init(&c, 1); such
as TCP.
2) The migration code had a bug in it where when migrating, if
a two calls to callout_reset came in and they both collided with
the callout on the wheel about to run, then the second call to
callout_reset would corrupt the list the callout wheel uses
putting the callout thread into a endless loop.
3) Per imp, I have fixed all the macro occurance in the code that
were for the most part being ignored.
Phabricator D1711 and looked at by lstewart and jhb and sbruno.
Reviewed by: kostikbel, imp, adrian, hselasky
MFC after: 3 days
Sponsored by: Netflix Inc.
allows the user to request administrative changes to individual devices
such as attach or detaching drivers or disabling and re-enabling devices.
- Add a new /dev/devctl2 character device which uses ioctls for device
requests. The ioctls use a common 'struct devreq' which is somewhat
similar to 'struct ifreq'.
- The ioctls identify the device to operate on via a string. This
string can either by the device's name, or it can be a bus-specific
address. (For unattached devices, a bus address is the only way to
locate a device.) Bus drivers register an eventhandler to claim
unrecognized device names that the driver recognizes as a valid address.
Two buses currently support addresses: ACPI recognizes any device
in the ACPI namespace via its full path starting with "\" and
the PCI bus driver recognizes an address specification of
'pci[<domain>:]<bus>:<slot>:<func>' (identical to the PCI selector
strings supported by pciconf).
- To make it easier to cut and paste, change the PnP location string
in the PCI bus driver to output a full PCI selector string rather
than 'slot=<slot> function=<func>'.
- Add a devctl(3) interface in libdevctl which provides a wrapper around
the ioctls and is the preferred interface for other userland code.
- Add a devctl(8) program which is a simple wrapper around the requests
supported by devctl(3).
- Add a device_is_suspended() function to check DF_SUSPENDED.
- Add a resource_unset_value() function that can be used to remove a
hint from the kernel environment. This is used to clear a
hint.<driver>.<unit>.disabled hint when re-enabling a boot-time
disabled device.
Reviewed by: imp (parts)
Requested by: imp (changing PCI location string)
Relnotes: yes
flag value is already exposed via dv_flags, just not the meaning of the
flags themselves. Use these constants to annotate devices that are
disabled or suspended in devinfo output.
in kernel config files..
put VERBOSE_SYSINIT in it's own option header so the one file,
init_main.c, can use it instead of requiring an entire kernel recompile
to change one file..
chances of finding problems related to wraparound sooner.
This comes from P4 change 167856 on 2009/08/26 around when we had problems
with the TCP stack with ticks after 24 days of uptime.
before pipeclose() is called, since for !PIPE_NAMED case, when peer is
already closed, the pipe pair memory is freed.
Submitted by: luke.tw@gmail.com
PR: 197246
Tested by: pho
MFC after: 3 days
subverted by userspace into cycle. Both umtx_propagate_priority() and
umtx_repropagate_priority() would then loop infinitely, owning the
spinlock.
Check for the cycle using standard Floyd' algorithm before doing the
pass in the affected functions. Add simple check for condition of
tricking the thread into a wait for itself, which could be easily
simulated by usermode without race.
Found by: Eric van Gyzen <eric@vangyzen.net>
In collaboration with: Eric van Gyzen <eric@vangyzen.net>
Tested by: pho
MFC after: 1 week
being held before sleeping.
This has bitten me (in ath(4)) once before and I'd like to see this
not bite anyone else.
Differential Revision: D1638
Reviewed by: jhb, hselasky
MFC after: 1 week
The core kernel part is patch file utimes.2008.4.diff from
pluknet@FreeBSD.org. I updated the code for API changes, added the manual
page and added compatibility code for old kernels. There is also audit and
Capsicum support.
A new UTIME_* constant might allow setting birthtimes in future.
Differential Revision: https://reviews.freebsd.org/D1426
Submitted by: pluknet (partially)
Reviewed by: delphij, pluknet, rwatson
Relnotes: yes
in bitfield argument is wrong, as it will be treated as bit 10, causing any
code printing >=10 bits with bit 10 on as having a trailing comma.
Newline (intended one) should be part of the format string (already present
in the examples).
Also fix grammar and kill EOL whitespace in comment while here.
PR: 195005
Approved by: bdrewery
FreeBSD developers need more time to review patches in the surrounding
areas like the TCP stack which are using MPSAFE callouts to restore
distribution of callouts on multiple CPUs.
Bump the __FreeBSD_version instead of reverting it.
Suggested by: kmacy, adrian, glebius and kib
Differential Revision: https://reviews.freebsd.org/D1438
We obtain a stable copy and store it in local 'fde' variable. Storing another
copy (based on aforementioned variable) does not serve any purpose.
No functional changes.
The only potential in-tree consumer (_fdrop) special-cased it and returns 0
0 on its own instead of calling badfo_close.
Remove the special case since it is not needed and very unlikely to encounter
anyway.
No objections from: kib
Prior to this change CLOCK_MONOTONIC could go backwards when the timecounter
hardware was changed via 'sysctl kern.timecounter.hardware'. This happened
because the vdso timehands update was missing the special treatment in
tc_windup() when changing timecounters.
Reviewed by: kib
in r277199. Acquire the neccessary reference in delist_dev_locked()
and inform destroy_devl() about it using CDP_UNREF_DTR flag.
Fix some style nits, add asserts.
Discussed with: hselasky
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
attachment to the process. Note that the command is not intended to
be a security measure, rather it is an obfuscation feature,
implemented for parity with other operating systems.
Discussed with: jilles, rwatson
Man page fixes by: rwatson
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
== SIG_DFL or SIG_IGN. Sloppy code does not fully initialize struct
sigaction for such cases, and being too demanding in the case of
default handler does not catch anything.
Reported and tested by: Alex Tutubalin <lexa@lexa.ru>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
- Close a migration race where callout_reset() failed to set the
CALLOUT_ACTIVE flag.
- Callout callback functions are now allowed to be protected by
spinlocks.
- Switching the callout CPU number cannot always be done on a
per-callout basis. See the updated timeout(9) manual page for more
information.
- The timeout(9) manual page has been updated to reflect how all the
functions inside the callout API are working. The manual page has
been made function oriented to make it easier to deduce how each of
the functions making up the callout API are working without having
to first read the whole manual page. Group all functions into a
handful of sections which should give a quick top-level overview
when the different functions should be used.
- The CALLOUT_SHAREDLOCK flag and its functionality has been removed
to reduce the complexity in the callout code and to avoid problems
about atomically stopping callouts via callout_stop(). If someone
needs it, it can be re-added. From my quick grep there are no
CALLOUT_SHAREDLOCK clients in the kernel.
- A new callout API function named "callout_drain_async()" has been
added. See the updated timeout(9) manual page for a complete
description.
- Update the callout clients in the "kern/" folder to use the callout
API properly, like cv_timedwait(). Previously there was some custom
sleepqueue code in the callout subsystem, which has been removed,
because we now allow callouts to be protected by spinlocks. This
allows us to tear down the callout like done with regular mutexes,
and a "td_slpmutex" has been added to "struct thread" to atomically
teardown the "td_slpcallout". Further the "TDF_TIMOFAIL" and
"SWT_SLEEPQTIMO" states can now be completely removed. Currently
they are marked as available and will be cleaned up in a follow up
commit.
- Bump the __FreeBSD_version to indicate kernel modules need
recompilation.
- There has been several reports that this patch "seems to squash a
serious bug leading to a callout timeout and panic".
Kernel build testing: all architectures were built
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D1438
Sponsored by: Mellanox Technologies
Reviewed by: jhb, adrian, sbruno and emaste
more generally make it easier to extend 'struct mbuf in the future', make
a number of changes to the data structure:
- As we anticipate embedding mbufs headers within variable-size regions of
memory in the future, change the definitions of byte arrays embedded in
mbufs to be of size [0] rather than [MLEN] and [MHLEN]. In fact, the
cxgbe driver already uses 'struct mbuf' on the front of other storage
sizes, but we would like the global mbuf allocator do be able to do this
as well.
- Fold 'struct m_hdr' into 'struct mbuf' itself, eliminating a set of
macros that aliased 'mh_foo' field names to 'm_foo' names such as
'm_next'. These present a particular problem as we would like to add
new mbuf-header fields -- e.g., 'm_size' -- that, if similarly named via
macros, would introduce collisions with many other variable names in the
kernel.
- Rename 'struct m_ext' to 'struct struct_m_ext' so that we can add
compile-time assertions without bumping into the still-extant 'm_ext'
macro.
- Remove the MSIZE compile-time assertion for 'struct mbuf', but add new
assertions for alignment of embedded data arrays (64-bit alignment even
on 32-bit platforms), and for the sizes the mbuf header, packet header,
and m_ext structure.
- Document that these assertions exist in comments in mbuf.h.
This change is not intended to cause (non-trivial) behavioural
differences, but is a precursor to further mbuf-allocator work.
Differential Revision: https://reviews.freebsd.org/D1483
Reviewed by: bz, gnn, np, glebius ("go ahead, I trust you")
Sponsored by: EMC / Isilon Storage Division
"delist_dev()" function. Make sure the character device structure
doesn't go away until the end of the "destroy_dev()" function due to
concurrently running cleanup code inside "devfs_populate()".
MFC after: 1 week
Reported by: dchagin@