The NVMe specification does not define a maximum or optimal delete
size, so technically max delete size is min(full size of namespace,
2^32 - 1 LBAs). A single delete operation for a multi-TB NVMe
namespace though may take much longer to complete than the nvme(4)
I/O timeout period. So choose a sensible default here that is still
suitably large to minimize the number of overall delete operations.
This also fixes possible uint32_t overflow on initial TRIM operation
for zpool create operations for NVMe namespaces with >4G LBAs.
MFC after: 3 days
Sponsored by: Intel
This hack will be removed in a few weeks. It is here to fix incremental
builds of SSH between r291941 and r294370.
Reported by: jmallett
MFC after: 1 day
Sponsored by: EMC / Isilon Storage Division
Summary:
Migrate to using the semi-opaque type rman_res_t to specify rman resources. For
now, this is still compatible with u_long.
This is step one in migrating rman to use uintmax_t for resources instead of
u_long.
Going forward, this could feasibly be used to specify architecture-specific
definitions of resource ranges, rather than baking a specific integer type into
the API.
This change has been broken out to facilitate MFC'ing drivers back to 10 without
breaking ABI.
Reviewed By: jhb
Sponsored by: Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D5075
Several attempts to fix this logic was done after r241298, which were
all reverted, yet this change was not.
The .h file does not depend on the .c file, so do not impose such a
dependency on it. They are generated by the same command but do not
depend on each other. Restore the .ORDER which should handle parallel build
issues. This fixes an actual bug where the .h file is not recreated
when missing [1]. For example:
cd lib/libc
make cleanobj
make nsparser.h
rm nsparser.h
make nsparser.h # will not rebuild nsparser.h
I have been trying to track down a build problem where nsparser.h is
missing when nslexer.o is built. It is possible this is related.
Reported by: bde [1]
https://lists.freebsd.org/pipermail/svn-src-all/2012-October/059481.htmlhttps://lists.freebsd.org/pipermail/svn-src-all/2012-October/060038.html
MFC after: 3 weeks
Sponsored by: EMC / Isilon Storage Division
DIRDEPS_BUILD does not yet support PROGS having their own dependency
file.
Overriding .MAKE.DEPENDFILE here causes major problems with the meta
mode logic since it creates the Makefile.depend as '.depend' resulting
in infinite loops in make due to dirdeps.mk including .depend endlessly.
X-MFC-With: r294752
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
We have had this user-modifable DEPENDFILE variable forever that does nothing
relevant for the user since fmake always used '.depend'. Bmake
introduced the .MAKE.DEPENDFILE variable that can be modified to change
the name of '.depend'.
Prior to r284288, bsd.progs.mk was setting .MAKE.DEPENDFILE to allow
working incremental builds. This was modified most likely to not
conflict with the META MODE handling of .MAKE.DEPENDFILE as it has a lot
more special logic for that variable.
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
Currently dtrace(1) -Go does not properly rebuild the target if it
exists. It results in missing symbols.
dtrace -C -x nolibs -G -o usdt.o -s /root/git/freebsd/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/json/usdt.d tst.usdt.o
dtrace: target object (usdt.o) already exists. Please remove the target
dtrace: object and rebuild all the source objects if you wish to run the DTrace
dtrace: linking process again
cc -O2 -pipe -O0 -g -I/root/git/freebsd/cddl/usr.sbin/dtrace/tests/common/json -std=gnu99 -fstack-protector-strong -Qunused-arguments -o tst.usdt.exe.full tst.usdt.o usdt.o
tst.usdt.o: In function `main':
/root/git/freebsd/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/json/tst.usdt.c:56: undefined reference to `__dtrace_bunyan_fake___log__debug'
/root/git/freebsd/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/json/tst.usdt.c:60: undefined reference to `__dtrace_bunyan_fake___log__debug'
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [tst.usdt.exe.full] Error code 1
This is a consequence of r212358.
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
sent using roundrobin protocol and set a better granularity and distribution
among the interfaces. Tuning the number of packages sent by interface can
increase throughput and reduce unordered packets as well as reduce SACK.
Example of usage:
# ifconfig bge0 up
# ifconfig bge1 up
# ifconfig lagg0 create
# ifconfig lagg0 laggproto roundrobin laggport bge0 laggport bge1 \
192.168.1.1 netmask 255.255.255.0
# ifconfig lagg0 rr_limit 500
Reviewed by: thompsa, glebius, adrian (old patch)
Approved by: bapt (mentor)
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D540
control algorithm options. The argument is variable length and is opaque
to TCP, forwarded directly to the algorithm's ctl_output method.
Provide new includes directory netinet/cc, where algorithm specific
headers can be installed.
The new API doesn't yet have any in tree consumers.
The original code written by lstewart.
Reviewed by: rrs, emax
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D711
This fixes incremental build of OpenSSH after the recent upgrade.
For example, in secure/lib/libssh, -include ssh_namespace.h is used on
all files. This is not tracked in the .depend file though due to
MKDEP_CFLAGS not including it. The ssh example was broken in r291941
when not using FAST_DEPEND due to the .depend bug. FAST_DEPEND was not
affected by this because it generates dependencies at compile time and
thus sees the -include.
This ugly make syntax could be simpler for bmake by using :tW but
fmake-compatible syntax is used since this needs to be MFC'd all the way
to stable/9.
Also add a temporary hack to workaround existing checkouts building
incrementally with a .depend file not having these headers.
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
This was a regression in r290629, which was revealed partly in r294360.
Once 'make depend' has ran it will generate all headers already. Thus
even with FAST_DEPEND lacking proper dependencies before building, it
will not have any missing headers. Once objects are compiled the depend
files will be generated with proper dependencies.
Sponsored by: EMC / Isilon Storage Division
This reworks r289254 and removes ALL_SUBDIR_TARGETS.
Because there is an include guard in this file there is no need for
LOCAL_ or ?= on SUBDIR_TARGETS or STANDALONE_SUBDIR_TARGETS. These can
just be set via src.conf. By the time bsd.subdir.mk is included it will
just append the values to the existing value and work fine. This allows
a consistent way to append to these variables without introducing a
LOCAL_ var for STANDALONE_SUBDIR_TARGETS or renaming the historical
SUBDIR_TARGETS.
Sponsored by: EMC / Isilon Storage Division
If filemon is used then there is no need to generate dependency files during
compilation as the .meta files will achieve the same result.
This is a temporary solution until FAST_DEPEND is default. Once that is
default there will be an option to disable dependency generation entirely
as it is only useful if an incremental build is planned, thus META_MODE+filemon
can enable that option to short-circuit all FAST_DEPEND-related logic.
Sponsored by: EMC / Isilon Storage Division
This API has no in-tree consumers at the moment but is useful to at least
one out-of-tree consumer, and naturally complements existing vnode refcount
functions (vholdl(9), vdropl(9)).
Obtained from: kib (sys/ portion)
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D4947
Differential Revision: https://reviews.freebsd.org/D4953
The .MAKEFLAGS check inside of the .for loop is extremely slow for some
reason. Just moving it out of the loop trimmed -V lookup time from 11
seconds to 1 second in the kernel obj directory.
Sponsored by: EMC / Isilon Storage Division
Some classes of IOAT hardware prefetch reads. DMA operations that
depend on the result of prior DMA operations must use the DMA_FENCE flag
to prevent stale reads.
(E.g., I've hit this personally on Broadwell-EP. The Broadwell-DE has a
different IOAT unit that is documented to not pipeline DMA operations.)
Sponsored by: EMC / Isilon Storage Division
canonical place, and the nit-pickers are welcome to move this
information there with a cross reference.
Differential Review: https://reviews.freebsd.org/D4860
Allow user-specified warning flag overrides for specific files under
bsd.sys.mk, in the same way kern.mk does.
This will to be used by future commits.
MFC after: 2 weeks
X-MFC-With: r293268
Sponsored by: Multiplay
This commit, fix a core dump on ypldap(8) related with memory allocation.
Also an example of how to set the ypldap.conf(5) properly is added to
examples files.
A new user _ypldap is required to be able to run ypldap(8) as well as
in a chroot mode.
Reviewed by: rodrigc (mentor), bjk
Approved by: bapt (mentor)
Relnotes: Yes
Sponsored by: gandi.net
Differential Revision: https://reviews.freebsd.org/D4744
option to invert the polarity in software. Also add an option to capture
very narrow pulses by using the hardware's MSR delta-bit capability of
latching line state changes.
This effectively reverts the mistake I made in r286595 which was based on
empirical measurements made on hardware using TTL-level signaling, in which
the logic levels are inverted from RS-232. Thus, this re-syncs the polarity
with the requirements of RFC 2783, which is writen in terms of RS-232
signaling.
Narrow-pulse mode uses the ability of most ns8250 and similar chips to
provide a delta indication in the modem status register. The hardware is
able to notice and latch the change when the pulse width is shorter than
interrupt latency, which results in the signal no longer being asserted by
time the interrupt service code runs. When running in this mode we get
notified only that "a pulse happened" so the driver synthesizes both an
ASSERT and a CLEAR event (with the same timestamp for each). When the pulse
width is about equal to the interrupt latency the driver may intermittantly
see both edges of the pulse. To prevent generating spurious events, the
driver implements a half-second lockout period after generating an event
before it will generate another.
Differential Revision: https://reviews.freebsd.org/D4477
It is built in libgcc_s.so and libgcc_eh.a to simplify transition.
It is enabled by default on arm64 (where we previously had no other
unwinder) and may be enabled for testing on other platforms by setting
WITH_LLVM_LIBUNWIND in src.conf(5).
Also add compiler-rt's __gcc_personality_v0 implementation for use with
the LLVM unwinder.
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D4787
ioat_acquire_reserve() is an extended version of ioat_acquire(). It
allows users to reserve space in the channel for some number of
descriptors. If this succeeds, it guarantees that at least submission
of N valid descriptors will succeed.
Sponsored by: EMC / Isilon Storage Division
Due to FreeBSD system-wide limits on number of MSI-X vectors
(https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321),
it may be desirable to allocate fewer than the maximum number
of vectors for an NVMe device, in order to save vectors for
other devices (usually Ethernet) that can take better
advantage of them and may be probed after NVMe.
This tunable is expressed in terms of minimum number of CPUs
per I/O queue instead of max number of queues per controller,
to allow for a more even distribution of CPUs per queue. This
avoids cases where some number of CPUs have a dedicated queue,
but other CPUs need to share queues. Ideally the PR referenced
above will eventually be fixed and the mechanism implemented
here becomes obsolete anyways.
While here, fix a bug in the CPUs per I/O queue calculation to
properly account for the admin queue's MSI-X vector.
Reviewed by: gallatin
MFC after: 3 days
Sponsored by: Intel
Immediate problem fixed by the new KPI is the long-standing race
between device creation and assignments to cdev->si_drv1 and
cdev->si_drv2, which allows the window where cdevsw methods might be
called with si_drv1,2 fields not yet set. Devices typically checked
for NULL and returned spurious errors to usermode, and often left some
methods unchecked.
The new function interface is designed to be extensible, which should
allow to add more features to make_dev_s(9) without inventing yet
another name for function to create devices, while maintaining KPI and
even KBI backward-compatibility.
Reviewed by: hps, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D4746
This will ensure that the variable was not set as a make override, in
make.conf, src.conf or src-env.conf. It allows setting the value in
src-env.conf when using WITH_AUTO_OBJ since that case properly handles
changing .OBJDIR (except if MAKEOBJDIRPREFIX does not yet exist which is
being discussed to be changed).
This change allows setting a default MAKEOBJDIRPREFIX via local.sys.env.mk.
Sponsored by: EMC / Isilon Storage Division
for libraries that follow the soft float ABI. It's only supported on
armv6 as a transition to the new hard float ABI, so mark as broken
everywhere else.
For determining the compiler version, quote the string to be echo'd,
otherwise the command might fail. This is because clang -v now results
in the following:
FreeBSD clang version 3.8.0 (trunk 256633) (based on LLVM 3.8.0svn)
The second "3.8.8svn)" string tripped up the shell command.
MFC after: 3 days
otherwise the command might fail. This is because clang -v now results
in the following:
FreeBSD clang version 3.8.0 (trunk 256633) (based on LLVM 3.8.0svn)
The second "3.8.8svn)" string tripped up the shell command.
The mdio driver interface is generally useful for devices that require
MDIO without the full MII bus interface. This lifts the driver/interface
out of etherswitch(4), and adds a mdio(4) man page.
Submitted by: Landon Fuller <landon@landonf.org>
Differential Revision: https://reviews.freebsd.org/D4606
POSIX requires for the c99 compiler.
(In fact, our c99(1) already ignores -lxnet; but our make(1) doesn't set
${CC} correctly, and our cc(1) treats xnet like any other library.)
Reviewed by: kib
While here, explicitly note the requirement that the BAR(s) must be
allocated prior to calling pci_alloc_msix().
Reviewed by: andrew, emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D4688
exhausted.
It is possible for a bug in the code (or, theoretically, even unusual
network conditions) to exhaust all possible mbufs or mbuf clusters.
When this occurs, things can grind to a halt fairly quickly. However,
we currently do not call mb_reclaim() unless the entire system is
experiencing a low-memory condition.
While it is best to try to prevent exhaustion of one of the mbuf zones,
it would also be useful to have a mechanism to attempt to recover from
these situations by freeing "expendable" mbufs.
This patch makes two changes:
a) The patch adds a generic API to the UMA zone allocator to set a
function that should be called when an allocation fails because the
zone limit has been reached. Because of the way this function can be
called, it really should do minimal work.
b) The patch uses this API to try to free mbufs when an allocation
fails from one of the mbuf zones because the zone limit has been
reached. The function schedules a callout to run mb_reclaim().
Differential Revision: https://reviews.freebsd.org/D3864
Reviewed by: gnn
Comments by: rrs, glebius
MFC after: 2 weeks
Sponsored by: Juniper Networks
Different revisions support different operations. Refer to Intel
External Design Specifications to figure out what your hardware
supports.
Sponsored by: EMC / Isilon Storage Division
o With new KPI consumers can request contiguous ranges of pages, and
unlike before, all pages will be kept busied on return, like it was
done before with the 'reqpage' only. Now the reqpage goes away. With
new interface it is easier to implement code protected from race
conditions.
Such arrayed requests for now should be preceeded by a call to
vm_pager_haspage() to make sure that request is possible. This
could be improved later, making vm_pager_haspage() obsolete.
Strenghtening the promises on the business of the array of pages
allows us to remove such hacks as swp_pager_free_nrpage() and
vm_pager_free_nonreq().
o New KPI accepts two integer pointers that may optionally point at
values for read ahead and read behind, that a pager may do, if it
can. These pages are completely owned by pager, and not controlled
by the caller.
This shifts the UFS-specific readahead logic from vm_fault.c, which
should be file system agnostic, into vnode_pager.c. It also removes
one VOP_BMAP() request per hard fault.
Discussed with: kib, alc, jeff, scottl
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
LLDB is usable for userland core file and live debugging on amd64, and
for userland core file debugging on arm64. In general it works at least
as well on FreeBSD as our in-tree gdb version, so enable it by default
to allow for broader use and testing.
An LLDB tutorial is available at http://lldb.llvm.org/tutorial.html, and
a table mapping GDB commands to LLDB commands can be found at
http://lldb.llvm.org/lldb-gdb.html .
LLDB also has some level of support for FreeBSD on arm, mips, i386,
and powerpc, but is not yet ready to have them enabled by default.
Reviewed by: gnn
Relnotes: Yes
This is not wrong, but was unexpected. Using <empty>:H results in '.' which
then using the rest of the conversion was added in RELDIR. This was also
causing an empty _DP_DIRDEPS to resolve to SRCTOP for DIRDEPS.
Sponsored by: EMC / Isilon Storage Division
This logic is potentially included multiple times, so overwrite the temporary
variable rather than append to it.
Sponsored by: EMC / Isilon Storage Division
This is because LDADD+=-lFOO is not the same as LDADD+=-lprivateFOO which is
what the private libs in LIBADD are.
Sponsored by: EMC / Isilon Storage Division
system call information such as system call arguments. Initially this
will consist of pulling duplicated code out of truss and kdump though it
may prove useful for other utilities in the future.
This commit moves the shared utrace(2) record parser out of kdump into
the library and updates kdump and truss to use it. One difference from
the previous version is that the library version treats unknown events
that start with the "RTLD" signature as unknown events. This simplifies
the interface and allows the consumer to decide how to handle all
non-recognized events. Instead, this function only generates a string
description for known malloc() and RTLD records.
Reviewed by: bdrewery
Differential Revision: https://reviews.freebsd.org/D4537
In I/OAT, this is done through the INTRDELAY register. On supported
platforms, this register can coalesce interrupts in a set period to
avoid excessive interrupt load for small descriptor workflows. The
period is configurable anywhere from 1 microsecond to 16.38
milliseconds, in microsecond granularity.
Sponsored by: EMC / Isilon Storage Division
These use ld(1), effectively -nostdlib, and don't need any of these
normal dependencies.
kmod builds also define PROG so just checking for KMOD here seems to be
the easiest to handle it.
Sponsored by: EMC / Isilon Storage Division
RISC-V is a new ISA designed to support computer research and education, and
is now become a standard open architecture for industry implementations.
This is a minimal set of changes required to run 'make kernel-toolchain'
using external (GNU) toolchain.
The FreeBSD/RISC-V project home: https://wiki.freebsd.org/riscv.
Reviewed by: andrew, bdrewery, emaste, imp
Sponsored by: DARPA, AFRL
Sponsored by: HEIF5
Differential Revision: https://reviews.freebsd.org/D4445
mps(4) sends StartStopUnit to SATA direct-access devices during shutdown.
Document the tunables which control that behavior.
PR: 195033
Reviewed by: scottl
Approved by: jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D4456
The hardware supports descriptors with two non-contiguous pages. This
allows issuing one descriptor for an 8k copy from/to non-contiguous but
otherwise page-aligned memory.
Sponsored by: EMC / Isilon Storage Division
Older ccache don't work with an empty CCACHE_PATH value. They will error with:
ccache: FATAL: Could not find compiler "cc" in PATH
make: "/mnt/bdrewery/git/onefs/src/share/mk/bsd.compiler.mk" line 134: Unable to determine compiler type for /usr/local/bin/ccache cc. Consider setting COMPILER_TYPE.
Sponsored by: EMC / Isilon Storage Division
STAGE_OBJTOP and STAGE_HOST_OBJTOP.
These will always be overridden in sub-makes when building in-tree, but
are exported for the benefit of hooking in external builds, such as
ports.
Sponsored by: EMC / Isilon Storage Division
- Don't bother looking up REVISION/BRANCH/etc from release/, or the
CPUTYPE check, as these are not used for makeman and wastes time. The also
invokes auto.obj.mk after I reverted auto.obj.mk ignoring -V in r291312.
- Don't modify CC or PATH when WITH_CCACHE_BUILD or WITH_META_MODE is enabled
as it leads to bsd.compiler.mk errors.
Sponsored by: EMC / Isilon Storage Division
These helper functions can be used to read in or write a buffer from or to
an arbitrary process' address space. Without them, this can only be done
using proc_rwmem(), which requires the caller to fill out a uio. This is
onerous and results in code duplication; the new functions provide a simpler
interface which is sufficient for most existing callers of proc_rwmem().
This change also adds a manual page for proc_rwmem() and the new functions.
Reviewed by: jhb, kib
Differential Revision: https://reviews.freebsd.org/D4245
Debug data files are now built by default with 'make buildworld' and
installed with 'make installworld'. This facilitates debugging but
requires more disk space both during the build and for the installed
world. Debug files may be disabled by setting WITHOUT_DEBUG_FILES=yes
in src.conf(5).
Reviewed by: bdrewery, eadler, vangyzen
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D4018
This will save time generating dependency files that we didn't expect
due to cases where SRCS!=OBJS or for building custom targetted objects
in Makefiles that do not end up in the DEPENDOBJS list.
This uses a bmake trick to modify CFLAGS based on ${.TARGET}. A
.PARSEDIR check is done for the sake of MFC safety.
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
Rather than try to guess at all of the OBJS variables just use SRCS
using the same patterns that mkdep does. This also fixes a mistake
where dependencies were being generated with FAST_DEPEND when they were
not for mkdep. This happens when OBJS!=SRCS as is the case in
gnu/lib/csu where SRCS has 1 file and OBJS has several other files that
does not even contain the 1 SRCS file. Generally in these cases the
OBJS have custom dependencies defined in their Makefile. If we generate
dependencies for those and then load a .depend file, then .IMPSRC may
contain duplicate sources and lead to errors such as:
cc: error: cannot specify -o when generating multiple output files
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
It was allowed before, but make it very explicit it is allowed now. And
prefer 'bool' to older types that were used for the same purpose -- int and
boolean_t.
Like with the C99 fixed-width types, use common sense when changing old
code.
No igor regressions.
Suggested by: bde <20151205031713.T3286@besplex.bde.org>
Reviewed by: glebius, davide, bapt (earlier versions)
Reviewed by: imp
Feedback from: julian
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D4384
Submitted by: Artem V. Andreev <Artem.Andreev at oktetlabs.ru>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4355
My changes in r291635 broke 'make install*' for DIRDEPS_BUILD but also
revealed that some other targets were not guaranteed to be created if
there was a SUBDIR defined. One example is 'installfiles' was never
defined if SUBDIR was not empty.
Sponsored by: EMC / Isilon Storage Division
The problem was that 'afterinstall' was not coming after SUBDIRs were
installed which was the expectation at least in sys/modules for kldxref.
Reported by: np
Pointyhat to: bdrewery
Sponsored by: EMC / Isilon Storage Division
the real build file.
This lessens the need to define DPADD_<lib> and LDADD_<lib> to just very
special cases.
Sponsored by: EMC / Isilon Storage Division
This would cause it to be included everywhere in the build since it is
the MAKESYSPATH. This leads to including dirdeps.mk more times than
desired.
Sponsored by: EMC / Isilon Storage Division
bsd.prog.mk and bsd.lib.mk already make OBJS depend on headers when there is
not .OBJDIR/.depend file, which is still true for the initial meta mode builds.
If there was something to benefit the meta mode build here then it should be
extended to the non-meta mode build as well.
Some of the problems here were just DPSRCS being hooked up wrongly, fixed in
r291330.
The logic itself is flawed as 'buildfiles' is in a different part of the
dependency tree than the objects and headers are, so the objects will still be
built independent from 'buildfiles'. 'buildfiles' is not ordered in the build
before objects.
Sponsored by: EMC / Isilon Storage Division
camdd(8) utility.
CCBs may be queued to the driver via the new CAMIOQUEUE ioctl, and
completed CCBs may be retrieved via the CAMIOGET ioctl. User
processes can use poll(2) or kevent(2) to get notification when
I/O has completed.
While the existing CAMIOCOMMAND blocking ioctl interface only
supports user virtual data pointers in a CCB (generally only
one per CCB), the new CAMIOQUEUE ioctl supports user virtual and
physical address pointers, as well as user virtual and physical
scatter/gather lists. This allows user applications to have more
flexibility in their data handling operations.
Kernel memory for data transferred via the queued interface is
allocated from the zone allocator in MAXPHYS sized chunks, and user
data is copied in and out. This is likely faster than the
vmapbuf()/vunmapbuf() method used by the CAMIOCOMMAND ioctl in
configurations with many processors (there are more TLB shootdowns
caused by the mapping/unmapping operation) but may not be as fast
as running with unmapped I/O.
The new memory handling model for user requests also allows
applications to send CCBs with request sizes that are larger than
MAXPHYS. The pass(4) driver now limits queued requests to the I/O
size listed by the SIM driver in the maxio field in the Path
Inquiry (XPT_PATH_INQ) CCB.
There are some things things would be good to add:
1. Come up with a way to do unmapped I/O on multiple buffers.
Currently the unmapped I/O interface operates on a struct bio,
which includes only one address and length. It would be nice
to be able to send an unmapped scatter/gather list down to
busdma. This would allow eliminating the copy we currently do
for data.
2. Add an ioctl to list currently outstanding CCBs in the various
queues.
3. Add an ioctl to cancel a request, or use the XPT_ABORT CCB to do
that.
4. Test physical address support. Virtual pointers and scatter
gather lists have been tested, but I have not yet tested
physical addresses or scatter/gather lists.
5. Investigate multiple queue support. At the moment there is one
queue of commands per pass(4) device. If multiple processes
open the device, they will submit I/O into the same queue and
get events for the same completions. This is probably the right
model for most applications, but it is something that could be
changed later on.
Also, add a new utility, camdd(8) that uses the asynchronous pass(4)
driver interface.
This utility is intended to be a basic data transfer/copy utility,
a simple benchmark utility, and an example of how to use the
asynchronous pass(4) interface.
It can copy data to and from pass(4) devices using any target queue
depth, starting offset and blocksize for the input and ouptut devices.
It currently only supports SCSI devices, but could be easily extended
to support ATA devices.
It can also copy data to and from regular files, block devices, tape
devices, pipes, stdin, and stdout. It does not support queueing
multiple commands to any of those targets, since it uses the standard
read(2)/write(2)/writev(2)/readv(2) system calls.
The I/O is done by two threads, one for the reader and one for the
writer. The reader thread sends completed read requests to the
writer thread in strictly sequential order, even if they complete
out of order. That could be modified later on for random I/O patterns
or slightly out of order I/O.
camdd(8) uses kqueue(2)/kevent(2) to get I/O completion events from
the pass(4) driver and also to send request notifications internally.
For pass(4) devcies, camdd(8) uses a single buffer (CAM_DATA_VADDR)
per CAM CCB on the reading side, and a scatter/gather list
(CAM_DATA_SG) on the writing side. In addition to testing both
interfaces, this makes any potential reblocking of I/O easier. No
data is copied between the reader and the writer, but rather the
reader's buffers are split into multiple I/O requests or combined
into a single I/O request depending on the input and output blocksize.
For the file I/O path, camdd(8) also uses a single buffer (read(2),
write(2), pread(2) or pwrite(2)) on reads, and a scatter/gather list
(readv(2), writev(2), preadv(2), pwritev(2)) on writes.
Things that would be nice to do for camdd(8) eventually:
1. Add support for I/O pattern generation. Patterns like all
zeros, all ones, LBA-based patterns, random patterns, etc. Right
Now you can always use /dev/zero, /dev/random, etc.
2. Add support for a "sink" mode, so we do only reads with no
writes. Right now, you can use /dev/null.
3. Add support for automatic queue depth probing, so that we can
figure out the right queue depth on the input and output side
for maximum throughput. At the moment it defaults to 6.
4. Add support for SATA device passthrough I/O.
5. Add support for random LBAs and/or lengths on the input and
output sides.
6. Track average per-I/O latency and busy time. The busy time
and latency could also feed in to the automatic queue depth
determination.
sys/cam/scsi/scsi_pass.h:
Define two new ioctls, CAMIOQUEUE and CAMIOGET, that queue
and fetch asynchronous CAM CCBs respectively.
Although these ioctls do not have a declared argument, they
both take a union ccb pointer. If we declare a size here,
the ioctl code in sys/kern/sys_generic.c will malloc and free
a buffer for either the CCB or the CCB pointer (depending on
how it is declared). Since we have to keep a copy of the
CCB (which is fairly large) anyway, having the ioctl malloc
and free a CCB for each call is wasteful.
sys/cam/scsi/scsi_pass.c:
Add asynchronous CCB support.
Add two new ioctls, CAMIOQUEUE and CAMIOGET.
CAMIOQUEUE adds a CCB to the incoming queue. The CCB is
executed immediately (and moved to the active queue) if it
is an immediate CCB, but otherwise it will be executed
in passstart() when a CCB is available from the transport layer.
When CCBs are completed (because they are immediate or
passdone() if they are queued), they are put on the done
queue.
If we get the final close on the device before all pending
I/O is complete, all active I/O is moved to the abandoned
queue and we increment the peripheral reference count so
that the peripheral driver instance doesn't go away before
all pending I/O is done.
The new passcreatezone() function is called on the first
call to the CAMIOQUEUE ioctl on a given device to allocate
the UMA zones for I/O requests and S/G list buffers. This
may be good to move off to a taskqueue at some point.
The new passmemsetup() function allocates memory and
scatter/gather lists to hold the user's data, and copies
in any data that needs to be written. For virtual pointers
(CAM_DATA_VADDR), the kernel buffer is malloced from the
new pass(4) driver malloc bucket. For virtual
scatter/gather lists (CAM_DATA_SG), buffers are allocated
from a new per-pass(9) UMA zone in MAXPHYS-sized chunks.
Physical pointers are passed in unchanged. We have support
for up to 16 scatter/gather segments (for the user and
kernel S/G lists) in the default struct pass_io_req, so
requests with longer S/G lists require an extra kernel malloc.
The new passcopysglist() function copies a user scatter/gather
list to a kernel scatter/gather list. The number of elements
in each list may be different, but (obviously) the amount of data
stored has to be identical.
The new passmemdone() function copies data out for the
CAM_DATA_VADDR and CAM_DATA_SG cases.
The new passiocleanup() function restores data pointers in
user CCBs and frees memory.
Add new functions to support kqueue(2)/kevent(2):
passreadfilt() tells kevent whether or not the done
queue is empty.
passkqfilter() adds a knote to our list.
passreadfiltdetach() removes a knote from our list.
Add a new function, passpoll(), for poll(2)/select(2)
to use.
Add devstat(9) support for the queued CCB path.
sys/cam/ata/ata_da.c:
Add support for the BIO_VLIST bio type.
sys/cam/cam_ccb.h:
Add a new enumeration for the xflags field in the CCB header.
(This doesn't change the CCB header, just adds an enumeration to
use.)
sys/cam/cam_xpt.c:
Add a new function, xpt_setup_ccb_flags(), that allows specifying
CCB flags.
sys/cam/cam_xpt.h:
Add a prototype for xpt_setup_ccb_flags().
sys/cam/scsi/scsi_da.c:
Add support for BIO_VLIST.
sys/dev/md/md.c:
Add BIO_VLIST support to md(4).
sys/geom/geom_disk.c:
Add BIO_VLIST support to the GEOM disk class. Re-factor the I/O size
limiting code in g_disk_start() a bit.
sys/kern/subr_bus_dma.c:
Change _bus_dmamap_load_vlist() to take a starting offset and
length.
Add a new function, _bus_dmamap_load_pages(), that will load a list
of physical pages starting at an offset.
Update _bus_dmamap_load_bio() to allow loading BIO_VLIST bios.
Allow unmapped I/O to start at an offset.
sys/kern/subr_uio.c:
Add two new functions, physcopyin_vlist() and physcopyout_vlist().
sys/pc98/include/bus.h:
Guard kernel-only parts of the pc98 machine/bus.h header with
#ifdef _KERNEL.
This allows userland programs to include <machine/bus.h> to get the
definition of bus_addr_t and bus_size_t.
sys/sys/bio.h:
Add a new bio flag, BIO_VLIST.
sys/sys/uio.h:
Add prototypes for physcopyin_vlist() and physcopyout_vlist().
share/man/man4/pass.4:
Document the CAMIOQUEUE and CAMIOGET ioctls.
usr.sbin/Makefile:
Add camdd.
usr.sbin/camdd/Makefile:
Add a makefile for camdd(8).
usr.sbin/camdd/camdd.8:
Man page for camdd(8).
usr.sbin/camdd/camdd.c:
The new camdd(8) utility.
Sponsored by: Spectra Logic
MFC after: 1 week
Each virtual interface has its own MAC address, queues, and statistics.
The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented
as additional VIs on each port. This change allows additional non-netmap
interfaces to be configured on each port. Additional virtual interfaces
use the naming scheme vcxgbeX or vcxlX.
Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a
value greater than 1 before loading the cxgbe(4) or cxl(4) driver.
NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).
T4/T5 NICs provide a limited number of MAC addresses for each physical port.
As a result, a maximum of six VIs can be configured on each port (including
the "main" interface and the netmap interface when netmap is enabled).
One user-visible result is that when netmap is enabled, packets received
or transmitted via the netmap interface are no longer counted in the stats
for the "main" interface, but are not accounted to the netmap interface.
The netmap interfaces now also have a new-bus device and export various
information sysctl nodes via dev.n(cxgbe|cxl).X.
The cxgbetool 'clearstats' command clears the stats for all VIs on the
specified port along with the port's stats. There is currently no way to
clear the stats of an individual VI.
Reviewed by: np
MFC after: 1 month
Sponsored by: Chelsio
like the various d_*_t typedefs since it declared a function pointer rather
than a function. Add a new d_priv_dtor_t typedef that declares the function
and can be used as a function prototype. The previous typedef wasn't
useful outside of the cdevpriv implementation, so retire it.
The name d_priv_dtor_t was chosen to be more consistent with cdev methods
since it is commonly used in place of d_close_t even though it is not a
direct pointer in struct cdevsw.
Reviewed by: kib, imp
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D4340
This is to fix 'make all' causing it to recurse on both 'all' and 'buildconfig'
due to 'buildconfig' being in ALL_SUBDIR_TARGETS and being a dependency of
'all'.
This now adds all of the '*includes', '*files' targets as subdir targets,
allowing them to recurse.
This also removes the need for some 'realinstall' hacks in bsd.subdir.mk since
it no longer recurses; only 'install' will recurse and call the proper
'beforeinstall', 'realinstall', and 'afterinstall' in each sub-directory.
This fixes 'make includes' and 'make files' to not be a rerolled ${MAKE}
sub-shell but to rather just recurse on 'inclues' and 'files'. This avoids
various issues such as the one fixed in r289462. As such revert Makefile.inc1
back to using 'includes' which avoids an extra tree walk and parallelizes
the includes phases better.
Makefile.inc1 includes a guard so that 'make all' will not use SUBDIR_PARALLEL,
added in r289438. This is so users do not get a probably broken build if they
run 'make all' from the top-level. Before the change in this commit, the
workaround for 'make everything' was 'par-all' which would depend on 'all' and
cause a proper parallel recursion. Now that will not work so a new
_PARALLEL_SUBUDIR_OK is used to allow it.
This is still part of an effort to combine bsd.(files|incs|confs).mk and move
some of its logic out of bsd.subdir.mk, as attempted in r289282 and reverted in
r289331. This commit fixes the problems found there which was mostly double
recursing during 'includes' which would recurse on itself and 'buildincludes'
and 'installincludes', all in parallel. The logic is still in bsd.subdir.mk
for now.
I've been cautious about this commit but have experienced no breakage on the
tree except for the 'par-all' case which was already a hack. If something foo
is depending on something bar that should recurse, it is very likely that the
foo target is being recursed on already meaning that bar will still effectively
recurse once sub-directories call foo.
Discussed on: arch@
MFC after: never
Sponsored by: EMC / Isilon Storage Division
This is to fix 'make all' causing it to recurse on both 'all' and 'buildconfig'
due to 'buildconfig' being in ALL_SUBDIR_TARGETS and being a dependency of
'all'.
This now adds all of the '*includes', '*files' targets as subdir targets,
allowing them to recurse.
This also removes the need for some 'realinstall' hacks in bsd.subdir.mk since
it no longer recurses; only 'install' will recurse and call the proper
'beforeinstall', 'realinstall', and 'afterinstall' in each sub-directory.
This fixes 'make includes' and 'make files' to not be a rerolled ${MAKE}
sub-shell but to rather just recurse on 'inclues' and 'files'. This avoids
various issues such as the one fixed in r289462. As such revert Makefile.inc1
back to using 'includes' which avoids an extra tree walk and parallelizes
the includes phases better.
Makefile.inc1 includes a guard so that 'make all' will not use SUBDIR_PARALLEL,
added in r289438. This is so users do not get a probably broken build if they
run 'make all' from the top-level. Before the change in this commit, the
workaround for 'make everything' was 'par-all' which would depend on 'all' and
cause a proper parallel recursion. Now that will not work so a new
_PARALLEL_SUBUDIR_OK is used to allow it.
This is still part of an effort to combine bsd.(files|incs|confs).mk and move
some of its logic out of bsd.subdir.mk, as attempted in r289282 and reverted in
r289331. This commit fixes the problems found there which was mostly double
recursing during 'includes' which would recurse on itself and 'buildincludes'
and 'installincludes', all in parallel. The logic is still in bsd.subdir.mk
for now.
I've been cautious about this commit but have experienced no breakage on the
tree except for the 'par-all' case which was already a hack. If something foo
is depending on something bar that should recurse, it is very likely that the
foo target is being recursed on already meaning that bar will still effectively
recurse once sub-directories call foo.
Discussed on: arch@
MFC after: never
Sponsored by: EMC / Isilon Storage Division
Fix current findings, which should fix cases of NO_SHARED not building
properly.
Given libfoo:
- Ensure that a LIBFOO is set. For INTERNALLIBS advise setting this in
src.libnames.mk, otherwise bsd.libnames.mk.
- Ensure that a LIBFOODIR is properly set.
- Ensure that _DP_foo is set and matches the LIBADD in the build of foo's own
Makefile
Sponsored by: EMC / Isilon Storage Division
I'm not sure why this was here, none of these use pthread themselves and
none of the consumers are broken with removing this.
Sponsored by: EMC / Isilon Storage Division
The proper place for this list is _DP_dtrace.
Due to removing the LDADD_dtrace, more LIBADD are needed in
cddl/usr.sbin/dtrace to prevent underlinking.
This fixes overlinking in cddl/usr.sbin/lockstat and
cddl/usr.sbin/plockstat.
Sponsored by: EMC / Isilon Storage Division
This change came in r281332 which was reducing overlinking in mt(1) but
currently mt(1) is linked with sbuf when it does not need it due to the
LDADD_mt+=${LDADD_sbuf}. Only libmt needs sbuf.
Add sbuf to _DP_mt so static linkage of libmt picks it up.
Sponsored by: EMC / Isilon Storage Division
Fix current findings.
Given libfoo:
- Ensure that a LIBFOO is set. For INTERNALLIBS advise setting this in
src.libnames.mk, otherwise bsd.libnames.mk.
- Ensure that a LIBFOODIR is properly set.
- Ensure that _DP_foo is set and matches the LIBADD in the build of foo's own
Makefile
Sponsored by: EMC / Isilon Storage Division
This does not really fix anything currently since _WITHOUT_SRCCONF must be
defined in the environment or local.sys.*.mk, but is proper and needed for
downstream fixes. I am working towards reworking src.conf inclusion still.
Sponsored by: EMC / Isilon Storage Division
This is not properly respecting WITHOUT or ARCH dependencies in target/.
Doing so requires a massive effort to rework targets/ to do so. A
better approach will be to either include the SUBDIR Makefiles directly
and map to DIRDEPS or just dynamically lookup the SUBDIR. These lose
the benefit of having a userland/lib, userland/libexec, etc, though and
results in a massive package. The current implementation of targets/ is
very unmaintainable.
Currently rescue/rescue and sys/modules are still not connected.
Sponsored by: EMC / Isilon Storage Division
- Support more of the toolchain from TOOLSDIR.
- This also improves 'make bootstrap-tools' to pass, for example,
AS=/usr/bin/as to Makefile.inc1, which will tell cross-tools to use
external toolchain support and avoid building things we won't be using
in the build.
- Always set the PATH to contain the staged TOOLSDIR directories when
not building the bootstrap targets.
The previous version was only setting this at MAKE.LEVEL==0 and if the
TOOLSDIR existed. Both of these prevented using staged tools that were
built during the build though as DIRDEPS with .host dependencies, such
as the fix for needing usr.bin/localedef.host in r291311.
This is not a common tool so we must build and use it during the build,
and need to be prepared to change PATH as soon as it appears.
This should also fix the issue of host dependencies disappearing from
Makefile.depend and then reappearing due to the start of the fresh build not
having the directory yet, resulting in the tools that were built not actually
being used.
- Only use LEGACY_TOOLS while building in Makefile.inc1. After r291317
and r291546 there is no need to add LEGACY_TOOLS into the PATH for
the pseudo/targets/toolchain build.
- Because the pseudo/targets/toolchain will now build its own
[clang-]tblgen, the special logic in clang.build.mk is no longer needed.
- LEGACY_TOOLS is no longer used outside of targets/pseudo/bootstrap-tools
so is no longer passed into the environment in its build.
Sponsored by: EMC / Isilon Storage Division
Doing this causes more trouble than it is worth regarding cyclic
dependencies. It should not be needed after cleaning up MACHINE=host
builds in r291324.
Sponsored by: EMC / Isilon Storage Division
IPv4/IPv6 checksum offloading and VLAN tag insertion/stripping.
Since uether doesn't provide a way to announce driver specific offload
capabilities to upper stack, checksum offloading support needs more work
and will be done in the future.
Special thanks to Hayes Wang from RealTek who gave input.
This is mostly working around the converts/iconv port having '../ces/file.o'
in its OBJS list which resulted in '.depend../ces/file.o'. Now it will have
'.depend.._ces_file.o'.
Other implementations have :T which would result in '.depend.file.o' here, but
that could lead to collisions.
X-MFC-With: r291554
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
-MP creates empty targets for all dependency files, which can be useful when a
dependency is deleted from the file system. This would otherwise cause an
error for "don't know how to build FOO" since the .depend file is included
with the dependency registered.
This is mostly a workaround for the misc/dahdi-kmod port using '::' for one of
its dependencies, while -MP uses just ':'. This results in an 'Inconsistent
operator for' error.
X-MFC-With: r290433
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division