9899:2011 Appendix K 3.7.4.1.
Other needed supporting types, defines and constraint_handler
infrastructure is added as specified in the C11 spec.
Submitted by: Tom Rix <trix@juniper.net>
Sponsored by: Juniper Networks
Discussed with: ed
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D9903
Differential revision: https://reviews.freebsd.org/D10161
The goal of this work is to remove the explicit dependency for ctl(4)
on iscsi(4), so end-users without iscsi(4) support in the kernel can
use ctl(4) for its other functions.
This allows those without iscsi(4) support built into the kernel to use
ctl(4) as a test mechanism. As a sidenote, this was possible around the
10.0-RELEASE period, but made impossible for end-users without iscsi(4)
between 10.0-RELEASE and 11.0-RELEASE.
Automatically load cfiscsi(4) from ctladm(8) and ctld(8) for backwards
compatibility with previously releases. The automatic loading feature is
compiled into the beforementioned tools if MK_ISCSI == yes when building
world.
Add a manpage for cfiscsi(4) and refer to it in ctl(4).
Differential Revision: D10099
MFC after: 2 months
Relnotes: yes
Reviewed by: mav, trasz
Sponsored by: Dell EMC Isilon
map the 'which' argument into a suitable audit event identifier for the
specific operation requested.
Obtained from: TrustedBSD Project
MFC after: 3 weeks
Sponsored by: DARPA, AFRL
instrument security event auditing rather than relying on conventional BSM
trail files or audit pipes:
- Add a set of per-event 'commit' probes, which provide access to
particular auditable events at the time of commit in system-call return.
These probes gain access to audit data via the in-kernel audit_record
data structure, providing convenient access to system-call arguments and
return values in a single probe.
- Add a set of per-event 'bsm' probes, which provide access to particular
auditable events at the time of BSM record generation in the audit
worker thread. These probes have access to the in-kernel audit_record
data structure and BSM representation as would be written to a trail
file or audit pipe -- i.e., asynchronously in the audit worker thread.
DTrace probe arguments consist of the name of the audit event (to support
future mechanisms of instrumenting multiple events via a single probe --
e.g., using classes), a pointer to the in-kernel audit record, and an
optional pointer to the BSM data and its length. For human convenience,
upper-case audit event names (AUE_...) are converted to lower case in
DTrace.
DTrace scripts can now cause additional audit-based data to be collected
on system calls, and inspect internal and BSM representations of the data.
They do not affect data captured in the audit trail or audit pipes
configured in the system. auditd(8) must be configured and running in
order to provide a database of event information, as well as other audit
configuration parameters (e.g., to capture command-line arguments or
environmental variables) for the provider to operate.
Reviewed by: gnn, jonathan, markj
Sponsored by: DARPA, AFRL
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D10149
A request may be queued while the queue lock is dropped when the mirror is
being destroyed. The corresponding wakeup would be lost, possibly resulting
in an apparent hang of the mirror worker thread.
Tested by: pho (part of a larger patch)
MFC after: 1 week
Sponsored by: Dell EMC Isilon
The worker thread will destroy the mirror provider as part of its teardown
sequence. The call made sense in the initial revision of gmirror, but
became unnecessary in r137248.
Tested by: pho (part of a larger diff)
MFC afteR: 2 weeks
Sponsored by: Dell EMC Isilon
doesn't help clang, but buys us another 32 bytes for gcc 4.2.1. It
also eliminates a warning from gcc 6.3.0 that says inlining this would
be unhelpful.
position. Especially the screen size, and potentially everything except
the input state and attributes. Do this by changing the cursor position
setting method to a general syncing method.
Use proper constructors instead of copying to create kernel terminal
contexts. We really want clones and not new instances, but there is
no method for cloning and there is nothing in the active instance that
needs to be cloned exactly.
Add proper destructors for kernel terminal contexts. I doubt that the
destructor code has every been reached, but if it was then it leaked the
memory of the clones.
Remove freeing of statically allocated memory for the non-kernel terminal
context for the same terminal as the kernel. This is in the nearly
unreachable code. This used to not happen because delicate context
swapping made the user context use the dynamic memory and kernel
context the static memory. I didn't restore this swapping since it
would have been unnatural to have all kernel contexts except 1 dynamic.
The constructor for terminal context has bad layering for reasons
related to the bug. It has to return static memory early before
malloc() works. Callers also can't allocate memory until after the
first constructor selects an emulator and tells upper layers the size
of its context. After that, the cloning hack required the cloning
code to allocate the memory, but for all other constructors it would
be better for the terminal layer to allocate and deallocate the
memory in all cases.
Zero the memory when allocating terminal contexts dynamically.
This is being done to make it easier to change in the future--this action might be
needed sooner rather than later because of gcc 6.3.0 bailing, stating that there
is negative free space left (deficit) in the boot2 bootloader.
MFC after: 2 months
Sponsored by: Dell EMC Isilon
self_reloc.c doesn't initialize `rel` in all cases in the C code, however, the value
might be initialized properly on the stack in the assembly code.
For right now (because this doesn't seem to be breaking anything and my initializing
the stack value could break something since it's called from assembly code) disable
the warning for self_reloc.c. More investigation should be done to determine the
appropriate response to this warning (either intialize the value or find a smarter
way to deal with the warning).
A long MFC timeout is being set for this change to allow a better solution for the
issue to be developed in that time period.
MFC after: 2 months
Reported by: Jenkins (FreeBSD-head-amd64-gcc job)
Tested with: amd64-gcc-6.3.0 (devel/amd64-xtoolchain-gcc)
Sponsored by: Dell EMC Isilon
With some file system the ls is unable to display file types.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D10066
Move the time related function into time.c, keep the same logic as libefi.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D10058
`-Wno-missing-variable-declarations` is a clang-specific flag,
so gcc (not 4.2.1, in particular 6.3.0 in my case) dies when
it's passed the flag.
X-MFC with: r304321
Reported by: amd64-gcc-6.3.0 (devel/amd64-xtoolchain-gcc)
Sponsored by: Dell EMC Isilon
This is a better pattern to follow when creating the bootloaders and doing
the relevant space checks to make sure that the sizes aren't exceeded (and
thus, copy-pasting is a bit less error prone).
MFC after: 3 days
Sponsored by: Dell EMC Isilon
This variable has been unused since its inception in r40106.
MFC after: 3 days
Reported by: amd64-gcc-6.3.0 (devel/amd64-xtoolchain-gcc)
Sponsored by: Dell EMC Isilon
zfssubr.c already defines this statically. Besides, zfsimpl.c defined it, but
didn't use it.
This fixes a -Wredundant-decls warning.
MFC after: 3 days
Reported by: amd64-gcc-6.3.0 (devel/amd64-xtoolchain-gcc)
Sponsored by: Dell EMC Isilon
This fixes several -Wshadow warnings introduced in r192194, but now errors
with gcc 6.3.0.
MFC after: 3 days
Reported by: amd64-gcc-6.3.0 (devel/amd64-xtoolchain-gcc)
Sponsored by: Dell EMC Isilon
This fixes a -Wempty-body warning with gcc 6.3.0 when PART_DEBUG is undefined.
MFC after: 3 days
Reported by: Jenkins (FreeBSD-head-amd64-gcc job)
Tested with: amd64-gcc-6.3.0 (devel/amd64-xtoolchain-gcc)
Sponsored by: Dell EMC Isilon
bde enabled -fno-guess-branch-probability in 2003, well before our
current compiler was imported. At the time it produced weirdly orded
code. It no longer does that. It also saves 0-4 bytes depending on
other options.
kan disabled unit-at-a-time in 2004 because it badly mangled boot2 so
it wouldn't work. That too was before the 4.2.1 compiler, where it no
longer does that. This saves 44 bytes.
I had planned to document why they were needed, but when I discovered
their antiquity, I removed them and boot2 still works and is
smaller. In qemu, the old and new boot2's behaved identically.
These are gcc specific hacks, and won't affect clang-built boot2
at all.
Don't compile geliargs into the image and don't pass geliargs to the respective
bootloader code via __exec(..).
This saves a negligible amount of memory/disk space.
X-MFC with: r296963
Obtained from: Isilon OneFS
Sponsored by: Dell EMC Isilon
cfumass(4) is not usable if usfs(4) is loaded or compiled into the
kernel. Remove usfs so that the user may kldload the USB mass storage
target they prefer.
PR: 218169
Reviewed by: trasz, hselasky (no objection)
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D10153
tcp_output.c was using a route on the stack for IPv6, which does not
allow route caching or LLE/ndp caching. Switch to using the route
(v6 flavor) in the in_pcb, which was already present, which caches
both L3 and L2 lookups.
Reviewed by: gnn hiren
MFC after: 2 weeks
clang is now inserting .file directives with the entire path in
them. This is fine, except that our sed peephole optimizer removes
them if ${SRCTOP} or ${OBJTOP} contains 'align' or 'nop', leading to
build failures. The sed peephole optimizer removes useful things for
boot2 when used with clang, so restrict its use to gcc. Also, gcc no
longer generates nops to pad things, so there's no point in removing
it. Specialize the optimization to just removing the .align 4 lines to
preclude inadvertant path matching.
Sponsored by: Netflix
Commit brought to you the path: /home/xxx/NCD-3592-logsynopts/FreeBSD
the thread that deals with socket state changes. This eliminates
various bad races with the ithread.
Submitted by: KrishnamRaju ErapaRaju @ Chelsio (original version)
MFC after: 3 days
Sponsored by: Chelsio Communications
This patch brings 802.1q support for Marvell 88E606x ethernet switches.
Test is done on 88E6065 chip (Aterm WR1200).
Submitted by: Hiroki Mori <yamori813@yahoo.co.jp>
Reviewed by: mizhka
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10144
7603 xuio_stat_wbuf_* should be declared (void)
illumos/illumos-gate@99aa8b550599aa8b5505https://www.illumos.org/issues/7603
The funcs are declared k&r style, where the args are not specified:
void xuio_stat_wbuf_copied();
They should be declared to take no arguments:
void xuio_stat_wbuf_copied(void);
Need to change both .c and .h.
Author: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
the LinuxKPI for accessing user-space memory in the kernel.
Add functions to hold and wire physical page(s) based on a given range
of user-space virtual addresses.
Add functions to get and put a reference on, wire, hold, mark
accessed, copy and dirty a physical page.
Add new VM related structures and defines as a preparation step for
advancing the memory map capabilities of the LinuxKPI.
Add function to figure out if a virtual address was allocated using
malloc().
Add function to convert a virtual kernel address into its physical
page pointer.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
- Don't execute any of g_mirror_shutdown_post_sync() when panicking. We
cannot safely idle the mirror or stop synchronization in that state, and
the current attempts to do so complicate debugging of gmirror itself.
- Check for a non-NULL panicstr instead of using SCHEDULER_STOPPED(). The
latter was added for use in the locking primitives.
Reviewed by: mav, pjd
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
The change introduced a dependency between genassym.c and header files
generated from .m files, but that dependency is not specified in the
make files.
Also, the change could be not as useful as I thought it was.
Reported by: dchagin, Manfred Antar <null@pozo.com>, and many others
that used to work via the bold hack).
Fix the table entry for bright black. Fix spelling of plain black in
nearby table entries (use the macro for black everywhere everywhere).
Fix the currently-unused non-bright color table to not have bright
colors in entries 9-15.
Improve nearby comments. Start converting to the xterm terminology
and default rendering of "bright" instead of "light" for bright
colors.
Syscons wasn't affected by the bug since I optimized it a little by
converting colors 0-15 directly. This also fixes the layering of
the conversion for these colors.
Apply the same optimization to vt (actually the layer above it). This
also moves the conversion 1 closer to the correct layer for colors
0-15.
The optimization of just avoiding 2 calls to a trivial function is worth
about 10% for simple output to the virtual buffer with occasional
rendering. The optimization is so large because the 2 calls are done
on every character, so although there are too many other calls and
other instructions per character, there are only about 10 times as
many. Old versions of syscons were about 10 times faster for simple
output, by using a fast path with about 12 instructions per character.
Rendering to even slow hardware takes relatively little time provided
it is rarely actually done.
database in the kernel audit implementation, similar the exist
class mapping database. This will be used by the DTrace audit
provider to map audit event identifiers originating in the
system-call table back into strings for the purposes of setting
probe names. The database is initialised and maintained by
auditd(8), which reads values in from the audit_events
configuration file, and then manages them using the A_GETEVENT
and A_SETEVENT auditon(2) operations.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, AFRL
MFC after: 3 weeks
The change seems to be more in the nomenclature than in the way the
topology is advertised by the hardware.
Tested by: truckman (earlier version of the change)
MFC after: 2 weeks
Implement timeouts for register-based DMAR commands. Tunable/sysctl
hw.dmar.timeout specifies the timeout in nanoseconds, set it to zero
to allow infinite wait. Default is 1ms.
Runtime modification of the sysctl is not safe, it is allowed for
debugging.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
- Add a new "qsize" parameter in audit_control and the getacqsize(3) API to
query it, allowing to set the kernel's maximum audit queue length.
- Add support to push a mapping between audit event names and event numbers
into the kernel (where supported) using new A_GETEVENT and A_SETEVENT
auditon(2) operations.
- Add audit event identifiers for a number of new (and not-so-new) FreeBSD
system calls including those for asynchronous I/O, thread management, SCTP,
jails, multi-FIB support, and misc. POSIX interfaces such as
posix_fallocate(2) and posix_fadvise(2).
- On operating systems supporting Capsicum, auditreduce(1) and praudit(1) now
run sandboxed.
- Empty "flags" and "naflags" fields are now permitted in audit_control(5).
Many thanks to Christian Brueffer for producing the OpenBSM release and
importing/tagging it in the vendor branch. This release will allow improved
auditing of a range of new FreeBSD functionality, as well as non-traditional
events (e.g., fine-grained I/O auditing) not required by the Orange Book or
Common Criteria.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, AFRL
MFC after: 3 weeks
I fixed this in 1997, but the fix was over-engineered and fragile and
was broken in 2003 if not before. i386 parameters were copied to 8
other arches verbatim, mostly after they stopped working on i386, and
mostly without the large comment saying how the values were chosen on
i386. powerpc has a non-verbatim copy which just changes the uncritical
parameter and seems to add a sign extension bug to it.
Just treat negative offsets as offsets if they are no more negative than
-db_offset_max (default -64K), and remove all the broken parameters.
-64K is not very negative, but it is enough for frame and stack pointer
offsets since kernel stacks are small.
The over-engineering was mainly to go more negative than -64K for the
negative offset format, without affecting printing for more than a
single address.
Addresses in the top 64K of a (full 32-bit or 64-bit) address space
are now printed less well, but there aren't many interesting ones.
For arches that have many interesting ones very near the top (e.g.,
68k has interrupt vectors there), there would be no good limit for
the negative offset format and -64K is a good as anything.
1) They are using wrong tag (Tx) + map (Rx) combination.
2) Rx descriptor is already synchronized in iwn_notif_intr()
3) It's not needed for transmitted data since device does not change
mbuf contents.
Tested with Intel 6205 (amd64), STA mode.
Newer VGAs don't support any mono modes, but bugs in the tables created
2 virtual mono modes (#45 90x43 and #112 80x43) that behaved more
strangely than crashing. 90-column modes are tweaked 80-column ones
and also fail to work on newer VGAs. #45 did crash (hang) on some
hardware.
it to a separate state for each CPU.
Terminal "input" is user or kernel output. Its state includes the current
parser state for escape sequences and multi-byte characters, and some
results of previous parsing (mainly attributes), and in teken the cursor
position, but not completed output. This state must be switched for kernel
output since the kernel can preempt anything, including itself, and this
must not affect the preempted state more than necessary. Since vty0 is
shared, it is necessary to affect the frame buffer and cursor position and
history, but escape sequences must not be affected and attributes for
further output must not be affected.
This used to work. The syscons terminal state contained mainly the parser
state for escape sequences and attributes, but not the cursor position,
and was switched. This was first broken by SMP and/or preemptive kernels.
Then there should really be a separate state for each thread, and one more
for ddb, or locking to prevent preemption. Serialization of printf() helps.
But it is arcane that full syscons escape sequences mostly work in kernel
printf(), and I have never seen them used except by me to test this fix.
They worked perfectly except for the races, since "input" from the kernel
was not special in any way.
This was broken to use teken. The general switch was removed, and the
kernel normal attribute was switched specially. The kernel reverse
attribute (config option SC_CONS_REVERSE_ATTR) became unused, and is
still unusable because teken doesn't support default reverse attributes
(it used to only be used via the ANSI escape sequence to set reverse
video).
The only new difficulty for using teken seems to be that the cursor
position is in the "input" state, so it must be updated in the active
input state for each half of the switch. Do this to complete the
restoration.
The per-CPU state is mainly to make per-CPU coloring work cleanly, at
a cost of some space. Each CPU gets its own full set of attribute
(not just the current attribute) maintained in the usual way. This
also reduces races from unserialized printf()s. However, this gives
races for serialized printf()s that otherwise have none. Nothing
prevents the CPU doing the a printf() changing in the middle of an
escape sequence.
optimization.
This fixes building with gcc-4.2.1 (it doesn't support SSE4).
gas-2.17.50 [FreeBSD] supports SSE4 instructions, so this doesn't
need using .byte directives.
This fixes depending on host user headers in the kernel.
Fix user includes (don't depend on namespace pollution in <nmmintrin.h>
that is not included now).
The instrinsics had no advantages except to sometimes avoid compiler
pessimixations. clang understands them a bit better than inline asm,
and generates better looking code which also runs better for cem, but
for me it just at the same speed or slower by doing excessive
unrollowing in all the wrong places. gcc-4.2.1 also doesn't understand
what it is doing with unrolling, but with -O3 somehow it does more
unrolling that helps.
Reduce 1 of the the compiler pessimizations (copying a variable which
already satisfies an "rm" constraint in a good way by being in memory
and not used again, to different memory and accessing it there. Force
copying it to a register instead).
Try to optimize the inner loops significantly, so as to run at full
speed on smaller inputs. The algorithm is already very MD, and was
tuned for the throughput of 3 crc32 instructions per cycle found on
at least Sandybridge through Haswell. Now it is even more tuned for
this, so depends more on the compiler not rearranging or unrolling
things too much. The main inner loop for should have no difficulty
runing at full speed on these CPUs unless the compiler unrolls it too
much. However, the main inner loop wasn't even used for buffers smaller
than 24K. Now it is used for buffers larger than 384 bytes. Now it
is not so long, and the main outer loop is used more. The new
optimization is to try to arrange that the outer loop runs in parallel
with the next inner loop except for the final iteration; then reduce
the loop sizes significantly to take advantage of this.
Approved by: cem
Not tested in production by: bde
Some code was additionally moved for (future) lock splitting.
Tested with Intel 6205, STA mode.
Differential Revision: https://reviews.freebsd.org/D10106
We don't have enouch space to store full VFP context within mcontext
stucture. Due to this:
- follow i386/amd64 way and store VFP state outside of the mcontext_t
but point to it. Use the size of VFP state structure as an 'magic'
indicator of the saved VFP state presence.
- teach set_mcontext() about this external storage.
- for signal delivery, store VFP state to expanded 'struct sigframe'.
Submited by: Andrew Gierth (initial version)
PR: 217611
MFC after: 2 weeks
Kernel environment variable hw.busdma.default can take values 'bounce'
and 'dmar' and selects corresponding busdma backend as default.
Per-device environment variable hw.busdma.pci<domain>.<bus>.<slot>.<func>
takes the same values and overrides hw.busdma.default for the given device.
Note that even with hw.busdma.default=bounce, DMA translation engines
are still started if DMARs are enabled, to disable them use
hw.dmar.dma tunable, as before.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
FreeBSD uses upstream DTB for RPi3 build and compatibility string for
i2c device is different there. Add this new string to compatibility data.
Reported by: Karl Denninger
MFC after: 3 days
Do not try to use errno(2) codes here; instead, just return unique
value (1) when radio is disabled via hardware switch and another
one (-1) for any other error in initialization path.
Tested with Intel 6205, STA mode.
The change is more intrusive than I would like because the feature
requires that a vector number is written to a special register.
Thus, now the vector number has to be provided to lapic_eoi().
It was readily available in the IO-APIC and MSI cases, but the IPI
handlers required more work.
Also, we now store the VMM IPI number in a global variable, so that it
is available to the justreturn handler for the same reason.
Reviewed by: kib
MFC after: 6 weeks
Differential Revision: https://reviews.freebsd.org/D9880
ip_forward, TCP/IPv6, and probably SCTP leaked references to L2 cache
entry because they used their own routes on the stack, not in_pcb routes.
The original model for route caching was callers that provided a route
structure to ip{,6}input() would keep the route, and this model was used
for L2 caching as well. Instead, change L2 caching to be done by default
only when using a route structure in the in_pcb; the pcb deallocation
code frees L2 as well as L3 cacches. A separate change will add route
caching to TCP/IPv6.
Another suggestion was to have the transport protocols indicate willingness
to use L2 caching, but this approach keeps the changes in the network
level
Reviewed by: ae gnn
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10059
As noted in the comment, nothing special needs to be done to destroy
the unneeded context after the allocation race, but the context memory
itself still should to be freed.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
It does not make sense since identity mapping already provides the
required mapping for RMRR ranges. More, since identity page tables do
not reflect content of map entries for id domains, creating RMRR
entries makes domain data inconsistent.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This can significantly reduce scan duration thus saving time and power.
EBS failure reported by FW disables EBS for current connection. It is
re-enabled upon new connection attempt on any WLAN interface.
Obtained from: dragonflybsd.git 89f579e9823a5c446ca172cf82bbc210d6a054a4
If this is the last running vap wait until device will be powered off
(fixes panic when 'ifconfig wlan0 destroy' is executed for running iwn(4)
interface).
Tested with:
- Intel 6205, STA mode.
- RTL8188EU, STA / IBSS modes.
- RTL8821AU, STA / HOSTAP modes.
Long ago, perhaps only on i386, kernel text was mapped read-only and
it was necessary to change the mapping to read-write to set breakpoints
in kernel text. Other writes by ddb to kernel text were also allowed.
This write protection is harder to implement with 4MB pages, and was
lost even for 4K pages when 4MB pages were implemented. So changing
the mapping became useless. It was actually worse than useless since
it followed followed various null and otherwise garbage pointers to
not change random memory instead of the mapping. (On i386s, the
pointers became good in pmap_bootstrap(), and on amd64 the pointers
became bad in pmap_bootstrap() if not before.)
Another bug broke detection of following of null pointers on i386,
except early in boot where not detecting this was a feature. When
I fixed the bug, I accidentally broke the feature and soon got traps
in db_write_bytes(). Setting breakpoints early in ddb was broken.
kib pointed out that a clean way to do the adjustment would be to use
a special [sub]map giving a small window on the bytes to be written.
The trap handler didn't know how to fix up errors for pagefaults
accessing the map itself. Such errors rarely need fixups, since most
traps for the map are for the first access which is a read.
Reviewed by: kib
crash when the file shrinks. This also fixes sendfile(2) not sending more
data in a case when the file grows, and the request is open-ended or
specifies a size that is greater than old file size.
PR: 217789
Reviewed by: gallatin
MFC after: 10 days
- in mcontext_t, rename newer used 'union __vfp' to equaly sized 'mc_spare'.
Space allocated by 'union __vfp' is too small and cannot hold full
VFP context.
- move structures defined in fp.h to more appropriate headers.
- remove all unused VFP structures.
MFC after: 2 weeks
illumos/illumos-gate@8363e80ae7https://github.com/illumos/illumos-gate/commit/8363e80ae72609660f6090766ca8c2c18https://www.illumos.org/issues/7303
This change introduces a new weighting algorithm to improve metaslab selection.
The new weighting algorithm relies on the SPACEMAP_HISTOGRAM feature. As a result,
the metaslab weight now encodes the type of weighting algorithm used
(size-based vs segment-based).
This also introduce a new allocation tracing facility and two new dcmds to help
debug allocation problems. Each zio now contains a zio_alloc_list_t structure
that is populated as the zio goes through the allocations stage. Here's an
example of how to use the tracing facility:
> c5ec000::print zio_t io_alloc_list | ::walk list | ::metaslab_trace
MSID DVA ASIZE WEIGHT RESULT VDEV
- 0 400 0 NOT_ALLOCATABLE ztest.0a
- 0 400 0 NOT_ALLOCATABLE ztest.0a
- 0 400 0 ENOSPC ztest.0a
- 0 200 0 NOT_ALLOCATABLE ztest.0a
- 0 200 0 NOT_ALLOCATABLE ztest.0a
- 0 200 0 ENOSPC ztest.0a
1 0 400 1 x 8M 17b1a00 ztest.0a
> 1ff2400::print zio_t io_alloc_list | ::walk list | ::metaslab_trace
MSID DVA ASIZE WEIGHT RESULT VDEV
- 0 200 0 NOT_ALLOCATABLE mirror-2
- 0 200 0 NOT_ALLOCATABLE mirror-0
1 0 200 1 x 4M 112ae00 mirror-1
- 1 200 0 NOT_ALLOCATABLE mirror-2
- 1 200 0 NOT_ALLOCATABLE mirror-0
1 1 200 1 x 4M 112b000 mirror-1
- 2 200 0 NOT_ALLOCATABLE mirror-2
If the metaslab is using segment-based weighting then the WEIGHT column will
display the number of segments available in the bucket where the allocation
attempt was made.
Author: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Chris Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
I got a report of this source file not building on Raspberry Pi. It's
interesting that this only fails for that target and not for others.
Again, that's no reason not to include the right headers.
PR: 217969
Reported by: Johannes Jost Meixner
MFC after: 1 week
Add support for early boot access to NVRAM variables, using a new
bhnd_nvram_data_getvar_direct() API to support zero-allocation direct
reading of NVRAM variables from a bhnd_nvram_io instance backed by the
CFE NVRAM device.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D9913
The existing ELF image activator requires the brandinfo to provide such
a string unconditionally, even if the executable format in question
doesn't use this type of branding. Skip matching when it's a null
pointer.
Reviewed by: kib
MFC after: 2 weeks
r315083 essentially reverted r263954 which was made for a good reason,
but didn't take into account AACRAID_DEBUG.
Now both types of build should be clean.
MFC after: 5 days
No MFC to: stable/10
I think this message is not very useful for end user. Also its formatting
does not match other messages printed at that time. Those who really need
this information can always find it in `camcontrol negotiate daX -v`.
MFC after: 2 weeks
This way we can avoid blocking the whole queue in the low memory
situations. It's better to sacrifice some I/O performance by not doing
the aggregation than to add an indefinite wait for more memory.
Reviewed by: smh
MFC after: 2 weeks
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D9999
This is done so that the thread state changes during the switch
are not confused with the thread state changes reported when the thread
spins on a lock.
Here is an example, three consecutive entries for the same thread (from top to
bottom):
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"sleep", attributes: prio:84, wmesg:"-", lockname:"(null)"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"spinning", attributes: lockname:"sched lock 1"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"running", attributes: none
The above trace could leave an impression that the final state of
the thread was "running".
After this change the sleep state will be reported after the "spinning"
and "running" states reported for the sched lock.
Reviewed by: jhb, markj
MFC after: 1 week
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D9961
Newbus handles multiple equally named device classes without problems,
so there is no reason to use slightly cryptic "<foo>_shdci" for them.
In contrast, the driver module name must be unique, so "<foo>_shdci"
is the right name for it.
* All the supported firmwares have these flags set.
* This removes the following flags:
IWM_UCODE_TLV_FLAGS_PM_CMD_SUPPORT,
IWM_UCODE_TLV_FLAGS_NEWBT_COEX,
IWM_UCODE_TLV_FLAGS_BF_UPDATED,
IWM_UCODE_TLV_FLAGS_D3_CONTINUITY_API,
IWM_UCODE_TLV_FLAGS_STA_KEY_CMD,
IWM_UCODE_TLV_FLAGS_DEVICE_PS_CMD,
IWM_UCODE_TLV_FLAGS_SCHED_SCAN,
IWM_UCODE_TLV_FLAGS_RX_ENERGY_API,
IWM_UCODE_TLV_FLAGS_TIME_EVENT_API_V2
* Also remove definitions and code for dealing with the v1 time-event api.
* Remove unneeded calc_rssi() function.
Obtained from: dragonflybsd.git d078c812418d0e2c3392e99fa25fc776d07bdfad
matches static binaries.
Interpretation of the 'static' there is that the binary must not
specify an interpreter. In particular, shared objects are matched by
the brand if BI_CAN_EXEC_DYN is also set.
This improves precision of the brand matching, which should eliminate
surprises due to brand ordering.
Revert r315701.
Discussed with and tested by: ed (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Prevent possible races in the pf_unload() / pf_purge_thread() shutdown
code. Lock the pf_purge_thread() with the new pf_end_lock to prevent
these races.
Use a shared/exclusive lock, as we need to also acquire another sx lock
(VNET_LIST_RLOCK). It's fine for both pf_purge_thread() and pf_unload()
to sleep,
Pointed out by: eri, glebius, jhb
Differential Revision: https://reviews.freebsd.org/D10026
Similar to the change for sendmsg(), create a pointer size independent
implementation of recvmsg() and let cloudabi32 and cloudabi64 call into
it. In case userspace requests one or more file descriptors, call
kern_recvit() in such a way that we get the control message headers in
an mbuf. Iterate over all of the headers and copy the file descriptors
to userspace.
CloudABI executables are statically linked and don't have an
interpreter. Setting the interpreter path to NULL used to work
previously, but r314851 introduced code that checks the string
unconditionally. Running CloudABI executables now causes a null pointer
dereference.
Looking at the rest of imgact_elf.c, it seems various other codepaths
already leaned on the fact that the interpreter path is set. Let's just
go ahead and pick an obviously incorrect interpreter path to appease
imgact_elf.c.
MFC after: 1 week
Reduce the potential amount of code duplication between cloudabi32 and
cloudabi64 by creating a cloudabi_sock_recv() utility function. The
cloudabi32 and cloudabi64 modules will then only contain code to convert
the iovecs to the native pointer size.
In cloudabi_sock_recv(), we can now construct an SCM_RIGHTS cmsghdr in
an mbuf and pass that on to kern_sendit().
This will provide a slightly better smoking gun than just stating
"can't remove non-dynamic nodes!" when calling sysctl_ctx_free(9)
and sysctl_remove_{name,oid}(9) with a non-dynamic (likely
static) sysctl.
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Because of integer types, the timeout calculation result was limited to
INT_MAX / (1000 * hz) seconds. For systems with hz=10000, this is only 215
seconds. Perform the calculation with 64-bit math to allow sleeping for the
full INT_MAX / hz interval (215000 seconds on such hz=10000 systems).
Submitted by: Scott Ferris <sferris at isilon.com>
Sponsored by: Dell EMC Isilon
We must ensure there's space for the terminating null in the temporary
buffer in imgact_binmisc_populate_interp().
Note that there's no buffer overflow here because xbe->xbe_interpreter's
length and null termination is checked in imgact_binmisc_add_entry()
before imgact_binmisc_populate_interp() is called. However, the latter
should correctly enforce its own bounds.
Reviewed by: sbruno
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D10042
Let firmware do its best first, and if it can't, try software recovery.
I would remove software timeout handler completely, but found bunch of
complains on command timeout on sparc64 mailing list few years ago, so
better be safe in case of interrupt loss.
MFC after: 2 weeks
For three years now CAM does not use SIM lock, but still enforces SIM to
use it. Remove this requirement, allowing SIMs to have any locking they
prefer, if they pass no mutex to cam_sim_alloc().
MFC after: 2 weeks
This is a painful change, but it is needed. On the one hand, we avoid
modifying them, and this slows down some ideas, on the other hand we still
eventually modify them and tools like netstat(1) never work on next version of
FreeBSD. We maintain a ton of spares in them, and we already got some ifdef
hell at the end of tcpcb.
Details:
- Hide struct inpcb, struct tcpcb under _KERNEL || _WANT_FOO.
- Make struct xinpcb, struct xtcpcb pure API structures, not including
kernel structures inpcb and tcpcb inside. Export into these structures
the fields from inpcb and tcpcb that are known to be used, and put there
a ton of spare space.
- Make kernel and userland utilities compilable after these changes.
- Bump __FreeBSD_version.
Reviewed by: rrs, gnn
Differential Revision: D10018
Since the uset can set dhcp.interface-mtu, we need to try to validate the
value. So we verify if the conversion to int is successful and we will not
allow to set value greater than max IPv4 packet size.
Also use snprintf for safety.
Reviewed by: allanjude, bapt
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D8492
In r315408, disk_cleanup was removed, which is called at
sys/boot/userboot/userboot/userboot_disk.c:113.
This causes bhyveload to fail.
PR: 217935
Reported by: Fabian Freyer
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D10060
vm_pindex_t's into a vm_ooffset_t.
The length given to shm_dotruncate() must never be negative. Assert this.
Tidy up a comment.
Reviewed by: kib
MFC after: 1 week
mmc(4). For the most part, this consists of support for:
- Switching the signal voltage (VCCQ) to 1.8 V or (if supported
by the host controller) to 1.2 V,
- setting the UHS mode as appropriate in the SDHCI_HOST_CONTROL2
register,
- setting the power class in the eMMC device according to the
core supply voltage (VCC),
- using different bits for enabling a bus width of 4 and 8 bits
in the the eMMC device at DDR or higher timings respectively,
- arbitrating timings faster than high speed if there actually
are additional devices on the same MMC bus.
Given that support for DDR52 is not denoted by SDHCI capability
registers, availability of that timing is indicated by a new
quirk SDHCI_QUIRK_MMC_DDR52 and only enabled for Intel SDHCI
controllers so far. Generally, what it takes for a sdhci(4)
front-end to enable support for DDR52 is to hook up the bridge
method mmcbr_switch_vccq (which especially for 1.2 V signaling
support is chip/board specific) and the sdhci_set_uhs_timing
sdhci(4) method.
As a side-effect, this change also fixes communication with
some eMMC devices at SDR high speed mode with 52 MHz due to
the signaling voltage and UHS bits in the SDHCI controller no
longer being left in an inappropriate state.
Compared to 52 MHz at SDR high speed which typically yields
~45 MB/s with the eMMC chips tested, throughput goes up to
~80 MB/s at DDR52.
Additionally, this change already adds infrastructure and quite
some code for modes up to HS400ES and SDR104 respectively (I did
not want to add to much stuff at a time, though). Essentially,
what is still missing in order to be able to activate support
for these latter is is support for and handling of (re-)tuning.
o In sdhci(4), add two tunables hw.sdhci.quirk_clear as well as
hw.sdhci.quirk_set, which (when hooked up in the front-end)
allow to set/clear sdhci(4) quirks for debugging and testing
purposes. However, especially for SDHCI controllers on the
PCI bus which have no specific support code so far and, thus,
are picked up as generic SDHCI controllers, hw.sdhci.quirk_set
allows for setting the necessary quirks (if required).
o In mmc(4), check and handle the return values of some more
function calls instead of assuming that everything went right.
In case failures actually are not problematic, indicate that
by casting the return value to void.
Reviewed by: jmcneill
This should reduce overhead for aggregates (since every second frame
clears the queue and reschedules the task there is no need to cancel
the callout here; let it just run once at the end - even if queue is
empty).
Reported by: adrian
calculated at runtime based on how long it takes to set up an event in
hardware. This fixes the intermittant 1-minute hang at boot on imx5
systems, and also the occasional oversleeping while running. It doesn't
affect imx6 systems, which use different hardware for eventtimers.
It turns out that it usually takes about 30 timer ticks to set up the timer
compare register, and the old hard-coded minimum period was 10 ticks. On
the rare occasions when a timeout event that short was set up, we'd miss
the event and have to wait about 64 seconds for counter rollover before
the compare interrupt would fire.
Instead of just hardcoding a new bigger value, the code now measures the
time it takes to do the register read/write sequence to set up the compare
register, scales it up by 1.5x to be safe, and calculates the minimum event
period from the result. In the real world, the minimum period works out to
about 750 nanoseconds on imx5 hardware.
It turns out to be surprisingly expensive to access the gpt hardware (on the
order of 150ns per read/write). To cut down on the overhead of setting up
each eventtimer event, eliminate read-modify-write sequences to manage the
compare interrupt enable, by keeping a shadow copy of the hardware register
and only writing to the hardware when the enable bits really change.
This should allow to drop 'ieee80211_ff_[age/flush]' calls from drivers
(an additional call can be made from ieee80211_tx_complete()
for non-default ieee80211_ffagemax values to prevent stalls -
but it will require an additional counter for transmitted frames).
Tested with RTL8821AU, STA mode (A-MSDU part only).
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D9984
Simplify the logic for clipping the range returned by the pager to fit
within the map entry.
Use atop() rather than OFF_TO_IDX() on addresses.
Reviewed by: kib
MFC after: 1 week
For 24xx and above use 2 vectors (default and response queue).
For 26xx and above use 3 vectors (default, response and ATIO queues).
Due to global lock interrupt hardlers never run simultaneously now, but
at least this allows to save one regitster read per interrupt.
MFC after: 2 weeks