implementation.
The kernel RPC code, which is responsible for the low-level scheduling
of incoming NFS requests, contains a throttling mechanism that
prevents too much kernel memory from being tied up by NFS requests
that are being serviced. When the throttle is engaged, the RPC layer
stops servicing incoming NFS sockets, resulting ultimately in
backpressure on the clients (if they're using TCP). However, this is
a very heavy-handed mechanism as it prevents all clients from making
any requests, regardless of how heavy or light they are. (Thus, when
engaged, the throttle often prevents clients from even mounting the
filesystem.) The throttle mechanism applies specifically to requests
that have been received by the RPC layer (from a TCP or UDP socket)
and are queued waiting to be serviced by one of the nfsd threads; it
does not limit the amount of backlog in the socket buffers.
The original implementation limited the total bytes of queued requests
to the minimum of a quarter of (nmbclusters * MCLBYTES) and 45 MiB.
The former limit seems reasonable, since requests queued in the socket
buffers and replies being constructed to the requests in progress will
all require some amount of network memory, but the 45 MiB limit is
plainly ridiculous for modern memory sizes: when running 256 service
threads on a busy server, 45 MiB would result in just a single
maximum-sized NFS3PROC_WRITE queued per thread before throttling.
Removing this limit exposed integer-overflow bugs in the original
computation, and related bugs in the routines that actually account
for the amount of traffic enqueued for service threads. The old
implementation also attempted to reduce accounting overhead by
batching updates until each queue is fully drained, but this is prone
to livelock, resulting in repeated accumulate-throttle-drain cycles on
a busy server. Various data types are changed to long or unsigned
long; explicit 64-bit types are not used due to the unavailability of
64-bit atomics on many 32-bit platforms, but those platforms also
cannot support nmbclusters large enough to cause overflow.
This code (in a 10.1 kernel) is presently running on production NFS
servers at CSAIL.
Summary of this revision:
* Removes 45 MiB limit on requests queued for nfsd service threads
* Fixes integer-overflow and signedness bugs
* Avoids unnecessary throttling by not deferring accounting for
completed requests
Differential Revision: https://reviews.freebsd.org/D2165
Reviewed by: rmacklem, mav
MFC after: 30 days
Relnotes: yes
Sponsored by: MIT Computer Science & Artificial Intelligence Laboratory
%rdi, %rsi, etc are inadvertently bypassed along with the check to
see if the instruction needs to be repeated per the 'rep' prefix.
Add "MOVS" instruction support for the 'MMIO to MMIO' case.
Reviewed by: neel
Per Austin group issue #884, sh should not import IFS from the environment
but always set it to $' \t\n'. For wordexp(), however, it is documented and
useful for it to use IFS from the environment.
Since sh currently imports IFS from the environment, this change has no
functional effect.
MFC after: 1 week
The __builtin_init_dwarf_reg_size_table function is unimplemented in
clang 3.6 for AArch64. Comment it out for now and replace it with
a message and abort.
Tracked in upstream LLVM PR 22997
https://llvm.org/bugs/show_bug.cgi?id=22997
Submitted by: andrew
specially aml8726-m6 and aml8726-m8b SoC based devices.
aml8726-m6 SoC exist in devices such as Visson ATV-102.
Hardkernel ODROID-C1 board has aml8726-m8b SoC.
The following support is included:
Basic machdep code
SMP
Interrupt controller
Clock control driver (aka gate)
Pinctrl
Timer
Real time clock
UART
GPIO
I2C
SD controller
SDXC controller
USB
Watchdog
Random number generator
PLL / Clock frequency measurement
Frame buffer
Submitted by: John Wehle
Approved by: stas (mentor)
still need libc_pic for a few things, but this is expected to be ready
soon.
Differential Revision: https://reviews.freebsd.org/D2136
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
- Convert errx(-1, ..) to errx(1, ..)
- Move the aio(4) checks to a single function (aio_available); use modfind(2)
instead of depending on SIGSYS (doesn't work when aio(4) support is missing,
not documented in the aio syscall manpages).
- Use aio_available liberally in the testcase functions
- Use mkstemp(3) + unlink(2) instead of mktemp(3)
- Fix some -Wunused warnings
- Bump WARNS to 6
MFC after: 1 week
Submitted by: mjohnston [*]
Sponsored by: EMC / Isilon Storage Division
on Intel processors. Clear spurious dependency by explicitely xoring
the destination register of popcnt.
Use bitcount64() instead of re-implementing SWAR locally, for
processors without popcnt instruction.
Reviewed by: jhb
Discussed with: jilles (previous version)
Sponsored by: The FreeBSD Foundation
rather than 20. The MP 1.4 specification states in Appendix B.2:
"A period of 20 microseconds should be sufficient for IPI dispatch to
complete under normal operating conditions".
(Note that this appears to be separate from the 10 millisecond (INIT) and
200 microsecond (STARTUP) waits after the IPIs are dispatched.) The
Intel SDM is silent on this issue as far as I can tell.
At least some hardware requires 60 microseconds as noted in the PR, so
bump this to 100 to be on the safe side.
PR: 197756
Reported by: zaphod@berentweb.com
MFC after: 1 week
As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU
linkers ld.bfd and ld.gold currently only support a subset of the
whole range of AArch64 ELF TLS relocations. Furthermore, they assume
that some of the code sequences to access thread-local variables are
produced in a very specific sequence. When the sequence is not as the
linker expects, it can silently mis-relaxe/mis-optimize the
instructions.
Even if that wouldn't be the case, it's good to produce the exact
sequence, as that ensures that linkers can perform optimizing
relaxations.
This patch:
* implements support for 16MiB TLS area size instead of 4GiB TLS area
size. Ideally clang would grow an -mtls-size option to allow support
for both, but that's not part of this patch.
* by default doesn't produce local dynamic access patterns, as even
modern ld.bfd and ld.gold linkers do not support the associated
relocations. An option (-aarch64-elf-ldtls-generation) is added to
enable generation of local dynamic code sequence, but is off by
default.
* makes sure that the exact expected code sequence for local dynamic
and general dynamic accesses is produced, by making use of a new
pseudo instruction. The patch also removes two
(AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing
AArch64-specific pseudo SDNode instructions that are superseded by
the new one (TLSDESC_CALLSEQ).
Submitted by: Kristof Beyls
Differential Revision: https://reviews.freebsd.org/D2175