This is required by programs like sockstat that read variably sized
sysctls such as kern.file. The normal path has no such restriction and
the restriction was added without comment along with initial support for
freebsd32 in 2002 (r100384).
Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15438
I210 restore functionality if pxeboot rom is enabled on this device.
r333345 attempted to determine if this code was needed or it was some kind
of work around for a problem. Turns out, its definitely a work around for
hardware locking and synchronization that manifests itself if the option
Rom is enabled and is selected as a boot device (there was a PXE attempt).
Reviewed by: mmacy
Differential Revision: https://reviews.freebsd.org/D15439
Creating a pool with a temporary name fails when we also specify custom
dataset properties: this is because we mistakenly call
zfs_set_prop_nvlist() on the "real" pool name which, as expected,
cannot be found because the SPA is present in the namespace with the
temporary name.
Fix this by specifying the correct pool name when setting the dataset
properties.
Author: loli10K <ezomori.nozomu@gmail.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Obtained from: ZFS on Linux, zfsonlinux/zfs@4ceb8dd6fd
MFC after: 1 week
- Driver support for hardware NAT.
- Driver support for swapmac action.
- Validate a request to create a hashfilter against the filter mask.
- Add a hashfilter config file for T5.
Sponsored by: Chelsio Communications
in the commit log of r321385 has been confirmed via the public VLI54
erratum. Thus, stop advertising DDR52 for these controllers.
Note that this change should hardly make a difference in practice as
eMMC chips from the same era as these SoCs most likely support HS200
at least, probably even up to HS400ES.
Use the new epoch based reclamation API. Now the hot paths will not
block at all, and the sx lock is used for the softc data. This fixes LORs
reported where the rwlock was obtained when the sxlock was held.
Submitted by: mmacy
Reported by: Harry Schmalzbauer <freebsd@omnilan.de>
Reviewed by: sbruno
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15355
Kernel debuggers depend on symbol names to find stack frames with a
trapframe rather than a normal stack frame. The labels used for the
shared interrupt entry point for the PTI and non-PTI cases did not
match the existing patterns confusing debuggers. Add the '.L' prefix
to mark these symbols as local so they are not visible in the symbol
table.
Reviewed by: kib
MFC after: 1 week
Sponsored by: Chelsio Communications
The length of SCTP packets is always a multiple of 4. Therefore,
ensure that the MTUs used are also a multiple of 4.
Thanks to Irene Ruengeler for providing an earlier version of this
patch.
MFC after: 1 week
r333273 and partially reverted with r333594.
Older CPUs implement addition of offsets into the page table by a
bitwise OR rather than actual addition, which only works if the table is
aligned at a multiple of its own size (they also require it to be aligned
at a multiple of 256KB). Newer ones do not have that requirement, but it
hardly matters to enforce it anyway.
The original code was failing on newer systems with huge amounts of RAM
(> 512 GB), in which the page table was 4 GB in size. Because the
bootstrap memory allocator took its alignment parameter as an int, this
turned into a 0, removing any alignment constraint at all and making
the MMU fail. The first round of this patch (r333273) fixed this case by
aligning it at 256 KB, which broke older CPUs. Fix this instead by widening
the alignment parameter.
Once a pmc owner is added to the pmc_ss_owners list it is
visible for all to see. We don't want this to happen until
setup is complete.
Reported by: mjg
Approved by: sbruno
- fix load/unload race by allocating the per-domain list structure at boot
- fix long extant vm map LOR by replacing pmc_sx sx_slock with global_epoch
to protect the liveness of elements of the pmc_ss_owners list
Reported by: pho
Approved by: sbruno
The INVARIANTS checks in epoch_wait() were intended to
prevent the block handler from returning with locks held.
What it in fact did was preventing anything except Giant
from being held across it. Check that the number of locks
held has not changed instead.
Approved by: sbruno@
In the reply to an ExchangeID operation, the NFSv4.1 server returns a
"scope" value (eir_server_scope). If this value is the same, it indicates
that two servers share state, which is never the case for FreeBSD servers.
As such, the value needs to be unique and it was without this patch.
However, I just found out that it is not supposed to change when the
server reboots and without this patch, it did change.
This patch fixes eir_server_scope so that it does not change when the
server is rebooted.
The only affect not having this patch has is that Linux clients don't
reclaim opens and locks after a server reboot, which meant they lost
any byte range locks held before the server rebooted.
It only affects NFSv4.1 mounts and the FreeBSD NFSv4.1 client was not
affected by this bug.
MFC after: 1 week
- GC the _nopreempt routines
- to really benefit we'd need a separate routine
- they're not currently in use
- they complicate the API for no benefit at this time
- check that we're actually in a epoch section at exit
- handle epoch_call() early in boot
- Fix copyright declaration language
Approved by: sbruno@
vm_page_queue(), added in r333256, generalizes vm_pageout_page_queued(),
so use it instead. No functional change intended.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D15402
For a fairly rare case of a client doing an ExchangeID after a hard reboot,
the old confirmed clientid still exists, but some clients use a new
co_verifier. For this case, the server was not freeing up the sessions on
the old confirmed clientid.
This patch fixes this case. It also adds two LIST_INIT() macros, which are
actually no-ops, since the structure is malloc()d with M_ZERO so the pointer
is already set to NULL.
It should have minimal impact, since the only way I could exercise this
code path was by doing a hard power cycle (pulling the plus) on a machine
running Linux with a NFSv4.1 mount on the server.
Originally spotted during testing of the ESXi 6.5 client.
Tested by: andreas.nagy@frequentis.com
MFC after: 2 months
When an NFSv4.1 session is busy due to a callback being in progress,
nfsrv_freesession() should return NFSERR_BACKCHANBUSY instead of NFS_OK.
The only effect this has is that the DestroySession operation will report
the failure for this case and this probably has little or no effect on a
client. Spotted by inspection and no failures related to this have been
reported.
MFC after: 2 months
- Create getblkx(9) variant of getblk(9) which can return error.
- Add GB_NOSPARSE flag for getblk()/getblkx() which requests that BMAP
was performed before the buffer is created, and EJUSTRETURN returned
in case the requested block does not exist.
- Make ffs_read() use GB_NOSPARSE to avoid instantiating buffer (and
allocating the pages for it), copying from zero_region instead.
The end result is less page allocations and buffer recycling when a
hole is read, which is important for some benchmarks.
Requested and reviewed by: jeff
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D14917
It appears that domain information is set correctly independent
of whether or not NUMA is defined. However, there is no memory
backing secondary domains leading to allocation failure.
Reported by: pho@, np@
Approved by: sbruno@
unwind_frame() may be instrumented by FBT, leading to recursion into
dtrace_probe(). Manually inline unwind_frame() as we do with stack
unwinding code for other architectures.
Submitted by: Domagoj Stolfa
Reviewed by: manu
MFC after: 1 week
Sponsored by: DARPA / AFRL
Differential Revision: https://reviews.freebsd.org/D15359
Don't enable regulator on attach but dealt with them on power_up/power_off
Only set the voltage for the signaling regulator since I don't have boards
that can change the supply voltage.
Enable 1.8v signaling voltage.
Only do a reset of the controller at attach and init it at power_up.
We use to enable some interrupts in reset, only enable the interrupts
we are interested in when doing a request.
While here remove the regulators handling in power_on as it is very wrong
and will be dealt with in another commit.
Tested on: A31, A64
is executed on the right stack already. No copy from the entry stack
to the kstack must be performed for vm86 bios call code to function.
To access the pcb flags on kernel entry, unconditionally switch to
kernel address space if vm86 mode is detected.
This fixes very early vm86 bios calls, typically done when boot is
performed by boot2 without loader, and kernel falls back to BIOS calls
to get SMAP.
Reported by: bde
Sponsored by: The FreeBSD Foundation
PCB_VM86CALL pcb flag not set should be treated same as return to
userspace.
Most important, the address space must be switched. This fixes
usermode vm86 operations after the 4/4 split.
Sponsored by: The FreeBSD Foundation
exception code is copied to the trampoline.
The correct value is then copied to trampoline automatically, so
tramp_idleptd_reloced can be eliminated.
This will allow to use the same exception entry code to handle traps
from vm86 bios calls on early boot stage, as after the trampoline is
configured.
Sponsored by: The FreeBSD Foundation
Record common_tssd, the descriptor to be written in GDT to point to
the common TSS, before LTR is executed. The LTR instruction sets the
loaded descriptor type to 386 TSS busy, which traps on reloads.
Sponsored by: The FreeBSD Foundation
On non-trivial SMP systems the contention on the pmc_owner mutex leads
to a substantial number of samples captured being from the pmc process
itself. This change a) makes buffers larger to avoid contention on the
global list b) makes the working sample buffer per cpu.
Run pmcstat in the background (default event rate of 64k):
pmcstat -S UNHALTED_CORE_CYCLES -O /dev/null sleep 600 &
Before:
make -j96 buildkernel -s >&/dev/null 3336.68s user 24684.10s system 7442% cpu 6:16.50 total
After:
make -j96 buildkernel -s >&/dev/null 2697.82s user 1347.35s system 6058% cpu 1:06.77 total
For more realistic overhead measurement set the sample rate for ~2khz
on a 2.1Ghz processor:
pmcstat -n 1050000 -S UNHALTED_CORE_CYCLES -O /dev/null sleep 6000 &
Collecting 10 samples of `make -j96 buildkernel` from each:
x before
+ after
real time:
N Min Max Median Avg Stddev
x 10 76.4 127.62 84.845 88.577 15.100031
+ 10 59.71 60.79 60.135 60.179 0.29957192
Difference at 95.0% confidence
-28.398 +/- 10.0344
-32.0602% +/- 7.69825%
(Student's t, pooled s = 10.6794)
system time:
N Min Max Median Avg Stddev
x 10 2277.96 6948.53 2949.47 3341.492 1385.2677
+ 10 1038.7 1081.06 1070.555 1064.017 15.85404
Difference at 95.0% confidence
-2277.47 +/- 920.425
-68.1574% +/- 8.77623%
(Student's t, pooled s = 979.596)
x no pmc
+ pmc running
real time:
HEAD:
N Min Max Median Avg Stddev
x 10 58.38 59.15 58.86 58.847 0.22504567
+ 10 76.4 127.62 84.845 88.577 15.100031
Difference at 95.0% confidence
29.73 +/- 10.0335
50.5208% +/- 17.0525%
(Student's t, pooled s = 10.6785)
patched:
N Min Max Median Avg Stddev
x 10 58.38 59.15 58.86 58.847 0.22504567
+ 10 59.71 60.79 60.135 60.179 0.29957192
Difference at 95.0% confidence
1.332 +/- 0.248939
2.2635% +/- 0.426506%
(Student's t, pooled s = 0.264942)
system time:
HEAD:
N Min Max Median Avg Stddev
x 10 1010.15 1073.31 1025.465 1031.524 18.135705
+ 10 2277.96 6948.53 2949.47 3341.492 1385.2677
Difference at 95.0% confidence
2309.97 +/- 920.443
223.937% +/- 89.3039%
(Student's t, pooled s = 979.616)
patched:
N Min Max Median Avg Stddev
x 10 1010.15 1073.31 1025.465 1031.524 18.135705
+ 10 1038.7 1081.06 1070.555 1064.017 15.85404
Difference at 95.0% confidence
32.493 +/- 16.0042
3.15% +/- 1.5794%
(Student's t, pooled s = 17.0331)
Reviewed by: jeff@
Approved by: sbruno@
Differential Revision: https://reviews.freebsd.org/D15155
The Linux client now uses the TestStateID operation, so this patch adds
support for it to the NFSv4.1 server. The FreeBSD client never uses this
operation, so it should not be affected.
MFC after: 2 months
r333175 updated the join_group functions, but not the leave_group ones.
Reviewed by: sbruno
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15393
Part 3 of many ...
The VPC framework relies heavily on cloning pseudo interfaces
(vmnics, vpc switch, vcpswitch port, hostif, vxlan if, etc).
This pulls in that piece. Some ancillary changes get pulled
in as a side effect.
Reviewed by: shurd@
Approved by: sbruno@
Sponsored by: Joyent, Inc.
Differential Revision: https://reviews.freebsd.org/D15347