A Build-ID is an identifier generated at link time to uniquely identify
ELF binaries. It allows efficient confirmation that an executable or
shared library and a corresponding standalone debuginfo file match.
(Otherwise, a checksum of the debuginfo file must be calculated when
opening it in a debugger.)
The FreeBSD base system includes GNU bfd ld 2.17.50 as the linker for
architectures other than arm64. Build-ID support was added to bfd ld
shortly after that version, so was not previously available to us.
We can now start making use of Build-ID as we migrate to using lld or
bfd ld from ports, conditionally enabled based on the LINKER_TYPE and
LINKER_VERSION make variables added in r320244 and subsequent commits.
Reviewed by: dim
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D11314
This allows them to be sent in a non truncated way and addresses a warning
given by newver versions of gcc.
Thanks to Anselm Jonas Scholl for reporting it and providing a patch.
FreeBSD note: the actual change has been in FreeBSD since r297848. This
commit accounts for integration of that change with subsequent changes,
especially r320156 (MFV of r318946) and r314274.
illumos/illumos-gate@403a8da73c403a8da73chttps://www.illumos.org/issues/5220
There are disk devices that have logical sector size larger than 512B, for
example 4KB. That is, their physical sector size is larger than 512B and they
do not provide emulation for 512B sector sizes. For such devices both a data
offset and a data size must be properly aligned. L2ARC should arrange that
because it uses physical I/O.
zio_vdev_io_start() performs a necessary transformation if io_size is not
aligned to vdev_ashift, but that is done only for logical I/O. Something
similar should be done in L2ARC code.
* a temporary write buffer should be allocated if the original buffer is
not going to be compressed and its size is not aligned
* size of a temporary compression buffer should be ashift aligned
* for the reads, if a size of a target buffer is not sufficiently large and
it is not aligned then a temporary read buffer should be allocated
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>
MFC after: 3 weeks
illumos/illumos-gate@0255edcc850255edcc85https://www.illumos.org/issues/8056
The send size estimate for a zvol can be too low, if the size of the record
headers (dmu_replay_record_t's) is a significant portion of the size.
This is typically the case when the data is highly compressible, especially
with embedded blocks.
The problem is that dmu_adjust_send_estimate_for_indirects() assumes that
blocks are the size of the "recordsize" property (128KB).
However, for zvols, the blocks are the size of the "volblocksize" property
(8KB). Therefore, we estimate that there will be 16x less record headers than
there really will be.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>
MFC after: 3 weeks
FreeBSD note: this commit removes small differences between what mav
committed to FreeBSD in r308782 and what ended up committed to illumos
after addressing all review comments.
illumos/illumos-gate@c5ee46810fc5ee46810fhttps://www.illumos.org/issues/7578
After some ZIL changes 6 years ago zil_slog_limit got partially broken
due to zl_itx_list_sz not updated when async itx'es upgraded to sync.
Actually because of other changes about that time zl_itx_list_sz is not
really required to implement the functionality, so this patch removes
some unneeded broken code and variables.
Original idea of zil_slog_limit was to reduce chance of SLOG abuse by
single heavy logger, that increased latency for other (more latency critical)
loggers, by pushing heavy log out into the main pool instead of SLOG. Beside
huge latency increase for heavy writers, this implementation caused double
write of all data, since the log records were explicitly prepared for SLOG.
Since we now have I/O scheduler, I've found it can be much more efficient
to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE
to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.
Existing ZIL implementation had problem with space efficiency when it
has to write large chunks of data into log blocks of limited size. In some
cases efficiency stopped to almost as low as 50%. In case of ZIL stored on
spinning rust, that also reduced log write speed in half, since head had to
uselessly fly over allocated but not written areas. This change improves
the situation by offloading problematic operations from z*_log_write() to
zil_lwb_commit(), which knows real situation of log blocks allocation and
can split large requests into pieces much more efficiently. Also as side
effect it removes one of two data copy operations done by ZIL code WR_COPIED
case.
While there, untangle and unify code of z*_log_write() functions.
Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing
block boundary, that may also improve efficiency if ZPL is made to do that.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Steven Hartland <steven.hartland@multiplay.co.uk>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Alexander Motin <mav@FreeBSD.org>
MFC after: 3 weeks
Relocatable linking in aarch64 ld from binutils 2.25.1 does not work.
The linker corrupts the references to the external symbols which are
defined by other object in the linking set and should therefore lose
the GOT entry.
The problem is fixed in later versions of GNU ld and does not exist in
the in-tree lld linker that we now use by default for arm64, so the
workaround can be removed.
Reviewed by: kib
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D11302
The EFI memory descriptor 64-bit aligns PhysicalStart on both 32- and
64-bit platforms. Make the padding explicit for i386 EFI.
Submitted by: Siva Mahadevan <smahadevan@freebsdfoundation.org>
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D11301
- Rename _SKIP_READ_DEPEND to _SKIP_DEPEND since it also avoids writing.
- This now uses .NOMETA to avoid reading any .meta files related to
DEPENDOBJS. Objects not in OBJS/DEPENDOBJS may still have their .meta
files read in if they are in the dependency graph.
- This also avoids statting .meta and .depend files in the META_MODE +
-DNO_FILEMON case.
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
ext4 on linux has always supported more than 32000 directories through
the dir_nlink feature, but FreeBSD was unable to catch up on this feature.
As part of the 64 bit inode changes nlink_t has been extended and this
feature is now possible.
Submitted by: Fedor Uporov
Differential Revision: https://reviews.freebsd.org/D11210
initialized.
bdrewery@ has reported panics "newnfs_copycred: negative nfsc_ngroups".
The only way I can see that this occurs is that the credentials field of
the open structure gets used before being filled in.
I am not sure quite how this happens, but for the file create case, the
code is serialized via the vnode lock on the directory. If, somehow, a
link to the same file gets created just after file creation, this might
occur.
This patch ensures that the credentials field is initialized to a reasonable
set of credentials before the structure is linked into any list, so I
this should ensure it is initialized before use.
I am committing the patch now, since bdrewery@ notes that the panics
are intermittent and it may be months before he knows if the patch fixes
his problem.
Reported by: bdrewery
MFC after: 2 weeks
instantiated.
Calling pmap_copy() on non-faulted anonymous memory entries is useless.
Noted and reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This patch disables outer cache sync in PL310 driver
by adding "arm,io-coherent" property. In addition to
the previous patches it was the last bit needed
for enabling proper operation of Armada 38x SoCs
with the IO cache coherency.
Submitted by: Michal Mazur <mkm@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Reviewed by: mmel
Differential revision: https://reviews.freebsd.org/D11204
Armada 38x SoCs, in order to work properly in IO-coherent mode,
requires an update of the MBUS windows attributesd.
This patch also configures nexus coherent dma tag, because all
busses and children devices have to inherit this setting in runtime.
The latter has to be executed as a sysinit (SI_SUB_DRIVERS type),
so that bus_dma_tag_create() can be executed properly.
Submitted by: Michal Mazur <mkm@semihalf.com>
Marcin Wojtas <mw@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Reviewed by: ian
Differential revision: https://reviews.freebsd.org/D11203
Allow to set the dma tag for nexus in the platform init code,
so that all busses and devices would be able to inherit it.
This change is useful e.g. for setting coherent dma tag for
the platforms with hardware IO cache coherency.
Submitted by: ian
Michal Mazur <mkm@semihalf.com>
Reviewed by: ian
Differential revision: https://reviews.freebsd.org/D11202
- Inherit BUS_DMA_COHERENT flag from parent buses
- Use cacheable memory attributes on dma coherent platform
- Disable cache synchronization on coherent platform
Changes are based on ARMv8 busdma code and commit r299683.
Submitted by: Michal Mazur <mkm@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Reviewed by: ian
Differential revision: https://reviews.freebsd.org/D11201
Add io_mapping_init_wc() and add a third (unused) parameter to
io_mapping_map_wc().
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11286
memory map request. When the VM fault handler is NULL a return code of
VM_PAGER_BAD is returned from the character device's pager populate
handler. This fixes compatibility with Linux.
MFC after: 1 week
Sponsored by: Mellanox Technologies
All of the problems were related to the FreeBSD-only features.
One was caused by a mismerge in the zfsbootcfg support code.
All others were in the TRIM support code.
MFC after: 1 week
X-MFC with: r320156
All of the problems were related to the FreeBSD-only features.
One was caused by a mismerge in the zfsbootcfg support code.
All others were in the TRIM support code.
Reported by: ken,
O. Hartmann <ohartmann@walstatt.org>,
Trond Endrestøl <Trond.Endrestol@fagskolen.gjovik.no>
MFC after: 1 week
X-MFC with: r320156
On some windows hosts TEST_UNIT_READY command will return
SRB_STATUS_ERROR and sense data "NOT READY asc:3a,1 (Medium
not present - tray closed)", this occurs periodically, and
not hurt anything else. So, we prefer to ignore this kind
of errors.
PR: 219973
Submitted by: Hongjiang Zhang <hongzhan microsoft com>
MFC after: 3 days
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D11271
ARM kernel modules require .text relocations (DT_TEXTREL) in shared
object ouptut, which is not allowed by default by lld. Add the -znotext
option to enable this. For simplicity add it unconditionally: it is
already default and thus either redundant (GNU BFD ld and gold from
ports) or ignored as an unknown option (GNU BFD ld 2.17.50 in the base
system).
Reviewed by: kib
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D11250
being overwritten, they are set only bits (cleared by hardware).
Disable the Acknowledge of the controller slave address. The slave mode is
not supported.
Make sure the interrupt flag bit is being cleared as recommended, add a
delay() _after_ clear the interrupt bit.
Sponsored by: Rubicon Communications, LLC (Netgate)
illumos/illumos-gate@770499e185770499e185https://www.illumos.org/issues/8021
The ARC buf data project (known simply as "ABD" since its genesis in the ZoL
community) changes the way the ARC allocates `b_pdata` memory from using linear
`void *` buffers to using scatter/gather lists of fixed-size 1KB chunks. This
improves ZFS's performance by helping to defragment the address space occupied
by the ARC, in particular for cases where compressed ARC is enabled. It could
also ease future work to allocate pages directly from `segkpm` for minimal-
overhead memory allocations, bypassing the `kmem` subsystem.
This is essentially the same change as the one which recently landed in ZFS on
Linux, although they made some platform-specific changes while adapting this
work to their codebase:
1. Implemented the equivalent of the `segkpm` suggestion for future work
mentioned above to bypass issues that they've had with the Linux kernel memory
allocator.
2. Changed the internal representation of the ABD's scatter/gather list so it
could be used to pass I/O directly into Linux block device drivers. (This
feature is not available in the illumos block device interface yet.)
FreeBSD notes:
- the actual (default) chunk size is 4KB (despite the text above saying 1KB)
- we can try to reimplement ABDs, so that they are not permanently
mapped into the KVA unless explicitly requested, especially on
platforms with scarce KVA
- we can try to use unmapped I/O and avoid intermediate allocation of a
linear, virtual memory mapped buffer
- we can try to avoid extra data copying by referring to chunks / pages
in the original ABD
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Dan Kimmel <dan.kimmel@delphix.com>
MFC after: 3 weeks
I think that the change is still good, but reconciling it with a planned
merge of the ARC buf data scatter-ization is a bit more tedious
than I can handle.
MFC after: 17 days
From the linux tune2fs(8) manpage:
"Allow the kernel to initialize bitmaps and inode tables and keep a high
watermark for the unused inodes in a filesystem, to reduce e2fsck(8) time.
This first e2fsck run after enabling this feature will take the full time,
but subsequent e2fsck runs will take only a fraction of the original time,
depending on how full the file system is."
Submitted by: Fedor Uporov
Differential Revision: https://reviews.freebsd.org/D11211
When a PL310 cache is used on a system that provides hardware
coherency, the outer cache sync operation is useless, and can be
skipped. Moreover, on some systems, it is harmful as it causes
deadlocks between the Marvell coherency mechanism, the Marvell PCIe
or Crypto controllers and the Cortex-A9.
To avoid this, this commit introduces a new Device Tree property
'arm,io-coherent' for the L2 cache controller node, valid only for the
PL310 cache. It identifies the usage of the PL310 cache in an I/O
coherent configuration. Internally, it makes the driver disable the
outer cache sync operation.
Note, that other outer-cache operations are not removed, as they may
be needed for certain situations, such as booting secondary CPUs.
Moreover, in order to enable IO coherent operation, the decision
whether to use L2 cache maintenance callbacks is done in busdma
layer, which was enabled in one of the previous commits.
Submitted by: Michal Mazur <mkm@semihalf.com>
Marcin Wojtas <mw@semihalf.com>
Reviewed by: mmel
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D11245
There is a hardware problem between Cortex-A9 CPUs and on-chip devices
in Armada 38X SoCs that may cause hang on heavy load. This can be
however worked around by mapping all registers and PCI IO
as strongly ordered instead of device memory.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Reviewed by: mmel
Tested by: mw_semihalf.com
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D10218