determining the softc of the bridge in psycho_route_interrupt(). [1]
While at it, update the corresponding comment that the code in
question is also necessary for U30s in addition to E450s (a fact
that has been known for ages).
PR: 218478
Submitted by: Yoshihiko Iwama
The code specified the length of a layout as INT64_MAX instead of
UINT64_MAX. This could result in getting a layout for less than the
full file for extremely large files. Although having little practical
effect, this patch corrects this in the code.
Detected during recent testing of the pNFS server.
MFC after: 2 weeks
In exceptional circumstances, an MCA exception will trigger when the
freelist is exhausted. In such a case, no error will be logged on the list
and 'mca_count' will not be incremented.
Prior to this patch, all CPUs that received the exception would spin
forever.
With this change, the CPU that detects the error but finds the freelist
empty will proceed to panic the machine, ending the deadlock.
A follow-up to r260457.
Reported by: Ryan Libby <rlibby at gmail.com>
Reviewed by: jhb@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D10536
It improves interoperability with if_bridge, which may need to disable
some capabilities not supported by other members. IMHO there is still
open question about LRO capability, which may need to be disabled on
physical interface.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
7740 fix for 6513 only works in hole punching case, not truncation
illumos/illumos-gate@7de35a3ed07de35a3ed0https://www.illumos.org/issues/7740
The problem is that dbuf_findbp will return ENOENT if the block it's
trying to find is beyond the end of the file. If that happens, we assume
there is no birth time, and so we lose that information when we write
out new blkptrs. We should teach dbuf_findbp to look for things that are
beyond the current end, but not beyond the absolute end of the file.
To verify, create a large file, truncate it to a short length, and then
write beyond the end. Check with zdb to make sure that there are no
holes with birth time zero (will appear as gaps).
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Paul Dagnelie <pcd@delphix.com>
7743 per-vdev-zaps have no initialize path on upgrade
illumos/illumos-gate@555da5111b555da5111bhttps://www.illumos.org/issues/7743
When loading a pool that had been created before the existance of
per-vdev zaps, on a system that knows about per-vdev zaps, the
per-vdev zaps will not be allocated and initialized.
This appears to be because the logic that would have done so, in
spa_sync_config_object(), is not reached under normal operation. It is
only reached if spa_config_dirty_list is non-empty.
The fix is to add another `AVZ_ACTION_` enum that will allow this code
to be reached when we detect that we're loading an old pool, even when
there are no dirty configs.
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>
7613 ms_freetree[4] is only used in syncing context
illumos/illumos-gate@5f145778015f14577801https://www.illumos.org/issues/7613
metaslab_t:ms_freetree[TXG_SIZE] is only used in syncing context. We should
replace it with two trees: the freeing tree (ranges that we are freeing this
syncing txg) and the freed tree (ranges which have been freed this txg).
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Some notes:
- Only i386 and amd64 layouts are checked, other Tier-1 (or close to
it) architectures would benefit from the same check.
- Unconditional enabling of the asserts depend on the stability of locks
memory layout. If locks are optimized to avoid bloat when some debugging
or profiling features turned off, it makes sense to only assert layout
for production configs.
Reviewed by: badger, emaste, jhb, vangyzen
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D10526
7586 remove #ifdef __lint hack from dmu.h
illumos/illumos-gate@4ba5b961634ba5b96163https://www.illumos.org/issues/7586
The #ifdef __lint in dmu.h is ugly, and it would be nice not to duplicate it if
we add other inline functions into header files in ZFS, especially since it is
difficult to make any other solution work across all compilation targets. We
should switch to disabling the lint flags that are failing instead.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Dan Kimmel <dan.kimmel@delphix.com>
available and if that is true make use of them.
Thank you very much to Andrew Turner for providing help and review the patch!
Reviewed by: andrew
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10499
7580 ztest failure in dbuf_read_impl
illumos/illumos-gate@1a01181fdc1a01181fdchttps://www.illumos.org/issues/7580
We need to prevent any reader whenever we're about the zero out all the
blkptrs. To do this we need to grab the dn_struct_rwlock as writer in
dbuf_write_children_ready and free_children just prior to calling bzero.
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: George Wilson <george.wilson@delphix.com>
- Rename the default implementation of 'pcib_request_feature' and add
a pcib_request_feature() wrapper function (as is often done for
new-bus APIs implemented via kobj) that accepts a single function.
Previously the call to pcib_request_feature() ended up invoking the
method on the great-great-grandparent of the bridge device instead
of the grandparent. For a bridge that was a direct child of pci0 on
x86 this resulted in the method skipping over the Host-PCI bridge
driver and being invoked against nexus0
- When invoking _OSC from a Host-PCI bridge driver, invoke
device_get_softc() against the Host-PCI bridge device instead of the
child bridge that is requesting HotPlug. Using the wrong softc data
resulted in garbage being passed for the ACPI handle causing the
_OSC call to fail.
- While here, perform some other cleanups to _OSC handling in the ACPI
Host-PCI bridge driver:
- Don't invoke _OSC when requesting a control that has already been
granted by the firmware.
- Don't set the first word of the capability array before invoking
_OSC. This word is always set explicitly by acpi_EvaluateOSC()
since it is UUID-independent.
- Don't modify the set of granted controls unless _OSC doesn't exist
(which is treated as always successful), or the _OSC method
doesn't fail.
- Don't require an _OSC status of 0 for success. _OSC always
returns the updated control mask even if it returns a non-zero
status in the first word.
- Whine if _OSC ever tries to revoke a previously-granted control.
(It is not supposed to do that.)
- While here, add constants for the _OSC status word in acpivar.h
(though currently unused).
Reported by: adrian
Reviewed by: imp
MFC after: 1 week
Tested on: Lenovo x220
Differential Revision: https://reviews.freebsd.org/D10520
7606 dmu_objset_find_dp() takes a long time while importing pool
illumos/illumos-gate@7588687e6b7588687e6bhttps://www.illumos.org/issues/7606
When importing a pool with a large number of filesystems within the same
parent filesystem, we see that dmu_objset_find_dp() takes a long time.
It is called from 3 places: spa_check_logs(), spa_ld_claim_log_blocks(),
and spa_load_verify().
There are several ways to improve performance here:
1. We don't really need to do spa_check_logs() or
spa_ld_claim_log_blocks() if the pool was closed cleanly.
2. spa_load_verify() uses dmu_objset_find_dp() to check that no
datasets have too long of names.
3. dmu_objset_find_dp() is slow because it's doing
zap_value_search() (which is O(N sibling datasets)) to determine
the name of each dsl_dir when it's opened. In this case we
actually know the name when we are opening it, so we can provide
it and avoid the lookup.
This change implements fix#3 from the above list; i.e. make
dmu_objset_find_dp() provide the name of the dataset so that we don't
have to search for it.
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prashanth Sreenivasa <prashksp@gmail.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>
linux_page_address() function in the LinuxKPI. This solves an issue
where the return value from linux_page_address() is passed to
kmem_free().
MFC after: 1 week
Sponsored by: Mellanox Technologies
The nfsv4_seqsession() call returns NFSERR_REPLYFROMCACHE when it has a
reply in the session, due to a requestor retry. The code erroneously
assumed a return of 0 for this case. This patch fixes this and adds
a KASSERT(). This would be an extremely rare occurrence. It was found
during code inspection during the pNFS server development.
MFC after: 2 weeks
busdma know, so that on architectures where dma isn't always coherent, we
know we don't have to write-back/invalidates cachelines on DMA operations.
Reviewed by: andrew, mav
validation of SEG.ACK as the first step. If the ACK is not acceptable,
a RST segment should be sent and the segment should be dropped.
Up to now, the segment was partially processed.
This patch moves the check for the SEG.ACK validation up to the front
as required.
Reviewed by: hiren, gnn
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D10424
PCB SP cache acquires extra reference, when SP is stored in the cache.
Release this reference when PCB is destroyed in ipsec_delete_pcbpolicy().
In ipsec_copy_pcbpolicy() release reference to SP in case if sp_in or
sp_out are not NULL.
Reported by: Slawa Olhovchenkov <slw at zxy spb ru>
MFC after: 1 week
before we increase irq again, or we'd end up choosing an irq, and then
really using the next one, even if it's not available.
Also in the inner loop, correct the end check so that we check every irq,
even the last one.
This makes the msk(4) adapter able to use MSI on Softiron Overdrive 1000.
This check has been redundant since it was introduced in r162554.
Reviewed by: emaste, glebius
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10322
7252 7628 compressed zfs send / receive
illumos/illumos-gate@5602294fda5602294fdahttps://www.illumos.org/issues/7252
This feature includes code to allow a system with compressed ARC enabled to
send data in its compressed form straight out of the ARC, and receive data in
its compressed form directly into the ARC.
https://www.illumos.org/issues/7628
We should have longer, more readable versions of the ZFS send / recv options.
7628 create long versions of ZFS send / receive options
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: David Quigley <dpquigl@davequigley.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Dan Kimmel <dan.kimmel@delphix.com>
wait for the next enqueue from the driver.
Reviewed by: gnn@, hselasky@, gallatin@
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D10432
patm(4) devices.
Maintaining an address family and framework has real costs when we make
infrastructure improvements. In the case of NATM we support no devices
manufactured in the last 20 years and some will not even work in modern
motherboards (some newer devices that patm(4) could be updated to
support apparently exist, but we do not currently have support).
With this change, support remains for some netgraph modules that don't
require NATM support code. It is unclear if all these should remain,
though ng_atmllc certainly stands alone.
Note well: FreeBSD 11 supports NATM and will continue to do so until at
least September 30, 2021. Improvements to the code in FreeBSD 11 are
certainly welcome.
Reviewed by: philip
Approved by: harti
The NFSv4 RFCs give a server the option of allowing the use of an open
stateid for write access to be used for a Read operation.
This patch enables this by default and adds a sysctl to disable it,
for anyone who does not want this capability.
Allowing this is particularily useful for a pNFS Data Server (DS), since
they are not permitted to allow the use of special stateids.
Discovered during recent testing of the pNFS server under development.
MFC after: 2 weeks
BHND_EROM_DUMP() method.
Dump the EROM tables to the coneole on mips/broadcom devices if bootverbose
is enabled; this functionality is primarily useful when debugging SoC EROM
parsing and device matching issues during early boot.
Reviewed by: mizhka
Approved by: adrian (mentor)
Sponsored by: Plausible Labs
Differential Revision: https://reviews.freebsd.org/D10122
kernel calls this directly so the event handler is not called, meaning
the computer fails to reboot.
Tested by: cognet
MFC after: 1 week
Sponsored by: DARPA, AFRL