It is required to hold lock that is associated with buffer ring before
flushing drbr.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
Lack of this lock was causing crash if down was called in
parallel with the initialization routine.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
the commit message; as actually implemented, the intent is to retry
up to 2 ms for controllers to enable bus power.
Noticed by: ian@, rgrimes@
Additional note: Among others, the problem addressed by r320577 is
the APL32 ("Storage Controllers May Not Be Power Gated") erratum.
Hopefully, along with r318282, r320577 works around the remaining
problems seen with Intel Apollo Lake eMMC and SDXC controllers.
The vm_map_fixed() and vm_map_stack() VM functions return Mach error
codes. Convert them into errno values before returning result from
exec_new_vmspace().
While there, modernize the comment and do minor style adjustments.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Upstream DTS for A64 SoC doesn't provide a /clocks node as Linux switched
to ccu-ng
This commit adds the necessary bits to boot on pine64 with latest DTS from
upstream.
USB is not working for now and some node aren't present in the DTS (like the
PMU, Power Management Unit).
Tested on: Pine64
iflib - reset fl-ifl_fragidx to 0 on iflib_fl_bufs_free(). This caused the
panic in em/igb when adding it to a bridge device.
iflib - Handle out of order packet delivery from hardware in support of LRO
Out of order updates to rxd's is fixed in r315217. However, it is not
completely fixed. While refilling the buffers, iflib is not considering
the out of order descriptors. Hence, it is refilling sequentially.
"idx" variable in _iflib_fl_refill routine is incremented sequentially.
By doing refilling sequentially, it will override the SGEs that
are *IN USE* by other connections. Fix is to maintain a bitmap of
rx descriptors and differentiate the used one with unused one and
refill only at the unused indices. This patch also fixes a
few bugs in bnxt, related to the same feature.
Submitted by: bhargava.marreddy@broadcom.com
Reviewed by: venkatkumar.duvvuru@broadcom.com shurd
Differential Revision: https://reviews.freebsd.org/D10681
Instead of using GID_FT SNS request to get list of registered FCP ports,
use GID_PT to get list of all Nx_Ports, and then use GFF_ID and/or GFT_ID
requests to find whether they are FCP and target capable.
The problem with old approach is that GID_FT does not report ports without
FC-4 type registered. In particular it was impossible to boot OS from
FreeBSD FC target using QLogic FC BIOS, since one does not register FC-4
type even on new cards and so ignored by old code as incompatible.
As a side bonus this allows initiator to skip pointless logins to other
initiators by fetching that information from SNS instead.
In case some switches do not implement GFF_ID/GFT_ID correctly, add sysctls
to disable that functionality. I handled broken GFF_ID of my Brocade 200E,
but there may be other switches with different bugs.
Linux also uses GID_PT, but GFF_ID is disabled by default there, and GFT_ID
is not supported.
Sponsored by: iXsystems, Inc.
It is useful to know exactly what features may be lacking when trying to
mount ext4 filesystems.
Submitted by: Fedor Uporov
Differential Revision: https://reviews.freebsd.org/D11208
to demote it to 512 pages, then remove each of these. We can just remove
the l2 map directly. This is what the intel pmaps already do.
Sponsored by: DARPA, AFRL
gap entry in the vm map being smaller than the sysctl-derived stack guard
size. Otherwise, the value of max_grow can suffer from overflow, and the
roundup(grow_amount, sgrowsiz) will not be properly capped, resulting in
an assertion failure.
In collaboration with: kib
MFC after: 3 days
All 32bit MIPS ABIs align uint64_t on 8-byte. Since struct kevent32
is defined using 32bit types to avoid extra alignment on amd64/i386,
layout of the structure needs paddings on PowerPC and apparently MIPS.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D11434
This patch was inspired by an opposite change made to shrink the code
for the boot loader.
On my i7-4770, it increases the skein1024 speed from 470 to 550 MB/s
Reviewed by: sbruno
MFC after: 1 month
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D7824
This fixes an integer underflow in efipart_realstrategy, which causes
crashes when an I/O operation's start point is after the end of the disk.
This can happen when trying to detect filesystems on very small disks.
This can occur if a BIOS freebsd-boot partition exists on a system when the
EFI loader is being used.
PR: 219000
Submitted by: Eric McCorkle <eric@metricspace.net>
Reviewed by: cem (previous version), tsoome (previous version)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10559
By default LLD links with relocations disallowed against readonly
sections (e.g., .text), but the 32-bit ARM EFI & uboot boot bits require
such relocations. -znotext is either ignored as an unknown -z option
(in-tree lld 2.17.50) or is already the default (GNU ld or GNU gold from
ports) so we can just add it unconditionally to allow building with LLD.
This is similar to the change in r320179 for the kernel link.
Sponsored by: The FreeBSD Foundation
start address is not required to be page aligned. However, the loop
within pmap_invalidate_cache_range() that performs the actual cache
line invalidations requires that the starting address be truncated to
a multiple of the cache line size. This change corrects an error in
that truncation.
Submitted by: Brett Gutstein <bgutstein@rice.edu>
Reviewed by: kib
MFC after: 1 week
were unneeded as we tell the tlb the pagetables are in cached memory. This
gives us a small, but statistically significant improvement over just
removing the PTE_SYNC cases.
While here remove PTE_SYNC, it's now unneeded.
Sponsored by: DARPA, AFRL
--Remove special-case handling of sparc64 bus_dmamap* functions.
Replace with a more generic mechanism that allows MD busdma
implementations to generate inline mapping functions by
defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>. This
is currently useful for sparc64, x86, and arm64, which all
implement non-load dmamap operations as simple wrappers
around map objects which may be bus- or device-specific.
--Remove NULL-checked bus_dmamap macros. Implement the
equivalent NULL checks in the inlined x86 implementation.
For non-x86 platforms, these checks are a minor pessimization
as those platforms do not currently allow NULL maps. NULL
maps were originally allowed on arm64, which appears to have
been the motivation behind adding arm[64]-specific barriers
to bus_dma.h, but that support was removed in r299463.
--Simplify the internal interface used by the bus_dmamap_load*
variants and move it to bus_dma_internal.h
--Fix some drivers that directly include sys/bus_dma.h
despite the recommendations of bus_dma(9)
Reviewed by: kib (previous revision), marius
Differential Revision: https://reviews.freebsd.org/D10729
performance.
To find in the leaf bitmap all ranges of sufficient length, use a doubling
strategy with shift-and-and until each bit still set represents a bit
sequence of length 'count', or until the bitmask is zero. In the latter
case, update the hint based on the first bit sequence length not found to
be available. For example, seeking an interval of length 12, the set bits
of the bitmap would represent intervals of length 1, then 2, then 3, then
6, then 12. If no bits are set at the point when each bit represents an
interval of length 6, then the hint can be updated to 5 and the search
terminated.
If long-enough intervals are found, discard those before the cursor. If
any remain, use binary search to find the position of the first of them,
and allocate that interval.
Submitted by: Doug Moore <dougm@rice.edu>
Reviewed by: kib, markj
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D11426
gcc produces a "variably modified X at file scope" warning for
structures that use these size definitions. I think the definitions are
actually fine but can be rephrased with the __CONST_RING_SIZE macro more
cleanly anyway.
Reviewed by: markj, royger
Approved by: markj (mentor)
Sponsored by: Dell EMC Isilon
Differential revision: https://reviews.freebsd.org/D11417
The acq barrier in refcount_acquire() has no use, constructor must
ensure that the changes are visible before publication by other means.
Last release must sync/with the constructor and all updaters.
This is based on the refcount/shared_ptr analysis I heard at the Hans
Boehm and Herb Sutter talks about C++ atomics.
Reviewed by: alc, jhb, markj
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D11270
recycles the current vm space. Otherwise, an mlockall(MCL_FUTURE) could
still be in effect on the process after an execve(2), which violates the
specification for mlockall(2).
It's pointless for vm_map_stack() to check the MEMLOCK limit. It will
never be asked to wire the stack. Moreover, it doesn't even implement
wiring of the stack.
Reviewed by: kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11421
For ATIOs it is pointless to report isp_loopid to CAM, since in other
places it operates with port database record IDs, not with loop IDs.
For INOTs target_id/target_lun seems were never set, so wildcard INOTs
probably were not working correctly when LUN IDs were important.
Process core notes for a 32-bit process running on a 64-bit host need to
use 32-bit structures so that the note layout matches the layout of notes
of a core dump of a 32-bit process under a 32-bit kernel.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D11407
struct g_kevent_args.
On some architectures, e.g. PowerPC, there is additional padding in uap.
Reported and tested by: andreast
Sponsored by: The FreeBSD Foundation
The implementation is using fixed size array allocated in asm module,
need to use proper array declaration for C source.
CID: 1376405
Reported by: Coverity, cem
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D11321
which is able to manipulate the clock and data lines directly.
When an i2c bus is hung by a slave device stuck in the middle of a
transaction that didn't complete properly, this function manipulates the
clock and data lines in a sequence known to reliably reset slave devices.
The most common cause of a hung i2c bus is a system reboot in the middle of
an i2c transfer (so it doesnt' happen often, but now there is a way other
than power cycling to recover from it).
nostop option is set, if a start was issued.
The nostop option doesn't mean "never issue a stop" it means "only issue
a stop after the last in a series of transfers". If the transfer ends
due to error, then that was the last transfer in the series, and a stop
is required.
Before this change, any error during a transfer when nostop is set would
effectively hang the bus, because sc->started would never get cleared,
and that caused all future calls to iicbus_start() to return an error
because it looked like the bus was already active. (Unrelated errors in
handling the nostop option, to be addressed separately, could lead to
this bus hang condition even on busses that don't set the nostop option.)
If an NFSv3 server were to reply with weak cache consistency attributes,
but not post operation attributes, the client would use garbage attributes
from memory. This was spotted during work on the code for the NFSv4.1 client.
I have never seen evidence that this happens and it wouldn't make sense
for an NFSv3 server to do this, so this patch is basically "theoretical",
but does fix the problem if a server were to do the above.
PR: 219552
MFC after: 2 weeks
When a pin is set for input the value in the DR will be the same as the PSR.
When a pin is set for output the value in the DR is the value output to the
pad, and the value in the PSR is the actual electrical level sensed on the
pad, and they can be different if the pad is configured for open-drain mode
and some other entity on the board is driving the line low.
group. Change all code points that open-coded this functionality
to use the new function. This commit is a refactoring with no
change in functionality.
In the future this change allows more robust checking of cylinder
group reads along the lines discussed in the hardening UFS session
at BSDCan (retry I/O, add checksums, etc). For more detail see the
session notes at https://wiki.freebsd.org/DevSummit/201706/HardeningUFS
Reviewed by: kib
The implementation of ZFS refcount_t uses the emulated illumos mutex
(the sx lock) and the waiting memory allocation when ZFS_DEBUG is
enabled. This makes refcount_t unsuitable for use in GEOM g_up
thread where sleeping is prohibited.
When importing the ABD change I modified vdev_geom using illumos
vdev_disk as an example. As a result, I added a call to abd_return_buf
in vdev_geom_io_intr. The latter is called on g_up thread while the
former uses refcount_t.
This change fixes the problem by deferring the abd_return_buf call to
the previously unused vdev_geom_io_done that is called on a ZFS zio
taskqueue thread where sleeping is allowed.
A side bonus of this change is that now a vdev zio has a pointer
to its corresponding bio while the zio is active.
Reported by: Shawn Webb <shawn.webb@hardenedbsd.org>
Tested by: Shawn Webb <shawn.webb@hardenedbsd.org>
MFC after: 1 week
X-MFC with: r320156
This finishes what r245164 started and makes open(..., O_APPEND) work again
after r299753.
- Pass ioflags, incl. IO_APPEND, down to the direct write backend (r245164
added it to only the bio backend).
- (r299753 changed the WRONLY backend from bio to direct.)
PR: 220185
Reported by: Ben RUBSON <ben.rubson at gmail.com>
Reviewed by: bapt@, rmacklem@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11348
a hint.
Right now, for non-fixed mmap(2) calls, addr is de-facto interpreted
as the absolute minimal address of the range where the mapping is
created. The VA allocator only allocates in the range [addr,
VM_MAXUSER_ADDRESS]. This is too restrictive, the mmap(2) call might
unduly fail if there is no free addresses above addr but a lot of
usable space below it.
Lift this implementation limitation by allocating VA in two passes.
First, try to allocate above addr, as before. If that fails, do the
second pass with less restrictive constraints for the start of
allocation by specifying minimal allocation address at the max bss
end, if this limit is less than addr.
One important case where this change makes a difference is the
allocation of the stacks for new threads in libthr. Under some
configuration conditions, libthr tries to hint kernel to reuse the
main thread stack grow area for the new stacks. This cannot work by
design now after grow area is converted to stack, and there is no
unallocated VA above the main stack. Interpreting requested stack
base address as the hint provides compatibility with old libthr and
with (mis-)configured current libthr.
Reviewed by: alc
Tested by: dim (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
the "effective MSS" for the connection. The chip expects it this way.
Submitted by: Krishnamraju Eraparaju @ Chelsio
MFC after: 3 days
Sponsored by: Chelsio Communications
If a peripheral driver (e.g. da, sa, cd) is added or removed from the
peripheral driver list while an unrelated peripheral driver instance (e.g.
da0, sa5, cd2) is going away and is inside camperiphfree(), we could
dereference an invalid pointer.
When peripheral drivers are added or removed (see periphdriver_register()
and periphdriver_unregister()), the peripheral driver array is resized
and existing entries are moved.
Although we hold the topology lock while we traverse the peripheral driver
list, we retain a pointer to the location of the peripheral driver pointer
and then drop the topology lock. So we are still vulnerable to the list
getting moved around while the lock is dropped.
To solve the problem, cache a copy of the peripheral driver pointer. If
its storage location in the list changes while we have the lock dropped, it
won't have any effect.
This doesn't solve the issue that peripheral drivers ("da", "cd", as opposed
to individual instances like "da0", "cd0") are not generally part of a
reference counting scheme to guard against deregistering them while there
are instances active. The caller (generally the person unloading a module)
has to be aware of active drivers and not unload something that is in use.
sys/cam/cam_periph.c:
In camperiphfree(), cache a pointer to the peripheral driver
instance to avoid holding a pointer to an invalid memory location
in the event that the peripheral driver list changes while we have
the topology lock dropped.
PR: kern/219701
Submitted by: avg
MFC after: 3 days
Sponsored by: Spectra Logic
Without the allocation length set, the target will either reject
the command or complete it without transferring any data.
This fixes the REPORT ZONES command for SCSI ZBC protocol devices,
as well as ATA ZAC protocol devices that are behind a SCSI to ATA
translation layer. (LSI/Broadcom's 12Gb SAS adapters translate ZBC
commands to ZAC commands.) Those are Host Aware and Host Managed SMR
drives.
This will fix REPORT ZONE commands sent to the da(4) driver via the
GEOM bio interface and zonectl, and REPORT ZONE commands sent from
camcontrol(8).
Note that in the case of camcontrol(8), we currently only send
SCSI ZBC commands to native SCSI protocol devices, not ATA devices
behind a SAT layer.
sys/cam/scsi/scsi_da.c:
Fill in the length field in scsi_zbc_in().
MFC after: 3 days
Sponsored by: Spectra Logic
and "next_skip" variables. The "skip" value in struct blist has long been
a 64-bit quantity but various functions have implicitly truncated this
value to 32 bits. Now, all arithmetic involving the "skip" value is 64
bits wide. (This should allow us to relax the size limit on a swap device
in the swap pager.)
Maintain the ability to test this allocator as a user-space application by
including <stdbool.h>.
Remove an unused variable from blst_radix_print().
Reviewed by: kib, markj
MFC after: 4 weeks
Differential Revision: https://reviews.freebsd.org/D11358
changed from %d to a long-width type.
Use uintmax_t casting and %ju to futureproof the format string against
potential changes with either the #define or the implementation-specific
definition for offsetof(..).
The fields exist on all versions of the filesystem and using them is a mount
option on linux. For FreeBSD, the corresponding i_uid and i_gid are always
long enough so use them by default.
Reviewed by: Fedor Uporov
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D11354
Without disabling interrupts it's possible for another thread to preempt
and update the registers post-read (tlb1_read_entry) or pre-write
(tlb1_write_entry), and confuse the kernel with mixed register states.
MFC after: 2 weeks
Only amd64 (because of i386) needs 32-bit time_t compat now, everything else is
64-bit time_t. Rather than checking on all 64-bit time_t archs, only check the
oddball amd64/i386.
Reviewed By: emaste, kib, andrew
Differential Revision: https://reviews.freebsd.org/D11364
RPI1-B, Alix and APU2 boards as well as NanoBSD with the following message:
vnode_pager_generic_getpages_done: I/O read error 5
Seems the breakage was because it was missed to include acr in glabel update.
Reported by: Peter Blok <pblok@bsd4all.org>,
madpilot, imp and trasz.
Reviewed by: trasz
Tested by: Peter Blok and madpilot.
MFC after: 3 days.
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11365
After r307132 the sbuf buffer is malloc()ed, but corresponding
sbuf_delete() call was missing.
Fix a nearby whitespace bug.
MFC after: 3 days
Sponsored by: Dell EMC Isilon
vfs.nfsd.nfsd_enable_stringtouid, but in reverse - when set to 1,
it forces the NFSv4 server to return numeric UIDs and GIDs instead
of "user@domain" strings. This helps with clients that can't
translate returned identifiers, eg when rerooting.
The same can be achieved by just never running nfsuserd(8),
but the sysctl is useful to toggle the behaviour back and forth
without rebooting.
Reviewed by: rmacklem (earlier version)
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D11326
3306 zdb should be able to issue reads in parallel
illumos/illumos-gate/31d7e8fa33fae995f558673adb22641b5aa8b6e1
https://www.illumos.org/issues/3306
The upstream change was made before we started to import upstream commits
individually. It was imported into the illumos vendor area as r242733.
That commit was MFV-ed in r260138, but as the commit message says
vdev_file.c was left intact.
This commit actually implements the parallel I/O for vdev_file using a
taskqueue with multiple thread. This implementation does not depend on
the illumos or FreeBSD bio interface at all, but uses zio_t to pass
around all the relevent data. So, the code looks a bit different from
the upstream.
This commit also incorporates ZoL commit
zfsonlinux/zfs/bc25c9325b0e5ced897b9820dad239539d561ec9 that fixed
https://github.com/zfsonlinux/zfs/issues/2270
We need to use a dedicated taskqueue for exactly the same reason as ZoL
as we do not implement TASKQ_DYNAMIC.
Obtained from: illumos, ZFS on Linux
MFC after: 2 weeks
AKA Make time_t 64 bits on powerpc(32).
PowerPC currently (until now) was one of two architectures with a 32-bit time_t
on 32-bit archs (the other being i386). This is an ABI breakage, so all ports,
and all local binaries, *must* be recompiled.
Tested by: andreast, others
MFC after: Never
Relnotes: Yes
A NFSv4.1/pNFS server using File Layout can specify that Commit operations
are to be done against the DS instead of MDS. Since no extant pNFS
server did this, the code was untested and "#ifdef notyet".
The FreeBSD pNFS server I am developing does specify that Commits be done
through the DS, so the code has been enabled/tested.
This patch should only affect the case of a pNFS server that specfies
Commits through the DS.
PR: 219551
MFC after: 2 weeks
the requested protection.
The syscall returns success without changing the protection of the
guard. This is consistent with the current mprotect(2) behaviour on
the unmapped ranges. More important, the calls performed by libc and
libthr to allow execution of stacks, if requested by the loaded ELF
objects, do the expected change instead of failing on the grow space
guard.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
If mmap(2) is called with the MAP_STACK flag and the size which is
less or equal to the initial stack mapping size plus guard,
calculation of the mapping layout created zero-sized guard. Attempt
to create such entry failed in vm_map_insert(), causing the whole
mmap(2) call to fail.
Fix it by adjusting the initial mapping size to have space for
non-empty guard. Reject MAP_STACK requests which are shorter or equal
to the configured guard pages size.
Reported and tested by: Manfred Antar <null@pozo.com>
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week