map entries list, and that it does not overlap with the previous and
next entries.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Assume the number of description used is reasonable value to
increment this otherwise opaque field by.
While here, reduce a minor difference between the legacy and
multiqueue transmit paths.
MFC after: 1 week
This requires the VMware vmxnet3 device to flip the start of packet
descriptor's generation before the rest of the packet's descriptors
have been loaded into the Rx ring. I've never observed this behavior,
and it seems to make the most sense not to do it this way. But it is
not a lot of work for the driver to handle this situation just in case.
MFC after: 1 week
The sbp_cam_detach_target can be called from sbp_post_explore function
on the first target that is not really attached and it was written with
the corresponding safety check in place to tolerate that. Unfortunately
the recent locking cleanup did add a locking assertion that tries to
dereference the target->sbp pointer unconditionally, which causes less
than desirable outcome. Since the assertion is useful, just initialize
the target sbp pointer once when sbp device is being initialized instead
of when the target is being attached. This makes assertion work in all
cases and fixes the crash on boot.
performing cpuid calls.
Add also a new way to specify the level type to cpucontrol(8) as
reported in the manpage.
Sponsored by: EMC / Isilon storage division
Reviewed by: bdrewery, gcooper
Testerd by: bdrewery
instead of trying to cache it.
Previously, we only trusted the state if we did not have a cached state.
However, once a state was cached, the _STA method was always ignored.
Specifically, once a power resource had been turned on once (e.g.
during resume), the driver assumed it was always on even if _STA said it
was off and never turned it back on. This prevented the power resource
from being turned back on if a laptop was resumed twice, for example.
To fix, just remove the cached state entirely and always use the results
of _STA. The loops already skip any resources where _STA fails.
Submitted by: trasz (initial patch to invoke _ON)
MFC after: 1 week
corresponding flag(s) in the new map entry. Previously, the caller was
responsible for setting them after vm_map_insert() returned.
Pass MAP_STACK_GROWS_DOWN to vm_map_insert() from vm_map_growstack() when
extending the stack in the downward direction.
Together these changes slightly simplify the caller's task when creating a
downward growing stack. In particular, the caller no longer needs to clip
the previous entry, because the new stack entry can't possibly coalesce
with the previous entry.
Reviewed by: kib
Sponsored by: EMC / Isilon Storage Division
From one side it allows to remove CTL_FLAG_TASK_PENDING flag, handling of
which significantly complicates fine-grained locking. From the other side
it reduces task management requests latency even below then that flag could.
As downside, it denies task management code to sleep, but that is not needed
any way now.
Discussed with: ken
This should allow to abort commands doing mostly disk I/O, such as VERIFY
or WRITE SAME. Before this change CTL_FLAG_ABORT was only checked around
data moves, which for these commands may not happen for a very long time.
MFC after: 2 weeks
SPC-4 recommends T10 vendor ID based LUN ID was created by concatenating
product name and serial number (and istgt follows that). But product name
is 16 bytes long by itself, so 16 bytes total length is clearly not enough
to fit both.
To keep compatibility with existing configurations, pad short device IDs
to old length of 16, same as before.
This change probably breaks CTL user-level ABI, so control tools should
be rebuilt after this change.
MFC after: 2 weeks
we're now back to the pre-r228483 level of default verbosity. This in
turn again typically allows for reading information that userland might
have printed on the screen before initiating a halt, but still permits
to debug potential device shutdown problems on system shutdown via
CAM_DEBUG etc.
Reviewed by: mav
MFC after: 3 days
Sponsored by: Bally Wulff Games & Entertainment GmbH
and prevents the request from deleting existing mappings in the
region, failing instead.
Reviewed by: alc
Discussed with: jhb
Tested by: markj, pho (previous version, as part of the bigger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
for any outstanding commands to be properly aborted by CTL.
Without it, in some cases (such as files backing the LUNs
stored on failing disk drives), terminating a busy session
would result in panic.
Reviewed by: mav@ (earlier version)
Sponsored by: The FreeBSD Foundation
Fix the gate in xen_pv_lapic_ipi_vectored to prevent access to element
at position nitems(xen_ipis).
Sponsored by: Citrix Systems R&D
Coverity ID: 1223203
Approved by: gibbs
- Don't compare the DMA map to NULL to determine if bus_dmamap_unload()
should be called when releasing a static allocation. Instead, compare
the bus address against 0.
- Don't assume that the DMA map for static allocations is NULL. Instead,
save the value set by bus_dmamem_alloc() so it can later be passed to
bus_dmamem_free(). Also, add missing calls to bus_dmamap_unload() in
these cases before freeing the buffer.
- Use the bus address from the bus_dma callback instead of calling
vtophys() on the address allocated by bus_dmamem_alloc().
Reviewed by: kan
- Add missing calls to bus_dmamap_unload() in et(4).
- Check the bus address against 0 to decide when to call
bus_dmamap_unload() instead of comparing the bus_dma map against NULL.
- Check the virtual address against NULL to decide when to call
bus_dmamem_free() instead of comparing the bus_dma map against NULL.
- Don't clear bus_dma map pointers to NULL for static allocations.
Instead, treat the value as completely opaque.
- Pass the correct virtual address to bus_dmamem_free() in wpi(4) instead
of trying to free a pointer to the virtual address.
Reviewed by: yongari
The functions' definitions are protected by #ifdef SMP.
Keeping apic_ops.ipi_*() methods NULL would allow to catch the use
on UP machines.
Reviewed by: royger
Sponsored by: The FreeBSD Foundation
permissions test, forgotten in r164033.
Refactor the permission checks for utimes(2) into vnode helper
function vn_utimes_perm(9), and simplify its code comparing with the
UFS origin, by writing the call to VOP_ACCESSX only once. Use the
helper for UFS(5), tmpfs(5), devfs(5) and msdosfs(5).
Reported by: bde
Reviewed by: bde, trasz
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
the queue where to enqueue pages that are going to be unwired.
- Add stronger checks to the enqueue/dequeue for the pagequeues when
adding and removing pages to them.
Of course, for unmanaged pages the queue parameter of vm_page_unwire() will
be ignored, just as the active parameter today.
This makes adding new pagequeues quicker.
This change effectively modifies the KPI. __FreeBSD_version will be,
however, bumped just when the full cache of free pages will be
evicted.
Sponsored by: EMC / Isilon storage division
Reviewed by: alc
Tested by: pho
Make data_submit backends method support not only read and write requests,
but also two new ones: verify and compare. Verify just checks readability
of the data in specified location without transferring them outside.
Compare reads the specified data and compares them to received data,
returning error if they are different.
VERIFY(10/12/16) commands request either verify or compare from backend,
depending on BYTCHK CDB field. COMPARE AND WRITE command executed in two
stages: first it requests compare, and then, if succeesed, requests write.
Atomicity of operation is guarantied by CTL request ordering code.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
This is needed because syscons depends on ISA.
Sponsored by: Citrix Systems R&D
Approved by: gibbs
x86/isa/isa.c:
- Allow the ISA bus to attach to xenpv.
Switch the initialization of gnttab to use an unused physical memory
range for both PVHVM and PVH.
In the past PVHVM was using the xenpci BAR, but there's no reason to
do that, and in fact FreeBSD was probably doing it because it was the
way it was done in Windows, were drivers cannot probably request for
unused physical memory ranges, but it was never enforced in the
hypervisor.
Sponsored by: Citrix Systems R&D
Approved by: gibbs
xen/gnttab.c:
- Allocate contiguous physical memory for grant table frames for both
PVHVM and PVH.
- Since gnttab is not a device, use the xenpv device in order to
request for this allocation.
dev/xen/xenpci/xenpcivar.h:
dev/xen/xenpci/xenpci.c:
- Remove the now unused xenpci_alloc_space and xenpci_alloc_space_int
functions.
xen/gnttab.h:
- Change the prototype of gnttab_init and gnttab_resume, that now
takes a device_t parameter.
dev/xen/control/control.c:
x86/xen/xenpv.c:
- Changes to accomodate the new prototype of gnttab_init and
gnttab_resume.
Currently the grant table is initialized from xenstore, but a better
place to do this would be xenpv, so move grant table initialization
there.
Sponsored by: Citrix Systems R&D
Approved by: gibbs
x86/xen/xenpv.c:
- Add gnttab initialization.
xen/xenstore/xenstore.c:
- Remove gnttab initialization.
For PVH guests the xenstore parameters are fetched from the start_info
struct, just like on PV.
Sponsored by: Citrix Systems R&D
Approved by: gibbs
xen/xenstore/xenstore.c:
- Fetch xenstore event channel port from start_info.
Add the PV shutdown hook to PVH.
Sponsored by: Citrix Systems R&D
Approved by: gibbs
dev/xen/control/control.c:
- Make xen_pv_shutdown_final available on XENHVM builds.
- Register the Xen PV shutdown hook for PVH guests.
Introduce a Xen specific nexus that is going to be used by Xen PV/PVH
guests.
Sponsored by: Citrix Systems R&D
Approved by: gibbs
x86/xen/xen_nexus.c:
- Introduce a Nexus to use on Xen PV(H) guests, this prevents PV(H)
guests from using the legacy Nexus.
conf/files.amd64:
conf/files.i386:
- Add the xen nexus to the build.
Since there's no ACPI on PVH guests, we need to create a dummy CPU
device in order to fill the pcpu->pc_device field.
Sponsored by: Citrix Systems R&D
Approved by: gibbs
dev/xen/pvcpu/pvcpu.c:
- Create a dummy CPU device for PVH guests in order to fill the
per-cpu pc_device field.
conf/files:
- Add the pvcpu device to kernels using XEN or XENHVM options.
Create a dummy bus so top level Xen devices can attach to it (instead
of attaching directly to the nexus). This allows to have all the Xen
related devices grouped under a single bus.
Sponsored by: Citrix Systems R&D
Approved by: gibbs
x86/xen/xenpv.c:
- Attach the xenpv bus when running as a Xen guest.
- Attach the ISA bus if needed, in order to attach syscons.
conf/files.amd6:
conf/files.i386:
- Include the xenpv.c file in the build of i386/amd64 kernels using
XENHVM.
dev/xen/console/console.c:
dev/xen/timer/timer.c:
xen/xenstore/xenstore.c:
- Attach to the xenpv bus instead of the Nexus.
dev/xen/xenpci/xenpci.c:
- Xen specific devices on PVHVM guests are no longer attached to the
xenpci device, they are instead attached to the xenpv bus, remove
the now unused methods.
Create the necessary hooks in order to provide a Xen PV APIC
implementation that can be used on PVH. Most of the lapic ops
shouldn't be called on Xen, since we trap those operations at a higher
layer.
Sponsored by: Citrix Systems R&D
Approved by: gibbs
x86/xen/hvm.c:
x86/xen/xen_apic.c:
- Move IPI related code to xen_apic.c
x86/xen/xen_apic.c:
- Introduce Xen PV APIC implementation, most of the functions of the
lapic interface should never be called when running as PV(H) guest,
so make sure FreeBSD panics when trying to use one of those.
- Define the Xen APIC implementation in xen_apic_ops.
xen/xen_pv.h:
- Extern declaration of the xen_apic struct.
x86/xen/pv.c:
- Use xen_apic_ops as apic_ops when running as PVH guest.
conf/files.amd64:
conf/files.i386:
- Include the xen_apic.c file in the build of i386/amd64 kernels
using XENHVM.
This is needed for Xen PV(H) guests, since there's no hardware lapic
available on this kind of domains. This commit should not change
functionality.
Sponsored by: Citrix Systems R&D
Reviewed by: jhb
Approved by: gibbs
amd64/include/cpu.h:
amd64/amd64/mp_machdep.c:
i386/include/cpu.h:
i386/i386/mp_machdep.c:
- Remove lapic_ipi_vectored hook from cpu_ops, since it's now
implemented in the lapic hooks.
amd64/amd64/mp_machdep.c:
i386/i386/mp_machdep.c:
- Use lapic_ipi_vectored directly, since it's now an inline function
that will call the appropiate hook.
x86/x86/local_apic.c:
- Prefix bare metal public lapic functions with native_ and mark them
as static.
- Define default implementation of apic_ops.
x86/include/apicvar.h:
- Declare the apic_ops structure and create inline functions to
access the hooks, so the change is transparent to existing users of
the lapic_ functions.
x86/xen/hvm.c:
- Switch to use the new apic_ops.
The header structure consists of two 1-byte elements, but it must always
be describable by a single SG entry. Note for consistency, specify the
alignment everywhere, even if the structure has the appropriate natural
alignment since it contains a uint16_t.
Obtained from: DragonFlyBSD
MFC after: 1 week
These defines are applicable to userland too, but virtqueue.h contains
the kernel virtqueue interface, and is therefore not usable in userland.
Note that Linux places these defines in virtio_ring.h, but I don't want
the drivers including this header file to keep the VirtIO ring opaque to
everything but the virtqueue.
MFC after: 1 week
The eventual goal is to share this file with userland, so
remove the macro that is only specific for virtio_pci(4).
Instead, add the VIRTIO_PCI_CONFIG_OFF macro from Linux to
get the config size whether MSIX is enabled or not.
MFC after: 1 week
would be read once and cached in a local variable so that the resource limit
check and map entry insertion would be guaranteed to use the same value.
However, the value being passed to vm_map_insert() is still from "sgrowsiz"
and not the local variable. Correct this oversight.
Reviewed by: kib
sysarch(2) code.
Use M_ZERO instead of explicit bzero(9). Do not check for failed
allocation when M_WAITOK is specified (which is specified always).
Use malloc(9) when allocating memory for the intermediate copy of the
user-supplied buffer.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
VM due to copyin(9) faulting while VFS locks are held is
deadlock-prone there in the same way as for the write(2) syscall.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
portsnap extract, where previously it would panic.. clearly someone
who knows pmap should optimize this code per alc's comment...
Submitted by: alc
MFC after: probably
avoid congestion on global mountlist_mtx mutex in vfs_busyfs(), while
traversing through the list of mount points.
This change significantly improves NFS server scalability, since it had
to do this translation for every request, and the global lock becomes quite
congested.
This code is more optimized for relatively small number of mount points.
On systems with hundreds of active mount points this simple cache may have
many collisions. But the original traversal code in that case should also
behave much worse, so we are not loosing much.
Reviewed by: attilio
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
since it will almost certanly fail. Take next bigger zone instead.
This situation should not happen with original bucket zones configuration:
"32 Bucket" zone uses "64 Bucket" and vice versa. But if "64 Bucket" zone
lock is congested, zone may grow its bucket size and start biting itself.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
NULL to determine if bus_dmamap_unload() or bus_dmamem_free() should be
called. Instead, check the associated bus and virtual addresses.
- Don't clear static DMA maps to NULL.
Reviewed by: jfv
freeing them instead of after.
- Check the bus address of a static DMA buffer to decide if the associated
map should be unloaded.
- Don't try to destroy bus dma maps for static DMA buffers.
Reviewed by: davidcs
This is loosly based on Xorg changeset f57bc0e by Christian
Zander.
Submitted by: Wolf Ramovsky <wolf.ramovsky gmail.com>
via core (peter)
MFC after: 2 weeks
- Don't call xpt_free_path() in os_query_remove_device() and
always return TRUE.
- Update os_buildsgl() to support build logical SG table which
will be used by lower RAID module.
- Return CAM_SEL_TIMEOUTstatus for SCSIcommand failed as target
missing.
Many thanks to HighPoint for providing this driver update.
Submitted by: Steve Chang
Reviewed by: mav
MFC after: 3 days
machines. Specifically, there was a mismatch between how the routine
allocation and deallocation operations accessed the population map
and how the aggressively optimized reservation-breaking operation
accessed it. So, problems only occurred when reservations were broken.
This change makes the routine operations access the population map in
the same way as the reservation breaking operation.
This bug was introduced in r259999.
PR: 187080
Tested by: jmg (on an "armeb" machine)
Sponsored by: EMC / Isilon Storage Division
In particular, don't check the value of the bus_dma map against NULL
to determine if either bus_dmamem_alloc() or bus_dmamap_load() succeeded.
Instead, assume that bus_dmamap_load() succeeeded (and thus that
bus_dmamap_unload() should be called) if the bus address for a resource
is non-zero, and assume that bus_dmamem_alloc() succeeded (and thus
that bus_dmamem_free() should be called) if the virtual address for a
resource is not NULL.
In many cases these bugs could result in leaks when a driver was detached.
Reviewed by: yongari
MFC after: 2 weeks
Direct bpf(4) consumers should now work fine with this tunable turned on.
In fact, the only case when optimized_writers can change program
behavior is direct bpf(4) consumer setting its read filter to
catch-all one.
MFC after: 2 weeks
Sponsored by: Yandex LLC
This partitioning scheme is used in DragonFlyBSD. It is similar to
BSD disklabel, but has the following improvements:
* metadata has own dedicated place and isn't accessible through partitions;
* all offsets are 64-bit;
* supports 16 partitions by default (has reserved place for more);
* has reserved place for backup label (but not yet implemented);
* has UUIDs for partitions and partition types;
No objections from: geom
MFC after: 2 weeks
Relnotes: yes
don't create a map before calling bus_dmamem_alloc() (such maps were
leaked). It is believed that the extra destroy of the map was generally
harmless since bus_dmamem_alloc() often uses special maps for which
bus_dmamap_destroy() is a no-op (e.g. on x86).
Reviewed by: scottl
shutdown by putting the former under !rebooting and turning the latter into
debug messages.
Reviewed by: hps
MFC after: 1 week
Sponsored by: Bally Wulff Games & Entertainment GmbH
injected into the guest. This allows the hypervisor to inject another
ExtINT or APIC vector as soon as the guest is able to process interrupts.
This change is not to address any correctness issue but to guarantee that
any pending APIC vector that was preempted by the ExtINT will be injected
as soon as possible. Prior to this change such pending interrupts could be
delayed until the next VM exit.
names so that encoding names are treated as case-insensitive. This allows
the use of 'utf-8' instead of 'UTF-8' for example and matches the behavior
of iconv(1).
PR: 167977
Submitted by: buganini@gmail.com
MFC after: 1 week
- Use the existing vbus locks instead of Giant for the CAM sim lock.
- Use callout(9) instead of timeout(9).
- Mark the interrupt handler as MPSAFE.
- Don't attempt to pass data in the softc from probe() to attach().
Reviewed by: Steve Chang <ychang@highpoint-tech.com>
Assisted by: delphij
* The way rings are updated changed with the last API bump.
Also sync ->head when moving slots in netmap_sw_to_nic().
* Remove a crashing selrecord() call.
* Unclog the logic surrounding netmap_rxsync_from_host().
* Add timestamping to RX host ring.
* Remove a couple of obsolete comments.
Submitted by: Franco Fichtner
MFC after: 3 days
Sponsored by: Packetwerk
flags, to rwlock. Lock it in read mode when used from subroutines
called from buffer release code paths.
The needsbuffer is now updated using atomics, while read lock of
nblock prevents loosing the wakeups from bufspacewakeup() and
bufcountadd() in getnewbuf_bufd_help().
In several interesting loads, needsbuffer flags are never set, while
buffers are reused quickly. This causes brelse() and bqrelse() from
different threads to content on the nblock. Now they take nblock in
read mode, together with needsbuffer not needing an update, allowing
higher parallelism.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
to !MAP_STACK mapping requests. For MAP_STACK | MAP_FIXED, clear any
mappings which could previously exist in the used range.
For this, teach vm_map_find() and vm_map_fixed() to handle
MAP_STACK_GROWS_DOWN or _UP cow flags, by calling a new
vm_map_stack_locked() helper, which is factored out from
vm_map_stack().
The side effect of the change is that MAP_STACK started obeying
MAP_ALIGNMENT and MAP_32BIT flags.
Reported by: rwatson
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Apparently for VMware Fusion (and presumably VMware Workstation/Player
since the PR states TSO is broken there too, but I cannot test), the
TCP header pseudo checksum calculated should only include the protocol
(IPPROTO_TCP) value, not also the lengths as the stack does instead.
VMware ESXi seems to ignore whatever value is in the TCP header checksum,
and it is a bit surprising there is a different behavior between the
VMware products. And it is unfortunate that on ESXi we are forced to do
this extra bit of work.
PR: kern/185849
MFC after: 3 days
on USB HUBs by moving the code into the USB explore threads. The
deadlock happens because child devices of the USB HUB don't have the
expected reference count when called from outside the explore
thread. Only the HUB device itself, which the IOCTL interface locks,
gets the correct reference count.
MFC after: 3 days
This is currently an opt-in build flag. Once ASLR support is ready and stable
it should changed to opt-out and be enabled by default along with ASLR.
Each application Makefile uses opt-out to ensure that ASLR will be enabled by
default in new directories when the system is compiled with PIE/ASLR. [2]
Mark known build failures as NO_PIE for now.
The only known runtime failure was rtld.
[1] http://www.bsdcan.org/2014/schedule/events/452.en.html
Submitted by: Shawn Webb <lattera@gmail.com>
Discussed between: des@ and Shawn Webb [2]
This allows to mostly avoid lock usage in getnewvnode_[drop_]reserve(),
that reduces number of global vnode_free_list_mtx mutex acquisitions
from 4 to 2 per NFS request on ZFS, improving SMP scalability.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Old design with unified thread pool was good from the point of thread
utilization. But single pool-wide mutex became huge congestion point
for systems with many CPUs. To reduce the congestion create several
thread groups within a pool (one group for every 6 CPUs and 12 threads),
each group with own mutex. Each connection during its registration is
assigned to one of the groups in round-robin fashion. File affinify
code may still move requests between the groups, but otherwise groups
are self-contained.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
using a direct hook called from kern_vfs_bio_buffer_alloc().
Mark ffs_rawread.c as requiring both ffs and directio options to be
compiled into the kernel. Add ffs_rawread.c to the list of ufs.ko
module' sources.
In addition to stopping breaking the layering violation, it also
allows to link kernel when FFS is configured as module and DIRECTIO is
enabled.
One consequence of the change is that ffs_rawread.o is always linked
into the module regardless of the DIRECTIO option. This is similar to
the option QUOTA and ufs_quota.c.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This allows to slightly simplify svc_run_internal() code: if we processed
all the requests in a queue, then we know that new one will not appear.
MFC after: 2 weeks
a partially populated reservation becomes fully populated, and decrease this
field when a fully populated reservation becomes partially populated.
Use this field to simplify the implementation of pmap_enter_object() on
amd64, arm, and i386.
On all architectures where we support superpages, the cost of creating a
superpage mapping is roughly the same as creating a base page mapping. For
example, both kinds of mappings entail the creation of a single PTE and PV
entry. With this in mind, use the page size field to make the
implementation of vm_map_pmap_enter(..., MAP_PREFAULT_PARTIAL) a little
smarter. Previously, if MAP_PREFAULT_PARTIAL was specified to
vm_map_pmap_enter(), that function would only map base pages. Now, it will
create up to 96 base page or superpage mappings.
Reviewed by: kib
Sponsored by: EMC / Isilon Storage Division
- Revert r265427. It appears we are halting the DWC OTG host
controller schedule if we process events only at every SOF. When doing
split transactions we rely on that events are processed quickly and
waiting too long might cause data loss.
- We are not always able to meet the timing requirements of interrupt
endpoint split transactions. Switch from INTERRUPT to CONTROL endpoint
type for interrupt endpoint events until further, hence CONTROL
endpoint events are more relaxed, reducing the chance of data
loss. See comment in code for more in-depth explanation.
- Simplify TT scheduling.
MFC after: 3 days
created to a symlink. This restriction (which was
inherited from OpenBSD) is not required by the NFS RFCs.
Since this is allowed by the old NFS server, it is a
POLA violation to not allow it. This patch modifies the
new NFS server to allow this.
Reported by: jhb
Reviewed by: jhb
MFC after: 3 days
- stdio.h is needed for fprint()
- make memsize uint32_t to avoid errors due to overflow
- honor the *XPOLL flagg in NIOCREGIF requests
- mmap fails wit MAP_FAILED, not NULL.
MFC after: 3 days
In the .o file, this only changes some line numbers (head amd64) because
element 0 is no longer explicitly initialized.
This should make bugs like FreeBSD-SA-14:12.ktrace less likely.
Discussed with: des
MFC after: 1 week
SoC's registers base address may differ between boards
(0xf1000000 or 0xd0000000). Therefore, in order to use
the proper CPU Boot Address Redirect register during SMP
initialization in mptramp the real, physical address has
to be passed to mptramp based on the value from DT.
Reviewed by: gber
During Armada's platform_mp_start_ap(), mptramp code
is being copied to the specific physical location (0xffff0000).
Before r265694 the address to which the code should be copied
was equal to the address of mpentry routine that followed the
mptramp in locore.S. Now the mptramp end address should be
exported and used as a copy limit.
Reviewed by: gber
- Remove double buffering interrupt and isochronous traffic via the
transaction translator. It can be avoided because the DWC OTG will
always delay the start split transactions for interrupt and
isochronous traffic, but will not delay the complete split
transactions, if we set the odd frame bit correctly.
- Need to check the transfer cache field in the device done function
to be sure all allocated channels are freed and not the transfer first
one. This seems to resolve the control endpoint transfer type quirk
which is now removed.
- Make sure any received data upon TX is dumped else RX path will
stop.
- Transmit isochronous data before receiving isochronous data as a
means to optimise the TT schedule.
- Implement a simple TT bandwidth scheduler.
- Cleanup use of old "td->error" variable.
- On interrupt IN traffic via the transaction translator we simply
ignore missed transfer opportunities and silently retry the
transaction upon next available time slot.
MFC after: 3 days
trims to the device assumes the list is sorted. Don't apply the
optimization of not sorting the queue when we have SSDs to the
delete_queue, since it causes more discard traffic to the drive. While
one could argue that the higher levels should coalesce the trims,
that's not done today, so some optimization at this level is needed.
CR: https://phabric.freebsd.org/D142
"Terminus BSD Console" is a derivative of Terminus that is provided
by Mr. Dimitar Zhekov under the 2-clause BSD license for use by the FreeBSD vt(4) console.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
- Properly align temporary buffer to 32-bit.
- Add an extra parenthesis to make expression clear.
- Range check the association ID received from hardware.
MFC after: 1 week
A similar fix should be applied to vmxnet, ixgbe, igb, i40e.
(some of them previously reported by Michael Tuexen)
Drivers using if_transmit are correct, and so are most of the
other drivers that reassing if_transmit.
Among other things, this bug causes panics when using netmap emulation
on top of generic drivers.
Approved by: bryanv
MFC after: 3 days
Core i7 and Westmere processors, the uncore PMC subsystem is
completely different from the uncore PMC on smaller versions of CPUs.
Disable existing uncore hwpmc code for EX, otherwise non-existing MSRs
are accessed.
The cores PMCs seems to be identical for non-EX and EX, according to
the SDM.
Reviewed by: davide, fabient
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
- The R92S_TCR register is an 8-bit register. Don't access it like a
16-bit register.
- Disable parsing the delete station event, due to many false events.
- Ensure that there is only one transfer queue for each endpoint, so
that packets transmitted don't get out of order.
MFC after: 1 week
If you had a UFS2 FS that didn't have it's super block at SBLOCK_UFS2,
you'll end up corrupting your FS as the superblock is updated and written
to a different location...
makefs used to put the superblock at SBLOCK_UFS1 for UFS 2 FS's causing
this issue...
Reviewed by: silience from mckusick
MFC after: 1 week
selection. gethrtime() in our port updated with HZ rate, so unusable for
this specific purpose, completely draining benefit of multiple taskqueues.
MFC after: 2 weeks
(7-bit device address << 1), always leaving the room for the read/write bit.
This commit convert ti_i2c and revert r259127 on bcm2835_bsc to make them
compatible with 8-bit addresses. Previous to this commit an i2c device
would have different addresses depending on the controller it was attached
to (by example, when compared to any iicbb(4) based i2c controller), which
was a pretty annoying behavior.
Also, update the PMIC i2c address on beaglebone* DTS files to match the new
address scheme.
Now the userland utilities need to do the correct slave address shifting
(but it is going to work with any i2c controller on the system).
Discussed with: ian
MFC after: 2 weeks
The ti_i2c controller only works in the master mode and the i2c address
passed on iicbus_reset() is used to set the controller slave address when
operating as an i2c slave (which isn't currently supported).
When talking to a slave, the slave address is correctly provided to
ti_i2c_tranfer().
o Always init locks and cv ASAP.
o Initialize driver-independent parts even if driver probing fail.
o Allow to call vt_upgrade anytime, for later loaded drivers.
o New window flag VWF_READY, to track if window already initialized.
Other updates:
o Pass vd as a cookie for kbd_allocate.
o Do not blank window on driver replacement.
Tested by: hselasky (RPi), emaste(VGA, EFIFB, KMS), me
MFC after: 7 days
Sponsored by: The FreeBSD Foundation
Make it really work for native FreeBSD programs. Before this it was broken
for years due to different number of pointer dereferences in Linux and
FreeBSD IOCTL paths, permanently returning errors to FreeBSD programs.
This change breaks the driver FreeBSD IOCTL ABI, making it more strict,
but since it was not working any way -- who bother.
Add shims for 32-bit programs on 64-bit host, translating the argument
of the SG_IO IOCTL for both FreeBSD and Linux ABIs.
With this change I was able to run 32-bit Linux sg3_utils tools and simple
32 and 64-bit FreeBSD test tools on both 32 and 64-bit FreeBSD systems.
MFC after: 1 month
interface allows the ifnet structure to be defined as an opaque
type in NIC drivers. This then allows the ifnet structure to be
changed without a need to change or recompile NIC drivers.
Put differently, NIC drivers can be written and compiled once and
be used with different network stack implementations, provided of
course that those network stack implementations have an API and
ABI compatible interface.
This commit introduces the 'if_t' type to replace 'struct ifnet *'
as the type of a network interface. The 'if_t' type is defined as
'void *' to enable the compiler to perform type conversion to
'struct ifnet *' and vice versa where needed and without warnings.
The functions that implement the API are the only functions that
need to have an explicit cast.
The MII code has been converted to use the driver API to avoid
unnecessary code churn. Code churn comes from having to work with
both converted and unconverted drivers in correlation with having
callback functions that take an interface. By converting the MII
code first, the callback functions can be defined so that the
compiler will perform the typecasts automatically.
As soon as all drivers have been converted, the if_t type can be
redefined as needed and the API functions can be fix to not need
an explicit cast.
The immediate benefactors of this change are:
1. Juniper Networks - The network stack implementation in Junos
is entirely different from FreeBSD's one and this change
allows Juniper to build "stock" NIC drivers that can be used
in combination with both the FreeBSD and Junos stacks.
2. FreeBSD - This change opens the door towards changing ifnet
and implementing new features and optimizations in the network
stack without it requiring a change in the many NIC drivers
FreeBSD has.
Submitted by: Anuranjan Shukla <anshukla@juniper.net>
Reviewed by: glebius@
Obtained from: Juniper Networks, Inc.
through a voltage divisor (R163 and R164 on page 4 of BBB schematic).
Add a note about this on ti_adc(4) man page. The ti_adc(4) man page will
first appear on 10.1-RELEASE.
MFC after: 1 week
Suggested by: Sulev-Madis Silber (ketas)
Manual page reviewed by: brueffer (D127)
Reorganize the previous contexts of the file as it is in Linux. The
eventual goal is to install the header files and share them between
the kernel and bhyve.
MFC after: 1 week
This _was_ right, a last minute suggestion and not enough testing makes
Adrian a bad boy.
Tested:
* igb(4) with RSS patches, by hand verifying each igb(4) taskqueue
tid from procstat -ka using cpuset -g -t <tid>.
If the user specifies in /boot/loader.conf:
loader_brand="mycustom-brand"
Then "mycustom-brand" will be executed instead of "fbsd-logo".
Submitted by: alfred
Obtained from: FreeNAS
and the actual PWM frequency.
Enforce the maximum value for the period sysctl.
The frequency systcl now allows the direct setting of the PWM frequency (it
will try to find the better clkdiv and period for a given frequency, i.e.
the ones that will give the better PWM resolution).
This allows the use lower frequencies on the PWM. Without changing the
clock prescaler the minimum PWM frequency was 1.52kHz.
PWM frequencies checked with an osciloscope.
PWM output tested with some R/C servos at 50Hz.
it implicitly in vmm.ko.
Add ioctl VM_GET_CPUS to get the current set of 'active' and 'suspended' cpus
and display them via /usr/sbin/bhyvectl using the "--get-active-cpus" and
"--get-suspended-cpus" options.
This is in preparation for being able to reset virtual machine state without
having to destroy and recreate it.
flag has been added instead of FUTEX_WAIT to replace the FUTEX_WAIT
logic which needs to do gettimeofday() calls before the futex syscall
to convert the absolute timeout to a relative timeout.
Before this the CLOCK_MONOTONIC used by the FUTEX_WAIT_BITSET op.
When the FUTEX_CLOCK_REALTIME is specified the timeout is an absolute
time, not a relative time. Rework futex_wait to handle this.
On the side fix the futex leak in error case and remove useless
parentheses.
Properly calculate the timeout for the CLOCK_MONOTONIC case.
MFC after: 3 days
At attach, print the SCL and SDA pin numbers.
Remove a stray blank line.
Remove the GPIOBUS locking from gpioiic_reset(), it is already called with
this lock held. This fixes a crash when you try to scan the iicbus with
i2c(8).
get_scatter_segment() in get_fl_payload() fails. While here,
fix the code to adjust fl_bufs_used when a failure occurs for
any other scatter segment.
MFC after: 3 days
- Make the USB boot library more configurable.
- Resolve compile issues when cross building.
- Allow use of separate malloc.
- Allow use of separate endian macros.
Sponsored by: DARPA, AFRL
o Allow setting keymap in FDT, use hardcoded one by default
o Represent fallback keymap as a list rather than directly usable M*N array
Submitted by: Maxim Ignatenko <gelraen.ua@gmail.com>
current RADXA config. Radxa Rock (RR) boards have few types such as
RR (full version), RR Lite and some variants of RR engineering samples.
Add kernel config and FDT file for RR Lite board.
Approved by: stas (mentor)
If a vt(4) font does not exactly fit the screen dimensions, the console
window is offset so that it is centered. A rectangle is drawn at the
top, left, right, and bottom of the screen, to erase any leftovers that
are outside of the new usable console area.
If the x offset or y offset is 0 then the left border or top border
respectively is not drawn. The right and bottom borders may be one
pixel larger than necessary due to rounding, and are always drawn.
Prior to this change a 0 offset would result in a panic when calling
vt_drawrect with an x or y coordinate of -1.
Sponsored by: The FreeBSD Foundation
vt_grow may be called with a new size that's larger than previous but
does not require reallocation - for example, when the number of columns
is the same and new number of rows is less than the history size.
Prior to this change we would fail to update vb_scr_size, and then hit
a KASSERT when trying to write to the newly visible rows.
Sponsored by: The FreeBSD Foundation
ifa_ifwithnet() and ifa_ifwithdstaddr() The legacy functions will call the
_fib() versions with RT_ALL_FIBS, preserving legacy behavior.
sys/net/if_var.h
sys/net/if.c
Add legacy-compatible functions as described above. Ensure legacy
behavior when RT_ALL_FIBS is passed as fibnum.
sys/netinet/in_pcb.c
sys/netinet/ip_output.c
sys/netinet/ip_options.c
sys/net/route.c
sys/net/rtsock.c
sys/netinet6/nd6.c
Call with _fib() functions if we must use a specific fib, or the
legacy functions otherwise.
tests/sys/netinet/fibs_test.sh
tests/sys/netinet/udp_dontroute.c
Improve the udp_dontroute test. The bug that this test exercises is
that ifa_ifwithnet() will return the wrong address, if multiple
interfaces have addresses on the same subnet but with different
fibs. The previous version of the test only considered one possible
failure mode: that ifa_ifwithnet_fib() might fail to find any
suitable address at all. The new version also checks whether
ifa_ifwithnet_fib() finds the correct address by checking where the
ARP request goes.
Reported by: bz, hrs
Reviewed by: hrs
MFC after: 1 week
X-MFC-with: 264905
Sponsored by: Spectra Logic
(XScale mainly) expects the memory located before the kernel to be mapped,
and use it to allocate the page tables, the various stacks, etc.
A better fix would probably be to rewrite the various bla_machdep.c to stop
using that RAM, but I'm not so inclined to do it, especially since I don't
have hardware for all of them.
correctly prepare KGSBASE msr to restore the user descriptor base on
the last swapgs during return to usermode.
Reported and tested by: peterj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
mode.
Put the htonl(), htons(), ntohl() and ntohs() declarations under
__POSIX_VISIBLE >= 200112. POSIX.1-2001 and newer require these to be
exposed from <netinet/in.h> (as well as <arpa/inet.h>).
Note that it may be unnecessary to check __POSIX_VISIBLE >= 200112 because
older versions of POSIX and the C standard do not define this header.
However, other places in the same file already perform the check.
PR: 188316
Submitted by: Christian Neukirchen
- Update FDT file for BERI DE4 boards.
- Add needed kernel configuration keywords.
- Rename module to saf1761otg so that the device unit number does not
interfere with the hardware ID in dmesg.
Sponsored by: DARPA, AFRL
- Use an interrupt filter for handling the data path interrupts. This
increases the throughput significantly.
- Implement support for USB suspend and resume in USB host mode.
Sponsored by: DARPA, AFRL
they then pick up an opt_global.h from KERNBUILDDIR having PAE defined.
Thus, build all modules by default except those which still really are
defective as of r266799.
- Minor style cleanup.
MFC after: 1 week
flags that has several bits cleared. The RF_WANTED and RF_FIRSTSHARE
bits are invalid in this context, and we want to defer setting RF_ACTIVE
in r_flags until later. This should make rman_get_flags() return
the correct answer in all cases.
Add a KASSERT() to catch callers which incorrectly pass the RF_WANTED
or RF_FIRSTSHARE flags.
Do a strict equality check on the share type bits of flags. In
particular, do an equality check on RF_PREFETCHABLE. The previous
code would allow one type of mismatch of RF_PREFETCHABLE but disallow
the other type of mismatch. Also, ignore the the RF_ALIGNMENT_MASK
bits since alignment validity should be handled by the amask check.
This field contains an integer value, but previous code did a strange
bitwise comparison on it.
Leave the original value of flags unmolested as a minor debug aid.
Change the start+amask overflow check to a KASSERT() since it is just
meant to catch a highly unlikely programming error in the caller.
Reviewed by: jhb
MFC after: 1 month
- Make the USB hardware skip PTDs which are not allocated.
- Peek host memory twice. Sometimes the PTD status is incorrectly
returned as zero.
- Ensure the host channel is always freed when software TD
is completing.
- Add correct configuration of interrupt polarity and type.
- Set CERR to 2 for asynchronous traffic to avoid having to
reactivate the PTD when a NAK token is received.
- Fix detection of STALL PID.
Sponsored by: DARPA, AFRL
For IPv6-in-IPv4, you may need to do the following command
on the tunnel interface if it is configured as IPv4 only:
ifconfig <interface> inet6 -ifdisabled
Code logic inspired from NetBSD.
PR: kern/169438
Submitted by: emeric.poupon@netasq.com
Reviewed by: fabient, ae
Obtained from: NETASQ
physical addresses.
- Nuke the unused softc of emujoy(4).
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
MFC after: 3 days
Sponsored by: Bally Wulff Games & Entertainment GmbH
- Based on actual usage and on what Linux does, dummy_page.addr should
contain the physical bus address of the dummy page rather than its
virtual one. As a side-effect, correcting this bug fixes compilation
with PAE support enabled by getting rid of an inappropriate cast.
- Also based on actual usage of dummy_page.addr, theoretically Radeon
devices could do a maximum of 44-bit DMA. In reality, though, it is
more likely that they only support 32-bit DMA, at least that is what
radeon_gart_table_ram_alloc() sets up for, too. However, passing ~0
to drm_pci_alloc() as maxaddr parameter translates to 64-bit DMA on
amd64/64-bit machines. Thus, use BUS_SPACE_MAXSIZE_32BIT instead,
which the existing 32-bit DMA limits within the drm2 code spelled as
0xFFFFFFFF should also be changed to.
Reviewed by: dumbbell
MFC after: 1 week
Sponsored by: Bally Wulff Games & Entertainment GmbH
Some Linux futex ops atomically verifies that the futex address uaddr
(uval) contains the value val. Comparing signed uval and unsigned val
may lead to an unexpected result, mostly to a deadlock.
So copyin uaddr to an unsigned int to compare the parameters correctly.
While here change ktr records to print parameters in more readable format.
Tested by eadler@
MFC after: 3 days
situation checked by assert is verified to not take place in
vm_map_wire(), and protection permissions on the wired entry can be
revoked afterward.
Reported by: markj
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
was not allocating space for the parameter save area in the stack frame.
If the compiler chose to save the argument to the signal handler on the
stack, it would overwrite the first 32 bits of the sigaction struct with
it, corrupting it for a subsequent invocation.
PR: powerpc/183040
MFC after: 8 days
BUS_DMA_KMEM_ALLOC. They serve the same purpose, but using the flag
means that the map can be NULL again, which in turn enables significant
optimizations for the common case of no bouncing.
Obtained from: Netflix, Inc.
MFC after: 3 days
- Switch from timeout() to callout_*() for per-request timers.
- Use device_find_child() in the identify routine.
- Use device_printf() instead of passing device_get_nameunit() to
printf().
- Expand the SBP_LOCK coverage simplifying the locking.
- Uninline STAILQ_FOREACH_SAFE().
Tested by: sbruno
Add a new zfs property, "redundant_metadata" which can have values "all" or
"most". The default will be "all", which is the current behavior. When set
to all, ZFS stores an extra copy of all metadata. If a single on-disk block
is corrupt, at worst a single block of user data (which is recordsize bytes
long) can be lost.
Setting to "most" will cause us to only store 1 copy of level-1 indirect
blocks of user data files. This can improve performance of random writes,
because less metadata has to be written. In practice, at worst about
100 blocks (of recordsize bytes each) of user data can be lost if a single
on-disk block is corrupt.
The exact behavior of which metadata blocks are stored redundantly may change
in future releases.
Illumos issue: 3835 zfs need not store 2 copies of all metadata
MFC after: 2 weeks
guest for which the rules regarding xsetbv emulation are known. In
particular future extensions like AVX-512 have interdependencies among
feature bits that could allow a guest to trigger a GP# in the host with
the current approach of allowing anything the host supports.
- Add proper checking of Intel MPX and AVX-512 XSAVE features in the
xsetbv emulation and allow these features to be exposed to the guest if
they are enabled in the host.
- Expose a subset of known-safe features from leaf 0 of the structured
extended features to guests if they are supported on the host including
RDFSBASE/RDGSBASE, BMI1/2, AVX2, AVX-512, HLE, ERMS, and RTM. Aside
from AVX-512, these features are all new instructions available for use
in ring 3 with no additional hypervisor changes needed.
Reviewed by: neel
Netmap gets its own hardware-assisted virtual interface and won't take
over or disrupt the "normal" interface in any way. You can use both
simultaneously.
For kernels with DEV_NETMAP, cxgbe(4) carves out an ncxl<N> interface
(note the 'n' prefix) in the hardware to accompany each cxl<N>
interface. These two ifnet's per port share the same wire but really
are separate interfaces in the hardware and software. Each gets its own
L2 MAC addresses (unicast and multicast), MTU, checksum caps, etc. You
should run netmap on the 'n' interfaces only, that's what they are for.
With this, pkt-gen is able to transmit > 45Mpps out of a single 40G port
of a T580 card. 2 port tx is at ~56Mpps total (28M + 28M) as of now.
Single port receive is at 33Mpps but this is very much a work in
progress. I expect it to be closer to 40Mpps once done. In any case
the current effort can already saturate multiple 10G ports of a T5 card
at the smallest legal packet size. T4 gear is totally untested.
trantor:~# ./pkt-gen -i ncxl0 -f tx -D 00:07:43🆎cd:ef
881.952141 main [1621] interface is ncxl0
881.952250 extract_ip_range [275] range is 10.0.0.1:0 to 10.0.0.1:0
881.952253 extract_ip_range [275] range is 10.1.0.1:0 to 10.1.0.1:0
881.962540 main [1804] mapped 334980KB at 0x801dff000
Sending on netmap:ncxl0: 4 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> 00:07:43🆎cd:ef)
881.962562 main [1882] Sending 512 packets every 0.000000000 s
881.962563 main [1884] Wait 2 secs for phy reset
884.088516 main [1886] Ready...
884.088535 nm_open [457] overriding ifname ncxl0 ringid 0x0 flags 0x1
884.088607 sender_body [996] start
884.093246 sender_body [1064] drop copy
885.090435 main_thread [1418] 45206353 pps (45289533 pkts in 1001840 usec)
886.091600 main_thread [1418] 45322792 pps (45375593 pkts in 1001165 usec)
887.092435 main_thread [1418] 45313992 pps (45351784 pkts in 1000834 usec)
888.094434 main_thread [1418] 45315765 pps (45406397 pkts in 1002000 usec)
889.095434 main_thread [1418] 45333218 pps (45378551 pkts in 1001000 usec)
890.097434 main_thread [1418] 45315247 pps (45405877 pkts in 1002000 usec)
891.099434 main_thread [1418] 45326515 pps (45417168 pkts in 1002000 usec)
892.101434 main_thread [1418] 45333039 pps (45423705 pkts in 1002000 usec)
893.103434 main_thread [1418] 45324105 pps (45414708 pkts in 1001999 usec)
894.105434 main_thread [1418] 45318042 pps (45408723 pkts in 1002001 usec)
895.106434 main_thread [1418] 45332430 pps (45377762 pkts in 1001000 usec)
896.107434 main_thread [1418] 45338072 pps (45383410 pkts in 1001000 usec)
...
Relnotes: Yes
Sponsored by: Chelsio Communications.
uart2: <Intel AMT - PM965/GM965 KT Controller> port 0x1830-0x1837
mem 0xfe024000-0xfe024fff irq 17 at device 3.3 on pci0
uart2: console (115200,n,8,1)
Tested as tty and serial console. Seems "fine"
- Put "_LE_" into the register access macros to indicate little endian
byte order is expected by the hardware.
- Avoid using the bounce buffer when not strictly needed. Try to move
data directly using bus-space functions first.
- Ensure we preserve the reserved bits in the power down mode
register. Else the hardware goes into a non-recoverable state.
- Always use 32-bit access when writing or reading registers or FIFOs,
because the hardware is 32-bit oriented and don't really understand 8-
and 16-bit access.
- Correct writes to the memory address register. There is no need to
shift the register offset.
- Correct interval for interrupt endpoints.
- Optimise 90ns internal memory buffer read delay.
- Rename PDT into PTD, which is how the datasheet writes it.
- Add missing programming for activating host controller PTDs.
Sponsored by: DARPA, AFRL
mappings. Instead, they should be first mapping to an RSS bucket and
then querying the RSS bucket -> CPU ID mapping to figure out the target
CPU.
When (if?) RSS rebalancing is implemented or some other (non round-robin)
distribution of work from buckets to CPU IDs, various bits of code - both
userland and kernel - will need to know how this mapping works.
So, to support this:
* Add a new function rss_m2bucket() - this maps an mbuf to a given bucket.
Anything which is currently doing hash -> CPU work may instead wish to
do hash -> bucket, and then query the bucket->cpuid map for which
CPU it belongs on. Or, map it to a bucket, then re-pin that bucket ->
CPU during a rebalance operation.
* For userland applications which wish to exploit affinity to RSS buckets,
the bucket -> CPU ID mapping is now available via a sysctl.
net.inet.rss.bucket_mapping lists the bucket to CPU ID mapping via
a list of bucket:cpu pairs.
Use armv7_setttb that sets proper PT attributes.
Get rid of unused CPU functions, put nullop instead.
Exchange obsolete pj4b_/arm11_ functions to the appropriate armv7_ ones.
API function 'vie_calculate_gla()'.
While the current implementation is simplistic it forms the basis of doing
segmentation checks if the guest is in 32-bit protected mode.
of the guest linear address space. These APIs in turn use a new ioctl
'VM_GLA2GPA' to convert the guest linear address to guest physical.
Use the new copyin/copyout APIs when emulating ins/outs instruction in
bhyve(8).
taskqueue worker thread(s) to.
For now it isn't a taskqueue/taskthread error to fail to pin
to the given cpuid.
Thanks to rpaulo@, kib@ and jhb@ for feedback.
Tested:
* igb(4), with local RSS patches to pin taskqueues.
TODO:
* ask the doc team for help in documenting the new API call.
* add a taskqueue_start_threads_cpuset() method which takes
a cpuset_t - but this may require a bunch of surgery to
bring cpuset_t into scope.
'struct vm_guest_paging'.
Check for canonical addressing in vmm_gla2gpa() and inject a protection
fault into the guest if a violation is detected.
If the page table walk is restarted in vmm_gla2gpa() then reset 'ptpphys' to
point to the root of the page tables.
indicate the faulting linear address.
If the guest PML4 entry has the PG_PS bit set then inject a page fault into
the guest with the PGEX_RSV bit set in the error_code.
Get rid of redundant checks for the PG_RW violations when walking the page
tables.
memory ordering model allows writes to different devices to complete out
of order, leading to a situation where the write that clears an interrupt
source at a device can complete after a write that unmasks and EOIs the
interrupt at the interrupt controller, leading to a spurious re-interrupt.
This adds a generic barrier function specific to the needs of interrupt
controllers, and calls that function from the GIC and TI AINTC controllers.
There may still be other soc-specific controllers that need to make the call.
Reviewed by: cognet, Svatopluk Kraus <onwahe@gmail.com>
MFC after: 3 days
Idle priority is not even time-share, so if system is busy in any way,
those events may never be executed. Since in some cases system waits
for events processed by that thread, that may cause deadlocks.
to be consistent with mutex destruction in ipf_log_soft_destroy(). As a
result mutex destruction in ipf_log_soft_fini() is redundant.
Approved by: glebius (mentor)
Obtained from: darrenr (author)
the kmem object lock is held. Do the pmap_remove() before acquiring the
kmem object lock.
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
The CUSE library is a wrapper for the devfs kernel functionality which
is exposed through /dev/cuse . In order to function the CUSE kernel
code must either be enabled in the kernel configuration file or loaded
separately as a module. Currently none of the committed items are
connected to the default builds, except for installing the needed
header files. The CUSE code will be connected to the default world and
kernel builds in a follow-up commit.
The CUSE module was written by Hans Petter Selasky, somewhat inspired
by similar functionality found in FUSE. The CUSE library can be used
for many purposes. Currently CUSE is used when running Linux kernel
drivers in user-space, which need to create a character device node to
communicate with its applications. CUSE has full support for almost
all devfs functionality found in the kernel:
- kevents
- read
- write
- ioctl
- poll
- open
- close
- mmap
- private per file handle data
Requested by several people. Also see "multimedia/cuse4bsd-kmod" in
ports.
the UART FIFO.
The emulation is constrained in a number of ways: 64-bit only, doesn't check
for all exception conditions, limited to i/o ports emulated in userspace.
Some of these constraints will be relaxed in followup commits.
Requested by: grehan
Reviewed by: tychon (partially and a much earlier version)
the proper ICWx initialization sequence. It assumes, probably correctly, that
the boot firmware has done the 8259 initialization.
Since grub-bhyve does not initialize the 8259 this write to the mask register
takes a code path in which 'error' remains uninitialized (ready=0,icw_num=0).
Fix this by initializing 'error' at the start of the function.