This paves way to nuke the hv_device, which is actually an unncessary
indirection.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D7028
This paves way to nuke the hv_device, which is actually an unncessary
indirection.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D7027
This makes life easier during the transition period to nuke the hv_device.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D7026
This prepares to remove the unnecessary offer message embedding in
hv_vmbus_channel.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D7020
This prepares to remove the unnecessary offer message embedding in
hv_vmbus_channel.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D7019
This prepares to remove the unnecessary offer message embedding in
hv_vmbus_channel.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D7015
This prepares to remove the unnecessary offer message embedding in
hv_vmbus_channel.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D7014
cause a crash.
Because dummynet calls pie_cleanup() while holding a mutex, pie_cleanup()
is not able to use callout_drain() to make sure that all callouts are
finished before it returns, and callout_stop() is not sufficient to make
that guarantee. After pie_cleanup() returns, dummynet will free a
structure that any remaining callouts will want to access.
Fix these problems by allocating a separate structure to contain the
data used by the callouts. In pie_cleanup(), call callout_reset_sbt()
to replace the normal callout with a cleanup callout that does the cleanup
work for each sub-queue. The instance of the cleanup callout that
destroys the last flow will also free the extra allocated block of memory.
Protect the reference count manipulation in the cleanup callout with
DN_BH_WLOCK() to be consistent with all of the other usage of the reference
count where this lock is held by the dummynet code.
Submitted by: Rasool Al-Saadi <ralsaadi@swin.edu.au>
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D7174
If the hypervisor version is smaller than 4.6.0. Xen commits 74fd00 and
70a3cb are required on the hypervisor side for this to be fixed, and those
are only included in 4.6.0, so stay on the safe side and disable MSI-X
interrupt migration on anything older than 4.6.0.
It should not cause major performance degradation unless a lot of MSI-X
interrupts are allocated.
Sponsored by: Citrix Systems R&D
MFC after: 3 days
Reviewed by: jhb
Differential revision: https://reviews.freebsd.org/D7148
For multi-channel devices, once the primary channel is closed,
a set of 'rescind' messages for sub-channels will be delivered
by Hypervisor. Sub-channel MUST be freed according to these
'rescind' messages; directly re-openning sub-channels in the
same fashion as the primary channel's re-opening does NOT work
at all.
After the primary channel is re-opened, requested # of sub-
channels will be delivered though 'channel offer' messages, and
this set of newly offered channels can be opened along side with
the primary channel.
This unbreaks the MTU setting for hn(4), which requires re-
openning all existsing channels upon MTU change.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6978
Instead of global variable, vmbus version is accessed through
a vmbus DEVMETHOD now.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6953
Pin the channel to cpu0 by default. Drivers having special channel-cpu
mapping requirement should call vmbus_channel_cpu_{set,rr}() themselves.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6918
Despite the implication (process has pending signals -> the current
thread marked for AST and has TDF_NEEDSIGCHK set) is not true due to
other thread might manipulate its signal blocking mask, it should still
hold for the single-threaded processes. Enable check for the condition
for single-threaded case, and replicate it from userret() to ast() as
well, where we check that ast indeed has no signal to deliver.
Note that the check is under DIAGNOSTIC, it is not enabled for INVARIANTS
but !DIAGNOSTIC since it imposes too heavy-weight locking for day-to-day
used debugging kernel.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
AST must not execute with TDF_SBDRY or TDF_SEINTR/TDF_SERESTART thread
flags set, which is asserted in userret(). As the consequence, -1 return
from cursig() must not be possible.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This also fixes memory leakge if sub-connect messages are needed.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6878
The current command response handling discards status and xfer
length unconditionally, so that all of the commands would be
considered successful, even if errors happened. When errors
really happens, this causes all kinds of wiredness, since the
buffer will not be filled on the host side and sense data will
be ignored.
Most of the time, errors do not happen, however, error does
happen for the request sent immediately after the disk resizing.
Discarding the SCSI status (SCSI_STATUS_CHECK_COND) and sense
data (capacity changes) prevents the disk resizing from working
properly.
This commit saves the response status and xfer length properly
for later use.
Submitted by: Dexuan Cui <decui microsoft com>
Noticed by: sephe
MFC after: 3 days
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D7181
NVRAM, ChipCommon, etc).
This extends the existing handling of NVRAM core discovery to support
locating additional devices that may be attached either directly as real
cores, or indirectly via ChipCommon (e.g. bhnd_pmu).
When attached as a SoC root bus (as opposed to a bridged WiFi device),
the platform devices may not be attached until later bus passes,
necessitating delayed discovery/initialization.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D6962
- Cast 32-bit register values to uintmax_t for use with %jx.
- Add special-case return address handling for MipsKernGenException to
avoid early termination of stack walking in the exception handler
stack frame.
Submitted by: Michael Zhilin <mizhka@gmail.com>
Reviewed by: ray
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D6907
on arm64 and all SoCs using the old FIFO register location are 32-bit only,
so unconditionally use the new location for arm64.
Reviewed by: andrew, manu
By definition (enum __drm_capabilities), cases other than CAP_SYS_ADMIN
aren't possible. Add in a KASSERT safety belt and return false in
!INVARIANTS case if an invalid value is passed in, as it would be a
programmer error.
This fixes a -Wreturn-type error with gcc 5.3.0.
Differential Revision: https://reviews.freebsd.org/D7188
MFC after: 1 week
Reported by: devel/amd64-gcc (5.3.0)
Reviewed by: dumbbell
Sponsored by: EMC / Isilon Storage Division
If tf_trapno contains garbage which appears to be equal to T_NMI,
e.g. due to thread previously entered kernel due to NMI, doreti
sequence skips ast, and does so until a trap or hardware interrupt
occur.
The visible effects of the issue are quite confusing. First, signals
delivery is postponed in observable ways. In particular, the
guarantee that unblocked async signals queue is flushed before a
return from syscall, is broken. Second, if there are pending signals,
all interruptible sleeps of the stuck thread are aborted immediately.
Since modern CPUs are relatively fast and tickless kernel generates
low interrupt rate, the faulty condition might exist for long time (in
an application time scale).
In collaboration with: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
if vnode is VMIO. For VMIO vnodes, set BO_DEAD in vm_object_terminate().
The vnode_destroy_object(), when calling into vm_object_terminate(),
must be able to flush buffers. BO_DEAD purpose is to quickly destroy
buffers on write when the underlying vnode is not operable any more
(one example is the devfs node after geom is gone). Setting BO_DEAD
for reclaiming vnode before object is terminated is premature, and
results in unability to flush buffers with live SU dependencies from
vinvalbuf() in vm_object_terminate().
Reported by: David Cross <dcrosstech@gmail.com>
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
So that we don't need to access the global vmbus softc.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6863
The device probe/attach has been move to a different thread, so the
reasons to create the channel asynchronously are no longer valid.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6862
Similar to r300836, r301800, and r302550, cl and ct will always
be non-NULL as they're allocated using the mem_alloc routines,
which always use `malloc(..., M_WAITOK)`.
MFC after: 1 week
Reported by: Coverity
CID: 1007342
Sponsored by: EMC / Isilon Storage Division
Similar to r300836 and r301800, cl and cu will always be non-NULL as they're
allocated using the mem_alloc routines, which always use
`malloc(..., M_WAITOK)`.
Deobfuscating the cleanup path fixes a leak where if cl was NULL and
cu was not, cu would not be free'd, and also removes a duplicate test for
cl not being NULL.
MFC after: 1 week
Reported by: Coverity
CID: 1007033, 1007344
Sponsored by: EMC / Isilon Storage Division
While I'm here, remove the useless message type from message process
array, which is not used and serves no purposes at all.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6858
And use this new APIs for Initial Contact post message Hypercall.
More post message Hypercalls will be converted.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6830
I don't know what errata is mentioned there, I was unable to find it, but
setting limit before the base simply does not work at all. According to
specification attempt to set limit out of the present window range resets
it to zero, effectively disabling it. And that is what I see in practice.
Fixing this properly disables access for remote side to our memory until
respective xlat is negotiated and set. As I see, Linux does the same.
At that point link is quite likely not established yet, so messing with
scratch registers is premature there. Original commit message mentioned
code diff reduction from Linux, but this line is not present in Linux now.
In some cases, the driver must handle given properties located in
specific OF subnode. Instead of creating duplicate set of function, add
'node' as argument to existing functions, defaulting it to device OF node.
MFC after: 3 weeks
those system calls audit event identifiers AUE_READ and AUE_WRITE.
While auditing file-descriptor I/O is not required by the Common
Criteria, in practice this proves useful for both live and forensic
analysis.
NB: freebsd32 already assigns AUE_READ and AUE_WRITE to read(2) and
write(2).
MFC after: 3 days
Sponsored by: DARPA, AFRL
audit trail. This was not required for Common Criteria auditing
(which requires only that the intent to read or write be audited
at the time of open(2)), but is useful for contemporary live
analysis and forensics.
MFC after: 3 days
Sponsored by: DARPA, AFRL
operates on a specific OF node instead of the pass in device's OF node.
Reviewed by: andrew, mmel
Differential Revision: https://reviews.freebsd.org/D6957
about the desirability of auditing the number, as it was in fact in the
wrong place (in the common path for open(2) and openat(2), and only the
latter accepts a file-descriptor argument). Where other ABIs support
openat(2), it may be necessary to do additional argument auditing as it is
not performed in kern_openat(9).
MFC after: 3 days
Sponsored by: DARPA, AFRL
FreeBSD support NX bit on X86_64 processors out of the box, for i386 emulation
use READ_IMPLIES_EXEC flag, introduced in r302515.
While here move common part of mmap() and mprotect() code to the files in compat/linux
to reduce code dupcliation between Linuxulator's.
Reported by: Johannes Jost Meixner, Shawn Webb
MFC after: 1 week
XMFC with: r302515, r302516
In Linux if this flag is set, PROT_READ implies PROT_EXEC for mmap().
Linux/i386 set this flag automatically if the binary requires executable stack.
READ_IMPLIES_EXEC flag will be used in the next Linux mmap() commit.
read(2), write(2), dup(2), and mmap(2). This auditing is not
required by the Common Criteria (and hence was not being
performed), but is valuable in both contemporary live analysis
and forensic use cases.
MFC after: 3 days
Sponsored by: DARPA, AFRL
The bus_region_* APIs accept the number of data items to be read, while
the code was passing the total number of bytes, resulting in an overflow
of the SPROM parser's buffer.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D7168
For some reason hack with sending MSI-X interrupts by writing to remote
LAPIC memory works only for 32-bit BARs, that are available only if split
BARs mode is enabled in BIOS. If it is not, complain loudly and fall back
to less efficient workaround.
Predicates are DIF objects whose return value is compared with zero to
determine whether the corresponding probe body is to be executed. The return
value itself is the contents of a 64-bit DIF register, but it was being
truncated to an int before the comparison. This meant that a predicate such
as /0x100000000/ would evaluate to false.
Reported by: rwatson
MFC after: 3 days
All armv6 processors are plenty fast enough for HZ=1000.
No changes are made for older arm systems, because some chips are a bit
wimpy for 1000 while others do fine, so it has to be set on a per-config
basis.
For compatibility reasons make driver not report any checksum offload by
default, since there is indeed none. But if administrator knows that
interface is used only for local traffic, he can enable fake checksum
offload manually on both sides to save some CPU cycles, since the data
are already protected by CRC32 of PCIe link.
Sponsored by: iXsystems, Inc.
pf returns PF_PASS, PF_DROP, ... in the netpfil hooks, but the hook callers
expect to get E<foo> error codes.
Map the returns values. A pass is 0 (everything is OK), anything else means
pf ate the packet, so return EACCES, which tells the stack not to emit an ICMP
error message.
PR: 207598
This allows at least first three doorbells to work very close to normal
hardware, properly signaling events to upper layers without spurious or
lost events. Doorbells above the first three may still report spurious
events due to lack of reliable information, but they are rarely used.
It is odd idea to serialize different MSI-X vectors. Use of rmlocks
here allows them to execute in parallel, but still protects ctx.
If upper layers require any additional serialization -- they can
do it by themselves.
This follows NTB subsystem modularization in Linux, tuning it to FreeBSD
native NewBus interfaces. This change allows to support different types
of hardware with different drivers, support multiple NTB instances in a
system, ntb_transport module use for needs other then if_ntb, etc.
Sponsored by: iXsystems, Inc.
Since SBARxSZ register can be write-once, it can be unusable for disabling
the SBAR. For such case also set SBARxBASE to zero to not intersect with
config BAR.
* add support to read the timer and capability
* add support to enable/disable the location timer.
On AR9380 at least, enabling the location timer is required to make
the timer tick, otherwise location packets return a timestamp of 0.
However, it then makes /all/ RX packets use the RX location timestamp
instead of the TSF timestamp.
So, unless I find another magical way to do location timestamping,
we will have to dynamically switch things on/off and ensure the
TX/RX path handles the "different" timestamps correctly.
Tested:
* AR9380, STA mode
* LOC_INFO is mostly just "did this packet come with a locationing
timestamp instead of TSF";
* Decode not-sounding, uploaded-data, data-valid, data type and
number of extension spatial streams.
* If fast_ts is set then the TX timestamp is the fast timestamp, not
normal TSF.
* If the TX descriptor has the position bit set then request locationing
and clear sounding-disable. This way we (a) get the response with
the TX timestamp from the location side of things, and (b) we get
a CSI dump of the response ACK, which we will eventually use in the
locationing path.
* the code already stored the length of the RX desc, which I never used.
So, use that and retire the new flag I introduced a while ago.
* Introduce a TX timestamp length field and capability.
It turns out that this value is not used within the system call code
under normal conditions, except when using tracing tools like ktrace.
If we forget to set this value, it is set to random garbage. This may
cause ktrace to hang indefinitely, making it impossible to kill.
Reported by: Michael Plass
PR: 210800
MFC before: 11.0-RELEASE
* extend the TX timestamp to 32 bits, as the AR5416 and later does a full
32 bit TX timestamp instead of 15 or 16 bits.
* add RX descriptor fields for PHY uploaded information (coming soon)
* add flags for RX/TX fast timestamp, hardware upload, etc
* add a flag for TX to request ToD/ToA location information.