times [1]. Assert that the pmap passed to pmap_remove_pages() is only
active on current CPU.
Submitted by: alc [1]
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Thus we can use IF_AFDATA_RLOCK() instead of IF_AFDATA_LOCK() when doing
lla_lookup() without LLE_CREATE flag.
Reviewed by: glebius, adrian
MFC after: 1 week
Sponsored by: Yandex LLC
When we encounter an I/O error on a piece of metadata while deleting
a file system or zvol, we don't update the bptree_entry_phys_t's
bookmark. This would lead to double free of bp's which will lead to
space map corruption.
Instead of tolerating and allowing the corruption, panic immediately.
See Illumos #4390 for more details.
4391 panic system rather than corrupting pool if we hit bug 4390
Illumos/illumos-gate@8b36997aa2
MFC after: 2 weeks
The operation was documented and implemented partially (both from a
type and architecture perspective) on 2013-08-21 and got used in
ZFS with revision 260150 (zfeature.c) and since ZFS is supported on
ia64, the lack of having atomic_swap became problem.
- Use simple ++ for rare events.
- Use uma_zone_get_cur() to get knowledge about space left in cache.
- Convert many fields of struct ng_netflow_info to 64 bit.
Tested by: Viktor Velichkin <avisom yandex.ru>
Sponsored by: Nginx, Inc.
hides the setjmp/longjmp semantics of VM enter/exit. vmx_enter_guest() is used
to enter guest context and vmx_exit_guest() is used to transition back into
host context.
Fix a longstanding race where a vcpu interrupt notification might be ignored
if it happens after vmx_inject_interrupts() but before host interrupts are
disabled in vmx_resume/vmx_launch. We now called vmx_inject_interrupts() with
host interrupts disabled to prevent this.
Suggested by: grehan@
Fix race condition in DELAY function: sc->tc was not initialized yet when
time_counter pointer was set, what resulted in NULL pointer dereference.
Export sysfreq to dts.
Submitted by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Using unmapped BIOs causes failure inside bus_dmamap_sync, since
this function requires valid MVA address, which is not present
if mapping is not set up.
Submitted by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
Add suport for setting triggering level and polarity in GIC.
New function pointer was added to nexus which corresponds
to the function which sets level/sense in the hardware (GIC).
Submitted by: Wojciech Macek <wma@semihalf.com>
Obtained from: Semihalf
to lla_lookup().
This drastically reduces the very high lock contention when doing parallel
TCP throughput tests (> 1024 sockets) with IPv6.
Tested:
* parallel IPv6 TCP bulk data exchange, 8192 sockets
MFC after: 1 week
Sponsored by: Netflix, Inc.
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
illumos/illumos-gate@43466aae47
NOTE: Make sure the boot code is updated if a zpool upgrade is
done on boot zpool.
MFC after: 2 weeks
(Note: this change is not applicable to FreeBSD and the file
is not included in build. It's integrated for completeness).
4128 disks in zpools never go away when pulled
illumos/illumos-gate@39cddb10a3
MFC after: 2 weeks
3306 zdb should be able to issue reads in parallel
3321 'zpool reopen' command should be documented in the man page
and help message
illumos/illumos-gate@31d7e8fa33
FreeBSD porting notes: the kernel part of this changeset depends
on Solaris buf(9S) interfaces and are not really applicable for
our use. vdev_disk.c is patched as-is to reduce diverge from
upstream, but vdev_file.c is left intact.
MFC after: 2 weeks
no longer any need for the page's PG_CACHED and PG_FREE flags to be set and
cleared while the free page queues lock is held. Thus, vm_page_alloc(),
vm_page_alloc_contig(), and vm_page_alloc_freelist() can wait until after
the free page queues lock is released to clear the page's flags. Moreover,
the PG_FREE flag can be retired. Now that the reservation system no longer
uses it, its only uses are in a few assertions. Eliminating these
assertions is no real loss. Other assertions catch the same types of
misbehavior, like doubly freeing a page (see r260032) or dirtying a free
page (free pages are invalid and only valid pages can be dirtied).
Eliminate an unneeded variable from vm_page_alloc_contig().
Sponsored by: EMC / Isilon Storage Division
Atmel boards I have.
# All Samsung, Toshiba and SanDisk parts will need to be in this table
# since they don't conform to the ONFI specification (they are all Toggle
# parts). There's some standards for the additional bytes so there's some hope
# to decode them automatically on a per-vendor basis, but even that has
# problems (and is what motivated the ONFI parameter page).
header is split out into its own BD for processing by the firmware. When
this split occurred the data length in the BD was not being set correctly
resulting in packet corruption.
Approved by: davidcd (mentor)
modules which require this flag to compile. Use a GCC_MS_EXTENSIONS
variable, defined in kern.pre.mk, which can be used to easily supply the
flag (or not), depending on the compiler type.
MFC after: 3 days
directly to the linker (LD_FLAGS) from flags passed indirectly, via the
compiler driver (LDFLAGS).
This is because several Makefiles under sys/boot/i386 and sys/boot/pc98
use ${LD} directly to link, and the normal LDFLAGS value should not be
used in these cases.
MFC after: 3 days
aggregation IDs, as is done in the upstream illumos code. This still
requires some FreeBSD-specific code, as our vmem API is not identical to the
one in illumos.
Submitted by: Mike Ma <mikemandarine@gmail.com>
by receive code waiting for data digest even when the data segment was
empty. It didn't actually read it, but it waited until those four bytes
become available in the socket buffer, i.e. until any other PDU (such as NOP)
came in.
PR: kern/185240
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
bits of the flowid as each other, resulting in a poor distribution of
packets among queues in certain cases. Work around this by adding a
set of sysctls for controlling a bit-shift on the flowid when doing
multi-port aggrigation in lagg and lacp. By default, lagg/lacp will
now use bits 16 and higher instead of 0 and higher.
Reviewed by: max
Obtained from: Netflix
MFC after: 3 days
- Nuke code setting PCI_POWERSTATE_D0; pci(4) already does that for type 0
devices.
- There's no need to keep track of resource IDs.
- Quiesce the interrupt before actually detaching.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
MFC after: 1 week
- Nuke code setting PCI_POWERSTATE_D0; pci(4) already does that for type 0
devices.
- Use PCIR_BAR instead of a homegrown macro.
- There's no need to keep track of resource IDs.
- Quiesce the interrupt before actually detaching.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
MFC after: 1 week
- Nuke code setting PCI_POWERSTATE_D0; pci(4) already does that for type 0
devices.
- Use PCIR_BAR instead of a homegrown macro.
- There's no need to keep track of resource IDs.
- Quiesce the interrupt before actually detaching.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
- Nuke dupe $FreeBSD$.
MFC after: 1 week
hw.ral.msi_disable (defaulting to using MSI).
- Probe with BUS_PROBE_DEFAULT instead of 0.
- Nuke code setting PCI_POWERSTATE_D0; pci(4) already does that for type 0
devices.
- Use PCIR_BAR instead of a homegrown macro.
- There's no need to keep track of resource IDs.
- Release resources again in case attaching fails.
- Quiesce the interrupt before detaching.
- Sprinkle const.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
- Trim headers.
- Nuke dupe $FreeBSD$.
MFC after: 1 week
an interface:
- in in_control() skip over not AF_INET addresses.
- in in_aifaddr_ioctl() and in_difaddr_ioctl() do correct check
of address family, w/o accessing memory beyond struct ifaddr.
Sponsored by: Nginx, Inc.
- #if 0 the currently unused paired port linking and unlinking of dual
adapters.
- Simplify MSI/MSI-X allocation and release. For a single one, we don't need
to fiddle with the MSI/MSI-X count and pci_release_msi(9) is smart enough
to just do nothing in case of INTx.
- Canonicalize actions taken on attach failure and detach.
- Remove the remainder of incomplete support for older FreeBSD versions.
MFC after: 1 week
- Simplify MSI allocation and release. For a single one, we don't need to
fiddle with the MSI count and pci_release_msi(9) is smart enough to just
do nothing in case of INTx.
- Don't allocate MSI as RF_SHAREABLE.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
MFC after: 1 week
- Based on lessons learnt with dc(4) (see r185750), add bus space barriers to
the MII bitbang read and write functions as well as to instances of page
switching.
- Add missing locking to ed_ifmedia_{upd,sts}().
- Canonicalize some messages.
- Based on actual functionality, ED_TC5299J_MII_DIROUT should be rather named
ED_TC5299J_MII_DIRIN.
- Remove unused headers.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
MFC after: 1 week
Actually, text versions of generic commands are not used, since ngctl(8)
uses binary messages for them. And to request a text command one needs
a working ngctl(8). That's why the bug was never discovered. I'm pondering
on removing the text support for generic commands.
Found by: dim with clang 3.4
fiddle with the MSI count and pci_release_msi(9) is smart enough to just
do nothing in case of INTx.
- Don't allocate MSI as RF_SHAREABLE.
MFC after: 1 week
better off living in aac_pci.c, but it doesn't seem worth creating a
aac_pci_detach() and it's also not the first PCI-specific bit in aac.c
MFC after: 3 days
is only to stop gcc complaining about anonymous unions, which clang does
not do. For clang 3.4 however, -fms-extensions enables the Microsoft
__wchar_t type, which clashes with our own types.h.
MFC after: 3 days
change virt_size(), virt_dumphdrs() and virt_dumpdata() into its callback
functions.
In virt_foreach() we iterate over all the virtual memory regions that we
want in the minidump. For now, just start with the PBVM (= kernel text
and data plus preloaded modules). The core file this produces can already
be used to work out the libkvm changes that need to be made to support it.
In parallel, we can flesh out the in-kernel bits to dump more of what we
need in a minidump without changing the core file structure.
This chip doesn't require the temperature sensor offset, either v1 or
v2. Doing so causes the initial calibration test to fail.
Tested:
* Intel Centrino 6150
Change the way that reservations keep track of which pages are in use.
Instead of using the page's PG_CACHED and PG_FREE flags, maintain a bit
vector within the reservation. This approach has a couple benefits.
First, it makes breaking reservations much cheaper because there are
fewer cache misses to identify the unused pages. Second, it is a pre-
requisite for supporting two or more reservation sizes.
The handler is now called after the register value is updated in the virtual
APIC page. This will make it easier to handle APIC-write VM-exits with APIC
register virtualization turned on.
This also implies that we need to keep a snapshot of the last value written
to a LVT register. We can no longer rely on the LVT registers in the APIC
page to be "clean" because the guest can write anything to it before the
hypervisor has had a chance to sanitize it.
registers.
The handler is now called after the register value is updated in the virtual
APIC page. This will make it easier to handle APIC-write VM-exits with APIC
register virtualization turned on.
We can no longer rely on the value of 'icr_timer' on the APIC page
in the callout handler. With APIC register virtualization the value of
'icr_timer' will be updated by the processor in guest-context before an
APIC-write VM-exit.
Clear the 'delivery status' bit in the ICRLO register in the write handler.
With APIC register virtualization the write happens in guest-context and
we cannot prevent a (buggy) guest from setting this bit.
except the chunks aren't physical memory regions but virtual memory
regions. In both cases, the core file is an ELF file and flags in
the header allow libkvm to distinguish one from the other.
Having ncneg diverge with the actual length of the ncneg tailq causes
NULL dereference.
Add assertion that an entry taken from ncneg queue is indeed negative.
Reported by and discussed with: avg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
bio_completed, only manage bio_resid, e.g. sa(4).
Reported and tested by: Manfred Antar <null@pozo.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
region is claimed by a new entry.
Pass MAP_STACK_GROWS_DOWN and MAP_STACK_GROWS_UP flags to
vm_map_insert() from vm_map_stack(), to really turn off coalescing
code and call to vm_map_simplify_entry() [1].
Reported by: avg, peter, many
Tested by: avg, peter
Noted by: avg [1]
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The handler is now called after the register value is updated in the virtual
APIC page. This will make it easier to handle APIC-write VM-exits with APIC
register virtualization turned on.
Additionally, mask all the LVT entries when the vlapic is software-disabled.
This seems to cause issues with jemalloc + {dhclient, sshd}.
Thus, revert this for now until the root cause can be found and
fixed.
This should quieten some runtime problems with the Raspberry Pi.
PR: kern/185046
MFC after: 3 days
This fixes the problem, when gmirror starts again just after stop.
The problem occurs when gmirror's component has geom label with equal size.
E.g. gpt and gptid have the same size as partition, diskid has the same
size as entire disk. When gmirror's geom has been destroyed, glabel
creates its providers and this initiate retaste.
Now "gmirror destroy" command is available. It destroys geom and also
erases gmirror's metadata.
MFC after: 2 weeks
The handlers are now called after the register value is updated in the virtual
APIC page. This will make it easier to handle APIC-write VM-exits with APIC
register virtualization turned on.
Additionally, we need to ensure that the value of these registers is always
correctly reflected in the virtual APIC page, because there is no VM exit
when the guest reads these registers with APIC register virtualization.
that we don't have a good way (yet) to iterate over the mapped pages by
virtual address and simply try each page within the range. Given that we
call pmap_remove() over the entire 2^63 bytes of address space, it takes
a while for pmap_remove to have tried all 2^50 pages.
By using pmap_remove_pages() we use the PV list to find all mappings.
Change derived from a patch by: alc
argument, cast the incoming 0 argument to void *, to silence a warning
from clang 3.4 ("expression which evaluates to zero treated as a null
pointer constant of type 'void *' [-Wnon-literal-null-conversion]").
MFC after: 3 days
- Host Identity Protocol (RFC5201)
- Shim6 Protocol (RFC5533)
- 2x experimentation and testing (RFC3692, RFC4727)
This does not indicate interest to implement/support these protocols,
but they are part of the "IPv6 Extension Header Types" [1] based on RFC7045
and might thus be needed by filtering and next header parsing
implementations.
References: [1] http://www.iana.org/assignments/ipv6-parameters
Obtained from: http://www.iana.org/assignments/protocol-numbers
MFC after: 1 week
by removing unsued file local functions and then unused callees.
A lot more warnings to resolve but someone had to break the ice.
MFC after: 10 days
X-Comment: I am not the new maintainer; chime in, it's ours.
- Do not update the histogram for items we are any way deleting from cache.
- Do not update the histogram if nfsrc_tcphighwater is not set.
- Remove some extra math operations.
emulation.
The vlapic initialization and cleanup is done via processor specific vmm_ops.
This will allow the VT-x/SVM modules to layer any hardware-assist for APIC
emulation or virtual interrupt delivery on top of the vlapic device model.
Add a parameter to 'vcpu_notify_event()' to distinguish between vlapic
interrupts versus other events (e.g. NMI). This provides an opportunity to
use hardware-assists like Posted Interrupts (VT-x) or doorbell MSR (SVM)
to deliver an interrupt to a guest without causing a VM-exit.
Get rid of lapic_pending_intr() and lapic_intr_accepted() and use the
vlapic_xxx() counterparts directly.
Associate an 'Apic Page' with each vcpu and reference it from the 'vlapic'.
The 'Apic Page' is intended to be referenced from the Intel VMCS as the
'virtual APIC page' or from the AMD VMCB as the 'vAPIC backing page'.
- Remove locking, since all module(9) events are running under &Giant.
- Use TAILQ for protocol handlers and fix a bug which led to
infinite cycle. Bug found in VirtualBox [1]
- Simplify code everywhere.
- Fix documentation.
[1] https://www.virtualbox.org/pipermail/vbox-dev/2013-November/011936.html
PR: 183792 [1]
Submitted by: Valery Ushakov <uwe NetBSD.org> [1]
Sponsored by: Nginx, Inc.
when a Getattr for a file is done by a client other than the one that
holds the file's delegation. This would only happen when delegations
are enabled and the problem is fixed by this patch.
MFC after: 1 week
reported to the freebsd-fs mailing list. I believe the problem was
caused by the Readdir operation using VFS_VGET() for a snapshot file entry
instead of VOP_LOOKUP(). This would not occur for NFSv3, since it
will do a VFS_VGET() of "." which fails with ENOTSUPP at the beginning
of the directory, whereas NFSv4 does not check "." or "..". This
patch adds a call to VFS_VGET() for the directory being read to check
for ENOTSUPP.
I also observed that the mount_on_fileid and fsid attributes were
not correct at the snapshot's auto mountpoints when looking at packet
traces for the Readdir. This patch fixes the attributes by doing a check
for different v_mount structure, even if the vnode v_mountedhere is not
set.
Reported by: jas@cse.yorku.ca
Tested by: jas@cse.yorku.ca
Reviewed by: asomers
MFC after: 1 week
They are stored as two separate characters in the vtbuf, so copy-pasting
will cause them to be passed to terminal_input_char() twice. Extend
terminal_input_char() to explicitly discard characters with TF_CJK_RIGHT
set. This causes only the left part to generate input.
4171 clean up spa_feature_*() interfaces
4172 implement extensible_dataset feature for use by other zpool
features
illumos/illumos-gate@2acef22db7
MFC after: 2 weeks
4168 ztest assertion failure in dbuf_undirty
4169 verbatim import causes zdb to segfa
4170 zhack leaves pool in ACTIVE state
illumos/illumos-gate@7fdd916c47
MFC after: 2 weeks
nfsv4_fillattr() as NULLs for the Getattr callback. This caused
nfsv4_fillattr() to not fill in the Change attribute for the reply.
I believe this was a violation of the RFC, but had little effect on
server behaviour. This patch passes a non-NULL p argument to fix this.
MFC after: 1 week
- Add a generic routine to trigger an LVT interrupt that supports both
fixed and NMI delivery modes.
- Add an ioctl and bhyvectl command to trigger local interrupts inside a
guest. In particular, a global NMI similar to that raised by SERR# or
PERR# can be simulated by asserting LINT1 on all vCPUs.
- Extend the LVT table in the vCPU local APIC to support CMCI.
- Flesh out the local APIC error reporting a bit to cache errors and
report them via ESR when ESR is written to. Add support for asserting
the error LVT when an error occurs. Raise illegal vector errors when
attempting to signal an invalid vector for an interrupt or when sending
an IPI.
- Ignore writes to reserved bits in LVT entries.
- Export table entries the MADT and MP Table advertising the stock x86
config of LINT0 set to ExtInt and LINT1 wired to NMI.
Reviewed by: neel (earlier version)
o Forward termianl framebuffer ioctl to fbd.
o Forward terminal mmap request to fbd.
o Move inclusion of sys/conf.h to vt.h.
Sponsored by: The FreeBSD Foundation
for the Getattr and Recall callbacks. This patch fixes it.
Since the NFSv4.1 specific error codes would only happen for
abnormal circumstances, this patch has little effect, in practice.
MFC after: 1 week
Instead of taking 8 specific bytes of file handle to identify file during
RPC thread affitinity handling, use trivial hash of the full file handle.
ZFS's struct zfid_short does not have padding field after the length field,
as result, originally picked 8 bytes are loosing lower 16 bits of object ID,
causing many false matches and unneeded requests affinity to same thread.
This fix substantially improves NFS server latency and scalability in SPEC
NFS benchmark by more flexible use of multiple NFS threads.
Sponsored by: iXsystems, Inc.
Instead of only wrapping when in the 'wrapped state', also force
wrapping when the character to be rendered does not fit on the line
anymore.
Tested by: lwhsu
capture mode together with the timecounter's PPS polling feature to get
very accurate PPS capture without any interrupt processing (or latency).
Hardware timers 4 through 7 have associated capture-trigger input pins.
When the PPS support is compiled in the code automatically chooses the
first timer it finds that has the capture-trigger pin set to input mode
(this is configured via the fdt data).
- Use named constants for register bits, instead of mystery numebrs
scattered around in the code.
- Use inline functions for bus space read/write, instead of macros
that rely on global variables.
- Move the timecounter struct into the softc instead of treating it
as a global variable. Backlink from it to the softc.
- This leaves a pointer to the softc as the only static/global variable
and it's now used only by DELAY().
state before the requested state transition. This guarantees that there is
exactly one ioctl() operating on a vcpu at any point in time and prevents
unintended state transitions.
More details available here:
http://lists.freebsd.org/pipermail/freebsd-virtualization/2013-December/001825.html
Reviewed by: grehan
Reported by: Markiyan Kushnir (markiyan.kushnir at gmail.com)
MFC after: 3 days
clang-specific or gcc-specific flags, introduce the following new
variables for use in Makefiles:
CFLAGS.clang
CFLAGS.gcc
CXXFLAGS.clang
CXXFLAGS.gcc
In bsd.sys.mk, these get appended to the regular CFLAGS or CXXFLAGS for
the right compiler.
MFC after: 1 week
The priority goes from "error" to "debug".
Connectors are polled every 10 seconds. Reading EDID is part of this
polling. However, when an invalid EDID is returned, this error message
is logged. When using Newcons for instance, having a kernel message
every 10 seconds is getting annoying.
Now that it's a debug message, it'll be logged only if hw.dri.debug is
enabled. This fix console spamming for some users.
Tested by: Larry Rosenman <ler@lerctr.org>
clients. Mask RX interrupts while grabbed on the atmel serial
driver. This UART interrupts every character. When interrupts are
enabled at the mountroot> prompt, this means the ISR eats the
characters. Rather than try to create a cooperative buffering system
for the low level kernel console, instead just mask out the ISR. For
NS8250 and decsendents this isn't needed, since interrupts only happen
after 14 or more characters (depending on the fifo settings). Plumb
such that these are optional so there's no change in behavior for all
the other UART clients. ddb worked on this platform because all
interrupts were disabled while it was running, so this problem wasn't
noticed. The mountroot> issue has been around for a very very long
time.
MFC after: 3 days
... for msleep/cv_*wait() return values, where wait_event*() is used
on Linux. ERESTARTSYS is the return code expected by callers when the
operation was interrupted.
For instance, this is the case of radeon_cs_ioctl() (radeon_cs.c): if
an error occurs, and the code isn't ERESTARTSYS (eg. EINTR), it logs an
error.
Note that ERESTARTSYS is defined as ERESTART, but this keeps callers'
code close to Linux.
Submitted by: avg@ (previous version)
to the soreceive(). This exposed a bug. When reading from a raw socket,
when our fake limit is depleted, we receive a truncated mbuf chain, with
m->m_pkthdr.len > m_length(m). The first problem is that MSG_TRUNC was not
handled. The second one is that we didn't reinit uio_resid in our endless
loop (neither flags), and if socket buffer contained several records, then
we quickly deplete our fake limit. The third bug, actually introduced in
r248885, is that MJUMPAGESIZE isn't enough to handle maximum packet that
ng_ksocket(4) can theoretically receive.
Changes:
- Reinit uio_resid and flags before every call to soreceive().
- Set maximum acceptable size of packet to IP_MAXPACKET. As for now the
module doesn't support INET6.
- Properly handle MSG_TRUNC return from soreceive().
PR: 184601
Submitted & tested by: Viktor Velichkin <avisom yandex.ru>
Sponsored by: Nginx, Inc.
Normal and bold fonts each have a glyph map for single or left half-
glyphs, and right half glyphs. The flag TF_CJK_RIGHT in term_char_t
requests the right half-glyph.
Reviewed by: ed@
Sponsored by: The FreeBSD Foundation
The previous code was checking the "VGA Enable" bit on the video card's
parent PCI-to-PCI bridge only. This didn't work for the case where the
video card is attached to the root PCI bus (ie. the card has no parent
PCI-to-PCI bridge).
Now, the new code:
1. checks the "VGA Enable" bit on the parent bridge only if it's a
PCI-to-PCI bridge;
2. always checks the "I/O" and "Memory address space decoding" bits
on the video card itself.
However, vendor-specific bits are not used.
This fixes the use of many integrated Radeon cards: without this patch,
we fail to detect them as the boot display and, when radeonkms looks for
the Video BIOS, it skips the shadow copy made by the System BIOS. It
then fails to fully initialize the card, because the shadow copy is the
only way to read the Video BIOS in these situations. A workaround was to
force the boot display selection using the "hw.pci.default_vgapci_unit"
tunable.
A previous version of this patch added a new function doing the checks.
Now, the vga_pci_is_boot_display() function is used to perform the
checks (only until the boot display is found) and return if the given
device is the boot display or not.
Furthermore, vga_pci_attach() logs "Boot video device" if the card being
attached it the Chosen One:
vgapci0: <VGA-compatible display> [...]
vgapci0: Boot video device
Reviewed by: kib@, jhb@ (both a previous version)
Tested by: lunatic_ (#freebsd-xorg, integrated Radeon card,
xmj (#freebsd-xorg, i915+NVIDIA cards)
This, and several subsequent commits, are suspend/resume for various PowerMac
drivers, which will include a change to the global suspend/resume code
eventually.
Introduce a new formatting bit (TF_CJK_RIGHT) that is set when putting a
cell that is the right part of a CJK fullwidth character. This will
allow drivers like vt(9) to support fullwidth characters properly.
emaste@ has a patch to extend vt(9)'s font handling to increase the
number of Unicode -> glyph maps from 2 ({normal,bold)} to 4
({normal,bold} x {left,right}). This will need to use this formatting
bit to determine whether to draw the left or right glyph.
Reviewed by: emaste
Do not insert active ports into pool->sp_active list if they are success-
fully assigned to some thread. This makes that list include only ports that
really require attention, and so traversal can be reduced to simple taking
the first one.
Remove idle thread from pool->sp_idlethreads list when assigning some
work (port of requests) to it. That again makes possible to replace list
traversals with simple taking the first element.
With this, also shut shut off the display (DPMS-style) and disable the clocking
when the backlight level is set to 0. This is taken from the radeonkms driver
(radeon_legacy_encoders.c) which doesn't yet support PowerPC, and won't for a
while, as it's missing full AGP support.
using cpuid can be quirky (this is the case of VMWare without the
vPMC support) but fail to probe hwpmc.
o Apply the fix for XEON family of processors as established by
315338-020 document (bug AJ85).
Sponsored by: EMC / Isilon storage division
Reviewed by: fabient
eneough queue space before queuing a bunch of IP fragments.
As the comment in the committed change says, in the post-if_transmit(),
post-SMP, post-preemption world, there's just too much overlapping
concurrent code paths and different approaches to driver transmit
queue management to have this code even remotely be effective.
The only specific place it could be useful is if ALTQ is enabled
but again it doesn't at all promise that all the fragments will be
transmitted anyway.
The main reason for committing this change is to disable a parallel
place where the drops counter is incremented. This is a side effect
of an upcoming change to ixgbe/cxgbe to handle the queue drops
counter slightly better.
Sponsored by: Netflix, Inc.
The least significant 8 bits of 'pm_flags' are now used for the IPI vector
to use for nested page table TLB shootdown.
Previously we used IPI_AST to interrupt the host cpu which is functionally
correct but could lead to misleading interrupt counts for AST handler. The
AST handler was also doing a lot more than what is required for the nested
page table TLB shootdown (EOI and IRET).
Qualcomm Snapdragon S4 and Snapdragon 400/600/800 SoCs and has architectural
similarities to ARM Cortex-A15. As for development boards IFC6400 series embedded
boards from Inforce Computing uses Snapdragon S4 Pro/APQ8064.
Approved by: stas (mentor)
When processing receive buffer, write the amount of data, expected
in present request record, into socket's so_rcv.sb_lowat to make stack
aware about our needs. When processing following upcalls, ignore them
until socket collect enough data to be read and processed in one turn.
This change reduces number of context switches and other operations
in RPC stack during large NFS writes (especially via non-Jumbo networks)
by order of magnitude.
After precessing current packet, take another look into the pending
buffer to find out whether the next packet had been already received.
If not, deactivate this port right there without making RPC code to
push this port to another thread just to find that there is nothing.
If the next packet is received partially, also deactivate the port, but
also update socket's so_rcv.sb_lowat to not be woken up prematurely.
This change additionally reduces number of context switches per NFS
request about in half.
Previously the code would just iterate over the whole tree as if it were
just a list.
Without this change I would observe X server becoming more and more
jerky over time.
MFC after: 5 days
covered by sbintime (LONG_MAX seconds).
Some programs use timeout values in excess of 1000 years. The conversion
to sbintime caused wrap-around on overflow, which resulted in short or
negative timeout values. This caused long delays on sockets opened by
affected programs (e.g. OpenSSH).
Kernels compiled without -fno-strict-overflow were not affected, apparently
because the compiler tested the sign of the timeout value before performing
the multiplication that lead to overflow.
When the -fno-strict-overflow option was added to CFLAGS, this optimization
was disabled and the test was performed on the result of the multiplication.
Negative products were caught and resulted in EINVAL being returned, but
wrap-around to positive values just shortened the timeout value to the
residue of the result that could be represented by sbintime.
The fix is to cap the timeout values at the maximum that can be represented
by sbintime, which is 2^31 - 1 seconds or more than 68 years.
After this change, the kernel can be compiled with -fno-strict-overflow
with no ill effects.
MFC after: 3 days
linker_unload_file() rather than kern_kldload() and kern_kldunload(). This
ensures that the handlers are invoked for files that are loaded/unloaded
automatically as dependencies. Previously, they were only invoked for files
loaded by a user.
As a side effect, the kld_load and kld_unload handlers are now invoked with
the kernel linker lock exclusively held.
Reported by: avg
Reviewed by: jhb
MFC after: 2 weeks
re-links dynamic states to default rule instead of
flushing on rule deletion.
This can be useful while performing ruleset reload
(think about `atomic` reload via changing sets).
Currently it is turned off by default.
MFC after: 2 weeks
Sponsored by: Yandex LLC
to fail and return error.
- Use make_dev_p() in tty_makedevf() instead of make_dev_cred().
- Always pass MAKEDEV_CHECKNAME flag.
- Optionally pass MAKEDEV_REF flag.
- Provide macro for compatibility with old API.
This fixes races with simultaneous creation and desctruction of
ttys, and makes it possible to call tty_makedevf() from device
cloners.
A race in tty_watermarks() still exist, since the latter drops
lock for M_WAITOK allocation. This will be addressed in separate
commit.
Reviewed by: kib
Sponsored by: Nginx, Inc.
child process that were inherited from its parent. However, this should
not be done in the case of a vfork, since the fork handler ends up removing
the tracepoints from the shared vm space, and userland DTrace probes in the
parent will no longer fire as a result.
Now the child of a vfork may trigger userland DTrace probes enabled in its
parent, so modify the fasttrap probe handler to handle this case and handle
the child process in the same way that it would handle the traced process.
In particular, if once traces function foo() in a process that vforks, and
the child calls foo(), fasttrap will treat this call as having come from the
parent. This is the behaviour of the upstream code.
While here, add #ifdef guards to some code that isn't present upstream.
MFC after: 1 month
in6addr_any and is not in the CLIP table either. This fixes a reported
TOE+IPv6 NULL-dereference panic in do_pass_open_rpl().
While here, stop creating hardware servers for any loopback address.
It's just a waste of server tids.
MFC after: 1 week
advisory lock cannot be obtained, prevent double-close of the vnode in
vn_close() called from the fdrop(), by resetting file' f_ops methods.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Besides not making sense, open(O_EXEC) for fifo creates fifoinfo with
zero readers and writers counts, which causes premature free of pipes.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
callers treat the MSI 'addr' and 'data' fields as opaque and also lets
bhyve implement multiple destination modes: physical, flat and clustered.
Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
Reviewed by: grehan@
This allows it to be better tracked as well as being able to leverage
UMA for more interesting/useful behaviour at a later date.
Sponsored by: Netflix, Inc.
Some Intel XHCI controlles timeout processing so-called "TRBs" when
the final LINK TRB of a so-called "TD" has the CHAIN-BIT set.
MFC after: 1 week
Tested by: glebius @
the TTY. In such a case, ttydev_close() is called multiple times and
each time, t_revokecnt is incremented and cv_broadcast() is called for
both the t_outwait and t_inwait condition variables.
Let's say revoke(2) comes in first and gets to call tty_drain() from
ttydev_leave(). Let's say that the revoke comes from init(8) as the
result of running "shutdown -r now". Since shutdown prints various
messages to the console before announing that the machine will reboot
immediately, let's also say that the output queue is not empty and
that tty_drain() has something to do. Let's assume this all happens
on a 9600 baud serial console, so it takes a time to drain.
The shutdown command will exit(2) and as such will end up closing
stdout. Let's say this close will come in second, bump t_revokecnt
and call tty_wakeup(). This has tty_wait() return prematurely and
the next thing that will happen is that the thread doing revoke(2)
will flush the TTY. Since the drain wasn't complete, the flush will
effectively drop whatever is left in t_outq.
This change takes into account that tty_drain() will return ERESTART
due to the fact that t_revokecnt was bumped and in that case simply
call tty_drain() again. The thread in question is already performing
the close so it can safely finish draining the TTY before destroying
the TTY structure.
Now all messages from shutdown will be printed on the serial console.
Obtained from: Juniper Networks, Inc.