before dropping process lock. Clear P_STOPPROF when doing wakeup.
Both issues caused thread to hang in stopprofclock() "stopprof" sleep.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
ZFS large block support.
Please note that booting from datasets that have recordsize greater
than 128KB is not supported (but it's Okay to enable the feature on
the pool). This *may* remain unchanged because of memory constraint.
Limited safety belt is provided for mounted root filesystem but use
caution is advised.
Illumos issue:
5027 zfs large block support
MFC after: 1 month
With command serialization used in CTL, there are no other commands to abort
when PREEMPT AND ABORT gets to run, so it is practically equal to PREEMPT.
MFC after: 1 week
have chosen different (and more traditional) stateless/statuful
NAT64 as translation mechanism. Last non-trivial commits to both
faith(4) and faithd(8) happened more than 12 years ago, so I assume
it is time to drop RFC3142 in FreeBSD.
No objections from: net@
The prior change to not enable LRO by default has confused several
people. The configurations where LRO is problematic is not the
typical use case for VirtIO, and due to other issues, this often
requires checksum offloading to be disabled anyways.
PR: 185864
MFC after: 2 weeks
It isn't safe to keep unreferenced ifaddrs. Use in6ifa_ifwithaddr() to
determine ifaddr corresponding to destination address. Since currently
we keep addresses with embedded scope zone, in6ifa_ifwithaddr is called
with zero zoneid and marked with XXX.
Also remove route and lle lookups from ip6_input. Use in6ifa_ifwithaddr()
instead.
Sponsored by: Yandex LLC
gcc requires variables to be initialised in two places. One of them
is correctly used only under the same conditional though.
For module builds properly check if the kernel supports INET or INET6,
as otherwise various mips kernels without IPv6 support would fail to build.
Create a proper stack frame for amd64 version of bcopy(). Note that
this also makes the stack properly aligned in the function, despite it
is not strictly needed.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
- Remove duplicated sources between standard part of the kernel and
module. In particular, it caused duplicated lock initialization and
sysctl registration, both having bad consequences.
- Add missed source files to module.
- Static part of the kernel provides randomdev module, not
random_adaptors. Correct dependencies.
- Use cdev modules declaration macros.
Approved by: secteam (delphij)
Reviewed by: markm
buffer from asm to C, which reduces amount of arguments for inline asm
and simplifies constraints. Use unsigned types consistently.
Submitted by: bde
Approved by: secteam (delphij)
Reviewed by: markm
MFC after: 1 week
After resource allocation and release, resource list entry
stays non-NULL. This causes panic in ofwbus_alloc_resource()
on subsequent resource allocation.
Clean appropriate list entry on release to avoid this.
Obtained from: Semihalf
Reviewed by: ian
Sponsored by: The FreeBSD Foundation
Split it into two modules: if_gre(4) for GRE encapsulation and
if_me(4) for minimal encapsulation within IP.
gre(4) changes:
* convert to if_transmit;
* rework locking: protect access to softc with rmlock,
protect from concurrent ioctls with sx lock;
* correct interface accounting for outgoing datagramms (count only payload size);
* implement generic support for using IPv6 as delivery header;
* make implementation conform to the RFC 2784 and partially to RFC 2890;
* add support for GRE checksums - calculate for outgoing datagramms and check
for inconming datagramms;
* add support for sending sequence number in GRE header;
* remove support of cached routes. This fixes problem, when gre(4) doesn't
work at system startup. But this also removes support for having tunnels with
the same addresses for inner and outer header.
* deprecate support for various GREXXX ioctls, that doesn't used in FreeBSD.
Use our standard ioctls for tunnels.
me(4):
* implementation conform to RFC 2004;
* use if_transmit;
* use the same locking model as gre(4);
PR: 164475
Differential Revision: D1023
No objections from: net@
Relnotes: yes
Sponsored by: Yandex LLC
it, except Ethernet, where it carried ng_ether(4) pointer.
For now carry the pointer in if_l2com directly.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
- Support the KDB alt break sequence to enter the debugger,
panic, reboot, etc. [1]
- Provide emergency write feature description. Note that QEMU
does not implement this feature.
- Make the VTCON_FLAG_* defines sequential once again.
- When the multiple port feature is not negotiated, query the
rows and columns of the one console during the device attach
when the size feature is negotiated.
- Report failure to the device if hot plugging a port fails.
- Acknowledge the console port event with an open event. This
is required by the spec, but QEMU doesn't seem to care.
Submitted by: Juniper [1]
MFC after: 1 month
-Improved VF stability, thanks to changes from Ryan Stone,
and Juniper.
- RSS fixes in the ixlv driver
- link detection in the ixlv driver
- New sysctl's added in ixl and ixlv
- reset timeout increased for ixlv
- stability fixes in detach
- correct media reporting
- Coverity warnings fixed
- Many small bug fixes
- VF Makefile modified - nvm shared code needed
- remove unused sleep channels in ixlv_sc struct
Submitted by: Eric Joyner (committed by jfv)
MFC after: 1 week
It turns out an alignment of zero can lead to an endless loop in the
vm reservations code, so specifically disallow that. The manpage says
hardware which can do dma at any address should use a value of one, which
hints at the forbiddeness of zero without exactly saying it. Several
other conditions which could lead to insanity in working with the tag are
also checked now.
Every existing call to bus_dma_tag_create() (about 680 of them) was
eyeballed for violations of these things, and two alignment=0 glitches
were fixed. It's possible something was missed, but overall this
shouldn't lead to any arm users suddenly experiencing failures.
problems than it solves. SYSDIR is already defined almost always and
can be used instead. Working around the one case where it isn't is
much easier than working around the fact that @ may not exist in 18
other places.
Differential Revision: https://reviews.freebsd.org/D1100
Some virtual if drivers has (ab)used ifa ifa_rtrequest hook to enforce
route MTU to be not bigger that interface MTU. While ifa_rtrequest hooking
might be an option in some situation, it is not feasible to do MTU checks
there: generic (or per-domain) routing code is perfectly capable of doing
this.
We currrently have 3 places where MTU is altered:
1) route addition.
In this case domain overrides radix _addroute callback (in[6]_addroute)
and all necessary checks/fixes are/can be done there.
2) route change (especially, GW change).
In this case, there are no explicit per-domain calls, but one can
override rte by setting ifa_rtrequest hook to domain handler
(inet6 does this).
3) ifconfig ifaceX mtu YYYY
In this case, we have no callbacks, but ip[6]_output performes runtime
checks and decreases rt_mtu if necessary.
Generally, the goals are to be able to handle all MTU changes in
control plane, not in runtime part, and properly deal with increased
interface MTU.
This commit changes the following:
* removes hooks setting MTU from drivers side
* adds proper per-doman MTU checks for case 1)
* adds generic MTU check for case 2)
* The latter is done by using new dom_ifmtu callback since
if_mtu denotes L3 interface MTU, e.g. maximum trasmitted _packet_ size.
However, IPv6 mtu might be different from if_mtu one (e.g. default 1280)
for some cases, so we need an abstract way to know maximum MTU size
for given interface and domain.
* moves rt_setmetrics() before MTU/ifa_rtrequest hooks since it copies
user-supplied data which must be checked.
* removes RT_LOCK_ASSERT() from other ifa_rtrequest hooks to be able to
use this functions on new non-inserted rte.
More changes will follow soon.
MFC after: 1 month
Sponsored by: Yandex LLC
We have observed that arc_release() can be called concurrently with a
l2arc in-flight write.
Also, we have observed that arc_hdr_destroy() can be called from
arc_write_done() for a zio with ZIO_FLAG_IO_REWRITE flag in similar
circumstances.
Previously the l2arc headers would be freed while leaking their
associated compression buffers. Now the buffers are placed on
l2arc_free_on_write list for delayed freeing. This is similar to what
was already done to arc buffers that were supposed to be freed
concurrently with in-flight writes of those buffers.
In addition to fixing the discovered leaks this change also adds some
protective code to assert that a compression buffer associated with a
l2arc header is never leaked.
A new kstat l2_cdata_free_on_write is added. It keeps a count of
delayed compression buffer frees which previously would have been leaks.
Tested by: Vitalij Satanivskij <satan@ukr.net> et al
Requested by: many
MFC after: 2 weeks
Sponsored by: HybridCluster / ClusterHQ
It returns only current working directory of given process which saves a lot of
overhead over kern.proc.filedesc if given proc has a lot of open fds.
Submitted by: Tiwei Bie <btw mail.ustc.edu.cn> (slightly modified)
X-Additional: JuniorJobs project
For ZVOL-backed LUNs this allows to inform initiators if storage's used or
available spaces get above/below the configured thresholds.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
According to IANA RPC uaddr registry, there are no AFs
except IPv4 and IPv6, so it's not worth being too abstract here.
Remove ne_rtable[AF_MAX+1] and use explicit per-AF radix tries.
Use own initialization without relying on domattach code.
While I admit that this was one of the rare places in kernel
networking code which really was capable of doing multi-AF
without any AF-depended code, it is not possible anymore to
rely on dom* code.
While here, change terrifying "Invalid radix node head, rn:" message,
to different non-understandable "netcred already exists for given addr/mask",
but less terrifying. Since we know that rn_addaddr() returns NULL if
the same record already exists, we should provide more friendly error.
MFC after: 1 month
Therefore, to set histry size to 2000 lines, add the following line to
your kernel configuration file:
options SC_HISTORY_SIZE=2000
The default history remains at 500 lines.
MFC after: 1 week
some accumulated entropy twice and use that as the new key. Due to a
typo, we were using the output of the first hash round instead of the
second. Correct this, but eliminate temp[] since we can reuse hash[].
Also add comments explaining what is going on and why.
Noticed by: Sami Farin <sami.farin@gmail.com>
Reviewed by: markm@
Approved by: so (des)
default_pager_putpages() and swap_pager_putpages().
It is the same fix as was done for vnode_pager_putpages()
in r271586.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
especially for platforms where unaligned access is not allowed. Make
it possible to override the small buffer size.
A simple continuous read string test using libusb showed a reduction
in CPU usage from roughly 10% to less than 1% using a dual-core GHz
CPU, when the malloc() operation was skipped for small buffers.
MFC after: 2 weeks
vt(4) is a new console driver which brings features such as:
o Support for Unicode and double-width characters
o Integration with the KMS kernel video drivers
o Support for UEFI
You may need to update your console settings in /etc/rc.conf, most
probably the keymap. During boot, /etc/rc.d/syscons will indicate what
you need to do.
vt(4) still has issues and lacks some features compared to syscons(4).
See the wiki for up-to-date information:
https://wiki.freebsd.org/Newcons
If you want to keep using syscons(4), you can do so by adding the
following line to /boot/loader.conf:
kern.vty=sc
Differential Revision: https://reviews.freebsd.org/D1005
Discussed with: emaste@, nwhitehorn@, ray@
Relnotes: yes
on my part about north bridge/GPU pci ids and use of aperture.
Leave the agp_intel.c out of static compilation on amd64, it makes the
things consistent with agp.ko.
Pointed out by: tijl
Sponsored by: The FreeBSD Foundation
MFC after: 13 days
... and their associated tunables. This gives a way to know the list of
available connectors, no matter the driver.
The problem is that xrandr(1) can list connectors but it uses a
different naming.
MFC after: 1 week
Impact on capability races was small: it was possible to get a spurious
ENOTCAPABLE (early return), but it was not possible to bypass checks.
Tidy up some comments.
875. This intersects with the agp_i810.c, which supports all Intels
from i810 to Core i5/7. Both agp_intel.c and agp_i810.c are compiled
into kernel when device agp is specified in config, and agp_i810
attach seems to be selected by chance due to linking order.
Strip support for 810 and later from agp_intel.c. Since 440-class
chipsets do not support any long-mode capable CPUs, remove agp_intel.c
from amd64 kernel file list. Note that agp_intel.c is not compiled
into agp.ko on amd64 already.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
talked about. Explain where the mentioned trampoline located
(usermode), and the fact that attempt to exit last thread is denied in
kernel (by delegating the work to usermode).
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
By default, vt(4) gets the "preferred mode" from DRM, when using a DRM
video driver as its backend. The preferred mode is usually the native
screen resolution.
Now, if this mode isn't appropriate, a user can use loader tunables to
select a mode. The tunables are read in the following order:
1. kern.vt.fb.modes.$connector_name
2. kern.vt.fb.default_mode
For example, to set a 1024x768 mode, no matter the connector:
kern.vt.fb.default_mode="1024x768"
To set a 800x600 mode only on the laptop builtin screen:
kern.vt.fb.modes.LVDS-1="800x600"
MFC after: 1 week
Currently sizeof(struct filedesc0) is 1096 bytes, which means allocations from
malloc use 2048 bytes.
There is no easy way to shrink the structure <= 1024 an it is likely to grow in
the future.
random_adaptors_lock is held.
- Use sx_sleep instead of tsleep in read and write path to allow
another thread that registers a new random adapter when waiting.
Assert that random_adaptor is not NULL after reacquiring the lock.
- Capture EINTR/ERESTART from sx_sleep to allow the blocking cycle be
stopped when user requests so, while there also make short
read/write's return 0.
- Move M_WAITOK allocations out of lock scope.
In collobration with: kib, markm, ian, jilles
Reviewed by: kib, markm
Approved by: so
support for AVX on i386.
- Similar to amd64, move the FPU save area out of the PCB and instead
store saved FPU state in a variable-sized buffer after the PCB on the
stack.
- To support the variable PCB location, alter the locore code to only use
the bottom-most page of proc0stack for init386(). init386() returns
the correct stack pointer to locore which adjusts the stack for thread0
before calling mi_startup().
- Don't bother setting cr3 in thread0's pcb in locore before calling
init386(). It wasn't used (init386() overwrote it at the end) and
it doesn't work with the variable-sized FPU save area.
- Remove the new-bus attachment from npx. This was only ever useful for
external co-processors using IRQ13, but those have not been supported
for several years. npxinit() is now called much earlier during boot
(init386()) similar to amd64.
- Implement PT_{GET,SET}XSTATE and I386_GET_XFPUSTATE.
- npxsave() is now only called from context switch contexts so it can
use XSAVEOPT.
Differential Revision: https://reviews.freebsd.org/D1058
Reviewed by: kib
Tested on: FreeBSD/i386 VM under bhyve on Intel i5-2520
The problem was that only the kbdmux keyboard index was saved in
vd->vd_keyboard. This index is -1 when kbdmux isn't used. In this
case, the keyboard was correctly allocated, but the returned index was
discarded.
PR: 194718
MFC after: 1 week
are not suspended. In particular, on the SU-enabled vulumes, there is
no reason why, between the call to softdep_flushfiles() and
softdep_waitidle(), SU work items cannot be queued.
Correct the condition to trigger the panic by only checking when
forced operation is done. Convert direct panic() call into KASSERT(),
there is no invalid on-disk data structures directly involved, so
follow the usual debugging vs. non-debugging approach.
Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
whether the shared request for already shared-locked lock could be
granted. Both problems result in the exclusive locker starvation.
The concurrent exclusive request is indicated by either
LK_EXCLUSIVE_WAITERS or LK_EXCLUSIVE_SPINNERS flags. The reverse
condition, i.e. no exclusive waiters, must check that both flags are
cleared.
Add a flag LK_NODDLKTREAT for shared lock request to indicate that
current thread guarantees that it does not own the lock in shared
mode. This turns back the exclusive lock starvation avoidance code;
see man page update for detailed description.
Use LK_NODDLKTREAT when doing lookup(9).
Reported and tested by: pho
No objections from: attilio
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
updating the GTT and flushing the AGP TLB by storing the GTT in
write-combining memory.
On x86 flushing the AGP TLB is done by an I/O operation or a store to a
MMIO register in uncacheable memory. Both cases imply that WC buffers are
flushed so no memory barriers are needed.
On powerpc there is no WC memory type. It maps to uncacheable memory and
two stores to uncacheable memory, such as to the GTT and then to an MMIO
register, are strongly ordered, so no memory barriers are needed either.
MFC after: 1 month
moving U-Boot specific code from libfdt.a to a new libuboot_fdt.a. This
needs to be a new library for linking to work correctly.
Differential Revision: https://reviews.freebsd.org/D1054
Reviewed by: ian, rpaulo (earlier version)
MFC after: 1 week
A new terminal_set_cursor() is added: it wraps the existing
teken_set_cursor() function.
In vtbuf_grow(), the cursor position is adjusted at the end of the
function. In vt_change_font(), we call terminal_set_cursor() just after
terminal_set_winsize_blank(), while the terminal is mute.
This fixes a bug where, after loading a kernel video driver which
increases the terminal window size, the cursor remains at its old
position, in other words, in the middle of the display content.
PR: 194421
MFC after: 1 week
hold the gpiobus lock between the gpio calls.
gpiobus_acquire_lock() now accepts a third parameter which tells gpiobus
what to do when the bus is already busy.
When GPIOBUS_WAIT wait is used, the calling thread will be put to sleep
until the bus became free.
With GPIOBUS_DONTWAIT the calling thread will receive EWOULDBLOCK right
away and then it can act upon.
This fixes the gpioiic(4) locking issues that arises when doing multiple
concurrent access on the bus.
of fuword(9) and suword(9). This makes the functions type-compatible
with volatile objects and does not require devolatile force, e.g. in
kern_umtx.c.
Requested by: bde
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
The check was recommened in the draft-ietf-ngtrans-mech-05.txt. But it isn't
clear, should it compare the source with all direct broadcast addresses in the
system or not.
RFC 4213 says it is enough to verify that the source address is the address
of the encapsulator, as configured on the decapsulator. And this verification
can be extended by administrator with any other forms of IPv4 ingress filtering.
Discussed with: glebius, melifaro
Sponsored by: Yandex LLC