This reduces amount of locking required for these zones.
Also, for cache only zones (UMA_ZFLAG_CACHE) accounting uz_items wasn't
correct at all, since they may allocate items directly from their backing
store and then free them via UMA underflowing uz_items.
Tested by: pho
atomic updates and reduces amount of data protected by zone lock.
During startup point these fields to EARLY_COUNTER. After startup
allocate them for all early zones.
Tested by: pho
Rather than mentioning the requirement for 4.x binaries but not
explaining why (it was assuming an upgrade from 4.x to 5.0-current),
explain when compat options are needed (for running existing host
binaries) in a more general way while using a more modern example
(COMPAT_FREEBSD11 for 11.x binaries). While here, explicitly mention
that a GENERIC kernel should always work.
Reported by: Robert Huff <roberthuff@rcn.com>
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18740
When the TCP window scale option is not used, and the window
opens up enough in one soreceive, a window update will not be sent.
For example, if recwin == 65535, so->so_rcv.sb_hiwat >= 262144, and
so->so_rcv.sb_hiwat <= 524272, the window update will never be sent.
This is because recwin and adv are clamped to TCP_MAXWIN << tp->rcv_scale,
and so will never be >= so->so_rcv.sb_hiwat / 4
or <= so->so_rcv.sb_hiwat / 8.
This patch ensures a window update is sent if the window opens by
TCP_MAXWIN << tp->rcv_scale, which should only happen when the window
size goes from zero to the max expressible.
This issue looks like it was introduced in r306769 when recwin was clamped
to TCP_MAXWIN << tp->rcv_scale.
MFC after: 1 week
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D18821
The cache_handle_range macro to handle the arm64 instruction and data
cache operations would return when it was complete. This causes problems
for arm64_icache_sync_range and arm64_icache_sync_range_checked as they
assume they can execute the i-cache handling instruction after it has been
called.
Fix this by making this assumption correct.
While here add missing instruction barriers and adjust the style to
match the rest of the assembly.
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18838
Sometimes, for negated tables, pf can log 'pfr_update_stats: assertion failed'.
This warning does not clarify anything for users, so silence it, just as
OpenBSD has.
PR: 234874
MFC after: 1 week
o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many
pbufs are we going to have set.
In various subsystems that are going to utilize pbufs create private zones
via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(),
and sets a limit on created zone. After startup preallocate pbufs according
to requirements of all pbuf zones.
Subsystems that used to have a private limit with old allocator now have
private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS,
swap, vnode pager.
The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9),
aio(4). They should have their private limits, but changing that is out of
scope of this commit.
o Fetch tunable value of kern.nswbuf from init_param2() and while here move
NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only
this option.
Default values aren't touched by this commit, but they probably should be
reviewed wrt to modern hardware.
This change removes a tight bottleneck from sendfile(2) operation, that
uses pbufs in vnode pager. Other pagers also would benefit from faster
allocation.
Together with: gallatin
Tested by: pho
Do not lose error condition by always returning 0 from set_led.
None of the calls to set_led checks for return value at the moment so
none of API consumers in base is affected.
PR: 231567
Submitted by: Bertrand Petit <bsdpr@phoe.frmug.org>
MFC after: 1 week
mv_pci driver reads PCI memory window layout from DTB data and if the
data is incomplete falls back to default value. The value is too small
to fit two PCI spaces for mwlwifi devices on WRT3200ACM so the resource
allocation for them fails. Increase the default to 4Mb from 1Mb so
the devices can be properly attached.
MFC after: 1 week
two zones sharing a keg may have different limits. Now this is going
to work:
zone = uma_zcreate();
uma_zone_set_max(zone, limit);
zone2 = uma_zsecond_create(zone);
uma_zone_set_max(zone2, limit2);
Kegs no longer have uk_maxpages field, but zones have uz_items. When
set, it may be rounded up to minimum possible CPU bucket cache size.
For small limits bucket cache can also be reconfigured to be smaller.
Counter uz_items is updated whenever items transition from keg to a
bucket cache or directly to a consumer. If zone has uz_maxitems set and
it is reached, then we are going to sleep.
o Since new limits don't play well with multi-keg zones, remove them. The
idea of multi-keg zones was introduced exactly 10 years ago, and never
have had a practical usage. In discussion with Jeff we came to a wild
agreement that if we ever want to reintroduce the idea of a smart allocator
that would be able to choose between two (or more) totally different
backing stores, that choice should be made one level higher than UMA,
e.g. in malloc(9) or in mget(), or whatever and choice should be controlled
by the caller.
o Sleeping code is improved to account number of sleepers and wake them one
by one, to avoid thundering herd problem.
o Flag UMA_ZONE_NOBUCKETCACHE removed, instead uma_zone_set_maxcache()
KPI added. Having no bucket cache basically means setting maxcache to 0.
o Now with many fields added and many removed (no multi-keg zones!) make
sure that struct uma_zone is perfectly aligned.
Reviewed by: markj, jeff
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D17773
shell scripts in base, removes the bogus "copyleft", adds the BeerWare license
header and uses rc.subr(8) new 'enable' keyword for adding entries in
rc.conf(5).
Submitted by: erdgeist <erdgeist@erdgeist.org>
Approved by: bapt
MFC after: 2 weeks
vm_kmem_size is u_long, and it might be not capable of holding page
count times PAGE_SIZE, even when scaled down by VM_KMEM_SIZE_SCALE. As
bde reported, 12G PAE config ends up with zero for kmem size.
Explicitly check for overflow and clamp kmem size at vm_kmem_size_max.
If we end up at zero size because VM_KMEM_SIZE_MAX is not defined,
panic with clear explanation rather then failing in a way which is
hard to relate.
Reported by: bde, pho
Tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18767
Add asn1_compile, make-roken, kcc, and slc to the OptionalObsoleteFiles.inc
so they would be removed during delete-old stage if the new world is built
without Kerberos support.
PR: 230725
Submitted by: Dmitry Wagin <dmitry.wagin@ya.ru>
MFC after: 1 week
Just like for Acer C270 chromebook the E820 extmem workaround is required for
FreeBSD to boot on Dell chromebook.
PR: 204916
Submitted by: Keith White <kwhite@site.uottawa.ca>
MFC after: 1 week
cpu_icache_sync_range(), except that it sets pcb_onfault to catch any page
fault, as doing cache maintenance operations for non-mapped generates a
data abort, and use it in freebsd32_sysarch(), so that a userland program
attempting to sync the icache with unmapped addresses doesn't crash the
kernel.
Spotted out by: andrew
r319722 separated struct socket and parts of the socket I/O path into
listening-socket-specific and dataflow-socket-specific pieces. Listening
socket connection notifications are now handled by solisten_wakeup() instead
of sowakeup(), but solisten_wakeup() does not currently post SIGIO to the
owning process.
PR: 234258
Reported by: Kenneth Adelman
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18664
There may be various side effects (device timeout, firmware and / or
kernel panic) when an invalid (or inapplicable - e.g., an MCS rate
for 11g-only device) is set; check rates before sending the frame to
the driver.
How-to-reproduce:
Set an MCS (real or bogus - with 0x80 bit set) rate in ibp_rate0 field
for any device that uses ieee80211_isratevalid() for rate checks -
rum(4), run(4), ural(4), bwi(4) or ral(4); if kernel is compiled
with INVARIANTS the check will result in "rate %d is basic/mcs?" panic.
Tested with WUSB54GC (rum(4)), AP mode.
MFC after: 1 week
pfctl has an issue with 'set skip on <group>', which causes inconsistent
behaviour: the set skip directive works initially, but does not take
effect when the same rules are re-applied.
PR: 229241
MFC after: 1 week
When we skip on a group the kernel will automatically skip on the member
interfaces. We still need to update our own cache though, or we risk
overruling the kernel afterwards.
This manifested as 'set skip' working initially, then not working when
the rules were reloaded.
PR: 229241
MFC after: 1 week
Don't clobber the low part of the register restoring the high component of.
This could lead to very bad behavior if it's an ABI-affected register.
While here, also mark the asm volatile in the SPE high save case, to match
the load case.
Reported by: Branden Bergren (git_bdragon.rtk0.net)
MFC after: 1 week
Summary: reloc_jmpslot function parameter 'defobj' is not used when using ELFv2
ABI
Submitted by: alfredo.junior_eldorado.org.br
Reviewed By: kib, git_bdragon.rtk0.net, emaste, jhibbits
Differential Revision: https://reviews.freebsd.org/D18808
Summary:
I was working on implementing ifuncs on powerpc64 elfv2 today, and I suddenly
realized that the reason I was having so much trouble with AT_HWCAP and
AT_HWCAP2 is they are missing from the sysentvec.
After adding them, the auxv is being filled like it should.
Submitted by: Brandon Bergren (git_bdragon.rtk0.net)
Differential Revision: https://reviews.freebsd.org/D18575
Family 15h is a bit of an oddball. Early models used the same temperature
register and spec (mostly[1]) as earlier CPU families.
Model 60h-6Fh and 70-7Fh use something more like Family 17h's Service
Management Network, communicating with it in a similar fashion. To support
them, add support for their version of SMU indirection to amdsmn(4) and use
it in amdtemp(4) on these models.
While here, clarify some of the deviceid macros in amdtemp(4) that were
added with arbitrary, incorrect family numbers, and remove ones that were
not used. Additionally, clarify intent and condition of heterogenous
multi-socket system detection.
[1]: 15h adds the "adjust range by -49°C if a certain condition is met,"
which previous families did not have.
Reported by: D. C. <tjoard AT gmail.com>
PR: 234657
Tested by: D. C. <tjoard AT gmail.com>
The IPI vector is static, and happens to be the most common interrupt by far
on some systems. Rather than searching for the interrupt every time, cache
the index.
This appears to yield a small performance boost, of about 8% reduction in
buildworld times, on my POWER9 system, when paired with r342975.
The XICS and XIVE need extra data beyond irq and vector. Rather than
performing a separate search, it's better for the general interrupt facility
to hold a private pointer, since the search already must be done anyway at
that level.
Summary:
GCC expects to link in a crtsavres.o on powerpc platforms. On
powerpc64 this is an empty file, but on powerpc and powerpcspe this does contain
some save/restore functions, which may not actually be necessary for newer
modern GCC and clang. This appeases the in-tree gcc, though, and is needed in
order to switch to the BSD CRTRBEGIN.
PR: 233751
Reviewed By: andrew
Differential Revision: https://reviews.freebsd.org/D18826
Check if rate control structures were allocated before trying to
access them in various places; this was possible before on
allocation failure (unlikely), but was revealed after r342211
where allocation was deferred.
In case if driver uses wlan_amrr(4) and it is loaded it
is possible to reproduce the panic via
sysctl net.wlan.<number>.rate_stats
(for wlan0 the number will be 0).
Tested with: RTL8188EE, AP mode + RTL8188CUS, STA mode.
MFC after: 3 days
bc.y: Rev 1.50
- write parse errors to stderr, prompted by Martijn Dekker
- we're only interactive if stdout en stderr are a tty as well as stdin
PR: 234430
Obtained from: OpenBSD
MFC after: 1 week
When building with KCOV enabled the compiler will insert function calls
to probes allowing us to trace the execution of the kernel from userspace.
These probes are on function entry (trace-pc) and on comparison operations
(trace-cmp).
Userspace can enable the use of these probes on a single kernel thread with
an ioctl interface. It can allocate space for the probe with KIOSETBUFSIZE,
then mmap the allocated buffer and enable tracing with KIOENABLE, with the
trace mode being passed in as the int argument. When complete KIODISABLE
is used to disable tracing.
The first item in the buffer is the number of trace event that have
happened. Userspace can write 0 to this to reset the tracing, and is
expected to do so on first use.
The format of the buffer depends on the trace mode. When in PC tracing just
the return address of the probe is stored. Under comparison tracing the
comparison type, the two arguments, and the return address are traced. The
former method uses on entry per trace event, while the later uses 4. As
such they are incompatible so only a single mode may be enabled.
KCOV is expected to help fuzzing the kernel, and while in development has
already found a number of issues. It is required for the syzkaller system
call fuzzer [1]. Other kernel fuzzers could also make use of it, either
with the current interface, or by extending it with new modes.
A man page is currently being worked on and is expected to be committed
soon, however having the code in the kernel now is useful for other
developers to use.
[1] https://github.com/google/syzkaller
Submitted by: Mitchell Horne <mhorne063@gmail.com> (Earlier version)
Reviewed by: kib
Testing by: tuexen
Sponsored by: DARPA, AFRL
Sponsored by: The FreeBSD Foundation (Mitchell Horne)
Differential Revision: https://reviews.freebsd.org/D14599
Extend the vendor class USB audio quirk to cover devices without
the USB audio control descriptor.
PR: 234794
MFC after: 1 week
Sponsored by: Mellanox Technologies
The goal of this change is to make it easier to use getconf to query
the number of available processors.
Sadly it's unclear per POSIX, which form (with a preceding _ or
lacking it) is correct. I will bring this up on the Austin Group list so
this point is clarified for implementors that might rely on this getconf
variable in future POSIX spec versions.
This is something I noticed when trying to import GoogleTest to FreeBSD
as one of the CI scripts uses this variable on Linux.
MFC after: 2 weeks
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D18640