if it can support the PRUS option (OOB). And then have
the new function call that to validate and give the
correct error response if needed to the user (rack
and bbr do not support obsoleted OOB data).
Sponsoered by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D24574
Eliminate the last rtalloc1() call to finish transition to the new routing
KPI defined in r359823.
Differential Revision: https://reviews.freebsd.org/D24663
After converting routing subsystem customers to use nexthop objects
defined in r359823, some fields in struct rtentry became unused.
This commit removes rt_ifp, rt_ifa, rt_gateway and rt_mtu from struct rtentry
along with the code initializing and updating these fields.
Cleanup of the remaining fields will be addressed by D24669.
This commit also changes the implementation of the RTM_CHANGE handling.
Old implementation tried to perform the whole operation under radix WLOCK,
resulting in slow performance and hacks like using RTF_RNH_LOCKED flag.
New implementation looks up the route nexthop under radix RLOCK, creates new
nexthop and tries to update rte nhop pointer. Only last part is done under
WLOCK.
In the hypothetical scenarious where multiple rtsock clients
repeatedly issue RTM_CHANGE requests for the same route, route may get
updated between read and update operation. This is addressed by retrying
the operation multiple (3) times before returning failure back to the
caller.
Differential Revision: https://reviews.freebsd.org/D24666
"F lock" is a switch between two sets of scancodes for function keys F1-F12
found on some Logitech and Microsoft PS/2 keyboards [1]. When "F lock" is
pressed, then F1-F12 act as function keys and produce usual keyscans for
these keys. When "F lock" is depressed, F1-F12 produced the same keyscans
but prefixed with E0.
Some laptops use [2] E0-prefixed F1-F12 scancodes for non-standard keys.
[1] https://www.win.tue.nl/~aeb/linux/kbd/scancodes-6.html
[2] https://reviews.freebsd.org/D21565
MFC after: 2 weeks
While there, remove ifdef around cs_target check in cfiscsi_ioctl_list().
I am not sure why this ifdef was added, but without this check code will
crash below on NULL dereference.
Submitted by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24587
run it). Make sure that we do. Simplify the flow a bit, and fix a
comment since we do need to do these things.
Noticed by: cperciva (not sure why my invariants kernel didn't trigger)
They have more differencies than similarities. For now there is lots
of code that would check for M_EXT only and work correctly on M_EXTPG
buffers, so still carry M_EXT bit together with M_EXTPG. However,
prepare some code for explicit check for M_EXTPG.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598
o Shrink sglist(9) functions to work with multipage mbufs down from
four functions to two.
o Don't use 'struct mbuf_ext_pgs *' as argument, use struct mbuf.
o Rename to something matching _epg.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598
next commit brings in second flag, so let them already be in the
future namespace.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598
but we need buffer of MLEN bytes. This isn't just a simplification,
but important fixup, because previous commit shrinked sizeof(struct
mbuf) down below MSIZE, and instantiating an mbuf on stack no longer
provides enough data.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598
The following series of patches addresses three things:
Now that array of pages is embedded into mbuf, we no longer need
separate structure to pass around, so struct mbuf_ext_pgs is an
artifact of the first implementation. And struct mbuf_ext_pgs_data
is a crutch to accomodate the main idea r359919 with minimal churn.
Also, M_EXT of type EXT_PGS are just a synonym of M_NOMAP.
The namespace for the newfeature is somewhat inconsistent and
sometimes has a lengthy prefixes. In these patches we will
gradually bring the namespace to "m_epg" prefix for all mbuf
fields and most functions.
Step 1 of 4:
o Anonymize mbuf_ext_pgs_data, embed in m_ext
o Embed mbuf_ext_pgs
o Start documenting all this entanglement
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598
- Make ctl_add_lun() synchronous. Asynchronous addition was used by
Copan's proprietary code long ago and never for upstream FreeBSD.
- Move LUN enable/disable calls from backends to CTL core.
- Serialize LUN modification and partially removal to avoid double frees.
- Slightly unify backends code.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
This removes support for the following algorithms:
- ARC4
- Blowfish
- CAST128
- DES
- 3DES
- MD5-HMAC
- Skipjack
Since /dev/crypto no longer supports 3DES, stop testing the 3DES KAT
vectors in cryptotest.py.
Reviewed by: cem (previous version)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24346
The changes in r359374 added various sanity checks in sessions and
requests created by crypto consumers in part to permit backend drivers
to make assumptions instead of duplicating checks for various edge
cases. One of the new checks was to reject sessions which provide a
pointer to a key while claiming the key is zero bits long.
IPsec ESP tripped over this as it passes along whatever key is
provided for NULL, including a pointer to a zero-length key when an
empty string ("") is used with setkey(8). One option would be to
teach the IPsec key layer to not allocate keys of zero length, but I
went with a simpler fix of just not passing any keys down and always
using a key length of zero for NULL algorithms.
PR: 245832
Reported by: CI
Examples of depecrated algorithms in manual pages and sample configs
are updated where relevant. I removed the one example of combining
ESP and AH (vs using a cipher and auth in ESP) as RFC 8221 says this
combination is NOT RECOMMENDED.
Specifically, this removes support for the following ciphers:
- des-cbc
- 3des-cbc
- blowfish-cbc
- cast128-cbc
- des-deriv
- des-32iv
- camellia-cbc
This also removes support for the following authentication algorithms:
- hmac-md5
- keyed-md5
- keyed-sha1
- hmac-ripemd160
Reviewed by: cem, gnn (older verisons)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24342
The addition of the HSM SBI extension to OpenSBI introduces a new
breaking change: secondary harts will remain parked in the firmware,
until they are brought up explicitly via sbi_hsm_hart_start(). Add
the call to do this, sending the secondary harts to mpentry.
If the HSM extension is not present, secondary harts are assumed to be
released by the firmware, as is the case for OpenSBI =< v0.6 and BBL.
In the case that the HSM call fails we exclude the CPU, notify the
user, and allow the system to proceed with booting.
Reviewed by: markj (older version)
Differential Revision: https://reviews.freebsd.org/D24497
APs enter the kernel at the same point as the BSP, the _start routine.
They then jump to mpentry, but not before storing the kernel's physical
load address in the s9 register. Extract this calculation into its own
routine, so that APs can be instructed to enter directly from mpentry.
Differential Revision: https://reviews.freebsd.org/D24495
All callers are currently filtering bad nsid to this function,
however, we'll have undefined behavior if that's not true. Add the
KASSERT to prevent that.
Previously procctl(PROC_PROTMAX_STATUS, ... used the PROC_ASLR_NOFORCE
macro for the "system-wide configured policy" status, instead of
PROC_PROTMAX_NOFORCE.
They both have a value of 3, so no functional change.
Sponsored by: The FreeBSD Foundation
Factoring some of the code in nfsm_dissct() out into separate functions
allows these functions to be used elsewhere in the NFS mbuf handling code.
Other uses of these functions will be done in future commits.
It also makes it easier to add support for ext_pgs mbufs, which is needed
for nfs-over-tls under development in base/projects/nfs-over-tls.
Although the algorithm in nfsm_dissct() is somewhat re-written by this
patch, the semantics of nfsm_dissct() should not have changed.
- maxio should be dp->d_maxsize. This is often MAXPHYS, but not always
(especially if MAXPHYS is > 1MB).
- Unlock the periph before returning. We don't need to relock it to
release the ccb.
- Make sure we release the ccb in error paths.
Reviewed by: cperciva
With a length of 16, the name ("<if name>:TX(<qid>):callout") typically
gets truncated.
PR: 245712
Reported by: ghuckriede@blackberry.com
MFC after: 1 week
Running TCP Cubic together with ECN could end up reducing cwnd down to 1 byte, if the
receiver continously sets the ECE flag, resulting in very poor transmission speeds.
In line with RFC6582 App. B, a lower bound of 2 MSS is introduced, as well as a typecast
to prevent any potential integer overflows during intermediate calculation steps of the
adjusted cwnd.
Reported by: Cheng Cui
Reviewed by: tuexen (mentor)
Approved by: tuexen (mentor)
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D23353
With these two ioctls implemented in the nda driver, nvmecontrol now
works with nda just like it does with nvd. It eliminates the need to
jump through odd hoops to get this data.
Add the nvmeX device to the XPT_PATH_INQ nvme specific
information. while one could figure this out by looking up the
domain🚌slot:function, it's a lot easier to have the SIM set it
directly since the sim knows this.
which can cause a TCP client to use invalid or stale TCP sequence numbers for ACK packets.
Packets with old sequence numbers are ignored and not used to update the send window size.
This might cause the TCP session to hang indefinitely under some circumstances.
Reported by: Cui Cheng
Reviewed by: tuexen (mentor), rgrimes (mentor)
Approved by: tuexen (mentor), rgrimes (mentor)
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D24515
Continue routing subsystem conversion to nhop objects defined in r359823.
Use fields from nhop structure instead of "struct rtentry" fields.
This is one of the last changes prior to removing rt_ifp, rt_ifa,
rt_gateway and rt_mtu from struct rtentry.
Differential Revision: https://reviews.freebsd.org/D24609
by not including the SYN bit sequence space in cwnd related calculations.
Snd_und is adjusted explicitly in all cases, outside the cwnd update, instead.
This fixes an off-by-one conformance issue with regular TCP sessions not
using Appropriate Byte Counting (RFC3465), sending one more packet during
the initial window than expected.
PR: 235256
Reviewed by: tuexen (mentor), rgrimes (mentor)
Approved by: tuexen (mentor), rgrimes (mentor)
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D19000
With the upcoming multipath changes described in D24141,
rt->rt_nhop can potentially point to a nexthop group instead of
an individual nhop.
To simplify caller handling of such cases, change ifa_rtrequest() callback
to pass changed nhop directly.
Differential Revision: https://reviews.freebsd.org/D24604
Openfirmare enumerates and installs the driver for all processors,
regardless of whether they will be started later (because of power
constrains for example).
MFC after: 3 weeks
Don't set initial voltage for regulators having their voltage already
in allowed range. As side effect of this change, we don't try to set
initial voltage for fixed voltage regulators - these don't have impemented
voltage set method so their initialization has always failed.
MFC after: 3 weeks
- always initialize selector of voltage signaling standard.
Various versions of U-boot leaves voltage signaling standard settings
for PMUIO2 domain in different state. Always initialize it
into expected state.
- start the driver as early as possible, the IO domains should be
initialized before other drivers are attached.
- rename RK3399 register to its name founds in TRM.
This is the second part of fixes for serial port corruption observed after
DT 5.6 import.
Reviewed by: manu
MFC after: 1 week
Currently functionality resides in rtsock.c, which is a controlling
interface, partially external to the routing subsystem.
Additionally, DDB-supporting functionality is > 100SLOC, which deserves
a separate file.
Given that, move this functionality to a newly-created net/route/ subdir.
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D24561
Nexthop objects implementation, defined in r359823,
introduced sys/net/route directory intended to hold all
routing-related code. Move recently-introduced route_temporal.c and
private route_var.h header there.
Differential Revision: https://reviews.freebsd.org/D24597
One of the goals of the new routing KPI defined in r359823
is to entirely hide`struct rtentry` from the consumers.
It will allow to improve routing subsystem internals and deliver
features much faster.
This is one of the last changes, effectively moving struct rtentry
definition to a net/route_var.h header, internal to the routing subsystem.
Differential Revision: https://reviews.freebsd.org/D24580
Otherwise, since the CV is not signalled until data is drained from the
socket, it is trivial to create an unkillable process using
sendfile(SF_SYNC) and a process-private PF_LOCAL socket pair. In
particular, the cv_wait() in sendfile() does not get interrupted until
data is drained from the receiving socket buffer.
Reported by: pho
Discussed with: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
A concurrent unlocked lookup can wire the page after
vm_page_release_locked() releases the last wiring, in which case
vm_page_release_locked() must not free the page. Once the xbusy lock is
acquired, that, the object lock and the fact that the page is unmapped
ensure that the wire count cannot increase, so re-check for new wirings
after the page is xbusied.
Update the comment above vm_page_wired() to reflect the new
synchronization rules.
Reported by: glebius
Reviewed by: alc, jeff, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24592
New fib[46]_lookup() functions support multipath transparently.
Given that, switch the last rtalloc_mpath_fib() calls to
dib4_lookup() and eliminate the function itself.
Note: proper flowid generation (especially for the outbound traffic) is a
bigger topic and will be handled in a separate review.
This change leaves flowid generation intact.
Differential Revision: https://reviews.freebsd.org/D24595
r360292 switched most of the remaining routing customers to a new KPI,
leaving a bunch of wrappers for old routing lookup functions unused.
Remove them from the tree as a part of routing cleanup.
Differential Revision: https://reviews.freebsd.org/D24569
This debugging code printing routing table data was introduced in rS25723,
22+ years ago. The last functional commit to this code was rS67534, 19 years ago.
The code has been turned off by default all this time.
Lastly, this code directly iterates radix tree and rtentries, which is not
not a proper interaction with routing system.
Differential Revision: https://reviews.freebsd.org/D24554
The NFS code had a bunch of Mac OS/X accessor functions named uio_XXX
left over from the port to Mac OS/X. Since that port is long forgotten,
replace the calls with the code generated by the FreeBSD macros for these
in nfskpiport.h. This allows the macros to be deleted from nfskpiport.h
and I think makes the code more readable.
This patch should not result in any semantic change.
This largely reuses the TLS TOE support added in r330884. However,
this uses the KTLS framework in upstream OpenSSL rather than requiring
Chelsio-specific patches to OpenSSL. As with the existing TLS TOE
support, use of RX offload requires setting the tls_rx_ports sysctl.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24453
Without this patch, sosend_generic() will try to use top->m_pkthdr.len,
assuming that the first mbuf has a pkthdr.
When a list of ext_pgs mbufs is passed in, the first mbuf is not a
pkthdr and cannot be post-r359919. As such, the value of top->m_pkthdr.len
is bogus (0 for my testing).
This patch fixes sosend_generic() to handle this case, calculating the
total length via m_length() for this case.
There is currently nothing that hands a list of ext_pgs mbufs to
sosend_generic(), but the nfs-over-tls kernel RPC code in
projects/nfs-over-tls will do that and was used to test this patch.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24568
This was changed in the review process for the flags sysctl. The
reasons for the change are no longer valid as the code changed after
that. Cast the one place where it might make a difference (but I don't
think it does). This restores the ability to see flags for softc in
gdb.
- Add a new TCP_RXTLS_ENABLE socket option to set the encryption and
authentication algorithms and keys as well as the initial sequence
number.
- When reading from a socket using KTLS receive, applications must use
recvmsg(). Each successful call to recvmsg() will return a single
TLS record. A new TCP control message, TLS_GET_RECORD, will contain
the TLS record header of the decrypted record. The regular message
buffer passed to recvmsg() will receive the decrypted payload. This
is similar to the interface used by Linux's KTLS RX except that
Linux does not return the full TLS header in the control message.
- Add plumbing to the TOE KTLS interface to request either transmit
or receive KTLS sessions.
- When a socket is using receive KTLS, redirect reads from
soreceive_stream() into soreceive_generic().
- Note that this interface is currently only defined for TLS 1.1 and
1.2, though I believe we will be able to reuse the same interface
and structures for 1.3.
This patch is intended to solve a specific problem that iavf(4)
encounters, but what it does can be extended to solve other issues.
To summarize the iavf(4) issue, if the PF driver configures VLAN
anti-spoof, then the VF driver needs to make sure no untagged traffic is
sent if a VLAN is configured, and vice-versa. This can be an issue when
a VLAN is being registered or unregistered, e.g. when a packet may be on
the ring with a VLAN in it, but the VLANs are being unregistered. This
can cause that tagged packet to go out and cause an MDD event.
To fix this, include a new interface-dependent function that drivers can
implement named IFDI_NEEDS_RESTART(). Right now, this function is called
in iflib_vlan_unregister/register() to determine whether the interface
needs to be stopped and started when a VLAN is registered or
unregistered. The default return value of IFDI_NEEDS_RESTART() is true,
so this fixes the MDD problem that iavf(4) encounters, since the
interface rings are flushed during a stop/init.
A future change to iavf(4) will implement that function just in case the
default value changes, and to make it explicit that this interface reset
is required when a VLAN is added or removed.
Reviewed by: gallatin@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D22086
Now that hw.machine_arch handles soft-float vs hard-float there is no
longer a reason for this config.
Submitted by: mhorne (kern.mk hunk)
Reviewed by: imp (earlier version), kp
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24544
Instead, copy the strings into a temporary buffer on the stack and
run strcmp on the copies.
Reviewed by: brooks, kib
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24567
For userland, MACHINE_ARCH reflects the current ABI via preprocessor
directives. For the kernel, the hw.machine_arch sysctl uses the ELF
header flags of the current process to select the correct MACHINE_ARCH
value.
Reviewed by: imp, kp
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24543
This extends some of the changes in place to support reporting support
for 32-bit ABIs to permit reporting hard-float vs soft-float ABIs.
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24542
latest rack and bbr in from the NF repo. When those come
in the OOB data handling will be fixed where Skyzaller crashes.
Differential Revision: https://reviews.freebsd.org/D24575
Contrary to the kevent man page, EV_EOF on a fifo is not cleared by
EV_CLEAR. Modify the read and write filters to clear EV_EOF when the
fifo's PIPE_EOF flag is clear, and update the man page to document the
new behaviour.
Modify the write filter to return the amount of buffer space available
even if no readers are present. This matches the behaviour for sockets.
When reading from a pipe, only call pipeselwakeup() if some data was
actually read. This prevents the continuous re-triggering of a
EVFILT_READ event on EOF when in edge-triggered mode.
PR: 203366, 224615
Submitted by: Jan Kokemüller <jan.kokemueller@gmail.com>
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24528
This ensures that pipe_poll() and the pipe kqueue filters observe
PIPE_EOF and set EV_EOF accordingly. As a result an extra call to
knote() after setting PIPE_EOF is unnecessary.
Submitted by: Jan Kokemüller <jan.kokemueller@gmail.com>
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24528
Previously we allocated a separate VM object for each kernel stack.
However, fully constructed kernel stacks are cached by UMA, so there is
no harm in using a single global object for all stacks. This reduces
memory consumption and makes it easier to define a memory allocation
policy for kernel stack pages, with the aim of reducing physical memory
fragmentation.
Add a global kstack_object, and use the stack KVA address to index into
the object like we do with kernel_object.
Reviewed by: kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24473
Some models of laptops e.g. "X1 Carbon 3rd Gen Thinkpad" have LRM buttons
wired as so called "Synaptic touchpads extended buttons" rather thah real
trackpoint buttons. Handle this case with merging of events from both
sources.
PR: 245877
Reported by: Raichoo <raichoo@googlemail.com>
MFC after: 1 week
Introduce new fib[46]_lookup_debugnet() functions serving as a
special interface for the crash-time operations. Underlying
implementation will try to return lookup result if
datastructures are not corrupted, avoding locking.
Convert debugnet to use fib4_lookup_debugnet() and switch it
to use nexthops instead of rtentries.
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D24555
It was broken by r360292 as fib6_lookup() assumes de-embedded addresses
while rtalloc_mpath_fib() requires sockaddr with embedded ones.
New fib6_lookup() transparently supports multipath, hence
remove old RADIX_MPATH condition.
The pf_frag_mtx mutex protects the fragments queue. The fragments queue
is virtualised already (i.e. per-vnet) so it makes no sense to block
jail A from accessing its fragments queue while jail B is accessing its
own fragments queue.
Virtualise the lock for improved concurrency.
Differential Revision: https://reviews.freebsd.org/D24504
Run the bridge datapath under epoch, rather than under the
BRIDGE_LOCK().
We still take the BRIDGE_LOCK() whenever we insert or delete items in
the relevant lists, but we use epoch callbacks to free items so that
it's safe to iterate the lists without the BRIDGE_LOCK.
Tests on mercat5/6 shows this increases bridge throughput significantly,
from 3.7Mpps to 18.6Mpps.
Reviewed by: emaste, philip, melifaro
MFC after: 2 months
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24250
If we pass an anchor name which doesn't exist pfr_table_count() returns
-1, which leads to an overflow in mallocarray() and thus a panic.
Explicitly check that pfr_table_count() does not return an error.
Reported-by: syzbot+bd09d55d897d63d5f4f4@syzkaller.appspotmail.com
Reviewed by: melifaro
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24539
r360292 introduced the wrong order, resulting in returned
nhops not being referenced, despite the fact that references
were requested. That lead to random GPF after using SCTP sockets.
Special defined macro like IPV[46]_SCOPE_GLOBAL will be introduced
soon to reduce the chance of putting arguments in wrong order.
Reported-by: syzbot+5c813c01096363174684@syzkaller.appspotmail.com
Release kernels have no KDB backends enabled, so they discard an NMI
if it is not due to a hardware failure. This includes NMIs from
IPMI BMCs and hypervisors.
Furthermore, the interaction of panic_on_nmi, kdb_on_nmi, and
debugger_on_panic is confusing.
Respond to all NMIs according to panic_on_nmi and debugger_on_panic.
Remove kdb_on_nmi. Expand the meaning of panic_on_nmi by making
it a bitfield. There are currently two bits: one for NMIs due to
hardware failure, and one for all others. Leave room for more.
If panic_on_nmi and debugger_on_panic are both true, don't actually panic,
but directly enter the debugger, to allow someone to leave the debugger
and [hopefully] resume normal execution.
Reviewed by: kib
MFC after: 2 weeks
Relnotes: yes: machdep.kdb_on_nmi is gone; machdep.panic_on_nmi changed
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D24558
Store the attached regulator in a tailq to later find them in ofw_map.
While here, do not attempt to attach a regulator without a name, a node
might exists but if it doesn't have a name the regulator is unused.
MFC after: 1 month
If pin is switched from fixed function to GPIO, it should have prepared
direction, pull-up/down and default value before function gets switched.
Otherwise we may produce unwanted glitch on output pin.
Right order of drive strength settings is questionable, but I think that
is slightly safer to do it also before function switch.
This fixes serial port corruption observed after DT 5.6 import.
MFC after: 1 week
This change is build on top of nexthop objects introduced in r359823.
Nexthops are separate datastructures, containing all necessary information
to perform packet forwarding such as gateway interface and mtu. Nexthops
are shared among the routes, providing more pre-computed cache-efficient
data while requiring less memory. Splitting the LPM code and the attached
data solves multiple long-standing problems in the routing layer,
drastically reduces the coupling with outher parts of the stack and allows
to transparently introduce faster lookup algorithms.
Route caching was (re)introduced to minimise (slow) routing lookups, allowing
for notably better performance for large TCP senders. Caching works by
acquiring rtentry reference, which is protected by per-rtentry mutex.
If the routing table is changed (checked by comparing the rtable generation id)
or link goes down, cache record gets withdrawn.
Nexthops have the same reference counting interface, backed by refcount(9).
This change merely replaces rtentry with the actual forwarding nextop as a
cached object, which is mostly mechanical. Other moving parts like cache
cleanup on rtable change remains the same.
Differential Revision: https://reviews.freebsd.org/D24340
The macros CAST_USER_ADDR_T() and CAST_DOWN() were used for the Mac OS/X
port. The first of these macros was a no-op for FreeBSD and the second
is no longer used.
This patch gets rid of them. It also deletes the "mbuf_t" typedef which
is no longer used in the FreeBSD code from nfskpiport.h
This patch should not change semantics.
IEEE80211_MESH_RTCMD_ADD was invoking memcmp() to validate the
supplied address directly on the user pointer rather than first doing
a copyin() and validating the copied value.
IEEE80211_MESH_RTCMD_DELETE was passing the user pointer directly to
ieee80211_mesh_rt_del() rather than copying the user buffer into a
temporary kernel buffer.
Reviewed by: brooks, kib
Obtained from: CheriBSD
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24562
pmap_emulate_modify() was assuming that no changes to the pmap could
take place between the TLB signaling the fault and
pmap_emulate_modify()'s acquisition of the pmap lock, but that's clearly
not even true in the uniprocessor case, nevermind the SMP case.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24523
MipsDoTLBMiss() will load a segmap entry or pde, check that it isn't
zero, and then chase that pointer to a physical page. If that page has
been freed in the interim, it will read garbage and go on to populate
the TLB with it.
This can happen because pmap_unwire_ptp zeros out the pde and
vm_page_free_zero()s the ptp (or, recursively, zeros out the segmap
entry and vm_page_free_zero()s the pdp) without interlocking against
MipsDoTLBMiss(). The pmap is locked, and pvh_global_lock may or may not
be held, but this is not enough. Solve this issue by inserting TLB
shootdowns within _pmap_unwire_ptp(); as MipsDoTLBMiss() runs with IRQs
deferred, the IPIs involved in TLB shootdown are sufficient to ensure
that MipsDoTLBMiss() sees either a zero segmap entry / pde or a non-zero
entry and the pointed-to page still not freed.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24491
For such mappings we need to dump 512 page table pages, not one, and
they need to be included in the pmap size recorded in the minidump
header.
MFC after: 2 weeks
Sponsored by: Juniper Networks, Klara Inc.
The comment referenced a non-existent function, and these minidump
implementations already buffer discontiguous physical data pages by
mapping them into a single VA range that gets passed to the dump device,
so there is no real advantage in batching calls to blk_write().
The RISC-V and MIPS minidump implementations still write a page at a
time and so would benefit from some form of batching.
MFC after: 2 weeks
Sponsored by: Juniper Networks, Klara Inc.
Kernel prints the device announcement before the attach method is
called, so if the correct description is not set by the probe method,
then the announcement would have an incorrect one.
MFC after: 1 week
Use DRIVER_MODULE_ORDERED(SI_ORDER_ANY) so that ig4's ACPI attachment
happens after iicbus and acpi_iicbus drivers are registered.
I have seen a problem where iicbus attached under ig4 instead of
acpi_iicbus when ig4.ko was loaded with kldload. I believe that that
happened because ig4 driver was a first driver to register, it attached
and created an iicbus child. Then iicbus driver was registered and,
since it was the only driver that could attach to the iicbus child
device, it did exactly that. After that acpi_iicbus driver was
registered. It would be able to attach to the iicbus device, but it was
already attached, so nothing happened.
MFC after: 2 weeks
The TSADC familiy is a little bit more complex than V2 and V3.
Early revision do not use syscon and do not use qsel (RK3288).
Next revision still do not use syscon but uses qsel (RK3328).
Final revision use both.
Submitted by: peterj
MFC after: 1 month
In r360131, acpi_ec probe was changed to not clobber an error status prior to
several error cases that did not explicitly set the error variable before
goto'ing the exit path. However, I did not notice that the error variable was
not set to success in the success path. That caused all successful probes to
fail, which is obviously undesirable.
PR: 245778
Reported by: Neel Chauhan <neel AT neelc.org>, Evilham <contact AT evilham.com>
Tested by: Evilham
X-MFC-With: r360131
One of the goals of the new routing KPI defined in r359823 is to entirely
hide`struct rtentry` from the consumers. It will allow to improve routing
subsystem internals and deliver more features much faster.
This commit is mostly mechanical change to eliminate direct struct rtentry
field accesses.
The only notable difference is AF_LINK gateway encoding.
AF_LINK gw is used in routing stack for operations with interface routes
and host loopback routes.
In the former case it indicates _some_ non-NULL gateway, as the interface
is the same as in rt_ifp in kernel and rtm_ifindex in rtsock reporting.
In the latter case the interface index inside gateway was used by the IPv6
datapath to verify address scope for link-local interfaces.
Kernel uses struct sockaddr_dl for this type of gateway. This structure
allows for specifying rich interface data, such as mac address and interface
name. However, this results in relatively large structure size - 52 bytes.
Routing stack fils in only 2 fields - sdl_index and sdl_type, which reside
in the first 8 bytes of the structure.
In the new KPI, struct nhop_object tries to be cache-efficient, hence
embodies gateway address inside the structure. In the AF_LINK case it
stores stortened version of the structure - struct sockaddr_dl_short,
which occupies 16 bytes. After D24340 changes, the data inside AF_LINK
gateway will not be used in the kernel at all, leaving rtsock as the only
potential concern.
The difference in rtsock reporting:
(old)
got message of size 240 on Thu Apr 16 03:12:13 2020
RTM_ADD: Add Route: len 240, pid: 0, seq 0, errno 0, flags:<UP,DONE,PINNED>
locks: inits:
sockaddrs: <DST,GATEWAY,NETMASK>
10.0.0.0 link#5 255.255.255.0
(new)
got message of size 200 on Sun Apr 19 09:46:32 2020
RTM_ADD: Add Route: len 200, pid: 0, seq 0, errno 0, flags:<UP,DONE,PINNED>
locks: inits:
sockaddrs: <DST,GATEWAY,NETMASK>
10.0.0.0 link#5 255.255.255.0
Note 40 bytes different (52-16 + alignment).
However, gateway is still a valid AF_LINK gateway with proper data filled in.
It is worth noting that these particular messages (interface routes) are mostly
ignored by routing daemons:
* bird/quagga/frr uses RTM_NEWADDR and ignores prefix route addition messages.
* quagga/frr ignores routes without gateway
More detailed overview on how rtsock messages are used by the
routing daemons to reconstruct the kernel view, can be found in D22974.
Differential Revision: https://reviews.freebsd.org/D24519
as the dma_device during RDMA registration.
cxgbe's struct device cannot be used as-is because it's a native FreeBSD
driver and ibcore is LinuxKPI based.
MFC after: 1 week
MFC after: r360196
Thanks to Natalie Silvanovich from Google for finding and reporting the
issue found by her in the SCTP userland stack.
MFC after: 3 days
X-MFC with: https://svnweb.freebsd.org/changeset/base/360193
RFC5661 specifies that a client's recovery upon receipt of NFSERR_BADSESSION
should first consist of a CreateSession operation using the extant ClientID.
If that fails, then a full recovery beginning with the ExchangeID operation
is to be done.
Without this patch, the FreeBSD client did not attempt the CreateSession
operation with the extant ClientID and went directly to a full recovery
beginning with ExchangeID. I have had this patch several years, but since
no extant NFSv4.n server required the CreateSession with extant ClientID,
I have never committed it.
I an committing it now, since I suspect some future NFSv4.n server will
require this and it should not negatively impact recovery for extant NFSv4.n
servers, since they should all return NFSERR_STATECLIENTID for this first
CreateSession.
The patched client has been tested for recovery against both the FreeBSD
and Linux NFSv4.n servers and no problems have been observed.
MFC after: 1 month
RFC 8221 does not outright ban 3des as the algorithms deprecated for
13 in r348205, but it is listed as a SHOULD NOT and will likely be a
MUST NOT by the time 13 ships.
Discussed with: bjk
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24341
Add driver for Broadcom "GENET" version 5, as found in BCM-2711 on
Raspberry Pi 4B. The driver is derived in part from the bcmgenet.c
driver in NetBSD, along with bcmgenetreg.h.
Reviewed by: manu
Obtained from: in part from NetBSD
Relnotes: yes, note addition
Differential Revision: https://reviews.freebsd.org/D24436
Allocate a temporary buffer in the kernel to serve as the CCB data
pointer for a pass-through transaction and use copyin/copyout to
shuffle the data to/from the user buffer.
Reviewed by: scottl, brooks
Obtained from: CheriBSD
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24489
The handle_string callback for the ENCIOC_SETSTRING ioctl was passing
a user pointer to memcpy(). Fix by using copyin() instead.
For ENCIOC_GETSTRING ioctls, the handler was storing the user pointer
in a CCB's data_ptr field where it was indirected by other code. Fix
this by allocating a temporary buffer (which ENCIOC_SETSTRING already
did) and copying the result out to the user buffer after the CCB has
been processed.
Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24487
These two sysctls were added to support UFS softupdates journalling
with snapshots. However, the changes to fsck to use them were never
committed and there have never been any in-tree uses of these sysctls.
More details from Kirk:
When journalling got added to soft updates, its journal rollback freed
blocks that it thought were no longer in use. But it does not take
snapshots into account (i.e., if a snapshot is still using it, then it
cannot be freed). So I added the needed logic to fsck by having the
free go through the kernel's blkfree code so it could grab blocks that
were still needed by snapshots. That is done using the setbufoutput
hack. I never got that code working reliably, so it is still sitting
in my work directory. Which also explains why you still cannot take
snapshots on filesystems running with journalling...
In looking over my use of this feature, and in particular the troubles
I was having with it, I conclude that it may be better to extract the
code from the kernel that handles freeing blocks claimed by snapshots
and putting it into fsck directly. My original intent was that it is
complex and at the time changing, so only having to maintain it in one
place was appealing. But at this point it has not changed in years and
the hacks like setinode and setbufoutput to be able to use the kernel
code is sufficiently ugly, that I am leaning towards just extracting
it.
Reviewed by: mckusick
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24484
If DTRACE is enabled at compile time, all kernel breakpoint traps are
first given to dtrace to see if they are triggered by a FBT probe.
Previously if dtrace didn't recognize the trap, it was silently
ignored breaking the handling of other kernel breakpoint traps such as
the debug.kdb.enter sysctl. This only returns early from the trap
handler if dtrace recognizes the trap and handles it.
Submitted by: Nicolò Mazzucato <nicomazz97@gmail.com>
Reviewed by: markj
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D24478
blockcount_wait() still unconditionally waits for the count to reach
zero before returning.
Tested by: pho (a larger patch)
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24513
The current situation results in intermittent breakage if data gets split up
with the sign bit set on the data1 half of it, as PAIR32TO64 will then:
data1 | (data2 << 32) -> resulting in data1 getting sign-extended when it's
implicitly widened and clobbering the result. AFAICT, there's no compelling
reason for these to be signed.
This was most exposed by flakiness in the kqueue timer tests under compat32
after the ABSTIME test got switched over to using a better clock and
microseconds.
Reviewed by: kib
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D24518
kmem_alloc_attr_domain() and kmem_alloc_contig_domain() duplicated each
other's page allocation and reclamation logic. Place it in a single
function to make it easier to add additional consumers. No functional
change intended.
Reviewed by: jeff, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24475
This simplifies some planned changes. No functional change intended.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24474
in all cases, by adjust snd_una right after the
connection initialization, to include the one byte
in sequence space occupied by the SYN bit.
This does not change the regular ACK processing,
while making the BYTES_THIS_ACK macro to work properly.
PR: 235256
Reviewed by: tuexen (mentor), rgrimes (mentor)
Approved by: tuexen (mentor), rgrimes (mentor)
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D19000
This unbreaks the i386 kqueue timer tests after a recent change switched
NOTE_ABSTIME over to using microseconds. Notably, the data argument (which
holds useconds) is an int64_t, but we were passing it to timer2sbintime
which takes an intptr_t. Perhaps in a previous incarnation, intptr_t would
have made sense, but now it just leads to the timestamp getting truncated
and subsequently rejected when it no longer fits in an intptr_t.
PR: 245768
Reported by: lwhsu / CI
MFC after: 1 week
Add some prose and a diagram describing the layout of the cipher IV
for AES-CTR and AES-GCM and how it relates to the ESP IV stored in the
packet after the ESP header. Also, remove an XXX comment about the
initial block counter value used for AES-CTR in esp_output as the
current code matches the RFC (and the equivalent code in esp_input
didn't have the XXX comment).
Discussed with: cem
The sole in-tree user of this flag has been retired, so remove this
complexity from all drivers. While here, add a helper routine drivers
can use to read the current request's IV into a local buffer. Use
this routine to replace duplicated code in nearly all drivers.
Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24450
This is the only place that uses CRYPTO_F_IV_GENERATE. All crypto
drivers currently duplicate the same boilerplate code to handle this
case. Doing the generation directly removes complexity from drivers.
It also simplifies support for separate input and output buffers.
Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24449
In r360126, I meant to have a different mask only on powerpc, not powerpc64.
Update the check to check that we're not compiling for powerpc64.
Reported by: jhibbits
Approved by: wulf (implicit)
MFC after: 2 weeks
X-MFC-Note: 12 only
X-MFC-With: r360126
Differential Revision: D24370 (followup)
All of the 'goto out;' cases in this probe routine without explicit
initialization of 'ret' indicate error cases and were clearly intended
to use the initial definition of 'ret' with ENXIO. However, 'ret' was
accidentally squashed by reuse for a subroutine call near the beginning
of probe.
Use a different variable for the subroutine status to preserve ENXIO ret
for the 'goto out's as a minimal solution to the panic reported at attach
for now.
PR: 245757
Change kern.evdev.rcpt_mask from 3 to 12 by default. This makes us much
more evdev-friendly, and will prevent everyone using xorg and wayland with
evdev devices (the default) from needing to change this locally.
powerpc32 still uses the old value for the keyboard part, becaues the adb
keyboard driver used there is not evdev compatible.
Reviewed by: wulf
Approved by: wulf
MFC after: 2 weeks
X-MFC-Note: 12 only
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D24370
vm_page_acquire_unlocked() relies on type-stability of vm_page
structures and assumes that the listq linkage pointers always point to a
vm_page or are NULL. QUEUE_MACRO_DEBUG_TRASH breaks that assumption, so
add an explicit check for a trashed queue pointer before dereferencing.
Reported and tested by: pho
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24472
Refer to bluetooth core v5.2 specifications Vol4. Part E. 7.8.27.
PR: 245763
Submitted by: Marc Veldman <marc@bumblingdork.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
As a short term solution for the problem reported by Shawn Webb re: r359950,
bump the maximum number of memmaps per VM. This structure is 40 bytes, and the
additional four (fixed array embedded in the struct vm) members increase the
size of struct vm by 3%.
(The vast majority of struct vm is the embedded struct vcpu array, which
accounts for 84% of the size -- over 4 kB.)
Reported by: Shawn Webb <shawn.webb AT hardenedbsd.org>
Reviewed by: grehan
X-MFC-With: r359950
Differential Revision: https://reviews.freebsd.org/D24507
elements in the middle.
This fixes a panic when detaching USB mouse.
PR: 245732
Reviewed by: wulf
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D24500
Both DIOCCHANGEADDR and DIOCADDADDR take a struct pf_pooladdr from
userspace. They failed to validate the dyn pointer contained in its
struct pf_addr_wrap member structure.
This triggered assertion failures under fuzz testing in
pfi_dynaddr_setup(). Happily the dyn variable was overruled there, but
we should verify that it's set to NULL anyway.
Reported-by: syzbot+93e93150bc29f9b4b85f@syzkaller.appspotmail.com
Reviewed by: emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24431
pmap_bootstrap() expects the kernel's physical load address, but we have
been providing the start of physical memory. This had the nice effect of
protecting the memory used by the SBI runtime firmware, but now that we
have alternate means of achieving that, we should provide the correct
value. This will free up any memory between the SBI firmware and the
kernel for allocation.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D24156
The device tree may contain a "reserved-memory" node, whose purpose is
to communicate sections of physical memory that should not be used for
general allocations. Add the logic to parse and exclude these regions.
The particular motivation for this is protection of the SBI runtime
firmware. Currently, there is no mechanism through which the SBI
can communicate the details of its reserved memory region(s) to
a supervisor payload. There has been some discussion recently on how
this can be achieved [1], and it seems that the path going forward
will be to add an entry to the reserved-memory node.
This hasn't caused any issues for us yet, since we exclude all physical
memory below the kernel's load address from being allocated, and on all
currently supported platforms this covers the SBI firmware region. This
will change in another commit, so as a safety measure, ensure that the
lowest 2MB of memory is excluded if this region has not been reported.
[1] https://github.com/riscv/riscv-sbi-doc/pull/37
Reviewed by: markj, nick (older version)
Differential Revision: https://reviews.freebsd.org/D24155
Replace our hand-rolled functions with the generic ones provided by
kern/subr_physmem.c. This greatly simplifies the initialization of
physical memory regions and kernel globals.
Tested by: nick
Differential Revision: https://reviews.freebsd.org/D24154
The arm_physmem interface found in arm's MD code provides a convenient
set of routines for adding/excluding physical memory regions and
initializing important kernel globals such as Maxmem, realmem,
phys_avail[], and dump_avail[]. It is especially convenient for FDT
systems, since we can use FDT parsing functions and pass the result
directly to one of these physmem routines. This interface is already in
use on arm and arm64, and can be used to simplify this early
initialization on RISC-V as well.
This requires only a couple trivial changes:
- Move arm_physmem_kernel_addr to arm/machdep.c. It is unused on arm64,
and manipulated entirely in arm MD code.
- Convert arm32_btop/arm64_btop to atop. This is equivalently defined
on all architectures.
- Drop the "arm" prefix.
Reviewed by: manu, emaste ("looks reasonable")
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24153
don't implement link power management, LPM.
This fixes error code XHCI_TRB_ERROR_BANDWIDTH for isochronous USB 3.0
transactions.
Submitted by: Horse Ma <Shichun.Ma@dell.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Unconditionally use ether_gen_addr() to generate bridge mac addresses. This
function is now less likely to generate duplicate mac addresses across jails.
The old hand rolled hostid based code adds no value.
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D24432
If we create two (vnet) jails and create a bridge interface in each we end up
with the same mac address on both bridge interfaces.
These very often conflicts, resulting in same mac address in both jails.
Mitigate this problem by including the jail name in the mac address.
Reviewed by: kevans, melifaro
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24383
It must contain fully restored contigous run of the wired pages from
the object, except possible trimmed tail.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
It is true for zfs, but it is not for e.g. vnode or buffer pagers.
When fixing bogus pages, fix them in both places. Rely on the fact
that pa[0] must have been invalid so it cannot be bogus.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
LLVM-9.0 clang++ throws an error for enum defined within
an anonymous struct.
Reviewed by: jtl, rpokala
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org//D24477
The typedef mbuf_t was used for the Mac OS/X port of the code long ago.
Since this port is no longer used and the use of mbuf_t obscures what
the code does (and is not consistent with style(9)), it is no longer needed.
This patch replaces all instances of mbuf_t with "struct mbuf *", so that
it is no longer used.
This patch should not result in any semantic change.
Silence the "DWARF2 can only represent one section per compilation unit"
warning in amd64 GENERIC builds by disabling Clang's debuginfo generation for
this assembler file (-g0). The message is replaced by a warning from
ctfconvert that there is no debuginfo to convert (future work).
The file contains some metadata (several ELF notes) and some code. The code
does not appear to have anything that debuginfo would aid.
I looked at the generated debuginfo (readelf -w xen-locore.o) prior to this
change, and the metadata that would be disabled are things like associated
between binary offset and code line number (not especially useful with a
disassembler), and label metadata for the entry points (not especially useful
as this is already in the symbol table).
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D24384
snd_hda was rewritten in r230130; one function retained a comment
referencing the previous name.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
A later change, currently being iterated on in D24459, will in-fact change
the lock type to an sx so that TTY drivers can sleep on it if they need to.
Committing this ahead of time to make the review in question a little more
palatable.
tty_lock_assert() is unfortunately still needed for now in two places to
make sure that the tty lock has not been recursed upon, for those scenarios
where it's supplied by the TTY driver and possibly a mutex that is allowed
to recurse.
Suggested by: markj
The use of "int" here caused the compiler to believe that it needs to
insert a "sll $n, $n, 0" to sign extend as part of the implicit cast
to uint64_t.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: brooks, arichardson
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24457
The handle_string callback for the ENCIOC_GET_ENCNAME and
ENCIOC_GETENCID ioctls tries to copy the size of the generated string
out to userland. However, the callback only has access to the kernel
copy of the structure populated by copyin(). The copyout() call
simply overwrites the value in the kernel's copy preventing the
subsequent overflow prevention logic from working.
Fix this by instead doing a copyout() of the updated length in the
caller after the callback returns.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24456
Generic if_output() callback signature was modified to use struct route
instead of struct rtentry in r191148, back in 2009.
Quoting commit message:
Change if_output to take a struct route as its fourth argument in order
to allow passing a cached struct llentry * down to L2
Fix bridge_output() to match this signature and update the remaining
comment in if_var.h.
Reviewed by: kp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24394
It is not logically dependent on "device mem", and an arm kernel
compiled without that device fails to link since the minidumpsys()
symbol is referenced by kern_dump.c.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
The static assertions were added (with size and offsets from gdb) and verified
with a build prior to marking the holes explicitly.
This is in preparation for a subsequent revision, pending in phabricator, that
makes use of some of these unused bits without impacting the ABI.
Reviewed by: grehan
Differential Revision: https://reviews.freebsd.org/D24461
this type, but since RPC depends on XDR (not vice versa) we need
it defined in XDR to make the module loadable without RPC.
Reviewed by: rmacklem
Differential Revision: https://reviews.freebsd.org/D24408
Ryan Moeller reported crashes in the NFS server that appear to be
caused by stack corruption in nfsrv_compound(). It appears that
the stack got corrupted just after a NFSv4.1 Lookup that crosses
a server mount point.
Although it is just a "theory" at this point, the most obvious way
the stack could get corrupted would be if nfsvno_checkexp() somehow
acquires an export with a bogus nes_numsecflavor value. This would
cause the copying of the secflavors to run off the end of the array,
which is allocated on the stack below where the corruption occurs.
This sanity check is simple to do and would stop the stack corruption
if the theory is correct. Otherwise, doing the sanity check seems to
be a reasonable safety belt to add to the code.
Reported by: freqlabs
MFC after: 2 weeks
cdir may have simply failed to resolve (e.g. fget_cap failure in namei
leading to NULL dp passed to AUDIT_ARG_UPATH*_VP); restore the pre-rS358191
behavior of setting cpath[0] = '\0' and bailing out instead of panicking.
This was found by inadvertently running the libc/c063 tests with auditing
enabled, resulting in a panic.
Reviewed by: mjg (committed version actually his)
Differential Revision: https://reviews.freebsd.org/D24445
On my Dell Latitude 7390 laptop, the brightness hotkeys
(Fn+<up/down arrow>) send ACPI notifications which acpi_video
handles by adjusting its brightness setting; but ACPI does not
actually do anything with the backlight.
Announcing brightness changes via devd makes it possible to close
the loop by triggering the intel_backlight utility to perform the
required backlight adjustment.
Reviewed by: imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D24424
Use AUXARGS_ENTRY_PTR to export these pointers. This is a followup to
r359987 and r359988.
Reviewed by: jhb
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24446
mb_reclaim() calls the protocol drain routines for each protocol in each
domain. Some protocols exist in more than one domain and share drain
routines. In the case of SCTP, it also uses the same drain routine for
its SOCK_SEQPACKET and SOCK_STREAM entries in the same domain.
On systems with INET, INET6, and SCTP all defined, mb_reclaim() calls
sctp_drain() four times. On systems with INET and INET6 defined,
mb_reclaim() calls tcp_drain() twice. mb_reclaim() is the only in-tree
caller of the pr_drain protocol entry.
Eliminate this duplication by ensuring that each pr_drain routine is only
specified for one protocol entry in one domain.
Reviewed by: tuexen
MFC after: 2 weeks
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D24418
Name each pcib driver uniquely.
This remove the warning printed at each arm boot :
module_register: cannot register simplebus/pcib from kernel; already loaded from kernel
One of the goals of the new routing KPI defined in r359823 is to
entirely hide`struct rtentry` from the consumers. It will allow to
improve routing subsystem internals and deliver more features much faster.
This change is one of the ongoing changes to eliminate direct
struct rtentry field accesses.
Additionally, with the followup multipath changes, single rtentry can point
to multiple nexthops.
With that in mind, convert rti_filter callback used when traversing the
routing table to accept pair (rt, nhop) instead of nexthop.
Reviewed by: ae
Differential Revision: https://reviews.freebsd.org/D24440
Name each ehci driver uniquely.
This remove the warning printed at each arm boot :
module_register: cannot register simplebus/ehci from kernel; already loaded from kernel
A similar fix was done in r333074 but imx_ehci wasn't renamed and generic_ehci wasn't
present at that time.
Permit instruction decoding logic to be compiled outside of the kernel for
rapid iteration and validation.
Reviewed by: grehan
Differential Revision: https://reviews.freebsd.org/D24439
While here, add a makefile in sys/modules/allwinner so it is built.
Also add the PNP info so devmatch will load this module automatically.
MFC after: 1 month
prison0's hostuuid will get set by the hostid rc script, either after
generating it and saving it to /etc/hostid or by simply reading /etc/hostid.
Some things (e.g. arbitrary MAC address generation) may use the hostuuid as
a factor in early boot, so providing a way to read /etc/hostid (if it's
available) and using it before userland starts up is desirable. The code is
written such that the preload doesn't *have* to be /etc/hostid, thus not
assuming that there will be newline at the end of the buffer or even the
exact shape of the newline. White trailing whitespace/non-printables
trimmed, the result will be validated as a valid uuid before it's used for
early boot purposes.
The preload can be turned off with hostuuid_load="NO" in /boot/loader.conf,
just as other preloads; it's worth noting that this is a 37-byte file, the
overhead is believed to be generally minimal.
It doesn't seem necessary at this time to be concerned with kern.hostid.
One does wonder if we should consider validating hostuuids coming in
via jail_set(2); some bits seem to care about uuid form and we bother
validating format of smbios-provided uuid and in-fact whatever uuid comes
from /etc/hostid.
Reviewed by: karels, delphij, jamie
MFC after: 1 week (don't preload by default, probably)
Differential Revision: https://reviews.freebsd.org/D24288
Stop attempting to use FADT legacy hardware flag, it is almost always
a lie.
Instead, unless user explicitly disabled the calibration, calibrate
against 8254 ISA clock. Then, obtain the rough value of the expected
TSC frequency from CPUID leafs 0x15/0x16 or even from the CPU
marketing name string. If calibration results look unbelievably bogus
comparing with CPUID leafs report, use CPUID one.
Intel does not recommend to use CPUID leaf 0x16 for the value of the
system time frequency, indeed the error there might be up to 1% which
e.g. makes ntpd give up. If ISA clock is present, we win, if not, we
get some frequency that allows the machine to boot without enormous
delay.
Next improvement would be to use HPET for re-calibration if we decided
that ISA clock gives bogus results, after HPETs are enumerated. This
is a much bigger change since we probably would need to re-evaluate
some constants depending on TSC frequency.
Reviewed by: emaste, jhb, scottl
Tested by: scottl
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D24426
I missed the "atomic" field of the RemoveExtendedAttribute operation's
reply when I implemented it. It worked between FreeBSD client and server,
since it was missed for both, but it did not conform to RFC 8276.
This patch adds the field for both client and server.
Thanks go to Frank for doing interoperability testing of the extended
attribute support against patches for Linux.
Submitted by: Frank van der Linden <fllinden@amazon.com>
Reported by: Frank van der Linden <fllinden@amazon.com>
Default moea64_bpvo_pool_size 327680 was insufficient for initial
memory mapping at boot time on systems with, for example, 64G and
no huge pages enabled.
Submitted by: Andre Silva <afscoelho@gmail.com>
Reviewed by: jhibbits, alfredo
Approved by: jhibbits (mentor)
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D24102