Commit Graph

141351 Commits

Author SHA1 Message Date
Oscar Zhao
49b76286b6 support thread migration and refactor kqdom locks 2021-05-21 14:38:08 -04:00
Oscar Zhao
0aa6e4d988 add last_nkev 2020-05-20 04:25:02 -04:00
Oscar Zhao
2e383660dd add kn_proc_count to best of two 2020-05-20 04:11:34 -04:00
Oscar Zhao
3ce2173be6 add cap to workstealing controlled by sysctl and its stat dumps 2020-05-09 20:52:39 -04:00
Oscar Zhao
96330609b7 1. get rid of the usage of next_kn in kqueue_scan. The latter is hell to maintain and lead to countless bugs.
2. slightly reduce lock contention in the knote_drop path.
2020-05-06 03:42:29 -04:00
Oscar Zhao
61067bc214 release processing KN only when kq_scan has >0 max events. Because calling kevent with =0 max events doesn't mean all events are processed. This results in a subtle race of pre-releasing unprocessed knotes. 2020-05-05 04:14:30 -04:00
Charlie Root
d535232e92 fix workstealing > 1 and possible divide by 0 2020-05-05 01:22:57 -04:00
Oscar Zhao
29df71b972 merge with head 2020-04-30 02:15:26 -04:00
imp
b8ce7e8464 Make sure that we get the sbuf resources we need.
Since we're calling sbuf_new with NOWAIT, make sure it can allocate a
buffer to use. Don't print anything if we can't get it.

Noticed by: rpokala
2020-04-30 00:43:11 +00:00
imp
1c63fddbd7 Implement the NVME_GET_NSID and NVME_PASSTHROUGH_CMD ioctls
With these two ioctls implemented in the nda driver, nvmecontrol now
works with nda just like it does with nvd. It eliminates the need to
jump through odd hoops to get this data.
2020-04-30 00:43:07 +00:00
imp
18a1b03238 Return the nvmeX device associated with the ndaX device.
Add the nvmeX device to the XPT_PATH_INQ nvme specific
information. while one could figure this out by looking up the
domain🚌slot:function, it's a lot easier to have the SIM set it
directly since the sim knows this.
2020-04-30 00:43:02 +00:00
imp
07ca42c0db Generate a devctl event for interesting events
When we reset the controller, and when the controller tells us about a
critical warning, send an event.
2020-04-30 00:27:19 +00:00
rscheff
9ff3e71cd0 Prevent premature shrinking of the scaled receive window
which can cause a TCP client to use invalid or stale TCP sequence numbers for ACK packets.

Packets with old sequence numbers are ignored and not used to update the send window size.
This might cause the TCP session to hang indefinitely under some circumstances.

Reported by:	Cui Cheng
Reviewed by:	tuexen (mentor), rgrimes (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor)
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D24515
2020-04-29 22:01:33 +00:00
melifaro
57588206fe Convert more rtentry field accesses into nhop fields accesses.
Continue routing subsystem conversion to nhop objects defined in r359823.
Use fields from nhop structure instead of "struct rtentry" fields.
This is one of the last changes prior to removing rt_ifp, rt_ifa,
 rt_gateway and rt_mtu from struct rtentry.

Differential Revision:	https://reviews.freebsd.org/D24609
2020-04-29 21:54:09 +00:00
rscheff
f8d5656acd Correctly set up the initial TCP congestion window in all cases,
by not including the SYN bit sequence space in cwnd related calculations.
Snd_und is adjusted explicitly in all cases, outside the cwnd update, instead.

This fixes an off-by-one conformance issue with regular TCP sessions not
using Appropriate Byte Counting (RFC3465), sending one more packet during
the initial window than expected.

PR:		235256
Reviewed by:	tuexen (mentor), rgrimes (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor)
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D19000
2020-04-29 21:48:52 +00:00
melifaro
2338a28d9e Add nhop to the ifa_rtrequest() callback.
With the upcoming multipath changes described in D24141,
 rt->rt_nhop can potentially point to a nexthop group instead of
 an individual nhop.
To simplify caller handling of such cases, change ifa_rtrequest() callback
 to pass changed nhop directly.

Differential Revision:	https://reviews.freebsd.org/D24604
2020-04-29 19:28:56 +00:00
mmel
dd3a1b93be Fix style(9). Strip write only variables.
Not a functional change.

MFC after:	1 week
2020-04-29 14:36:50 +00:00
mmel
8dfd2d1457 Export tracing facility of GIC500 ITS block.
Possibility of tracing of processing message based interrupts is very
useful for debugging of PCIe driver, mainly for its MSI part.

MFC after: 1 week
2020-04-29 14:31:25 +00:00
mmel
43798da41a Don't try to set frequendcy for enumerated but newer started CPUs.
Openfirmare enumerates and installs the driver for all processors,
regardless of whether they will be started later (because of power
constrains for example).

MFC after: 3 weeks
2020-04-29 14:14:15 +00:00
mmel
c14a375467 Don't try to re-initialize already preseted regulator.
Don't set initial voltage for regulators having their voltage already
in allowed range. As side effect of this change, we don't try to set
initial voltage for fixed voltage regulators - these don't have impemented
voltage set  method so their initialization has always failed.

MFC after:	3 weeks
2020-04-29 13:45:21 +00:00
mmel
db4e94860d Multiple fixes for rockchip iodomain driver:
- always initialize selector of voltage signaling standard.
  Various versions of U-boot leaves voltage signaling standard settings
  for PMUIO2 domain in different state.  Always initialize it
  into expected state.
- start the driver as early as possible, the IO domains should be
  initialized before other drivers are attached.
- rename RK3399 register to its name founds in TRM.

This is the second part of fixes for serial port corruption observed after
DT 5.6 import.

Reviewed by:	manu
MFC after:	1 week
2020-04-29 13:43:15 +00:00
melifaro
ba5c497845 Move route-specific ddb commands to route/route_ddb.c
Currently functionality resides in rtsock.c, which is a controlling
 interface, partially external to the routing subsystem.
Additionally, DDB-supporting functionality is > 100SLOC, which deserves
 a separate file.

Given that, move this functionality to a newly-created net/route/ subdir.

Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D24561
2020-04-28 20:00:17 +00:00
melifaro
7c1be85b66 Move route_temporal.c and route_var.h to net/route.
Nexthop objects implementation, defined in r359823,
 introduced sys/net/route directory intended to hold all
 routing-related code. Move recently-introduced route_temporal.c and
 private route_var.h header there.

Differential Revision:	https://reviews.freebsd.org/D24597
2020-04-28 19:14:09 +00:00
melifaro
1d7f87c6b2 Move struct rtentry definition to nhop_var.h.
One of the goals of the new routing KPI defined in r359823
 is to entirely hide`struct rtentry` from the consumers.
It will allow to improve routing subsystem internals and deliver
 features much faster.

This is one of the last changes, effectively moving struct rtentry
 definition to a net/route_var.h header, internal to the routing subsystem.

Differential Revision:	https://reviews.freebsd.org/D24580
2020-04-28 18:42:30 +00:00
bdrewery
4cb95a1ad5 Don't try ctfconvert on file without debug info.
This was currently an ignored error but will change to a hard error
eventually.

Differential Revision:	https://reviews.freebsd.org/D24536
2020-04-28 16:09:25 +00:00
bdrewery
fcdcf2a876 None of these use opt_sched.h 2020-04-28 16:09:18 +00:00
takawata
44b4ea5525 Add le_read_buffer_size command and manpage.
It supports both v1 and v2 command.

PR:245964
Submitted by:	Marc Veldman <marc@bumblingdork.com>
2020-04-28 16:00:34 +00:00
markj
7d7c4f74f0 Make sendfile(SF_SYNC)'s CV wait interruptible.
Otherwise, since the CV is not signalled until data is drained from the
socket, it is trivial to create an unkillable process using
sendfile(SF_SYNC) and a process-private PF_LOCAL socket pair.  In
particular, the cv_wait() in sendfile() does not get interrupted until
data is drained from the receiving socket buffer.

Reported by:	pho
Discussed with:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-04-28 15:02:44 +00:00
markj
afa3646dbf Re-check for wirings after busying the page in vm_page_release_locked().
A concurrent unlocked lookup can wire the page after
vm_page_release_locked() releases the last wiring, in which case
vm_page_release_locked() must not free the page.  Once the xbusy lock is
acquired, that, the object lock and the fact that the page is unmapped
ensure that the wire count cannot increase, so re-check for new wirings
after the page is xbusied.

Update the comment above vm_page_wired() to reflect the new
synchronization rules.

Reported by:	glebius
Reviewed by:	alc, jeff, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24592
2020-04-28 13:51:41 +00:00
melifaro
192053a93a Convert rtalloc_mpath_fib() users to the new KPI.
New fib[46]_lookup() functions support multipath transparently.
Given that, switch the last rtalloc_mpath_fib() calls to
 dib4_lookup() and eliminate the function itself.

Note: proper flowid generation (especially for the outbound traffic) is a
 bigger topic and will be handled in a separate review.
This change leaves flowid generation intact.

Differential Revision:	https://reviews.freebsd.org/D24595
2020-04-28 08:06:56 +00:00
melifaro
4a05b1819a Eliminate now-unused parts of old routing KPI.
r360292 switched most of the remaining routing customers to a new KPI,
 leaving a bunch of wrappers for old routing lookup functions unused.

Remove them from the tree as a part of routing cleanup.

Differential Revision:	https://reviews.freebsd.org/D24569
2020-04-28 07:25:34 +00:00
melifaro
eeadd16587 Remove rtable dumping code from bootp.
This debugging code printing routing table data was introduced in rS25723,
 22+ years ago. The last functional commit to this code was rS67534, 19 years ago.
The code has been turned off by default all this time.
Lastly, this code directly iterates radix tree and rtentries, which is not
 not a proper interaction with routing system.

Differential Revision:	https://reviews.freebsd.org/D24554
2020-04-28 07:23:41 +00:00
rmacklem
3663733adc Get rid of uio_XXX macros used for the Mac OS/X port.
The NFS code had a bunch of Mac OS/X accessor functions named uio_XXX
left over from the port to Mac OS/X. Since that port is long forgotten,
replace the calls with the code generated by the FreeBSD macros for these
in nfskpiport.h. This allows the macros to be deleted from nfskpiport.h
and I think makes the code more readable.

This patch should not result in any semantic change.
2020-04-28 02:11:02 +00:00
jhb
ec845f5c7c Bump __FreeBSD_version for KTLS RX support. 2020-04-28 00:06:49 +00:00
jhb
2419779c0b Add support for KTLS RX over TOE to T6.
This largely reuses the TLS TOE support added in r330884.  However,
this uses the KTLS framework in upstream OpenSSL rather than requiring
Chelsio-specific patches to OpenSSL.  As with the existing TLS TOE
support, use of RX offload requires setting the tls_rx_ports sysctl.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24453
2020-04-27 23:59:42 +00:00
rmacklem
89bb24d90c Fix sosend_generic() so that it can handle a list of ext_pgs mbufs.
Without this patch, sosend_generic() will try to use top->m_pkthdr.len,
assuming that the first mbuf has a pkthdr.
When a list of ext_pgs mbufs is passed in, the first mbuf is not a
pkthdr and cannot be post-r359919.  As such, the value of top->m_pkthdr.len
is bogus (0 for my testing).
This patch fixes sosend_generic() to handle this case, calculating the
total length via m_length() for this case.

There is currently nothing that hands a list of ext_pgs mbufs to
sosend_generic(), but the nfs-over-tls kernel RPC code in
projects/nfs-over-tls will do that and was used to test this patch.

Reviewed by:	gallatin
Differential Revision:	https://reviews.freebsd.org/D24568
2020-04-27 23:55:09 +00:00
imp
65f15e1fdf Export the nda device's flags as a sysctl. 2020-04-27 23:43:17 +00:00
imp
502289dcd5 Convert rotating to a flag bit.
Move rotating to a flag bit. Add bit definitions for it. Create a
compat sysctl for it.
2020-04-27 23:43:12 +00:00
imp
d5c11c2480 Convert unmappedio over to a flag.
Make unmappedio a flag. Move it to the flags definition. Add compat
sysctl for it.
2020-04-27 23:43:08 +00:00
imp
c5149ad592 Add flags sysctl to ada
Report the ada device flags like we do the da devices. No booleans
have (yet) been converted, but iomapped and rotating are planned.
2020-04-27 23:43:04 +00:00
imp
43ef45384e Change the flags back to an enum
This was changed in the review process for the flags sysctl. The
reasons for the change are no longer valid as the code changed after
that. Cast the one place where it might make a difference (but I don't
think it does).  This restores the ability to see flags for softc in
gdb.
2020-04-27 23:39:32 +00:00
jhb
d223bc14de Initial support for kernel offload of TLS receive.
- Add a new TCP_RXTLS_ENABLE socket option to set the encryption and
  authentication algorithms and keys as well as the initial sequence
  number.

- When reading from a socket using KTLS receive, applications must use
  recvmsg().  Each successful call to recvmsg() will return a single
  TLS record.  A new TCP control message, TLS_GET_RECORD, will contain
  the TLS record header of the decrypted record.  The regular message
  buffer passed to recvmsg() will receive the decrypted payload.  This
  is similar to the interface used by Linux's KTLS RX except that
  Linux does not return the full TLS header in the control message.

- Add plumbing to the TOE KTLS interface to request either transmit
  or receive KTLS sessions.

- When a socket is using receive KTLS, redirect reads from
  soreceive_stream() into soreceive_generic().

- Note that this interface is currently only defined for TLS 1.1 and
  1.2, though I believe we will be able to reuse the same interface
  and structures for 1.3.
2020-04-27 23:17:19 +00:00
jhb
a19547aa99 Add the initial sequence number to the TLS enable socket option.
This will be needed for KTLS RX.

Reviewed by:	gallatin
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24451
2020-04-27 22:31:42 +00:00
erj
83723aa6b0 iflib: Stop interface before (un)registering VLAN
This patch is intended to solve a specific problem that iavf(4)
encounters, but what it does can be extended to solve other issues.

To summarize the iavf(4) issue, if the PF driver configures VLAN
anti-spoof, then the VF driver needs to make sure no untagged traffic is
sent if a VLAN is configured, and vice-versa. This can be an issue when
a VLAN is being registered or unregistered, e.g. when a packet may be on
the ring with a VLAN in it, but the VLANs are being unregistered. This
can cause that tagged packet to go out and cause an MDD event.

To fix this, include a new interface-dependent function that drivers can
implement named IFDI_NEEDS_RESTART(). Right now, this function is called
in iflib_vlan_unregister/register() to determine whether the interface
needs to be stopped and started when a VLAN is registered or
unregistered. The default return value of IFDI_NEEDS_RESTART() is true,
so this fixes the MDD problem that iavf(4) encounters, since the
interface rings are flushed during a stop/init.

A future change to iavf(4) will implement that function just in case the
default value changes, and to make it explicit that this interface reset
is required when a VLAN is added or removed.

Reviewed by:	gallatin@
MFC after:	1 week
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D22086
2020-04-27 22:02:44 +00:00
jhb
7cae950029 Retire the GENERICSF kernel config.
Now that hw.machine_arch handles soft-float vs hard-float there is no
longer a reason for this config.

Submitted by:	mhorne (kern.mk hunk)
Reviewed by:	imp (earlier version), kp
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24544
2020-04-27 21:51:22 +00:00
jhb
7cd0c41207 Don't run strcmp() against strings stored in user memory.
Instead, copy the strings into a temporary buffer on the stack and
run strcmp on the copies.

Reviewed by:	brooks, kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24567
2020-04-27 18:04:42 +00:00
jhb
ca9eb9ea4d Improve MACHINE_ARCH handling for hard vs soft-float on RISC-V.
For userland, MACHINE_ARCH reflects the current ABI via preprocessor
directives.  For the kernel, the hw.machine_arch sysctl uses the ELF
header flags of the current process to select the correct MACHINE_ARCH
value.

Reviewed by:	imp, kp
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24543
2020-04-27 17:55:40 +00:00
jhb
03fe59bbd8 Extend support in sysctls for supporting multiple native ABIs.
This extends some of the changes in place to support reporting support
for 32-bit ABIs to permit reporting hard-float vs soft-float ABIs.

Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24542
2020-04-27 17:53:38 +00:00
rrs
d5486433ca This change does a small prepratory step in getting the
latest rack and bbr in from the NF repo. When those come
in the OOB data handling will be fixed where Skyzaller crashes.

Differential Revision:	https://reviews.freebsd.org/D24575
2020-04-27 16:30:29 +00:00
markj
3cbb3f5c2b Fix handling of EV_EOF for named pipes.
Contrary to the kevent man page, EV_EOF on a fifo is not cleared by
EV_CLEAR.  Modify the read and write filters to clear EV_EOF when the
fifo's PIPE_EOF flag is clear, and update the man page to document the
new behaviour.

Modify the write filter to return the amount of buffer space available
even if no readers are present.  This matches the behaviour for sockets.

When reading from a pipe, only call pipeselwakeup() if some data was
actually read.  This prevents the continuous re-triggering of a
EVFILT_READ event on EOF when in edge-triggered mode.

PR:		203366, 224615
Submitted by:	Jan Kokemüller <jan.kokemueller@gmail.com>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D24528
2020-04-27 15:59:19 +00:00