freebsd-dev

Author	SHA1	Message	Date
Gleb Smirnoff	53af690381	tcp: remove INP_TIMEWAIT flag Mechanically cleanup INP_TIMEWAIT from the kernel sources. After `0d7445193a`, this commit shall not cause any functional changes. Note: this flag was very often checked together with INP_DROPPED. If we modify in_pcblookup*() not to return INP_DROPPED pcbs, we will be able to remove most of this checks and turn them to assertions. Some of them can be turned into assertions right now, but that should be carefully done on a case by case basis. Differential revision: https://reviews.freebsd.org/D36400	2022-10-06 19:24:37 -07:00
Hans Petter Selasky	0e391a3197	ktls: Add missing NULL pointer check for TLS RX hardware offload. The send tag pointer may be NULL when the ktls_reset_receive_tag() function is invoked. Add check for this. Reviewed by: gallatin @ Sponsored by: NVIDIA Networking	2022-09-06 13:49:23 +02:00
Gleb Smirnoff	e7d02be19d	protosw: refactor protosw and domain static declaration and load o Assert that every protosw has pr_attach. Now this structure is only for socket protocols declarations and nothing else. o Merge struct pr_usrreqs into struct protosw. This was suggested in 1996 by wollman@ (see `7b187005d1`), and later reiterated in 2006 by rwatson@ (see `6fbb9cf860`). o Make struct domain hold a variable sized array of protosw pointers. For most protocols these pointers are initialized statically. Those domains that may have loadable protocols have spacers. IPv4 and IPv6 have 8 spacers each (andre@ `dff3237ee5`). o For inetsw and inet6sw leave a comment noting that many protosw entries very likely are dead code. o Refactor pf_proto_[un]register() into protosw_[un]register(). o Isolate pr_*_notsupp() methods into uipc_domain.c Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D36232	2022-08-17 11:50:32 -07:00
Hans Petter Selasky	fe8c78f0d2	ktls: Add full support for TLS RX offloading via network interface. Basic TLS RX offloading uses the "csum_flags" field in the mbuf packet header to figure out if an incoming mbuf has been fully offloaded or not. This information follows the packet stream via the LRO engine, IP stack and finally to the TCP stack. The TCP stack preserves the mbuf packet header also when re-assembling packets after packet loss. When the mbuf goes into the socket buffer the packet header is demoted and the offload information is transferred to "m_flags" . Later on a worker thread will analyze the mbuf flags and decide if the mbufs making up a TLS record indicate a fully-, partially- or not decrypted TLS record. Based on these three cases the worker thread will either pass the packet on as-is or recrypt the decrypted bits, if any, or decrypt the packet as usual. During packet loss the kernel TLS code will call back into the network driver using the send tag, informing about the TCP starting sequence number of every TLS record that is not fully decrypted by the network interface. The network interface then stores this information in a compressed table and starts asking the hardware if it has found a valid TLS header in the TCP data payload. If the hardware has found a valid TLS header and the referred TLS header is at a valid TCP sequence number according to the TCP sequence numbers provided by the kernel TLS code, the network driver then informs the hardware that it can resume decryption. Care has been taken to not merge encrypted and decrypted mbuf chains, in the LRO engine and when appending mbufs to the socket buffer. The mbuf's leaf network interface pointer is used to figure out from which network interface the offloading rule should be allocated. Also this pointer is used to track route changes. Currently mbuf send tags are used in both transmit and receive direction, due to convenience, but may get a new name in the future to better reflect their usage. Reviewed by: jhb@ and gallatin@ Differential revision: https://reviews.freebsd.org/D32356 Sponsored by: NVIDIA Networking	2022-06-07 12:58:09 +02:00
Hans Petter Selasky	f0fca64618	ktls: Refer send tag pointer once. So that the asserts and the actual code see the same values. Differential revision: https://reviews.freebsd.org/D32356 MFC after: 1 week Sponsored by: NVIDIA Networking	2022-06-07 12:57:03 +02:00
Gleb Smirnoff	b46667c63e	sockbuf: merge two versions of sbcreatecontrol() into one No functional change.	2022-05-17 10:10:42 -07:00
John Baldwin	a4c5d490f6	KTLS: Move OCF function pointers out of ktls_session. Instead, create a switch structure private to ktls_ocf.c and store a pointer to the switch in the ocf_session. This will permit adding an additional function pointer needed for NIC TLS RX without further bloating ktls_session. Reviewed by: hselasky Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D35011	2022-04-22 15:52:12 -07:00
John Baldwin	cd0525f615	ktls: Write-lock the INP when changing a transmit TLS session. The TCP rate pacing code relies on being able to read this pointer safely while holding an INP lock. The initial TLS session pointer is set while holding the write lock already. Reviewed by: gallatin, hselasky Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D34086	2022-02-11 15:16:25 -08:00
Mark Johnston	5de79eeddb	ktls: Disallow transmitting empty frames outside of TLS 1.0/CBC mode There was nothing preventing one from sending an empty fragment on an arbitrary KTLS TX-enabled socket, but ktls_frame() asserts that this could not happen. Though the transmit path handles this case for TLS 1.0 with AES-CBC, we should be strict and allow empty fragments only in modes where it is explicitly allowed. Modify sosend_generic() to reject writes to a KTLS-enabled socket if the number of data bytes is zero, so that userspace cannot trigger the aforementioned assertion. Add regression tests to exercise this case. Reported by: syzkaller Reviewed by: gallatin, jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34195	2022-02-08 12:40:41 -05:00
John Baldwin	d958bc7963	ktls: Try to enable TOE TLS after marking existing data not ready. At the moment this is mostly a no-op but in the future there will be in-flight encrypted data which requires software decryption. This same setup is also needed for NIC TLS RX. Note that this does break TOE TLS RX for AES-CBC ciphers since there is no software fallback for AES-CBC receive. This will be resolved one way or another before 14.0 is released. Reviewed by: hselasky Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D34082	2022-01-31 16:39:21 -08:00
Hans Petter Selasky	9e2cce7e6a	Implement a function to get the next TCP- and TLS- receive sequence number. This function will be used by coming TLS hardware receive offload support. Differential Revision: https://reviews.freebsd.org/D32356 Discussed with: jhb@ MFC after: 1 week Sponsored by: NVIDIA Networking	2022-01-26 12:55:00 +01:00
Mark Johnston	6be8944d96	ktls: Zero out TLS_GET_RECORD control messages Otherwise we end up copying one uninitialized byte into the socket buffer. Reported by: KMSAN Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D33953	2022-01-20 15:42:46 -05:00
John Baldwin	05a1d0f5d7	ktls: Support for TLS 1.3 receive offload. Note that support for TLS 1.3 receive offload in OpenSSL is still an open pull request in active development. However, potential changes to that pull request should not affect the kernel interface. Reviewed by: hselasky Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D33007	2021-12-14 11:01:05 -08:00
Mateusz Guzik	a90b85dd5a	ktls: plug set-but-not-used vars Sponsored by: Rubicon Communications, LLC ("Netgate")	2021-12-14 14:44:37 +00:00
Cy Schubert	db0ac6ded6	Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816" This reverts commit `266f97b5e9`, reversing changes made to `a10253cffe`. A mismerge of a merge to catch up to main resulted in files being committed which should not have been.	2021-12-02 14:45:04 -08:00
Cy Schubert	266f97b5e9	wpa: Import wpa_supplicant/hostapd commit 14ab4a816 This is the November update to vendor/wpa committed upstream 2021-11-26. MFC after: 1 month	2021-12-02 13:35:14 -08:00
Gleb Smirnoff	de2d47842e	SMR protection for inpcbs With introduction of epoch(9) synchronization to network stack the inpcb database became protected by the network epoch together with static network data (interfaces, addresses, etc). However, inpcb aren't static in nature, they are created and destroyed all the time, which creates some traffic on the epoch(9) garbage collector. Fairly new feature of uma(9) - Safe Memory Reclamation allows to safely free memory in page-sized batches, with virtually zero overhead compared to uma_zfree(). However, unlike epoch(9), it puts stricter requirement on the access to the protected memory, needing the critical(9) section to access it. Details: - The database is already build on CK lists, thanks to epoch(9). - For write access nothing is changed. - For a lookup in the database SMR section is now required. Once the desired inpcb is found we need to transition from SMR section to r/w lock on the inpcb itself, with a check that inpcb isn't yet freed. This requires some compexity, since SMR section itself is a critical(9) section. The complexity is hidden from KPI users in inp_smr_lock(). - For a inpcb list traversal (a pcblist sysctl, or broadcast notification) also a new KPI is provided, that hides internals of the database - inp_next(struct inp_iterator *). Reviewed by: rrs Differential revision: https://reviews.freebsd.org/D33022	2021-12-02 10:48:48 -08:00
John Baldwin	900a28fe33	ktls: Reject some invalid cipher suites. - Reject AES-CBC cipher suites for TLS 1.0 and TLS 1.1 using auth algorithms other than SHA1-HMAC. - Reject AES-GCM cipher suites for TLS versions older than 1.2. Reviewed by: markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D32842	2021-11-15 11:30:12 -08:00
John Baldwin	e3ba94d4f3	Don't require the socket lock for sorele(). Previously, sorele() always required the socket lock and dropped the lock if the released reference was not the last reference. Many callers locked the socket lock just before calling sorele() resulting in a wasted lock/unlock when not dropping the last reference. Move the previous implementation of sorele() into a new sorele_locked() function and use it instead of sorele() for various places in uipc_socket.c that called sorele() while already holding the socket lock. The sorele() macro now uses refcount_release_if_not_last() try to drop the socket reference without locking the socket. If that shortcut fails, it locks the socket and calls sorele_locked(). Reviewed by: kib, markj Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D32741	2021-11-09 10:50:12 -08:00
John Baldwin	96668a81ae	ktls: Always create a software backend for receive sessions. A future change to TOE TLS will require a software fallback for the first few TLS records received. Future support for NIC TLS on receive will also require a software fallback for certain cases. Reviewed by: gallatin, hselasky Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D32566	2021-10-21 09:37:17 -07:00
John Baldwin	c57dbec69a	ktls: Add a routine to query information in a receive socket buffer. In particular, ktls_pending_rx_info() determines which TLS record is at the end of the current receive socket buffer (including not-yet-decrypted data) along with how much data in that TLS record is not yet present in the socket buffer. This is useful for future changes to support NIC TLS receive offload and enhancements to TOE TLS receive offload. Those use cases need a way to synchronize a state machine on the NIC with the TLS record boundaries in the TCP stream. Reviewed by: gallatin, hselasky Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D32564	2021-10-21 09:36:29 -07:00
Mark Johnston	84c3922243	Convert consumers to vm_page_alloc_noobj_contig() Remove now-unneeded page zeroing. No functional change intended. Reviewed by: alc, hselasky, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32006	2021-10-19 21:22:56 -04:00
Mark Johnston	a4667e09e6	Convert vm_page_alloc() callers to use vm_page_alloc_noobj(). Remove page zeroing code from consumers and stop specifying VM_ALLOC_NOOBJ. In a few places, also convert an allocation loop to simply use VM_ALLOC_WAITOK. Similarly, convert vm_page_alloc_domain() callers. Note that callers are now responsible for assigning the pindex. Reviewed by: alc, hselasky, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31986	2021-10-19 21:22:56 -04:00
John Baldwin	a72ee35564	ktls: Defer creation of threads and zones until first use. Run ktls_init() when the first KTLS session is created rather than unconditionally during boot. This avoids creating unused threads and allocating unused resources on systems which do not use KTLS. Reviewed by: gallatin, markj Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D32487	2021-10-14 15:48:34 -07:00
John Baldwin	9f03d2c001	ktls: Ensure FIFO encryption order for TLS 1.0. TLS 1.0 records are encrypted as one continuous CBC chain where the last block of the previous record is used as the IV for the next record. As a result, TLS 1.0 records cannot be encrypted out of order but must be encrypted as a FIFO. If the later pages of a sendfile(2) request complete before the first pages, then TLS records can be encrypted out of order. For TLS 1.1 and later this is fine, but this can break for TLS 1.0. To cope, add a queue in each TLS session to hold TLS records that contain valid unencrypted data but are waiting for an earlier TLS record to be encrypted first. - In ktls_enqueue(), check if a TLS record being queued is the next record expected for a TLS 1.0 session. If not, it is placed in sorted order in the pending_records queue in the TLS session. If it is the next expected record, queue it for SW encryption like normal. In addition, check if this new record (really a potential batch of records) was holding up any previously queued records in the pending_records queue. Any of those records that are now in order are also placed on the queue for SW encryption. - In ktls_destroy(), free any TLS records on the pending_records queue. These mbufs are marked M_NOTREADY so were not freed when the socket buffer was purged in sbdestroy(). Instead, they must be freed explicitly. Reviewed by: gallatin, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D32381	2021-10-13 12:30:15 -07:00
John Baldwin	a63752cce6	ktls: Reject attempts to enable AES-CBC with TLS 1.3. AES-CBC cipher suites are not supported in TLS 1.3. Reported by: syzbot+ab501c50033ec01d53c6@syzkaller.appspotmail.com Reviewed by: tuexen, markj Differential Revision: https://reviews.freebsd.org/D32404	2021-10-13 12:12:58 -07:00
Mark Johnston	bf25678226	ktls: Fix error/mode confusion in TCP_*TLS_MODE getsockopt handlers ktls_get_(rx\|tx)_mode() can return an errno value or a TLS mode, so errors are effectively hidden. Fix this by using a separate output parameter. Convert to the new socket buffer locking macros while here. Note that the socket buffer lock is not needed to synchronize the SOLISTENING check here, we can rely on the PCB lock. Reviewed by: jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31977	2021-09-17 14:19:05 -04:00
John Baldwin	c782ea8bb5	Add a switch structure for send tags. Move the type and function pointers for operations on existing send tags (modify, query, next, free) out of 'struct ifnet' and into a new 'struct if_snd_tag_sw'. A pointer to this structure is added to the generic part of send tags and is initialized by m_snd_tag_init() (which now accepts a switch structure as a new argument in place of the type). Previously, device driver ifnet methods switched on the type to call type-specific functions. Now, those type-specific functions are saved in the switch structure and invoked directly. In addition, this more gracefully permits multiple implementations of the same tag within a driver. In particular, NIC TLS for future Chelsio adapters will use a different implementation than the existing NIC TLS support for T6 adapters. Reviewed by: gallatin, hselasky, kib (older version) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D31572	2021-09-14 11:43:41 -07:00
Mark Johnston	f94acf52a4	socket: Rename sb(un)lock() and interlock with listen(2) In preparation for moving sockbuf locks into the containing socket, provide alternative macros for the sockbuf I/O locks: SOCK_IO_SEND_(UN)LOCK() and SOCK_IO_RECV_(UN)LOCK(). These operate on a socket rather than a socket buffer. Note that these locks are used only to prevent concurrent readers and writters from interleaving I/O. When locking for I/O, return an error if the socket is a listening socket. Currently the check is racy since the sockbuf sx locks are destroyed during the transition to a listening socket, but that will no longer be true after some follow-up changes. Modify a few places to check for errors from sblock()/SOCK_IO_(SEND\|RECV)_LOCK() where they were not before. In particular, add checks to sendfile() and sorflush(). Reviewed by: tuexen, gallatin MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31657	2021-09-07 15:06:48 -04:00
John Baldwin	470e851c4b	ktls: Support asynchronous dispatch of AEAD ciphers. KTLS OCF support was originally targeted at software backends that used host CPU cycles to encrypt TLS records. As a result, each KTLS worker thread queued a single TLS record at a time and waited for it to be encrypted before processing another TLS record. This works well for software backends but limits throughput on OCF drivers for coprocessors that support asynchronous operation such as qat(4) or ccr(4). This change uses an alternate function (ktls_encrypt_async) when encrypt TLS records via a coprocessor. This function queues TLS records for encryption and returns. It defers the work done after a TLS record has been encrypted (such as marking the mbufs ready) to a callback invoked asynchronously by the coprocessor driver when a record has been encrypted. - Add a struct ktls_ocf_state that holds the per-request state stored on the stack for synchronous requests. Asynchronous requests malloc this structure while synchronous requests continue to allocate this structure on the stack. - Add a ktls_encrypt_async() variant of ktls_encrypt() which does not perform request completion after dispatching a request to OCF. Instead, the ktls_ocf backends invoke ktls_encrypt_cb() when a TLS record request completes for an asynchronous request. - Flag AEAD software TLS sessions as async if the backend driver selected by OCF is an async driver. - Pull code to create and dispatch an OCF request out of ktls_encrypt() into a new ktls_encrypt_one() function used by both ktls_encrypt() and ktls_encrypt_async(). - Pull code to "finish" the VM page shuffling for a file-backed TLS record into a helper function ktls_finish_noanon() used by both ktls_encrypt() and ktls_encrypt_cb(). Reviewed by: markj Tested on: ccr(4) (jhb), qat(4) (markj) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D31665	2021-08-30 13:11:52 -07:00
John Baldwin	d16cb228c1	ktls: Fix accounting for TLS 1.0 empty fragments. TLS 1.0 empty fragment mbufs have no payload and thus m_epg_npgs is zero. However, these mbufs need to occupy a "unit" of space for the purposes of M_NOTREADY tracking similar to regular mbufs. Previously this was done for the page count returned from ktls_frame() and passed to ktls_enqueue() as well as the page count passed to pru_ready(). However, sbready() and mb_free_notready() only use m_epg_nrdy to determine the number of "units" of space in an M_EXT mbuf, so when a TLS 1.0 fragment was marked ready it would mark one unit of the next mbuf in the socket buffer as ready as well. To fix, set m_epg_nrdy to 1 for empty fragments. This actually simplifies the code as now only ktls_frame() has to handle TLS 1.0 fragments explicitly and the rest of the KTLS functions can just use m_epg_nrdy. Reviewed by: gallatin MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D31536	2021-08-16 10:42:46 -07:00
Andrew Gallatin	95c51fafa4	ktls: Init reset tag task for cloned sessions When cloning a ktls session (which is needed when we need to switch output NICs for a NIC TLS session), we need to also init the reset task, like we do when creating a new tls session. Reviewed by: jhb Sponsored by: Netflix	2021-08-11 14:06:43 -04:00
Andrew Gallatin	09066b9866	ktls: Use the new PNOLOCK flag Use the new PNOLOCK flag to tsleep() to indicate that we are managing potential races, and don't need to sleep with a lock, or have a backstop timeout. Reviewed by: jhb Sponsored by: Netflix	2021-08-05 17:19:12 -04:00
Andrew Gallatin	2694c869ff	ktls: fix a panic with INVARIANTS `98215005b7` introduced a new thread that uses tsleep(..0) to sleep forever. This hit an assert due to sleeping with a 0 timeout. So spell "forever" using SBT_MAX instead, which does not trigger the assert. Pointy hat to: gallatin Pointed out by: emaste Sponsored by: Netflix	2021-08-05 13:09:06 -04:00
Andrew Gallatin	98215005b7	ktls: start a thread to keep the 16k ktls buffer zone populated Ktls recently received an optimization where we allocate 16k physically contiguous crypto destination buffers. This provides a large (more than 5%) reduction in CPU use in our workload. However, after several days of uptime, the performance benefit disappears because we have frequent allocation failures from the ktls buffer zone. It turns out that when load drops off, the ktls buffer zone is trimmed, and some 16k buffers are freed back to the OS. When load picks back up again, re-allocating those 16k buffers fails after some number of days of uptime because physical memory has become fragmented. This causes allocations to fail, because they are intentionally done without M_NORECLAIM, so as to avoid pausing the ktls crytpo work thread while the VM system defragments memory. To work around this, this change starts one thread per VM domain to allocate ktls buffers with M_NORECLAIM, as we don't care if this thread is paused while memory is defragged. The thread then frees the buffers back into the ktls buffer zone, thus allowing future allocations to succeed. Note that waking up the thread is intentionally racy, but neither of the races really matter. In the worst case, we could have either spurious wakeups or we could have to wait 1 second until the next rate-limited allocation failure to wake up the thread. This patch has been in use at Netflix on a handful of servers, and seems to fix the issue. Differential Revision: https://reviews.freebsd.org/D31260 Reviewed by: jhb, markj, (jtl, rrs, and dhw reviewed earlier version) Sponsored by: Netflix	2021-08-05 10:19:12 -04:00
Andrew Gallatin	4150a5a87e	ktls: fix NOINET build Reported by: mjguzik Sponsored by: Netflix	2021-07-07 10:40:02 -04:00
Andrew Gallatin	28d0a740dd	ktls: auto-disable ifnet (inline hw) kTLS Ifnet (inline) hw kTLS NICs typically keep state within a TLS record, so that when transmitting in-order, they can continue encryption on each segment sent without DMA'ing extra state from the host. This breaks down when transmits are out of order (eg, TCP retransmits). In this case, the NIC must re-DMA the entire TLS record up to and including the segment being retransmitted. This means that when re-transmitting the last 1448 byte segment of a TLS record, the NIC will have to re-DMA the entire 16KB TLS record. This can lead to the NIC running out of PCIe bus bandwidth well before it saturates the network link if a lot of TCP connections have a high retransmoit rate. This change introduces a new sysctl (kern.ipc.tls.ifnet_max_rexmit_pct), where TCP connections with higher retransmit rate will be switched to SW kTLS so as to conserve PCIe bandwidth. Reviewed by: hselasky, markj, rrs Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D30908	2021-07-06 10:28:32 -04:00
Mateusz Guzik	904a08f342	ktls: switch bare zone_mbuf use to m_free_raw Reviewed by: gallatin Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D30955	2021-07-02 08:30:22 +00:00
John Baldwin	faf0224ff2	ktls: Don't mark existing received mbufs notready for TOE TLS. The TOE driver might receive decrypted TLS records that are enqueued to the socket buffer after ktls_try_toe() returns and before ktls_enable_rx() locks the receive buffer to call sb_mark_notready(). In that case, sb_mark_notready() would incorrectly treat the decrypted TLS record as an encrypted record and schedule it for decryption. This always resulted in the connection being dropped as the data in the control message did not look like a valid TLS header. To fix, don't try to handle software decryption of existing buffers in the socket buffer for TOE TLS in ktls_enable_rx(). If a TOE TLS driver needs to decrypt existing data in the socket buffer, the driver will need to manage that in its tod_alloc_tls_session method. Sponsored by: Chelsio Communications	2021-06-15 17:45:21 -07:00
Andrew Gallatin	ed5e13cfc2	ktls: Fix interaction with RATELIMIT uipc_ktls.c was missing opt_ratelimit.h, so it was never noticing that RATELIMIT was enabled. Once it was enabled, it failed to compile as ktls_modify_txrtlmt() had accrued a compilation error when it was not being compiled in. Sponsored by: Netflix	2021-06-14 10:51:16 -04:00
John Baldwin	6b313a3a60	Include the trailer in the original dst_iov. This avoids creating a duplicate copy on the stack just to append the trailer. Reviewed by: gallatin, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D30139	2021-05-25 16:59:19 -07:00
John Baldwin	21e3c1fbe2	Assume OCF is the only KTLS software backend. This removes support for loadable software backends. The KTLS OCF support is now always included in kernels with KERN_TLS and the ktls_ocf.ko module has been removed. The software encryption routines now take an mbuf directly and use the TLS mbuf as the crypto buffer when possible. Bump __FreeBSD_version for software backends in ports. Reviewed by: gallatin, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D30138	2021-05-25 16:59:19 -07:00
Mark Johnston	89b650872b	ktls: Hide initialization message behind bootverbose We don't typically print anything when a subsystem initializes itself, and KTLS is currently disabled by default anyway. Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29097	2021-03-05 13:11:02 -05:00
Mark Johnston	49f6925ca3	ktls: Cache output buffers for software encryption Maintain a cache of physically contiguous runs of pages for use as output buffers when software encryption is configured and in-place encryption is not possible. This makes allocation and free cheaper since in the common case we avoid touching the vm_page structures for the buffer, and fewer calls into UMA are needed. gallatin@ reports a ~10% absolute decrease in CPU usage with sendfile/KTLS on a Xeon after this change. It is possible that we will not be able to allocate these buffers if physical memory is fragmented. To avoid frequently calling into the physical memory allocator in this scenario, rate-limit allocation attempts after a failure. In the failure case we fall back to the old behaviour of allocating a page at a time. N.B.: this scheme could be simplified, either by simply using malloc() and looking up the PAs of the pages backing the buffer, or by falling back to page by page allocation and creating a mapping in the cache zone. This requires some way to save a mapping of an M_EXTPG page array in the mbuf, though. m_data is not really appropriate. The second approach may be possible by saving the mapping in the plinks union of the first vm_page structure of the array, but this would force a vm_page access when freeing an mbuf. Reviewed by: gallatin, jhb Tested by: gallatin Sponsored by: Ampere Computing Submitted by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D28556	2021-03-03 17:34:01 -05:00
John Baldwin	90972f0402	ktls: Use COUNTER_U64_DEFINE_EARLY for the ktls_toe_chacha20 counter. I missed updating this counter when rebasing the changes in `9c64fc4029` after the switch to COUNTER_U64_DEFINE_EARLY in `1755b2b989`. Fixes: `9c64fc4029` Add Chacha20-Poly1305 as a KTLS cipher suite. Sponsored by: Netflix	2021-02-25 15:00:13 -08:00
John Baldwin	9c64fc4029	Add Chacha20-Poly1305 as a KTLS cipher suite. Chacha20-Poly1305 for TLS is an AEAD cipher suite for both TLS 1.2 and TLS 1.3 (RFCs 7905 and 8446). For both versions, Chacha20 uses the server and client IVs as implicit nonces xored with the record sequence number to generate the per-record nonce matching the construction used with AES-GCM for TLS 1.3. Reviewed by: gallatin Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D27839	2021-02-18 09:26:32 -08:00
Mark Johnston	b5aa9ad43a	ktls: Make configuration sysctls available as tunables Reviewed by: gallatin, jhb Sponsored by: Ampere Computing Submitted by: Klara, Inc. MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D28499	2021-02-08 09:19:02 -05:00
Mark Johnston	1755b2b989	ktls: Use COUNTER_U64_DEFINE_EARLY This makes it a bit more straightforward to add new counters when debugging. No functional change intended. Reviewed by: jhb Sponsored by: Ampere Computing Submitted by: Klara, Inc. MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D28498	2021-02-08 09:18:51 -05:00
Gleb Smirnoff	3f43ada98c	Catch up with `6edfd179c8`: mechanically rename IFCAP_NOMAP to IFCAP_MEXTPG. Originally IFCAP_NOMAP meant that the mbuf has external storage pointer that points to unmapped address. Then, this was extended to array of such pointers. Then, such mbufs were augmented with header/trailer. Basically, extended mbufs are extended, and set of features is subject to change. The new name should be generic enough to avoid further renaming.	2021-01-29 11:46:24 -08:00
Mark Johnston	4dc1b17dbb	ktls: Improve handling of the bind_threads tunable a bit - Only check for empty domains if we actually tried to configure domain affinity in the first place. Otherwise setting bind_threads=1 will always cause the sysctl value to be reported as zero. This is harmless since the threads end up being bound, but it's confusing. - Try to improve the sysctl description a bit. Reviewed by: gallatin, jhb Submitted by: Klara, Inc. Sponsored by: Ampere Computing Differential Revision: https://reviews.freebsd.org/D28161	2021-01-19 21:32:33 -05:00

1 2

89 Commits