freebsd-dev

Author	SHA1	Message	Date
Randall Stewart	945f9a7cc9	tcp: misc cleanup of options for rack as well as socket option logging. Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack has an experimental SaD (Sack Attack Detection) algorithm that should be made available. Also there is a t_maxpeak_rate that needs to be removed (its un-used). Reviewed by: tuexen, cc Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D39427	2023-04-07 10:15:29 -04:00
Randall Stewart	73ee5756de	Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features. So stack switching as always been a bit of a issue. We currently use a break before make setup which means that if something goes wrong you have to try to get back to a stack. This patch among a lot of other things changes that so that it is a make before break. We also expand some of the function blocks in prep for new features in rack that will allow more controlled pacing. We also add other abilities such as the pathway for a stack to query a previous stack to acquire from it critical state information so things in flight don't get dropped or mis-handled when switching stacks. We also add the concept of a timer granularity. This allows an alternate stack to change from the old ticks granularity to microseconds and of course this even gives us a pathway to go to nanosecond timekeeping if we need to (something for the data center to consider for sure). Once all this lands I will then update rack to begin using all these new features. Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D39210	2023-04-01 01:46:38 -04:00
Alexander V. Chernikov	19e43c163c	netlink: add netlink KPI to the kernel by default This change does the following: Base Netlink KPIs (ability to register the family, parse and/or write a Netlink message) are always present in the kernel. Specifically, * Implementation of genetlink family/group registration/removal, some base accessors (netlink_generic_kpi.c, 260 LoC) are compiled in unconditionally. * Basic TLV parser functions (netlink_message_parser.c, 507 LoC) are compiled in unconditionally. * Glue functions (netlink<>rtsock), malloc/core sysctl definitions (netlink_glue.c, 259 LoC) are compiled in unconditionally. * The rest of the KPI _functions_ are defined in the netlink_glue.c, but their implementation calls a pointer to either the stub function or the actual function, depending on whether the module is loaded or not. This approach allows to have only 1k LoC out of ~3.7k LoC (current sys/netlink implementation) in the kernel, which will not grow further. It also allows for the generic netlink kernel customers to load successfully without requiring Netlink module and operate correctly once Netlink module is loaded. Reviewed by: imp MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D39269	2023-03-27 13:55:44 +00:00
Randall Stewart	69c7c81190	Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities. The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints we need to move to using the new inline functions. This adds them and moves rack to now use the tcp_tracepoints. Reviewed by: tuexen, gallatin Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D38831	2023-03-16 11:43:16 -04:00
Brooks Davis	105a4f7b3c	ng_atmllc: remove This standalone module is the last vestage of ATM support in the tree so send it on its way. Reviewed by: manu, emaste Relnotes: yes Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D38880	2023-03-09 18:04:21 +00:00
Brooks Davis	af0cc0b223	NgATM: Remove netgraph ATM support Most ATM support was removed prior to FreeBSD 12. The netgraph support was kept as it was less intrusive, but it is presumed to be unused. Reviewed by: manu Relnotes: yes Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D38879	2023-03-09 18:04:02 +00:00
Brooks Davis	bd32aedeee	NgATM: Remove useless NGATM_ATM option MFC after: 3 days Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D38875	2023-03-02 23:40:05 +00:00
Brooks Davis	3746e90118	NATM: Remove useless NETGRAPH_ATM_ATMPIF option This code was removed as part of the NATM removal in 2017 and somehow this option was missed. MFC after: 3 days Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D38874	2023-03-02 23:40:05 +00:00
Michael Paepcke	c51978f4b2	kbd: add KBD_DELAY1 and KBD_DELAY2 Allow to configure KBD_DELAY* via KERNCONF for user-land less embedded and security appliances Reviewed by: imp (folded) Pull Request: https://github.com/freebsd/freebsd-src/pull/649	2023-02-24 23:19:05 -07:00
Dmitry Chagin	06c07e1203	Complete removal of opt_compat.h Since Linux emulation layer build options was removed there is no reason to keep opt_compat.h. Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D38548 MFC after: 2 weeks	2023-02-13 19:07:38 +03:00
Dag-Erling Smørgrav	69d94f4c76	Add tarfs, a filesystem backed by tarballs. Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Reviewed by: pauamma, imp Differential Revision: https://reviews.freebsd.org/D37753	2023-02-02 18:19:29 +01:00
Alexander V. Chernikov	c9313a0bad	netlink: allow netlink to be build in the kernel Differential Revision: https://reviews.freebsd.org/D37781	2022-12-23 15:24:44 +00:00
Gleb Smirnoff	eaabc93764	tcp: retire TCPDEBUG This subsystem is superseded by modern debugging facilities, e.g. DTrace probes and TCP black box logging. We intentionally leave SO_DEBUG in place, as many utilities may set it on a socket. Also the tcp::debug DTrace probes look at this flag on a socket. Reviewed by: gnn, tuexen Discussed with: rscheff, rrs, jtl Differential revision: https://reviews.freebsd.org/D37694	2022-12-14 09:54:06 -08:00
Gleb Smirnoff	e68b379244	tcp: embed inpcb into tcpcb For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several structures, that previously were allocated separately. The most import one is the inpcb itself. With embedding we can provide strong guarantee that with a valid TCP inpcb the tcpcb is always valid and vice versa. Also we reduce number of allocs/frees per connection. The embedded inpcb is placed in the beginning of the struct tcpcb, since in_pcballoc() requires that. However, later we may want to move it around for cache line efficiency, and this can be done with a little effort. The new intotcpcb() macro is ready for such move. The congestion algorithm data, the TCP timers and osd(9) data are also embedded into tcpcb, and temprorary struct tcpcb_mem goes away. There was no extra allocation here, but we went through extra pointer every time we accessed this data. One interesting side effect is that now TCP data is allocated from SMR-protected zone. Potentially this allows the TCP stacks or other TCP related modules to utilize that for their own synchronization. Large part of the change was done with sed script: s/tp->ccv->/tp->t_ccv./g s/tp->ccv/\&tp->t_ccv/g s/tp->cc_algo/tp->t_cc/g s/tp->t_timers->tt_/tp->tt_/g s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g Dependency side effect is that code that needs to know struct tcpcb should also know struct inpcb, that added several <netinet/in_pcb.h>. Differential revision: https://reviews.freebsd.org/D37127	2022-12-07 09:00:48 -08:00
Doug Rabson	eb6f48854d	Fix a typo in the binmisc option name This should be spelt IMGACT_BINMISC to match the filename. The option name does not appear outside of sys/conf and this module is typically used via the kernel module imgact_binmisc.ko. MFC After: 2 weeks	2022-12-07 13:51:34 +00:00
Robert Wing	dacbe4d533	config: remove LOCK_PROFILING_FAST This option lived a short life in 2007. last used `eea4f254fe` Reviewed by: mjg Differential Revision: https://reviews.freebsd.org/D37600	2022-12-05 08:03:52 -09:00
Warner Losh	d6f1e6aa11	config: Make ISAPNP be in opt_dontuse.h Nothing uses ISAPNP today, apart from bringing in files or not. There's really no need to ever do #ifdef ISAPNP in drivers and such. It means use the ISA bus plug and play isolation protocol to enumerate the bus, not the more useful 'you might have devices with isa pnp ids' which all drivers hide behind DEV_ISA and/or an isa clause in the files files. Sponsored by: Netflix Reviewed by: kevans, emaste Differential Revision: https://reviews.freebsd.org/D37109	2022-10-24 12:13:03 -06:00
Mitchell Horne	287d467c5d	mac: add new mac_ddb(4) policy Generally, access to the kernel debugger is considered to be unsafe from a security perspective since it presents an unrestricted interface to inspect or modify the system state, including sensitive data such as signing keys. However, having some access to debugger functionality on production systems may be useful in determining the cause of a panic or hang. Therefore, it is desirable to have an optional policy which allows limited use of ddb(4) while disabling the functionality which could reveal system secrets. This loadable MAC module allows for the use of some ddb(4) commands while preventing the execution of others. The commands have been broadly grouped into three categories: - Those which are 'safe' and will not emit sensitive data (e.g. trace). Generally, these commands are deterministic and don't accept arguments. - Those which are definitively unsafe (e.g. examine <addr>, search <addr> <value>) - Commands which may be safe to execute depending on the arguments provided (e.g. show thread <addr>). Safe commands have been flagged as such with the DB_CMD_MEMSAFE flag. Commands requiring extra validation can provide a function to do so. For example, 'show thread <addr>' can be used as long as addr can be checked against the system's list of process structures. The policy also prevents debugger backends other than ddb(4) from executing, for example gdb(4). Reviewed by: markj, pauamma_gundo.com (manpages) Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D35371	2022-07-18 22:06:15 +00:00
John Baldwin	0b377a49fa	FB_INSTALL_CDEV: Remove this option and related code. This option was never enabled in GENERIC and does not appear to work (the cdevsw is stored in a global array but never passed to make_dev to be associated with a character device). Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D35008	2022-04-21 10:29:14 -07:00
Emmanuel Vadot	8b39d2fe35	options: Remove EXT_RESOURCES It is now unused in kernel code. MFC after: Never Differential Revision: https://reviews.freebsd.org/D33839	2022-02-21 17:29:15 +01:00
Warner Losh	21e22be91a	ed: Remove options ed(4) was removed some time ago, but these options relevant to only it weren't GC'd at the time. Remove them. Sponsored by: Netflix	2021-12-09 17:41:39 -07:00
Florian Walpen	bf2fa8d9d1	MAC/priority module for realtime privilege group This is a MAC policy module that grants scheduling privileges based on group membership. Users or processes in the group realtime (gid 47) are allowed to run threads and processes with realtime scheduling priority. For timing-sensitive, low-latency software like audio/jack, running with realtime priority helps to avoid stutter and gaps. PR: 239125 MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D33191	2021-12-04 20:19:25 +02:00
Cy Schubert	db0ac6ded6	Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816" This reverts commit `266f97b5e9`, reversing changes made to `a10253cffe`. A mismerge of a merge to catch up to main resulted in files being committed which should not have been.	2021-12-02 14:45:04 -08:00
Cy Schubert	266f97b5e9	wpa: Import wpa_supplicant/hostapd commit 14ab4a816 This is the November update to vendor/wpa committed upstream 2021-11-26. MFC after: 1 month	2021-12-02 13:35:14 -08:00
Gleb Smirnoff	93c67567e0	Remove "options PCBGROUP" With upcoming changes to the inpcb synchronisation it is going to be broken. Even its current status after the move of PCB synchronization to the network epoch is very questionable. This experimental feature was sponsored by Juniper but ended never to be used in Juniper and doesn't exist in their source tree [sjg@, stevek@, jtl@]. In the past (AFAIK, pre-epoch times) it was tried out at Netflix [gallatin@, rrs@] with no positive result and at Yandex [ae@, melifaro@]. I'm up to resurrecting it back if there is any interest from anybody. Reviewed by: rrs Differential revision: https://reviews.freebsd.org/D33020	2021-12-02 10:48:48 -08:00
Warner Losh	8722e05ae1	twa: Remove Belatedly remove twa(4). It was supposed to go before 13.0, but was overlooked. Sponsored by: Netflix Relnotes: yes Reviewed by: scottl Differential Revision: https://reviews.freebsd.org/D33114	2021-11-25 00:45:13 -07:00
Kristof Provost	4e85b64890	Add a COMPAT_FREEBSD13 kernel option Use it wherever COMPAT_FREEBSD11 is currently specified. Reviewed by: jhb (previous version) Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D33005	2021-11-17 03:08:40 +01:00
Randall Stewart	b8d60729de	tcp: Congestion control cleanup. NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!! This patch does a bit of cleanup on TCP congestion control modules. There were some rather interesting surprises that one could get i.e. where you use a socket option to change from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get a memory failure and end up on cc_newreno. This is not what one would expect. The new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK and pass in to the init function preallocated memory. The CC init is expected in this case not to fail but if it does and a module does break the "no fail with memory given" contract we do fall back to the CC that was in place at the time. This also fixes up a set of common newreno utilities that can be shared amongst other CC modules instead of the other CC modules reaching into newreno and executing what they think is a "common and understood" function. Lets put these functions in cc.c and that way we have a common place that is easily findable by future developers or bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE and HYSTART++ without having to dance through hoops for other CC modules, instead both newreno and the other modules just call into the common functions if they desire that behavior or roll there own if that makes more sense. Note: This commit changes the kernel configuration!! If you are not using GENERIC in some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC, CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined as well if you desire. Note that if you create a kernel configuration that does not define a congestion control module and includes INET or INET6 the kernel compile will break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\" but you can specify any string that represents the name of the CC module (same names that show up in the CC module list under net.inet.tcp.cc). If you fail to add the options CC_DEFAULT in your kernel configuration the kernel build will also break. Reviewed by: Michael Tuexen Sponsored by: Netflix Inc. RELNOTES:YES Differential Revision: https://reviews.freebsd.org/D32693	2021-11-11 06:28:18 -05:00
Warner Losh	4b3da659bf	nvme: Only reset once on attach. The FreeBSD nvme driver has reset the nvme controller twice on attach to address a theoretical issue assuring the hardware is in a known state. However, exierence has shown the second reset is unnecessary and increases the time to boot. Eliminate the second reset. Should there be a situation when you need a second reset (for buggy or at least somewhat out of the mainstream hardware), the hardware option NVME_2X_RESET will restore the old behavior. Document this in nvme(4). If there's any trouble at all with this, I'll add a sysctl tunable to control it. Sponsored by: Netflix Reviewed by: cperciva, mav Differential Revision: https://reviews.freebsd.org/D32241	2021-10-01 11:09:34 -06:00
Konstantin Belousov	cf0ee8738e	Drop cloudabi According to https://github.com/NuxiNL/cloudlibc: CloudABI is no longer being maintained. It was an awesome experiment, but it never got enough traction to be sustainable. There is no reason to keep it in FreeBSD. Approved by: ed (private mail) Reviewed by: emaste Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D31923	2021-09-22 00:18:44 +03:00
Mark Johnston	30d00832d7	conf: Add a KMSAN kernel option Sponsored by: The FreeBSD Foundation	2021-08-10 21:22:12 -04:00
Kyle Evans	7a129c973b	kern: add an option for preserving the early kenv Some downstream configurations do not store secrets in the early (loader/static) environments and desire a way to preserve these for diagnostic reasons. Provide an option to do so. Reviewed by: imp, jhb (earlier version) Differential Revision: https://reviews.freebsd.org/D30834	2021-07-18 23:05:48 -05:00
Mark Johnston	01028c736c	Add a KASAN option to the kernel build LLVM support for enabling KASAN has not yet landed so the option is not yet usable, but hopefully this will change soon. Reviewed by: imp, andrew MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29454	2021-04-13 17:42:20 -04:00
Konstantin Belousov	750ea20d3f	Delete dead CLUSTERDEBUG config option. Reviewed by: mckusick Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28679	2021-02-21 11:38:21 +02:00
Warner Losh	35af933173	acpi: limit the AMDI0020/AMDI0010 workaround to an option It appears that production versions of EPYC firmware get the _STA method right for these nodes. In fact, this workaround breaks on production hardware by including too many uart nodes. This work around was for pre-release hardware that wound up not having a large deployment. Move this work around to a kernel option since the machines that needed it have been powered off and are difficult to resurrect. Should there be a more significant deployment than is understood, we can restrict it based on smbios strings. Discussed with: mmacy@, seanc@, jhb@ MFC After: 3 days	2021-02-08 14:47:49 -07:00
Vladimir Kondratyev	b62f6dfaed	hid: Replace USBHID_ENABLED kernel config option with loader tunable usbhid(4) is disabled by default to avoid conflicts with existing USB HID drivers. To enable it place following lines to /boot/loader.conf: hw.usb.usbhid.enable=1 usbhid_load="YES" Suggested by: jhb Reviewed by: hselasky Differential revision: https://reviews.freebsd.org/D28124	2021-01-14 23:04:47 +03:00
Vladimir Kondratyev	9be6b22da9	hidraw(4): Add HIDRAW_MAKE_UHID_ALIAS kernel option which installs /dev/uhid# alias to hidraw character device for compatibility with some existing uhid(4) users like Firefox. As side effect it renames traditional uhid(4) driver to hidraw to make possible using of common unit number allocator. Requested by: Greg V <greg_unrelenting.technology> Reviewed by: hselasky (as part of D27992)	2021-01-08 02:18:44 +03:00
Vladimir Kondratyev	b93f6bfca3	hid: Port ukbd to HID and attach to build Reviewed by: hselasky Differential revision: https://reviews.freebsd.org/D27991	2021-01-08 02:18:43 +03:00
Vladimir Kondratyev	01f2e864f7	hid: Import usbhid - USB transport backend for HID subsystem. This change implements hid_if.m methods for HID-over-USB protocol [1]. Also, this change adds USBHID_ENABLED kernel option which changes device_probe() priority and adds/removes PnP records to prefer usbhid over ums, ukbd, wmt and other USB HID device drivers and vice-versa. The module is based on uhid(4) driver. It is disabled by default for now due to conflicts with existing USB HID drivers. [1] https://www.usb.org/sites/default/files/hid1_11.pdf Reviewed by: hselasky Differential revision: https://reviews.freebsd.org/D27893	2021-01-08 02:18:43 +03:00
Vladimir Kondratyev	b1f1b07f6d	hid: Import iichid - I2C transport backend for HID subsystem This implements hid_if.m methods for HID-over-I2C protocol [1]. Following kernel options are added: IICHID_SAMPLING - Enable support for a sampling mode as interrupt resource acquisition is not always possible in a case of GPIO interrupts. IICHID_DEBUG - Enable debug output. The module is based on prior Marc Priggemeyer work (D16698). [1] http://download.microsoft.com/download/7/d/d/7dd44bb7-2a7a-4505-ac1c-7227d3d96d5b/hid-over-i2c-protocol-spec-v1-0.docx Differential revision: https://reviews.freebsd.org/D27892	2021-01-08 02:18:43 +03:00
Vladimir Kondratyev	1975878673	hid: Import functions and constants required by new subsystem This does an import of quirk stubs, debugging macros from USB code and numerous usage constants used by dependent drivers. Besides, this change renames some functions to get a better matching with userland library and NetBSD/OpenBSD HID code. Namely: - Old hid_report_size() renamed to hid_report_size_max() - New hid_report_size() calculates size of given report rather than maximum size of all reports. - hid_get_data_unsigned() renamed to hid_get_udata() - hid_put_data_unsigned() renamed to hid_put_udata() Compat shim functions are provided in usbhid.h to make possible compile of legacy code unmodified after this change. Reviewed by: manu, hselasky Differential revision: https://reviews.freebsd.org/D27887	2021-01-08 02:18:42 +03:00
Alexander V. Chernikov	f5baf8bb12	Add modular fib lookup framework. This change introduces framework that allows to dynamically attach or detach longest prefix match (lpm) lookup algorithms to speed up datapath route tables lookups. Framework takes care of handling initial synchronisation, route subscription, nhop/nhop groups reference and indexing, dataplane attachments and fib instance algorithm setup/teardown. Framework features automatic algorithm selection, allowing for picking the best matching algorithm on-the-fly based on the amount of routes in the routing table. Currently framework code is guarded under FIB_ALGO config option. An idea is to enable it by default in the next couple of weeks. The following algorithms are provided by default: IPv4: * bsearch4 (lockless binary search in a special IP array), tailored for small-fib (<16 routes) * radix4_lockless (lockless immutable radix, re-created on every rtable change), tailored for small-fib (<1000 routes) * radix4 (base system radix backend) * dpdk_lpm4 (DPDK DIR24-8-based lookups), lockless datastrucure, optimized for large-fib (D27412) IPv6: * radix6_lockless (lockless immutable radix, re-created on every rtable change), tailed for small-fib (<1000 routes) * radix6 (base system radix backend) * dpdk_lpm6 (DPDK DIR24-8-based lookups), lockless datastrucure, optimized for large-fib (D27412) Performance changes: Micro benchmarks (I7-7660U, single-core lookups, 2048k dst, code in D27604): IPv4: 8 routes: radix4: ~20mpps radix4_lockless: ~24.8mpps bsearch4: ~69mpps dpdk_lpm4: ~67 mpps 700k routes: radix4_lockless: 3.3mpps dpdk_lpm4: 46mpps IPv6: 8 routes: radix6_lockless: ~20mpps dpdk_lpm6: ~70mpps 100k routes: radix6_lockless: 13.9mpps dpdk_lpm6: 57mpps Forwarding benchmarks: + 10-15% IPv4 forwarding performance (small-fib, bsearch4) + 25% IPv4 forwarding performance (full-view, dpdk_lpm4) + 20% IPv6 forwarding performance (full-view, dpdk_lpm6) Control: Framwork adds the following runtime sysctls: List algos * net.route.algo.inet.algo_list: bsearch4, radix4_lockless, radix4 * net.route.algo.inet6.algo_list: radix6_lockless, radix6, dpdk_lpm6 Debug level (7=LOG_DEBUG, per-route) net.route.algo.debug_level: 5 Algo selection (currently only for fib 0): net.route.algo.inet.algo: bsearch4 net.route.algo.inet6.algo: radix6_lockless Support for manually changing algos in non-default fib will be added soon. Some sysctl names will be changed in the near future. Differential Revision: https://reviews.freebsd.org/D27401	2020-12-25 11:33:17 +00:00
Alexander V. Chernikov	d1d941c5b9	Remove RADIX_MPATH config option. ROUTE_MPATH is the new config option controlling new multipath routing implementation. Remove the last pieces of RADIX_MPATH-related code and the config option. Reviewed by: glebius Differential Revision: https://reviews.freebsd.org/D27244	2020-11-29 19:43:33 +00:00
Konstantin Belousov	cd85379104	Make MAXPHYS tunable. Bump MAXPHYS to 1M. Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav () Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225	2020-11-28 12:12:51 +00:00
Mark Johnston	a28c28e6ef	Remove NO_EVENTTIMERS support The arm configs that required it have been removed from the tree. Removing this option makes the callout code easier to read and discourages developers from adding new configs without eventtimer drivers. Reviewed by: ian, imp, mav Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27270	2020-11-19 02:50:48 +00:00
Conrad Meyer	a3c41f8bfb	Add "Fenestras X" alternative /dev/random implementation Fortuna remains the default; no functional change to GENERIC. Big picture: - Scalable entropy generation with per-CPU, buffered local generators. - "Push" system for reseeding child generators when root PRNG is reseeded. (Design can be extended to arc4random(9) and userspace generators.) - Similar entropy pooling system to Fortuna, but starts with a single pool to quickly bootstrap as much entropy as possible early on. - Reseeding from pooled entropy based on time schedule. The time interval starts small and grows exponentially until reaching a cap. Again, the goal is to have the RNG state depend on as much entropy as possible quickly, but still periodically incorporate new entropy for the same reasons as Fortuna. Notable design choices in this implementation that differ from those specified in the whitepaper: - Blake2B instead of SHA-2 512 for entropy pooling - Chacha20 instead of AES-CTR DRBG - Initial seeding. We support more platforms and not all of them use loader(8). So we have to grab the initial entropy sources in kernel mode instead, as much as possible. Fortuna didn't have any mechanism for this aside from the special case of loader-provided previous-boot entropy, so most of these sources remain TODO after this commit. Reviewed by: markm Approved by: csprng (markm) Differential Revision: https://reviews.freebsd.org/D22837	2020-10-10 21:45:59 +00:00
Alexander V. Chernikov	fedeb08b6a	Introduce scalable route multipath. This change is based on the nexthop objects landed in D24232. The change introduces the concept of nexthop groups. Each group contains the collection of nexthops with their relative weights and a dataplane-optimized structure to enable efficient nexthop selection. Simular to the nexthops, nexthop groups are immutable. Dataplane part gets compiled during group creation and is basically an array of nexthop pointers, compiled w.r.t their weights. With this change, `rt_nhop` field of `struct rtentry` contains either nexthop or nexthop group. They are distinguished by the presense of NHF_MULTIPATH flag. All dataplane lookup functions returns pointer to the nexthop object, leaving nexhop groups details inside routing subsystem. User-visible changes: The change is intended to be backward-compatible: all non-mpath operations should work as before with ROUTE_MPATH and net.route.multipath=1. All routes now comes with weight, default weight is 1, maximum is 2^24-1. Current maximum multipath group width is statically set to 64. This will become sysctl-tunable in the followup changes. Using functionality: * Recompile kernel with ROUTE_MPATH * set net.route.multipath to 1 route add -6 2001:db8::/32 2001:db8::2 -weight 10 route add -6 2001:db8::/32 2001:db8::3 -weight 20 netstat -6On Nexthop groups data Internet6: GrpIdx NhIdx Weight Slots Gateway Netif Refcnt 1 ------- ------- ------- --------------------------------------- --------- 1 13 10 1 2001:db8::2 vlan2 14 20 2 2001:db8::3 vlan2 Next steps: * Land outbound hashing for locally-originated routes ( D26523 ). * Fix net/bird multipath (net/frr seems to work fine) * Add ROUTE_MPATH to GENERIC * Set net.route.multipath=1 by default Tested by: olivier Reviewed by: glebius Relnotes: yes Differential Revision: https://reviews.freebsd.org/D26449	2020-10-03 10:47:17 +00:00
Ruslan Bukin	6186bfbd18	Rename kernel option ACPI_DMAR to IOMMU. This is mostly needed for a common arm64/amd64 iommu code. Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D26587	2020-09-29 20:29:07 +00:00
Conrad Meyer	64612d4e44	geom(4): Kill GEOM_PART_EBR_COMPAT option Take advantage of Warner's nice new real GEOM aliasing system and use it for aliased partition names that actually work. Our canonical EBR partition name is the weird, not-default-on-x86-prior-to- this-revision "da1p4+00001234." However, if compatibility mode (tunable kern.geom.part.ebr.compat_aliases) is enabled (1, default), we continue to provide the alias names like "da1p5" in addition to the weird canonical names. Naming partition providers was just one aspect of the COMPAT knob; in addition it limited mutability, in part because it did not preserve existing EBR header content aside from that of LBA 0. This change saves the EBR header for LBA 0, as well as for every EBR partition encountered. That way, when we write out the EBR partition table on modification, we can restore any bootloader or other metadata in both LBA0 (the first data-containing EBR may start after 0) as well as every logical EBR we read from the disk, and only update the geometry metadata and linked list pointers that describe the actual partitioning. (This change does not add support for the 'bootcode' verb to EBR.) PR: 232463 Reported by: Manish Jain <bourne.identity AT hotmail.com> Discussed with: ae (no objection) Relnotes: maybe Differential Revision: https://reviews.freebsd.org/D24939	2020-07-01 02:16:36 +00:00
Mark Johnston	95033af923	Add the SCTP_SUPPORT kernel option. This is in preparation for enabling a loadable SCTP stack. Analogous to IPSEC/IPSEC_SUPPORT, the SCTP_SUPPORT kernel option must be configured in order to support a loadable SCTP implementation. Discussed with: tuexen MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2020-06-18 19:32:34 +00:00

1 2 3 4 5 ...

1044 Commits