freebsd-skq

Author	SHA1	Message	Date
shurd	99c641b97c	Roll up iflib commits from github. This pulls in most of the work done by Matt Macy as well as other changes which he has accepted via pull request to his github repo at https://github.com/mattmacy/networking/ This should bring -CURRENT and the github repo into close enough sync to allow small feature branches rather than a large chain of interdependant patches being developed out of tree. The reset of the synchronization should be able to be completed on github by splitting the remaining changes that are not yet ready into short feature branches for later review as smaller commits. Here is a summary of changes included in this patch: 1) More checks when INVARIANTS are enabled for eariler problem detection 2) Group Task Queue cleanups - Fix use of duplicate shortdesc for gtaskqueue malloc type. Some interfaces such as memguard(9) use the short description to identify malloc types, so duplicates should be avoided. 3) Allow gtaskqueues to use ithreads in addition to taskqueues - In some cases, this can improve performance 4) Better logging when taskqgroup_attach*() fails to set interrupt affinity. 5) Do not start gtaskqueues until they're needed 6) Have mp_ring enqueue function enter the ABDICATED rather than BUSY state. This moves the TX to the gtaskq and allows processing to continue faster as well as make TX batching more likely. 7) Add an ift_txd_errata function to struct if_txrx. This allows drivers to inspect/modify mbufs before transmission. 8) Add a new IFLIB_NEED_ZERO_CSUM for drivers to indicate they need checksums zeroed for checksum offload to work. This avoids modifying packet data in the TX path when possible. 9) Use ithreads for iflib I/O instead of taskqueues 10) Clean up ioctl and support async ioctl functions 11) Prefetch two cachlines from each mbuf instead of one up to 128B. We often need to parse packet header info beyond 64B. 12) Fix potential memory corruption due to fence post error in bit_nclear() usage. 13) Improved hang detection and handling 14) If the packet is smaller than MTU, disable the TSO flags. This avoids extra packet parsing when not needed. 15) Move TCP header parsing inside the IS_TSO?() test. This avoids extra packet parsing when not needed. 16) Pass chains of mbufs that are not consumed by lro to if_input() rather call if_input() for each mbuf. 17) Re-arrange packet header loads to get as much work as possible done before a cache stall. 18) Lock the context when calling IFDI_ATTACH_PRE()/IFDI_ATTACH_POST()/ IFDI_DETACH(); 19) Attempt to distribute RX/TX tasks across cores more sensibly, especially when RX and TX share an interrupt. RX will attempt to take the first threads on a core, and TX will attempt to take successive threads. 20) Allow iflib_softirq_alloc_generic() to request affinity to the same cpus an interrupt has affinity with. This allows TX queues to ensure they are serviced by the socket the device is on. 21) Add new iflib sysctls to net.iflib: - timer_int - interval at which to run per-queue timers in ticks - force_busdma 22) Add new per-device iflib sysctls to dev.X.Y.iflib - rx_budget allows tuning the batch size on the RX path - watchdog_events Count of watchdog events seen since load 23) Fix error where netmap_rxq_init() could get called before IFDI_INIT() 24) e1000: Fixed version of r323008: post-cold sleep instead of DELAY when waiting for firmware - After interrupts are enabled, convert all waits to sleeps - Eliminates e1000 software/firmware synchronization busy waits after startup 25) e1000: Remove special case for budget=1 in em_txrx.c - Premature optimization which may actually be incorrect with multi-segment packets 26) e1000: Split out TX interrupt rather than share an interrupt for RX and TX. - Allows better performance by keeping RX and TX paths separate 27) e1000: Separate igb from em code where suitable Much easier to understand separate functions and "if (is_igb)" than previous tests like "if (reg_icr & (E1000_ICR_RXSEQ \| E1000_ICR_LSC))" #blamebruno Reviewed by: sbruno Approved by: sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12235	2017-09-13 01:18:42 +00:00
mjoras	ad8bdad7b2	Allow vlan interfaces to rx through netmap(4). Normally after receiving a packet, a vlan(4) interface sends the packet back through its parent interface's rx routine so that it can be processed as an untagged frame. It does this by using the parent's ifp->if_input. This is incompatible with netmap(4), which replaces the vlan(4) interface's if_input with a netmap(4) hook. Fix this by using the vlan(4) interface's ifp instead of the parent's directly. Reported by: Harry Schmalzbauer <freebsd@omnilan.de> Reviewed by: rstone Approved by: rstone (mentor) MFC after: 3 days Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12191	2017-09-13 00:25:09 +00:00
sbruno	2b52ff4766	Leave the Cavium Liquid IO driver exist in files, not files.amd64 Submitted by: imp	2017-09-12 23:58:38 +00:00
imp	7ad3bd1293	cam iosched: Limit the quanta default to hz if it's below 200 The cam_iosched_ticker() can't be scheduled more than once per tick. Some limiters depend on quanta matching the number of calls per second to enforce the proper limits. Limit the quanta to no faster than 1 per clock tick. This fixes some features when running in VMs where the default HZ is 100. PR: 221953 Obtained from: ElectroBSD Differential Revision: https://reviews.freebsd.org/D12337 Submitted by: Fabian Keil	2017-09-12 23:46:33 +00:00
sbruno	069127d189	Do not try to build the Cavium Liquidio driver on all architechtures. For now, limit to amd64 only.	2017-09-12 23:42:52 +00:00
sbruno	5543e587c7	The diff is the initial submission of Cavium Liquidio 2350/2360 10/25G Intelligent NIC driver. The submission conconsists of firmware binary file and driver sources. Submitted by: pkanneganti@cavium.com (Prasad V Kanneganti) Relnotes: Yes Sponsored by: Cavium Networks Differential Revision: https://reviews.freebsd.org/D11927	2017-09-12 23:36:58 +00:00
tuexen	7af7d26d49	Export the UDP encapsualation port and the path state.	2017-09-12 21:08:50 +00:00
asomers	c08529fe2a	Remove spaces from CTL devices' default serial numbers It's awkward to have spaces in CAM device serial numbers. That leads to such things as device nodes named "/dev/diskid/MYSERIAL%20%20%201". Better to replace the spaces with "0"s. This change only affects the default serial numbers for users who don't provide their own. Reviewed by: ken, mav MFC after: Never Relnotes: Yes Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D12263	2017-09-12 19:36:24 +00:00
jhb	3783785a2f	Handle relocations for newer non-PIC MIPS ABI. Newer binutils supports extensions to the MIPS ABI for non-PIC code that is used when compiling O32 binaries with clang 5 (but not used for N64 oddly enough). These extensions require support for R_MIPS_COPY relocations as well as a second PLT GOT using R_MIPS_JUMP_SLOT relocations. For R_MIPS_COPY, use the same approach as on other architectures where fixups are deferred to the MD do_copy_relocations. The additional PLT GOT for jump slots is located in a .got.plt section which is identified by a DT_MIPS_PLTGOT dynamic entry. This GOT also requires fixups for the first two GOT entries just as the normal GOT. However, the entry point for this second GOT uses a different calling convention. Rather than passing an offset into the GOT, it passes an offset into the .rel.plt section. This requires a second entry point (_rtld_pltbind_start) which calls the normal _rtld_bind() rather than _mips_rtld_bind(). This also means providing a real version of reloc_jmpslot() which is used by _rtld_bind(). In addition, add real implementions of reloc_plt() and reloc_jmpslots() which walk .rel.plt handling R_MIPS_JUMP_SLOT relocations. Reviewed by: kib Sponsored by: DARPA / AFRL Differential Revision: https://reviews.freebsd.org/D12326	2017-09-12 17:46:30 +00:00
tsoome	5210116662	libefi: efipart_open should check the status from disk_open In case of error from disk_open(), we should clean up properly. Reviewed by: allanjude, imp Differential Revision: https://reviews.freebsd.org/D12340	2017-09-12 14:18:45 +00:00
tsoome	2abc976028	loader should support large_dnode The zfsonlinux feature large_dnode is not yet supported by the loader. Reviewed by: avg, allanjude Differential Revision: https://reviews.freebsd.org/D12288	2017-09-12 13:45:04 +00:00
tuexen	56b9f343a0	Add support to print the TCP stack being used. Sponsored by: Netflix, Inc.	2017-09-12 13:34:43 +00:00
avg	cd9a347ef1	fix a fallout from the ZTOV tightening, r323479 MFC after: 13 days X-MFC with: r323479	2017-09-12 13:21:14 +00:00
cognet	5d0707d532	Some devices come with the same name as TI devices, so we can't rely on the "probe" method of those drivers to mean we're on e TI SoC. Introduce a new function, ti_soc_is_supported(), and use it to be sure we're really a TI system. PR: 222250	2017-09-12 10:43:02 +00:00
avg	3be8b9ac62	zfsctl_snapdir_lookup should be able to handle an uncovered vnode The uncovered vnode is possible because there is no guarantee that its hold count would go to zero (and it would be inactivated and reclaimed) immediately after a covering filesystem is unmounted. So, such a vnode should be expected and it is possible to re-use it without any trouble. MFC after: 3 weeks Sponsored by: Panzura	2017-09-12 06:06:58 +00:00
avg	aaa65e29f5	zfs_ctldir: remove obsolete / bogus ARGSUSED lint directives None of the tagged functions had unused parameters. MFC after: 1 week	2017-09-12 06:05:30 +00:00
avg	e212e901f3	zfsvfs_hold: assert that the busied filesystem can not be unmounted This is a FreeBSD specific feature. MFC after: 3 weeks Sponsored by: Panzura	2017-09-12 06:04:50 +00:00
avg	5fca2ddf8b	zfs_get_vfs: reference a requested filesystem instead of vfs_busy-ing it The only consumer of zfs_get_vfs, zfs_unmount_snap, does not need the filesystem to be busy, it just need a reference that it can pass to dounmount. Also, previously the code was racy as it unbusied the filesystem before taking a reference on it. Now the code should be simpler and safer. MFC after: 2 weeks Sponsored by: Panzura	2017-09-12 06:04:01 +00:00
avg	ab9c296692	zfs: tighten debug versions of ZTOV and VTOZ MFC after: 2 weeks Sponsored by: Panzura	2017-09-12 06:02:21 +00:00
cy	6d4e7c8ca3	Improve the wording of a comment describing why EAGAIN is the error code. MFC after: 3 days	2017-09-12 04:21:04 +00:00
ian	5623dee899	Add a default implementation that returns ENODEV for start, repeat_start, stop, read, and write methods. Some controllers don't implement these individual operations and have only a transfer method. In that case, we should return an indication that the device is present but doesn't support the method, as opposed to the kobj default error ENXIO which makes it look like the whole device is missing. Userland tools such as i2c(8) can use the differing return values to switch between the two different i2c IO mechanisms.	2017-09-11 23:47:49 +00:00
cem	94488dae4e	MCA: Rename AMD MISC bits/masks They apply to all AMD MCAi_MISC0 registers, not just MCA4 (NB). No functional change. Sponsored by: Dell EMC Isilon	2017-09-11 20:42:07 +00:00
cem	8fed2c5f64	x86 MCA: Extract CMCI support predicate into function On AMD, the MCG_CAP feature bit is reserved -- not explicitly zero. Do not use it to determine CMCI support. Reviewed by: avg, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12320	2017-09-11 20:41:25 +00:00
mw	b17f3d6ae3	Restore alphabetical order in UART Makefile Commit r323359 introduced new Marvell UART controller driver and by mistake it broke correct order in the Makefile. Fix this. Reported by: emaste	2017-09-11 19:07:53 +00:00
kibab	a4e7c08871	Add MMCCAM-enabled kernel config for arm64 Approved by: imp (mentor) Differential Revision: https://reviews.freebsd.org/D12114	2017-09-11 19:07:42 +00:00
mw	87549eced1	Expand Marvell NIC description in arm64 GENERIC config Suggested by: emaste	2017-09-11 19:00:53 +00:00
kib	fa64065f8a	Fix ioapic acpi id matching on PCI attach and rid calculation. Sponsored by: The FreeBSD Foundation MFC after: 11 days	2017-09-11 18:29:09 +00:00
cem	42d7ded221	Decode new AMD SVM feature bits on family 17h Sponsored by: Dell EMC Isilon	2017-09-11 18:11:53 +00:00
emaste	2961888e6f	boot1: remove BOOT1_MAXSIZE default value This Makefile relies on Makefile.fat providing the correct value for BOOT1_MAXSIZE and BOOT1_OFFSET. Since BOOT1_OFFSET had no default value here the build would already fail if Makefile.fat did not provide correct values. Sponsored by: The FreeBSD Foundation	2017-09-11 14:33:04 +00:00
avg	26d4f92783	MFV r323111: 8569 problem with inline functions in abd.h illumos/illumos-gate@37e84ab74e `37e84ab74e` https://www.illumos.org/issues/8569 C [C99] has peculiar rules for inline functions that are different from the C++ rules. Unlike C++ where inline is "fire and forget", in C a programmer must pay attention to the function's storage class / visibility. The main problem is with the case where a compiler decides to not inline a call to the function declared as inline. Some relevant links: - http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15831.html - http://www.drdobbs.com/the-new-c-inline-functions/184401540 The summary is that either the inline functions should be declared 'static inline' or one of the compilation units (.c files) must provide a callable externally visible function definition. In the former case, the compiler would automatically create a local non-inlined function instance in every compilation unit where it's needed. In the latter case the single external definition is used to satisfy any non-inlined calls in all compilation units. As things stand right now, we can get an undefined reference error under certain combinations of compilers and compiler options. For example, this is what I get on FreeBSD when compiling with clang 4.0.0 and -O1: In function `abd_free': /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c:385: undefined reference to `abd_is_linear' Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 1 week	2017-09-11 12:15:49 +00:00
avg	88adc9fbd7	Revert r322601, Mark ZFS ABD inline functions static An alternative fix is to be merged from illumos shortly.	2017-09-11 12:08:20 +00:00
avg	1393620686	MFV r323110: 8558 lwp_create() returns EAGAIN on system with more than 80K ZFS filesystems illumos/illumos-gate@216d7723a1 `216d7723a1` https://www.illumos.org/issues/8558 On a system with more than 80K ZFS filesystems, we've seen cases where lwp_create() will start to fail by returning EAGAIN. The problem being, for each of those 80K ZFS filesystems, a taskq will be created for each dataset as part of the ZIL for each dataset. For each of these taskq's, a kernel thread will be created which results in 24KB being allocated for each thread. With enough of these 24KB allocations, we eventually exhaust the memory region set aside for these allocations. Currently, segkpsize is set to a value of 2GB, which means we can only support about 80K filesystems; 2GB / 24KB = ~80K. The lwp_create() failure comes into play due to the fact that LWP creation also allocates 24KB from this same region of memory. Thus, if we've exhausted this region of memory due to the number of ZIL taskq's, there won't be any memory avaible to allow the call to lwp_create() to succeed. FreeBSD note: I haven't created sysctl-s for the new ZIL clean parameters. Let's add them if anyone requires to tune them. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com> MFC after: 3 weeks	2017-09-11 11:31:43 +00:00
mw	355f88634d	Improve HW type checking in mv_ehci driver This patch adds hwtype parameter which keeps information about hardware revision of Marvell EHCI controller. It allows to replace multiple calls to ofw_bus_is_compatible with comparing hwtype value during driver initialization. Submitted by: Patryk Duda <pdk@semihalf.com> Suggested by: ian Obtained from: Semihalf Sponsored by: Semihalf	2017-09-11 10:41:42 +00:00
tsoome	15abbf36c6	r323389 breaks the kernel build when WITHOUT_ZFS is defined in src.conf Need to add #ifdef EFI_ZFS_BOOT guard into efi/loader/main.c PR: 222215 Reported by: Sylvain Garrigues	2017-09-11 07:38:53 +00:00
scottl	e90eee487b	Add infrastructure for allocating multiple MSI-X interrupts. Also add more fine-tuned controls for allocating requests and replies. Sponsored by: Netflix	2017-09-11 01:51:27 +00:00
emaste	4923fe7ccc	boot1 generate-fat: generate all templates at once In advance of other changes to the fat template generation process, have generate-fat.sh create all template files at the same time so that they cannot get out of sync. Also correct a longstanding but where BOOT1_OFFSET was overwritten on each invocation. A previous version of this patch stored a per-arch offset (e.g. BOOT1_arm64_OFFSET) but that was deemed unnecessary. Instead just hardcode the known offset that applies to all archs (0x2d) and fail if the offset happens to be different. Ongiong work (using newfs_msdos in bsdinstall and adding msdosfs support to makefs) will eventually allow us to do away with this fat template hack altogether, but in the near term we have a few improvements that will build on this. Reviewed by: allanjude, imp, Eric McCorkle MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D10931	2017-09-11 00:37:00 +00:00
emaste	87b953ecdb	newvers.sh: speed up failing git-svn revision search In the case of running newvers.sh on a git tree w/o git-svn-id notes we previously piped the entire 'git log' to grep. Add --grep to the log invocation to avoid processing log entries of no interest. This saves about 2-3 seconds of newvers.sh run time on my SSD laptop. Later changes will bring further speedups. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2017-09-11 00:14:04 +00:00
emaste	2c5b42c776	newvers.sh: accept "git-svn-id:" at the start of a line only This prevents incorrect subversion revision detection when "git svn" is not being used to get the sources but git is available. Previously old subversion revisions included in commit messages were favoured over the more recent and correct revisions in git notes. For example `cf1f355747` represents r315395 but was treated as r313908 which is referenced in the commit message. Commits following r315395/cf1f35574722 but before another commit with a git-svn-id reference in the commit message would be treated as r313908 as well. Patch from PR updated to accommodate the initial four space indent in `git log` ouptut. PR: 221848 Submitted by: Fabian Keil Obtained from: ElectroBSD MFC after: 2 weeks	2017-09-10 19:12:01 +00:00
mjg	7495dfa6d7	Move vmmeter atomic counters into dedicated cache lines Prior to the change they were subject to extreme false sharing. In particular this change shaves about 3 seconds real time of -j 80 buildkernel. Reviewed by: alc, markj Differential Revision: https://reviews.freebsd.org/D12281	2017-09-10 19:00:38 +00:00
ian	39e8e58c20	Add gpio methods to read/write/configure up to 32 pins simultaneously. Sometimes it is necessary to combine several gpio pins into an ad-hoc bus and manipulate the pins as a group. In such cases manipulating the pins individualy is not an option, because the value on the "bus" assumes potentially-invalid intermediate values as each pin is changed in turn. Note that the "bus" may be something as simple as a bi-color LED where changing colors requires changing both gpio pins at once, or something as complex as a bitbanged multiplexed address/data bus connected to a microcontroller. In addition to the absolute requirement of simultaneously changing the output values of driven pins, a desirable feature of these new methods is to provide a higher-performance mechanism for reading and writing multiple pins, especially from userland where pin-at-a-time access incurs a noticible syscall time penalty. These new interfaces are NOT intended to abstract away all the ugly details of how gpio is implemented on any given platform. In fact, to use these properly you absolutely must know something about how the gpio hardware is organized. Typically there are "banks" of gpio pins controlled by registers which group several pins together. A bank may be as small as 2 pins or as big as "all the pins on the device, hundreds of them." In the latter case, a driver might support this interface by allowing access to any 32 adjacent pins within the overall collection. Or, more likely, any 32 adjacent pins starting at any multiple of 32. Whatever the hardware restrictions may be, you would need to understand them to use this interface. In additional to defining the interfaces, two example implementations are included here, for imx5/6, and allwinner. These represent the two primary types of gpio hardware drivers. imx6 has multiple gpio devices, each implementing a single bank of 32 pins. Allwinner implements a single large gpio number space from 1-n pins, and the driver internally translates that linear number space to a bank+pin scheme based on how the pins are grouped into control registers. The allwinner implementation imposes the restriction that the first_pin argument to the new functions must always be pin 0 of a bank. Differential Revision: https://reviews.freebsd.org/D11810	2017-09-10 18:08:25 +00:00
alc	e7430120e9	To analyze the allocation of swap blocks by blist functions, add a method for analyzing the radix tree structures and reporting on the number, and sizes, of maximal intervals of free blocks. The report includes the number of maximal intervals, and also the number of them in each of several size ranges, from small (size 1, or 3 to 4) to large (28657 to 46367) with size boundaries defined by Fibonacci numbers. The report is written in the test tool with the 's' command, or in a running kernel by sysctl. The analysis of the radix tree frequently computes the position of the lone bit set in a u_daddr_t, a computation that also appears in leaf allocation. That computation has been moved into a function of its own, and optimized for cases where an inlined machine instruction can replace the usual binary search. Submitted by: Doug Moore <dougm@rice.edu> MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D11906	2017-09-10 17:46:03 +00:00
des	64e97ceba3	If the user tries to set kern.randompid to 1 (which is meaningless), set it to a random value between 100 and 1123, rather than 0 as before. Submitted by: Marie Helene Kvello-Aune <marieheleneka@gmail.com> MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D5336	2017-09-10 15:01:29 +00:00
tsoome	5ce667b569	loader.efi: chain loader should provide proper device handle Since the efipart rewrite, the chain command was looking for device handle using interface applicable only for net devices. Disk partitions and zfs pools need their own approach to find the proper handle. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D12287	2017-09-10 13:53:42 +00:00
kib	452c04519c	Fix typo, TC0->TCO. Submitted by: jhb MFC after: 1 week	2017-09-10 13:21:54 +00:00
kib	861f0ba4d4	Add definitions of (new) bits for TCO registers from the Lewisburg/Sunrise Point documentation. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-09-10 12:10:27 +00:00
kib	f1910b8d27	Style: tab after #define. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-09-10 11:57:02 +00:00
mjg	4161a7cc9b	namecache: clean up struct namecache_ts handling namecache_ts differs from mere namecache by few fields placed mid struct. The access to the last element (the name) is thus special-cased. The standard solution is to put new fields at the very beginning anad embedd the original struct. The pointer shuffled around points to the embedded part. If needed, access to new fields can be gained through __containerof. MFC after: 1 week	2017-09-10 11:17:32 +00:00
scottl	ce44045fac	Fix intrhook release in MPR and MPS for EARLY_AP_STARTUP. Reported by: Limelight Sponsored by: Netflix	2017-09-10 07:10:40 +00:00
scottl	a2aed52cc9	More code refactoring in preparation for enabling multiqueue. Sponsored by: Netflix	2017-09-10 04:09:18 +00:00
scottl	8fef16b01d	Convert some in-line printing of diagnostic into tables. Sponsored by: Netflix	2017-09-09 22:02:36 +00:00

1 2 3 4 5 ...

127653 Commits