freebsd-skq

Author	SHA1	Message	Date
Vladimir Kondratyev	461120b834	Linux epoll: Check both read and write kqueue events existence in EPOLL_CTL_ADD Linux epoll EPOLL_CTL_ADD op handler should always check registration of both EVFILT_READ and EVFILT_WRITE kevents to deceide if supplied file descriptor fd is already registered with epoll instance. Reviewed by: emaste MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D22515	2019-11-24 20:44:14 +00:00
Vladimir Kondratyev	896a4c279d	Linux epoll: Don't deregister file descriptor after EPOLLONESHOT is fired Linux epoll does not remove descriptor after one-shot event has been triggered. Set EV_DISPATCH kqueue flag rather then EV_ONESHOT to get the same behavior. Required by Linux Steam client. PR: 240590 Reported by: Alex S <iwtcex@gmail.com> Reviewed by: emaste, imp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D22513	2019-11-24 20:41:47 +00:00
Konstantin Belousov	3236244936	Ignore object->handle for OBJ_ANON objects. Note that the change in vm_object_collapse() is arguably a correctness fix. We must not collapse into content-identity carrying objects. Reviewed by: jeff Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D22467	2019-11-24 19:18:12 +00:00
Eitan Adler	2e3fa849dd	bsd-family-tree: correct macOS release date Reported by: Herbert J. Skuhra <herbert@gojira.at> Reported by: Maxim Konovalov <maxim.konovalov@gmail.com>	2019-11-24 19:16:57 +00:00
Konstantin Belousov	b631c36f0d	Record part of the owner struct thread pointer into busy_lock. Record as much bits from curthread into busy_lock as fits. Low bits for struct thread * representation are zero due to struct and zone alignment, and they leave space for busy flags (perhaps except statically allocated thread0). Upper bits are not very interesting for assert, and in most practical situations recorded value should allow to manually identify the owner with certainity. Assert that unbusy is performed by the owner, except few places where unbusy is done in io completion handler. For this case, add _unchecked variants of asserts and unbusy primitives. Reviewed by: markj (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D22298	2019-11-24 19:12:23 +00:00
Konstantin Belousov	dbe257d253	tmpfs: resolve deadlock between rename and unmount. Top-level kern_renameat() increases the writecount on the mount point, which, together with tmpfs unmount suspending the mount, already ensures that unmount cannot proceed while rename unlocks and relocks all operated vnodes. Remove vfs_busy() call from tmpfs_rename() which was done while holding a vnode lock, creating the deadlock. The only intent of the busy operation seems to be the prevention of unmount, which is already ensured. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-11-24 19:06:38 +00:00
Konstantin Belousov	13189065cb	amd64: assert that EARLY_COUNTER does not corrupt memory. Reviewed by: imp Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22514	2019-11-24 19:02:13 +00:00
Navdeep Parhar	515a40d5d9	cxgbe(4): sysctl to reset the temperature/voltage sensor. # sysctl dev.<nexus>.<inst>.reset_sensor=1 # sysctl dev.t6nex.0.reset_sensor=1 MFC after: 1 week Sponsored by: Chelsio Communications	2019-11-24 16:40:54 +00:00
Warner Losh	dfdbb32093	Don't need giant for these drivers dev nodes. Also, Giant isn't required to busy / unbusy a device, so drop that too while I'm here. It's not done elsewhere in the tree and in the future will likely be handled by a node lock to ensure consistency. Leave Giant in place for attach and removing childing, as that's actually still needed, even if imperfect. Remove stale comment about contigmalloc taking Giant and calling w/o the lock held. Neither of these is still true.	2019-11-24 15:37:19 +00:00
Warner Losh	96b506a57c	Hoist locking giant back up into the ioctl handler Move the locking back into the ioctl handler. This "fixes" the race where we hve a hot plug event just after the dropping of Giant in pci_find_dbsf, assuming the driver doesn't then call anything that drops and picks up Giant again... It's a little safer since don't think it doesn't, but we lack the tools to know for sure.	2019-11-24 15:37:14 +00:00
Warner Losh	57aa9163fd	Fix leak in state machine for commands. When we get a device departed message from the firmware, we send a TARGET_REST to the device to let the firmware know we're done and as part of the recovery process. This will abort all the commands. While the documentation says the IOC is responsible for writing the completion message for all the commands pending with an aborted status, we sometimes have queued commands for the target that haven't been completed so are in the INQUEUE state. So, when we later complete the pending CCB as aborted, these commands are freed and we hit the "state not busy" panic. Elsewhere where we dequeue commands, we move the state to BUSY from INQUEUE. Do that here as well. In talking to Ken, Scott and Justin, they recommended a series of tests to see if this is 100% safe. Those tests are ongoing, but preliminary tests suggest this is safe as we see no duplicate completions when we hit this case at work. We have a machine that has a dodgy powersupply which usually doesn't apply power to a few drives, but sometimes does when the machine is under heavy load so we get a rash of the connect / disconnect messages over half an hour. Without this change, we'd see state not busy panic. With this change, the drives just annoyingly come and go without affecting the rest of the machine, but without a complete error injection test suite, it's hard to know if all edge cases are now covered or not. Discussed with: scottl, ken, gibbs	2019-11-24 15:24:05 +00:00
Li-Wen Hsu	b9e5cb0580	Fix gcc build We have -Werror=strict-overflow so gcc complains: In file included from /tmp/obj/workspace/src/amd64.amd64/tmp/usr/include/bitstring.h:36:0, from /workspace/src/tests/sys/sys/bitstring_test.c:34: /workspace/src/tests/sys/sys/bitstring_test.c: In function 'bit_ffc_at_test': /workspace/src/sys/sys/bitstring.h:239:5: error: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-Werror=strict-overflow] if (_start >= _nbits) { ^ Disable assuming overflow of signed integer will never happen by specifying -fno-strict-overflow Sponsored by: The FreeBSD Foundation	2019-11-24 15:03:35 +00:00
Kristof Provost	492f3a312a	pf: Add endline to all DPFPRINTF() DPFPRINTF() doesn't automatically add an endline, so be consistent and always add it.	2019-11-24 13:53:36 +00:00
Eitan Adler	71fb754d11	bsd-family-tree: add several new entries Reviewed by: imp, scottl Differential Revision: https://reviews.freebsd.org/D22529	2019-11-24 07:52:35 +00:00
Brandon Bergren	e58d379587	[PowerPC] Fix stack padding issue on ppc32. Four bytes of padding are needed in the regular powerpc case to bring the stack frame size up to a multiple of 16 bytes to meet ABI requirements. Fixes odd hangs I was encountering during testing.	2019-11-24 06:43:03 +00:00
Navdeep Parhar	e56d731b7d	cxgbe(4): Update the firmware interface header. This allows the driver to be updated for the next firmware without waiting for it to be released. MFC after: 2 weeks Sponsored by: Chelsio Communications	2019-11-24 05:37:28 +00:00
Justin Hibbits	7511645efa	rtld/powerpc: Fix _rtld_bind_start for powerpcspe Summary: We need to save off the full 64-bit register, not just the low 32 bits, of all registers getting saved off in _rtld_bind_start. Additionally, we need to save off the other SPE registers (SPEFSCR and accumulator), so that their program state is not affected by the PLT resolver. Reviewed by: bdragon Differential Revision: https://reviews.freebsd.org/D22520	2019-11-24 04:35:29 +00:00
Warner Losh	a921c2003f	Add a warning about Giant Locked devices Add a warning when a device registers with devfs and requests D_NEEDGIANT. The warning says the device will go away before 13.0. This is needed to flush out the devices in the tree that are still Giant locked. This warning, or some variant of it, should have gone into the tree a long time ago... The intention is to require all devices be converted to not use automatic giant in this way, or remove any such devices that remain that we don't have the hardware to test a conversion of. kbd so far is the only device that can't leave the tree, yet needs something sensible done to avoid the auto giant lock (even if it is just doing the wrapping itself). There may be others added to this list... Any discussions of this topic will take place on arch@.	2019-11-23 23:57:26 +00:00
Warner Losh	283a5a3796	We don't even need Giant here. It isn't protecting anything internal to geom, and nothing we call requires it to be held. It's left over from a time when the latter wasn't the case. Retire it. Reviewed in concept: scottl@	2019-11-23 23:44:00 +00:00
Warner Losh	dd615d09c4	Push Giant down one layer The /dev/pci device doesn't need GIANT, per se. However, one routine that it calls, pci_find_dbsf implicitly does. It walks a list that can change when PCI scans a new bus. With hotplug, this means we could have a race with that scanning. To prevent that, take out Giant around scanning the list. However, given that we have places in the tree that drop giant, if held when we call into them, the whole use of Giant to protect newbus may be less effective that we desire, so add a comment about why we're talking it out, and we'll address the issue when we lock newbus with something other than Giant.	2019-11-23 23:43:52 +00:00
Brandon Bergren	0ee420b608	[PowerPC] Fix typo in _ctx_start on ppc32 Theoretically, this was breaking the size calculation for the symbol. Noticed when doing a readthrough. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D22525	2019-11-23 23:41:21 +00:00
Brandon Bergren	a638bf2a76	[PowerPC] Use QEMU-compatible version of SPE accumulator save Switch from "evaddumiaaw 0,0" to "evmwumiaa 0,0,0" when persisting the accumulator. This has the benefit of actually being implemented in QEMU as it is the form Linux uses for the same task. Both instructions are functionally equivilent, as we are using them for their side effect of copying the accumulator to GPRs rather than for the actual math operation that they are performing. Reviewed by: jhibbits	2019-11-23 21:18:55 +00:00
Dimitry Andric	1bb8eb56ef	libclang_rt: enable on powerpc* Summary: Enable on powerpc64 and in lib/libclang_rt/Makefile change MACHINE_CPUARCH to MACHINE_ARCH because on powerpc64 MACHINE_ARCH==MACHINE_CPUARCH so the 32-bit library overwrites 64-bit library during installworld. This patch doesn't enable any other libclang_rt libraries because they need to be separately ported. I have verified that games/julius (which fails on powerpc64 elfv2 without this change because of no libclang_rt profiling library) builds. Test Plan: Ship it, test on powerpc and powerpcspe Submitted by: pkubaj Reviewed by: dim, jhibbits Differential Revision: https://reviews.freebsd.org/D22425 MFC after: 1 month X-MFC-With: r353358	2019-11-23 19:35:09 +00:00
Doug Moore	be252a414f	The error messages that indicate bugs in 'area' bitstring functions should identify accurately which function exhibited the bug. Reviewed by: asomers MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D22519	2019-11-23 17:22:36 +00:00
Kyle Evans	8922c2ca03	bcm2835_sdhci: fix non-INVARIANTS build sc is now only used to make sure we're not re-entering the data handling path erroneously. Reported by: Mark Millard	2019-11-23 13:39:47 +00:00
Kyle Evans	0227a14997	arm64/NOTES: add SOC_BRCM_BCM2838 This should have been done back when it was added, but it was not. It only really adds an extra entry for memory mapping bits in bcm2835_vcbus.c, so nothing too extensive yet.	2019-11-23 03:38:26 +00:00
Kyle Evans	d5f1d33c67	bcm2835_dma: rip out the "use_dma" flag, make it non-optional Now that it works for the Raspberry Pi 4, we can discontinue our workarounds that were put in place to at least get a bootable kernel for other testing.	2019-11-23 01:47:17 +00:00
Kyle Evans	d7399dfdba	bcm2835_sdhci: "fix" DMA on the RPi 4 According to the documentation I have, DREQ pacing should be required here. The DREQ# hasn't changed since the BCM2835. As soon as we attempt to setup DREQ, DMA stalls and there's no clear reason why as of yet. Setting this back to NONE seems to work just as well, though it's yet to be determined if this is a sustainable model in high-throughput scenarios.	2019-11-23 01:46:02 +00:00
Conrad Meyer	7993a104a1	Add explicit SI_SUB_EPOCH Add explicit SI_SUB_EPOCH, after SI_SUB_TASKQ and before SI_SUB_SMP (EARLY_AP_STARTUP). Rename existing "SI_SUB_TASKQ + 1" to SI_SUB_EPOCH. epoch(9) consumers cannot epoch_alloc() before SI_SUB_EPOCH:SI_ORDER_SECOND, but likely should allocate before SI_SUB_SMP. Prior to this change, consumers (well, epoch itself, and net/if.c) just open-coded the SI_SUB_TASKQ + 1 order to match epoch.c, but this was fragile. Reviewed by: mmacy Differential Revision: https://reviews.freebsd.org/D22503	2019-11-22 23:23:40 +00:00
Alexander Motin	bae3729be4	Do not retry long ready waits if previous gave nothing. I have some disks reporting "Logical unit is in process of becoming ready" for about half an hour before finally reporting failure. During that time CAM waits for the readiness during ~2 minutes for each request, that makes system boot take very long time. This change reduces wait times for the following requests to ~1 second if previously long wait for that device has timed out. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2019-11-22 21:31:59 +00:00
Conrad Meyer	b6db1cc710	random(4): De-export random_sources list The internal datastructures do not need to be visible outside of random_harvestq, and this helps ensure they are not misused. No functional change. Approved by: csprng(delphij, markm) Differential Revision: https://reviews.freebsd.org/D22485	2019-11-22 20:24:15 +00:00
Scott Long	02d4535d2d	Mark hpt27xx for removal in 13.0; all CAM drivers will be Giant-free by then. Relnotes: yes	2019-11-22 20:23:22 +00:00
Conrad Meyer	d7a23f9f6b	random(4): Use ordinary sysctl definitions There's no need to dynamically populate them; the SYSCTL_ macros take care of load/unload appropriately already (and random_harvestq is 'standard' and cannot be unloaded anyway). Approved by: csprng(delphij, markm) Differential Revision: https://reviews.freebsd.org/D22484	2019-11-22 20:22:29 +00:00
Dave Cottlehuber	130cfcf3fc	dhclient: support option 114, default-url ascii This will enable further automation of HTTP UEFI boot loader support by providing a specific option for providing the boot URL to FreeBSD. Documented in: https://www.iana.org/assignments/bootp-dhcp-parameters/bootp-dhcp-parameters.xhtml https://kb.isc.org/docs/isc-dhcp-44-manual-pages-dhcp-options https://tools.ietf.org/html/rfc3679 Approved by: emaste MFC after: 2 weeks Sponsored by: SkunkWerks, GmbH Differential Revision: https://reviews.freebsd.org/D22475	2019-11-22 20:22:16 +00:00
Conrad Meyer	f19de0a945	random(4): Abstract loader entropy injection Break random_harvestq_prime up into some logical subroutines. The goal is that it becomes easier to add other early entropy sources. While here, drop pre-12.0 compatibility logic. loader default configuration should preload the file as expeced since 12.0. Approved by: csprng(delphij, markm) Differential Revision: https://reviews.freebsd.org/D22482	2019-11-22 20:20:37 +00:00
Conrad Meyer	92ebf15da5	random(4): Remove unused definitions Approved by: csprng(gordon, markm) Differential Revision: https://reviews.freebsd.org/D22481	2019-11-22 20:18:07 +00:00
Kyle Evans	ba78f78f44	bcm2835_vcbus: add the other rpi4 compat string The DTS I used initially had brcm,bcm2838; the new one uses brcm,bcm2711. Add that one as well.	2019-11-22 19:56:52 +00:00
Kyle Evans	5b0a8ee218	MMCCAM: defer release of ccb until we're done with it If we've found a device, we attempt to call xpt_action() on a ccb that's already been released. Simply defer release until after we're done with it. Reviewed by: imp, scottl MFC after: 1 week	2019-11-22 19:54:14 +00:00
Conrad Meyer	cb285f7c7c	random/ivy: Provide mechanism to read independent seed values from rdrand On x86 platforms with the intrinsic, rdrand is a deterministic bit generator (AES-CTR) seeded from an entropic source. On x86 platforms with rdseed, it is something closer to the upstream entropic source. (There is more nuance; a block diagram is provided in [1].) On devices with rdrand and without rdseed, there is no good intrinsic for acecssing the good entropic soure directly. However, the DRBG is guaranteed to reseed every 8 kB on these platforms. As a conservative option, on such hardware we can read an extra 7.99kB samples every time we want a sample from an independent seed. As one can imagine, this drastically slows the effective read rate of RDRAND (a factor of 1024 on amd64 and 2048 on ia32). Microbenchmarks on AMD Zen (has RDSEED) show an RDRAND rate of 25 MB/s and Intel Haswell (no RDSEED) show RDRAND of 170 MB/s. This would reduce the read rate on Haswell to ~170 kB/s (at 100% CPU). random(4)'s harvestq thread periodically "feeds" from pure sources in amounts of 128-1024 bytes. On Haswell, enabling this feature increases the CPU time of RDRAND in each "feed" from approximately 0.7-6 µs to 0.7-6 ms. Because there is some performance penalty to this more conservative option, a knob is provided to enable the change. The change does not affect platforms with RDSEED. [1]: https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide#inpage-nav-4-2 Approved by: csprng(delphij, markm) Differential Revision: https://reviews.freebsd.org/D22455	2019-11-22 19:30:31 +00:00
Alexander Motin	7e8baf37e0	Remove xpt_lock mutex. CAM does not require SIM locks for years, and obviously does not require it for completely virtual XPT SIM. MFC after: 2 weeks	2019-11-22 18:55:27 +00:00
Scott Long	8823960b8d	Schedule the trm(4) driver for removal. It relies on Giant and thus has required compat shims in CAM for 12 years. Relnotes: yes	2019-11-22 18:50:53 +00:00
Brooks Davis	edb0ec001e	Revert r354909: Make the warning for deprecated NO_ variables an error. An unexpectidly large number of ports define NO_MAN (and sometimes the long-dead NOMAN). I'll fix ports and then re-commit.	2019-11-22 18:41:09 +00:00
Alexander Motin	a4876fbfc3	Make CAM use root_mount_hold_token() to delay boot. Before this change CAM used config_intrhook_establish() for this purpose, but that approach does not allow to delay it again after releasing once. USB stack uses root_mount_hold() to delay boot until bus scan is complete. But once it is, CAM had no time to scan SCSI bus, registered by umass(4), if it already done other scans and called config_intrhook_disestablish(). The new approach makes it work smooth, assuming the USB device is found during the initial bus scan. Devices appearing on USB bus later may still require setting kern.cam.boot_delay, but hopefully those are minority. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2019-11-22 18:39:51 +00:00
Scott Long	f0d6f5774a	Remove NEEDGIANT from the scsi_sg /dev node. It likely has not been needed for many years. Reported by: imp	2019-11-22 18:18:36 +00:00
Ravi Pokala	90e43b446d	Add and document options to allow rpc.lockd and rpc.statd to run in the foreground. This allows a separate process to monitor when and how those programs exit. That process can then restart them if needed. Submitted by: Alex Burlyga Reviewed by: bcr, imp MFC after: 1 week Sponsored by: Panasas Differential Revision: https://reviews.freebsd.org/D22474	2019-11-22 16:53:30 +00:00
Mark Johnston	9c770a27ce	Simplify vm_pageout_init_domain() and add a "big picture" comment. Stop subtracting 1024/200 from vmd_page_count/200. I cannot see how such precise accounting can make a difference on modern systems. Add some explanation of what the page daemon does and how it handles memory shortages. Reviewed by: dougm Discussed with: jeff, kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22396	2019-11-22 16:31:43 +00:00
Mark Johnston	8fc2550837	Reclaim memory from UMA if the page daemon is struggling. Use the UMA reclaim thread to asynchronously drain all caches if there is a severe shortage in a domain. Otherwise we only trigger UMA reclamation every 10s even when the system has completely run out of memory. Stop entirely draining the caches when one domain falls below its min threshold. In some workloads it is normal for one NUMA domain to end up being nearly depleted by kernel memory allocations, for example for the ZFS ARC. The domainset iterators skip domains below the vmd_min_free theshold on the first iteration, so we should allow that mechanism to limit further depletion of the domain's free pages before taking the extreme step of calling uma_reclaim(UMA_RECLAIM_DRAIN_CPU). Discussed with: jeff MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22395	2019-11-22 16:31:30 +00:00
Mark Johnston	bf0d60af92	Update the checks in vm_page_zone_import(). - Remove the cnt == 1 check. UMA passes cnt == 1 when it has disabled per-CPU caching. In this case we might as well just allocate a single page and return it to the caller, since the caller is going to do exactly that anyway if the UMA cache allocation attempt fails. - Don't replenish caches if the domain is severely short on free pages. With large buckets we may otherwise quickly exacerbate a situation where the page daemon is failing to keep up. - Don't replenish caches if the calling thread belongs to the page daemon, which should avoid creating extra memory pressure when it is trying to free memory. Virtually all such allocations while occur in the context of laundering, where the laundry thread must allocate slabs for various swap and I/O-related UMA zones. Reviewed by: kib Discussed with: alc, jeff MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22394	2019-11-22 16:31:10 +00:00
Mark Johnston	003cf08ba9	Revise the page cache size policy. In r353734 the use of the page caches was limited to systems with a relatively large amount of RAM per CPU. This was to mitigate some issues reported with the system not able to keep up with memory pressure in cases where it had been able to do so prior to the addition of the direct free pool cache. This change re-enables those caches. The change modifies uma_zone_set_maxcache(), which was introduced specifically for the page cache zones. Rather than using it to limit only the full bucket cache, have it also set uz_count_max to provide an upper bound on the per-CPU cache size that is consistent with the number of items requested. Remove its return value since it has no use. Enable the page cache zones unconditionally, and limit them to 0.1% of the domain's pages. The limit can be overridden by the vm.pgcache_zone_max tunable as before. Change the item size parameter passed to uma_zcache_create() to the correct size, and stop setting UMA_ZONE_MAXBUCKET. This allows the page cache buckets to be adaptively sized, like the rest of UMA's caches. This also causes the initial bucket size to be small, so only systems which benefit from large caches will get them. Reviewed by: gallatin, jeff MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22393	2019-11-22 16:30:47 +00:00
Mark Johnston	b378d29687	Fix locking in vm_reserv_reclaim_contig(). We were not properly handling the case where the trylock of the reservaton fails, in which case we could leak reservation lock. Introduce a marker reservation to implement precise scanning in vm_reserv_reclaim_contig(). Before, a race could result in early termination of the scan in rare situations. Use the marker's lock to serialize scans of the partpop queue so that a global marker structure can be used. Modify vm_reserv_reclaim_inactive() to handle the presence of a marker while minimizing the hold time of domain-global locks. Reviewed by: alc, jeff, kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22392	2019-11-22 16:28:52 +00:00

1 2 3 4 5 ...

246188 Commits