freebsd-dev

Author	SHA1	Message	Date
Philip Paeps	7f0b970948	QEMU: use default HZ HZ=100 by default on riscv since r351918.	2019-09-06 01:22:16 +00:00
Philip Paeps	7f0851ab19	riscv: default to HZ=100 Most current RISC-V development platforms are not fast enough to benefit from the increased granularity provided by HZ=1000. Sponsored by: Axiado	2019-09-06 01:19:31 +00:00
Rick Macklem	4ce21f37fd	Delete the unused "nd" argument for nfsrv_proxyds(). The "nd" argument for nfsrv_proxyds() is no longer used by the function. This patch deletes it. This allows a subsequent patch to delete the "nd" argument from nfsvno_getattr(), since it's only use of "nd" was to pass it to nfsrv_proxyds(). Getting rid of the "nd" argument from nfsvno_getattr() avoids confusion over why it might need "nd". This patch is trivial and does not have any semantic effect.	2019-09-05 22:25:19 +00:00
Conrad Meyer	a6935d085c	Remove long-dead BUF_ASSERT_{,UN}HELD assertions These were fully neutered in r177676 (2008), but not removed at the time for unclear reasons. They're totally dead code, so go ahead and yank them now. No functional change.	2019-09-05 21:43:33 +00:00
Conrad Meyer	f80cbeb292	msdosfs: Drop an unneeded brelse in bread error condition After r294954, it is an invariant that bread returns non-NULL bp if and only if the routine succeeded. On error, it handles any buffer cleanup internally. So the brelse(NULL) here was just redundant. No functional change. Discussed with: kib (extracted from a larger differential)	2019-09-05 21:30:52 +00:00
Ian Lepore	acce2d7606	Use a single write of 3 bytes instead of iicdev_writeto() in ads111x. The iicdev_writeto() function basically does scatter-gather IO by filling in a pair of iic_msg structs to write the register address then the data from different locations but with a single bus START/xfer/STOP sequence. It turns out several low-level i2c controller drivers do not honor the IIC_NOSTART flag, so the second piece of the write gets a new START on the bus, and that confuses the ads111x chips which expect a continuous write of 3 bytes to set a register. A proper fix for this is to track down all the misbehaving controllers drivers and fix them. For now this change makes this driver work again.	2019-09-05 19:17:53 +00:00
Ian Lepore	c56cf3d276	Ensure a measurement is complete before reading the result in ads111x. Also, disable the comparator by default; it's not used for anything. The previous logic would start a measurement, and then pause_sbt() for the averaging time currently configured in the chip. After waiting that long, the code would blindly read the measurement register and return its value. The problem is that the chip's idea of averaging time is based on its internal free-running 1MHz oscillator, which may be running at a wildly different rate than the kernel clock. If the chip's internal timer was running slower than the kernel clock, we'd end up grabbing a stale result from an old measurement. The driver now still uses pause_sbt() to yield the cpu while waiting for the measurement to complete, but after sleeping it checks the chip's status register to ensure the measurement engine is idle. If it's not, the driver uses a retry loop to wait a bit (5% of the original wait time) then check again for completion.	2019-09-05 19:07:48 +00:00
Mateusz Guzik	68c3c1abe1	vfs: temporarily revert r351825 There are 2 problems: - it introduces a funny bug where it can end up trylocking the same vnode [1] - it exposes a pre-existing softdep deadlock [2] Both are easier to run into that the bug which got fixed, so revert until a complete solution is worked out. Reported by: cy [1], pho [2] Sponsored by: The FreeBSD Foundation	2019-09-05 18:19:51 +00:00
Toomas Soome	8262585607	Adjust teken to allow build as part of loader Building for loader needs specific headers.	2019-09-05 18:07:40 +00:00
Ruslan Bukin	446e035c30	Add dwgpio to NOTES so it gets built in LINT kernels. Sponsored by: DARPA, AFRL	2019-09-05 17:54:57 +00:00
Stephen J. Kiernan	d57cd5ccd3	Bump up the low range of cpuset numbers to account for the kernel cpuset. Reviewed by: jeff Obtained from: Juniper Networks, Inc.	2019-09-05 17:48:39 +00:00
Ed Maste	aa91d4b3a9	pcie: return an error if a matching resource is not found Submitted by: markj Reviewed by: manu Event: vBSDCon FreeBSD hackathon Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20884	2019-09-05 15:45:21 +00:00
Hans Petter Selasky	a48a37bee2	Decrease the default audio playback latency to a maximum of 21.3ms. This significantly improves the audio playback response time. Discussed with: mav@ MFC after: 1 week Sponsored by: Mellanox Technologies	2019-09-05 10:49:12 +00:00
Conrad Meyer	58aa4dbf4a	sys/mount.h: Comment on distinction between vfs_{c,}mount Hope to save someone else a little future effort in ugly duplicated code. No functional change.	2019-09-05 00:56:37 +00:00
Rick Macklem	2e67077700	Delete the unused "nd" argument for nfsrv_checkdsattr(). The "nd" argument for nfsrv_checkdsattr() is no longer used by the function. This patch deletes it. This allows subsequent patches to delete the "nd" argument from nfsrv_proxyds(), since it's only use of "nd" was to pass it to nfsrv_checkdsattr(). The same will then be true for nfsvno_getattr(), which passes "nd" to nfsrv_proxyds(). Getting rid of the "nd" argument from nfsvno_getattr() avoids confusion over why it might need "nd". This patch is trivial and does not have any semantic effect. Found by inspection while working on the NFSv4.2 server.	2019-09-04 22:37:28 +00:00
Konstantin Belousov	bf5661f4a1	madvise(MADV_FREE): Quick fix to time rewind. Don't free pages in a shadowing object. While this degrades MADV_FREE to a no-op (and we could, instead, choose to fall back to MADV_DONTNEED, at the cost of changing pmap_madvise), this is presently considered a temporary fix. We may prefer to risk a little fragmentation of the map by creating a zero/OBJT_DEFAULT entry over top of the existing object and, simultaneously, revert to the existing marking any pages in the former shadowing object in the advised region as reclaimable. At least one consumer of MADV_FREE (snmalloc) may use mmap() to construct zeroed pages "eventually" here anyway, so the fragmentation may be coming anyway. Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk> PR: 240061 Reviewed by: markj MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21517	2019-09-04 20:28:16 +00:00
Warner Losh	f93b7f954e	Support doorbell strides != 0. The NVMe standard (1.4) states >>> 8.6 Doorbell Stride for Software Emulation >>> The doorbell stride,...is useful in software emulation of an NVM >>> Express controller. ... For hardware implementations of the NVM >>> Express interface, the expected doorbell stride value is 0h. However, hardware in the wild exists with a doorbell stride of 1 (meaning 8 byte separation). This change supports that hardware, as well as software emulators as envisioned in Section 8.6. Since this is the fast path, care has been taken to make this computation efficient. The bit of math to compute an offset for each is replaced by a memory load from cache of a pre-computed value. MFC After: 3 days Reviewed by: scottl@ Differential Revision: https://reviews.freebsd.org/D21514	2019-09-04 20:08:36 +00:00
Mateusz Guzik	c07d4a0a68	vfs: fully hold vnodes in vnlru_free_locked Currently the code only bumps holdcnt and clears the VI_FREE flag, not performing actual vhold. Since the vnode is still visible elsewhere, a potential new user can find it and incorrectly assume it is properly held. Use vholdl instead to correctly hold the vnode. Another place recycling (vlrureclaim) does this already. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21522	2019-09-04 19:23:18 +00:00
Edward Tomasz Napierala	e55366be83	Fix /proc/mounts for autofs(5) mounts. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2019-09-04 18:00:54 +00:00
Edward Tomasz Napierala	36c03d045a	Improve debugging output. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2019-09-04 18:00:03 +00:00
Ruslan Bukin	50c365c49a	Include dwgpio to the build. Sponsored by: DARPA, AFRL	2019-09-04 15:55:44 +00:00
Ruslan Bukin	564e82561b	o Add support for multi-port instances of Synopsys DesignWare APB GPIO Controller. o Rename the driver to dwgpio. Sponsored by: DARPA, AFRL	2019-09-04 15:37:24 +00:00
Kyle Evans	1f6453b126	Back out r351799 empty does not appear to work like I thought it did and it actively breaks real LOCAL_MODULES usage, of which I have none at the moment...	2019-09-04 14:32:04 +00:00
Kyle Evans	f99c5e8d28	pseudofs: make readdir work without a pid again Specifically, the following was broken: $ mount -t procfs procfs /proc $ ls -l /proc r351741 reworked readdir slightly to avoid pfs_node/pidhash LOR, but inadvertently regressed pid == NO_PID; new pfs_lookup_proc() fails for the obvious reasons, and later pfs_visible_proc doesn't capture the pid == NO_PID -> return 1 aspect of pfs_visible. We can infact skip this whole block if we're operating on a directory w/ NO_PID, as it's always visible. Reported by: trasz Reviewed by: mjg Differential Revision: https://reviews.freebsd.org/D21518	2019-09-04 14:20:39 +00:00
Andriy Gapon	387df3b805	shutdown_halt: make sure that watchdog timer is stopped The point of halt is to keep the machine in limbo. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D21222	2019-09-04 13:26:59 +00:00
Andriy Gapon	b539c9bfbd	ZFS: Always refuse receving non-resume stream when resume state exists This fixes a hole in the situation where the resume state is left from receiving a new dataset and, so, the state is set on the dataset itself (as opposed to %recv child). Additionally, distinguish incremental and resume streams in error messages. This was also committed to ZoL: zfsonlinux/zfs@ebeb6f23bf MFC after: 2 weeks Sponsored by: CyberSecure	2019-09-04 07:33:22 +00:00
Michael Tuexen	ecc5b1d156	Fix the SACK block generation in the base TCP stack by bringing it in sync with the RACK stack. Reviewed by: rrs@ MFC after: 5 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D21513	2019-09-04 04:38:31 +00:00
Mark Johnston	f97bf60469	Fix some nits in pmap_page_array_startup(). - Use ptoa() instead of the archaic ctob(). - Use pagezero() to zero a PDP page. - Remove PA_MIN_ADDRESS, orphaned by r351742. - Remove unneeded parens and an unnecessary control flow statement. Reported by: alc Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21495	2019-09-03 22:26:01 +00:00
Kyle Evans	a3f59fe262	LOCAL_MODULES: Allow LOCAL_MODULES="" in src.conf to work Currently LOCAL_MODULES= works, but LOCAL_MODULES="" causes build errors as .for still has the empty string to loop over. An .if empty prior to the loop was considered, but LOCAL_MODULES has empty quotes at that point and thus, isn't empty. A better solution likely exists, but this floats us by for now...	2019-09-03 22:01:12 +00:00
Kyle Evans	ef03f57dd2	Allow more nesting of GEOM partitioning schemes GEOM is supposed to be topology-agnostic, but the GPT and BSD partition code has arbitrary restrictions on nesting that are annoying in cases such as running VMs on raw partitions (since the VM's partitioning scheme is not visible to the host). This patch adds sysctls to disable the restrictions except in the case of BSD label (and similar) partitions with offset 0 (where we need to avoid recursively recognizing the label). Submitted by: Andrew Gierth MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21350	2019-09-03 20:57:20 +00:00
Kyle Evans	dca52ab480	posixshm: start counting writeable mappings r351650 switched posixshm to using OBJT_SWAP for shm_object r351795 added support to the swap_pager for tracking writeable mappings Take advantage of this and start tracking writeable mappings; fd sealing will use this to reject a seal on writing with EBUSY if any such mapping exist. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21456	2019-09-03 20:33:38 +00:00
Kyle Evans	fe7bcbaf50	vm pager: writemapping accounting for OBJT_SWAP Currently writemapping accounting is only done for vnode_pager which does some accounting on the underlying vnode. Extend this to allow accounting to be possible for any of the pager types. New pageops are added to update/release writecount that need to be implemented for any pager wishing to do said accounting, and we implement these methods now for both vnode_pager (unchanged) and swap_pager. The primary motivation for this is to allow other systems with OBJT_SWAP objects to check if their objects have any write mappings and reject operations with EBUSY if so. posixshm will be the first to do so in order to reject adding write seals to the shmfd if any writable mappings exist. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D21456	2019-09-03 20:31:48 +00:00
Edward Tomasz Napierala	ee6da5cee7	Unbreak Linux binaries linked against new glibc, such as the ones from recent Ubuntu versions. Without it they segfault on startup. Reviewed by: emaste MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20687	2019-09-03 19:48:23 +00:00
Michael Tuexen	191ae5bfa9	Fix two TCP RACK issues: * Convert the TCP delayed ACK timer from ms to ticks as required. This fixes the timer on platforms with hz != 1000. * Don't delay acknowledgements which report duplicate data using DSACKs. Reviewed by: rrs@ MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D21512	2019-09-03 19:48:02 +00:00
Konstantin Belousov	fe69291ff4	Add procctl(PROC_STACKGAP_CTL) It allows a process to request that stack gap was not applied to its stacks, retroactively. Also it is possible to control the gaps in the process after exec. PR: 239894 Reviewed by: alc Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21352	2019-09-03 18:56:25 +00:00
Edward Tomasz Napierala	bb3c7a5440	Make linprocfs(4) report Tgid, Linux ltrace(1) needs it. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2019-09-03 16:33:02 +00:00
Mateusz Guzik	e3c3248cc7	vfs: implement usecount implying holdcnt vnodes have 2 reference counts - holdcnt to keep the vnode itself from getting freed and usecount to denote it is actively used. Previously all operations bumping usecount would also bump holdcnt, which is not necessary. We can detect if usecount is already > 1 (in which case holdcnt is also > 1) and utilize it to avoid bumping holdcnt on our own. This saves on atomic ops. Reviewed by: kib Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21471	2019-09-03 15:42:11 +00:00
Warner Losh	4d5475613e	Implement nvme suspend / resume for pci attachment When we suspend, we need to properly shutdown the NVME controller. The controller may go into D3 state (or may have the power removed), and to properly flush the metadata to non-volatile RAM, we must complete a normal shutdown. This consists of deleting the I/O queues and setting the shutodown bit. We have to do some extra stuff to make sure we reset the software state of the queues as well. On resume, we have to reset the card twice, for reasons described in the attach funcion. Once we've done that, we can restart the card. If any of this fails, we'll fail the NVMe card, just like we do when a reset fails. Set is_resetting for the duration of the suspend / resume. This keeps the reset taskqueue from running a concurrent reset, and also is needed to prevent any hw completions from queueing more I/O to the card. Pass resetting flag to nvme_ctrlr_start. It doesn't need to get that from the global state of the ctrlr. Wait for any pending reset to finish. All queued I/O will get sent to the hardware as part of nvme_ctrlr_start(), though the upper layers shouldn't send any down. Disabling the qpairs is the other failsafe to ensure all I/O is queued. Rename nvme_ctrlr_destory_qpairs to nvme_ctrlr_delete_qpairs to avoid confusion with all the other destroy functions. It just removes the queues in hardware, while the other _destroy_ functions tear down driver data structures. Split parts of the hardware reset function up so that I can do part of the reset in suspsend. Split out the software disabling of the qpairs into nvme_ctrlr_disable_qpairs. Finally, fix a couple of spelling errors in comments related to this. Relnotes: Yes MFC After: 1 week Reviewed by: scottl@ (prior version) Differential Revision: https://reviews.freebsd.org/D21493	2019-09-03 15:26:11 +00:00
Mark Johnston	7cdeaf3309	Add preliminary support for atomic updates of per-page queue state. Queue operations on a page use the page lock when updating the page to reflect the desired queue state, and the page queue lock when physically enqueuing or dequeuing a page. Multiple pages share a given page lock, but queue state is per-page; this false sharing results in heavy lock contention. Take a small step towards the use of atomic_cmpset to synchronize updates to per-page queue state by introducing vm_page_pqstate_cmpset() and using it in the page daemon. In the longer term the plan is to stop using the page lock to protect page identity and rely only on the object and page busy locks. However, since the page daemon avoids acquiring the object lock except when necessary, some synchronization with a concurrent free of the page is required. vm_page_pqstate_cmpset() can be used to ensure that queue state updates are successful only if the page is not scheduled for a dequeue, which is sufficient for the page daemon. Add vm_page_swapqueue(), which moves a page from one queue to another using vm_page_pqstate_cmpset(). Use it in the active queue scan, which does not use the object lock. Modify vm_page_dequeue_deferred() to use vm_page_pqstate_cmpset() as well. Reviewed by: kib Discussed with: jeff Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21257	2019-09-03 14:29:58 +00:00
Mark Johnston	9d75f0dc75	Map the vm_page array into KVA on amd64. r351198 allows the kernel to use domain-local memory to back the vm_page array (up to 2MB boundaries) and reserves a separate PML4 entry for that purpose. One consequence of that change is that the vm_page array is no longer present in minidumps, which only adds pages mapped above VM_MIN_KERNEL_ADDRESS. To avoid the friction caused by having kernel data structures mapped below VM_MIN_KERNEL_ADDRESS, map the vm_page array starting at VM_MIN_KERNEL_ADDRESS instead of using a dedicated PML4 entry. Reviewed by: kib Discussed with: jeff Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21491	2019-09-03 13:18:51 +00:00
Mateusz Guzik	f5791174df	pseudofs: fix a LOR pfs_node vs pidhash (sleepable after non-sleepable) Sponsored by: The FreeBSD Foundation	2019-09-03 12:54:51 +00:00
Andriy Gapon	50f14c4f68	superio: fix the copyright block and update the year MFC after: 2 weeks	2019-09-03 12:40:58 +00:00
Mateusz Guzik	d05b53e0ba	Add sysctlbyname system call Previously userspace would issue one syscall to resolve the sysctl and then another one to actually use it. Do it all in one trip. Fallback is provided in case newer libc happens to be running on an older kernel. Submitted by: Pawel Biernacki Reported by: kib, brooks Differential Revision: https://reviews.freebsd.org/D17282	2019-09-03 04:16:30 +00:00
Mark Johnston	209f2e9838	Add a sysctl to dump kernel mappings and their properties on amd64. The sysctl is called vm.pmap.kernel_maps. It dumps address ranges and their corresponding protection and mapping mode, as well as counts of 2MB and 1GB pages in the range. Reviewed by: kib MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21380	2019-09-02 21:57:57 +00:00
Mark Johnston	87044fca73	Replace PMAP_LARGEMAP_MAX_ADDRESS() with a more general predicate. No functional change intended. Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-09-02 21:54:08 +00:00
Michael Tuexen	fe5dee73f7	This patch improves the DSACK handling to conform with RFC 2883. The lowest SACK block is used when multiple Blocks would be elegible as DSACK blocks ACK blocks get reordered - while maintaining the ordering of SACK blocks not relevant in the DSACK context is maintained. Reviewed by: rrs@, tuexen@ Obtained from: Richard Scheffenegger MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21038	2019-09-02 19:04:02 +00:00
Edward Tomasz Napierala	1d3a302b4a	Bump Linux version to 3.2.0. Without it, binaries linked against glibc 2.24 and up (eg Ubuntu 19.04) fail with "FATAL: kernel too old". This alone is not enough to make newer binaries actually work; fix/hack/workaround is pending review at https://reviews.freebsd.org/D20687. Reviewed by: emaste MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20757	2019-09-02 18:10:35 +00:00
Warner Losh	31b11bb3f2	In nvme_completion_poll, add a sanity check to make sure that we complete the polling within a second. Panic if we don't. All the commands that use this interface should typically complete within a few tens to hundreds of microseconds. Panic rather than return ETIMEDOUT because if the command somehow does later complete, it will randomly corrupt memory. Also, it helps to get a traceback from where the unexpected failure happens, rather than an infinite loop.	2019-09-02 17:11:32 +00:00
Warner Losh	ab0681aac9	In all the places that we use the polled for completion interface, except crash dump support code, move the while loop into an inline function. These aren't done in the fast path, so if the compiler choses to not inline, any performance hit is tiny.	2019-09-02 17:11:27 +00:00
Warner Losh	fc68da4b4d	Add a brief comment explaining why we can return ETIMEDOUT from the call to the polled interface. Normally this would have the potential to corrupt stack memory because the completion routines would run after we return. In this case, however, we're doing a dump so it's safe for reasons explained in the comment.	2019-09-02 17:10:46 +00:00

1 2 3 4 5 ...

128528 Commits