freebsd-nq

Author	SHA1	Message	Date
Ian Lepore	53f93ed3ff	Fix an alignment check that is wrong in half the busdma implementations. This will enable the elimination of a workaround in the USB driver that artifically allocates buffers twice as big as they need to be (which actually saves memory for very small buffers on the buggy platforms). When deciding how to allocate a dma buffer, armv4, armv6, mips, and x86/iommu all correctly check for the tag alignment <= maxsize as enabling simple uma/malloc based allocation. Powerpc, sparc64, x86/bounce, and arm64/bounce were all checking for alignment < maxsize; on those platforms when alignment was equal to the max size it would fall back to page-based allocators even for very small buffers. This change makes all platforms use the <= check. It should be noted that on all platforms other than arm[v6] and mips, this check is relying on undocumented behavior in malloc(9) that if you allocate a block of a given size it will be aligned to the next larger power-of-2 boundary. There is nothing in the malloc(9) man page that makes that explicit promise (but the busdma code has been relying on this behavior all along so I guess it works). Arm and mips code uses the allocator in kern/subr_busdma_buffalloc.c, which does explicitly implement this promise about size and alignment. Other platforms probably should switch to the aligned allocator.	2015-11-02 23:37:19 +00:00
Konstantin Belousov	cff8c6f2d1	Add support for weak symbols to the kernel linkers. It means that linkers no longer raise an error when undefined weak symbols are found, but relocate as if the symbol value was 0. Note that we do not repeat the mistake of userspace dynamic linker of making the symbol lookup prefer non-weak symbol definition over the weak one, if both are available. In fact, kernel linker uses the first definition found, and ignores duplicates. Signature of the elf_lookup() and elf_obj_lookup() functions changed to split result/error code and the symbol address returned. Otherwise, it is impossible to return zero address as the symbol value, to MD relocation code. This explains the mechanical changes in elf_machdep.c sources. The powerpc64 R_PPC_JMP_SLOT handler did not checked error from the lookup() call, the patch leaves the code as is (untested). Reported by: glebius Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-09-20 01:27:59 +00:00
Marius Strobl	100f3d3da6	- Sanity check that the parent ranges given in the "ranges" property of PCI-EBus-bridges actually match the BARs as specified in and required by [1, p. 113 f.]. Doing so earlier would have simplified diagnosing a bug in QEMU/OpenBIOS getting the mapping of child addresses wrong, which still needs to be fixed there. In theory, we could try to change the BARs accordingly if we hit this problem. However, at least with real machines changing the decoding likely won't work, especially if the PCI-EBus-bridge is beneath an APB one. So implementing such functionality generally is rather pointless. - Actually change the allocation type of EBus resources if they change from SYS_RES_MEMORY to SYS_RES_IOPORT when mapping them to PCI ranges in ebus_alloc_resource() and passing them up to bus_activate_resource(9). This may happen with the QEMU/OpenBIOS PCI-EBus-bridge but not real ones. Still, this is only cleans up the code and the result of resource allocation and activation is unchanged. - Change the remainder of printf(9) to device_printf(9) calls and canonicalize their wording. MFC after: 1 week Peripheral Component Interconnect Input Output Controller, Part No.: 802-7837-01, Sun Microelectronics, March 1997 [1]	2015-09-13 21:59:56 +00:00
Marius Strobl	1ed1ef30ee	Merge r286374 from x86: Formally pair store_rel(&smp_started) with load_acq(&smp_started). Similarly to x86, this change is mostly a NOP due to the kernel being run in total store order. MFC after: 1 week	2015-09-13 00:08:04 +00:00
Marius Strobl	9888f86a16	- Factor out the common and generic parts of the sparc64 host-PCI-bridge drivers into the revived sys/sparc64/pci/ofw_pci.c, previously already serving a similar purpose. This has been done with sun4v in mind, which explains a) the otherwise not that obvious scheme employed and b) why reusing sys/powerpc/ofw/ofw_pci.c was even lesser an option. - Add a workaround for QEMU once again not emulating real machines, in this case by not providing the OFW_PCI_CS_MEM64 range. [1] Submitted by: jhb [1] MFC after: 1 week	2015-09-12 22:49:32 +00:00
Mark Johnston	610141cebb	Add stack_save_td_running(), a function to trace the kernel stack of a running thread. It is currently implemented only on amd64 and i386; on these architectures, it is implemented by raising an NMI on the CPU on which the target thread is currently running. Unlike stack_save_td(), it may fail, for example if the thread is running in user mode. This change also modifies the kern.proc.kstack sysctl to use this function, so that stacks of running threads are shown in the output of "procstat -kk". This is handy for debugging threads that are stuck in a busy loop. Reviewed by: bdrewery, jhb, kib Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3256	2015-09-11 03:54:37 +00:00
Konstantin Belousov	1fa6712471	Do not hold the process around the vm_fault() call from the trap()s. The only operation which is prevented by the hold is the kernel stack swapout for the faulted thread, which should be fine to allow. Remove useless checks for NULL curproc or curproc->p_vmspace from the trap_pfault() wrappers on x86 and powerpc. Reviewed by: alc (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-09-10 17:46:48 +00:00
Ed Maste	fc8c856029	Rationalize BSD license on sys/*/include/in_cksum.h Remove the advertising clause from the Regents of the University of California's license, per the letter dated July 22, 1999. Update clause numbering.	2015-08-05 19:05:12 +00:00
Ed Maste	96226a9aa7	Rationalize BSD license on sys/*/include/float.h Remove the advertising clause from the Regents of the University of California's license, per the letter dated July 22, 1999. Update clause numbering.	2015-08-05 17:05:35 +00:00
Jason A. Harmening	713841afb2	Add two new pmap functions: vm_offset_t pmap_quick_enter_page(vm_page_t m) void pmap_quick_remove_page(vm_offset_t kva) These will create and destroy a temporary, CPU-local KVA mapping of a specified page. Guarantees: --Will not sleep and will not fail. --Safe to call under a non-sleepable lock or from an ithread Restrictions: --Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms --Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page. --MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm. NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing. Reviewed by: kib Approved by: kib (mentor) Differential Revision: http://reviews.freebsd.org/D3013	2015-08-04 19:46:13 +00:00
Marius Strobl	7815d3948c	o Revert the other functional half of r239864, i. e. the merge of r134227 from x86 to use smp_ipi_mtx spin lock not only for smp_rendezvous_cpus() but also for the MD cache invalidation, TLB demapping and remote register reading IPIs due to the following reasons: - The cross-IPI SMP deadlock x86 otherwise is subject to can't happen on sparc64. That's because on sparc64, spin locks don't disable interrupts completely but only raise the processor interrupt level to PIL_TICK. This means that IPIs still get delivered and direct dispatch IPIs such as the cache invalidation etc. IPIs in question are still executed. - In smp_rendezvous_cpus(), smp_ipi_mtx is held not only while sending an IPI_RENDEZVOUS, but until all CPUs have processed smp_rendezvous_action(). Consequently, smp_ipi_mtx may be locked for an extended amount of time as queued IPIs (as opposed to the direct ones) such as IPI_RENDEZVOUS are scheduled via a soft interrupt. Moreover, given that this soft interrupt is only delivered at PIL_RENDEZVOUS, processing of smp_rendezvous_action() on a target may be interrupted by f. e. a tick interrupt at PIL_TICK, in turn leading to the target in question trying to send an IPI by itself while IPI_RENDEZVOUS isn't fully handled, yet, and, thus, resulting in a deadlock. o As mentioned in the commit message of r245850, on least some sun4u platforms concurrent sending of IPIs by different CPUs is fatal. Therefore, hold the reintroduced MD ipi_mtx also while delivering cross-traps via MI helpers, i. e. ipi_{all_but_self,cpu,selected}(). o Akin to x86, let the last CPU to process cpu_mp_bootstrap() set smp_started instead of the BSP in cpu_mp_unleash(). This ensures that all APs actually are started, when smp_started is no longer 0. o In all MD and MI IPI helpers, check for smp_started == 1 rather than for smp_cpus > 1 or nothing at all. This avoids races during boot causing IPIs trying to be delivered to APs that in fact aren't up and running, yet. While at it, move setting of the cpu_ipi_{selected,single}() pointers to the appropriate delivery functions from mp_init() to cpu_mp_start() where it's better suited and allows to get rid of the global isjbus variable. o Given that now concurrent IPI delivery no longer is possible, also nuke the delays before completely disabling interrupts again in the CPU-specific cross-trap delivery functions, previously giving other CPUs a window for sending IPIs on their part. Actually, we now should be able to entirely get rid of completely disabling interrupts in these functions. Such a change needs more testing, though. o In {s,}tick_get_timecount_mp(), make the {s,}tick variable static. While not necessary for correctness, this avoids page faults when accessing the stack of a foreign CPU as {s,}tick now is locked into the TLBs as part of static kernel data. Hence, {s,}tick_get_timecount_mp() always execute as fast as possible, avoiding jitter. PR: 201245 MFC after: 3 days	2015-07-24 15:13:21 +00:00
Zbigniew Bodek	721555e7ee	Fix KSTACK_PAGES issue when the default value was changed in KERNCONF If KSTACK_PAGES was changed to anything alse than the default, the value from param.h was taken instead in some places and the value from KENRCONF in some others. This resulted in inconsistency which caused corruption in SMP envorinment. Ensure all places where KSTACK_PAGES are used the opt_kstack_pages.h is included. The file opt_kstack_pages.h could not be included in param.h because was breaking the toolchain compilation. Reviewed by: kib Obtained from: Semihalf Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D3094	2015-07-16 10:46:52 +00:00
Christian Brueffer	f4c1eac7cd	Spell crypto correctly.	2015-07-14 10:47:56 +00:00
Konstantin Belousov	8954a9a4e6	Add the atomic_thread_fence() family of functions with intent to provide a semantic defined by the C11 fences with corresponding memory_order. atomic_thread_fence_acq() gives r \| r, w, where r and w are read and write accesses, and \| denotes the fence itself. atomic_thread_fence_rel() is r, w \| w. atomic_thread_fence_acq_rel() is the combination of the acquire and release in single operation. Note that reads after the acq+rel fence could be made visible before writes preceeding the fence. atomic_thread_fence_seq_cst() orders all accesses before/after the fence, and the fence itself is globally ordered against other sequentially consistent atomic operations. Reviewed by: alc Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-07-08 18:12:24 +00:00
George V. Neville-Neil	0661a7c224	Fix up tabs vs. spaces	2015-07-04 20:31:06 +00:00
George V. Neville-Neil	3839369c03	Enable IPSEC in all GENERIC kernels. Universe and kernel build tests passed 4 July 2015 PR: 128030 Sponsored by: Rubicon Communications (Netgate)	2015-07-04 17:37:00 +00:00
Mateusz Guzik	4ea6a9a28f	Generalised support for copy-on-write structures shared by threads. Thread credentials are maintained as follows: each thread has a pointer to creds and a reference on them. The pointer is compared with proc's creds on userspace<->kernel boundary and updated if needed. This patch introduces a counter which can be compared instead, so that more structures can use this scheme without adding more comparisons on the boundary.	2015-06-10 10:43:59 +00:00
Alan Cox	966272ca33	Retire VM_FREEPOOL_CACHE as the next step in eliminating PG_CACHE pages. Differential Revision: https://reviews.freebsd.org/D2712 Reviewed by: kib Sponsored by: EMC / Isilon Storage Division	2015-06-08 04:59:32 +00:00
Dmitry Chagin	0c38abc250	The kernel sends signals to the processes via ABI specific sv_sendsig method. Native ABI do not need signal conversion, only emulators may want this. Usually emulators implements its own sv_sendsig method. For now only ibcs2 emulator does not have own sv_sendsig implementation and depends on native sendsig() method. So, remove any extra attempts to convert signal numbers from native sendsig() methods except from i386 where ibsc2 is living.	2015-05-24 17:56:02 +00:00
Dmitry Chagin	91d1786f65	In preparation for switching linuxulator to the use the native 1:1 threads add a hook for cleaning thread resources before the thread die. Differential Revision: https://reviews.freebsd.org/D1038	2015-05-24 14:51:29 +00:00
Pedro F. Giffuni	cd508278c1	ddb: finish converting boolean values. The replacement started at r283088 was necessarily incomplete without replacing boolean_t with bool. This also involved cleaning some type mismatches and ansifying old C function declarations. Pointed out by: bde Discussed with: bde, ian, jhb	2015-05-21 15:16:18 +00:00
Edward Tomasz Napierala	ba8f0eb8fc	Build GENERIC with RACCT/RCTL support by default. Note that it still needs to be enabled by adding "kern.racct.enable=1" to /boot/loader.conf. Differential Revision: https://reviews.freebsd.org/D2407 Reviewed by: emaste@, wblock@ MFC after: 1 month Relnotes: yes Sponsored by: The FreeBSD Foundation	2015-05-14 14:03:55 +00:00
John Baldwin	5abf821641	Update this driver to not save copies of registers that are no longer used after r281874. While here, also update it to always write the parent's PCI bus number to the primary bus register.	2015-04-24 13:12:04 +00:00
Andrew Turner	405ada37fb	Add support for the uart classes to set their default register shift value. This is needed with the pl011 driver. Before this change it would default to a shift of 0, however the hardware places the registers at 4-byte addresses meaning the value should be 2. This patch fixes this for the pl011 when configured using the fdt. The other drivers have a default value of 0 to keep this a no-op. MFC after: 1 week	2015-04-11 17:16:23 +00:00
John Baldwin	dbee5c671a	Move the 32-bit compatible procfs types from freebsd32.h to <sys/procfs.h> and export them to userland. - Define __HAVE_REG32 on platforms that define a reg32 structure and check for this in <sys/procfs.h> to control when to export prstatus32, etc. - Add prstatus32_t and prpsinfo32_t typedefs for the 32-bit structures. libbfd looks for these types, and having them fixes 'gcore' in gdb of a 32-bit process on a 64-bit platform. - Use the structure definitions from <sys/procfs.h> in gcore's elf32 core dump code instead of duplicating the definitions. Differential Revision: https://reviews.freebsd.org/D2142 Reviewed by: kib, nathanw (powerpc bits) MFC after: 1 week	2015-04-08 16:30:45 +00:00
Ryan Stone	f2c2231e0c	Fix integer truncation bug in malloc(9) A couple of internal functions used by malloc(9) and uma truncated a size_t down to an int. This could cause any number of issues (e.g. indefinite sleeps, memory corruption) if any kernel subsystem tried to allocate 2GB or more through malloc. zfs would attempt such an allocation when run on a system with 2TB or more of RAM. Note to self: When this is MFCed, sparc64 needs the same fix. Differential revision: https://reviews.freebsd.org/D2106 Reviewed by: kib Reported by: Michael Fuckner <michael@fuckner.net> Tested by: Michael Fuckner <michael@fuckner.net> MFC after: 2 weeks	2015-04-01 12:42:26 +00:00
John Baldwin	86750039c6	Apply r276208 to non-amd64 NOTES files as well to fix tinderbox builds run under a system using vt(4) instead of syscons(4): Use compiled in default keymaps which are available both in syscons and vt.	2015-03-25 15:51:41 +00:00
Marius Strobl	aed116911d	Unbreak sparc64 after r276630 by calling __sparc_sigtramp_setup signal trampoline as part of the MD __sys_sigaction again. Submitted by: kib (initial versions) MFC after: 3 days	2015-02-16 22:13:03 +00:00
Konstantin Belousov	206f09eb46	Do not qualify the mcontext_t mcp argument for set_mcontext(9) as const. On x86, even after the machine context is supposedly read into the struct ucontext, lazy FPU state save code might only mark the FPU data as hardware-owned. Later, set_fpcontext() needs to fetch the state from hardware, modifying the mcp. The set_mcontext(9) is called from sigreturn(2) and setcontext(2) implementations and old create_thread(2) interface, which throw the *mcp out after the set_mcontext() call. Reported by: dim Discussed with: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-01-31 21:43:46 +00:00
Konstantin Belousov	8f4548ff25	Remove Giant from /dev/mem and /dev/kmem. It is definitely not needed for i386, and from the code inspection, nothing in the arm/mips/sparc64 implementations depends on it. Discussed with: imp, nwhitehorn Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-01-24 12:51:15 +00:00
Mark Johnston	bdb9ab0dd9	Factor out duplicated code from dumpsys() on each architecture into generic code in sys/kern/kern_dump.c. Most dumpsys() implementations are nearly identical and simply redefine a number of constants and helper subroutines; a generic implementation will make it easier to implement features around kernel core dumps. This change does not alter any minidump code and should have no functional impact. PR: 193873 Differential Revision: https://reviews.freebsd.org/D904 Submitted by: Conrad Meyer <conrad.meyer@isilon.com> Reviewed by: jhibbits (earlier version) Sponsored by: EMC / Isilon Storage Division	2015-01-07 01:01:39 +00:00
John Baldwin	3e32dff52c	Remove "New" label from NFSCL/NFSD now that they are the only NFS client/server. While here, remove duplicate NFSCL from sys/conf/NOTES. Approved by: rmacklem	2015-01-06 16:15:57 +00:00
George V. Neville-Neil	bd19924f6b	This configuration file removes several debugging options, including WITNESS and INVARIANTS checking, which are known to have significant performance impact on running systems. When benchmarking new features this kernel should be used instead of the standard GENERIC. This kernel configuration should never appear outside of the HEAD of the FreeBSD tree.	2014-12-02 19:55:43 +00:00
Ed Maste	294246bb7d	Revert r274772: it is not valid on MIPS Reported by: sbruno	2014-11-25 03:50:31 +00:00
Ed Maste	688fd61ae8	Use canonical __PIC__ flag It is automatically set when -fPIC is passed to the compiler. Reviewed by: dim, kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D1179	2014-11-21 02:05:48 +00:00
Alexander V. Chernikov	603eaf792b	Renove faith(4) and faithd(8) from base. It looks like industry have chosen different (and more traditional) stateless/statuful NAT64 as translation mechanism. Last non-trivial commits to both faith(4) and faithd(8) happened more than 12 years ago, so I assume it is time to drop RFC3142 in FreeBSD. No objections from: net@	2014-11-09 21:33:01 +00:00
Konstantin Belousov	4f3dc90023	Add fueword(9) and casueword(9) functions. They are like fuword(9) and casuword(9), but do not mix value read and indication of fault. I know (or remember) enough assembly to handle x86 and powerpc. For arm, mips and sparc64, implement fueword() and casueword() as wrappers around fuword() and casuword(), which means that the functions cannot distinguish between -1 and fault. On architectures where fueword() and casueword() are native, implement fuword() and casuword() using fueword() and casuword(), to reduce assembly code duplication. Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 2 weeks (ia64 needs treating)	2014-10-28 15:22:13 +00:00
Alan Cox	b17af71943	Simplify memrw(). MFC after: 10 days	2014-10-27 01:10:40 +00:00
John Baldwin	7d313e7bdb	Add COMPAT_FREEBSD9 and COMPAT_FREEBSD10 options to wrap code that provides compatability for FreeBSD 9.x and 10.x binaries. Enable these options in kernel configs that enable other COMPAT_FREEBSD<n> options.	2014-10-24 19:58:24 +00:00
Davide Italiano	2be111bf7d	Follow up to r225617. In order to maximize the re-usability of kernel code in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv(). This fixes a namespace collision with libc symbols. Submitted by: kmacy Tested by: make universe	2014-10-16 18:04:43 +00:00
Baptiste Daroussin	dbacd66928	Fix typo revealed by using newer binutils Differential Revision: https://reviews.freebsd.org/D933 Reviewed by: marius	2014-10-10 14:18:33 +00:00
Roger Pau Monné	c98a2727cc	ddb: allow specifying the exact address of the symtab and strtab When the FreeBSD kernel is loaded from Xen the symtab and strtab are not loaded the same way as the native boot loader. This patch adds three new global variables to ddb that can be used to specify the exact position and size of those tables, so they can be directly used as parameters to db_add_symbol_table. A new helper is introduced, so callers that used to set ksym_start and ksym_end can use this helper to set the new variables. It also adds support for loading them from the Xen PVH port, that was previously missing those tables. Sponsored by: Citrix Systems R&D Reviewed by: kib ddb/db_main.c: - Add three new global variables: ksymtab, kstrtab, ksymtab_size that can be used to specify the position and size of the symtab and strtab. - Use those new variables in db_init in order to call db_add_symbol_table. - Move the logic in db_init to db_fetch_symtab in order to set ksymtab, kstrtab, ksymtab_size from ksym_start and ksym_end. ddb/ddb.h: - Add prototype for db_fetch_ksymtab. - Declate the extern variables ksymtab, kstrtab and ksymtab_size. x86/xen/pv.c: - Add support for finding the symtab and strtab when booted as a Xen PVH guest. Since Xen loads the symtab and strtab as NetBSD expects to find them we have to adapt and use the same method. amd64/amd64/machdep.c: arm/arm/machdep.c: i386/i386/machdep.c: mips/mips/machdep.c: pc98/pc98/machdep.c: powerpc/aim/machdep.c: powerpc/booke/machdep.c: sparc64/sparc64/machdep.c: - Use the newly introduced db_fetch_ksymtab in order to set ksymtab, kstrtab and ksymtab_size.	2014-09-25 08:28:10 +00:00
John Baldwin	1f895058e4	Revert unrelated changes accidentally committed in r271192.	2014-09-17 18:55:39 +00:00
Adrian Chadd	066da8050b	Migrate ie->ie_assign_cpu and associated code to use an int for CPU rather than u_char. Migrate post_filter to use an int for a CPU rather than u_char. Change intr_event_bind() to use an int for CPU rather than u_char. It touches the ppc, sparc64, arm and mips machdep code but it should (hah!) be a no-op. Tested: * i386, AMD64 laptops Reviewed by: jhb	2014-09-17 17:33:22 +00:00
John Baldwin	b1d735ba4c	Create a separate structure for per-CPU state saved across suspend and resume that is a superset of a pcb. Move the FPU state out of the pcb and into this new structure. As part of this, move the FPU resume code on amd64 into a C function. This allows resumectx() to still operate only on a pcb and more closely mirrors the i386 code. Reviewed by: kib (earlier version)	2014-09-06 15:23:28 +00:00
Konstantin Belousov	8158a32900	For CPUs which do hardware cache line unaliasing, use direct map to access sfbufs. Suggested and reviewed by: alc Tested by: Michael Moll <kvedulv@kvedulv.de> Sponsored by: The FreeBSD Foundation	2014-08-23 18:11:54 +00:00
Konstantin Belousov	5e1d15a879	Complete r254667, do not destroy pmap lock if KVA allocation failed. Submitted by: Svatopluk Kraus <onwahe@gmail.com> MFC after: 1 week	2014-08-16 08:31:25 +00:00
Konstantin Belousov	bb0a8f248d	On sparc64, do not keep mappings for the destroyed sf_bufs. Sparc64 pmap, unlike i386, and similar to i386/xen pv, does not tolerate abandoned mappings for the freed pages. Reported and tested by: dumbbell Diagnosed and reviewed by: alc Sponsored by: The FreeBSD Foundation	2014-08-10 16:59:39 +00:00
Konstantin Belousov	39ffa8c138	Change pmap_enter(9) interface to take flags parameter and superpage mapping size (currently unused). The flags includes the fault access bits, wired flag as PMAP_ENTER_WIRED, and a new flag PMAP_ENTER_NOSLEEP to indicate that pmap should not sleep. For powerpc aim both 32 and 64 bit, fix implementation to ensure that the requested mapping is created when PMAP_ENTER_NOSLEEP is not specified, in particular, wait for the available memory required to proceed. In collaboration with: alc Tested by: nwhitehorn (ppc aim32 and booke) Sponsored by: The FreeBSD Foundation and EMC / Isilon Storage Division MFC after: 2 weeks	2014-08-08 17:12:03 +00:00
Gleb Smirnoff	c8d2ffd6a7	Merge all MD sf_buf allocators into one MI, residing in kern/subr_sfbuf.c The MD allocators were very common, however there were some minor differencies. These differencies were all consolidated in the MI allocator, under ifdefs. The defines from machine/vmparam.h turn on features required for a particular machine. For details look in the comment in sys/sf_buf.h. As result no MD code left in sys///vm_machdep.c. Some arches still have machine/sf_buf.h, which is usually quite small. Tested by: glebius (i386), tuexen (arm32), kevlo (arm32) Reviewed by: kib Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-08-05 09:44:10 +00:00
Alan Cox	a695d9b25b	Retire pmap_change_wiring(). We have never used it to wire virtual pages. We continue to use pmap_enter() for that. For unwiring virtual pages, we now use pmap_unwire(), which unwires a range of virtual addresses instead of a single virtual page. Sponsored by: EMC / Isilon Storage Division	2014-08-03 20:40:51 +00:00
Nathan Whitehorn	190d685037	Add vt(4) support to sparc64. The only driver currently present (ofwfb) provides support for a variety of low-end graphics hardware (SBus adapters, Mach64, QEMU's framebuffer, XVR-100). A driver for at least the Creator3D cards will have to be present before this can become the default console driver. To test vt(4) on sparc64, set kern.vty=vt at the loader prompt.	2014-08-02 03:48:16 +00:00
Gavin Atkinson	f6b4f5ca21	Add error return to dumpsys(), and use it in doadump(). This commit does not add error returns to minidumpsys() or textdump_dumpsys(); those can also be added later. Submitted by: Conrad Meyer (EMC / Isilon storage division)	2014-07-25 23:52:53 +00:00
Alan Cox	09132ba6ac	Introduce pmap_unwire(). It will replace pmap_change_wiring(). There are several reasons for this change: pmap_change_wiring() has never (in my memory) been used to set the wired attribute on a virtual page. We have always used pmap_enter() to do that. Moreover, it is not really safe to use pmap_change_wiring() to set the wired attribute on a virtual page. The description of pmap_change_wiring() says that it assumes the existence of a mapping in the pmap. However, non-wired mappings may be reclaimed by the pmap at any time. (See pmap_collect().) Many implementations of pmap_change_wiring() will crash if the mapping does not exist. pmap_unwire() accepts a range of virtual addresses, whereas pmap_change_wiring() acts upon a single virtual page. Since we are typically unwiring a range of virtual addresses, pmap_unwire() will be more efficient. Moreover, pmap_unwire() allows us to unwire superpage mappings. Previously, we were forced to demote the superpage mapping, because pmap_change_wiring() only allowed us to express the unwiring of a single base page mapping at a time. This added to the overhead of unwiring for large ranges of addresses, including the implicit unwiring that occurs at process termination. Implementations for arm and powerpc will follow. Discussed with: jeff, marcel Reviewed by: kib Sponsored by: EMC / Isilon Storage Division	2014-07-06 17:42:38 +00:00
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Warner Losh	3f1afabf09	Restore comments accidentally removed. MFC after: 3 days	2014-06-06 04:08:55 +00:00
Kenneth D. Merry	991554f2c4	Bring in the mpr(4) driver for LSI's MPT3 12Gb SAS controllers. This is derived from the mps(4) driver, but it supports only the 12Gb IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108. Some notes about this driver: o The 12Gb hardware can do "FastPath" I/O, and that capability is included in this driver. o WarpDrive functionality has been removed, since it isn't supported in the 12Gb driver interface. o The Scatter/Gather list handling code is significantly different between the 6Gb and 12Gb hardware. The 12Gb boards support IEEE Scatter/Gather lists. Thanks to LSI for developing and testing this driver for FreeBSD. share/man/man4/mpr.4: mpr(4) man page. sys/dev/mpr/*: mpr(4) driver files. sys/modules/Makefile, sys/modules/mpr/Makefile: Add a module Makefile for the mpr(4) driver. sys/conf/files: Add the mpr(4) driver. sys/amd64/conf/GENERIC, sys/i386/conf/GENERIC, sys/mips/conf/OCTEON1, sys/sparc64/conf/GENERIC: Add the mpr(4) driver to all config files that currently have the mps(4) driver. sys/ia64/conf/GENERIC: Add the mps(4) and mpr(4) drivers to the ia64 GENERIC config file. sys/i386/conf/XEN: Exclude the mpr module from building here. Submitted by: Steve McConnell <Stephen.McConnell@lsi.com> MFC after: 3 days Tested by: Chris Reeves <chrisr@spectralogic.com> Sponsored by: LSI, Spectra Logic Relnotes: LSI 12Gb SAS driver mpr(4) added	2014-05-02 20:25:09 +00:00
Scott Long	60ad8150c7	Retire smp_active. It was racey and caused demonstrated problems with the cpufreq code. Replace its use with smp_started. There's at least one userland tool that still looks at the kern.smp.active sysctl, so preserve it but point it to smp_started as well. Discussed with: peter, jhb MFC after: 3 days Obtained from: Netflix	2014-04-26 20:27:54 +00:00
Tijl Coosemans	0a4c54d606	Rename __wchar_t so it no longer conflicts with __wchar_t from clang 3.4 -fms-extensions. MFC after: 2 weeks	2014-04-01 14:46:11 +00:00
Warner Losh	3ad1a09169	Rather than require a makeoptions DEBUG to get debug correct, add it in kern.mk, but only if we're using clang. While this option is supported by both clang and gcc, in the future there may be changes to clang which change the defaults that require a tweak to build our kernel such that other tools in our tree will work. Set a good example by forcing -gdwarf-2 only for clang builds, and only if the user hasn't specified another dwarf level already. Update UPDATING to reflect the changed state of affairs. This also keeps us from having to update all the ARM kernels to add this, and also keeps us from in the future having to update all the MIPS kernels and is one less place the user will have to know to do something special for clang and one less thing developers will need to do when moving an architecture to clang. Reviewed by: ian@ MFC after: 1 week	2014-03-25 22:08:31 +00:00
Bryan Drewery	44f1c91610	Rename global cnt to vm_cnt to avoid shadowing. To reduce the diff struct pcu.cnt field was not renamed, so PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in kvm(3) and vmstat(8). The goal was to not affect externally used KPI. Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the the global cnt variable. Exp-run revealed no ports using it directly. No objection from: arch@ Sponsored by: EMC / Isilon Storage Division	2014-03-22 10:26:09 +00:00
Robert Watson	4a14441044	Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks	2014-03-16 10:55:57 +00:00
Warner Losh	22b8ff24b5	Delete stray clause 3 (Advertising clause) and renumber while i'm here. Approved by: alc@	2014-03-11 23:41:35 +00:00
Dimitry Andric	c3bb517174	Merge from head up to r262472.	2014-02-25 07:40:37 +00:00
Dimitry Andric	362a42b214	Make sure a for loop in fire_alloc_msix() terminates, by making the loop counter signed. Reviewed by: marius MFC after: 3 days	2014-02-25 07:33:28 +00:00
Dimitry Andric	ad3d02dd2a	In sys/sparc64/sparc64/spitfire.c, prevent signed shift overflow by casting to the appropriate type. (Note this fix cannot be done in sys/sparc64/sparc64/spitfire.c, since that file is also included by assembly source files.) Reviewed by: marius MFC after: 3 days	2014-02-25 07:28:51 +00:00
Dimitry Andric	defb3353ee	Similar to r261991, for compiling the GENERIC kernel on sparc64, explicitly use -gdwarf-2 for the debug symbols.	2014-02-23 19:18:04 +00:00
Dimitry Andric	2994da7d3c	Remove more superfluous const specifiers.	2014-02-23 18:36:45 +00:00
John Baldwin	4edef187b8	Add support for managing PCI bus numbers. As with BARs and PCI-PCI bridge I/O windows, the default is to preserve the firmware-assigned resources. PCI bus numbers are only managed if NEW_PCIB is enabled and the architecture defines a PCI_RES_BUS resource type. - Add a helper API to create top-level PCI bus resource managers for each PCI domain/segment. Host-PCI bridge drivers use this API to allocate bus numbers from their associated domain. - Change the PCI bus and CardBus drivers to allocate a bus resource for their bus number from the parent PCI bridge device. - Change the PCI-PCI and PCI-CardBus bridge drivers to allocate the full range of bus numbers from secbus to subbus from their parent bridge. The drivers also always program their primary bus register. The bridge drivers also support growing their bus range by extending the bus resource and updating subbus to match the larger range. - Add support for managing PCI bus resources to the Host-PCI bridge drivers used for amd64 and i386 (acpi_pcib, mptable_pcib, legacy_pcib, and qpi_pcib). - Define a PCI_RES_BUS resource type for amd64 and i386. Reviewed by: imp MFC after: 1 month	2014-02-12 04:30:37 +00:00
Nathan Whitehorn	95e3bfe889	Simplify the ofw_bus_lookup_imap() API slightly: make it allocate maskbuf internally instead of requiring the caller to allocate it.	2013-12-17 15:11:24 +00:00
Marius Strobl	820677f94a	Restore a vital comment nuked in r259016.	2013-12-08 15:25:19 +00:00
Aleksandr Rybalko	27cf7d04ef	Merge VT(9) project (a.k.a. newcons). Reviewed by: nwhitehorn MFC_to_10_after: re approval Sponsored by: The FreeBSD Foundation	2013-12-05 22:38:53 +00:00
Pawel Jakub Dawidek	f2b525e6b9	Make process descriptors standard part of the kernel. rwhod(8) already requires process descriptors to work and having PROCDESC in GENERIC seems not enough, especially that we hope to have more and more consumers in the base. MFC after: 3 days	2013-11-30 15:08:35 +00:00
Alan Cox	c70af4875e	As of r257209, all architectures have defined VM_KMEM_SIZE_SCALE. In other words, every architecture is now auto-sizing the kmem arena. This revision changes kmeminit() so that the definition of VM_KMEM_SIZE_SCALE becomes mandatory and the definition of VM_KMEM_SIZE becomes optional. Replace or eliminate all existing definitions of VM_KMEM_SIZE. With auto-sizing enabled, VM_KMEM_SIZE effectively became an alternate spelling for VM_KMEM_SIZE_MIN on most architectures. Use VM_KMEM_SIZE_MIN for clarity. Change kmeminit() so that the effect of defining VM_KMEM_SIZE is similar to that of setting the tunable vm.kmem_size. Whereas the macros VM_KMEM_SIZE_{MAX,MIN,SCALE} have had the same effect as the tunables vm.kmem_size_{max,min,scale}, the effects of VM_KMEM_SIZE and vm.kmem_size have been distinct. In particular, whereas VM_KMEM_SIZE was overridden by VM_KMEM_SIZE_{MAX,MIN,SCALE} and vm.kmem_size_{max,min,scale}, vm.kmem_size was not. Remedy this inconsistency. Now, VM_KMEM_SIZE can be used to set the size of the kmem arena at compile-time without that value being overridden by auto-sizing. Update the nearby comments to reflect the kmem submap being replaced by the kmem arena. Stop duplicating the auto-sizing formula in every machine- dependent vmparam.h and place it in kmeminit() where auto-sizing takes place. Reviewed by: kib (an earlier version) Sponsored by: EMC / Isilon Storage Division	2013-11-08 16:25:00 +00:00
Konstantin Belousov	80938e75f0	Add bus_dmamap_load_ma() function to load map with the array of vm_pages. Provide trivial implementation which forwards the load to _bus_dmamap_load_phys() page by page. Right now all architectures use bus_dmamap_load_ma_triv(). Tested by: pho (as part of the functional patch) Sponsored by: The FreeBSD Foundation MFC after: 1 month	2013-10-27 21:39:16 +00:00
Marius Strobl	2ce472e736	Move the implementation of bus_space_barrier(9) to the inline function in the header. Actually, there's only one version for all types of busses, so it doesn't make sense to walk up the hierarchy.	2013-10-24 17:06:41 +00:00
Marius Strobl	d1b111bea8	Implement GET_STACK_USAGE. Discussed with: mav Approved by: re (kib) MFC after: 1 week	2013-09-29 13:09:25 +00:00
Gleb Smirnoff	255c1caae3	- Create kern.ipc.sendfile namespace, and put the new "readhead" OID there as "kern.ipc.sendfile.readahead". - Push all nsfbuf related tunables into MD code. Don't move them to new namespace in favor of POLA. Reviewed by: scottl Approved by: re (gjb)	2013-09-22 13:36:52 +00:00
Alan Cox	deb179bb4c	The pmap function pmap_clear_reference() is no longer used. Remove it. pmap_clear_reference() has had exactly one caller in the kernel for several years, more precisely, since FreeBSD 8. Now, that call no longer exists. Approved by: re (kib) Sponsored by: EMC / Isilon Storage Division	2013-09-20 04:30:18 +00:00
Pawel Jakub Dawidek	3fded357af	Fix panic in ktrcapfail() when no capability rights are passed. While here, correct all consumers to pass NULL instead of 0 as we pass capability rights as pointers now, not uint64_t. Reported by: Daniel Peyrolon Tested by: Daniel Peyrolon Approved by: re (marius)	2013-09-18 19:26:08 +00:00
John Baldwin	edb572a38c	Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use an address in the first 2GB of the process's address space. This flag should have the same semantics as the same flag on Linux. To facilitate this, add a new parameter to vm_map_find() that specifies an optional maximum virtual address. While here, fix several callers of vm_map_find() to use a VMFS_* constant for the findspace argument instead of TRUE and FALSE. Reviewed by: alc Approved by: re (kib)	2013-09-09 18:11:59 +00:00
Gleb Smirnoff	2ee9b44cae	Fix build with gcc. Move sf_buf_alloc()/sf_buf_free() declarations to MD headers.	2013-09-06 17:44:13 +00:00
Alan Cox	51321f7c31	Significantly reduce the cost, i.e., run time, of calls to madvise(..., MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new pmap function, pmap_advise(), that operates on a range of virtual addresses within the specified pmap, allowing for a more efficient implementation of MADV_DONTNEED and MADV_FREE. Previously, the implementation of MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as pmap_clear_reference(). Intuitively, the problem with this implementation is that the pmap-level locks are acquired and released and the page table traversed repeatedly, once for each resident page in the range that was specified to madvise(2). A more subtle flaw with the previous implementation is that pmap_clear_reference() would clear the reference bit on all mappings to the specified page, not just the mapping in the range specified to madvise(2). Since our malloc(3) makes heavy use of madvise(2), this change can have a measureable impact. For example, the system time for completing a parallel "buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%. Note: This change only contains pmap_advise() implementations for a subset of our supported architectures. I will commit implementations for the remaining architectures after further testing. For now, a stub function is sufficient because of the advisory nature of pmap_advise(). Discussed with: jeff, jhb, kib Tested by: pho (i386), marcel (ia64) Sponsored by: EMC / Isilon Storage Division	2013-08-29 15:49:05 +00:00
Konstantin Belousov	e68c64f0ba	Revert r254501. Instead, reuse the type stability of the struct pmap which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE zone. Initialize the pmap lock in the vmspace zone init function, and remove pmap lock initialization and destruction from pmap_pinit() and pmap_release(). Suggested and reviewed by: alc (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation	2013-08-22 18:12:24 +00:00
Konstantin Belousov	5944de8ecd	Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9). The flag was mandatory since r209792, where vm_page_grab(9) was changed to only support the alloc retry semantic. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation	2013-08-22 07:39:53 +00:00
Pawel Jakub Dawidek	417ffc66fa	Add process descriptors support to the GENERIC kernel. It is already being used by the tools in base systems and with sandboxing more and more tools the usage should only increase. Submitted by: Mariusz Zaborski <oshogbo@FreeBSD.org> Sponsored by: Google Summer of Code 2013 MFC after: 1 month	2013-08-18 10:21:29 +00:00
Attilio Rao	c7aebda8a1	The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl	2013-08-09 11:11:11 +00:00
Andriy Gapon	9ba0691bdd	follow up to r254051 - update powerpc/GENERIC64 as well, suggested by mdf - update comments so that they make sense after the change, suggested by jhb X-MFC after: never (change specific to head)	2013-08-09 08:11:09 +00:00
Konstantin Belousov	449c2e92c9	Split the pagequeues per NUMA domains, and split pageademon process into threads each processing queue in a single domain. The structure of the pagedaemons and queues is kept intact, most of the changes come from the need for code to find an owning page queue for given page, calculated from the segment containing the page. The tie between NUMA domain and pagedaemon thread/pagequeue split is rather arbitrary, the multithreaded daemon could be allowed for the single-domain machines, or one domain might be split into several page domains, to further increase concurrency. Right now, each pagedaemon thread tries to reach the global target, precalculated at the start of the pass. This is not optimal, since it could cause excessive page deactivation and freeing. The code should be changed to re-check the global page deficit state in the loop after some number of iterations. The pagedaemons reach the quorum before starting the OOM, since one thread inability to meet the target is normal for split queues. Only when all pagedaemons fail to produce enough reusable pages, OOM is started by single selected thread. Launder is modified to take into account the segments layout with regard to the region for which cleaning is performed. Based on the preliminary patch by jeff, sponsored by EMC / Isilon Storage Division. Reviewed by: alc Tested by: pho Sponsored by: The FreeBSD Foundation	2013-08-07 16:36:38 +00:00
Andriy Gapon	818d282e7b	enable KDB_TRACE in GENERICs KDB_TRACE is not an alternative to DDB/etc, they are complementary. So I do not see any reason to not enable KDB_TRACE by default. X-MFC after: never (change specific to head)	2013-08-07 08:03:50 +00:00
Jeff Roberson	5df87b21d3	Replace kernel virtual address space allocation with vmem. This provides transparent layering and better fragmentation. - Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem. Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division	2013-08-07 06:21:20 +00:00
Marius Strobl	4af8466d8b	Add MD (for now) atomic_store_acq_<type>() and use it in pmap_activate() to get the semantics when setting the PMAP right. Prior to r251782, the latter already used implicit acquire semantics, which - currently - means to not employ additional explicit memory barriers under the hood (see also r225889).	2013-08-06 15:34:11 +00:00
Attilio Rao	66bacd7e17	Remove unused member. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-08-04 21:17:05 +00:00
David E. O'Brien	0e6a0799a9	Back out r253779 & r253786.	2013-07-31 17:21:18 +00:00
David E. O'Brien	99ff83da74	Decouple yarrow from random(4) device. * Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option. The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow. * random(4) device doesn't really depend on rijndael-. Yarrow, however, does. Add random_adaptors.[ch] which is basically a store of random_adaptor's. random_adaptor is basically an adapter that plugs in to random(4). random_adaptor can only be plugged in to random(4) very early in bootup. Unplugging random_adaptor from random(4) is not supported, and is probably a bad idea anyway, due to potential loss of entropy pools. We currently have 3 random_adaptors: + yarrow + rdrand (ivy.c) + nehemeiah * Remove platform dependent logic from probe.c, and move it into corresponding registration routines of each random_adaptor provider. probe.c doesn't do anything other than picking a specific random_adaptor from a list of registered ones. * If the kernel doesn't have any random_adaptor adapters present then the creation of /dev/random is postponed until next random_adaptor is kldload'ed. * Fix randomdev_soft.c to refer to its own random_adaptor, instead of a system wide one. Submitted by: arthurmesh@gmail.com, obrien Obtained from: Juniper Networks Reviewed by: obrien	2013-07-29 20:26:27 +00:00
Andriy Gapon	a29cc9a34b	Revert r253748,253749 This WIP should not have been committed yet. Pointyhat to: avg	2013-07-28 18:44:17 +00:00
Andriy Gapon	366d8bfb7b	put contents of cpu.h under _KERNEL no userland-serviceable parts inside MFC after: 20 days	2013-07-28 18:32:27 +00:00
Andrey V. Elsukov	dbd4437b06	Include sys/systm.h after sys/param.h. Suggested by: pluknet	2013-07-15 15:40:57 +00:00

1 2 3 4 5 ...

2150 Commits