freebsd-dev

Author	SHA1	Message	Date
Attilio Rao	81c02539f1	MFC	2011-06-06 21:38:39 +00:00
Marcel Moolenaar	e726a6b70c	Improve cpu_idle(): o cpu_idle_hook is expected to be called with interrupts disabled and re-enables interrupts on return. o sync with x86: don't idle when the CPU has runnable tasks o have callers of ia64_call_pal_static() disable interrupts and re-enable interrupts. o add, but compile-out, support for idle mode. This will be enabled at some later time, after proper testing.	2011-06-06 19:06:15 +00:00
Attilio Rao	61b926921f	MFC	2011-05-31 21:22:44 +00:00
Nathan Whitehorn	d098f93019	On multi-core, multi-threaded PPC systems, it is important that the threads be brought up in the order they are enumerated in the device tree (in particular, that thread 0 on each core be brought up first). The SLIST through which we loop to start the CPUs has all of its entries added with SLIST_INSERT_HEAD(), which means it is in reverse order of enumeration and so AP startup would always fail in such situations (causing a machine check or RTAS failure). Fix this by changing the SLIST into an STAILQ, and inserting new CPUs at the end. Reviewed by: jhb	2011-05-31 15:11:43 +00:00
Attilio Rao	c02f1527a9	MFC	2011-05-14 19:20:13 +00:00
Marcel Moolenaar	767ca6ed1a	Prefer switching the memory stack from user to kernel before switching the register stack. While the ordering doesn't matter, it creates an invariant not previously there: the memory stack pointer will always be larger than the register stack pointer. With this invariant in place, it's easier to add instrumentation code that detects a stack overflow because in such a scenario the memory stack pointer and register stack pointers have crossed each other. Aside: basic kernel operation needs about half the stack size (~16K) at most. We have plenty of head room on the kernel stack...	2011-05-14 14:55:15 +00:00
Marcel Moolenaar	65385d6d79	Sharpening the saw: o Clobber the register that holds the restart token immediately after crossing the restart point. This prevents false positives (i.e. a nested exception that we don't know can happen and that is being treated as one we know by virtue of a lingering restart token). o Now that the bootstrap kernel stack is free, switch onto it and call trap() for nested traps that we don't know about. In trap we panic() so that we can analyze the condition.	2011-05-14 14:47:19 +00:00
Marcel Moolenaar	7fb64531d3	Be pedantic: mark the pcpu pointer (= register r13) itself as volatile.	2011-05-14 14:40:24 +00:00
Marcel Moolenaar	dc03be9d67	Turn ia64_srlz() and ia64_srlz_i() into defines so that the code is still correct when inlining is disabled.	2011-05-14 14:36:08 +00:00
Attilio Rao	b2aa562e7b	MFC	2011-05-13 20:58:48 +00:00
Matthew D Fleming	cfb00e5aa7	Move the ZERO_REGION_SIZE to a machine-dependent file, as on many architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware. Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment). Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit. Requested by: alc MFC after: 1 week MFC with: r221853	2011-05-13 19:35:01 +00:00
Attilio Rao	bceb221f78	Fix remaining bits that actually weren't converted by mistake. Reported by: sbruno	2011-05-13 14:57:20 +00:00
Attilio Rao	b9f714be9f	MFC	2011-05-07 23:34:14 +00:00
Marcel Moolenaar	a63f3da5b2	In pmap_kextract(), return the physical address for PBVM virtual addresses as well (incl. the PBVM page table).	2011-05-07 17:23:13 +00:00
Attilio Rao	aa8b9e0706	MFC	2011-05-06 22:45:33 +00:00
John Baldwin	f9a9473702	Retire isa_setup_intr() and isa_teardown_intr() and use the generic bus versions instead. They were never needed as bus_generic_intr() and bus_teardown_intr() had been changed to pass the original child device up in 42734, but the ISA bus was not converted to new-bus until 45720.	2011-05-06 13:48:53 +00:00
Attilio Rao	71a19bdc64	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
Marcel Moolenaar	6dfe4f958f	Don't use the whole region 5 for KVA, because the CPU may not implement all of the 61 bits available within the region for virtual addressing. Since there's no good way for us to map out the gap in the virtual address space, limit KVA to the architectural minimum implemented address bits. This still gives us 1 petabyte of KVA, so no worries.	2011-05-02 17:49:05 +00:00
Marcel Moolenaar	7df304f3e0	Stop linking against a direct-mapped virtual address and instead use the PBVM. This eliminates the implied hardcoding of the physical address at which the kernel needs to be loaded. Using the PBVM makes it possible to load the kernel irrespective of the physical memory organization and allows us to replicate kernel text on NUMA machines. While here, reduce the direct-mapped page size to the kernel's page size so that we can support memory attributes better.	2011-04-30 20:49:00 +00:00
John Baldwin	b67d11bbcc	Change rman_manage_region() to actually honor the rm_start and rm_end constraints on the rman and reject attempts to manage a region that is out of range. - Fix various places that set rm_end incorrectly (to ~0 or ~0u instead of ~0ul). - To preserve existing behavior, change rman_init() to set rm_start and rm_end to allow managing the full range (0 to ~0ul) if they are not set by the caller when rman_init() is called.	2011-04-29 18:41:21 +00:00
Attilio Rao	2be767e069	Add the watchdogs patting during the (shutdown time) disk syncing and disk dumping. With the option SW_WATCHDOG on, these operations are doomed to let watchdog fire, fi they take too long. I implemented the stubs this way because I really want wdog_kern_* KPI to not be dependant by SW_WATCHDOG being on (and really, the option only enables watchdog activation in hardclock) and also avoid to call them when not necessary (avoiding not-volountary watchdog activations). Sponsored by: Sandvine Incorporated Discussed with: emaste, des MFC after: 2 weeks	2011-04-28 16:02:05 +00:00
Rick Macklem	4309e17add	This patch changes head so that the default NFS client is now the new NFS client (which I guess is no longer experimental). The fstype "newnfs" is now "nfs" and the regular/old NFS client is now fstype "oldnfs". Although mounts via fstype "nfs" will usually work without userland changes, an updated mount_nfs(8) binary is needed for kernels built with "options NFSCL" but not "options NFSCLIENT". Updated mount_nfs(8) and mount(8) binaries are needed to do mounts for fstype "oldnfs". The GENERIC kernel configs have been changed to use options NFSCL and NFSD (the new client and server) instead of NFSCLIENT and NFSSERVER. For kernels being used on diskless NFS root systems, "options NFSCL" must be in the kernel config. Discussed on freebsd-fs@.	2011-04-27 17:51:51 +00:00
Marcel Moolenaar	682bf0a7e1	Remove prototypes of non-existent functions.	2011-04-25 22:38:09 +00:00
Alexander Motin	97b53e3634	Switch the GENERIC kernels for all architectures to the new CAM-based ATA stack. It means that all legacy ATA drivers are disabled and replaced by respective CAM drivers. If you are using ATA device names in /etc/fstab or other places, make sure to update them respectively (adX -> adaY, acdX -> cdY, afdX -> daY, astX -> saY, where 'Y's are the sequential numbers for each type in order of detection, unless configured otherwise with tunables, see cam(4)). ataraid(4) functionality is now supported by the RAID GEOM class. To use it you can load geom_raid kernel module and use graid(8) tool for management. Instead of /dev/arX device names, use /dev/raid/rX.	2011-04-24 08:58:58 +00:00
Marcel Moolenaar	76ceb3c6ee	Use the new arch_loadaddr I/F to align ELF objects to PBVM page boundaries. For good measure, align all other objects to cache lines boundaries. Use the new arch_loadseg I/F to keep track of kernel text and data so that we can wire as much of it as is possible. It is the responsibility of the kernel to link critical (read IVT related) code and data at the front of the respective segment so that it's covered by TRs before the kernel has a chance to add more translations. Use a better way of determining whether we're loading a legacy kernel or not. We can't check for the presence of the PBVM page table, because we may have unloaded that kernel and loaded an older (legacy) kernel after that. Simply use the latest load address for it.	2011-04-03 23:49:20 +00:00
Konstantin Belousov	7332c129e0	Add support for executing the FreeBSD 1/i386 a.out binaries on amd64. In particular: - implement compat shims for old stat(2) variants and ogetdirentries(2); - implement delivery of signals with ancient stack frame layout and corresponding sigreturn(2); - implement old getpagesize(2); - provide a user-mode trampoline and LDT call gate for lcall $7,$0; - port a.out image activator and connect it to the build as a module on amd64. The changes are hidden under COMPAT_43. MFC after: 1 month	2011-04-01 11:16:29 +00:00
Alan Cox	5adad80656	Eliminate an unused definition. Reviewed by: marcel	2011-03-26 20:40:33 +00:00
Marcel Moolenaar	0355a8b24b	Fix switching to physical mode as part of calling into EFI runtime services or PAL procedures. The new implementation is based on specific functions that are known to be called in certain scenarios only. This in particular fixes the PAL call to obtain information about translation registers. In general, the new implementation does not bank on virtual addresses being direct-mapped and will work when the kernel uses PBVM. When new scenarios need to be supported, new functions are added if the existing functions cannot be changed to handle the new scenario. If a single generic implementation is possible, it will become clear in due time. While here, change bootinfo to a pointer type in anticipation of future development.	2011-03-21 18:20:53 +00:00
Marcel Moolenaar	7c9eed5c4e	Change region 4 to be part of the kernel. This serves 2 purposes: 1. The PBVM is in region 4, so if we want to make use of it, we need region 4 freed up. 2. Region 4 and above cannot be represented by an off_t by virtue of that type being signed. This is problematic for truss(1), ktrace(1) and other such programs.	2011-03-21 01:09:50 +00:00
Bjoern A. Zeeb	d2b74735b8	For now remove options FLOWTABLE from the remaining GENERIC kernel configurations and make it opt-in for those who want it. LINT will still build it. While it may be a perfect win in some scenarios, it still troubles users (see PRs) in general cases. In addition we are still allocating resources even if disabled by sysctl and still leak arp/nd6 entries in case of interface destruction. Discussed with: qingli (2010-11-24, just never executed) Discussed with: juli (OCTEON1) PR: kern/148018, kern/155604, kern/144917, kern/146792 MFC after: 2 weeks	2011-03-19 15:50:34 +00:00
Marcel Moolenaar	9b0a458bcc	o Move the IVT and supporting functions to the front of the text segment so that it's always mapped by the loader. o Change the alternate fault handlers to account for PBVM. Since currently the region is handled by the VHPT, no alternate faults will be generated for it.	2011-03-18 22:45:43 +00:00
Marcel Moolenaar	7f806fe14e	Remove inclusion of unneeded bootinfo.h header.	2011-03-18 22:33:19 +00:00
Marcel Moolenaar	45c0ab27b1	Use VM_MAXUSER_ADDRESS rather than VM_MAX_ADDRESS when we talk about the bounds of user space. Redefine VM_MAX_ADDRESS as ~0UL, even though it's not used anywhere in the source tree.	2011-03-18 15:36:28 +00:00
Marcel Moolenaar	18d9407a9f	MFaltix: Add support for Pre-Boot Virtual Memory (PBVM) to the loader. PBVM allows us to link the kernel at a fixed virtual address without having to make any assumptions about the physical memory layout. On the SGI Altix 350 for example, there's no usuable physical memory below 192GB. Also, the PBVM allows us to control better where we're going to physically load the kernel and its modules so that we can make sure we load the kernel in memory that's close to the BSP. The PBVM is managed by a simple page table. The minimum size of the page table is 4KB (EFI page size) and the maximum is currently set to 1MB. A page in the PBVM is 64KB, as that's the maximum alignment one can specify in a linker script. The bottom line is that PBVM is between 64KB and 8GB in size. The loader maps the PBVM page table at a fixed virtual address and using a single translations. The PBVM itself is also mapped using a single translation for a maximum of 32MB. While here, increase the heap in the EFI loader from 512KB to 2MB and set the stage for supporting relocatable modules.	2011-03-16 03:53:18 +00:00
Marcel Moolenaar	23ca55ccf8	Reserve 24KB for the static kernel stack. We use this stack to call into the firmware for physical mode calls and the Altix doesn't work when it's 16KB.	2011-03-15 16:50:17 +00:00
Marcel Moolenaar	2133803023	Re-order the TRs so that there're less gaps. This could have a performance impact.	2011-03-15 06:07:02 +00:00
Marcel Moolenaar	6f181a80f8	Don't define IA64_PBVM_PAGE_SIZE as 1U shifted to the left by some amount. The compiler seems to assume it's a 32-bit integral and rounding to the page size using the standard expression (((u_long)(x) + mask) & ~mask), results in a 32-bit value. Dropping the 'U' suffix is enough to have the compiler treat the expression as a 64-bit integral.	2011-03-14 23:49:41 +00:00
Marcel Moolenaar	c94018bcd8	o Deal with unmapped PBVM in the alternate instruction and data TLB fault handlers. o Put the IVT in its own section and keep the supporting code close. o Make sure the VHPT is sized so that it can be mapped using a single translation. o Map the PAL code and VHPT with a translation that has the right size. Assume the platform has a PAL code size that can be mapped with a single translations. o Pass the pointer to the bootinfo structure as an argument to ia64_init(). o Get rid of LOG2_ID_PAGE_SIZE and IA64_ID_PAGE_SIZE. It was used to map the regions 6 & 7 and was as large as possible. The problem is that we can't support memory attributes easily if the granuratity is not a page. We need to support memory attributes because the new USB stack violates the BUS_DMA(9) interface. o Update some comments... NOTE: this is broken for SMP kernels, because the AP startup code hasn't been updated yet.	2011-03-14 05:29:45 +00:00
Marcel Moolenaar	5feac8d228	Expose the PBVM constants to assembly.	2011-03-14 05:18:27 +00:00
Marcel Moolenaar	7bd6af277d	First cut at having the kernel run within the PBVM: o The bootinfo structure is now a virtual pointer. o Replace VM_MAX_ADDRESS with VM_MAXUSER_ADDRESS and redefine VM_MAX_ADDRESS as the maximum address possible (~0UL). o Since we're not using direct-mapped translations, switching to physical addressing is less trivial. Reserve the boot stack for running in physical mode and special-case the EFI call, as we're still on the boot stack. o Region 4 belongs to the kernel now, not process space.	2011-03-12 02:00:28 +00:00
Marcel Moolenaar	db06a6f4ef	Merge svn+ssh://svn.freebsd.org/base/head@219553	2011-03-12 01:26:04 +00:00
Marcel Moolenaar	dd01d03463	o Add defines for Pre-Boot Virtual Memory (PBVM) o Move the backing store in the top half of region 0 now that region 4 is re-assigned to be part of the kernel. o De-emphasize VM_MAX_ADDRESS. It's really not used anywhere and probably means something different than the limit for process address space (we have VM_MAXUSER_ADDRESS for that). o Exclude the gateway page from VM_MAXUSER_ADDRESS (i.e. make it the same as VM_MAX_ADDRESS).	2011-03-11 22:00:45 +00:00
Marcel Moolenaar	9706a84ba0	Add fields for the PBVM page table address and size.	2011-03-11 21:54:45 +00:00
Matthew D Fleming	c77715ef6c	Mostly revert r219468, as I had misremembered the C standard regarding the size of an extern array. Keep one change from strncpy to strlcpy.	2011-03-11 18:56:55 +00:00
Matthew D Fleming	cd67ac41ae	Use MAXPATHLEN rather than the size of an extern array when copying the kernel name. Also consistenly use strlcpy(). Suggested by: Warner Losh	2011-03-10 22:56:00 +00:00
Dmitry Chagin	e5d81ef1b5	Extend struct sysvec with new method sv_schedtail, which is used for an explicit process at fork trampoline path instead of eventhadler(schedtail) invocation for each child process. Remove eventhandler(schedtail) code and change linux ABI to use newly added sysvec method. While here replace explicit comparing of module sysentvec structure with the newly created process sysentvec to detect the linux ABI. Discussed with: kib MFC after: 2 Week	2011-03-08 19:01:45 +00:00
Marcel Moolenaar	9b4fcf851a	Merge svn+ssh://svn.freebsd.org/base/head@218816	2011-02-18 21:39:09 +00:00
Alan Cox	e6ffa21488	Remove pmap fields that are either unused or not fully implemented. Discussed with: kib	2011-02-17 15:36:29 +00:00
Marcel Moolenaar	9818a9f26c	Comment-out FLOWTABLE. It causes a kernel panic due to a misaligned memory access related to an IPv6 route update. PR: kern/148018	2011-02-06 22:18:37 +00:00
Matthew D Fleming	08b163fa51	Put the general logic for being a CPU hog into a new function should_yield(). Use this in various places. Encapsulate the common case of check-and-yield into a new function maybe_yield(). Change several checks for a magic number of iterations to use should_yield() instead. MFC after: 1 week	2011-02-02 16:35:10 +00:00
Sergey Kandaurov	4053b05b91	Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize. Submitted by: perryh pluto.rain.com (previous version) Reviewed by: jhb Approved by: kib (mentor) Tested by: universe	2011-01-21 10:26:26 +00:00
Jung-uk Kim	bc35e60ec0	Remove empty dev_mem_md_init() stubs.	2011-01-17 23:06:47 +00:00
Jung-uk Kim	2fea643112	Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set(). Compile sys/dev/mem/memutil.c for all supported platforms and remove now unnecessary dev_mem_md_init(). Consistently define mem_range_softc from mem.c for all platforms. Add missing #include guards for machine/memdev.h and sys/memrange.h. Clean up some nearby style(9) nits. MFC after: 1 month	2011-01-17 22:58:28 +00:00
John Baldwin	58ccf5b41c	Remove unneeded includes of <sys/linker_set.h>. Other headers that use it internally contain nested includes. Reviewed by: bde	2011-01-11 13:59:06 +00:00
Konstantin Belousov	50a57dfbec	Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h. Update the outdated comments describing MAXSLP and the process selection algorithm for swap out. Comments wording and reviewed by: alc	2011-01-09 12:50:44 +00:00
David Schultz	9719c5a6ab	The highest-precision floating point type on ia64 has 64 bits of precision, so DECIMAL_DIG should be 21, as on i386/amd64.	2011-01-09 06:05:02 +00:00
Tijl Coosemans	a56e818f29	On mixed 32/64 bit architectures (mips, powerpc) use __LP64__ rather than architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and corresponding macros) are different from 32 bit. [1] Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX. Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition for (u)intmax_t. Do this on all architectures for consistency. Suggested by: bde [1] Approved by: kib (mentor)	2011-01-08 12:43:05 +00:00
Tijl Coosemans	9858863cd4	Fix types of some values in machine/_limits.h. On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int. However, lacking integer suffixes for types smaller than int, their type should correspond to that of an object of type unsigned char (or short) when used in an expression with objects of type int. In that case unsigned char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and USHRT_MAX should also be int. Where MIN/MAX constants implicitly have the correct type the suffix has been removed. While here, correct some comments. Reviewed by: bde Approved by: kib (mentor)	2011-01-08 11:13:34 +00:00
Konstantin Belousov	39198f15ee	Add AT_STACKPROT elf aux vector. Will be used to inform rtld about the initial stack protection set by the kernel image activator.	2011-01-07 14:22:34 +00:00
Marcel Moolenaar	0c21a60cf6	svn+ssh://svn.freebsd.org/base/head@216199	2010-12-05 20:47:36 +00:00
Rebecca Cran	c90f7d9b44	Revert r216134. This checkin broke platforms where bus_space are macros: they need to be a single statement, and do { } while (0) doesn't work in this situation so revert until a solution can be devised.	2010-12-03 07:09:23 +00:00
Rebecca Cran	15b4888a24	Disallow passing in a count of zero bytes to the bus_space(9) functions. Passing a count of zero on i386 and amd64 for [I386\|AMD64]_BUS_SPACE_MEM causes a crash/hang since the 'loop' instruction decrements the counter before checking if it's zero. PR: kern/80980 Discussed with: jhb	2010-12-02 22:19:30 +00:00
Alan Cox	b9895f9add	phys_avail[] is correctly defined as an array of vm_paddr_t's in machdep.c. Use that same type, and not vm_offset_t, in this include file.	2010-12-01 05:52:27 +00:00
John Baldwin	961135ead8	- Remove <machine/mutex.h>. Most of the headers were empty, and the contents of the ones that were not empty were stale and unused. - Now that <machine/mutex.h> no longer exists, there is no need to allow it to override various helper macros in <sys/mutex.h>. - Rename various helper macros for low-level operations on mutexes to live in the _mtx_* or __mtx_* namespaces. While here, change the names to more closely match the real API functions they are backing. - Drop support for including <sys/mutex.h> in assembly source files. Suggested by: bde (1, 2)	2010-11-09 20:46:41 +00:00
John Baldwin	b3e3402d3a	Remove unused includes of <sys/mutex.h> and <machine/mutex.h>.	2010-11-09 20:41:10 +00:00
Jung-uk Kim	2473325fa8	Reduce diff between platforms and fix style(9) bugs.	2010-11-09 00:14:39 +00:00
John Baldwin	0108cce0a4	Adjust the order of operations in spinlock_enter() and spinlock_exit() to work properly with single-stepping in a kernel debugger. Specifically, these routines have always disabled interrupts before increasing the nesting count and restored the prior state of interrupts after decreasing the nesting count to avoid problems with a nested interrupt not disabling interrupts when acquiring a spin lock. However, trap interrupts for single-stepping can still occur even when interrupts are disabled. Now the saved state of interrupts is not saved in the thread until after interrupts have been disabled and the nesting count has been increased. Similarly, the saved state from the thread cannot be read once the nesting count has been decreased to zero. To fix this, use temporary variables to store interrupt state and shuffle it between the thread's MD area and the appropriate registers. In cooperation with: bde MFC after: 1 month	2010-11-05 13:42:58 +00:00
Marcel Moolenaar	6f3544cd70	Merge svn+ssh://svn.freebsd.org/base/head@214309	2010-10-26 02:34:47 +00:00
Neel Natu	5c1a8dc028	Fix bogus error message from bus_dmamem_alloc() about incorrect alignment. The check for alignment should be made against the physical address and not the virtual address that maps it. Sponsored by: NetApp Submitted by: Will McGovern (will at netapp dot com) Reviewed by: mjacob, jhb	2010-09-29 21:53:11 +00:00
Warner Losh	01b5c01cae	Remove clauses 3 and 4, per changes to NetBSD versions of these files.	2010-09-25 04:41:42 +00:00
Andriy Gapon	3d844eddb7	bus_add_child: change type of order parameter to u_int This reflects actual type used to store and compare child device orders. Change is mostly done via a Coccinelle (soon to be devel/coccinelle) semantic patch. Verified by LINT+modules kernel builds. Followup to: r212213 MFC after: 10 days	2010-09-10 11:19:03 +00:00
John Baldwin	8c7a92bd4a	Remove unused KTRACE includes.	2010-08-19 16:41:27 +00:00
Konstantin Belousov	ee235befcb	Supply some useful information to the started image using ELF aux vectors. In particular, provide pagesize and pagesizes array, the canary value for SSP use, number of host CPUs and osreldate. Tested by: marius (sparc64) MFC after: 1 month	2010-08-17 08:55:45 +00:00
Marcel Moolenaar	b17f9ad2c9	Merge svn+ssh://svn.freebsd.org/base/head@211344	2010-08-15 22:09:43 +00:00
Konstantin Belousov	1757d9699d	Prefer struct sysentvec sv_psstrings to hardcoding FREEBSD32_PS_STRINGS in the compat32 code. Use sv_usrstack instead of FREEBSD32_USRSTACK as well. MFC after: 1 week	2010-08-07 11:57:13 +00:00
John Baldwin	d9d8d1449d	Add a new ipi_cpu() function to the MI IPI API that can be used to send an IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive. Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month	2010-08-06 15:36:59 +00:00
John Baldwin	536af0d751	Mark the __curthread() functions as __pure2 and remove the volatile keyword from the inline assembly. This allows the compiler to cache invocations of curthread since it's value does not change within a thread context. Submitted by: zec (i386) MFC after: 1 week	2010-07-29 18:44:10 +00:00
Matthew D Fleming	d7854da193	Add MALLOC_DEBUG_MAXZONES debug malloc(9) option to use multiple uma zones for each malloc bucket size. The purpose is to isolate different malloc types into hash classes, so that any buffer overruns or use-after-free will usually only affect memory from malloc types in that hash class. This is purely a debugging tool; by varying the hash function and tracking which hash class was corrupted, the intersection of the hash classes from each instance will point to a single malloc type that is being misused. At this point inspection or memguard(9) can be used to catch the offending code. Add MALLOC_DEBUG_MAXZONES=8 to -current GENERIC configuration files. The suggestion to have this on by default came from Kostik Belousov on -arch. This code is based on work by Ron Steinke at Isilon Systems. Reviewed by: -arch (mostly silence) Reviewed by: zml Approved by: zml (mentor)	2010-07-28 15:36:12 +00:00
John Baldwin	a3870a1826	Very rough first cut at NUMA support for the physical page allocator. For now it uses a very dumb first-touch allocation policy. This will change in the future. - Each architecture indicates the maximum number of supported memory domains via a new VM_NDOMAIN parameter in <machine/vmparam.h>. - Each cpu now has a PCPU_GET(domain) member to indicate the memory domain a CPU belongs to. Domain values are dense and numbered from 0. - When a platform supports multiple domains, the default freelist (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain. The MD code is required to populate an array of mem_affinity structures. Each entry in the array defines a range of memory (start and end) and a domain for the range. Multiple entries may be present for a single domain. The list is terminated by an entry where all fields are zero. This array of structures is used to split up phys_avail[] regions that fall in VM_FREELIST_DEFAULT into per-domain freelists. - Each memory domain has a separate lookup-array of freelists that is used when fulfulling a physical memory allocation. Right now the per-domain freelists are listed in a round-robin order for each domain. In the future a table such as the ACPI SLIT table may be used to order the per-domain lookup lists based on the penalty for each memory domain relative to a specific domain. The lookup lists may be examined via a new vm.phys.lookup_lists sysctl. - The first-touch policy is implemented by using PCPU_GET(domain) to pick a lookup list when allocating memory. Reviewed by: alc	2010-07-27 20:33:50 +00:00
Konstantin Belousov	87d45a0392	When compat32 binary asks for the value of hw.machine_arch, report the name of 32bit sibling architecture instead of the host one. Do the same for hw.machine on amd64. Add a safety belt debug.adaptive_machine_arch sysctl, to turn the substitution off. Reviewed by: jhb, nwhitehorn MFC after: 2 weeks	2010-07-22 09:13:49 +00:00
Marcel Moolenaar	092b5c88c5	Add acpi_find_table() -- a convenience function for looking up an ACPI table given the signature.	2010-07-07 20:07:33 +00:00
Marcel Moolenaar	1a6c2ccd5e	Remove pointless BOOTP conditional.	2010-07-07 19:34:48 +00:00
Marcel Moolenaar	23a2665a7e	Use an unbuffered transmit function for low-level console output.	2010-07-07 04:06:38 +00:00
Marcel Moolenaar	fdf49ca923	Switch ia64 to the unified busdma implementation.	2010-07-07 02:16:47 +00:00
Marcel Moolenaar	d6c180505a	Merge svn+ssh://svn.freebsd.org/base/head@209749	2010-07-06 23:20:43 +00:00
Marcel Moolenaar	b1b6c03e3d	Provide more examples for error injection.	2010-07-06 23:13:21 +00:00
Marcel Moolenaar	e987ee58d9	Allocate and setup an interrupt vector for corrected machine checks. For now, just print when we get the interrupt, but eventually we need to collect the details and provide a more useful report.	2010-07-03 20:19:20 +00:00
Marcel Moolenaar	57764700bc	When compiling with profiling, we define PROF for userspace and GPROF for the kernel.	2010-07-01 00:30:35 +00:00
Marcel Moolenaar	2c9459d167	While functions are ideally aligned to a 32-byte boundary, don't assume this to be the case.	2010-06-30 22:29:02 +00:00
John Baldwin	fc0de8f0b6	Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to <sys/syscallsubr.h> where all other kern_<syscall> prototypes live.	2010-06-30 18:03:42 +00:00
Marcel Moolenaar	95bf65307a	Merge svn+ssh://svn.freebsd.org/base/head@209086	2010-06-12 04:41:13 +00:00
Marcel Moolenaar	d87d5bbf82	The ptc.g operation for the Mckinley and Madison processors has the side-effect of purging more than the requested translation. While this is not a problem in general, it invalidates the assumption made during constructing the trapframe on entry into the kernel in SMP configurations. The assumption is that only the first store to the stack will possibly cause a TLB miss. Since the ptc.g purges the translation caches of all CPUs in the coherency domain, a ptc.g executed on one CPU can cause a purge on another CPU that is currently running the critical code that saves the state to the trapframe. This can cause an unexpected TLB miss and with interrupt collection disabled this means an unexpected data nested TLB fault. A data nested TLB fault will not save any context, nor provide a way for software to determine what caused the TLB miss nor where it occured. Careful construction of the kernel entry and exit code allows us to handle a TLB miss in precisely orchastrated points and thereby avoiding the need to wire the kernel stack, but the unexpected TLB miss caused by the ptc.g instructution resulted in an unrecoverable condition and resulting in machine checks. The solution to this problem is to synchronize the kernel entry on all CPUs with the use of the ptc.g instruction on a single CPU by implementing a bare-bones readers-writer lock that allows N readers (= N CPUs entering the kernel) and 1 writer (= execution of the ptc.g instruction on some CPU). This solution wins over a rendez-vous approach by not interrupting CPUs with an IPI. This problem has not been observed on the Montecito. PR: ia64/147772 MFC after: 6 days	2010-06-12 01:45:29 +00:00
Alan Cox	9124d0d6a3	Relax one of the new assertions in pmap_enter() a little. Specifically, allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)	2010-06-11 15:49:39 +00:00
Marcel Moolenaar	f635c047c5	Bump MAX_BPAGES from 256 to 1024. It seems that a few drivers, bge(4) in particular, do not handle deferred DMA map load operations at all. Any error, and especially EINPROGRESS, is treated as a hard error and typically abort the current operation. The fact that the busdma code queues the load operation for when resources (i.e. bounce buffers in this particular case) are available makes this especially problematic. Bounce buffering, unlike what the PR synopsis would suggest, works fine. While on the subject, properly implement swi_vm(). PR: 147502 MFC after: 1 week	2010-06-11 03:00:32 +00:00
Alan Cox	ce18658792	Reduce the scope of the page queues lock and the number of PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment. Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this. Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue. Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.) Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively. ARM: Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH(). Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases. PowerPC/AIM: moea_page_exits_quick() and moea_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated. Push down the page queues lock inside of moea_clear_bit(), simplifying moea_clear_modify() and moea_clear_reference(). The last parameter to moea_clear_bit() is never used. Eliminate it. PowerPC/BookE: Simplify mmu_booke_page_exists_quick()'s control flow. Reviewed by: kib@	2010-06-10 16:56:35 +00:00
Marcel Moolenaar	970c23b2e6	Merge svn+ssh://svn.freebsd.org/base/head@208879	2010-06-06 21:19:04 +00:00
Alan Cox	c68c71f9b8	Simplify the inner loop of get_pv_entry(): While iterating over the page's pv list, there is no point in checking whether or not the pv list is empty, wait instead until the loop completes.	2010-05-30 20:31:12 +00:00
Alan Cox	ff8ffaf43a	Don't set PG_WRITEABLE in pmap_enter() unless the page is managed.	2010-05-29 18:26:44 +00:00
Alan Cox	c46b90e90a	Push down page queues lock acquisition in pmap_enter_object() and pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea_is_modified() and moea_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib	2010-05-26 18:00:44 +00:00
Marcel Moolenaar	7708106a08	Merge svn+ssh://svn.freebsd.org/base/head@208557	2010-05-26 04:14:29 +00:00

1 2 3 4 5 ...

1912 Commits