freebsd-skq

Author	SHA1	Message	Date
Marcel Moolenaar	e920e3978e	Switch to the event timers infrastructure. This includes: o Setting td_intr_frame to the XIVs trap frame because it's referenced by the ET event handler. o Signal EOI to the CPU before calling the registered XIV handlers. This prevents lost ITC interrupts, which cause starvation in one-shot mode. o Adding support for IPI_HARDCLOCK with corresponding per-CPU counters. o Have the APs call cpu_initclocks() so as to limited the scattering of clock related initialization. cpu_initclocks() calls the <self>_bsp() or <self>_ap() version accordingly. o Uncomment the ET clock handling in cpu_idle(). o Update the DDB 'show pcpu' output for the new MD fields. o Entirely rewritten ia64_ih_clock(). Note that we don't create as many clock XIVs as we have CPUs, as is done on PowerPC. It doesn't scale. We can only have 240 XIVs and we can have more CPUs than that. There's a single intrcnt index for the cumulative clock ticks and we keep per CPU counts in the PCPU stats structure. o Register the ITC by hooking SI_SUB_CONFIGURE (2nd order). Open issues: o Clock interrupts can still be lost. Some tweaking is still necessary. Thanks to: mav@ for his support, feedback and explanations. ET stats while committing: eris% sysctl machdep.cpu \| grep nclks machdep.cpu.0.nclks: 24007 machdep.cpu.1.nclks: 22895 machdep.cpu.2.nclks: 13523 machdep.cpu.3.nclks: 9342 machdep.cpu.4.nclks: 9103 machdep.cpu.5.nclks: 9298 machdep.cpu.6.nclks: 10039 machdep.cpu.7.nclks: 9479 eris% vmstat -i \| grep clock clock 108599 50	2011-06-25 02:15:14 +00:00
Marcel Moolenaar	acd1d4d28e	Properly serialize the global shootdown with the instruction stream of the local processor. Also explicitly invalidate the ALAT. This is done on the other CPUs in the coherence domain by virtue of the ptc.ga instruction, but does not apply to the local CPU.	2011-06-17 04:26:03 +00:00
Attilio Rao	c02f1527a9	MFC	2011-05-14 19:20:13 +00:00
Marcel Moolenaar	7fb64531d3	Be pedantic: mark the pcpu pointer (= register r13) itself as volatile.	2011-05-14 14:40:24 +00:00
Marcel Moolenaar	dc03be9d67	Turn ia64_srlz() and ia64_srlz_i() into defines so that the code is still correct when inlining is disabled.	2011-05-14 14:36:08 +00:00
Attilio Rao	b2aa562e7b	MFC	2011-05-13 20:58:48 +00:00
Matthew D Fleming	cfb00e5aa7	Move the ZERO_REGION_SIZE to a machine-dependent file, as on many architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware. Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment). Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit. Requested by: alc MFC after: 1 week MFC with: r221853	2011-05-13 19:35:01 +00:00
Attilio Rao	71a19bdc64	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
Marcel Moolenaar	6dfe4f958f	Don't use the whole region 5 for KVA, because the CPU may not implement all of the 61 bits available within the region for virtual addressing. Since there's no good way for us to map out the gap in the virtual address space, limit KVA to the architectural minimum implemented address bits. This still gives us 1 petabyte of KVA, so no worries.	2011-05-02 17:49:05 +00:00
Marcel Moolenaar	7df304f3e0	Stop linking against a direct-mapped virtual address and instead use the PBVM. This eliminates the implied hardcoding of the physical address at which the kernel needs to be loaded. Using the PBVM makes it possible to load the kernel irrespective of the physical memory organization and allows us to replicate kernel text on NUMA machines. While here, reduce the direct-mapped page size to the kernel's page size so that we can support memory attributes better.	2011-04-30 20:49:00 +00:00
Marcel Moolenaar	682bf0a7e1	Remove prototypes of non-existent functions.	2011-04-25 22:38:09 +00:00
Marcel Moolenaar	76ceb3c6ee	Use the new arch_loadaddr I/F to align ELF objects to PBVM page boundaries. For good measure, align all other objects to cache lines boundaries. Use the new arch_loadseg I/F to keep track of kernel text and data so that we can wire as much of it as is possible. It is the responsibility of the kernel to link critical (read IVT related) code and data at the front of the respective segment so that it's covered by TRs before the kernel has a chance to add more translations. Use a better way of determining whether we're loading a legacy kernel or not. We can't check for the presence of the PBVM page table, because we may have unloaded that kernel and loaded an older (legacy) kernel after that. Simply use the latest load address for it.	2011-04-03 23:49:20 +00:00
Alan Cox	5adad80656	Eliminate an unused definition. Reviewed by: marcel	2011-03-26 20:40:33 +00:00
Marcel Moolenaar	0355a8b24b	Fix switching to physical mode as part of calling into EFI runtime services or PAL procedures. The new implementation is based on specific functions that are known to be called in certain scenarios only. This in particular fixes the PAL call to obtain information about translation registers. In general, the new implementation does not bank on virtual addresses being direct-mapped and will work when the kernel uses PBVM. When new scenarios need to be supported, new functions are added if the existing functions cannot be changed to handle the new scenario. If a single generic implementation is possible, it will become clear in due time. While here, change bootinfo to a pointer type in anticipation of future development.	2011-03-21 18:20:53 +00:00
Marcel Moolenaar	7c9eed5c4e	Change region 4 to be part of the kernel. This serves 2 purposes: 1. The PBVM is in region 4, so if we want to make use of it, we need region 4 freed up. 2. Region 4 and above cannot be represented by an off_t by virtue of that type being signed. This is problematic for truss(1), ktrace(1) and other such programs.	2011-03-21 01:09:50 +00:00
Marcel Moolenaar	45c0ab27b1	Use VM_MAXUSER_ADDRESS rather than VM_MAX_ADDRESS when we talk about the bounds of user space. Redefine VM_MAX_ADDRESS as ~0UL, even though it's not used anywhere in the source tree.	2011-03-18 15:36:28 +00:00
Marcel Moolenaar	18d9407a9f	MFaltix: Add support for Pre-Boot Virtual Memory (PBVM) to the loader. PBVM allows us to link the kernel at a fixed virtual address without having to make any assumptions about the physical memory layout. On the SGI Altix 350 for example, there's no usuable physical memory below 192GB. Also, the PBVM allows us to control better where we're going to physically load the kernel and its modules so that we can make sure we load the kernel in memory that's close to the BSP. The PBVM is managed by a simple page table. The minimum size of the page table is 4KB (EFI page size) and the maximum is currently set to 1MB. A page in the PBVM is 64KB, as that's the maximum alignment one can specify in a linker script. The bottom line is that PBVM is between 64KB and 8GB in size. The loader maps the PBVM page table at a fixed virtual address and using a single translations. The PBVM itself is also mapped using a single translation for a maximum of 32MB. While here, increase the heap in the EFI loader from 512KB to 2MB and set the stage for supporting relocatable modules.	2011-03-16 03:53:18 +00:00
Marcel Moolenaar	6f181a80f8	Don't define IA64_PBVM_PAGE_SIZE as 1U shifted to the left by some amount. The compiler seems to assume it's a 32-bit integral and rounding to the page size using the standard expression (((u_long)(x) + mask) & ~mask), results in a 32-bit value. Dropping the 'U' suffix is enough to have the compiler treat the expression as a 64-bit integral.	2011-03-14 23:49:41 +00:00
Marcel Moolenaar	c94018bcd8	o Deal with unmapped PBVM in the alternate instruction and data TLB fault handlers. o Put the IVT in its own section and keep the supporting code close. o Make sure the VHPT is sized so that it can be mapped using a single translation. o Map the PAL code and VHPT with a translation that has the right size. Assume the platform has a PAL code size that can be mapped with a single translations. o Pass the pointer to the bootinfo structure as an argument to ia64_init(). o Get rid of LOG2_ID_PAGE_SIZE and IA64_ID_PAGE_SIZE. It was used to map the regions 6 & 7 and was as large as possible. The problem is that we can't support memory attributes easily if the granuratity is not a page. We need to support memory attributes because the new USB stack violates the BUS_DMA(9) interface. o Update some comments... NOTE: this is broken for SMP kernels, because the AP startup code hasn't been updated yet.	2011-03-14 05:29:45 +00:00
Marcel Moolenaar	7bd6af277d	First cut at having the kernel run within the PBVM: o The bootinfo structure is now a virtual pointer. o Replace VM_MAX_ADDRESS with VM_MAXUSER_ADDRESS and redefine VM_MAX_ADDRESS as the maximum address possible (~0UL). o Since we're not using direct-mapped translations, switching to physical addressing is less trivial. Reserve the boot stack for running in physical mode and special-case the EFI call, as we're still on the boot stack. o Region 4 belongs to the kernel now, not process space.	2011-03-12 02:00:28 +00:00
Marcel Moolenaar	dd01d03463	o Add defines for Pre-Boot Virtual Memory (PBVM) o Move the backing store in the top half of region 0 now that region 4 is re-assigned to be part of the kernel. o De-emphasize VM_MAX_ADDRESS. It's really not used anywhere and probably means something different than the limit for process address space (we have VM_MAXUSER_ADDRESS for that). o Exclude the gateway page from VM_MAXUSER_ADDRESS (i.e. make it the same as VM_MAX_ADDRESS).	2011-03-11 22:00:45 +00:00
Marcel Moolenaar	9706a84ba0	Add fields for the PBVM page table address and size.	2011-03-11 21:54:45 +00:00
Marcel Moolenaar	9b4fcf851a	Merge svn+ssh://svn.freebsd.org/base/head@218816	2011-02-18 21:39:09 +00:00
Alan Cox	e6ffa21488	Remove pmap fields that are either unused or not fully implemented. Discussed with: kib	2011-02-17 15:36:29 +00:00
Jung-uk Kim	2fea643112	Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set(). Compile sys/dev/mem/memutil.c for all supported platforms and remove now unnecessary dev_mem_md_init(). Consistently define mem_range_softc from mem.c for all platforms. Add missing #include guards for machine/memdev.h and sys/memrange.h. Clean up some nearby style(9) nits. MFC after: 1 month	2011-01-17 22:58:28 +00:00
Konstantin Belousov	50a57dfbec	Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h. Update the outdated comments describing MAXSLP and the process selection algorithm for swap out. Comments wording and reviewed by: alc	2011-01-09 12:50:44 +00:00
David Schultz	9719c5a6ab	The highest-precision floating point type on ia64 has 64 bits of precision, so DECIMAL_DIG should be 21, as on i386/amd64.	2011-01-09 06:05:02 +00:00
Tijl Coosemans	a56e818f29	On mixed 32/64 bit architectures (mips, powerpc) use __LP64__ rather than architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and corresponding macros) are different from 32 bit. [1] Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX. Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition for (u)intmax_t. Do this on all architectures for consistency. Suggested by: bde [1] Approved by: kib (mentor)	2011-01-08 12:43:05 +00:00
Tijl Coosemans	9858863cd4	Fix types of some values in machine/_limits.h. On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int. However, lacking integer suffixes for types smaller than int, their type should correspond to that of an object of type unsigned char (or short) when used in an expression with objects of type int. In that case unsigned char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and USHRT_MAX should also be int. Where MIN/MAX constants implicitly have the correct type the suffix has been removed. While here, correct some comments. Reviewed by: bde Approved by: kib (mentor)	2011-01-08 11:13:34 +00:00
Konstantin Belousov	39198f15ee	Add AT_STACKPROT elf aux vector. Will be used to inform rtld about the initial stack protection set by the kernel image activator.	2011-01-07 14:22:34 +00:00
Marcel Moolenaar	0c21a60cf6	svn+ssh://svn.freebsd.org/base/head@216199	2010-12-05 20:47:36 +00:00
Rebecca Cran	c90f7d9b44	Revert r216134. This checkin broke platforms where bus_space are macros: they need to be a single statement, and do { } while (0) doesn't work in this situation so revert until a solution can be devised.	2010-12-03 07:09:23 +00:00
Rebecca Cran	15b4888a24	Disallow passing in a count of zero bytes to the bus_space(9) functions. Passing a count of zero on i386 and amd64 for [I386\|AMD64]_BUS_SPACE_MEM causes a crash/hang since the 'loop' instruction decrements the counter before checking if it's zero. PR: kern/80980 Discussed with: jhb	2010-12-02 22:19:30 +00:00
Alan Cox	b9895f9add	phys_avail[] is correctly defined as an array of vm_paddr_t's in machdep.c. Use that same type, and not vm_offset_t, in this include file.	2010-12-01 05:52:27 +00:00
John Baldwin	961135ead8	- Remove <machine/mutex.h>. Most of the headers were empty, and the contents of the ones that were not empty were stale and unused. - Now that <machine/mutex.h> no longer exists, there is no need to allow it to override various helper macros in <sys/mutex.h>. - Rename various helper macros for low-level operations on mutexes to live in the _mtx_* or __mtx_* namespaces. While here, change the names to more closely match the real API functions they are backing. - Drop support for including <sys/mutex.h> in assembly source files. Suggested by: bde (1, 2)	2010-11-09 20:46:41 +00:00
Marcel Moolenaar	6f3544cd70	Merge svn+ssh://svn.freebsd.org/base/head@214309	2010-10-26 02:34:47 +00:00
Warner Losh	01b5c01cae	Remove clauses 3 and 4, per changes to NetBSD versions of these files.	2010-09-25 04:41:42 +00:00
Konstantin Belousov	ee235befcb	Supply some useful information to the started image using ELF aux vectors. In particular, provide pagesize and pagesizes array, the canary value for SSP use, number of host CPUs and osreldate. Tested by: marius (sparc64) MFC after: 1 month	2010-08-17 08:55:45 +00:00
Marcel Moolenaar	b17f9ad2c9	Merge svn+ssh://svn.freebsd.org/base/head@211344	2010-08-15 22:09:43 +00:00
John Baldwin	d9d8d1449d	Add a new ipi_cpu() function to the MI IPI API that can be used to send an IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive. Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month	2010-08-06 15:36:59 +00:00
John Baldwin	536af0d751	Mark the __curthread() functions as __pure2 and remove the volatile keyword from the inline assembly. This allows the compiler to cache invocations of curthread since it's value does not change within a thread context. Submitted by: zec (i386) MFC after: 1 week	2010-07-29 18:44:10 +00:00
John Baldwin	a3870a1826	Very rough first cut at NUMA support for the physical page allocator. For now it uses a very dumb first-touch allocation policy. This will change in the future. - Each architecture indicates the maximum number of supported memory domains via a new VM_NDOMAIN parameter in <machine/vmparam.h>. - Each cpu now has a PCPU_GET(domain) member to indicate the memory domain a CPU belongs to. Domain values are dense and numbered from 0. - When a platform supports multiple domains, the default freelist (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain. The MD code is required to populate an array of mem_affinity structures. Each entry in the array defines a range of memory (start and end) and a domain for the range. Multiple entries may be present for a single domain. The list is terminated by an entry where all fields are zero. This array of structures is used to split up phys_avail[] regions that fall in VM_FREELIST_DEFAULT into per-domain freelists. - Each memory domain has a separate lookup-array of freelists that is used when fulfulling a physical memory allocation. Right now the per-domain freelists are listed in a round-robin order for each domain. In the future a table such as the ACPI SLIT table may be used to order the per-domain lookup lists based on the penalty for each memory domain relative to a specific domain. The lookup lists may be examined via a new vm.phys.lookup_lists sysctl. - The first-touch policy is implemented by using PCPU_GET(domain) to pick a lookup list when allocating memory. Reviewed by: alc	2010-07-27 20:33:50 +00:00
Konstantin Belousov	87d45a0392	When compat32 binary asks for the value of hw.machine_arch, report the name of 32bit sibling architecture instead of the host one. Do the same for hw.machine on amd64. Add a safety belt debug.adaptive_machine_arch sysctl, to turn the substitution off. Reviewed by: jhb, nwhitehorn MFC after: 2 weeks	2010-07-22 09:13:49 +00:00
Marcel Moolenaar	092b5c88c5	Add acpi_find_table() -- a convenience function for looking up an ACPI table given the signature.	2010-07-07 20:07:33 +00:00
Marcel Moolenaar	23a2665a7e	Use an unbuffered transmit function for low-level console output.	2010-07-07 04:06:38 +00:00
Marcel Moolenaar	fdf49ca923	Switch ia64 to the unified busdma implementation.	2010-07-07 02:16:47 +00:00
Marcel Moolenaar	d6c180505a	Merge svn+ssh://svn.freebsd.org/base/head@209749	2010-07-06 23:20:43 +00:00
Marcel Moolenaar	e987ee58d9	Allocate and setup an interrupt vector for corrected machine checks. For now, just print when we get the interrupt, but eventually we need to collect the details and provide a more useful report.	2010-07-03 20:19:20 +00:00
Marcel Moolenaar	57764700bc	When compiling with profiling, we define PROF for userspace and GPROF for the kernel.	2010-07-01 00:30:35 +00:00
Marcel Moolenaar	2c9459d167	While functions are ideally aligned to a 32-byte boundary, don't assume this to be the case.	2010-06-30 22:29:02 +00:00

1 2 3 4 5 ...

610 Commits