freebsd-nq

Author	SHA1	Message	Date
Jung-uk Kim	3bd12ca8f1	Tidy up global locks for ACPICA. There is no functional change.	2013-08-13 21:34:03 +00:00
Andriy Gapon	a29cc9a34b	Revert r253748,253749 This WIP should not have been committed yet. Pointyhat to: avg	2013-07-28 18:44:17 +00:00
Andriy Gapon	366d8bfb7b	put contents of cpu.h under _KERNEL no userland-serviceable parts inside MFC after: 20 days	2013-07-28 18:32:27 +00:00
Konstantin Belousov	70a7dd5d5b	Fix issues with zeroing and fetching the counters, on x86 and ppc64. Issues were noted by Bruce Evans and are present on all architectures. On i386, a counter fetch should use atomic read of 64bit value, otherwise carry from the increment on other CPU could be lost for the given fetch, making error of 2^32. If 64bit read (cmpxchg8b) is not available on the machine, it cannot be SMP and it is enough to disable preemption around read to avoid the split read. On x86 the counter increment is not atomic on purpose, which makes it possible for the store of the incremented result to override just zeroed per-cpu slot. The effect would be a counter going off by arbitrary value after zeroing. Perform the counter zeroing on the same processor which does the increments, making the operations mutually exclusive. On i386, same as for the fetching, if the cmpxchg8b is not available, machine is not SMP and we disable preemption for zeroing. PowerPC64 is treated the same as amd64. For other architectures, the changes made to allow the compilation to succeed, without fixing the issues with zeroing or fetching. It should be possible to handle them by using the 64bit loads and stores atomic WRT preemption (assuming the architectures also converted from using critical sections to proper asm). If architecture does not provide the facility, using global (spin) mutex would be non-optimal but working solution. Noted by: bde Sponsored by: The FreeBSD Foundation	2013-07-01 02:48:27 +00:00
Jung-uk Kim	b1ddd13145	Move definitions required by userland applications out of acpica_machdep.h.	2013-06-27 00:22:40 +00:00
Attilio Rao	941646f5ec	Rename VM_NDOMAIN into MAXMEMDOM and move it into machine/param.h in order to match the MAXCPU concept. The change should also be useful for consolidation and consistency. Sponsored by: EMC / Isilon storage division Obtained from: jeff Reviewed by: alc	2013-05-07 22:46:24 +00:00
Gleb Smirnoff	4e76af6a41	Merge from projects/counters: counter(9). Introduce counter(9) API, that implements fast and raceless counters, provided (but not limited to) for gathering of statistical data. See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html for more details. In collaboration with: kib Reviewed by: luigi Tested by: ae, ray Sponsored by: Nginx, Inc.	2013-04-08 19:40:53 +00:00
Gleb Smirnoff	17dece86fe	Merge from projects/counters: Pad struct pcpu so that its size is denominator of PAGE_SIZE. This is done to reduce memory waste in UMA_PCPU_ZONE zones. Sponsored by: Nginx, Inc.	2013-04-08 19:19:10 +00:00
Marcel Moolenaar	c78a079c40	kernacc() expects all KVAs to be covered in the kernel map. With the introduction of the PBVM, this stopped being the case. Redefine the VM parameters so that the PBVM is included in the kernel map. In particular this introduces VM_INIT_KERNEL_ADDRESS to point to the base of region 5 now that VM_MIN_KERNEL_ADDRESS points to the base of region 4 to include the PBVM. While here define KERNBASE to the actual link address of the kernel as is intended. PR: 169926	2013-02-25 02:41:38 +00:00
Marcel Moolenaar	22aa5c496b	Eliminate padding by moving 'narg' next to 'code'. Both are 32-bit entities in the syscall_args structure that otherwise has 64-bit only fields.	2013-02-12 17:24:41 +00:00
Konstantin Belousov	28854834f4	Fix compilation on ia64 when page size is configured for 16KB. Reviewed by: alc, marcel	2012-10-28 11:53:54 +00:00
Alan Cox	cdd7357cc5	Port the new PV entry allocator from amd64/i386. This allocator has two advantages. First, PV entries are roughly half the size. Second, this allocator doesn't access the paging queues, and thus it allows for the removal of the page queues lock from this pmap. Replace all uses of the page queues lock by a R/W lock that is private to this pmap. Tested by: marcel	2012-10-26 03:02:39 +00:00
Marcel Moolenaar	0aa31a3393	Move PCPU initialization to a new function called cpu_pcpu_setup(). This makes it easier to add additional CPU or platform information to the per-CPU structure without duplicated code.	2012-07-08 18:00:22 +00:00
Marcel Moolenaar	373ca4ed22	Implement ia64_physmem_alloc() and use it consistently to get memory before VM has been initialized. This includes: 1. Replacing pmap_steal_memory(), 2. Replace the handcrafted logic to allocate a naturally aligned VHPT, 3. Properly allocate the DPCPU for the BSP. Ad 3: Appending the DPCPU to kernend worked as long as we wouldn't cross into the next PBVM page. If we were to cross into the next page, then there wouldn't be a PTE entry on the page table for it and we would end up with a MCA following a page fault. As such, this commit fixes MCAs occasionally seen.	2012-07-07 05:17:43 +00:00
Marcel Moolenaar	662d34aff2	Hide the creation of phys_avail behind an API to make it easier to do it correctly. We now iterate the EFI memory descriptors once and collect all the information in a single pass. This includes: 1. The I/O port base address, 2. The PAL memory region. Have the physmem API track this. 3. Memory descriptors of memory we can't use, like bad memory, runtime services code & data, etc. Have the physmem API track these. 4. memory descriptors of memory we can use or re-use, such as free memory, boot time services code & data, loader code & data, etc. These are added by the physmem API. Since the PBVM page table and pages are in memory described as loader data, inform the physmem API of chunks that need to be delated from the available physical memory. While here, remove Maxmem and replace it with the better named paddr_max. Maxmem was defined as physmem, which is generally wrong. Now, paddr_max is properly defined as the largesty physical address. The upshot of all this is that: 1. We properly determine realmem. 2. We maximize physmem by re-using memory where possible. 3. We remove complexity from ia64_init() in machdep.c. 4. Remove confusion about realmem, physmem & Maxmem. The new ia64_physmem_alloc() is to replace pmap_steal_memory() in pmap.c, as well as replace the handcrafted allocation of the VHPT for the BSP in pmap_bootstrap() in pmap.c. This is step 2 and addresses the manipulation of phys_avail after it is being created.	2012-07-07 00:25:17 +00:00
Andrew Turner	74dc547e24	Make the wchar_t type machine dependent. This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an unsigned short with the former preferred. Because of this requirement we need to move the definition of __wchar_t to a machine dependent header. It also cleans up the macros defining the limits of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine dependent header then using them to define WCHAR_MIN and WCHAR_MAX respectively. Discussed with: bde	2012-06-24 04:15:58 +00:00
Konstantin Belousov	aea810386d	Implement mechanism to export some kernel timekeeping data to usermode, using shared page. The structures and functions have vdso prefix, to indicate the intended location of the code in some future. The versioned per-algorithm data is exported in the format of struct vdso_timehands, which mostly repeats the content of in-kernel struct timehands. Usermode reading of the structure can be lockless. Compatibility export for 32bit processes on 64bit host is also provided. Kernel also provides usermode with indication about currently used timecounter, so that libc can fall back to syscall if configured timecounter is unknown to usermode code. The shared data updates are initiated both from the tc_windup(), where a fast task is queued to do the update, and from sysctl handlers which change timecounter. A manual override switch kern.timecounter.fast_gettime allows to turn off the mechanism. Only x86 architectures export the real algorithm data, and there, only for tsc timecounter. HPET counters page could be exported as well, but I prefer to not further glue the kernel and libc ABI there until proper vdso-based solution is developed. Minimal stubs neccessary for non-x86 architectures to still compile are provided. Discussed with: bde Reviewed by: jhb Tested by: flo MFC after: 1 month	2012-06-22 07:06:40 +00:00
Konstantin Belousov	232aa31fb9	Reserve AT_TIMEKEEP auxv entry for providing usermode the pointer to timekeeping information. MFC after: 1 week	2012-06-22 06:38:31 +00:00
Alan Cox	6031c68de4	The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer. Aesthetics aside, I am making this change because amd64 will likely begin using an alternative method to track write mappings, and having pmap_page_is_write_mapped() in place allows me to make such a change without further modification to the MI VM layer. As an added bonus, tidy up some nearby comments concerning page flags. Reviewed by: kib MFC after: 6 weeks	2012-06-16 18:56:19 +00:00
Bjoern A. Zeeb	920b965865	MFp4 bz_ipv6_fast: in_cksum.h required ip.h to be included for struct ip. To be able to use some general checksum functions like in_addword() in a non-IPv4 context, limit the (also exported to user space) IPv4 specific functions to the times, when the ip.h header is present and IPVERSION is defined (to 4). We should consider more general checksum (updating) functions to also allow easier incremental checksum updates in the L3/4 stack and firewalls, as well as ponder further requirements by certain NIC drivers needing slightly different pseudo values in offloading cases. Thinking in terms of a better "library". Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days	2012-05-24 22:00:48 +00:00
Dimitry Andric	460378bf13	Add a convenience macro for the returns_twice attribute, and apply it to the prototypes of the appropriate functions (getcontext, savectx, setjmp, sigsetjmp and vfork). MFC after: 2 weeks	2012-04-29 11:04:31 +00:00
Tijl Coosemans	6e310b206f	Eliminate ia32_reg.h by moving its contents to x86 and ia64 reg.h. Reviewed by: kib	2012-03-18 19:12:11 +00:00
David Schultz	2ee7b1d4ae	Add C11 macros describing subnormal numbers to float.h. Reviewed by: bde	2012-01-23 06:36:41 +00:00
David Schultz	9fa03ecd01	Add parentheses where required. Without them, `sizeof LDBL_MAX' is a syntax error and shouldn't be, while `1 FLT_ROUNDS' isn't a syntax error and should be. Thanks to bde for the examples.	2012-01-20 06:51:41 +00:00
Ed Schouten	53627e400f	Replace __signed by signed. The signed keyword is an integral part of the C syntax. There's no need to use __signed.	2011-12-13 13:38:03 +00:00
David Schultz	a50079b7ff	People porting FreeBSD to new architectures ought not have to implement a deprecated FPU control interface in addition to the standard one. To make this clearer, further deprecate ieeefp.h by not declaring the function prototypes except on architectures that implement them already. Currently i386 and amd64 implement the ieeefp.h interface for compatibility, and for fp[gs]etprec(), which doesn't exist on most other hardware. Powerpc, sparc64, and ia64 partially implement it and probably shouldn't, and other architectures don't implement it at all.	2011-10-21 06:41:46 +00:00
Konstantin Belousov	6bfe4c78c8	Remove unused define. MFC after: 1 month	2011-10-07 16:09:44 +00:00
Attilio Rao	786ef92b7b	Bump MAXCPU for amd64, ia64 and XLP mips appropriately. From now on, default values for FreeBSD will be 64 maxiumum supported CPUs on amd64 and ia64 and 128 for XLP. All the other architectures seem already capped appropriately (with the exception of sparc64 which needs further support on jalapeno flavour). Bump __FreeBSD_version in order to reflect KBI/KPI brekage introduced during the infrastructure cleanup for supporting MAXCPU > 32. This covers cpumask_t retiral too. The switch is considered completed at the present time, so for whatever bug you may experience that is reconducible to that area, please report immediately. Requested by: marcel, jchandra Tested by: pluknet, sbruno Approved by: re (kib)	2011-07-19 13:00:30 +00:00
Attilio Rao	68b739cd6f	Add the possibility to specify from kernel configs MAXCPU value. This patch is going to help in cases like mips flavours where you want a more granular support on MAXCPU. No MFC is previewed for this patch. Tested by: pluknet Approved by: re (kib)	2011-07-19 00:37:24 +00:00
Marcel Moolenaar	303c22c53b	Add a few more helper functions for working with memory descriptors: o efi_md_find() - returns the md that covers the given address o efi_md_last() - returns the last md in the list o efi_md_prev() - returns the md that preceeds the given md.	2011-07-16 19:56:07 +00:00
Marcel Moolenaar	0eb8c1410a	Implement basic support for memory attributes. At this time we only distinguish between UC and WB memory so that we can map the page to either a region 6 address (for UC) or a region 7 address (for WB). This change is only now possible, because previously we would map regions 6 and 7 with 256MB translations and on top of that had the kernel mapped in region 7 using a wired translation. The introduction of the PBVM moved the kernel into its own region and freed up region 7 and allowed us to revert to standard page-sized translations. This commit inroduces pmap_page_to_va() that respects the attribute.	2011-07-08 16:30:54 +00:00
Marcel Moolenaar	e920e3978e	Switch to the event timers infrastructure. This includes: o Setting td_intr_frame to the XIVs trap frame because it's referenced by the ET event handler. o Signal EOI to the CPU before calling the registered XIV handlers. This prevents lost ITC interrupts, which cause starvation in one-shot mode. o Adding support for IPI_HARDCLOCK with corresponding per-CPU counters. o Have the APs call cpu_initclocks() so as to limited the scattering of clock related initialization. cpu_initclocks() calls the <self>_bsp() or <self>_ap() version accordingly. o Uncomment the ET clock handling in cpu_idle(). o Update the DDB 'show pcpu' output for the new MD fields. o Entirely rewritten ia64_ih_clock(). Note that we don't create as many clock XIVs as we have CPUs, as is done on PowerPC. It doesn't scale. We can only have 240 XIVs and we can have more CPUs than that. There's a single intrcnt index for the cumulative clock ticks and we keep per CPU counts in the PCPU stats structure. o Register the ITC by hooking SI_SUB_CONFIGURE (2nd order). Open issues: o Clock interrupts can still be lost. Some tweaking is still necessary. Thanks to: mav@ for his support, feedback and explanations. ET stats while committing: eris% sysctl machdep.cpu \| grep nclks machdep.cpu.0.nclks: 24007 machdep.cpu.1.nclks: 22895 machdep.cpu.2.nclks: 13523 machdep.cpu.3.nclks: 9342 machdep.cpu.4.nclks: 9103 machdep.cpu.5.nclks: 9298 machdep.cpu.6.nclks: 10039 machdep.cpu.7.nclks: 9479 eris% vmstat -i \| grep clock clock 108599 50	2011-06-25 02:15:14 +00:00
Marcel Moolenaar	acd1d4d28e	Properly serialize the global shootdown with the instruction stream of the local processor. Also explicitly invalidate the ALAT. This is done on the other CPUs in the coherence domain by virtue of the ptc.ga instruction, but does not apply to the local CPU.	2011-06-17 04:26:03 +00:00
Attilio Rao	c02f1527a9	MFC	2011-05-14 19:20:13 +00:00
Marcel Moolenaar	7fb64531d3	Be pedantic: mark the pcpu pointer (= register r13) itself as volatile.	2011-05-14 14:40:24 +00:00
Marcel Moolenaar	dc03be9d67	Turn ia64_srlz() and ia64_srlz_i() into defines so that the code is still correct when inlining is disabled.	2011-05-14 14:36:08 +00:00
Attilio Rao	b2aa562e7b	MFC	2011-05-13 20:58:48 +00:00
Matthew D Fleming	cfb00e5aa7	Move the ZERO_REGION_SIZE to a machine-dependent file, as on many architectures (i386, for example) the virtual memory space may be constrained enough that 2MB is a large chunk. Use 64K for arches other than amd64 and ia64, with special handling for sparc64 due to differing hardware. Also commit the comment changes to kmem_init_zero_region() that I missed due to not saving the file. (Darn the unfamiliar development environment). Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you see fit. Requested by: alc MFC after: 1 week MFC with: r221853	2011-05-13 19:35:01 +00:00
Attilio Rao	71a19bdc64	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
Marcel Moolenaar	6dfe4f958f	Don't use the whole region 5 for KVA, because the CPU may not implement all of the 61 bits available within the region for virtual addressing. Since there's no good way for us to map out the gap in the virtual address space, limit KVA to the architectural minimum implemented address bits. This still gives us 1 petabyte of KVA, so no worries.	2011-05-02 17:49:05 +00:00
Marcel Moolenaar	7df304f3e0	Stop linking against a direct-mapped virtual address and instead use the PBVM. This eliminates the implied hardcoding of the physical address at which the kernel needs to be loaded. Using the PBVM makes it possible to load the kernel irrespective of the physical memory organization and allows us to replicate kernel text on NUMA machines. While here, reduce the direct-mapped page size to the kernel's page size so that we can support memory attributes better.	2011-04-30 20:49:00 +00:00
Marcel Moolenaar	682bf0a7e1	Remove prototypes of non-existent functions.	2011-04-25 22:38:09 +00:00
Marcel Moolenaar	76ceb3c6ee	Use the new arch_loadaddr I/F to align ELF objects to PBVM page boundaries. For good measure, align all other objects to cache lines boundaries. Use the new arch_loadseg I/F to keep track of kernel text and data so that we can wire as much of it as is possible. It is the responsibility of the kernel to link critical (read IVT related) code and data at the front of the respective segment so that it's covered by TRs before the kernel has a chance to add more translations. Use a better way of determining whether we're loading a legacy kernel or not. We can't check for the presence of the PBVM page table, because we may have unloaded that kernel and loaded an older (legacy) kernel after that. Simply use the latest load address for it.	2011-04-03 23:49:20 +00:00
Alan Cox	5adad80656	Eliminate an unused definition. Reviewed by: marcel	2011-03-26 20:40:33 +00:00
Marcel Moolenaar	0355a8b24b	Fix switching to physical mode as part of calling into EFI runtime services or PAL procedures. The new implementation is based on specific functions that are known to be called in certain scenarios only. This in particular fixes the PAL call to obtain information about translation registers. In general, the new implementation does not bank on virtual addresses being direct-mapped and will work when the kernel uses PBVM. When new scenarios need to be supported, new functions are added if the existing functions cannot be changed to handle the new scenario. If a single generic implementation is possible, it will become clear in due time. While here, change bootinfo to a pointer type in anticipation of future development.	2011-03-21 18:20:53 +00:00
Marcel Moolenaar	7c9eed5c4e	Change region 4 to be part of the kernel. This serves 2 purposes: 1. The PBVM is in region 4, so if we want to make use of it, we need region 4 freed up. 2. Region 4 and above cannot be represented by an off_t by virtue of that type being signed. This is problematic for truss(1), ktrace(1) and other such programs.	2011-03-21 01:09:50 +00:00
Marcel Moolenaar	45c0ab27b1	Use VM_MAXUSER_ADDRESS rather than VM_MAX_ADDRESS when we talk about the bounds of user space. Redefine VM_MAX_ADDRESS as ~0UL, even though it's not used anywhere in the source tree.	2011-03-18 15:36:28 +00:00
Marcel Moolenaar	18d9407a9f	MFaltix: Add support for Pre-Boot Virtual Memory (PBVM) to the loader. PBVM allows us to link the kernel at a fixed virtual address without having to make any assumptions about the physical memory layout. On the SGI Altix 350 for example, there's no usuable physical memory below 192GB. Also, the PBVM allows us to control better where we're going to physically load the kernel and its modules so that we can make sure we load the kernel in memory that's close to the BSP. The PBVM is managed by a simple page table. The minimum size of the page table is 4KB (EFI page size) and the maximum is currently set to 1MB. A page in the PBVM is 64KB, as that's the maximum alignment one can specify in a linker script. The bottom line is that PBVM is between 64KB and 8GB in size. The loader maps the PBVM page table at a fixed virtual address and using a single translations. The PBVM itself is also mapped using a single translation for a maximum of 32MB. While here, increase the heap in the EFI loader from 512KB to 2MB and set the stage for supporting relocatable modules.	2011-03-16 03:53:18 +00:00
Marcel Moolenaar	6f181a80f8	Don't define IA64_PBVM_PAGE_SIZE as 1U shifted to the left by some amount. The compiler seems to assume it's a 32-bit integral and rounding to the page size using the standard expression (((u_long)(x) + mask) & ~mask), results in a 32-bit value. Dropping the 'U' suffix is enough to have the compiler treat the expression as a 64-bit integral.	2011-03-14 23:49:41 +00:00
Marcel Moolenaar	c94018bcd8	o Deal with unmapped PBVM in the alternate instruction and data TLB fault handlers. o Put the IVT in its own section and keep the supporting code close. o Make sure the VHPT is sized so that it can be mapped using a single translation. o Map the PAL code and VHPT with a translation that has the right size. Assume the platform has a PAL code size that can be mapped with a single translations. o Pass the pointer to the bootinfo structure as an argument to ia64_init(). o Get rid of LOG2_ID_PAGE_SIZE and IA64_ID_PAGE_SIZE. It was used to map the regions 6 & 7 and was as large as possible. The problem is that we can't support memory attributes easily if the granuratity is not a page. We need to support memory attributes because the new USB stack violates the BUS_DMA(9) interface. o Update some comments... NOTE: this is broken for SMP kernels, because the AP startup code hasn't been updated yet.	2011-03-14 05:29:45 +00:00

1 2 3 4 5 ...

641 Commits