freebsd-skq

Author	SHA1	Message	Date
jkim	a62bf181b5	Add AMD Family 0Fh, Model 6Bh, Stepping 2 to the list of invariant TSCs and fix i386 test.	2008-10-22 17:30:37 +00:00
jkim	d0d7e3dcb3	Set kern.timecounter.invariant_tsc to 1 for AMD CPU family 10h and higher even if BIOS does not advertise it.	2008-10-22 00:01:53 +00:00
jkim	0b6f0646df	Turn off CPU frequency change notifiers when the TSC is P-state invariant or it is forced by setting 'kern.timecounter.invariant_tsc' tunable to non-zero.	2008-10-21 00:38:00 +00:00
jkim	d4d906ba86	Detect Advanced Power Management Information for AMD CPUs.	2008-10-21 00:17:55 +00:00
kib	29ccf7d166	Correctly fill siginfo for the signals delivered by linux tkill/tgkill. It is required for async cancellation to work. Fix PROC_LOCK leak in linux_tgkill when signal delivery attempt is made to not linux process. Do not call em_find(p, ...) with p unlocked. Move common code for linux_tkill() and linux_tgkill() into linux_do_tkill(). Change linux siginfo_t definition to match actual linux one. Extend uid fields to 4 bytes from 2. The extension does not change structure layout and is binary compatible with previous definition, because i386 is little endian, and each uid field has 2 byte padding after it. Reported by: Nicolas Joly <njoly pasteur fr> Submitted by: dchangin MFC after: 1 month	2008-10-19 10:02:26 +00:00
kib	98d25e143a	Set PCB_32BIT and clear PCB_GS32BIT for linux32 binaries. Tested by: dchagin MFC after: 3 days	2008-10-18 13:39:22 +00:00
kib	faae1c0f2f	Make robust futexes work on linux32/amd64. Use PTRIN to read user-mode pointers. Change types used in the structures definitions to properly-sized architecture-specific types. Submitted by: dchagin MFC after: 1 week	2008-10-14 07:59:23 +00:00
davidxu	8a22cb4e57	If the current thread has the trap bit set (i.e. a debugger had single stepped the process to the system call), we need to clear the trap flag from the new frame. Otherwise, the new thread will receive a (likely unexpected) SIGTRAP when it executes the first instruction after returning to userland.	2008-10-05 02:03:54 +00:00
stas	2a2b3ea928	- Add driver for Attansic L2 FastEthernet controller found on Asus EeePC and some Asus mainboards. Reviewed by: yongari, rpaulo, jhb Tested by: many Approved by: kib (mentor) MFC after: 1 week	2008-10-03 10:31:31 +00:00
peter	ed8d07f232	Collect N identical (or near identical) mkdumpheader() implementations into one, as threatened in the comment. Textdump magic can be passed in.	2008-10-01 22:08:53 +00:00
jhb	593ef76d8c	Bump MAXCPU to 32 now that 32 CPU x86 systems exist. Tested by: rwatson, mdtansca Approved by: peter	2008-10-01 21:59:04 +00:00
marius	a1ec700ce8	Remove ipi_all() and ipi_self() as the former hasn't been used at all to date and the latter also is only used in ia64 and powerpc code which no longer serves a real purpose after bring-up and just can be removed as well. Note that architectures like sun4u also provide no means of implementing IPI'ing a CPU itself natively in the first place. Suggested by: jhb Reviewed by: arch, grehan, jhb	2008-09-28 18:34:14 +00:00
ed	4efdef565f	Replace all calls to minor() with dev2unit(). After I removed all the unit2minor()/minor2unit() calls from the kernel yesterday, I realised calling minor() everywhere is quite confusing. Character devices now only have the ability to store a unit number, not a minor number. Remove the confusion by using dev2unit() everywhere. This commit could also be considered as a bug fix. A lot of drivers call minor(), while they should actually be calling dev2unit(). In -CURRENT this isn't a problem, but it turns out we never had any problem reports related to that issue in the past. I suspect not many people connect more than 256 pieces of the same hardware. Reviewed by: kib	2008-09-27 08:51:18 +00:00
kib	c500808674	Change the static struct sysentvec and struct Elf_Brandinfo initializers to the C99 style. At least, it is easier to read sysent definitions that way, and search for the actual instances of sigcode etc. Explicitely initialize sysentvec.sv_maxssiz that was missed in most sysvecs. No objection from: jhb MFC after: 1 month	2008-09-24 10:14:37 +00:00
jhb	7dfce995e4	MFC: Update comments about the 0xcf9 register reset method. Approved by: re (kensmith, kib)	2008-09-18 20:20:28 +00:00
stas	f6aef556ae	- Recognize SAVE and OSXSAVE extended processor features. Approved by: kib (mentor) MFC after: 1 month	2008-09-18 18:51:32 +00:00
jkoshy	a9cbfb55cd	Correct a callchain capture bug on the i386. On the i386 architecture, the processor only saves the current value of `%esp' on stack if a privilege switch is necessary when entering the interrupt handler. Thus, `frame->tf_esp' is only valid for an entry from user mode. For interrupts taken in kernel mode, we need to determine the top-of-stack for the interrupted kernel procedure by adding the appropriate offset to the current frame pointer. Reported by: kris, Fabien Thomas Tested by: Fabien Thomas <fabien.thomas at netasq dot com>	2008-09-15 06:47:52 +00:00
jhb	0ae31bdab7	MFC: Adjust the handling of the various timer frequencies when using the lapic timer to use dynamic divisors and try to keep stathz around 128. Approved by: re (kib)	2008-09-12 21:41:21 +00:00
jhb	1d4a561a28	Add a 'hw.pci.mcfg' tunable. It can be set to 0 to disable memory-mapped PCI config access.	2008-09-11 21:42:11 +00:00
jhb	8470014e06	Update the comments above the 0xcf9 register reset attempt to match the code. We only attempt a single reset using this method (a "hard" reset), and we use two writes to ensure there is a 0 -> 1 transition in bit 2 to force a reset. MFC after: 1 week	2008-09-11 18:33:57 +00:00
jhb	ddef310111	Some K8 chipsets don't expose all of the PCI devices on bus 0 via PCIe memory-mapped config access. Add a workaround for these systems by checking the first function of each slot on bus 0 using both the memory-mapped config access and the older type 1 I/O port config access. If we find a slot that is only visible via the type 1 I/O port config access, we flag that slot. Future PCI config transactions to flagged slots on bus 0 use type 1 I/O port config access rather than memory mapped config access.	2008-09-10 18:06:08 +00:00
kib	f444f744fa	The pcb_gs32p should be per-cpu, not per-thread pointer. This is location in GDT where the segment descriptor from pcb_gs32sd is copied, and the location is in GDT local to CPU. Noted and reviewed by: peter MFC after: 1 week	2008-09-08 09:59:05 +00:00
kib	73306a5435	Provide private per-CPU GDTs on amd64. This is required at least for the linux CB_GS32BIT to work. Noted by: nox Reviewed by: peter MFC after: 1 week	2008-09-08 09:55:51 +00:00
kib	396f47bf91	In linux_set_thread_area(), mark pcb as PCB_GS32BIT. This was missed when r180992 was committed. Reviewed by: peter MFC after: 1 week	2008-09-08 09:09:23 +00:00
kib	c593a271c4	Fix inconsistencies in the comments. MFC after: 1 week	2008-09-08 08:58:29 +00:00
kib	a568a3185e	Segment registers are stored in the uc_mcontext member of the struct l_ucontext. To restore the registers content, trampoline needs to dereference uc_mcontext instead of taking some undefined values from l_ucontext. Submitted by: Dmitry Chagin <dchagin@> MFC after: 1 week	2008-09-07 16:39:21 +00:00
simon	f095193224	- Fix amd64 local privilege escalation. [08:07] - Fix nmount(2) local privilege escalation. [08:08] - Fix IPv6 remote kernel panics. [08:09] Fix for [08:07] is merge of r181823. Submitted by: kib [08:07], csjp [08:08], bz [08:09] Reviewed by: peter [08:07], jhb [08:07] Reviewed by: jinmei [08:09], rwatson [08:09] Approved by: re (SA blanket) Approved by: so (simon) Security: FreeBSD-SA-08:07.amd64 Security: FreeBSD-SA-08:08.nmount Security: FreeBSD-SA-08:09.icmp6	2008-09-03 19:09:47 +00:00
kib	59a00054ac	- When executing FreeBSD/amd64 binaries from FreeBSD/i386 or Linux/i386 processes, clear PCB_32BIT and PCB_GS32BIT bits [1]. - Reread the fs and gs bases from the msr unconditionally, not believing the values in pcb_fsbase and pcb_gsbase, since usermode may reload segment registers, invalidating the cache. [2]. Both problems resulted in the wrong fs base, causing wrong tls pointer be dereferenced in the usermode. Reported and tested by: Vyacheslav Bocharov <adeepv at gmail com> [1] Reported by: Bernd Walter <ticsoat cicely7 cicely de>, Artem Belevich <fbsdlist at src cx>[2] Reviewed by: peter MFC after: 3 days	2008-09-02 17:52:11 +00:00
jkim	e41f677c9f	Move empty filter handling to MI source. MFC after: 3 days	2008-08-26 21:06:31 +00:00
jkim	e21d933237	Fix a typo in copyrights.	2008-08-25 20:43:13 +00:00
jhb	346004ece8	Adjust the handling the various timer frequencies when using the lapic timer. Previously, the various divisors were fixed which meant that while it gave somewhat reasonable stathz, etc. at hz=1000, it went off the rails with any other hz value. With these changes, we now pick a lapic timer hz based on the value of hz. If hz is >= 1500, then the lapic timer runs at hz. If 1500 hz >= 750, we run the lapic timer at hz * 2. If hz < 750, we run at hz * 4. We compute a divider at runtime to make stathz run as close to 128 as we can since stathz really wants to be run at something close to that frequency. Profiling just runs on every clock tick. So some examples: With hz = 100, the lapic timer now runs at 400 instead of 2000. stathz will be 133, and profhz = 400. With hz = 1000 (default), the lapic timer is still at 2000 (as it is now), stathz is at 133 (as it is now), and profhz will be 2000 (previously 666). MFC after: 2 weeks	2008-08-23 12:35:43 +00:00
jhb	2a48176eba	Extend the support for PCI-e memory mapped configuration space access: - Rename pciereg_cfgopen() to pcie_cfgregopen() and expose it to the rest of the kernel. It now also accepts parameters via function arguments rather than global variables. - Add a notion of minimum and maximum bus numbers and reject requests for an out of range bus. - Add more range checks on slot/func/reg/bytes parameters to the cfg reg read/write routines. Don't panic on any invalid parameters, just fail the request (writes do nothing, reads return -1). This matches the behavior of the other cfg mechanisms. - Port the memory mapped configuration space access to amd64. On amd64 we simply use the direct map (via pmap_mapdev()) for the memory mapped window. - During acpi_attach() just after loading the ACPI tables, check for a MCFG table. If it exists, call pciereg_cfgopen() on each subtable (memory mapped window). For now we only support windows for domain 0 that start with bus 0. This removes the need for more chipset-specific quirks in the MD code. - Remove the chipset-specific quirks for the Intel 5000P/V/Z chipsets since these machines should all have MCFG tables via ACPI. - Updated pci_cfgregopen() to DTRT if ACPI had invoked pcie_cfgregopen() earlier. MFC after: 2 weeks	2008-08-22 02:14:23 +00:00
jhb	cb1f3b0dc8	MFC: Decode "exotic" instructions such as pause as well as "cmov*" on i386.	2008-08-20 18:01:59 +00:00
ed	cc3116a938	Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan	2008-08-20 08:31:58 +00:00
jhb	d90774443d	Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports already define _KERNEL to get to this and I'm about to add hooks to libkvm to access per-CPU data. MFC after: 1 week	2008-08-19 19:53:52 +00:00
jkim	9847f32c4e	Correctly check unsignedness of all BPF_LD\|BPF_IND instructions. This is roughly from sys/net/bpf_filter.c r1.12 and r1.14.	2008-08-18 19:14:26 +00:00
jkim	137ba6a238	- Make these files compilable on user land. - Update copyrights and fix style(9).	2008-08-18 18:59:33 +00:00
kib	0d74400a62	The doreti_iret_fault code is always called with gs base MSR containing kernel gs base, because %rip is adjusted only on kernel-mode trap caused by iretq execution. On the other hand, the stack contains (hardware part of) trap frame from the usermode. As a consequence, checking for frame mode and doing swapgs causes the kernel to enter trap() with usermode gs base. Remove the check for mode and conditional swapgs, we already have right gs base in the MSR. Submitted by: Nate Eldredge <neldredge math ucsd edu> MFC after: 3 days	2008-08-18 08:47:27 +00:00
bz	1021d43b56	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
jkim	4385fbc2f6	Use int32_t/int16_t instead of int/short as sys/net/bpf_filter.c does.	2008-08-13 19:52:00 +00:00
jkim	39084a201d	- Remove unnecessary jump instruction(s) when offset(s) is/are zero(s). - Constantly use conditional jumps for unsigned integers.	2008-08-13 19:25:09 +00:00
jkim	ad7ec74ae4	Update copyrights and fix style(9).	2008-08-12 21:31:31 +00:00
jkim	5bf34e3e87	Replace all stack usages with registers and remove unused macros.	2008-08-12 20:10:45 +00:00
jhb	7109016dfe	Decode some more "exotic" instructions including: fxsave, fxrstor, ldmxcsr, stmxcsr, clflush, lfence, mfence, sfence, syscall, sysret, sysenter, sysexit, pause, monitor, mwait, and swapgs (amd64 only). MFC after: 1 week	2008-08-11 20:19:42 +00:00
alc	d53364aaab	Intel describes the behavior of their processors as "undefined" if two or more mappings to the same physical page have different memory types, i.e., PAT settings. Consequently, if pmap_change_attr() is applied to a virtual address range within the kernel map, then the corresponding ranges of the direct map also need to be changed. Enhance pmap_change_attr() to handle this case automatically. Add a comment describing what pmap_change_attr() does. Discussed with: jhb	2008-08-09 05:46:13 +00:00
stas	a782fc10fe	- Add cpuctl(4) pseudo-device driver to provide access to some low-level features of CPUs like reading/writing machine-specific registers, retrieving cpuid data, and updating microcode. - Add cpucontrol(8) utility, that provides userland access to the features of cpuctl(4). - Add subsequent manpages. The cpuctl(4) device operates as follows. The pseudo-device node cpuctlX is created for each cpu present in the systems. The pseudo-device minor number corresponds to the cpu number in the system. The cpuctl(4) pseudo- device allows a number of ioctl to be preformed, namely RDMSR/WRMSR/CPUID and UPDATE. The first pair alows the caller to read/write machine-specific registers from the correspondent CPU. cpuid data could be retrieved using the CPUID call, and microcode updates are applied via UPDATE. The permissions are inforced based on the pseudo-device file permissions. RDMSR/CPUID will be allowed when the caller has read access to the device node, while WRMSR/UPDATE will be granted only when the node is opened for writing. There're also a number of priv(9) checks. The cpucontrol(8) utility is intened to provide userland access to the cpuctl(4) device features. The utility also allows one to apply cpu microcode updates. Currently only Intel and AMD cpus are supported and were tested. Approved by: kib Reviewed by: rpaulo, cokane, Peter Jeremy MFC after: 1 month	2008-08-08 16:26:53 +00:00
alc	a08d055d40	Introduce pmap_change_attr_locked().	2008-08-07 04:56:29 +00:00
alc	016300862b	Make pmap_kenter_attr() static.	2008-08-04 08:04:09 +00:00
ed	7237d2d9a2	Disconnect drivers that haven't been ported to MPSAFE TTY yet. As clearly mentioned on the mailing lists, there is a list of drivers that have not been ported to the MPSAFE TTY layer yet. Remove them from the kernel configuration files. This means people can now still use these drivers if they explicitly put them in their kernel configuration file, which is good. People should keep in mind that after August 10, these drivers will not work anymore. Even though owners of the hardware are capable of getting these drivers working again, I will see if I can at least get them to a compilable state (if time permits).	2008-08-03 10:32:17 +00:00
alc	a59aadb8d6	Enhance pmap_mapdev_attr(). Take advantage of recent enhancements to pmap_change_attr() in order to use the direct map for any cache mode, not just write-back mode. It is worth noting that this change also eliminates a situation in which we have two mappings to the same physical memory with different cache modes. Submitted by: Magesh Dhasayyan (with some changes by me) Discussed with: jhb	2008-08-02 03:43:54 +00:00
jhb	972469c120	MFC: Add the optional nvram(4) device. As with 7.x, this device is off by default but can be enabled via 'device nvram' or loading the nvram.ko module on amd64 and i386.	2008-08-01 21:24:17 +00:00
alc	2496625e17	Enhance pmap_change_attr() with the ability to demote 1GB page mappings.	2008-08-01 04:55:38 +00:00
alc	ba620c5434	Enhance pmap_change_attr(). Specifically, avoid 2MB page demotions, cache mode changes, and cache and TLB invalidation when some or all of the specified range is already mapped with the specified cache mode. Submitted by: Magesh Dhasayyan	2008-07-31 22:45:28 +00:00
alc	9a91b0b82b	Eliminate recomputation of the PDE by pmap_pde_attr().	2008-07-31 04:42:42 +00:00
jfv	528c59435f	Add igb to the default kernel MFC after:ASAP	2008-07-30 22:27:38 +00:00
kib	19bf5e2807	Bring back the save/restore of the %ds, %es, %fs and %gs registers for the 32bit images on amd64. Change the semantic of the PCB_32BIT pcb flag to request the context switch code to operate on the segment registers. Its previous meaning of saving or restoring the %gs base offset is assigned to the new PCB_GS32BIT flag. FreeBSD 32bit image activator sets the PCB_32BIT flag, while Linux 32bit emulation sets PCB_32BIT \| PCB_GS32BIT. Reviewed by: peter MFC after: 2 weeks	2008-07-30 11:30:55 +00:00
alc	93ecf09bb7	Don't allow pmap_change_attr() to be applied to the recursive mapping.	2008-07-28 04:59:48 +00:00
alc	8c160a5c01	Add a check for 1GB page mappings to pmap_change_attr() so that it fails gracefully. (On K10 family processors the direct map is implemented using 1GB page mappings.)	2008-07-28 03:58:49 +00:00
yongari	7e604e1299	MFC r179347. Add jme(4) to the list of drivers supported by GENERIC kernel.	2008-07-28 02:20:29 +00:00
alc	369f479043	Style fixes to several function definitions.	2008-07-27 18:18:50 +00:00
alc	4455d56f31	Enhance pmap_change_attr(). Use pmap_demote_pde() to demote a 2MB page mapping to 4KB page mappings when the specified attribute change only applies to a portion of the 2MB page. Previously, in such cases, pmap_change_attr() gave up and returned an error. Submitted by: Magesh Dhasayyan	2008-07-27 17:32:36 +00:00
alc	86bb1c4f2b	Increase the ceiling on the size of the buffer map.	2008-07-19 23:42:38 +00:00
alc	493038bc03	Correct an error in pmap_change_attr()'s initial loop that verifies that the given range of addresses are mapped. Previously, the loop was testing the same address every time. Submitted by: Magesh Dhasayyan	2008-07-18 22:05:51 +00:00
alc	07ead18715	Simplify pmap_extract()'s control flow, making it more like the related functions pmap_extract_and_hold() and pmap_kextract().	2008-07-18 20:07:50 +00:00
alc	d4de04e9b1	Update bus_dmamem_alloc()'s first call to malloc() such that M_WAITOK is specified when appropriate. Reviewed by: scottl	2008-07-15 03:34:49 +00:00
alc	181ba3c627	Handle a race between pmap_kextract() and pmap_promote_pde(). This race caused ZFS to crash when restoring a snapshot with superpage promotion enabled. Reported by: kris	2008-07-13 18:19:53 +00:00
ed	a8f4e95b68	Make uart(4) the default serial port driver on i386 and amd64. The uart(4) driver has the advantage of supporting a wider variety of hardware on a greater amount of platforms. This driver has already been the standard on platforms such as ia64, powerpc and sparc64. I've decided not to change anything on pc98. I'd rather let people from the pc98 team look at this. Approved by: philip (mentor), marcel	2008-07-13 07:20:14 +00:00
alc	a761fe4f10	Refine the changes made in SVN rev 180430. Specifically, instantiate a new page table page only if the 2MB page mapping has been used. Also, refactor some assertions.	2008-07-12 21:24:42 +00:00
alc	6dd633c3c8	In order to apply pmap_demote_pde() to a page directory entry (PDE) from the direct map, the PDE must have PG_M and PG_A preset. Noticed by: Magesh Dhasayyan	2008-07-12 18:43:57 +00:00
alc	bb93e65e81	Extend pmap_demote_pde() to include the ability to instantiate a new page table page where none existed before.	2008-07-10 16:22:24 +00:00
peter	383e07b996	Band-aid a problem with 32 bit selector setup. Initialize %ds, %es, and %fs during CPU startup. Otherwise a garbage value could leak to a 32-bit process if a process migrated to a different CPU after exec and the new CPU had never exec'd a 32-bit process. A more complete fix is needed, but this mitigates the most frequent manifestations. Obtained from: ups	2008-07-09 19:44:37 +00:00
alc	1f7be5a00f	Fix lines that are too long in pmap_growkernel() by substituting shorter but equivalent expressions.	2008-07-09 06:04:10 +00:00
alc	12ebfa7cd9	Eliminate pmap_growkernel()'s dependence on create_pagetables() preallocating page directory pages from VM_MIN_KERNEL_ADDRESS through the end of the kernel's bss. Specifically, the dependence was in pmap_growkernel()'s one- time initialization of kernel_vm_end, not in its main body. (I could not, however, resist the urge to optimize the main body.) Reduce the number of preallocated page directory pages to just those needed to support NKPT page table pages. (In fact, this allows me to revert a couple of my earlier changes to create_pagetables().)	2008-07-08 22:59:17 +00:00
alc	88ebbadb2e	Rev 180333, ``Change create_pagetables() and pmap_init() so that many fewer page table pages have to be preallocated ...'', violates an assumption made by minidumpsys(): kernel_vm_end is the highest virtual address that has ever been used by the kernel. Now, however, the kernel code, data, and bss may reside at addresses beyond kernel_vm_end. This revision modifies the upper bound on minidumpsys()'s two page table traversals to account for this possibility.	2008-07-08 04:00:22 +00:00
delphij	cb283fcdf7	Add HWPMC_HOOKS to GENERIC kernels, this makes hwpmc.ko work out of the box.	2008-07-07 22:55:11 +00:00
alc	64ef3ee8e5	In FreeBSD 7.0 and beyond, pmap_growkernel() should pass VM_ALLOC_INTERRUPT to vm_page_alloc() instead of VM_ALLOC_SYSTEM. VM_ALLOC_SYSTEM was the logical choice before FreeBSD 7.0 because VM_ALLOC_INTERRUPT could not reclaim a cached page. Simply put, there was no ordering between VM_ALLOC_INTERRUPT and VM_ALLOC_SYSTEM as to which "dug deeper" into the cache and free queues. Now, there is; VM_ALLOC_INTERRUPT dominates VM_ALLOC_SYSTEM. While I'm here, teach pmap_growkernel() to request a prezeroed page. MFC after: 1 week	2008-07-07 17:25:09 +00:00
alc	2e2599b5d0	Change create_pagetables() and pmap_init() so that many fewer page table pages have to be preallocated by create_pagetables().	2008-07-06 22:36:28 +00:00
alc	12adeb8575	Increase the kernel map's size to 7GB, making room for a kmem map of size greater than 4GB. (Auto-sizing will set the ceiling on the kmem map size to 4.2GB.)	2008-07-05 20:44:55 +00:00
alc	3f80ced8f0	Eliminate an unused declaration. (In fact, the declaration is bogus because the variable is defined static to pmap.c on i386.) Found by: CScout	2008-07-04 17:36:12 +00:00
alc	4ef7d0cd62	Increase the ceiling on the kmem map's size to 3.6GB. Also, define the ceiling as a fraction of the kernel map's size rather than an absolute quantity. Thus, scaling of the kmem map's size will be automatic with changes to the kernel map's size.	2008-07-03 04:53:14 +00:00
alc	04dc8c2e9c	Eliminate an unnecessary static variable: nkpt.	2008-07-02 05:41:23 +00:00
obrien	e34ad39247	MFC: 180149 / 1.49.2.1 (which was MFC of r180109 / rev 1.53) Document the layout of the address space.	2008-07-01 21:10:38 +00:00
alc	c5776c1b86	Document the layout of the address space, borrowing heavily from http://lists.freebsd.org/pipermail/freebsd-amd64/2005-July/005578.html	2008-06-30 03:14:39 +00:00
alc	2ea095c0e8	Compute NKPDPE from NKPT. This reduces the number of knobs that must be turned in order to change the size of the kernel virtual address space.	2008-06-30 02:35:55 +00:00
alc	828df4dc3d	Strictly speaking, the definition of VM_MAX_KERNEL_ADDRESS is wrong. However, in practice, the error (currently) makes no difference because the computation performed by KVADDR() hides the error. This revision fixes the error. Also, eliminate a (now) unused definition.	2008-06-29 19:13:27 +00:00
alc	4a9078c9b0	Increase the size of the kernel virtual address space to 6GB. Until the maximum size of the kmem map can be greater than 4GB, there is little point in making the kernel virtual address space larger than 6GB. Tested by: kris@	2008-06-29 18:35:00 +00:00
ed	4d6a9685e8	Remove the unused major/minor numbers from iodev and memdev. Now that st_rdev is being automatically generated by the kernel, there is no need to define static major/minor numbers for the iodev and memdev. We still need the minor numbers for the memdev, however, to distinguish between /dev/mem and /dev/kmem. Approved by: philip (mentor)	2008-06-25 07:45:31 +00:00
jkim	4cc7195a06	Emit opcodes closer to GNU as(1) generated codes and micro-optimize.	2008-06-24 20:12:12 +00:00
jkim	6aed2b5388	Rehash and clean up BPF JIT compiler macros to match AT&T notations.	2008-06-23 23:09:52 +00:00
alc	13b5266a78	Ensure that KERNBASE is no less than the virtual address -2GB.	2008-06-23 15:22:53 +00:00
alc	2d03a1918b	Prepare for a larger kernel virtual address space. Specifically, once KERNBASE and VM_MIN_KERNEL_ADDRESS are no longer the same, the physical memory allocated during bootstrap will be offset from the low-end of the kernel's page table.	2008-06-21 19:19:09 +00:00
alc	8c14570d5e	Make preparations for increasing the size of the kernel virtual address space on the amd64 architecture. The amd64 architecture requires kernel code and global variables to reside in the highest 2GB of the 64-bit virtual address space. Thus, KERNBASE cannot change. However, KERNBASE is sometimes used as the start of the kernel virtual address space. Henceforth, VM_MIN_KERNEL_ADDRESS should be used instead. Since KERNBASE and VM_MIN_KERNEL_ADDRESS are still the same address, there should be no visible effect from this change (yet). That said, kris@ has tested crash dumps under the full patch that increases the kernel virtual address space on amd64 to 6GB. Tested by: kris@	2008-06-20 20:59:31 +00:00
delphij	ee624c02de	Add et(4), a port of DragonFly's Agere ET1310 10/100/Gigabit Ethernet device driver, written by sephe@ Obtained from: DragonFly Sponsored by: iXsystems MFC after: 2 weeks	2008-06-20 19:28:33 +00:00
alc	2fc7871f11	Make preparations for increasing the size of the kernel virtual address space on the amd64 architecture. The amd64 architecture requires kernel code and global variables to reside in the highest 2GB of the 64-bit virtual address space. Thus, KERNBASE cannot change. However, KERNBASE is sometimes used as the start of the kernel virtual address space. Henceforth, VM_MIN_KERNEL_ADDRESS should be used instead. Since KERNBASE and VM_MIN_KERNEL_ADDRESS are still the same address, there should be no visible effect from this change (yet).	2008-06-20 05:22:09 +00:00
alc	25d85147e9	Tweak the promotion test in pmap_promote_pde(). Specifically, test PG_A before PG_M. This sometimes prevents unnecessary removal of write access from a PTE. Overall, the net result is fewer demotions and promotion failures.	2008-06-13 19:33:56 +00:00
alc	8ef896c2f8	Reverse the direction of pmap_promote_pde()'s traversal over the specified page table page. The direction of the traversal can matter if pmap_promote_pde() has to remove write access (PG_RW) from a PTE that hasn't been modified (PG_M). In general, if there are two or more such PTEs to choose among, it is better to write protect the one nearer the high end of the page table page rather than the low end. This is because most programs access memory in an ascending direction. The net result of this change is a sometimes significant reduction in the number of failed promotion attempts and the number of pages that are write protected by pmap_promote_pde().	2008-06-12 05:18:09 +00:00
alc	061cc6f3ee	Correct an error in pmap_promote_pde() that may result in an errant promotion within the kernel's address space. Specifically, pmap_promote_pde() is only called when the page table page (PTP) that is referenced by the given PDE has a full "use count", i.e., its wire_count is 512. Although this guarantees for a user address space that all 512 PTEs in the PTP hold valid mappings, the same is not true of the kernel's address space. A kernel PTP always has a use count of 512 regardless of the state of the PTEs. Therefore, pmap_promote_pde() should not assume (or assert) that the first PTE in the PTP is valid.	2008-06-01 07:36:59 +00:00
yongari	2643dd65cd	Add jme(4) to the list of drivers supported by GENERIC kernel.	2008-05-27 02:22:32 +00:00
bz	6bba9b4244	Remove ISDN4BSD (I4B) from HEAD as it is not MPSAFE and parts relied on the now removed NET_NEEDS_GIANT. Most of I4B has been disconnected from the build since July 2007 in HEAD/RELENG_7. This is what was removed: - configuration in /etc/isdn - examples - man pages - kernel configuration - sys/i4b (drivers, layers, include files) - user space tools - i4b support from ppp - further documentation Discussed with: rwatson, re	2008-05-26 10:40:09 +00:00
jb	3c417aadf3	Add the DTrace hooks for exception handling (Function boundary trace -fbt- provider), cyclic clock and syscalls.	2008-05-24 06:32:26 +00:00
alc	964def13e2	The VM system no longer uses setPQL2(). Remove it and its helpers.	2008-05-23 04:03:54 +00:00
yongari	11daf29917	Add age(4) to the list of drivers supported by GENERIC kernel.	2008-05-19 02:30:27 +00:00
alc	a8f81206ad	Retire pmap_addr_hint(). It is no longer used.	2008-05-18 04:16:57 +00:00
remko	91e9f2c6be	Resort the if_ti driver to match the PCI Network cards instead of placing it under the mii devices list. PR: kern/123147 Submitted by: gavin Approved by: imp (mentor, implicit) MFC after: 3 days	2008-05-17 23:50:00 +00:00
attilio	bc6974a05b	Removed unused assembly offsets for structures digging.	2008-05-16 13:23:47 +00:00
rdivacky	faae559cb1	Regen. Approved by: kib (mentor)	2008-05-13 20:02:26 +00:00
rdivacky	13cbd9c97e	Implement robust futexes. Most of the code is modelled after what Linux does. This is because robust futexes are mostly userspace thing which we cannot alter. Two syscalls maintain pointer to userspace list and when process exits a routine walks this list waking up processes sleeping on futexes from that list. Reviewed by: kib (mentor) MFC after: 1 month	2008-05-13 20:01:27 +00:00
alc	9fed63a445	Correct an error in pmap_align_superpage(). Specifically, correctly handle the case where the mapping is greater than a superpage in size but the alignment of the physical pages spans a superpage boundary.	2008-05-11 20:33:47 +00:00
alc	9e8bccea75	Introduce pmap_align_superpage(). It increases the starting virtual address of the given mapping if a different alignment might result in more superpage mappings.	2008-05-09 16:48:07 +00:00
sam	39c0719a2e	enable IEEE80211_DEBUG and IEEE80211_AMPDU_AGE by default	2008-05-03 17:05:38 +00:00
sam	8bf6f34fe9	Intel 4965 wireless driver (derived from openbsd driver of the same name)	2008-04-29 21:36:17 +00:00
alc	fc2ca88c2c	Always use PG_PS_FRAME to extract the physical address of a 2/4MB page from a PDE.	2008-04-25 16:00:39 +00:00
jeff	14b586bf96	- Add an integer argument to idle to indicate how likely we are to wake from idle over the next tick. - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are suspended in cpu specific states. This function can fail and cause the scheduler to fall back to another mechanism (ipi). - Implement support for mwait in cpu_idle() on i386/amd64 machines that support it. mwait is a higher performance way to synchronize cpus as compared to hlt & ipis. - Allow selecting the idle routine by name via sysctl machdep.idle. This replaces machdep.cpu_idle_hlt. Only idle routines supported by the current machine are permitted. Sponsored by: Nokia	2008-04-25 05:18:50 +00:00
dfr	0536363c85	MFC: kernel-mode NFS lock manager.	2008-04-24 10:46:25 +00:00
rdivacky	dd1e82ea4d	Implement linux_truncate64() syscall. Tested by: Aline de Freitas <aline@riseup.net> Approved by: kib (mentor)	2008-04-23 15:56:33 +00:00
phk	8d647da1ed	Now that all platforms use genclock, shuffle things around slightly for better structure. Much of this is related to <sys/clock.h>, which should really have been called <sys/calendar.h>, but unless and until we need the name, the repocopy can wait. In general the kernel does not know about minutes, hours, days, timezones, daylight savings time, leap-years and such. All that is theoretically a matter for userland only. Parts of kernel code does however care: badly designed filesystems store timestamps in local time and RTC chips almost universally track time in a YY-MM-DD HH:MM:SS format, and sometimes in local timezone instead of UTC. For this we have <sys/clock.h> <sys/time.h> on the other hand, deals with time_t, timeval, timespec and so on. These know only seconds and fractions thereof. Move inittodr() and resettodr() prototypes to <sys/time.h>. Retain the names as it is one of the few surviving PDP/VAX references. Move startrtclock() to <machine/clock.h> on relevant platforms, it is a MD call between machdep.c/clock.c. Remove references to it elsewhere. Remove a lot of unnecessary <sys/clock.h> includes. Move the machdep.disable_rtc_set sysctl to subr_rtc.c where it belongs. XXX: should be kern.disable_rtc_set really, it's not MD.	2008-04-22 19:38:30 +00:00
sam	3569e353ca	Multi-bss (aka vap) support for 802.11 devices. Note this includes changes to all drivers and moves some device firmware loading to use firmware(9) and a separate module (e.g. ral). Also there no longer are separate wlan_scan* modules; this functionality is now bundled into the wlan module. Supported by: Hobnob and Marvell Reviewed by: many Obtained from: Atheros (some bits)	2008-04-20 20:35:46 +00:00
sam	682b4ae9be	move awi to the Attic; it will not make the jump to the new world order Reviewed by: imp	2008-04-20 19:20:39 +00:00
peter	a2a93ed9df	Put in a real isa_irq_pending() stub in order to remove two lines of dmesg noise from sio per unit. sio likes to probe if interrupts are configured correctly by looking at the pending bits of the atpic in order to put a non-fatal warning on the console. I think I'd rather read the pending bits from the apics, but I'm not sure its worth the hassle.	2008-04-19 07:25:57 +00:00
jeff	b82673ba40	- Add inlines for the monitor and mwait instructions. Sponsored by: Nokia	2008-04-18 05:47:56 +00:00
jkim	e0c673b7e2	Regenerate.	2008-04-16 19:27:36 +00:00
jkim	513781a1c1	Add stubs for syscalls introduced in Linux 2.6.17 kernel. Some GNU libc version started using them before 2.6.17 was officially out. MFC after: 3 days	2008-04-16 19:25:39 +00:00
imp	1bfc42b341	This file is unused on amd64.	2008-04-15 02:10:14 +00:00
phk	bd75233fb4	Convert amd64 and i386 to share the atrtc device driver.	2008-04-14 08:00:00 +00:00
rpaulo	07a3d6df55	Connect k8temp(4) to the build.	2008-04-12 14:20:22 +00:00
jeff	8efb03d60e	- Add the interrupt vector number to intr_event_create so MI code can lookup hard interrupt events by number. Ignore the irq# for soft intrs. - Add support to cpuset for binding hardware interrupts. This has the side effect of binding any ithread associated with the hard interrupt. As per restrictions imposed by MD code we can only bind interrupts to a single cpu presently. Interrupts can be 'unbound' by binding them to all cpus. Reviewed by: jhb Sponsored by: Nokia	2008-04-11 03:26:41 +00:00
alc	062b34d1be	Correct pmap_copy()'s method for extracting the physical address of a 2/4MB page from a PDE. Specifically, change it to use PG_PS_FRAME, not PG_FRAME, to extract the physical address of a 2/4MB page from a PDE. Change the last argument passed to pmap_pv_insert_pde() from a vm_page_t representing the first 4KB page of a 2/4MB page to the vm_paddr_t of the 2/4MB page. This avoids an otherwise unnecessary conversion from a vm_paddr_t to a vm_page_t in pmap_copy().	2008-04-10 16:04:50 +00:00
kib	133f8f7798	Regenerate	2008-04-08 09:51:19 +00:00
kib	eb77b477b4	Implement the linux syscalls openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat, renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat. Submitted by: rdivacky Sponsored by: Google Summer of Code 2007 Tested by: pho	2008-04-08 09:45:49 +00:00
alc	f2ea5d4883	Update pmap_page_wired_mappings() so that it counts 2/4MB page mappings.	2008-04-07 07:38:02 +00:00
jhb	79918c45a6	Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt code. - Rename the intr_event 'eoi', 'disable', and 'enable' hooks to 'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric. Also, add a comment describe what the MI code expects them to do. - On amd64, i386, and powerpc this is effectively a NOP. - On arm, don't bother masking the interrupt unless the ithread is scheduled in the non-INTR_FILTER case to match what INTR_FILTER did. Also, don't bother unmasking the interrupt in the post_filter case if we never masked it. The INTR_FILTER case had been doing this by having arm_unmask_irq for the post_filter (formerly 'eoi') hook. - On ia64, stray interrupts are now masked for the non-INTR_FILTER case. They were already masked in the INTR_FILTER case. - On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for both the 'post_filter' and 'post_ithread' hooks to match what the non-INTR_FILTER code did. - On sun4v, retire the ithread wrapper hack by using an appropriate 'post_ithread' hook instead (it's what 'post_ithread'/'enable' was designed to do even in 5.x). Glanced at by: piso Reviewed by: marius Requested by: marius [1], [5] Tested on: amd64, i386, arm, sparc64	2008-04-05 19:58:30 +00:00
alc	b1ed68eb9c	Eliminate an unnecessary test and its misleading comment from pmap_enter().	2008-04-04 18:00:22 +00:00
alc	18c266c7a7	Optimize pmap_pml4e() and pmap_pdpe() based upon two observations: The given pmap is never NULL, and therefore pmap_pml4e() can never return NULL. The pervasive use of these inline functions throughout the pmap makes these simple changes worthwhile.	2008-04-02 04:39:47 +00:00
ps	41d5b26ff8	Add support to mincore for detecting whether a page is part of a "super" page or not. Reviewed by: alc, ups	2008-03-28 04:29:27 +00:00
kib	6d49f1490b	MFC rev. 1.682 of sys/amd64/amd64/machdep.c rev. 1.16 of sys/amd64/ia32/ia32_signal.c rev. 1.33 of sys/amd64/linux32/linux32_sysvec.c rev. 1.666 of sys/i386/i386/machdep.c rev. 1.152 of sys/i386/linux/linux_sysvec.c rev. 1.39 of sys/i386/svr4/svr4_machdep.c rev. 1.402 of sys/pc98/pc98/machdep.c Modify the signal handler frame setup code to clear the DF {e,r}flags bit on the amd64/i386 for the signal handlers.	2008-03-27 13:53:52 +00:00
dfr	dc98ee4196	Add kernel module support for nfslockd and krpc. Use the module system to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k' option to rpc.lockd and make kernel NLM the default. A user can still force the use of the old user NLM by building a kernel without NFSLOCKD and/or removing the nfslockd.ko module.	2008-03-27 11:54:20 +00:00
jb	34e730ca27	When building a kernel module, define MAXCPU the same as SMP so that modules work with and without SMP.	2008-03-27 05:03:26 +00:00
phk	c763b22a79	Back in the good old days, PC's had random pieces of rock for frequency generation and what frequency the generated was anyones guess. In general the 32.768kHz RTC clock x-tal was the best, because that was a regular wrist-watch Xtal, whereas the X-tal generating the ISA bus frequency was much lower quality, often costing as much as several cents a piece, so it made good sense to check the ISA bus frequency against the RTC clock. The other relevant property of those machines, is that they typically had no more than 16MB RAM. These days, CPU chips croak if their clocks are not tightly within specs and all necessary frequencies are derived from the master crystal by means if PLL's. Considering that it takes on average 1.5 second to calibrate the frequency of the i8254 counter, that more likely than not, we will not actually use the result of the calibration, and as the final clincher, we seldom use the i8254 for anything besides BEL in syscons anyway, it has become time to drop the calibration code. If you need to tell the system what frequency your i8254 runs, you can do so from the loader using hw.i8254.freq or using the sysctl kern.timecounter.tc.i8254.frequency.	2008-03-26 22:12:00 +00:00
phk	168398fe50	Eliminate unnecessary #includes	2008-03-26 20:26:12 +00:00
phk	fa71439e44	The "free-lance" timer in the i8254 is only used for the speaker these days, so de-generalize the acquire_timer/release_timer api to just deal with speakers. The new (optional) MD functions are: timer_spkr_acquire() timer_spkr_release() and timer_spkr_setfreq() the last of which configures the timer to generate a tone of a given frequency, in Hz instead of 1/1193182th of seconds. Drop entirely timer2 on pc98, it is not used anywhere at all. Move sysbeep() to kern/tty_cons.c and use the timer_spkr() if they exist, and do nothing otherwise. Remove prototypes and empty acquire-/release-timer() and sysbeep() functions from the non-beeping archs. This eliminate the need for the speaker driver to know about i8254frequency at all. In theory this makes the speaker driver MI, contingent on the timer_spkr_() functions existing but the driver does not know this yet and still attaches to the ISA bus. Syscons is more tricky, in one function, sc_tone(), it knows the hz and things are just fine. In the other function, sc_bell() it seems to get the period from the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode the 1193182 and leave it at that. It's probably not important. Change a few other sysbeep() uses which obviously knew that the argument was in terms of i8254 frequency, and leave alone those that look like people thought sysbeep() took frequency in hertz. This eliminates the knowledge of i8254_freq from all but the actual clock.c code and the prof_machdep.c on amd64 and i386, where I think it would be smart to ask for help from the timecounters anyway [TBD].	2008-03-26 20:09:21 +00:00
phk	632e5d39f7	Rename timer0_max_count to i8254_max_count. Rename timer0_real_max_count to i8254_real_max_count and make it static. Rename timer_freq to i8254_freq and make it a loader tunable.	2008-03-26 15:03:24 +00:00
phk	44bfb30efd	The RTC related pscnt and psdiv variables have no business being public.	2008-03-26 13:25:27 +00:00
jkim	3e99f5d364	Belatedly add BPF_JITTER in NOTES for supported architectures.	2008-03-24 22:23:22 +00:00
peter	112e790f78	First pass at (possibly futile) microoptimizing of cpu_switch. Results are mixed. Some pure context switch microbenchmarks show up to 29% improvement. Pipe based context switch microbenchmarks show up to 7% improvement. Real world tests are far less impressive as they are dominated more by actual work than switch overheads, but depending on the machine in question, workload, kernel options, phase of moon, etc, a few percent gain might be seen. Summary of changes: - don't reload MSR_[FG]SBASE registers when context switching between non-threaded userland apps. These typically cost 120 clock cycles each on an AMD cpu (less on Barcelona/Phenom). Intel cores are probably no faster on this. - The above change only helps unthreaded userland apps that tend to use the same value for gsbase. Threaded apps will get no benefit from this. - reorder things like accessing the pcb to be in memory order, to give prefetching a better chance of working. Operations are now in increasing memory address order, rather than reverse or random. - Push some lesser used code out of the main code paths. Hopefully allowing better code density in cache lines. This is probably futile. - (part 2 of previous item) Reorder code so that branches have a more realistic static branch prediction hint. Both Intel and AMD cpus default to predicting branches to lower memory addresses as being taken, and to higher memory addresses as not being taken. This is overridden by the limited dynamic branch prediction subsystem. A trip through userland might overflow this. - Futule attempt at spreading the use of the results of previous operations in new operations. Hopefully this will allow the cpus to execute in parallel better. - stop wasting 16 bytes at the top of kernel stack, below the PCB. - Never load the userland fs/gsbase registers for kthreads, but preserve curpcb->pcb_[fg]sbase as caches for the cpu. (Thanks Jeff!) Microbenchmarking this code seems to be really sensitive to things like scheduling luck, timing, cache behavior, tlb behavior, kernel options, other random code changes, etc. While it doesn't help heavy userland workloads much, it does help high context switch loads a little, and should help those that involve switching via kthreads a bit more. A special thanks to Kris for the testing and reality checks, and Jeff for tormenting me into doing this. :) This is still work-in-progress.	2008-03-23 23:09:06 +00:00
alc	f9d9755304	Correct an error in pmap_mincore() when applied to a 2MB page mapping: Use PG_PS_FRAME, not PG_FRAME, to obtain the physical address of the 2MB physical page from the PDE.	2008-03-23 23:04:09 +00:00
peter	b238ee1007	Export TDP_KTHREAD to asm files.	2008-03-23 22:46:37 +00:00
peter	1f7e9770bb	Move pcb_flags to make trivially better use of cache lines.	2008-03-23 22:45:51 +00:00
peter	075e9da352	Protect the setting of the fsbase/gsbase MSR registers and the pcb_[fg]sbase values with a critical section, like the rest of the kernel.	2008-03-23 22:44:56 +00:00
alc	e702727e2c	To date, we have assumed that the TLB will only set the PG_M bit in a PTE if that PTE has the PG_RW bit set. However, this assumption does not hold on recent processors from Intel. For example, consider a PTE that has the PG_RW bit set but the PG_M bit clear. Suppose this PTE is cached in the TLB and later the PG_RW bit is cleared in the PTE, but the corresponding TLB entry is not (yet) invalidated. Historically, upon a write access using this (stale) TLB entry, the TLB would observe that the PG_RW bit had been cleared and initiate a page fault, aborting the setting of the PG_M bit in the PTE. Now, however, P4- and Core2-family processors will set the PG_M bit before observing that the PG_RW bit is clear and initiating a page fault. In other words, the write does not occur but the PG_M bit is still set. The real impact of this difference is not that great. Specifically, we should no longer assert that any PTE with the PG_M bit set must also have the PG_RW bit set, and we should ignore the state of the PG_M bit unless the PG_RW bit is set. However, these changes enable me to remove a work-around from pmap_promote_pde(), the superpage promotion procedure. (Note: The AMD processors that we have tested, including the latest, the Phenom, still exhibit the historical behavior.) Acknowledgments: After I observed the problem, Stephan (ups) was instrumental in characterizing the exact behavior of Intel's recent TLBs. Tested by: Peter Holm	2008-03-23 20:38:01 +00:00
kib	53a15ee1ea	Prevent the overflow in the calculation of the next page directory. The overflow causes the wraparound with consequent corruption of the (almost) whole address space mapping. As Alan noted, pmap_copy() does not require the wrap-around checks because it cannot be applied to the kernel's pmap. The checks there are included for consistency. Reported and tested by: kris (i386/pmap.c:pmap_remove() part) Reviewed by: alc MFC after: 1 week	2008-03-23 07:07:27 +00:00

1 2 3 4 5 ...

5464 Commits