freebsd-skq

Author	SHA1	Message	Date
rdivacky	f446f9def7	Make isa_dma functions MPSAFE by introducing its own private lock. These functions are selfcontained (ie. they touch only isa_dma.c static variables and hardware) so a private lock is sufficient to prevent races. This changes only i386/amd64 while there are also isa_dma functions for ia64/sparc64. Sparc64 are ones empty stubs and ia64 ones are unused as ia64 does not have isa (says marcel). This patch removes explicit locking of Giant from a few drivers (there are some that requires this but lack ones - this patch fixes this) and also removes the need for implicit locking of Giant from attach routines where it's provided by newbus. Approved by: ed (mentor, implicit) Reviewed by: jhb, attilio (glanced by) Tested by: Giovanni Trematerra <giovanni.trematerra gmail com> IA64 clue: marcel	2009-11-09 20:29:10 +00:00
kuriyama	6c9aae7fc5	- Add hw.clflush_disable loader tunable to avoid panic (trap 9) at map_invalidate_cache_range() even if CPU is not Intel. - This tunable can be set to -1 (default), 0 and 1. -1 is same as current behavior, which automatically disable CLFLUSH on Intel CPUs without CPUID_SS (should be occured on Xen only). You can specify 1 when this panic happened on non-Intel CPUs (such as AMD's). Because disabling CLFLUSH may reduce performance, you can try with setting 0 on Intel CPUs without SS to use CLFLUSH feature. Reviewed by: kib Reported by: karl, kuriyama Related to: kern/138863	2009-11-09 02:54:16 +00:00
attilio	e0f4684a1b	Strip from messages for users external URLs the project cannot directly control. Requested by: kib, rwatson	2009-11-05 14:34:38 +00:00
attilio	68e40aae5f	Opteron rev E family of processor expose a bug where, in very rare ocassions, memory barriers semantic is not honoured by the hardware itself. As a result, some random breakage can happen in uninvestigable ways (for further explanation see at the content of the commit itself). As long as just a specific familly is bugged of an entire architecture is broken, a complete fix-up is impratical without harming to some extents the other correct cases. Considering that (and considering the frequency of the bug exposure) just print out a warning message if the affected machine is identified. Pointed out by: Samy Al Bahra <sbahra at repnop dot org> Help on wordings by: jeff MFC: 3 days	2009-11-04 01:32:59 +00:00
ed	9ade11517c	Unobfuscate unit number handling in apm(4). There is no need to use the lower 4 bits of the unit number to store the device type number. Just use 0 and 1 to distinguish them. devfs also guarantees that there can never be an open call on a device that has a unit number different to 0 and 1, so there is no need to check for this in open().	2009-10-31 10:38:30 +00:00
jhb	b13876f956	Fix some problems with effective mmap() offsets > 32 bits. This was partially fixed on amd64 earlier. Rather than forcing linux_mmap_common() to use a 32-bit offset, have it accept a 64-bit file offset. This offset is then passed to the real mmap() call. Rather than inventing a structure to hold the normal linux_mmap args that has a 64-bit offset, just pass each of the arguments individually to linux_mmap_common() since that more closes matches the existing style of various kern_foo() functions. Submitted by: Christian Zander @ Nvidia MFC after: 1 week	2009-10-28 20:17:54 +00:00
kib	ce081b037e	In r197963, a race with thread being selected for signal delivery while in kernel mode, and later changing signal mask to block the signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls. Use kern_sigprocmask() instead of direct manipulation of td_sigmask to reschedule newly blocked signals, closing the race. Reviewed by: davidxu Tested by: pho MFC after: 1 month	2009-10-27 10:47:58 +00:00
marcel	51bb720939	o Introduce vm_sync_icache() for making the I-cache coherent with the memory or D-cache, depending on the semantics of the platform. vm_sync_icache() is basically a wrapper around pmap_sync_icache(), that translates the vm_map_t argumument to pmap_t. o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc it replaces the pmap_page_executable() function, added to solve the I-cache problem in uiomove_fromphys(). o In proc_rwmem() call vm_sync_icache() when writing to a page that has execute permissions. This assures that when breakpoints are written, the I-cache will be coherent and the process will actually hit the breakpoint. o This also fixes the Book-E PMAP implementation that was missing necessary locking while trying to deal with the I-cache coherency in pmap_enter() (read: mmu_booke_enter_locked). The key property of this change is that the I-cache is made coherent after writes have been done. Doing it in the PMAP layer when adding or changing a mapping means that the I-cache is made coherent before any writes happen. The difference is key when the I-cache prefetches.	2009-10-21 18:38:02 +00:00
avg	be62c97d20	add amdtemp to i386 NOTES essentially this is a MFamd64 Nod from: rpaulo	2009-10-20 09:31:57 +00:00
kib	2892f80896	Move intr_describe() out of #ifdef SMP; the function is always required. Reviewed by: jhb	2009-10-16 12:00:59 +00:00
jhb	45688ed39d	Add a facility for associating optional descriptions with active interrupt handlers. This is primarily intended as a way to allow devices that use multiple interrupts (e.g. MSI) to meaningfully distinguish the various interrupt handlers. - Add a new BUS_DESCRIBE_INTR() method to the bus interface to associate a description with an active interrupt handler setup by BUS_SETUP_INTR. It has a default method (bus_generic_describe_intr()) which simply passes the request up to the parent device. - Add a bus_describe_intr() wrapper around BUS_DESCRIBE_INTR() that supports printf(9) style formatting using var args. - Reserve MAXCOMLEN bytes in the intr_handler structure to hold the name of an interrupt handler and copy the name passed to intr_event_add_handler() into that buffer instead of just saving the pointer to the name. - Add a new intr_event_describe_handler() which appends a description string to an interrupt handler's name. - Implement support for interrupt descriptions on amd64 and i386 by having the nexus(4) driver supply a custom bus_describe_intr method that invokes a new intr_describe() MD routine which in turn looks up the associated interrupt event and invokes intr_event_describe_handler(). Requested by: many Reviewed by: scottl MFC after: 2 weeks	2009-10-15 14:54:35 +00:00
jhb	04d5ac98a1	Move the USB wireless drivers down into their own section next to the USB ethernet drivers. Submitted by: Glen Barber glen.j.barber @ gmail MFC after: 1 month	2009-10-13 19:02:03 +00:00
kib	3547dab066	Define architectural load bases for PIE binaries. Addresses were selected by looking at the bases used for non-relocatable executables by gnu ld(1), and adjusting it slightly. Discussed with: bz Reviewed by: kan Tested by: bz (i386, amd64), bsam (linux) MFC after: some time	2009-10-10 15:31:24 +00:00
attilio	615a58802f	atomic_cmpset_barr_* was added in order to cope with compilers willing to specify their own version of atomic_cmpset_* which could have been different than the membar version. Right now, however, FreeBSD is bound mostly to GCC-like compilers and it is desired to add new support and compat shim mostly when there is a real necessity, in order to avoid too much compatibility bloats. In this optic, bring back atomic_cmpset_{acq, rel}_* to be the same as atomic_cmpset_* and unwind the atomic_cmpset_barr_* introduction. Requested by: jhb Reviewed by: jhb Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-10-09 15:51:40 +00:00
attilio	d6f29069b6	- All the functions in atomic.h needs to be in "physical" form (like not defined through macros or similar) in order to be later compiled in the kernel and offer this way the support for modules (and compatibility among the UP case and SMP case). Fix this for the newly introduced atomic_cmpset_barr_* cases by defining and specifying a template. Note that the new DEFINE_CMPSET_GEN() template save more typing on amd64 than the current code. [1] - Fix the style for memory barriers on amd64. [1] Reported by: Paul B. Mahol <onemda at gmail dot com>	2009-10-06 23:48:28 +00:00
attilio	b1ce942125	Per their definition, atomic instructions used in conjuction with memory barriers should also ensure that the compiler doesn't reorder paths where they are used. GCC, however, does that aggressively, even in presence of volatile operands. The most reliable way GCC offers for avoid instructions reordering is clobbering "memory" even if that is theoretically an heavy-weight operation, flushing the content of all the registers and forcing reload of them (We could rely, however, on gcc DTRT by just understanding the purpose as this is a well-known pattern for many modern operating-systems). Not all our memory barriers, right now, clobber memory for GCC-like compilers. The most notable cases are IA32 and amd64 where the memory barrier are treacted the same as normal atomic instructions. Fix this by offering the possibility to implement atomic instructions with memory barriers separately from the normal version and implement the GCC-like specific one using memory clobbering. Thanks to Chris Lattner (@apple) for his discussion on llvm specifics. Reported by: jhb Reviewed by: jhb Tested by: rdivacky, Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-10-06 13:45:49 +00:00
bz	8e183cd852	Make sure that the primary native brandinfo always gets added first and the native ia32 compat as middle (before other things). o(ld)brandinfo as well as third party like linux, kfreebsd, etc. stays on SI_ORDER_ANY coming last. The reason for this is only to make sure that even in case we would overflow the MAX_BRANDS sized array, the native FreeBSD brandinfo would still be there and the system would be operational. Reviewed by: kib MFC after: 1 month	2009-10-03 11:57:21 +00:00
kmacy	46de945b60	make read_eflags and write_eflags accomplish the same effect on PVM as native, simplifying interrupt handling	2009-10-01 22:05:38 +00:00
kib	a73674620c	As a workaround, for Intel CPUs, do not use CLFLUSH in pmap_invalidate_cache_range() when self-snoop is apparently not reported in cpu features. We get a reserved trap when clflushing APIC registers window. XEN in full system virtualization mode removes self-snoop from CPU features, making this a problem. Tested by: csjp Reviewed by: alc MFC after: 3 days	2009-10-01 12:52:48 +00:00
rpaulo	fa8d81a34c	Improve 802.11s comment. Spotted by: dougb MFC after: 1 day	2009-10-01 02:08:42 +00:00
avg	b6e8843767	cpufunc.h: unify/correct style of c extension names i386 and amd64 archs only. inline => __inline. [1] __asm__ => __asm. [2] Reviewed by: kib, jhb [1] Suggested by: kib [2] MFC after: 1 week	2009-09-30 16:34:50 +00:00
jkim	6f26f72e08	Copy apm(4) emulation from sys/i386/acpica/acpi_machdep.c and install apm(8) and apm_bios.h on amd64.	2009-09-27 14:00:16 +00:00
bz	4c721eede8	lindev(4) [1] is supposed to be a collection of linux-specific pseudo devices that we also support, just not by default (thus only LINT or module builds by default). While currently there is only "/dev/full" [2], we are planning to see more in the future. We may decide to change the module/dependency logic in the future should the list grow too long. This is not part of linux.ko as also non-linux binaries like kFreeBSD userland or ports can make use of this as well. Suggested by: rwatson [1] (name) Submitted by: ed [2] Discussed with: markm, ed, rwatson, kib (weeks ago) Reviewed by: rwatson, brueffer (prev. version) PR: kern/68961 MFC after: 6 weeks	2009-09-26 12:45:28 +00:00
avg	96f7c01c8c	number of cleanups in i386 and amd64 pci md code o introduce PCIE_REGMAX and use it instead of ad-hoc constant o where 'reg' parameter/variable is not already unsigned, cast it to unsigned before comparison with maximum value to cut off negative values o use PCI_SLOTMAX in several places where 31 or 32 were explicitly used o drop redundant check of 'bytes' in i386 pciereg_cfgread() - valid values are already checked in the subsequent switch Reviewed by: jhb MFC after: 1 week	2009-09-24 07:11:23 +00:00
jhb	3f9fa059d7	Extract the code to find and map the MADT ACPI table during early kernel startup and genericize it so it can be reused to map other tables as well: - Add a routine to walk a list of ACPI subtables such as those used in the APIC and SRAT tables in the MI acpi(4) driver. - Move the routines for mapping and unmapping an ACPI table as well as mapping the RSDT or XSDT and searching for a table with a given signature out into acpica_machdep.c for both amd64 and i386.	2009-09-23 15:42:35 +00:00
jhb	8f447f623d	- Split the logic to parse an SMAP entry out into a separate function on amd64 similar to i386. This fixes a bug on amd64 where overlapping entries would not cause the SMAP parsing to stop. - Change the SMAP parsing code to do a sorted insertion into physmap[] instead of an append to support systems with out-of-order SMAP entries. PR: amd64/138220 Reported by: James R. Van Artsdalen james of jrv org MFC after: 3 days	2009-09-22 16:51:00 +00:00
delphij	5cab0133e2	Build x86bios only for i386/amd64 for now. More work is required to make these functional on other architectures, and the current code breaks sparc64 and powerpc. Spotted by: tinderbox via des	2009-09-21 23:58:29 +00:00
kib	611929da1a	If CPU happens to be in usermode when a T_RESERVED trap occured, then trapsignal is called with ksi.ksi_signo = 0. For debugging kernels, that should end up in panic, for non-debugging kernels behaviour is undefined. Do panic regardeless of execution mode at the moment of trap. Reviewed by: jhb MFC after: 1 month	2009-09-21 09:41:51 +00:00
delphij	07c93a91b6	Automatically depend on x86emu when vesa or dpms is being built into kernel. With this change the user no longer need to remember building this option. Submitted by: swell.k at gmail.com	2009-09-21 07:08:20 +00:00
delphij	928fee6a3a	Enable s3pci on amd64 which works on top of VESA, and allow static building it into kernel on i386 and amd64. Submitted by: swell.k at gmail.com	2009-09-21 07:05:48 +00:00
alc	497f06f54e	When superpages are enabled, add the 2 or 4MB page size to the array of supported page sizes. Reviewed by: jhb MFC after: 3 weeks	2009-09-18 17:09:33 +00:00
alc	309c5ab06f	Add a new sysctl for reporting all of the supported page sizes. Reviewed by: jhb MFC after: 3 weeks	2009-09-18 17:04:57 +00:00
rwatson	53eaed07bc	Use C99 initialization for struct filterops. Obtained from: Mac OS X Sponsored by: Apple Inc. MFC after: 3 weeks	2009-09-12 20:03:45 +00:00
kmacy	54935c59e4	fix UP compilation	2009-09-11 23:41:11 +00:00
jkim	be05b1436b	Consolidate CPUID to CPU family/model macros for amd64 and i386 to reduce unnecessary #ifdef's for shared code between them.	2009-09-10 17:27:36 +00:00
des	e4fc8331e6	As jhb@ pointed out to me, r197057 was incorrect, not least because these are generated files.	2009-09-10 13:20:27 +00:00
kib	91e4a6b834	As was done in r196643 for i386 and amd64, swap the start/end virtual addresses in pmap_invalidate_cache_range(). Reported by: Vincent Hoffman <vince unsane co uk> Reviewed by: jhb MFC after: 3 days	2009-09-09 19:40:54 +00:00
delphij	49000890a8	- Teach vesa(4) and dpms(4) about x86emu. [1] - Add vesa kernel options for amd64. - Connect libvgl library and splash kernel modules to amd64 build. - Connect manual page dpms(4) to amd64 build. - Remove old vesa/dpms files. Submitted by: paradox <ddkprog yahoo com> [1], swell k at gmail.com (with some minor tweaks)	2009-09-09 09:50:31 +00:00
phk	e645b495ed	Get rid of the _NO_NAMESPACE_POLLUTION kludge by creating an architecture specific include file containing the _ALIGN* stuff which <sys/socket.h> needs.	2009-09-08 20:45:40 +00:00
kib	942ebf5dc6	Add missing ';'.	2009-09-04 14:53:12 +00:00
julian	48d3390a19	whitespace commit Submitted by: bde@	2009-09-04 07:29:24 +00:00
julian	5ae8d08f60	Bring i386 up to date with amd64 and others. The macros for PCPU can be slightly simplified, which makes the resulting tangle qa lot easier to understand when trying to read them. MFC after: 4 weeks	2009-09-04 05:40:06 +00:00
jkim	7cb9716a59	Fix confusing comments about default PAT entries.	2009-09-02 16:47:10 +00:00
jkim	eba9c85a4e	- Work around ACPI mode transition problem for recent NVIDIA 9400M chipset based Intel Macs. Since r189055, these platforms started freezing when ACPI is being initialized for unknown reason. For these platforms, we just use the old PAT layout. Note this change is not enough to boot fully on these platforms because of other problems but it makes debugging possible. Note MacBook5,2 may be affected as well but it was not added here because of lack of hardware to test. - Initialize PAT MSR fully instead of reading and modifying it for safety. Reported by: rpaulo, hps, Eygene Ryabinkin (rea-fbsd at codelabs dot ru) Reviewed by: jhb	2009-09-02 16:02:48 +00:00
jhb	0c6bbf27ef	Don't attempt to bind the current thread to the CPU an IRQ is bound to when removing an interrupt handler from an IRQ during shutdown. During shutdown we are already bound to CPU 0 and this was triggering a panic. MFC after: 3 days	2009-09-02 00:39:59 +00:00
adrian	92e0abec60	Delete whitespace not in i386/pmap.c	2009-09-01 12:17:47 +00:00
adrian	ce36458bc8	Migrate to use cpuset_t.	2009-09-01 06:15:50 +00:00
adrian	a935e0180a	Merge in the pat_works work from sys/i386/i386/pmap.c - primarily to reduce diff size.	2009-09-01 05:15:45 +00:00
adrian	5d7c160965	Fix broken build.	2009-09-01 03:44:25 +00:00
adrian	b1e7b6996c	Revert previous commit; that was left-over junk in the tree.	2009-08-31 23:35:59 +00:00
adrian	08a823972b	Shuffle pagezero() into the same location as in sys/i386/i386/pmap.c.	2009-08-31 23:30:39 +00:00
jhb	19a52c95a5	Simplify pmap_change_attr() a bit: - Always calculate the cache bits instead of doing it on-demand. - Always set changed to TRUE rather than only doing it if it is false. Discussed with: alc MFC after: 3 days	2009-08-31 18:41:13 +00:00
jhb	d76071806c	Improve pmap_change_attr() so that it is able to demote a large (2/4MB) page into 4KB pages as needed. This should be fairly rare in practice on i386. This includes merging the following changes from the amd64 pmap: 180430, 180485, 180845, 181043, 181077, and 196318. - Add basic support for changing attributes on PDEs to pmap_change_attr() similar to the support in the initial version of pmap_change_attr() on amd64 including inlines for pmap_pde_attr() and pmap_pte_attr(). - Extend pmap_demote_pde() to include the ability to instantiate a new page table page where none existed before. - Enhance pmap_change_attr(). Use pmap_demote_pde() to demote a 2/4MB page mapping to 4KB page mappings when the specified attribute change only applies to a portion of the 2/4MB page. Previously, in such cases, pmap_change_attr() gave up and returned an error. - Correct a critical accounting error in pmap_demote_pde(). Reviewed by: alc MFC after: 3 days	2009-08-31 17:42:52 +00:00
delphij	ff163fe63b	Partially revert 196524: this part of change should not be committed as part of the changeset - it's an unrelated one. Reported by: danfe	2009-08-31 17:34:11 +00:00
bz	840afe36da	Make sure FreeBSD binaries without .note.ABI-tag section work correctly and do not match a colliding Debian GNU/kFreeBSD brandinfo statements. For this mark the Debian GNU/kFreeBSD brandinfo that it must have an .note.ABI-tag section and ignore the old EI_OSABI brandinfo when comparing a possibly colliding set of options. Due to SYSINIT we add the brandinfo in a non-deterministic order, so native FreeBSD is not always first. We may want to consider to force native FreeBSD to come first as well. The only way a problem could currently be noticed is when running an i386 binary without the .note.ABI-tag on amd64 and the Debian GNU/kFreeBSD brandinfo was matched first, as the fallback to ld-elf32.so.1 does not exist in that case. Reported and tested by: ticso In collaboration with: kib MFC after: 3 days	2009-08-30 14:38:17 +00:00
rnoland	acf2898b52	Swap the start/end virtual addresses in pmap_invalidate_cache_range(). This fixes the functionality on non SelfSnoop hardware. Found by: rnoland Submitted by: alc Reviewed by: kib MFC after: 3 days	2009-08-29 16:01:21 +00:00
glebius	c24080cf00	Fix build broken in r196524.	2009-08-25 14:08:33 +00:00
delphij	61c6f92db7	Fix VESA modes and allow 8bit depth modes. PR: i386/124902 Submitted by: paradox <ddkprog yahoo com> MFC after: 2 months	2009-08-24 22:35:53 +00:00
bz	ba7b3afabc	Fix handling of .note.ABI-tag section for GNU systems [1]. Handle GNU/Linux according to LSB Core Specification 4.0, Chapter 11. Object Format, 11.8. ABI note tag. Also check the first word of desc, not only name, according to glibc abi-tags specification to distinguish between Linux and kFreeBSD. Add explicit handling for Debian GNU/kFreeBSD, which runs on our kernels as well [2]. In {amd64,i386}/trap.c, when checking osrel of the current process, also check the ABI to not change the signal behaviour for Linux binary processes, now that we save an osrel version for all three from the lists above in struct proc [2]. These changes make it possible to run FreeBSD, Debian GNU/kFreeBSD and Linux binaries on the same machine again for at least i386 and amd64, and no longer break kFreeBSD which was detected as GNU(/Linux). PR: kern/135468 Submitted by: dchagin [1] (initial patch) Suggested by: kib [2] Tested by: Petr Salinger (Petr.Salinger seznam.cz) for kFreeBSD Reviewed by: kib MFC after: 3 days	2009-08-24 16:19:47 +00:00
jkim	f00e60c6b2	Check whether the SMBIOS reports reasonable amount of memory. If it is less than "avail memory", fall back to Maxmem to avoid user confusion. We use SMBIOS information to display "real memory" since r190599 but some broken SMBIOS implementation reported only half of actual memory. Tested by: bz Approved by: re (kib)	2009-08-20 22:58:05 +00:00
jhb	9b0755de9f	Temporarily revert the new-bus locking for 8.0 release. It will be reintroduced after HEAD is reopened for commits by re@. Approved by: re (kib), attilio	2009-08-20 19:17:53 +00:00
ed	0c0a18c1ff	Make the MacBookPro3,1 hardware boot again. Tested by: Patrick Lamaiziere <patfbsd davenulle org> Approved by: re (kib)	2009-08-19 20:39:33 +00:00
attilio	38b1922900	Port recent IPI enhachements to en: * Introduce the ipi_nmi_handler() function for the Xen infrastructure * Fixup adeguately the ipi sender functions Approved by: re (kib)	2009-08-15 18:37:06 +00:00
jhb	d51166f15e	Adjust the handling of the local APIC PMC interrupt vector: - Provide lapic_disable_pmc(), lapic_enable_pmc(), and lapic_reenable_pmc() routines in the local APIC code that the hwpmc(4) driver can use to manage the local APIC PMC interrupt vector. - Do not enable the local APIC PMC interrupt vector by default when HWPMC_HOOKS is enabled. Instead, the hwpmc(4) driver explicitly enables the interrupt when it is succesfully initialized and disables the interrupt when it is unloaded. This avoids enabling the interrupt on unsupported CPUs which may result in spurious NMIs. Reported by: rnoland Reviewed by: jkoshy Approved by: re (kib) MFC after: 2 weeks	2009-08-14 21:05:08 +00:00
attilio	e85ca71aad	* Completely Remove the option STOP_NMI from the kernel. This option has proven to have a good effect when entering KDB by using a NMI, but it completely violates all the good rules about interrupts disabled while holding a spinlock in other occasions. This can be the cause of deadlocks on events where a normal IPI_STOP is expected. * Adds an new IPI called IPI_STOP_HARD on all the supported architectures. This IPI is responsible for sending a stop message among CPUs using a privileged channel when disponible. In other cases it just does match a normal IPI_STOP. Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64 architectures, while on the other has a normal IPI_STOP effect. It is responsibility of maintainers to eventually implement an hard stop when necessary and possible. * Use the new IPI facility in order to implement a new userend SMP kernel function called stop_cpus_hard(). That is specular to stop_cpu() but it does use the privileged channel for the stopping facility. * Let KDB use the newly introduced function stop_cpus_hard() and leave stop_cpus() for all the other cases * Disable interrupts on CPU0 when starting the process of APs suspension. * Style cleanup and comments adding This patch should fix the reboot/shutdown deadlocks many users are constantly reporting on mailing lists. Please don't forget to update your config file with the STOP_NMI option removal Reviewed by: jhb Tested by: pho, bz, rink Approved by: re (kib)	2009-08-13 17:09:45 +00:00
attilio	7f42e47a67	Make the newbus subsystem Giant free by adding the new newbus sxlock. The newbus lock is responsible for protecting newbus internIal structures, device states and devclass flags. It is necessary to hold it when all such datas are accessed. For the other operations, softc locking should ensure enough protection to avoid races. Newbus lock is automatically held when virtual operations on the device and bus are invoked when loading the driver or when the suspend/resume take place. For other 'spourious' operations trying to access/modify the newbus topology, newbus lock needs to be automatically acquired and dropped. For the moment Giant is also acquired in some key point (modules subsystem) in order to avoid problems before the 8.0 release as module handlers could make assumptions about it. This Giant locking should go just after the release happens. Please keep in mind that the public interface can be expanded in order to provide more support, if there are really necessities at some point and also some bugs could arise as long as the patch needs a bit of further testing. Bump __FreeBSD_version in order to reflect the newbus lock introduction. Reviewed by: ed, hps, jhb, imp, mav, scottl No answer by: ariff, thompsa, yongari Tested by: pho, G. Trematerra <giovanni dot trematerra at gmail dot com>, Brandon Gooch <jamesbrandongooch at gmail dot com> Sponsored by: Yahoo! Incorporated Approved by: re (ksmith)	2009-08-02 14:28:40 +00:00
ed	c5b1acbb23	Make the MacBook3,1 boot again. Approved by: re (kib)	2009-08-02 11:26:23 +00:00
kib	eafde3dc55	Fix XEN build breakage, by implementing pmap_invalidate_cache_range() and using it when appropriate. Merge analogue of the r195836 optimization to XEN. Approved by: re (kensmith)	2009-07-29 19:38:33 +00:00
kib	7b17971146	As was done in r195820 for amd64, use clflush for flushing cache lines when memory page caching attributes changed, and CPU does not support self-snoop, but implemented clflush, for i386. Take care of possible mappings of the page by sf buffer by utilizing the mapping for clflush, otherwise map the page transiently. Amd64 used direct map. Proposed and reviewed by: alc Approved by: re (kensmith)	2009-07-29 08:49:58 +00:00
rpaulo	9b50a8b4b6	Refine the MacBook hack to only match early models that have Intel ICH. Discussed with: kjim Approved by: re (kib)	2009-07-27 13:51:55 +00:00
jhb	44220d7e1e	Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks	2009-07-24 13:50:29 +00:00
alc	fc8defe5dd	Eliminate unnecessary cache and TLB flushes by pmap_change_attr(). (This optimization was implemented in the amd64 version roughly 1 year ago.) Approved by: re (kensmith)	2009-07-23 19:43:23 +00:00
alc	6e8105e16a	Change the handling of fictitious pages by pmap_page_set_memattr() on amd64 and i386. Essentially, fictitious pages provide a mechanism for creating aliases for either normal or device-backed pages. Therefore, pmap_page_set_memattr() on a fictitious page needn't update the direct map or flush the cache. Such actions are the responsibility of the "primary" instance of the page or the device driver that "owns" the physical address. For example, these actions are already performed by pmap_mapdev(). The device pager needn't restore the memory attributes on a fictitious page before releasing it. It's now pointless. Add pmap_page_set_memattr() to the Xen pmap. Approved by: re (kib)	2009-07-19 21:40:19 +00:00
alc	40432bac3b	An addendum to r195649, "Add support to the virtual memory system for configuring machine-dependent memory attributes...": Don't set the memory attribute for a "real" page that is allocated to a device object in vm_page_alloc(). It is a pointless act, because the device pager replaces this "real" page with a "fake" page and sets the memory attribute on that "fake" page. Eliminate pointless code from pmap_cache_bits() on amd64. Employ the "Self Snoop" feature supported by some x86 processors to avoid cache flushes in the pmap. Approved by: re (kib)	2009-07-18 01:50:05 +00:00
jkim	f54960f3a9	Match PCI Express root bridge _HID directly instead of relying on _CID. Reviewed by: jhb Approved by: re (kib)	2009-07-13 21:36:31 +00:00
alc	ea60573817	Add support to the virtual memory system for configuring machine- dependent memory attributes: Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior. Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages. Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager: kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386. vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes. Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping. Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7. In collaboration with: jhb Approved by: re (kib)	2009-07-12 23:31:20 +00:00
rpaulo	8424d74020	Implementation of the upcoming Wireless Mesh standard, 802.11s, on the net80211 wireless stack. This work is based on the March 2009 D3.0 draft standard. This standard is expected to become final next year. This includes two main net80211 modules, ieee80211_mesh.c which deals with peer link management, link metric calculation, routing table control and mesh configuration and ieee80211_hwmp.c which deals with the actually routing process on the mesh network. HWMP is the mandatory routing protocol on by the mesh standard, but others, such as RA-OLSR, can be implemented. Authentication and encryption are not implemented. There are several scripts under tools/tools/net80211/scripts that can be used to test different mesh network topologies and they also teach you how to setup a mesh vap (for the impatient: ifconfig wlan0 create wlandev ... wlanmode mesh). A new build option is available: IEEE80211_SUPPORT_MESH and it's enabled by default on GENERIC kernels for i386, amd64, sparc64 and pc98. Drivers that support mesh networks right now are: ath, ral and mwl. More information at: http://wiki.freebsd.org/WifiMesh Please note that this work is experimental. Also, please note that bridging a mesh vap with another network interface is not yet supported. Many thanks to the FreeBSD Foundation for sponsoring this project and to Sam Leffler for his support. Also, I would like to thank Gateworks Corporation for sending me a Cambria board which was used during the development of this project. Reviewed by: sam Approved by: re (kensmith) Obtained from: projects/mesh11s	2009-07-11 15:02:45 +00:00
trasz	09784497a2	There is an optimization in chmod(1), that makes it not to call chmod(2) if the new file mode is the same as it was before; however, this optimization must be disabled for filesystems that support NFSv4 ACLs. Chmod uses pathconf(2) to determine whether this is the case - however, pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't. This change adds lpathconf(3) to make it possible to solve that problem in a clean way. Reviewed by: rwatson (earlier version) Approved by: re (kib)	2009-07-08 15:23:18 +00:00
jhb	19e5186727	After the per-CPU IDT changes, the IDT vector of an interrupt could change when the interrupt was moved from one CPU to another. If the interrupt was enabled, then the old IDT vector needs to be disabled and the new IDT vector needs to be enabled. This was mostly masked prior to the recent MSI changes since in the older code almost all allocated IDT vectors were already enabled and the enabled vectors on the BSP during boot covered enough of the IDT range. However, after the MSI changes, MSI interrupts that were allocated but not enabled (e.g. DRM with MSI) during boot could result in an allocated IDT vector that wasn't enabled. The round-robin at the end of boot could place another interrupt at the same IDT vector without enabling the IDT vector causing trap 30 faults. Fix this by explicitly disabling/enabling the old and new IDT vectors for enabled interrupt sources when moving an interrupt between CPUs via the pic_assign_cpu() method. While here, fix a bug in my earlier changes so that an I/O APIC interrupt pin is left unchanged if ioapic_assign_cpu() fails to allocate a new IDT vector and returns ENOSPC. Approved by: re (kensmith)	2009-07-06 18:23:00 +00:00
alc	8c4633ab62	PAE adds another level to the i386 page table. This level is a small 4-entry table that must be located within the first 4GB of RAM. This requirement is met by defining an UMA zone with a custom back-end allocator function. This revision makes two changes to this back-end allocator function: (1) It replaces the use of contigmalloc() with the use of kmem_alloc_contig(). This eliminates "double accounting", i.e., accounting by both the UMA zone and malloc tags. (I made the same change for the same reason to the zones supporting jumbo frames a week ago.) (2) It passes through the "wait" parameter, i.e., M_WAITOK, M_ZERO, etc. to kmem_alloc_contig() rather than ignoring it. pmap_init() calls uma_zalloc() with both M_WAITOK and M_ZERO. At the moment, this is harmless only because the default behavior of contigmalloc()/kmem_alloc_contig() is to wait and because pmap_init() doesn't really depend on the memory being zeroed. The back-end allocator function in the Xen pmap is dead code. I am changing it nonetheless because I don't want to leave any "bad examples" in the source tree for someone to copy at a later date. Approved by: re (kib)	2009-07-05 21:40:21 +00:00
sam	c67dff7aca	Cleanup ALIGNED_POINTER: o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v) o define as "1" on amd64 and i386 where there is no restriction o make the type returned consistent with ALIGN o remove _ALIGNED_POINTER o make associated comments consistent Reviewed by: bde, imp, marcel Approved by: re (kensmith)	2009-07-05 17:45:48 +00:00
ed	f11b84cef6	Enable POSIX semaphores on all non-embedded architectures by default. More applications (including Firefox) seem to depend on this nowadays, so not having this enabled by default is a bad idea. Proposed by: miwi Patch by: Florian Smeets <flo kasimir com> Approved by: re (kib)	2009-07-02 18:24:37 +00:00
jhb	76256698a1	Improve the handling of cpuset with interrupts. - For x86, change the interrupt source method to assign an interrupt source to a specific CPU to return an error value instead of void, thus allowing it to fail. - If moving an interrupt to a CPU fails due to a lack of IDT vectors in the destination CPU, fail the request with ENOSPC rather than panicing. - For MSI interrupts on x86 (but not MSI-X), only allow cpuset to be used on the first interrupt in a group. Moving the first interrupt in a group moves the entire group. - Use the icu_lock to protect intr_next_cpu() on x86 instead of the intr_table_lock to fix a LOR introduced in the last set of MSI changes. - Add a new privilege PRIV_SCHED_CPUSET_INTR for using cpuset with interrupts. Previously, binding an interrupt to a CPU only performed a privilege check if the interrupt had an interrupt thread. Interrupts without a thread could be bound by non-root users as a result. - If an interrupt event's assign_cpu method fails, then restore the original cpuset mask for the associated interrupt thread. Approved by: re (kib)	2009-07-01 17:20:07 +00:00
dfr	5d248bb05f	Remove the old kernel RPC implementation and the NFS_LEGACYRPC option. Approved by: re	2009-06-30 19:03:27 +00:00
rwatson	da78c9e4a2	Replace AUDIT_ARG() with variable argument macros with a set more more specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week	2009-06-27 13:58:44 +00:00
jhb	12b7caa1dd	Return ENOSYS instead of EINVAL for invalid function codes to match the behavior of Linux. Reported by: Alexander Best alexbestms of math.uni-muenster.de Approved by: re (kib)	2009-06-26 19:39:33 +00:00
alc	1ce12d013e	Correct the #endif comment. Noticed by: jmallett Approved by: re (kib)	2009-06-26 16:22:24 +00:00
alc	91cafd48b1	This change is the next step in implementing the cache control functionality required by video card drivers. Specifically, this change introduces vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all architectures. In addition, this changes adds a vm_cache_mode_t parameter to kmem_alloc_contig() and vm_phys_alloc_contig(). These will be the interfaces for allocating mapped kernel memory and physical memory, respectively, with non-default cache modes. In collaboration with: jhb	2009-06-26 04:47:43 +00:00
jhb	6d5618d67c	Fix kernels compiled without SMP support. Make intr_next_cpu() available for UP kernels but as a stub that always returns the single CPU's local APIC ID. Reported by: kib	2009-06-25 20:35:46 +00:00
jhb	7f94b48606	- Restore the behavior of pre-allocating IDT vectors for MSI interrupts. This is mostly important for the multiple MSI message case where the IDT vectors for the entire group need to be allocated together. This also restores the assumptions made by the PCI bus code that it could invoke PCIB_MAP_MSI() once MSI vectors were allocated. - To avoid whiplash with CPU assignments, change the way that CPUs are assigned to interrupt sources on activation. Instead of assigning the CPU via pic_assign_cpu() before calling enable_intr(), allow the different interrupt source drivers to ask the MD interrupt code which CPU to use when they allocate an IDT vector. I/O APIC interrupt pins do this in their pic_enable_intr() routines giving the same behavior as before. MSI sources do it when the IDT vectors are allocated during msi_alloc() and msix_alloc(). - Change the intr_table_lock from an sx lock to a mutex. Tested by: rnoland	2009-06-25 18:13:46 +00:00
rwatson	3ef00864f7	Fix ibcs2_ipc.c build by adding missing limits.h include. Submitted by: keramida	2009-06-25 07:25:39 +00:00
jhb	6f52fe78fb	Change the ABI of some of the structures used by the SYSV IPC API: - The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned short. - The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned short. - The mode member of struct ipc_perm is now mode_t instead of unsigned short (this is merely a style bug). - The rather dubious padding fields for ABI compat with SV/I386 have been removed from struct msqid_ds and struct semid_ds. - The shm_segsz member of struct shmid_ds is now a size_t instead of an int. This removes the need for the shm_bsegsz member in struct shmid_kernel and should allow for complete support of SYSV SHM regions >= 2GB. - The shm_nattch member of struct shmid_ds is now an int instead of a short. - The shm_internal member of struct shmid_ds is now gone. The internal VM object pointer for SHM regions has been moved into struct shmid_kernel. - The existing __semctl(), msgctl(), and shmctl() system call entries are now marked COMPAT7 and new versions of those system calls which support the new ABI are now present. - The new system calls are assigned to the FBSD-1.1 version in libc. The FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls. - A simplistic framework for tagging system calls with compatibility symbol versions has been added to libc. Version tags are added to system calls by adding an appropriate __sym_compat() entry to src/lib/libc/incldue/compat.h. [1] PR: kern/16195 kern/113218 bin/129855 Reviewed by: arch@, rwatson Discussed with: kan, kib [1]	2009-06-24 21:10:52 +00:00
jhb	b4f97709ae	Whitespace fix.	2009-06-24 19:16:48 +00:00
mav	d34f6ea286	Make algorithm a bit more bulletproof.	2009-06-23 23:16:37 +00:00
jeff	5bc3a65e40	Implement a facility for dynamic per-cpu variables. - Modules and kernel code alike may use DPCPU_DEFINE(), DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined PCPU_. Requires only one extra instruction more than PCPU_ and is virtually the same as __thread for builtin and much faster for shared objects. DPCPU variables can be initialized when defined. - Modules are supported by relocating the module's per-cpu linker set over space reserved in the kernel. Modules may fail to load if there is insufficient space available. - Track space available for modules with a one-off extent allocator. Free may block for memory to allocate space for an extent. Reviewed by: jhb, rwatson, kan, sam, grehan, marius, marcel, stas	2009-06-23 22:42:39 +00:00
mav	4528681d83	Rework r193814: While general idea of patch was good, it was not working properly due the way it was implemented. When we are using same timer interrupt for several of hard/prof/stat purposes we should not send several IPIs same time to other CPUs. Sending several IPIs same time leads to terrible accounting/profiling results due to strong synchronization effect, when the second interrupt handler accounts processing of the first one. Interlink timer events in a such way, that no more then one IPI is sent for any original timer interrupt.	2009-06-23 21:45:33 +00:00
rpaulo	792757454a	* Driver for ACPI WMI (Windows Management Instrumentation) * Driver for ACPI HP extra functionations, which required ACPI WMI driver. Submitted by: Michael <freebsdusb at bindone.de> Approved by: re MFC after: 2 weeks	2009-06-23 13:17:25 +00:00
alc	4dc742f489	Eliminate dead code. These definitions should have been deleted with the introduction of i686_mem.c in r45405. Merge adjacent #ifdef _KERNEL/#endif blocks.	2009-06-22 04:21:02 +00:00
brooks	4cdb86f203	Use NGROUPS instead of NGROUPS_MAX as the limits on setgroups and getgroups for ibcs emulation. It seems vanishingly likely any programs will actually be affected since they probably assume a much lower value and use a static array size.	2009-06-20 18:52:02 +00:00
brooks	f53c1c309d	Rework the credential code to support larger values of NGROUPS and NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867	2009-06-19 17:10:35 +00:00
jhb	6bbcbf2460	Regen for added flags field.	2009-06-17 19:53:20 +00:00
jhb	04e0bdb73d	Move (read\|write)_cyrix_reg() inlines from specialreg.h to cpufunc.h. specialreg.h now consists solely of register-related macros.	2009-06-16 15:13:18 +00:00
mav	c4234dfc7e	Forbid multi-vector MSI interrupt vectors migration to another CPU once allocated. MSI have strict vectors allocation requirements, which are not satisfied now during reallocation. This is not the best possible solution, but better then just broken, as it was. No objections: current@, arch@, jhb@	2009-06-15 13:47:49 +00:00
alc	07cfd3813e	Long, long ago in r27464 special case code for mapping device-backed memory with 4MB pages was added to pmap_object_init_pt(). This code assumes that the pages of a OBJT_DEVICE object are always physically contiguous. Unfortunately, this is not always the case. For example, jhb@ informs me that the recently introduced /dev/ksyms driver creates a OBJT_DEVICE object that violates this assumption. Thus, this revision modifies pmap_object_init_pt() to abort the mapping if the OBJT_DEVICE object's pages are not physically contiguous. This revision also changes some inconsistent if not buggy behavior. For example, the i386 version aborts if the first 4MB virtual page that would be mapped is already valid. However, it incorrectly replaces any subsequent 4MB virtual page mappings that it encounters, potentially leaking a page table page. The amd64 version has a bug of my own creation. It potentially busies the wrong page and always an insufficent number of pages if it blocks allocating a page table page. To my knowledge, there have been no reports of these bugs, hence, their persistance. I suspect that the existing restrictions that pmap_object_init_pt() placed on the OBJT_DEVICE objects that it would choose to map, for example, that the first page must be aligned on a 2 or 4MB physical boundary and that the size of the mapping must be a multiple of the large page size, were enough to avoid triggering the bug for drivers like ksyms. However, one side effect of testing the OBJT_DEVICE object's pages for physical contiguity is that a dubious difference between pmap_object_init_pt() and the standard path for mapping devices pages, i.e., vm_fault(), has been eliminated. Previously, pmap_object_init_pt() would only instantiate the first PG_FICTITOUS page being mapped because it never examined the rest. Now, however, pmap_object_init_pt() uses the new function vm_object_populate() to instantiate them all (in order to support testing their physical contiguity). These pages need to be instantiated for the mechanism that I have prototyped for automatically maintaining the consistency of the PAT settings across multiple mappings, particularly, amd64's direct mapping, to work. (Translation: This change is also being made to support jhb@'s work on the Nvidia feature requests.) Discussed with: jhb@	2009-06-14 19:51:43 +00:00
ed	07b720e0fe	Enable PRINTF_BUFR_SIZE on i386 and amd64 by default. In the past there have been some reports of PRINTF_BUFR_SIZE not functioning correctly. Instead of having garbled console messages, we should just see whether the issues are still there and analyze them. Approved by: re	2009-06-14 18:01:35 +00:00
ed	4798e07722	Clobber "cc" instead of using volatile. Submitted by: Christoph Mallon	2009-06-13 14:30:08 +00:00
ed	ac0e32dbc7	Clobber "cc" instead of using volatile; remove obsolete register keyword. Submitted by: Christoph Mallon	2009-06-13 14:00:10 +00:00
ed	c8fca13ecc	Simplify the inline assembler (and correct potential error) of pte_load_store(). Submitted by: Christoph Mallon	2009-06-13 13:56:06 +00:00
avg	2511f4d43b	strict kobj signatures: fix legacy i386 pcib_write_config impl Reviewed by: imp, current@ Approved by: jhb (mentor)	2009-06-11 17:06:31 +00:00
kib	e1cb2941d4	Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Use vnode interlock to protect the knote fields [1]. The locking assumes that shared vnode lock is held, thus we get exclusive access to knote either by exclusive vnode lock protection, or by shared vnode lock + vnode interlock. Do not use kl_locked() method to assert either lock ownership or the fact that curthread does not own the lock. For shared locks, ownership is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared lock not owned by curthread, causing false positives in kqueue subsystem assertions about knlist lock. Remove kl_locked method from knlist lock vector, and add two separate assertion methods kl_assert_locked and kl_assert_unlocked, that are supposed to use proper asserts. Change knlist_init accordingly. Add convenience function knlist_init_mtx to reduce number of arguments for typical knlist initialization. Submitted by: jhb [1] Noted by: jhb [2] Reviewed by: jhb Tested by: rnoland	2009-06-10 20:59:32 +00:00
yongari	c9be81a520	Add alc(4), a driver for Atheros AR8131/AR8132 PCIe ethernet controller. These controllers are also known as L1C(AR8131) and L2C(AR8132) respectively. These controllers resembles the first generation controller L1 but usage of different descriptor format and new register mappings over L1 register space requires a new driver. There are a couple of registers I still don't understand but the driver seems to have no critical issues for performance and stability. Currently alc(4) supports the following hardware features. o MSI o TCP Segmentation offload o Hardware VLAN tag insertion/stripping o Tx/Rx interrupt moderation o Hardware statistics counters(dev.alc.%d.stats) o Jumbo frame o WOL AR8131/AR8132 also supports Tx checksum offloading but I disabled it due to stability issues. I'm not sure this comes from broken sample boards or hardware bugs. If you know your controller works without problems you can still enable it. The controller has a silicon bug for Rx checksum offloading, so the feature was not implemented. I'd like to say big thanks to Atheros. Atheros kindly sent sample boards to me and answered several questions I had. HW donated by: Atheros Communications, Inc.	2009-06-10 02:07:58 +00:00
kmacy	b4caedd487	opt in to flowtable on i386/amd64	2009-06-09 21:58:14 +00:00
kmacy	bdcfd6610c	remove flowtable from DEFAULTS	2009-06-09 20:26:52 +00:00
ariff	ee44c932b2	When using i8254 as the only kernel timer source: - Interpolate stat/prof clock using clkintr() in a similar fashion to local APIC timer, since statclock usually run slower. - Liberate hardclockintr() from taking the burden of handling both stat and prof clock interrupt. Instead, send IPIs within clkintr() to handle those.	2009-06-09 07:26:52 +00:00
ariff	9df3e8cc3a	Move C1E workaround into its own idle function. Previous workaround works only during initial booting process, while there are laptops/BIOSes that tend to act 'smarter' by force enabling C1E if the main power adapter being pulled out, rendering previous workaround ineffective. Given the fact that we still rely on local APIC to drive timer interrupt, this workaround should keep all Turion (probably Phenom too) X\d+ alive whether its on battery power or not. URL: http://lists.freebsd.org/pipermail/freebsd-acpi/2008-April/004858.html http://lists.freebsd.org/pipermail/freebsd-acpi/2008-May/004888.html Tested by: Peter Jeremy <peterjeremy at optushome d com d au>	2009-06-09 04:17:36 +00:00
delphij	cfc92528a5	Add line width calculations for 15/16 and 24/32 bit modes in case the "Get Scan Line Length" function fails, as it does in Parallels (in Version 2.2, Build 2112 at least). PR: i386/127367 Obtained from: DragonFly Submitted by: Pedro Giffuni MFC after: 1 month	2009-06-09 00:54:57 +00:00
jkim	8ede8714ca	Rewrite OsdSynch.c to reflect the latest ACPICA more closely: - Implement ACPI semaphore (ACPI_SEMAPHORE) with condvar(9) and mutex(9). - Implement ACPI mutex (ACPI_MUTEX) with mutex(9). - Implement ACPI lock (ACPI_SPINLOCK) with spin mutex(9).	2009-06-08 20:07:16 +00:00
ed	ca6dbec08b	Revert my change; reintroduce __gnu89_inline. It turns out our compiler in stable/7 can't build this code anymore. Even though my opinion is that those people should just run `make kernel-toolchain' before building a kernel, I am willing to wait and commit this after we've branched stable/8. Requested by: rwatson	2009-06-08 18:23:43 +00:00
ed	70abc0766a	Remove __gnu89_inline. Now that we use C99 almost everywhere, just use C99-style in the pmap code. Since the pmap code is the only consumer of __gnu89_inline, remove it from cdefs.h as well. Because the flag was only introduced 17 months ago, I don't expect any problems. Reviewed by: alc	2009-06-08 17:27:25 +00:00
adrian	92f847edaf	Decouple the i386 native and i386 Xen APIC definitions a little further. I'm experimenting locally with xen APIC emulation a bit and this makes it easier to migrate APIC entries between being bitmapped and not being bitmapped.	2009-06-07 22:52:48 +00:00
jkim	6d358bddff	Import ACPICA 20090521.	2009-06-05 18:44:36 +00:00
rwatson	f4934662e5	Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd	2009-06-05 14:55:22 +00:00
rwatson	14f4a9dd42	Remove MAC kernel config files and add "options MAC" to GENERIC, with the goal of shipping 8.0 with MAC support in the default kernel. No policies will be compiled in or enabled by default, but it will now be possible to load them at boot or runtime without a kernel recompile. While the framework is not believed to impose measurable overhead when no policies are loaded (a result of optimization over the past few months in HEAD), we'll continue to benchmark and optimize as the release approaches. Please keep an eye out for performance or functionality regressions that could be a result of this change. Approved by: re (kensmith) Obtained from: TrustedBSD Project	2009-06-02 18:31:08 +00:00
dchagin	bb8f1f3e67	Implement accept4 syscall. Approved by: kib (mentor) MFC after: 1 month	2009-06-01 20:48:39 +00:00
rwatson	b850b85534	Regenerate generated syscall files following changes to struct sysent in r193234.	2009-06-01 16:14:38 +00:00
adrian	30e73eac6d	Fix the MP IPI code to differentiate between bitmapped IPIs and function IPIs. This attempts to fix the IPI handling code to correctly differentiate between bitmapped IPIs and function IPIs. The Xen IPIs were on low numbers which clashed with the bitmapped IPIs. This commit bumps those IPI numbers up to 240 and above (just like in the i386 code) and fiddles with the ipi_vectors[] logic to call the correct function. This still isn't "right". Specifically, the IPI code may work fine for TLB shootdown events but the rendezvous/lazypmap IPIs are thrown by calling ipi_*() routines which don't set the call_func stuff (function id, addr1, addr2) that the TLB shootdown events are. So the Xen SMP support is still broken. PR: 135069	2009-05-31 08:11:39 +00:00
adrian	ff7c20c8cb	Remove some unused code in ipi_selected() . The code path this was copied from (sys/i386/i386/mp_machdep.c:ipi_selected()) handles bitmap'ed IPIs and normal IPIs via separate notification paths. Xen SMP handles them the same way.	2009-05-31 07:25:24 +00:00
adrian	e0e83e1ad2	Even though I'm not quite sure that the call_func stuff will work properly in all the places/cases IPI messages will be generated, at least be consistent with how the call_data pointer is assigned and cleared (ie, all done inside the spinlock. Ensure that its NULL before continuing, just to try and identify situations where things are going horribly wrong.	2009-05-30 15:20:25 +00:00
adrian	d03ca3acc6	Don't schedule a CALL_FUNCTION_VECTOR software IPI if the IPI was signaled via the bitmap (and thus sent via RESCHEDULE_VECTOR.)	2009-05-30 14:59:08 +00:00
adrian	9849e67ed2	Correctly report the IPI IRQs being created; make it clear what vectors they are for.	2009-05-30 06:37:03 +00:00
jamie	572db1408a	Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)	2009-05-29 21:27:12 +00:00
adrian	9217f2901b	Revert to 2-clause.	2009-05-29 13:48:42 +00:00
adrian	cf3db32dfd	Fix the Xen TOD update when the hypervisor wall clock is nudged. The "wall clock" in the current code is actually the hypervisor start time. The time of day is the "start time" plus the hypervisor "uptime". Large enough bumps in the dom0 clock lead to a hypervisor "bump" which is implemented as a bump in the start time, not the uptime. The clock.c routines were reading in the hypervisor start time and then using this as the TOD. This meant that any hypervisor time bump would cause the FreeBSD DomU to set its TOD to the hypervisor start time, rather than the actual TOD. This fix is a bit hacky and some reshuffling should be done later on to clarify what is going on. I've left the wall clock code alone. (The code which updates shadow_tv and shadow_tv_version.) A new routine adds the uptime to the shadow_tv, which is then used to update the TOD. I've included some debugging so it is obvious when the clock is nudged. PR: 135008	2009-05-29 13:43:21 +00:00
adrian	a5c69d4a68	Migrate the Xen hypervisor clock reading routines into something sharable.	2009-05-29 13:36:06 +00:00
adrian	f733124272	Say hello to a very basic, read-only, Xen Hypervisor RTC. The hypervisor doesn't provide a single "TOD" - it instead provides a "start time" and a "running time". These are added together to form the current TOD. The TOD is in UTC. This RTC is only (initially) designed to be read at startup. There's some further poking that needs to happen to pick up hypervisor time changes (ie, by the Dom0 time being adjusted by something). This time adjustment currently can cause "weird stuff" in the DomU clock; I'll begin investigating and repairing that in subsequent commits. PR: 135008	2009-05-28 04:17:05 +00:00
imp	c67e70f989	We don't need d_thread_t for cross-branch portability here anymore. Move do struct thread * instead.	2009-05-20 16:47:40 +00:00
imp	bc7a515649	Some minor style changes: o Convert K&R function definitions to ANSI o Eliminate spaces/tabs that should have been deleted as part of the de__P efforts o Use struct thread * in preference to d_thread_t *.	2009-05-20 16:29:22 +00:00
jhb	e7900df956	Don't bother reading the initial value of the machine check banks during startup on Pentium 4 CPUs. This wasn't safe to do on APs during AP startup, was of limited value, and won't be used for future processors.	2009-05-20 16:11:22 +00:00
jhb	6048eaa315	- Add a tunable 'hw.mca.enabled' that can be used to enable/disable the machine check code. Disable it by default for now. - When computing the mask of bits that determines a non-restartable event during a machine check exception, or-in the overflow flag rather than replacing the other flags. PR: i386/134586 [2] Submitted by: Andi Kleen andi-fbsd firstfloor.org	2009-05-18 21:50:06 +00:00
jhb	b80bf31073	Add a read-only sysctl hw.pci.mcfg to mirror the tunable by the same name. MFC after: 1 week	2009-05-18 21:47:32 +00:00
jhb	f5760f10df	Bump CACHE_LINE_SIZE to 128 for x86. Intel's manuals explicitly recommend using 128 byte alignment for locks. (See IA-32 SDM Vol 3A 7.11.6.7)	2009-05-18 19:33:59 +00:00
marcel	8b09116a5a	Add cpu_flush_dcache() for use after non-DMA based I/O so that a possible future I-cache coherency operation can succeed. On ARM for example the L1 cache can be (is) virtually mapped, which means that any I/O that uses temporary mappings will not see the I-cache made coherent. On ia64 a similar behaviour has been observed. By flushing the D-cache, execution of binaries backed by md(4) and/or NFS work reliably. For Book-E (powerpc), execution over NFS exhibits SIGILL once in a while as well, though cpu_flush_dcache() hasn't been implemented yet. Doing an explicit D-cache flush as part of the non-DMA based I/O read operation eliminates the need to do it as part of the I-cache coherency operation itself and as such avoids pessimizing the DMA-based I/O read operations for which D-cache are already flushed/invalidated. It also allows future optimizations whereby the bcopy() followed by the D-cache flush can be integrated in a single operation, which could be implemented using on-chips DMA engines, by-passing the D-cache altogether.	2009-05-18 18:37:18 +00:00
dchagin	5351e06699	Somewhere between 2.6.23 and 2.6.27, Linux added SOCK_CLOEXEC and SOCK_NONBLOCK flags, that allow to save fcntl() calls. Implement a variation of the socket() syscall which takes a flags in addition to the type argument. Approved by: kib (mentor) MFC after: 1 month	2009-05-16 18:48:41 +00:00
jhb	526729c1b6	Trim the default set of device hints on i386 and amd64: - Remove vga0 and the disabled uart2/uart3 hints from both platforms. - Remove hints for ISA adv0, bt0, aha0, aic0, ed0, cs0, sn0, ie0, fe0, and le0 from i386. All these hints were marked 'disabled' and thus already did not work "out of the box". Discussed with: imp	2009-05-14 21:53:35 +00:00
attilio	902219327c	FreeBSD right now support 32 CPUs on all the architectures at least. With the arrival of 128+ cores it is necessary to handle more than that. One of the first thing to change is the support for cpumask_t that needs to handle more than 32 bits masking (which happens now). Some places, however, still assume that cpumask_t is a 32 bits mask. Fix that situation by using always correctly cpumask_t when needed. While here, remove the part under STOP_NMI for the Xen support as it is broken in any case. Additively make ipi_nmi_pending as static. Reviewed by: jhb, kmacy Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-05-14 17:43:00 +00:00
jhb	370298a108	Implement simple machine check support for amd64 and i386. - For CPUs that only support MCE (the machine check exception) but not MCA (i.e. Pentium), all this does is print out the value of the machine check registers and then panic when a machine check exception occurs. - For CPUs that support MCA (the machine check architecture), the support is a bit more involved. - First, there is limited support for decoding the CPU-independent MCA error codes in the kernel, and the kernel uses this to output a short description of any machine check events that occur. - When a machine check exception occurs, all of the MCx banks on the current CPU are scanned and any events are reported to the console before panic'ing. - To catch events for correctable errors, a periodic timer kicks off a task which scans the MCx banks on all CPUs. The frequency of these checks is controlled via the "hw.mca.interval" sysctl. - Userland can request an immediate scan of the MCx banks by writing a non-zero value to "hw.mca.force_scan". - If any correctable events are encountered, the appropriate details are stored in a 'struct mca_record' (defined in <machine/mca.h>). The "hw.mca.count" is a count of such records and each record may be queried via the "hw.mca.records" tree by specifying the record index (0 .. count - 1) as the next name in the MIB similar to using PIDs with the kern.proc.* sysctls. The idea is to export machine check events to userland for more detailed processing. - The periodic timer and hw.mca sysctls are only present if the CPU supports MCA. Discussed with: emaste (briefly) MFC after: 1 month	2009-05-13 17:53:04 +00:00
alc	2553ba9c64	Correct a rare use-after-free error in pmap_copy(). This error was introduced in amd64 revision 1.540 and i386 revision 1.547. However, it had no harmful effects until after a recent change, r189698, on amd64. (In other words, the error is harmless in RELENG_7.) The error is triggered by the failure to allocate a pv entry for the one and only mapping in a page table page. I am addressing the error by changing pmap_copy() to abort if either pv entry allocation or page table page allocation fails. This is appropriate because the creation of mappings by pmap_copy() is optional. They are a (possible) optimization, and not a requirement. Correct a nearby whitespace error in the i386 pmap_copy(). Crash reported by: jeff@ MFC after: 6 weeks	2009-05-13 07:42:53 +00:00
brueffer	1e50e07e85	Remove unused variables. Found with: Coverity Prevent(tm) CID: 4285, 4286	2009-05-12 22:11:02 +00:00
dchagin	51f122997d	Do not export AT_CLKTCK when emulating Linux kernel prior to 2.4.0, as it has appeared in the 2.4.0-rc7 first time. Being exported, AT_CLKTCK is returned by sysconf(_SC_CLK_TCK), glibc falls back to the hard-coded CLK_TCK value when aux entry is not present. Glibc versions prior to 2.2.1 always use hard-coded CLK_TCK value. For older applications/libc's which depends on hard-coded CLK_TCK value user should set compat.linux.osrelease less than 2.4.0. Approved by: kib (mentor)	2009-05-10 18:43:43 +00:00
dchagin	ab5a6b0d18	Rework r189362, r191883. The frequency of the statistics clock is given by stathz. Use stathz if it is available, otherwise use hz. Pointed out by: bde Approved by: kib (mentor)	2009-05-10 18:16:07 +00:00
kuriyama	9913dad783	- Use "device\t" and "options \t" for consistency.	2009-05-10 00:00:25 +00:00
ed	8dbae36d2b	Regenerate system call tables to use SVN ids.	2009-05-08 20:16:04 +00:00
ed	b2bb829ec0	Regenerate ibcs2 system call table.	2009-05-08 20:08:43 +00:00
ed	59fb74ae92	Burn TTY ioctl bridges in compat layers. I really don't want any pieces of code to include ioctl_compat.h, so let the ibcs2 and svr4 compat leave sgtty alone. If they want to support sgtty, they should emulate it on top of termios, not sgtty. The code has been marked with BURN_BRIDGES for a long time. ibcs2 and svr4 are not really popular pieces of code anyway.	2009-05-08 20:06:37 +00:00
zec	639797b2e6	Introduce a new virtualization container, provisionally named vprocg, to hold virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import. As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet. Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch. Permit kldload / kldunload operations to be executed only from the default vimage context. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz Approved by: julian (mentor)	2009-05-08 14:11:06 +00:00
jamie	267ea54b44	Move the per-prison Linux MIB from a private one-off pointer to the new OSD-based jail extensions. This allows the Linux MIB to accessed via jail_set and jail_get, and serves as a demonstration of adding jail support to a module. Reviewed by: dchagin, kib Approved by: bz (mentor)	2009-05-07 18:36:47 +00:00
dchagin	010f4da5f8	To avoid excessive code duplication move MI definitions to the MI header file. As it is defined in Linux. Approved by: kib (mentor) MFC after: 1 month	2009-05-07 09:39:20 +00:00
mav	4ba113b178	Do not try to initialize LAPIC timer if we are not going to use it. It solves assertion, when kernel built with INVARIANTS configured to use i8254 timer.	2009-05-05 01:13:20 +00:00
jkim	3b0819d0af	Unlock the largest standard CPUID on Intel CPUs for both amd64 and i386 and fix SMP topology detection. On i386, we extend it to cover Core, Core 2, and Core i7 processors, not just Pentium 4 family, and move it to better place. On amd64, all supported Intel CPUs should have this MSR.	2009-05-04 18:05:27 +00:00
mav	114e5b7de0	Oops, sorry. Fix for fix.	2009-05-04 08:41:54 +00:00
mav	92d6358872	There is no atrtc driver in pc98, so hide atrtcclock_disable variable usage in APM driver for this platform. This should fix pc98 build.	2009-05-04 08:36:47 +00:00
mav	98565a1214	Rename statclock_disable variable to atrtcclock_disable that it actually is, and hide it inside of atrtc driver. Add new tunable hint.atrtc.0.clock controlling it. Setting it to 0 disables using RTC clock as stat-/ profclock sources. Teach i386 and amd64 SMP platforms to emulate stat-/profclocks using i8254 hardclock, when LAPIC and RTC clocks are disabled. This allows to reduce global interrupt rate of idle system down to about 100 interrupts per core, permitting C3 and deeper C-states provide maximum CPU power efficiency.	2009-05-03 17:47:21 +00:00
kmacy	7066780d7a	fix XEN compilation	2009-05-02 22:22:00 +00:00
mav	b704e6092a	Add support for using i8254 and rtc timers as event sources for i386 SMP system. Redistribute hard-/stat-/profclock events to other CPUs using IPI.	2009-05-02 12:59:47 +00:00
dchagin	32b5830d97	Move extern variable definitions to the header file. Approved by: kib (mentor) MFC after: 1 month	2009-05-02 10:06:49 +00:00
mav	50b57c0fb5	Small addition to r191720. Restore previous behaviour for the case of unknown interrupt. Invocation of IRQ -1 crashes my system on resume. Returning 0, as it was, is not perfect also, but at least not so dangerous.	2009-05-01 20:53:37 +00:00
sam	c0a4585083	o add uath o sort usb wireless drivers	2009-05-01 17:20:16 +00:00
mav	364f1b1af0	Use value -1 instead of 0 for marking unused APIC vectors. This fixes IRQ0 routing on LAPIC-enabled systems. Add hint.apic.0.clock tunable. Setting it 0 disables using LAPIC timers as hard-/stat-/profclock sources falling back to using i8254 and rtc timers. On modern CPUs LAPIC is a part of CPU core which is shutting down when CPU enters C3 or deeper power state. It makes no problems for interrupt processing, as chipset wakes up CPU on interrupt triggering. But entering C3 state kills LAPIC timer and freezes system time, making C3 and deeper states practically unusable. Using i8254 timer allows to avoid this problem. By using i8254 timer my T7700 C2D CPU with UP kernel successfully enters C3 state, saving more then a Watt of total idle power (>10%) in addition to all other power-saving techniques. This technique is not working for SMP yet, as only one CPU receives timer interrupts. But I think that problem could be fixed by forwarding interrupts to other CPUs with IPI.	2009-05-01 17:05:49 +00:00
dchagin	dca50049ce	Reimplement futexes. Old implemention used Giant to protect the kernel data structures, but at the same time called malloc(M_WAITOK), that could cause the calling thread to sleep and lost Giant protection. User-visible result was the missed wakeup. New implementation uses one sx lock per futex. The sx protects the futex structures and allows to sleep while copyin or copyout are performed. Unlike linux, we return EINVAL when FUTEX_CMP_REQUEUE operation is requested and either caller specified futexes are equial or second futex already exists. This is acceptable since the situation can only occur from the application error, and glibc falls back to old FUTEX_WAKE operation when FUTEX_CMP_REQUEUE returns an error. Approved by: kib (mentor) MFC after: 1 month	2009-05-01 15:36:02 +00:00
jkim	680fc00a3e	- Fix divide-by-zero panic when SMP kernel is used on UP system[1]. - Avoid possible divide-by-zero panic on SMP system when the CPUID is disabled, unsupported, or buggy. Submitted by: pluknet (pluknet at gmail dot com)[1]	2009-04-30 22:10:04 +00:00
jeff	9339d50dc3	- Add support for cpuid leaf 0xb. This allows us to determine the topology of nehalem/corei7 based systems. - Remove the cpu_cores/cpu_logical detection from identcpu. - Describe the layout of the system in cpu_mp_announce(). Sponsored by: Nokia	2009-04-29 06:54:40 +00:00
jhb	3ca2800ae4	Reduce the number of bounce zones (and thus the number of bounce pages used in some cases): - Ignore DMA tag boundaries when allocating bounce pages. The boundaries don't determine whether or not parts of a DMA request bounce. Instead, they are just used to carve up segments. - Allow tags with sub-page alignment to share bounce pages since bounce pages are always page aligned. Reviewed by: scottl (amd64) MFC after: 1 month	2009-04-23 20:24:19 +00:00
jhb	5ff418d071	Adjust the way we number CPUs on x86 so that we attempt to "group" all logical CPUs in a package. We do this by numbering the non-boot CPUs by starting with the first CPU whose APIC ID is after the boot CPU and wrapping back around to APIC ID 0 if needed rather than always starting at APIC ID 0. While here, adjust the cpu_mp_announce() routine to list CPUs based on the mapping established by assign_cpu_ids() rather than making assumptions about the algorithm assign_cpu_ids() uses. MFC after: 1 month	2009-04-22 21:40:37 +00:00
rwatson	21a8b350dc	Don't conditionally define CACHE_LINE_SHIFT, as we anticipate sizing a fair number of static data structures, making this an unlikely option to try to change without also changing source code. [1] Change default cache line size on ia64, sparc64, and sun4v to 128 bytes, as this was what rtld-elf was already using on those platforms. [2] Suggested by: bde [1], jhb [2] MFC after: 2 weeks	2009-04-20 12:59:23 +00:00
rwatson	ab17fac487	Add description and cautionary note regarding CACHE_LINE_SIZE. MFC after: 2 weeks Suggested by: alc	2009-04-19 21:26:36 +00:00
rwatson	8df790f38f	For each architecture, define CACHE_LINE_SHIFT and a derived CACHE_LINE_SIZE constant. These constants are intended to over-estimate the cache line size, and be used at compile-time when a run-time tuning alternative isn't appropriate or available. Defaults for all architectures are 64 bytes, except powerpc where it is 128 bytes (used on G5 systems). MFC after: 2 weeks Discussed on: arch@	2009-04-19 20:19:13 +00:00
kmacy	1aef8359b1	- Import infrastructure for caching flows as a means of accelerating L3 and L2 lookups as well as providing stateful load balancing when used with RADIX_MPATH. - Currently compiled in to i386 and amd64 but disabled by default, it can be enabled at runtime with 'sysctl net.inet.flowtable.enable=1'. - Embedded users can remove it entirely from the kernel by adding 'nooption FLOWTABLE' to their kernel config files. - A minimal hookup will be added to ip_output in a subsequent commit. I would like to see more review before bringing in changes that require more churn. Supported by: Bitgravity Inc.	2009-04-19 00:16:04 +00:00
jhb	360bcf2161	Restore bus DMA bounce pages to an offset of 0 when they are released by a tag that has BUS_DMA_KEEP_PG_OFFSET set. Otherwise the page could be reused with a non-zero offset by a tag that doesn't have BUS_DMA_KEEP_PG_OFFSET leading to data corruption. Sleuthing by: avg Reviewed by: scottl	2009-04-17 13:22:18 +00:00
marcel	cf8f14f029	Add a compat option to the EBR scheme that controls the naming of the partitions (GEOM_PART_EBR_COMPAT). When compatibility is enabled, changes to the partitioning are disallowed. Remove the device name aliasing added previously to provide backward compatibility, but which in practice doesn't give us anything. Enable compatibility on amd64 and i386.	2009-04-15 22:38:22 +00:00
jkim	e8cee11d8d	A simple rewrite of biossmap.c: - Do not iterate int 15h, function e820h twice. Instead, we use STAILQ to store each return buffer and copy all at once. - Export optional extended attributes defined in ACPI 3.0 as separate metadata. Currently, there are only two bits defined in the specification. For example, if the descriptor has extended attributes and it is not enabled, it has to be ignored by OS. We may implement it in the kernel later if it is necessary and proven correct in reality. - Check return buffer size strictly as suggested in ACPI 3.0. Reviewed by: jhb	2009-04-15 17:31:22 +00:00
kib	9c0149c147	The bus_dmamap_load_uio(9) shall use pmap of the thread recorded in the uio_td to extract pages from, instead of unconditionally use kernel pmap. Submitted by: Jason Harmening <jason.harmening gmail com> (amd64 version) PR: amd64/133592 Reviewed by: scottl (original patch), jhb MFC after: 2 weeks	2009-04-13 19:20:32 +00:00
ed	a0f5dad6a9	Simplify in/out functions (for i386 and AMD64). Remove a hack to generate more efficient code for port numbers below 0x100, which has been obsolete for at least ten years, because GCC has an asm constraint to specify that. Submitted by: Christoph Mallon <christoph mallon gmx de>	2009-04-11 14:01:01 +00:00
ed	93a9ed75b4	Also remove the unused __word_swap_int*() macros. Submitted by: Christoph Mallon <christoph.mallon@gmx.de>	2009-04-08 19:10:20 +00:00
ed	9141c649b8	Implement __bswap16() without using inline assembly. Most compilers nowadays (including GCC) are smart enough to know what's going on and generate more efficient code anyway. Submitted by: Christoph Mallon <christoph.mallon@gmx.de>	2009-04-08 19:06:47 +00:00
dchagin	01bf63c9fb	Fix KBI breakage by r190520 which affects older linux.ko binaries: 1) Move the new field (brand_note) to the end of the Brandinfo structure. 2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer is valid. 3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old modules won't have the flag set, so the new field brand_note would be ignored. Suggested by: jhb Reviewed by: jhb Approved by: kib (mentor) MFC after: 6 days	2009-04-05 09:27:19 +00:00
alc	85b0c58343	Retire VM_PROT_READ_IS_EXEC. It was intended to be a micro-optimization, but I see no benefit from it today. VM_PROT_READ_IS_EXEC was only intended for use on processors that do not distinguish between read and execute permission. On an mmap(2) or mprotect(2), it automatically added execute permission if the caller specified permissions included read permission. The hope was that this would reduce the number of vm map entries needed to implement an address space because there would be fewer neighboring vm map entries that differed only in the presence or absence of VM_PROT_EXECUTE. (See vm/vm_mmap.c revision 1.56.) Today, I don't see any real applications that benefit from VM_PROT_READ_IS_EXEC. In any case, vm map entries are now organized as a self-adjusting binary search tree instead of an ordered list. So, the need for coalescing vm map entries is not as great as it once was.	2009-04-04 23:12:14 +00:00
dfr	df0ed71781	Fix the Xen build for i386 PV mode.	2009-04-01 17:06:28 +00:00
kib	a2d099881f	Sync definitions for struct sigcontext for i386 and amd64 architectures to struct mcontext.	2009-04-01 13:44:28 +00:00
kib	aaeb4d6376	Fill the fsbase and gsbase fields of the mcontext structure on i386. In collaboration with: pho Discussed with: peter Reviewed by: jhb	2009-04-01 12:46:05 +00:00
kib	8e7a736a88	Add all segment registers for the amd64 CPU to struct reg and mcontext. To keep these structures ABI-compatible, half the size of r_trapno, r_err, mc_trapno, mc_flags. Add fsbase and gsbase to mcontext on both amd64 and i386. Add flags to amd64 mcontext to indicate that it contains valid segments or bases. In collaboration with: pho Discussed with: peter Reviewed by: jhb	2009-04-01 12:44:17 +00:00
jkim	49cb67d70e	Fix an uninitialized variable from the previous commit.	2009-03-31 21:14:05 +00:00
jkim	01c7b1ae8d	Probe size of installed memory modules from loader and display it as 'real memory' instead of Maxmem if the value is available. Note amd64 displayed physmem as 'usable memory' since machdep.c r1.640 to unconfuse users. Now it is consistent across amd64 and i386 again. While I am here, clean up smbios.c a bit and update copyright date. Reviewed by: jhb	2009-03-31 21:02:55 +00:00
mr	b6886fc07b	Extend comment in copyright notice as requested by author. Submitted by: G.Otsuji	2009-03-29 13:35:20 +00:00
mr	705fd8f6f9	Add support for Phenom (Family 10h) to cpufreq. Its a newer version provided by the author than in the PR. PR: kern/128575 Submitted by: Gen Otsuji annona2 [at] gmail.com	2009-03-28 08:54:47 +00:00
kib	a7383c9a55	Convert gdt_segs and ldt_segs initialization to C99 style. Reviewed by: jhb	2009-03-26 18:07:13 +00:00
jhb	afc2ecb61b	Fix a few nits in the earlier changes to prevent local information leakage in AMD FPUs: - Do not clear the affected state in the case that the FPU registers for the thread that already owns the FPU are changed via fpu_setregs(). The only local information the thread would see is its own state in that case. - Fix a type mismatch for the dummy variable used in a "fld". It accepts a float, not a double. Reviewed by: bde Approved by: so (cperciva) MFC after: 1 month	2009-03-25 22:08:30 +00:00
jhb	6cd843f315	Rename (fpu\|npx)_cleanstate to (fpu\|npx)_initialstate to better reflect their purpose. Inspired by: bde MFC after: 1 month	2009-03-25 14:17:08 +00:00
jhb	ac5c4c1c38	Fall back to using configuration type 1 accesses for PCI config requests if the requested PCI bus falls outside of the bus range given in the ACPI MCFG table. Several BIOSes seem to not include all of the PCI busses in systems in their MCFG tables. It maybe that the BIOS is simply buggy and does support all the busses, but it is more conservative to just fall back to the old method unless it is certain that memory accesses will work.	2009-03-24 18:10:22 +00:00
alc	1623994337	Update stale comments. The alternate address space mapping was eliminated when PAE support was added to i386. The direct mapping exists on amd64.	2009-03-22 18:56:26 +00:00
alc	e68330f894	Eliminate the recomputation of pcb_cr3 from cpu_set_upcall(). The bcopy()ed value from the old thread is the correct value because the new thread and the old thread will share a page table.	2009-03-22 02:33:48 +00:00
thompsa	11f8f68779	Remove the uscanner(4) driver, this follows the removal of the kernel scanner driver in Linux 2.6. uscanner was just a simple wrapper around a fifo and contained no logic, the default interface is now libusb (supported by sane). Reviewed by: HPS	2009-03-19 20:33:26 +00:00
kib	7695aca762	Add AT_EXECPATH ELF auxinfo entry type. The value's a_ptr is a pointer to the full path of the image that is being executed. Increase AT_COUNT. Remove no longer true comment about types used in Linux ELF binaries, listed types contain FreeBSD-specific entries. Reviewed by: kan	2009-03-17 12:50:16 +00:00
jkim	3eda4741da	Initial suspend/resume support for amd64. This code is heavily inspired by Takanori Watanabe's experimental SMP patch for i386 and large portion was shamelessly cut and pasted from Peter Wemm's AP boot code.	2009-03-17 00:48:11 +00:00
rwatson	70b6a8119c	Remove IFF_NEEDSGIANT, a compatibility infrastructure introduced in FreeBSD 5.x to allow network device drivers to run with Giant despite the network stack being Giant-free. This significantly simplifies calls into ioctl() on network interfaces, especially in the multicast code, as well as eliminates deferred invocation of interface if_start routines. Disable the build on device drivers still depending on IFF_NEEDSGIANT as they no longer compile. They will be removed in a few weeks if they haven't been made MPSAFE in that time. Disabled drivers: if_ar if_axe if_aue if_cdce if_cue if_kue if_ray if_rue if_rum if_sr if_udav if_ural if_zyd Drivers that were already disabled because of tty changes: if_ppp if_sl Discussed on: arch@	2009-03-15 14:21:05 +00:00
alc	7f1b26ac0c	MFamd64 r189785 Update the pmap's resident page count when a page table page is freed in pmap_remove_pde() and pmap_remove_pages(). MFC after: 6 weeks	2009-03-14 15:37:19 +00:00
dchagin	2408b715a0	Implement new way of branding ELF binaries by looking to a ".note.ABI-tag" section. The search order of a brand is changed, now first of all the ".note.ABI-tag" is looked through. Move code which fetch osreldate for ELF binary to check_note() handler. PR: 118473 Approved by: kib (mentor)	2009-03-13 16:40:51 +00:00
dfr	598fb4217f	Merge in support for Xen HVM on amd64 architecture.	2009-03-11 15:30:12 +00:00
rwatson	5277812531	Trim comments about the MP-safety of various bits of the amd64/i386 system call entry path and i386 IP checksum generation: we now assume all code is MPSAFE unless explicitly marked otherwise. Remove XXX Giant comments along similar lines: the code by the comments either doesn't need or doesn't want Giant (especially the NMI handler). MFC after: 3 days	2009-03-09 13:11:16 +00:00
sobomax	82279c3ff2	Small comment nit: "run time" -> "run-time". Submitted by: rwatson	2009-03-08 05:01:39 +00:00
thompsa	4ab7fdce63	Reenable ndis in the LINT build now that it has been updated for USB. Thanks to HPS and Weongyo.	2009-03-07 19:54:30 +00:00
jhb	e1b708897e	A better fix for handling different FPU initial control words for different ABIs: - Store the FPU initial control word in the pcb for each thread. - When first using the FPU, load the initial control word after restoring the clean state if it is not the standard control word. - Provide a correct control word for Linux/i386 binaries under FreeBSD/amd64. - Adjust the control word returned for fpugetregs()/npxgetregs() when a thread hasn't used the FPU yet to reflect the real initial control word for the current ABI. - The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control word instead of trashing whatever the current state of the FPU is. Reviewed by: bde	2009-03-05 19:42:11 +00:00
jhb	b4cf24773d	Remove unused arg from npxinit(). Forgot to commit this file in the last i386 FPU change.	2009-03-05 18:43:54 +00:00
jhb	b2f198587d	Some cleanups to the i386 FPU support: - Remove the control word parameter to npxinit(). It was always set to __INITIAL_NPXCW__. - Remove npx_cleanstate_ready as the cleanstate is always initalized when it is used. - Improve the handling of the case when the FPU isn't present. Now the npx0 device no longer succeeds in its probe so all of npx_attach() is skipped. Also, we allow this case with SMP (though that shouldn't actually occur as all i386 systems that support SMP have FPUs) now. SMP was only an issue back when we had an FPU emulator which was not per-CPU. - MFamd64: Clear some of the state in npx_cleanstate rather than leaving it as garbage. - MFamd64: When a user thread first uses the FPU, use npx_cleanstate for the initial FPU state. Reviewed by: bde	2009-03-05 18:32:43 +00:00
jhb	95e639db1b	At least one BIOS bogusly includes duplicate entries for I/O APICs. The bogus entries have a starting IRQ that is invalid (> 255, so won't fit into a PCI intline config register). It had the side effect of breaking MSI by "claiming" several IRQs in the MSI range. Fix this by ignoring such I/O APICs. MFC after: 2 weeks	2009-03-05 16:03:44 +00:00
dchagin	45cda70b8f	Add AT_PLATFORM, AT_HWCAP and AT_CLKTCK auxiliary vector entries which are used by glibc. This silents the message "2.4+ kernel w/o ELF notes?" from some programs at start, among them are top and pkill. Do the assignment of the vector entries in elf_linux_fixup() as it is done in glibc. Fix some minor style issues. Submitted by: Marcin Cieslak <saper at SYSTEM PL> Approved by: kib (mentor) MFC after: 1 week	2009-03-04 12:14:33 +00:00
sobomax	1b8152b75c	Fix typo in comments in r189023.	2009-02-25 22:24:56 +00:00
jkim	c7f643835f	Enable support for PAT_WRITE_PROTECTED and PAT_UNCACHED cache modes unconditionally on amd64. On i386, we assume PAT is usable if the CPU vendor is not Intel or CPU model is newer than Pentium IV. Reviewed by: alc, jhb	2009-02-25 20:26:48 +00:00
sobomax	ba8a8daf8d	Make machdep.hyperthreading_enabled tunable working with the SCHED_ULE. Unlike with SCHED_BSD, however, it can only be set to 0 at boot time, it's not possible to change it at runtime. Reviewed by: jhb MFC after: 1 month	2009-02-25 01:49:01 +00:00
thompsa	d6d9119787	These are no longer needed.	2009-02-24 23:27:59 +00:00
rdivacky	e5bfcba080	Change the functions to ANSI in those cases where it breaks promotion to int rule. See ISO C Standard: SS6.7.5.3:15. Approved by: kib (mentor) Reviewed by: warner Tested by: silence on -current	2009-02-24 18:09:31 +00:00
thompsa	cf8a92987a	Exclude ndis from the LINT build as it currently breaks the build, patches to move to the new usb stack are in progress.	2009-02-24 00:39:48 +00:00
thompsa	6b0018e885	Change over the usb kernel options to the new stack (retaining existing naming). The old usb stack can be compiled in my prefixing the name with 'o'.	2009-02-23 18:34:56 +00:00
jhb	ffd00ec82d	Some whitespace and style fixes. Submitted by: bde (partly)	2009-02-23 15:39:24 +00:00
jhb	b38fc7d456	FreeBSD/i386 doesn't include a software FPU emulator anymore, so adjust an iBCS2 syscall to indicate that there is no FPU support at all rather than emulated support if an FPU is not present.	2009-02-23 15:38:35 +00:00
alc	e58241dc4a	Optimize free_pv_entry(); specifically, avoid repeated TAILQ_REMOVE()s. MFC after: 1 week	2009-02-23 06:00:24 +00:00
jeff	f9e60653c3	- Resolve an issue where we may clear an idt while an interrupt on a different cpu is still assigned to that vector by never clearing idt entries. This was only provided as a debugging feature and the bugs are caught by other means. - Drop the sched lock when rebinding to reassign an interrupt vector to a new cpu so that pending interrupts have a chance to be delivered before removing the old vector. Discussed with: tegge, jhb	2009-02-21 23:15:34 +00:00
rdivacky	ce66ebaba5	Mark these variables as __used too. Fix a style of previous commit. Noticed by: Christoph Mallon Approved by: kib (mentor)	2009-02-18 22:44:55 +00:00
rdivacky	9a1e2de526	Mark these variables as __used as those are used in the asm block. Approved by: kib (mentor)	2009-02-18 18:25:16 +00:00
kib	021a7529ae	Adapt linux emulation to use cv for vfork wait. Submitted by: Takahiro Kurosawa <takahiro.kurosawa gmail com> PR: kern/131506	2009-02-18 16:11:39 +00:00
thompsa	c24b826e84	Add uslcom to the build too. Reminded by: Michael Butler	2009-02-15 23:40:29 +00:00
thompsa	15cccb8286	Switch over GENERIC kernels to USB2 by default. Tested by: make universe	2009-02-15 22:33:44 +00:00
kmacy	efa1327aee	- fix formatting - fix types in ticks_to_system_time	2009-02-15 06:36:02 +00:00
alc	783a479d0f	Remove unnecessary page queues locking around vm_page_wakeup(). Approved by: kmacy	2009-02-14 22:07:22 +00:00
alc	1ecfdf51e9	Remove unnecessary page queues locking around vm_page_busy() and vm_page_wakeup(). (This change is applicable to RELENG_7 but not RELENG_6.) MFC after: 1 week	2009-02-14 18:23:52 +00:00
jhb	26e338d6fc	Use shared vnode locks when invoking VOP_READDIR(). MFC after: 1 month	2009-02-13 18:18:14 +00:00
marcel	8a9f5896ce	Add option GEOM_PART_EBR by default on amd64 and i386.	2009-02-10 00:08:39 +00:00
cognet	80c343b215	The bounce zone sees its page number increased if multiple dma maps use it in the same dma tag. However, it can happen multiple dma tags share the same bounce zone too, so add a per-bounce zone map counter, and check it instead of the dma tag map counter, to know if we have to alloc more pages. Reported by: miwi Reviewed by: scottl	2009-02-09 18:03:31 +00:00
imp	719ba982f2	When bouncing pages, allow a new option to preserve the intra-page offset. This is needed for the ehci hardware buffer rings that assume this behavior. This is an interim solution, and a more general one is being worked on. This solution doesn't break anything that doesn't ask for it directly. The mbuf and uio variants with this flag likely don't work and haven't been tested. Universe builds with these changes. I don't have a huge-memory machine to test these changes with, but will be happy to work with folks that do and hps if this changes turns out not to be sufficient. Submitted by: alfred@ from Hans Peter Selasky's original	2009-02-08 22:54:58 +00:00
kmacy	90862b22bb	Don't try to directly update page tables	2009-02-08 21:54:51 +00:00
wkoszek	28797e9f9e	si(4) seems to build without a problem. However, since noone noticed lack of this driver, put it in a comment.	2009-02-08 12:40:33 +00:00
wkoszek	d6333de741	Tidy NOTES a bit: - leave pmtimer comment that is common to other architectures. - bring pbio explanation to the block comment relating to other drivers in the same block.	2009-02-07 00:06:13 +00:00
wkoszek	c9a51b4782	Comment about ural(4) isn't approprate here, since the driver is present in global NOTES file. cx(4) driver isn't present in this file, though it could be. However, cx(4) seems to be more or less dead -- it hasn't been linked to the modules build, and after TTY-ng transformations it doesn't compile. Remove it until cx(4) is broken.	2009-02-06 22:22:08 +00:00
wkoszek	a6c32bcda6	Fix AGP debugging code: - correct format strings - fill opt_agp.h if AGP_DEBUG is defined - bring AGP_DEBUG to LINT by mentioning it in NOTES This should hopefully fix a warning that was... Found by: Coverity Prevent(tm) CID: 3676 Tested on: amd64, i386	2009-02-06 20:57:10 +00:00
kmacy	ce680c3b4b	halt APs on reboot	2009-02-05 21:41:27 +00:00
kmacy	49b89940c8	reboot instance on reset	2009-02-05 21:35:40 +00:00
kmacy	d0a09ee159	pass in smp_processor_id to identify the cpu in use	2009-02-05 04:00:55 +00:00
kmacy	e967ffe3bb	adjust the way that idle happens so as to avoid missing timer interrupts	2009-02-05 02:01:18 +00:00
kmacy	a814c36367	make sure that interrupts are disabled when handling page faults et al	2009-02-03 03:43:00 +00:00
bz	6d6ef97dba	Bring over the code from sys/i386/i386/mp_machdep.c, r187880 to make XEN config compile again: - Allocate apic vectors on a per-cpu basis.	2009-01-31 21:40:27 +00:00
obrien	7a153194ec	Change some movl's to mov's. Newer GAS no longer accept 'movl' instructions for moving between a segment register and a 32-bit memory location. Looked at by: jhb	2009-01-31 11:37:21 +00:00

... 3 4 5 6 7 ...

12135 Commits