freebsd-dev

Author	SHA1	Message	Date
Robert Watson	74b5505e5d	Continue to introduce Capsicum capability mode: White list sysarch calls allowed in capability mode; arguably, there should be some link between the capability mode model and the privilege model here. Sysarch is a morass similar to ioctl, in many senses. Submitted by: anderson Discussed with: benl, kris, pjd Sponsored by: Google, Inc. Obtained from: Capsicum Project MFC after: 3 months	2011-03-01 13:35:48 +00:00
John Baldwin	4cef260f42	Fix whitespace nit.	2011-02-22 14:58:14 +00:00
Rebecca Cran	6bccea7c2b	Fix typos - remove duplicate "the". PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-21 09:01:34 +00:00
Alan Cox	e6ffa21488	Remove pmap fields that are either unused or not fully implemented. Discussed with: kib	2011-02-17 15:36:29 +00:00
Dmitry Chagin	dc4f0a9e11	To avoid excessive code duplication create wrapper for fill regs from stack frame. Change the trap() code to use newly created function instead of explicit regs assignment.	2011-02-16 17:50:21 +00:00
Dmitry Chagin	09d6cb0a23	For realtime signals fill the sigval value.	2011-02-15 21:46:36 +00:00
Dmitry Chagin	fde6316272	Sort include files in the alphabetical order.	2011-02-13 19:07:48 +00:00
Dmitry Chagin	222198ab0b	Move linux_clone(), linux_fork(), linux_vfork() to a MI path.	2011-02-12 18:17:12 +00:00
Dmitry Chagin	c8d6845e9e	In preparation for moving linux_clone() to a MI path introduce linux_set_upcall_kse().	2011-02-12 16:33:00 +00:00
Dmitry Chagin	2c7660ba3e	In preparation for moving linux_clone () to a MI path move the TLS code in a separate function. Use function parameter instead of direct using register.	2011-02-12 15:50:21 +00:00
Dmitry Chagin	9bd9b52478	Regen for r218610.	2011-02-12 15:36:25 +00:00
Dmitry Chagin	f91ea2518b	The fourth argument of linux_clone is a pointer to the TLS. Change clone syscall definition to match actual linux one.	2011-02-12 15:33:25 +00:00
Alan Cox	02e5228ca0	Setting VV_TEXT here is redundant. It is already set by do_execve(). Reviewed by: kib	2011-02-09 18:45:33 +00:00
Konstantin Belousov	b17ef03604	Fix linking of the kernel without device npx. MFC after: 2 weeks	2011-02-05 15:37:10 +00:00
Konstantin Belousov	6f9ec5aab0	Clear the padding when returning context to the usermode, for MI ucontext_t and x86 MD parts. Kernel allocates the structures on the stack, and not clearing reserved fields and paddings causes leakage. Noted and discussed with: bde MFC after: 2 weeks	2011-02-05 15:10:27 +00:00
Matthew D Fleming	08b163fa51	Put the general logic for being a CPU hog into a new function should_yield(). Use this in various places. Encapsulate the common case of check-and-yield into a new function maybe_yield(). Change several checks for a magic number of iterations to use should_yield() instead. MFC after: 1 week	2011-02-02 16:35:10 +00:00
Dmitry Chagin	77192fddeb	Regen for r218101. MFC after: 1 Month.	2011-01-30 20:38:26 +00:00
Dmitry Chagin	8d73c2bfd1	Change linux futex syscall definition to match actual linux one. MFC after: 1 Month.	2011-01-30 20:31:43 +00:00
Dmitry Chagin	9adaae9403	The kern_wait() code already removes the SIGCHLD signal for the waited process. Removing other SIGCHLD signals is not needed and may cause problems. Pointed out by: jilles MFC after: 1 Month.	2011-01-30 18:17:38 +00:00
Dmitry Chagin	adc7ece00a	Implement a variation of the linux_common_wait() which should be used by linuxolator itself. Move linux_wait4() to MD path as it requires native struct rusage translation to struct l_rusage on linux32/amd64. MFC after: 1 Month.	2011-01-28 18:47:07 +00:00
Dmitry Chagin	a5c1afadeb	Add macro to test the sv_flags of any process. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures. MFC after: 1 month	2011-01-26 20:03:58 +00:00
Matthew D Fleming	f89f7ada8d	Set td_kstack_pages for thread0. This was already being done for most architectures, but i386 and amd64 were missing it. Submitted by: Mohd Fahadullah <mfahadullah AT isilon DOT com>	2011-01-26 17:06:13 +00:00
Sergey Kandaurov	4053b05b91	Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize. Submitted by: perryh pluto.rain.com (previous version) Reviewed by: jhb Approved by: kib (mentor) Tested by: universe	2011-01-21 10:26:26 +00:00
Jung-uk Kim	fd240d6d9f	Fix yet another fallout from r208833. VM86 BIOS call may cause page fault when FPU is in use. Reported by: Marc UBM Bocklet (ubm dot freebsd at googlemail dot com) Tested by: b. f. (bf1783 at googlemail dot com) MFC after: 3 days	2011-01-19 17:09:07 +00:00
Konstantin Belousov	55aabb7fd1	For architectures not using direct map , and requiring real KVA page for sf buf allocation, use wakeup() instead of wakeup_one() to notify sf buffer waiters about free buffer. sf_buf_alloc() calls msleep(PCATCH) when SFB_CATCH flag was given, and for simultaneous wakeup and signal delivery, msleep() returns EINTR/ERESTART despite the thread was selected for wakeup_one(). As result, we loose a wakeup, and some other waiter will not be woken up. Reported and tested by: az Reviewed by: alc, jhb MFC after: 1 week	2011-01-18 21:57:02 +00:00
John Baldwin	6bd823f334	- Remove some always-true checks (checking for unsigned < 0). - Only check largs->num against max_ldt_segment on amd64 for I386_SET_LDT when descriptors are provided. Specifically, allow the 'start == 0' and 'num == 0' special case used to free all LDT entries that previously failed with EINVAL. Submitted by: clang via rdivacky (some of 1) Reviewed by: kib	2011-01-18 16:43:01 +00:00
Jung-uk Kim	2fea643112	Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set(). Compile sys/dev/mem/memutil.c for all supported platforms and remove now unnecessary dev_mem_md_init(). Consistently define mem_range_softc from mem.c for all platforms. Add missing #include guards for machine/memdev.h and sys/memrange.h. Clean up some nearby style(9) nits. MFC after: 1 month	2011-01-17 22:58:28 +00:00
Jung-uk Kim	df74996c3d	Avoid preemption while manipulating CRs and MTRRs. Tested by: ariff	2011-01-17 17:30:35 +00:00
John Baldwin	072e9838e2	If an interrupt on an I/O APIC is moved to a different CPU after it has started to execute, it seems that the corresponding ISR bit in the "old" local APIC can be cleared. This causes the local APIC interrupt routine to fail to find an interrupt to service. Rather than panic'ing in this case, simply return from the interrupt without sending an EOI to the local APIC. If there are any other pending interrupts in other ISR registers, the local APIC will assert a new interrupt. Tested by: steve	2011-01-13 17:00:22 +00:00
Konstantin Belousov	50a57dfbec	Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h. Update the outdated comments describing MAXSLP and the process selection algorithm for swap out. Comments wording and reviewed by: alc	2011-01-09 12:50:44 +00:00
Tijl Coosemans	d22e78d6b9	Copy powerpc/include/_inttypes.h to x86 and replace i386/amd64/pc98 headers with stubs. Approved by: kib (mentor)	2011-01-08 18:09:48 +00:00
Tijl Coosemans	a56e818f29	On mixed 32/64 bit architectures (mips, powerpc) use __LP64__ rather than architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and corresponding macros) are different from 32 bit. [1] Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX. Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition for (u)intmax_t. Do this on all architectures for consistency. Suggested by: bde [1] Approved by: kib (mentor)	2011-01-08 12:43:05 +00:00
Tijl Coosemans	d942996baf	On 32 bit architectures define (u)int64_t as (unsigned) long long instead of (unsigned) int __attribute__((__mode__(__DI__))). This aligns better with macros such as (U)INT64_C, (U)INT64_MAX, etc. which assume (u)int64_t has type (unsigned) long long. The mode attribute was used because long long wasn't standardised until C99. Nowadays compilers should support long long and use of the mode attribute is discouraged according to GCC Internals documentation. The type definition has to be marked with __extension__ to support compilation with "-std=c89 -pedantic". Discussed with: bde Approved by: kib (mentor)	2011-01-08 11:47:55 +00:00
Tijl Coosemans	9858863cd4	Fix types of some values in machine/_limits.h. On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int. However, lacking integer suffixes for types smaller than int, their type should correspond to that of an object of type unsigned char (or short) when used in an expression with objects of type int. In that case unsigned char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and USHRT_MAX should also be int. Where MIN/MAX constants implicitly have the correct type the suffix has been removed. While here, correct some comments. Reviewed by: bde Approved by: kib (mentor)	2011-01-08 11:13:34 +00:00
Tijl Coosemans	911127a0d6	Remove unused support for 64 bit long on 32 bit architectures. It was used mainly to discover and fix some 64-bit portability problems before 64-bit arches were widely available. Discussed with: bde Approved by: kib (mentor)	2011-01-07 22:57:31 +00:00
Konstantin Belousov	39198f15ee	Add AT_STACKPROT elf aux vector. Will be used to inform rtld about the initial stack protection set by the kernel image activator.	2011-01-07 14:22:34 +00:00
John Baldwin	c305730dc0	Remove bogus usage of INTR_FAST. "Fast" interrupts are now indicated by registering a filter handler rather than a threaded handler. Also remove a bogus use of INTR_MPSAFE for a filter.	2011-01-06 21:08:06 +00:00
Colin Percival	b5e61aab00	Spell CRITICAL_ASSERT correctly. Submitted by: jhb MFC with: r216944	2011-01-04 16:29:07 +00:00
Colin Percival	aa829cef6a	Add hamfisted locking to the Xen/PV pmap code: Only allow one thread to be in {pmap_pinit, pmap_copy, pmap_release} at a time. This reduces the rate of panics when running 'make index' from ~0.6/hour to ~0.02/hour (p < 10^-30). At a later date this locking will be removed, and for this reason, it is wrapped in #ifdef HAMFISTED_LOCKING; this temporary hack is being put in place with the intention of shipping somewhat-stable Xen bits in FreeBSD 8.2-RELEASE. PR: kern/153672 MFC after: 3 days	2011-01-04 15:55:15 +00:00
Robert Watson	2913e88c91	Make "options XENHVM" compile for i386, not just amd64 -- a largely mechanical change. This opens the door for using PV device drivers under Xen HVM on i386, as well as more general harmonisation of i386 and amd64 Xen support in FreeBSD. Reviewed by: cperciva MFC after: 3 weeks	2011-01-04 14:49:54 +00:00
Colin Percival	d01b2dad73	Adjust the critical section protecting _xen_flush_queue to cover the entire range where the page mapping request queue needs to be atomically examined and modified. Oddly, while this doesn't seem to affect the overall rate of panics (running 'make index' on EC2 t1.micro instances, there are 0.6 +/- 0.1 panics per hour, both before and after this change), it eliminates vm_fault from panic backtraces, leaving only backtraces going through vmspace_fork.	2011-01-04 00:16:38 +00:00
Colin Percival	aaaf607148	Make i386_set_ldt work on i386/XEN, step 5/5. When cleaning up a thread, reset its LDT to the default LDT. Note: Casting the LDT pointer to an int and storing it in pc_currentldt is wildly bogus, but is harmless since pc_currentldt is a write-only variable. MFC after: 3 days	2010-12-31 17:42:25 +00:00
Colin Percival	698cc19d6b	Make i386_set_ldt work on i386/XEN, step 4/5. Use xen_update_descriptor to update the LDT rather than bcopy. Under Xen, pages used for holding LDTs must be read-only, so we can't make the change ourselves. Ths obvious alternative of "remap the page read-write, make the change, then map it read-only again" doesn't work since Xen won't allow an LDT page to be remapped as R/W. An arguably better solution is used by NetBSD: They don't modify LDTs in-place at all, but instead copy the entire LDT, modify the new version, then atomically swap. MFC after: 3 days	2010-12-31 17:41:14 +00:00
Colin Percival	de187b8df2	Make i386_set_ldt work on i386/XEN, step 3/5. Synchronize reality with comment: The user_ldt_alloc function is supposed to return with dt_lock held. Due to broken locking in i386/xen/pmap.c, we drop dt_lock during the call to pmap_map_readonly and then pick it up again; this can be removed once the Xen pmap locking is fixed. MFC after: 3 days	2010-12-31 17:40:30 +00:00
Colin Percival	90b7d33458	Make i386_set_ldt work on i386/XEN, step 2/5. Don't map physical to machine page numbers in pte_load_store, since it uses PT_SET_VA (which takes a physical page number and converts it to a machine page number). MFC after: 3 days	2010-12-31 17:39:58 +00:00
Colin Percival	d262f2dcfc	Make i386_set_ldt work on i386/XEN, step 1/5. Lock the vm page queue mutex around calls to pte_store. As with many other uses of the vm page queue mutex in i386/xen/pmap.c, this is bogus and needs to be replaced at some future date by a spin lock dedicated to protecting the queue of pending xen page mapping hypervisor calls. (But for now, bogus locking is better than a panic.) MFC after: 3 days	2010-12-31 17:39:31 +00:00
Pyun YongHyeon	2608aefc0b	Add driver for DM&P Vortex86 RDC R6040 Fast Ethernet. The controller is commonly found on DM&P Vortex86 x86 SoC. The driver supports all hardware features except flow control. The flow control was intentionally disabled due to silicon bug. DM&P Electronics, Inc. provided all necessary information including sample board to write driver and answered many questions I had. Many thanks for their support of FreeBSD. H/W donated by: DM&P Electronics, Inc.	2010-12-31 00:21:41 +00:00
Warner Losh	714cf6c0df	Revert r216777, per jhb@	2010-12-28 22:45:29 +00:00
Warner Losh	1977f3f168	Comment out npx and isa from NOTES file. We don't need them here since DEFAULTS already pulls them in.	2010-12-28 21:22:08 +00:00
Warner Losh	78b92d19e0	Remove mem, io, isa and npx since they are duplicative of the entries in DEFAULTS. Saves 8 lines of warnings when we build XBOX.	2010-12-28 21:20:58 +00:00
Colin Percival	4a416f8375	Remove a "not strictly correct" (and panic-inducing) workaround for a bug which doesn't seem to exist. PR: kern/141328 MFC after: 3 days	2010-12-28 14:36:32 +00:00
Colin Percival	76c9650713	Build the modules which can be built. The excluded modules fall into two categories: Those which can't build with PAE because they attempt to cast a pointer to a bus_addr_t (mostly scsi drivers); and those which can't be built with XEN because they conflict with something in xen-os.h (e.g., in cxgb there is a conflicting definition of test_and_clear_bit). MFC after: 1 week	2010-12-27 23:59:27 +00:00
Colin Percival	8ea0b3bb2f	Lock the vm page queue mutex in pmap_pte_release around the call to PMAP_SET_VA; this fixes a mutex-not-held panic when a process which called mlock(2) exits, and parallels a change made in pmap_pte 10 months ago (svn r204160). Note: The locking in this code is utterly broken. We should not be using the VM page queue mutex to protect the queue of pending Xen page mapping hypervisor calls. Even if it made sense to do so, this commit and r204160 introduce LORs between the vm page queue mutex and PMAP2mutex. (However, a possible deadlock is better than a guaranteed panic, and this change will hopefully make life easier for whoever fixes the Xen pmap locking in the future.) PR: kern/140313 MFC after: 3 days	2010-12-26 13:05:43 +00:00
Tijl Coosemans	81bd5041a2	Merge amd64 and i386 bus.h and move the resulting header to x86. Replace the original amd64 and i386 headers with stubs. Rename (AMD64\|I386)_BUS_SPACE_* to X86_BUS_SPACE_* everywhere. Reviewed by: imp (previous version), jhb Approved by: kib (mentor)	2010-12-20 16:39:43 +00:00
Alan Cox	9d555e459c	Redo some parts of r216333, specifically, the locking changes to pmap_extract_and_hold(), and undo the rest. In particular, I forgot that PG_PS and PG_PTE_PAT are the same bit.	2010-12-19 07:31:56 +00:00
Konstantin Belousov	7222d2fbee	Inform a compiler which asm statements in the x86 implementation of atomics change eflags. Reviewed by: jhb MFC after: 2 weeks	2010-12-18 16:41:11 +00:00
Konstantin Belousov	a9b31c256e	In pmap_extract(), unlock pmap lock earlier. The calculation does not need the lock when operating on local variables. Reviewed by: alc	2010-12-18 11:31:32 +00:00
Jung-uk Kim	e1c9d39ebe	Stop lying about supporting cpu_est_clockrate() when TSC is invariant. This function always returned the nominal frequency instead of current frequency because we use RDTSC instruction to calculate difference in CPU ticks, which is supposedly constant for the case. Now we support cpu_get_nominal_mhz() for the case, instead. Note it should be just enough for most usage cases because cpu_est_clockrate() is often times abused to find maximum frequency of the processor.	2010-12-14 20:07:51 +00:00
Konstantin Belousov	60c7c84e85	In fpudna()/npxdna(), mark FPU context initialized and optionally mark user FPU context initialized, if current context is user context. It was reversed in r215865, by inadequate change of this code fragment to a call to fpuuserinited()/npxuserinited(). The issue is only relevant for in-kernel users of FPU. Reported by: Jan Henrik Sylvester <me janh de>, Mike Tancsa <mike sentex net> Tested by: Mike Tancsa MFC after: 3 days	2010-12-12 16:16:39 +00:00
Colin Percival	20d1a304b3	Reduce the Xen timecounter from 1GHz to 2^-9 GHz, thereby increasing the timecounter period from 2^32 ns (~4.3s) to 2^41 ns (~36m39s). Some time sharing systems can skip clock interrupts for a few seconds when under load (e.g., if we've recently used more than our fair share of CPU and someone else wants a burst of CPU) and we were losing time in quanta of 2^32 ns due to timecounter wrapping. Increasing the timecounter period up to 2^41 ns is definitely overkill, but we still have microsecond timecounter precision, and anyone using paravirtualized hardware when they need submicrosecond timing is crazy.	2010-12-11 22:33:33 +00:00
Colin Percival	0f30ed5bc6	Make the machdep.independent_wallclock sysctl do what it says on the box.	2010-12-11 20:12:42 +00:00
Alan Cox	d1cf854b5d	When r207410 eliminated the acquisition and release of the page queues lock from pmap_extract_and_hold(), it didn't take into account that pmap_pte_quick() sometimes requires the page queues lock to be held. This change reimplements pmap_extract_and_hold() such that it no longer uses pmap_pte_quick(), and thus never requires the page queues lock. For consistency, adopt the same idiom as used by the new implementation of pmap_extract_and_hold() in pmap_extract() and pmap_mincore(). It also happens to make these functions shorter. Fix a style error in pmap_pte(). Reviewed by: kib@	2010-12-09 20:16:00 +00:00
Colin Percival	91ff9dc058	Replace i386/i386/busdma_machdep.c and amd64/amd64/busdma_machdep.c (which are identical) with a single x86/x86/busdma_machdep.c.	2010-12-09 06:41:50 +00:00
Jung-uk Kim	71e0b05797	Do not subtract 0.5% from estimated frequency if DELAY(9) is driven by TSC. Remove a confusing comment about converting to MHz as we never did.	2010-12-08 23:40:41 +00:00
Colin Percival	af60888734	On amd64, we have (since r1.72, in December 2005) MAX_BPAGES=8192, while on i386 we have MAX_BPAGES=512. Implement this difference via '#ifdef __i386__'. With this commit, the i386 and amd64 busdma_machdep.c files become identical; they will soon be replaced by a single file under sys/x86.	2010-12-08 20:20:10 +00:00
Jung-uk Kim	dd7d207dcb	Merge sys/amd64/amd64/tsc.c and sys/i386/i386/tsc.c and move to sys/x86/x86. Discussed with: avg	2010-12-08 00:09:24 +00:00
Jung-uk Kim	61d14101dd	Use int for 'tsc_present' instead of u_int. It is just a boolean.	2010-12-07 23:19:49 +00:00
Jung-uk Kim	7214d5d75b	Remove stale comments about P-state invariant TSC and fix style(9) nits.	2010-12-07 22:43:25 +00:00
Jung-uk Kim	1bcc28295b	Do not register a event handler for CPU freqency changes when it is found P-state invariant. This is continuation of r216274.	2010-12-07 22:34:51 +00:00
Jung-uk Kim	4a9c4056dc	Now the P-state invariant TSC is probed early enough, do not register event handlers for CPU freqency changes when it is found P-state invariant. Adjust a comment about non-existent tsc_freq_max() while I am here.	2010-12-07 22:23:26 +00:00
Jung-uk Kim	78a661bbaa	Probe P-state invariant TSC from rightful place.	2010-12-07 22:12:02 +00:00
Colin Percival	716d203d6b	MFamd64 r204214: Enforce stronger alignment semantics (require that the end of segments be aligned, not just the start of segments) in order to allow Xen's blkfront driver to operate correctly. PR: kern/152818 MFC after: 3 days	2010-12-05 03:20:55 +00:00
Colin Percival	a39dc31fca	Remove gratuitous i386/amd64 inconsistency in favour of the less verbose version of declaring a variable initialized to zero.	2010-12-04 23:36:40 +00:00
Colin Percival	5c5590862f	Remove unnecessary #includes which seem to have been accidentally added as part of CVS r1.76 (in January 2006).	2010-12-04 23:24:35 +00:00
Jung-uk Kim	2f7ab7e85d	Revert r216161. It is not necessary because we zero-fill BSS anyway. Requested by: jhb	2010-12-03 22:27:51 +00:00
Jung-uk Kim	b14fe63392	Explicitly initialize TSC frequency. To calibrate TSC frequency, we use DELAY(9) and it may use TSC in turn if TSC frequency is non-zero. MFC after: 3 days	2010-12-03 21:54:10 +00:00
Jung-uk Kim	e391a266ed	Do not change CPU ticker frequency if TSC is P-state invariant. Note this change was meant to be committed with r184102 (and its subsequent MFCs) but it fell off somehow. Pointyhat to: jkim MFC after: 3 days	2010-12-03 21:06:30 +00:00
Rebecca Cran	c90f7d9b44	Revert r216134. This checkin broke platforms where bus_space are macros: they need to be a single statement, and do { } while (0) doesn't work in this situation so revert until a solution can be devised.	2010-12-03 07:09:23 +00:00
Rebecca Cran	15b4888a24	Disallow passing in a count of zero bytes to the bus_space(9) functions. Passing a count of zero on i386 and amd64 for [I386\|AMD64]_BUS_SPACE_MEM causes a crash/hang since the 'loop' instruction decrements the counter before checking if it's zero. PR: kern/80980 Discussed with: jhb	2010-12-02 22:19:30 +00:00
Colin Percival	d42446149f	Fix bug introduced by r194784: Under XEN, the page(s) allocated to dpcpu for CPU #0 weren't being properly reserved. Under VM pressure this would cause problems when the dpcpu structures were overwritten by arbitrary data; the most common symptom was a panic when netisr attempted to lock a mutex. For some reason the XEN code keeps track of the start of available memory in the variables 'first', 'physfree', and 'init_first'; as far as I can tell, we always have first == physfree == init_first * PAGE_SIZE. The earlier commit adjusted 'first' (which, on !XEN, is the only variable which tracks this value) but not the other two variables. Exercise for reader: Eliminate two of these three variables.	2010-11-29 06:50:30 +00:00
Konstantin Belousov	c6fb218c3c	Calling fill_fpregs() for curthread is legitimate, and ELF coredump does this. Reported and tested by: pho MFC after: 5 days	2010-11-28 17:56:34 +00:00
Konstantin Belousov	5c6eb03790	Remove npxgetregs(), npxsetregs(), fpugetregs() and fpusetregs() functions, they are unused. Remove 'user' from npxgetuserregs() etc. names. For {npx,fpu}{get,set}regs(), always use pcb->pcb_user_save for FPU context storage. This eliminates the need for ugly copying with overwrite of the newly added and reserved fields in ucontext on i386 to satisfy alignment requirements for fpusave() and fpurstor(). pc98 version was copied from i386. Suggested and reviewed by: bde Tested by: pho (i386 and amd64) MFC after: 1 week	2010-11-26 14:50:42 +00:00
Tijl Coosemans	ce4ec51dbe	Merge amd64/i386 _align.h by aligning on the size of register_t (copied from powerpc). Reviewed by: imp, jhb Approved by: kib (mentor)	2010-11-26 10:59:20 +00:00
Ulrich Spörlein	02604cd4f4	Remove kernel support for BB profiling, now that kernbb(8) is gone, too. PR: bin/83558 Reviewed by: jkim	2010-11-26 08:11:43 +00:00
Colin Percival	5c0ab2fa8b	Revert r215819 and fix the bug properly. In pmap_qremove, paging table updates were being queued by pmap_kremove, but the queue wasn't being flushed; as a result, the updates didn't happen until after the call to pmap_invalidate_range, and old entries could stick around in the TLB. Adding a PT_UPDATES_FLUSH() call immediately before pmap_invalidate_range ensures that after the invalidation the TLB will be repopulated with the correct new entries. Thanks to: kib, avg, alc	2010-11-25 22:06:07 +00:00
Dimitry Andric	079d7e43ca	Use unambiguous inline assembly to load a float variable. GNU as silently converts 'fld' to 'flds', without taking the actual variable type into account (!), but clang's integrated assembler rightfully complains about it. Discussed with: cperciva	2010-11-25 18:14:18 +00:00
John Baldwin	9d76324839	Add device IDs for two more ServerWorks Host-PCI bridges so that we can read their starting PCI bus number for older systems that do not support ACPI (or have a broken _BBN method). PR: kern/148108 MFC after: 1 week	2010-11-25 15:42:33 +00:00
Colin Percival	98702b3990	Work around paging bug. Somehow we seem to be ending up with entries in the TLB which don't correspond to ptes with PG_V set; prior to this commit I'm sometimes getting the wrong data when pages are loaded into the buffer cache (they're being loaded, but the missing TLB invalidation is causing the wrong data to be visible).	2010-11-25 15:41:34 +00:00
Colin Percival	1a3b2b87de	Rename HYPERVISOR_multicall (which performs the multicall hypercall) to _HYPERVISOR_multicall, and create a new HYPERVISOR_multicall function which invokes _HYPERVISOR_multicall and checks that the individual hypercalls all succeeded.	2010-11-25 15:05:21 +00:00
Colin Percival	0bd7a92067	Remove vestigal debugging code which, in fork-heavy workloads, can cause a 30x slowdown.	2010-11-25 04:45:31 +00:00
Jung-uk Kim	d2d0fda841	Remove a stale tunable introduced in r215703.	2010-11-23 17:28:23 +00:00
Andriy Gapon	9b984feb3d	specialreg.h: add definitions for some useful bits found in CPUID.6 EAX and ECX CPUID.6 is defined as Thermal and Power Management Leaf by both Intel and AMD. Reviewed by: jhb MFC after: 7 days	2010-11-23 13:55:30 +00:00
Jung-uk Kim	7dd052c1d9	- Disable caches and flush caches/TLBs when we update PAT as we do for MTRR. Flushing TLBs is required to ensure cache coherency according to the AMD64 architecture manual. Flushing caches is only required when changing from a cacheable memory type (WB, WP, or WT) to an uncacheable type (WC, UC, or UC-). Since this function is only used once per processor during startup, there is no need to take any shortcuts. - Leave PAT indices 0-3 at the default of WB, WT, UC-, and UC. Program 5 as WP (from default WT) and 6 as WC (from default UC-). Leave 4 and 7 at the default of WB and UC. This is to avoid transition from a cacheable memory type to an uncacheable type to minimize possible cache incoherency. Since we perform flushing caches and TLBs now, this change may not be necessary any more but we do not want to take any chances. - Remove Apple hardware specific quirks. With the above changes, it seems this hack is no longer needed. - Improve pmap_cache_bits() with an array to map PAT memory type to index. This array is initialized early from pmap_init_pat(), so that we do not need to handle special cases in the function any more. Now this function is identical on both amd64 and i386. Reviewed by: jhb Tested by: RM (reuf_m at hotmail dot com) Ryszard Czekaj (rychoo at freeshell dot net) army.of.root (army dot of dot root at googlemail dot com) MFC after: 3 days	2010-11-22 19:52:44 +00:00
Colin Percival	c3f128981e	In xen_get_timecount, return the full ns-precision time rather than rounding to 1/HZ precision. I have no idea why the rounding was introduced in the first place, but it makes FreeBSD unhappy.	2010-11-22 09:04:29 +00:00
Colin Percival	61381fcf2d	Unifdef XEN. This file is only compiled with the XEN kernel option set, and the !XEN bits get in the way of understanding the code.	2010-11-20 21:36:12 +00:00
Colin Percival	8cdabbaf32	Add VTOM(va) macro as xpmap_ptom(VTOP(va)) to convert to machine addresses. Clean up the code by converting xpmap_ptom(VTOP(...)) to VTOM(...) and converting xpmap_ptom(VM_PAGE_TO_PHYS(...)) to VM_PAGE_TO_MACH(...). In a few places we take advantage of the fact that xpmap_ptom can commute with setting PG_* flags. This commit should have no net effect save to improve the readability of this code.	2010-11-20 20:04:29 +00:00
Colin Percival	ad520892d7	Make pmap_release consistent with pmap_pinit with respect to unpinning pages. The pinning of NPGPTD pages is #if 0ed out in pmap_pinit (I'm not quite sure why...) and this commit adds a corresponding #if 0 in pmap_release to avoid unpinning those pages. Some versions of Xen seem to silently ignore requests to unpin pages which were never pinned in the first place, but some return an error (causing FreeBSD to panic) prior to this commit.	2010-11-19 15:12:19 +00:00
Andriy Gapon	b43d292565	specialreg.h: add definitions for MPERF/APERF pair of MSRs These MSRs can be used to determine actual (average) performance as compared to a maximum defined performance. Availability of these MSRs is indicated by bit0 in CPUID.6.ECX on both Intel and AMD processors. MFC after: 5 days	2010-11-19 15:07:36 +00:00
Andriy Gapon	7af7c7624a	specialreg.h: add AMD-specific "Hardware Configuration Register" MSR It seems that this MSR has been available in a range of AMD processors families for quite a while now. Note1: not all AMD MSRs that are found in amd64 specialreg.h are also in the i386 version. Note2: perhaps some additional name component is needed to distinguish AMD-specific MSRs. MFC after: 5 days	2010-11-19 15:00:20 +00:00
Andriy Gapon	8fd6d51347	specialreg.h: add definition for AMD Core Performance Boost bit This bit indicates availability of the feature. MFC after: 4 days	2010-11-19 14:46:17 +00:00
Colin Percival	f4c884f95a	Make pmap_release match pmap_pinit by invoking pmap_qremove(pmap->pm_pdpt) to match pmap_pinit's pmap_qenter(pmap->pm_pdpt) call in the case of PAE.	2010-11-18 21:29:43 +00:00
Colin Percival	f86f965ef8	Don't KASSERT in pmap_release that xpmap_ptom(VM_PAGE_TO_PHYS(m)) == (pmap->pm_pdpt[i] & PG_FRAME) for i = NPGPTD, since pmap->pm_pdpt[i] is only initialized for 0 <= i < NPGPTD. This fixes an inevitable panic with XEN && PAE && INVARIANTS when pmap_release is called (e.g., when /sbin/init is launched).	2010-11-18 21:02:40 +00:00
Jung-uk Kim	816b3bd1b0	Restore CR0 after MTRR initialization for correctness sakes. There will be no noticeable change because we enable caches before we enter here for both BSP and AP cases. Remove another pointless optimization for CR4.PGE bit while I am here.	2010-11-16 23:26:02 +00:00
Jung-uk Kim	50083a5624	Invalidate TLBs explicitly. r1.4 of sys/i386/i386/i686_mem.c removed this code but probably it only worked by chance because modifying CR4.PGE bit causes invlidation of entire TLBs. Since these are very rare events, this micro-optimization seems useless. Reviewed by: jhb	2010-11-16 22:44:58 +00:00
Konstantin Belousov	7022f954c3	Do not use __FreeBSD_version prefix for the special osrel version. The ports/Mk/bsd.port.mk uses sys/param.h to fetch osrel, and cannot grok several constants with the prefix. Reported and tested by: swell.k gmail com MFC after: 1 week	2010-11-14 21:59:11 +00:00
Konstantin Belousov	94bce4535d	Use symbolic names instead of hardcoding values for magic p_osrel constants. MFC after: 1 week	2010-11-14 18:24:12 +00:00
Jung-uk Kim	a3c464fb3c	MFamd64: (based on) r209957 Move logic of building ACPI headers for acpi_wakeup.c into better places, remove intermediate makefile and shell script, and reduce diff between i386 and amd64.	2010-11-12 20:55:14 +00:00
Jung-uk Kim	19da400c64	Move identical copies of apm_bios.h to sys/x86/include, replace them with stubs, and adjust PC98 stub accordingly. Reviewed by: imp, nyan	2010-11-11 19:36:21 +00:00
Jung-uk Kim	926ad40ff9	Add compat shim for apm(4) to translate APM BIOS function numbers from i386 to PC98-specific ones. Any binaries using apm ioctl(4) commands but built for i386 should also work on PC98 now. Reviewed by: imp, nyan	2010-11-11 19:20:33 +00:00
Jung-uk Kim	93a8847473	Make APM emulation look more closer to its origin. Use device_get_softc(9) instead of hardcoding acpi(4) unit number as we have device_t for it.	2010-11-10 18:50:12 +00:00
Jung-uk Kim	7c2bf852d7	Refactor acpi_machdep.c for amd64 and i386, move APM emulation into a new file acpi_apm.c, and place it on sys/x86/acpica.	2010-11-10 01:29:56 +00:00
John Baldwin	961135ead8	- Remove <machine/mutex.h>. Most of the headers were empty, and the contents of the ones that were not empty were stale and unused. - Now that <machine/mutex.h> no longer exists, there is no need to allow it to override various helper macros in <sys/mutex.h>. - Rename various helper macros for low-level operations on mutexes to live in the _mtx_* or __mtx_* namespaces. While here, change the names to more closely match the real API functions they are backing. - Drop support for including <sys/mutex.h> in assembly source files. Suggested by: bde (1, 2)	2010-11-09 20:46:41 +00:00
Attilio Rao	fcb250f392	Move the mptable.h under x86/include/. Sponsored by: Sandvine Incorporated MFC after: 14 days	2010-11-09 20:28:09 +00:00
Jung-uk Kim	cedd86cafa	Now OsdEnvironment.c is identical on amd64 and i386. Move it to a new home.	2010-11-09 00:27:18 +00:00
Jung-uk Kim	2473325fa8	Reduce diff between platforms and fix style(9) bugs.	2010-11-09 00:14:39 +00:00
John Baldwin	13e25cb7a5	Move the MADT parser for amd64 and i386 to sys/x86/acpica now that it is identical on both platforms.	2010-11-08 20:57:02 +00:00
John Baldwin	c5b0b5fc6b	Sync the APIC startup sequence with amd64: - Register APIC enumerators at SI_SUB_TUNABLES - 1 instead of SI_SUB_CPU - 1. - Probe CPUs at SI_SUB_TUNABLES - 1. This allows i386 to set a truly accurate mp_maxid value rather than always setting it to MAXCPU - 1.	2010-11-08 20:35:09 +00:00
John Baldwin	6c3c2b8b87	Remove stub symbols for APIC-related functions when 'device apic' is not included in a kernel config. These stubs had existed previously so that acpi.ko could always include the MADT parsing code and still link with a kernel that did not include 'device apic'.	2010-11-08 20:32:35 +00:00
John Baldwin	f67b4bd367	A few small style and whitespace fixes.	2010-11-08 20:05:22 +00:00
Alan Cox	228a253795	Eliminate a possible race between pmap_pinit() and pmap_kenter_pde() on superpage promotion or demotion. Micro-optimize pmap_kenter_pde(). Reviewed by: kib, jhb (an earlier version) MFC after: 1 week	2010-11-07 18:42:37 +00:00
John Baldwin	0108cce0a4	Adjust the order of operations in spinlock_enter() and spinlock_exit() to work properly with single-stepping in a kernel debugger. Specifically, these routines have always disabled interrupts before increasing the nesting count and restored the prior state of interrupts after decreasing the nesting count to avoid problems with a nested interrupt not disabling interrupts when acquiring a spin lock. However, trap interrupts for single-stepping can still occur even when interrupts are disabled. Now the saved state of interrupts is not saved in the thread until after interrupts have been disabled and the nesting count has been increased. Similarly, the saved state from the thread cannot be read once the nesting count has been decreased to zero. To fix this, use temporary variables to store interrupt state and shuffle it between the thread's MD area and the appropriate registers. In cooperation with: bde MFC after: 1 month	2010-11-05 13:42:58 +00:00
Andriy Gapon	3b50d59fef	x86 topo_probe: do not probe smp topology if only one cpu is visible This could lead to a division by zero if hardware is multi-core and/or multi-threaded, but for some (quite unusual) reason FreeBSD sees only one logical processor. This could happen, for example, if neither MADT nor MP Table are presented by BIOS. Also: - assert in topo_probe_0x4 that BSP is accounted for - neither cpu_cores nor cpu_logical should be zero after successful probing, so either being zero is an indication of failed probing Reported by: vwe, Dan Allen <danallen46@airwired.net> Tested by: Dan Allen <danallen46@airwired.net> MFC after: 3 days	2010-11-04 08:51:45 +00:00
John Baldwin	239da85bbc	Further tweaks to the ram_attach() routine: - Use > 2^32 - 1 instead of >= when checking for memory regions above 4G. - Skip SMAP entries > 4G on i386 rather than breaking out of the loop since SMAP entries are not guaranteed to be in order. - Remove 'i' and loop over 'rid' directly in the dump_avail[] case. - Only check for 4G regions in the dump_avail[] case on i386 if PAE is enabled since vm_paddr_t is 32-bit in the !PAE case. Submitted by: alc	2010-11-02 17:56:16 +00:00
John Baldwin	32c3d3b6e6	Move <machine/apicreg.h> to <x86/apicreg.h>.	2010-11-01 18:18:46 +00:00
John Baldwin	5ecdb3c46b	Move the <machine/mca.h> header to <x86/mca.h>.	2010-11-01 17:40:35 +00:00
Attilio Rao	ba2a27351b	Merge nexus.c from amd64 and i386 to x86 subtree. Sponsored by: Sandvine Incorporated Tested by: gianni	2010-10-28 16:31:39 +00:00
John Baldwin	89d84a4055	Use 'PCPU_GET(apic_id)' to determine the BSP's APIC ID on a UP machine when routing interrupts instead of cpu_apic_ids[0] since cpu_apic_ids[] is only populated for multiple-CPU machines. This also matches what the code does when SMP is not enabled. PR: bin/151616 Tested by: "Damian S. Kolodziejczyk" damkol \| gmail Submitted by: avg MFC after: 1 week	2010-10-28 13:44:19 +00:00
Attilio Rao	a3da97926d	Merge the mptable support from MD bits to x86 subtree. Sponsored by: Sandvine Incorporated Discussed with: jhb	2010-10-28 07:58:06 +00:00
Attilio Rao	256439c972	Merge dump_machdep.c i386/amd64 under the x86 subtree. Sponsored by: Sandvine Incorporated Tested by: gianni	2010-10-26 12:46:26 +00:00
John Baldwin	0689bdcc19	Use 'saveintr' instead of 'savecrit' or 'eflags' to hold the state returned by intr_disable(). Requested by: bde	2010-10-25 15:31:13 +00:00
John Baldwin	c6390f7ac5	Use intr_disable() and intr_restore() instead of frobbing the flags register directly to disable interrupts. Reviewed by: bde (earlier version) MFC after: 2 weeks	2010-10-25 15:28:03 +00:00
Justin T. Gibbs	ff662b5c98	Improve the Xen para-virtualized device infrastructure of FreeBSD: o Add support for backend devices (e.g. blkback) o Implement extensions to the Xen para-virtualized block API to allow for larger and more outstanding I/Os. o Import a completely rewritten block back driver with support for fronting I/O to both raw devices and files. o General cleanup and documentation of the XenBus and XenStore support code. o Robustness and performance updates for the block front driver. o Fixes to the netfront driver. Sponsored by: Spectra Logic Corporation sys/xen/xenbus/init.txt: Deleted: This file explains the Linux method for XenBus device enumeration and thus does not apply to FreeBSD's NewBus approach. sys/xen/xenbus/xenbus_probe_backend.c: Deleted: Linux version of backend XenBus service routines. It was never ported to FreeBSD. See xenbusb.c, xenbusb_if.m, xenbusb_front.c xenbusb_back.c for details of FreeBSD's XenBus support. sys/xen/xenbus/xenbusvar.h: sys/xen/xenbus/xenbus_xs.c: sys/xen/xenbus/xenbus_comms.c: sys/xen/xenbus/xenbus_comms.h: sys/xen/xenstore/xenstorevar.h: sys/xen/xenstore/xenstore.c: Split XenStore into its own tree. XenBus is a software layer built on top of XenStore. The old arrangement and the naming of some structures and functions blurred these lines making it difficult to discern what services are provided by which layer and at what times these services are available (e.g. during system startup and shutdown). sys/xen/xenbus/xenbus_client.c: sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbus_probe.c: sys/xen/xenbus/xenbusb.c: sys/xen/xenbus/xenbusb.h: Split up XenBus code into methods available for use by client drivers (xenbus.c) and code used by the XenBus "bus code" to enumerate, attach, detach, and service bus drivers. sys/xen/reboot.c: sys/dev/xen/control/control.c: Add a XenBus front driver for handling shutdown, reboot, suspend, and resume events published in the XenStore. Move all PV suspend/reboot support from reboot.c into this driver. sys/xen/blkif.h: New file from Xen vendor with macros and structures used by a block back driver to service requests from a VM running a different ABI (e.g. amd64 back with i386 front). sys/conf/files: Adjust kernel build spec for new XenBus/XenStore layout and added Xen functionality. sys/dev/xen/balloon/balloon.c: sys/dev/xen/netfront/netfront.c: sys/dev/xen/blkfront/blkfront.c: sys/xen/xenbus/... sys/xen/xenstore/... o Rename XenStore APIs and structures from xenbus_* to xs_. o Adjust to use of M_XENBUS and M_XENSTORE malloc types for allocation of objects returned by these APIs. o Adjust for changes in the bus interface for Xen drivers. sys/xen/xenbus/... sys/xen/xenstore/... Add Doxygen comments for these interfaces and the code that implements them. sys/dev/xen/blkback/blkback.c: o Rewrite the Block Back driver to attach properly via newbus, operate correctly in both PV and HVM mode regardless of domain (e.g. can be in a DOM other than 0), and to deal with the latest metadata available in XenStore for block devices. o Allow users to specify a file as a backend to blkback, in addition to character devices. Use the namei lookup of the backend path to automatically configure, based on file type, the appropriate backend method. The current implementation is limited to a single outstanding I/O at a time to file backed storage. sys/dev/xen/blkback/blkback.c: sys/xen/interface/io/blkif.h: sys/xen/blkif.h: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkfront/block.h: Extend the Xen blkif API: Negotiable request size and number of requests. This change extends the information recorded in the XenStore allowing block front/back devices to negotiate for optimal I/O parameters. This has been achieved without sacrificing backward compatibility with drivers that are unaware of these protocol enhancements. The extensions center around the connection protocol which now includes these additions: o The back-end device publishes its maximum supported values for, request I/O size, the number of page segments that can be associated with a request, the maximum number of requests that can be concurrently active, and the maximum number of pages that can be in the shared request ring. These values are published before the back-end enters the XenbusStateInitWait state. o The front-end waits for the back-end to enter either the InitWait or Initialize state. At this point, the front end limits it's own capabilities to the lesser of the values it finds published by the backend, it's own maximums, or, should any back-end data be missing in the store, the values supported by the original protocol. It then initializes it's internal data structures including allocation of the shared ring, publishes its maximum capabilities to the XenStore and transitions to the Initialized state. o The back-end waits for the front-end to enter the Initalized state. At this point, the back end limits it's own capabilities to the lesser of the values it finds published by the frontend, it's own maximums, or, should any front-end data be missing in the store, the values supported by the original protocol. It then initializes it's internal data structures, attaches to the shared ring and transitions to the Connected state. o The front-end waits for the back-end to enter the Connnected state, transitions itself to the connected state, and can commence I/O. Although an updated front-end driver must be aware of the back-end's InitWait state, the back-end has been coded such that it can tolerate a front-end that skips this step and transitions directly to the Initialized state without waiting for the back-end. sys/xen/interface/io/blkif.h: o Increase BLKIF_MAX_SEGMENTS_PER_REQUEST to 255. This is the maximum number possible without changing the blkif request header structure (nr_segs is a uint8_t). o Add two new constants: BLKIF_MAX_SEGMENTS_PER_HEADER_BLOCK, and BLKIF_MAX_SEGMENTS_PER_SEGMENT_BLOCK. These respectively indicate the number of segments that can fit in the first ring-buffer entry of a request, and for each subsequent (sg element only) ring-buffer entry associated with the "header" ring-buffer entry of the request. o Add the blkif_request_segment_t typedef for segment elements. o Add the BLKRING_GET_SG_REQUEST() macro which wraps the RING_GET_REQUEST() macro and returns a properly cast pointer to an array of blkif_request_segment_ts. o Add the BLKIF_SEGS_TO_BLOCKS() macro which calculates the number of ring entries that will be consumed by a blkif request with the given number of segments. sys/xen/blkif.h: o Update for changes in interface/io/blkif.h macros. o Update the BLKIF_MAX_RING_REQUESTS() macro to take the ring size as an argument to allow this calculation on multi-page rings. o Add a companion macro to BLKIF_MAX_RING_REQUESTS(), BLKIF_RING_PAGES(). This macro determines the number of ring pages required in order to support a ring with the supplied number of request blocks. sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkfront/block.h: o Negotiate with the other-end with the following limits: Reqeust Size: MAXPHYS Max Segments: (MAXPHYS/PAGE_SIZE) + 1 Max Requests: 256 Max Ring Pages: Sufficient to support Max Requests with Max Segments. o Dynamically allocate request pools and segemnts-per-request. o Update ring allocation/attachment code to support a multi-page shared ring. o Update routines that access the shared ring to handle multi-block requests. sys/dev/xen/blkfront/blkfront.c: o Track blkfront allocations in a blkfront driver specific malloc pool. o Strip out XenStore transaction retry logic in the connection code. Transactions only need to be used when the update to multiple XenStore nodes must be atomic. That is not the case here. o Fully disable blkif_resume() until it can be fixed properly (it didn't work before this change). o Destroy bus-dma objects during device instance tear-down. o Properly handle backend devices with powef-of-2 sector sizes larger than 512b. sys/dev/xen/blkback/blkback.c: Advertise support for and implement the BLKIF_OP_WRITE_BARRIER and BLKIF_OP_FLUSH_DISKCACHE blkif opcodes using BIO_FLUSH and the BIO_ORDERED attribute of bios. sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkfront/block.h: Fix various bugs in blkfront. o gnttab_alloc_grant_references() returns 0 for success and non-zero for failure. The check for < 0 is a leftover Linuxism. o When we negotiate with blkback and have to reduce some of our capabilities, print out the original and reduced capability before changing the local capability. So the user now gets the correct information. o Fix blkif_restart_queue_callback() formatting. Make sure we hold the mutex in that function before calling xb_startio(). o Fix a couple of KASSERT()s. o Fix a check in the xb_remove_ macro to be a little more specific. sys/xen/gnttab.h: sys/xen/gnttab.c: Define GNTTAB_LIST_END publicly as GRANT_REF_INVALID. sys/dev/xen/netfront/netfront.c: Use GRANT_REF_INVALID instead of driver private definitions of the same constant. sys/xen/gnttab.h: sys/xen/gnttab.c: Add the gnttab_end_foreign_access_references() API. This API allows a client to batch the release of an array of grant references, instead of coding a private for loop. The implementation takes advantage of this batching to reduce lock overhead to one acquisition and release per-batch instead of per-freed grant reference. While here, reduce the duration the gnttab_list_lock is held during gnttab_free_grant_references() operations. The search to find the tail of the incoming free list does not rely on global state and so can be performed without holding the lock. sys/dev/xen/xenpci/evtchn.c: sys/dev/xen/evtchn/evtchn.c: sys/xen/xen_intr.h: o Implement the bind_interdomain_evtchn_to_irqhandler API for HVM mode. This allows an HVM domain to serve back end devices to other domains. This API is already implemented for PV mode. o Synchronize the API between HVM and PV. sys/dev/xen/xenpci/xenpci.c: o Scan the full region of CPUID space in which the Xen VMM interface may be implemented. On systems using SuSE as a Dom0 where the Viridian API is also exported, the VMM interface is above the region we used to search. o Pass through bus_alloc_resource() calls so that XenBus drivers attaching on an HVM system can allocate unused physical address space from the nexus. The block back driver makes use of this facility. sys/i386/xen/xen_machdep.c: Use the correct type for accessing the statically mapped xenstore metadata. sys/xen/interface/hvm/params.h: sys/xen/xenstore/xenstore.c: Move hvm_get_parameter() to the correct global header file instead of as a private method to the XenStore. sys/xen/interface/io/protocols.h: Sync with vendor. sys/xeninterface/io/ring.h: Add macro for calculating the number of ring pages needed for an N deep ring. To avoid duplication within the macros, create and use the new __RING_HEADER_SIZE() macro. This macro calculates the size of the ring book keeping struct (producer/consumer indexes, etc.) that resides at the head of the ring. Add the __RING_PAGES() macro which calculates the number of shared ring pages required to support a ring with the given number of requests. These APIs are used to support the multi-page ring version of the Xen block API. sys/xeninterface/io/xenbus.h: Add Comments. sys/xen/xenbus/... o Refactor the FreeBSD XenBus support code to allow for both front and backend device attachments. o Make use of new config_intr_hook capabilities to allow front and back devices to be probed/attached in parallel. o Fix bugs in probe/attach state machine that could cause the system to hang when confronted with a failure either in the local domain or in a remote domain to which one of our driver instances is attaching. o Publish all required state to the XenStore on device detach and failure. The majority of the missing functionality was for serving as a back end since the typical "hot-plug" scripts in Dom0 don't handle the case of cleaning up for a "service domain" that is not itself. o Add dynamic sysctl nodes exposing the generic ivars of XenBus devices. o Add doxygen style comments to the majority of the code. o Cleanup types, formatting, etc. sys/xen/xenbus/xenbusb.c: Common code used by both front and back XenBus busses. sys/xen/xenbus/xenbusb_if.m: Method definitions for a XenBus bus. sys/xen/xenbus/xenbusb_front.c: sys/xen/xenbus/xenbusb_back.c: XenBus bus specialization for front and back devices. MFC after: 1 month	2010-10-19 20:53:30 +00:00
Jung-uk Kim	56b11f84a7	Remove trailing ", " from `sysctl machdep.idle_available' output.	2010-10-12 20:53:12 +00:00
Konstantin Belousov	78ae4338a2	Add macro DECLARE_MODULE_TIED to denote a module as requiring the kernel of exactly the same __FreeBSD_version as the headers module was compiled against. Mark our in-tree ABI emulators with DECLARE_MODULE_TIED. The modules use kernel interfaces that the Release Engineering Team feel are not stable enough to guarantee they will not change during the life cycle of a STABLE branch. In particular, the layout of struct sysentvec is declared to be not part of the STABLE KBI. Discussed with: bz, rwatson Approved by: re (bz, kensmith) MFC after: 2 weeks	2010-10-12 09:18:17 +00:00
Alan Cox	fb4c8540b2	Initialize KPTmap in locore so that vm86.c can call vtophys() (or really pmap_kextract()) before pmap_bootstrap() is called. Document the set of pmap functions that may be called before pmap_bootstrap() is called. Tested by: bde@ Reviewed by: kib@ Discussed with: jhb@ MFC after: 6 weeks	2010-10-05 17:06:51 +00:00
Konstantin Belousov	3f506a78ce	Display PCID capability of CPU and add CPUID define for it. MFC after: 1 week	2010-10-05 15:31:56 +00:00
Andriy Gapon	d443a96ffb	i386 and amd64 mp_machdep: improve topology detection for Intel CPUs This patch is significantly based on previous work by jkim. List of changes: - added comments that describe topology uniformity assumption - added reference to Intel Processor Topology Enumeration article - documented a few global variables that describe topology - retired weirdly set and used logical_cpus variable - changed fallback code for mp_ncpus > 0 case, so that CPUs are treated as being different packages rather than cores in a single package - moved AMD-specific code to topo_probe_amd [jkim] - in topo_probe_0x4() follow Intel-prescribed procedure of deriving SMT and core masks and match APIC IDs against those masks [started by jkim] - in topo_probe_0x4() drop code for double-checking topology parameters by looking at L1 cache properties [jkim] - in topo_probe_0xb() add fallback path to topo_probe_0x4() as prescribed by Intel [jkim] Still to do: - prepare for upcoming AMD CPUs by using new mechanism of uniform topology description [pointed by jkim] - probe cache topology in addition to CPU topology and probably use that for scheduler affinity topology; e.g. Core2 Duo and Athlon II X2 have the same CPU topology, but Athlon cores do not share L2 cache while Core2's do (no L3 cache in both cases) - think of supporting non-uniform topologies if they are ever implemented for platforms in question - think how to better described old HTT vs new HTT distinction, HTT vs SMT can be confusing as SMT is a generic term - more robust code for marking CPUs as "logical" and/or "hyperthreaded", use HTT mask instead of modulo operation - correct support for halting logical and/or hyperthreaded CPUs, let scheduler know that it shouldn't schedule any threads on those CPUs PR: kern/145385 (related) In collaboration with: jkim Tested by: Sergey Kandaurov <pluknet@gmail.com>, Jeremy Chadwick <freebsd@jdc.parodius.com>, Chip Camden <sterling@camdensoftware.com>, Steve Wills <steve@mouf.net>, Olivier Smedts <olivier@gid0.org>, Florian Smeets <flo@smeets.im> MFC after: 1 month	2010-10-01 10:32:54 +00:00
Neel Natu	5c1a8dc028	Fix bogus error message from bus_dmamem_alloc() about incorrect alignment. The check for alignment should be made against the physical address and not the virtual address that maps it. Sponsored by: NetApp Submitted by: Will McGovern (will at netapp dot com) Reviewed by: mjacob, jhb	2010-09-29 21:53:11 +00:00
David Xu	8d2a935e45	Remove a redundant instruction for casuword.	2010-09-29 02:36:58 +00:00
John Baldwin	1691b62202	Rewrite the i386 memory probe: - Check for SMAP data from the loader first. If it exists, don't bother doing any VM86 calls at all. This will be more friendly for non-BIOS boot environments such as EFI, etc. - Move the base memory setup into a new basemem_setup() routine instead of duplicating it. - Simplify the XEN case by removing all of the VM86/SMAP parsing code rather than just jumping over it. - Adjust some comments to better explain the code flow. MFC after: 2 weeks	2010-09-27 19:36:15 +00:00
David Xu	295fbd498e	Now userland POSIX semaphore is based on umtx. The kernel module is only used to support binary compatible, if want to run old binary, you need to kldload the module.	2010-09-24 09:04:16 +00:00
Norikatsu Shigemura	cbf4dac64f	Add support 'device tpm' for amd64. Add tpm(4)'s default setting to /boot/defaults/loader.conf. Add 'device tpm' to NOTES for amd64 and i386. Discussed with: takawata Approved by: imp (mentor)	2010-09-19 14:40:37 +00:00
Alexander Motin	a157e42516	Refactor timer management code with priority to one-shot operation mode. The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.	2010-09-13 07:25:35 +00:00
Andriy Gapon	3d844eddb7	bus_add_child: change type of order parameter to u_int This reflects actual type used to store and compare child device orders. Change is mostly done via a Coccinelle (soon to be devel/coccinelle) semantic patch. Verified by LINT+modules kernel builds. Followup to: r212213 MFC after: 10 days	2010-09-10 11:19:03 +00:00
Roman Divacky	27d4fea6c5	Change the parameter passed to the inline assembly to u_short as we are dealing with 16bit segment registers. Change mov to movw. Approved by: rpaulo (mentor) Reviewed by: kib, rink	2010-09-03 14:25:17 +00:00
Rui Paulo	cba3269417	Register an interrupt vector for DTrace return probes. There is some code missing in lapic to make sure that we don't overwrite this entry, but this will be done on a sequent commit. Sponsored by: The FreeBSD Foundation	2010-08-28 08:03:29 +00:00
Rui Paulo	6bf9fb35e5	Sync DTrace bits with amd64 and fix the build. Sponsored by: The FreeBSD Foundation	2010-08-26 11:22:12 +00:00
Jung-uk Kim	db1cea00ad	Increase maximum number of page table entries per VM86 context from 8 to 24 pages, yet again. Now we can allocate a whole segment, which is required for shadowing option ROM images, for example.	2010-08-25 21:13:23 +00:00
Rui Paulo	0bc1991a4a	Call the necessary DTrace function pointers when we have different kinds of traps. Sponsored by: The FreeBSD Foundation	2010-08-25 09:10:32 +00:00
Rui Paulo	8a8d8fa3d1	Add two DTrace trap type values. Used by fasttrap. Sponsored by: The FreeBSD Foundation	2010-08-24 13:13:24 +00:00

1 2 3 4 5 ...

12117 Commits