freebsd-skq

Author	SHA1	Message	Date
marcel	1c7681de49	Drop the high FP state of an exiting thread in cpu_thread_exit() and not in cpu_exit(). The latter is called after td_md.md_highfp_mtx has been destroyed, which results in a race condition when another thread wants to use the high FP registers on the CPU that still has the high FP registers in question.	2009-06-20 05:36:53 +00:00
jkim	6d358bddff	Import ACPICA 20090521.	2009-06-05 18:44:36 +00:00
rwatson	14f4a9dd42	Remove MAC kernel config files and add "options MAC" to GENERIC, with the goal of shipping 8.0 with MAC support in the default kernel. No policies will be compiled in or enabled by default, but it will now be possible to load them at boot or runtime without a kernel recompile. While the framework is not believed to impose measurable overhead when no policies are loaded (a result of optimization over the past few months in HEAD), we'll continue to benchmark and optimize as the release approaches. Please keep an eye out for performance or functionality regressions that could be a result of this change. Approved by: re (kensmith) Obtained from: TrustedBSD Project	2009-06-02 18:31:08 +00:00
jamie	572db1408a	Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)	2009-05-29 21:27:12 +00:00
ed	8d73adc757	Last minute TTY API change: remove mutex argument from tty_alloc(). I don't want people to override the mutex when allocating a TTY. It has to be there, to keep drivers like syscons happy. So I'm creating a tty_alloc_mutex() which can be used in those cases. tty_alloc_mutex() should eventually be removed. The advantage of this approach, is that we can just remove a function, without breaking the regular API in the future.	2009-05-29 06:41:23 +00:00
rink	1843497c73	ia64: Move MCA information retrieval to a per-CPU kthread Once AP's are launched, their MCA state information is stored and later obtainable using a sysctl. Since the size of the MCA state information is unknown, it will be malloc'ed as needed. However, when 'ia64_ap_startup' runs, it's not yet safe to call malloc and this may cause 'panic: blockable sleep lock (sleep mutex) 8192 @ /usr/src/sys/vm/uma_core.c'. This commit avoids this issue by scheduling a separate kthread to obtain this information, which immediately terminates afterwards.	2009-05-27 18:12:27 +00:00
marcel	29c02c1386	Rename ia64_invalidate_icache() to ia64_sync_icache(). We're not invalidating anything.	2009-05-18 18:44:54 +00:00
marcel	8b09116a5a	Add cpu_flush_dcache() for use after non-DMA based I/O so that a possible future I-cache coherency operation can succeed. On ARM for example the L1 cache can be (is) virtually mapped, which means that any I/O that uses temporary mappings will not see the I-cache made coherent. On ia64 a similar behaviour has been observed. By flushing the D-cache, execution of binaries backed by md(4) and/or NFS work reliably. For Book-E (powerpc), execution over NFS exhibits SIGILL once in a while as well, though cpu_flush_dcache() hasn't been implemented yet. Doing an explicit D-cache flush as part of the non-DMA based I/O read operation eliminates the need to do it as part of the I-cache coherency operation itself and as such avoids pessimizing the DMA-based I/O read operations for which D-cache are already flushed/invalidated. It also allows future optimizations whereby the bcopy() followed by the D-cache flush can be integrated in a single operation, which could be implemented using on-chips DMA engines, by-passing the D-cache altogether.	2009-05-18 18:37:18 +00:00
kuriyama	9913dad783	- Use "device\t" and "options \t" for consistency.	2009-05-10 00:00:25 +00:00
marcel	8f8a26f716	Remove isa_irq_pending(). It's not used.	2009-04-24 03:43:20 +00:00
rwatson	21a8b350dc	Don't conditionally define CACHE_LINE_SHIFT, as we anticipate sizing a fair number of static data structures, making this an unlikely option to try to change without also changing source code. [1] Change default cache line size on ia64, sparc64, and sun4v to 128 bytes, as this was what rtld-elf was already using on those platforms. [2] Suggested by: bde [1], jhb [2] MFC after: 2 weeks	2009-04-20 12:59:23 +00:00
rwatson	ab17fac487	Add description and cautionary note regarding CACHE_LINE_SIZE. MFC after: 2 weeks Suggested by: alc	2009-04-19 21:26:36 +00:00
rwatson	8df790f38f	For each architecture, define CACHE_LINE_SHIFT and a derived CACHE_LINE_SIZE constant. These constants are intended to over-estimate the cache line size, and be used at compile-time when a run-time tuning alternative isn't appropriate or available. Defaults for all architectures are 64 bytes, except powerpc where it is 128 bytes (used on G5 systems). MFC after: 2 weeks Discussed on: arch@	2009-04-19 20:19:13 +00:00
jhb	360bcf2161	Restore bus DMA bounce pages to an offset of 0 when they are released by a tag that has BUS_DMA_KEEP_PG_OFFSET set. Otherwise the page could be reused with a non-zero offset by a tag that doesn't have BUS_DMA_KEEP_PG_OFFSET leading to data corruption. Sleuthing by: avg Reviewed by: scottl	2009-04-17 13:22:18 +00:00
kib	9c0149c147	The bus_dmamap_load_uio(9) shall use pmap of the thread recorded in the uio_td to extract pages from, instead of unconditionally use kernel pmap. Submitted by: Jason Harmening <jason.harmening gmail com> (amd64 version) PR: amd64/133592 Reviewed by: scottl (original patch), jhb MFC after: 2 weeks	2009-04-13 19:20:32 +00:00
dchagin	01bf63c9fb	Fix KBI breakage by r190520 which affects older linux.ko binaries: 1) Move the new field (brand_note) to the end of the Brandinfo structure. 2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer is valid. 3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old modules won't have the flag set, so the new field brand_note would be ignored. Suggested by: jhb Reviewed by: jhb Approved by: kib (mentor) MFC after: 6 days	2009-04-05 09:27:19 +00:00
kib	1fca0aa454	Add trivial implementation for the freebsd32_sysarch on ia64. Fix comapt32 and LINT build on ia64. Discussed with: jhb	2009-04-01 19:23:07 +00:00
kib	7695aca762	Add AT_EXECPATH ELF auxinfo entry type. The value's a_ptr is a pointer to the full path of the image that is being executed. Increase AT_COUNT. Remove no longer true comment about types used in Linux ELF binaries, listed types contain FreeBSD-specific entries. Reviewed by: kan	2009-03-17 12:50:16 +00:00
dchagin	2408b715a0	Implement new way of branding ELF binaries by looking to a ".note.ABI-tag" section. The search order of a brand is changed, now first of all the ".note.ABI-tag" is looked through. Move code which fetch osreldate for ELF binary to check_note() handler. PR: 118473 Approved by: kib (mentor)	2009-03-13 16:40:51 +00:00
thompsa	6b0018e885	Change over the usb kernel options to the new stack (retaining existing naming). The old usb stack can be compiled in my prefixing the name with 'o'.	2009-02-23 18:34:56 +00:00
thompsa	c24b826e84	Add uslcom to the build too. Reminded by: Michael Butler	2009-02-15 23:40:29 +00:00
thompsa	15cccb8286	Switch over GENERIC kernels to USB2 by default. Tested by: make universe	2009-02-15 22:33:44 +00:00
marcel	59864a1e04	Mark the BSP as being awake. This supresses the message that not all usable CPUs could be woken up...	2009-02-10 20:29:57 +00:00
imp	719ba982f2	When bouncing pages, allow a new option to preserve the intra-page offset. This is needed for the ehci hardware buffer rings that assume this behavior. This is an interim solution, and a more general one is being worked on. This solution doesn't break anything that doesn't ask for it directly. The mbuf and uio variants with this flag likely don't work and haven't been tested. Universe builds with these changes. I don't have a huge-memory machine to test these changes with, but will be happy to work with folks that do and hps if this changes turns out not to be sufficient. Submitted by: alfred@ from Hans Peter Selasky's original	2009-02-08 22:54:58 +00:00
wkoszek	10be92c87c	Don't forget to create opt_agp.h on ia64, which also uses agp(4).	2009-02-07 09:57:14 +00:00
jhb	91ab06bc89	Tweak the ia64 machine check handling code to not register new sysctl nodes while holding a spin mutex. Instead, it now shoves the machine check records onto a queue that is later drained to add sysctl nodes for each record. While a routine to drain the queue is present, it is not currently called. Reviewed by: marcel	2009-02-04 18:44:29 +00:00
alc	7a8370bbd4	Correct an error in revision 1.170 of this file. When get_pv_entry() is forced to reclaim pv entries, the one pv entry that it returns should not be freed.	2009-01-18 08:00:55 +00:00
imp	39a3668dcc	AT_DEBUG and AT_BRK were OBE like 10 years ago, so retire them. Reviewed by: peter	2008-12-17 06:56:58 +00:00
ed	9286c815e8	Remove "[KEEP THIS!]" from COMPAT_43TTY. It's not really that important. Sgtty is a programming interface that has been replaced by termios over the years. In June we already removed <sgtty.h>, which exposes the ioctl()'s that are implemented by this interface. The importance of this flag is overrated right now.	2008-12-02 19:09:08 +00:00
kib	8fad2283b3	Add sv_flags field to struct sysentvec with intention to provide description of the ABI of the currently executing image. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures to determine ABI features. Discussed with: dchagin, imp, jhb, peter	2008-11-22 12:36:15 +00:00
marcel	07d364adf0	Define mb(), rmb() and wmb() for real.	2008-11-22 06:56:49 +00:00
kmacy	9d3bb599b1	- bump __FreeBSD version to reflect added buf_ring, memory barriers, and ifnet functions - add memory barriers to <machine/atomic.h> - update drivers to only conditionally define their own - add lockless producer / consumer ring buffer - remove ring buffer implementation from cxgb and update its callers - add if_transmit(struct ifnet ifp, struct mbuf m) to ifnet to allow drivers to efficiently manage multiple hardware queues (i.e. not serialize all packets through one ifq) - expose if_qflush to allow drivers to flush any driver managed queues This work was supported by Bitgravity Inc. and Chelsio Inc.	2008-11-22 05:55:56 +00:00
des	66f807ed8b	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
marcel	2f8a2668f4	Atomically increment the number of awoken APs as all APs will be unleashed here. Pointed out by: christian.kandeler@hob.de	2008-10-19 20:14:48 +00:00
peter	ed8d07f232	Collect N identical (or near identical) mkdumpheader() implementations into one, as threatened in the comment. Textdump magic can be passed in.	2008-10-01 22:08:53 +00:00
marius	a1ec700ce8	Remove ipi_all() and ipi_self() as the former hasn't been used at all to date and the latter also is only used in ia64 and powerpc code which no longer serves a real purpose after bring-up and just can be removed as well. Note that architectures like sun4u also provide no means of implementing IPI'ing a CPU itself natively in the first place. Suggested by: jhb Reviewed by: arch, grehan, jhb	2008-09-28 18:34:14 +00:00
ed	4efdef565f	Replace all calls to minor() with dev2unit(). After I removed all the unit2minor()/minor2unit() calls from the kernel yesterday, I realised calling minor() everywhere is quite confusing. Character devices now only have the ability to store a unit number, not a minor number. Remove the confusion by using dev2unit() everywhere. This commit could also be considered as a bug fix. A lot of drivers call minor(), while they should actually be calling dev2unit(). In -CURRENT this isn't a problem, but it turns out we never had any problem reports related to that issue in the past. I suspect not many people connect more than 256 pieces of the same hardware. Reviewed by: kib	2008-09-27 08:51:18 +00:00
kib	c500808674	Change the static struct sysentvec and struct Elf_Brandinfo initializers to the C99 style. At least, it is easier to read sysent definitions that way, and search for the actual instances of sigcode etc. Explicitely initialize sysentvec.sv_maxssiz that was missed in most sysvecs. No objection from: jhb MFC after: 1 month	2008-09-24 10:14:37 +00:00
obrien	d31fa36475	The kernel implemented 'memcmp' is an alias for 'bcmp'. However, memcmp and bcmp are not the same thing. 'man bcmp' states that the return is "non-zero" if the two byte strings are not identical. Where as, 'man memcmp' states that the return is the "difference between the first two differing bytes (treated as unsigned char values" if the two byte strings are not identical. So provide a proper memcmp(9), but it is a C implementation not a tuned assembly implementation. Therefore bcmp(9) should be preferred over memcmp(9).	2008-09-23 14:45:10 +00:00
ed	cc3116a938	Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan	2008-08-20 08:31:58 +00:00
jhb	d90774443d	Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports already define _KERNEL to get to this and I'm about to add hooks to libkvm to access per-CPU data. MFC after: 1 week	2008-08-19 19:53:52 +00:00
bz	1021d43b56	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
alc	d4de04e9b1	Update bus_dmamem_alloc()'s first call to malloc() such that M_WAITOK is specified when appropriate. Reviewed by: scottl	2008-07-15 03:34:49 +00:00
delphij	cb283fcdf7	Add HWPMC_HOOKS to GENERIC kernels, this makes hwpmc.ko work out of the box.	2008-07-07 22:55:11 +00:00
marcel	1c800dbdb5	Add inline function ia64_fc_i() to abstract inline assembly. Use the new inline function in ia64_invalidate_icache(). While there, add proper synchronization so that we know the fc.i instructions have taken effect when we return.	2008-07-07 17:43:56 +00:00
ed	4d6a9685e8	Remove the unused major/minor numbers from iodev and memdev. Now that st_rdev is being automatically generated by the kernel, there is no need to define static major/minor numbers for the iodev and memdev. We still need the minor numbers for the memdev, however, to distinguish between /dev/mem and /dev/kmem. Approved by: philip (mentor)	2008-06-25 07:45:31 +00:00
marcel	c628123952	Work-around a compiler optimization bug, that broke libthr. Massive inlining resulted in constant propagation to the extend that cmpval was known to the compiler to be URWLOCK_WRITE_OWNER (= 0x80000000U). Unfortunately, instead of zero-extending the unsigned constant, it was sign-extended. As such, the cmpxchg instruction was comparing 0x0000000080000000LU to 0xffffffff80000000LU and obviously didn't perform the exchange. But, since the value returned by cmpxhg equalled cmpval (when zero- extended), the _thr_rtld_lock_release() function thought the exchange did happen and as such returned as if having released the lock. This was not the case. Subsequent locking requests found rw_state non-zero and the thread in question entered the kernel and block indefinitely. The work-around is to zero-extend by casting to uint64_t.	2008-05-28 16:41:02 +00:00
marcel	431e157a7b	Account for IPI_PREEMPT. We don't want to call sched_preempt() with interrupts disabled or with td_intr_nesting_level non-zero.	2008-05-23 19:53:50 +00:00
alc	964def13e2	The VM system no longer uses setPQL2(). Remove it and its helpers.	2008-05-23 04:03:54 +00:00
marcel	ae2d712eed	Create the bucket mutexes with MTX_NOWITNESS. There's now a hard limit of 512 pending mutexes in the witness code and we can easily have 1 million bucket mutexes initialized before witness is up and running. Bumping the limit from 512 to 1M is not really an option here...	2008-05-22 06:27:46 +00:00
marcel	04349514e9	We can call ia64_flush_dirty() when the corresponding process is locked or not. As such, use PROC_LOCKED() to determine which case it is and lock the process when not.	2008-05-21 05:15:27 +00:00
alc	a8f81206ad	Retire pmap_addr_hint(). It is no longer used.	2008-05-18 04:16:57 +00:00
alc	783a45362f	Add a stub for pmap_align_superpage() on machines that don't (yet) implement pmap-level support for superpages.	2008-05-09 23:31:42 +00:00
marcel	b14c656c51	Unbreak previous commit. While here, refactor the code a bit.	2008-04-25 16:09:03 +00:00
jeff	14b586bf96	- Add an integer argument to idle to indicate how likely we are to wake from idle over the next tick. - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are suspended in cpu specific states. This function can fail and cause the scheduler to fall back to another mechanism (ipi). - Implement support for mwait in cpu_idle() on i386/amd64 machines that support it. mwait is a higher performance way to synchronize cpus as compared to hlt & ipis. - Allow selecting the idle routine by name via sysctl machdep.idle. This replaces machdep.cpu_idle_hlt. Only idle routines supported by the current machine are permitted. Sponsored by: Nokia	2008-04-25 05:18:50 +00:00
phk	8d647da1ed	Now that all platforms use genclock, shuffle things around slightly for better structure. Much of this is related to <sys/clock.h>, which should really have been called <sys/calendar.h>, but unless and until we need the name, the repocopy can wait. In general the kernel does not know about minutes, hours, days, timezones, daylight savings time, leap-years and such. All that is theoretically a matter for userland only. Parts of kernel code does however care: badly designed filesystems store timestamps in local time and RTC chips almost universally track time in a YY-MM-DD HH:MM:SS format, and sometimes in local timezone instead of UTC. For this we have <sys/clock.h> <sys/time.h> on the other hand, deals with time_t, timeval, timespec and so on. These know only seconds and fractions thereof. Move inittodr() and resettodr() prototypes to <sys/time.h>. Retain the names as it is one of the few surviving PDP/VAX references. Move startrtclock() to <machine/clock.h> on relevant platforms, it is a MD call between machdep.c/clock.c. Remove references to it elsewhere. Remove a lot of unnecessary <sys/clock.h> includes. Move the machdep.disable_rtc_set sysctl to subr_rtc.c where it belongs. XXX: should be kern.disable_rtc_set really, it's not MD.	2008-04-22 19:38:30 +00:00
phk	bbf813673e	Make genclock standard on all platforms. Thanks to: grehan & marcel for platform support on ia64 and ppc.	2008-04-21 10:09:55 +00:00
marcel	f4fb7d0eb3	Sanitize the malloc types: M_PMAP is not used in pmap.c, so don't define it there. Don't use M_PMAP in mp_machdep.c; define M_SMP instead.	2008-04-19 04:56:16 +00:00
marcel	710f14abdd	Remove cruft we got from Alpha, which was probably inherited from NetBSD. I.e. make it more like a FreeBSD header.	2008-04-18 02:21:11 +00:00
marcel	e920d11c2c	Use genclock for RTC handling. This eliminates the MD versions for inittodr() and resettodr(). Have nexus double as the clock device, because it's the firmware that provides RTC services. We could create a special (pseudo-) device for it, but that wasn't superior enough to actually do it. Maybe later... Requested by: phk	2008-04-15 17:02:23 +00:00
marcel	da8b8894d6	Support and switch to the ULE scheduler: o Implement IPI_PREEMPT, o Set td_lock for the thread being switched out, o For ULE & SMP, loop while td_lock points to blocked_lock for the thread being switched in, o Enable ULE by default in GENERIC and SKI,	2008-04-15 05:02:42 +00:00
marcel	d8346deefc	Revision 1.9 changes the delivery mode from the magic constant 0 (i.e. fixed delivery) to SAPIC_DELMODE_LOWPRI. While the commit log doesn't mention the change in behaviour, it is believed to be deliberate. In the last 5.5 years this hasn't been a problem. Nor do I think did it make any difference, but who knows. However, I do know that it break SMP support for Montecito-based machines. Switch back to fixed-CPU delivery so that SMP works again. This gives me some time to look more closely at the problem, as well as make sure the I-cache validation as it's implemented currently is sufficient in SMP configurations...	2008-04-14 20:34:45 +00:00
jeff	335d8ade9b	- Pass the irq and not the vector to intr_event_create(). Reviewed by: marcel	2008-04-11 23:10:39 +00:00
jeff	8efb03d60e	- Add the interrupt vector number to intr_event_create so MI code can lookup hard interrupt events by number. Ignore the irq# for soft intrs. - Add support to cpuset for binding hardware interrupts. This has the side effect of binding any ithread associated with the hard interrupt. As per restrictions imposed by MD code we can only bind interrupts to a single cpu presently. Interrupts can be 'unbound' by binding them to all cpus. Reviewed by: jhb Sponsored by: Nokia	2008-04-11 03:26:41 +00:00
marcel	015b99f5de	Unbreak after removal of SI_SUB_MOUNT_ROOT.	2008-04-09 03:32:48 +00:00
jhb	79918c45a6	Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt code. - Rename the intr_event 'eoi', 'disable', and 'enable' hooks to 'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric. Also, add a comment describe what the MI code expects them to do. - On amd64, i386, and powerpc this is effectively a NOP. - On arm, don't bother masking the interrupt unless the ithread is scheduled in the non-INTR_FILTER case to match what INTR_FILTER did. Also, don't bother unmasking the interrupt in the post_filter case if we never masked it. The INTR_FILTER case had been doing this by having arm_unmask_irq for the post_filter (formerly 'eoi') hook. - On ia64, stray interrupts are now masked for the non-INTR_FILTER case. They were already masked in the INTR_FILTER case. - On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for both the 'post_filter' and 'post_ithread' hooks to match what the non-INTR_FILTER code did. - On sun4v, retire the ithread wrapper hack by using an appropriate 'post_ithread' hook instead (it's what 'post_ithread'/'enable' was designed to do even in 5.x). Glanced at by: piso Reviewed by: marius Requested by: marius [1], [5] Tested on: amd64, i386, arm, sparc64	2008-04-05 19:58:30 +00:00
marcel	f4a93f0828	Better implement I-cache invalidation. The previous implementation was a kluge. This implementation matches the behaviour on powerpc and sparc64. While on the subject, make sure to invalidate the I-cache after loading a kernel module. MFC after: 2 weeks	2008-03-30 23:09:14 +00:00
dfr	dc98ee4196	Add kernel module support for nfslockd and krpc. Use the module system to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k' option to rpc.lockd and make kernel NLM the default. A user can still force the use of the old user NLM by building a kernel without NFSLOCKD and/or removing the nfslockd.ko module.	2008-03-27 11:54:20 +00:00
jb	34e730ca27	When building a kernel module, define MAXCPU the same as SMP so that modules work with and without SMP.	2008-03-27 05:03:26 +00:00
phk	fa71439e44	The "free-lance" timer in the i8254 is only used for the speaker these days, so de-generalize the acquire_timer/release_timer api to just deal with speakers. The new (optional) MD functions are: timer_spkr_acquire() timer_spkr_release() and timer_spkr_setfreq() the last of which configures the timer to generate a tone of a given frequency, in Hz instead of 1/1193182th of seconds. Drop entirely timer2 on pc98, it is not used anywhere at all. Move sysbeep() to kern/tty_cons.c and use the timer_spkr() if they exist, and do nothing otherwise. Remove prototypes and empty acquire-/release-timer() and sysbeep() functions from the non-beeping archs. This eliminate the need for the speaker driver to know about i8254frequency at all. In theory this makes the speaker driver MI, contingent on the timer_spkr_() functions existing but the driver does not know this yet and still attaches to the ISA bus. Syscons is more tricky, in one function, sc_tone(), it knows the hz and things are just fine. In the other function, sc_bell() it seems to get the period from the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode the 1193182 and leave it at that. It's probably not important. Change a few other sysbeep() uses which obviously knew that the argument was in terms of i8254 frequency, and leave alone those that look like people thought sysbeep() took frequency in hertz. This eliminates the knowledge of i8254_freq from all but the actual clock.c code and the prof_machdep.c on amd64 and i386, where I think it would be smart to ask for help from the timecounters anyway [TBD].	2008-03-26 20:09:21 +00:00
jhb	c04bb048f6	Simplify the interrupt code a bit: - Always include the ie_disable and ie_eoi methods in 'struct intr_event' and collapse down to one intr_event_create() routine. The disable and eoi hooks simply aren't used currently in the !INTR_FILTER case. - Expand 'disab' to 'disable' in a few places. - Use function casts for arm and i386:intr_eoi_src() instead of wrapper routines since to trim one extra indirection. Compiled on: {arm,amd64,i386,ia64,ppc,sparc64} x {FILTER, !FILTER} Tested on: {amd64,i386} x {FILTER, !FILTER}	2008-03-17 22:42:01 +00:00
pjd	ea49d310bf	Implement atomic_fetchadd_long() for all architectures and document it. Reviewed by: attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)	2008-03-16 21:20:50 +00:00
rwatson	877d7c65ba	In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink	2008-03-16 10:58:09 +00:00
imp	be829c21fb	BUS_DMA_ISA is left over from Alpha, and is not used in the tree at all. The reference in ia64 code is due to cutNpaste in its history and can safely be removed. Revired by: cognet, raj, marcel, jhb and maybe one other whom I'm forgetting	2008-03-15 06:44:45 +00:00
jhb	9c113163fb	Add preliminary support for binding interrupts to CPUs: - Add a new intr_event method ie_assign_cpu() that is invoked when the MI code wishes to bind an interrupt source to an individual CPU. The MD code may reject the binding with an error. If an assign_cpu function is not provided, then the kernel assumes the platform does not support binding interrupts to CPUs and fails all requests to do so. - Bind ithreads to CPUs on their next execution loop once an interrupt event is bound to a CPU. Only shared ithreads are bound. We currently leave private ithreads for drivers using filters + ithreads in the INTR_FILTER case unbound. - A new intr_event_bind() routine is used to bind an interrupt event to a CPU. - Implement binding on amd64 and i386 by way of the existing pic_assign_cpu PIC method. - For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up an interrupt source and binds its interrupt event to the specified CPU. MI code can currently (ab)use this by doing: intr_bind(rman_get_start(irq_res), cpu); however, I plan to add a truly MI interface (probably a bus_bind_intr(9)) where the implementation in the x86 nexus(4) driver would end up calling intr_bind() internally. Requested by: kmacy, gallatin, jeff Tested on: {amd64, i386} x {regular, INTR_FILTER}	2008-03-14 19:41:48 +00:00
jhb	64ab71ccbd	Rework how the nexus(4) device works on x86 to better handle the idea of different "platforms" on x86 machines. The existing code already handles having two platforms: ACPI and legacy. However, the existing approach was rather hardcoded and difficult to extend. These changes take the approach that each x86 hardware platform should provide its own nexus(4) driver (it can inherit most of its behavior from the default legacy nexus(4) driver) which is responsible for probing for the platform and performing appropriate platform-specific setup during attach (such as adding a platform-specific bus device). This does mean changing the x86 platform busses to no longer use an identify routine for probing, but to move that logic into their matching nexus(4) driver instead. - Make the default nexus(4) driver in nexus.c on i386 and amd64 handle the legacy platform. It's probe routine now returns BUS_PROBE_GENERIC so it can be overriden. - Expose a nexus_init_resources() routine which initializes the various resource managers so that subclassed nexus(4) drivers can invoke it from their attach routine. - The legacy nexus(4) driver explicitly adds a legacy0 device in its attach routine. - The ACPI driver no longer contains an new-bus identify method. Instead it exposes a public function (acpi_identify()) which is a probe routine that the MD nexus(4) drivers can use to probe for ACPI. All of the probe logic in acpi_probe() is now moved into acpi_identify() and acpi_probe() is just a stub. - On i386 and amd64, an ACPI-specific nexus(4) driver checks for ACPI via acpi_identify() and claims the nexus0 device if the probe succeeds. It then explicitly adds an acpi0 device in its attach routine. - The legacy(4) driver no longer knows anything about the acpi0 device. - On ia64 if acpi_identify() fails you basically end up with no devices. This matches the previous behavior where the old acpi_identify() would fail to add an acpi0 device again leaving you with no devices. Discussed with: imp Silence on: arch@	2008-03-13 20:39:04 +00:00
jeff	bf86f0992a	- Fix build breakage; there was a reference to a removed syscall in a KASSERT(). Attempt to cleanup the comment to reflect reality.	2008-03-12 22:14:14 +00:00
jeff	acb93d599c	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
jeff	ad2a31513f	- Remove the old smp cpu topology specification with a new, more flexible tree structure that encodes the level of cache sharing and other properties. - Provide several convenience functions for creating one and two level cpu trees as well as a default flat topology. The system now always has some topology. - On i386 and amd64 create a seperate level in the hierarchy for HTT and multi-core cpus. This will allow the scheduler to intelligently load balance non-uniform cores. Presently we don't detect what level of the cache hierarchy is shared at each level in the topology. - Add a mechanism for testing common topologies that have more information than the MD code is able to provide via the kern.smp.topology tunable. This should be considered a debugging tool only and not a stable api. Sponsored by: Nokia	2008-03-02 07:58:42 +00:00
marcel	f051ca1feb	Re-sort options. While here: o remove COMPAT_FREEBSD5 o add INVARIANTS o add WITNESS	2008-02-16 18:30:58 +00:00
marcel	257f2d8fc2	On Montecito processors, the instruction cache is in fact not coherent with the data caches. Implement a quick fix to allow us to boot on Montecito, while I'm working on a better fix in the mean time. Commit made on Montecito-based Itanium...	2008-02-14 18:46:50 +00:00
marcel	c1a1c62b2a	Allocate a stack for thread0 and switch to it before calling mi_startup(). This frees up kstack for static PAL/SAL calls and double-fault handling.	2008-02-04 02:21:33 +00:00
ru	910410640b	Add a wrapper function that bound checks writes to the dump device.	2008-01-28 19:04:07 +00:00
jhb	c7e0e41f73	Add COMPAT_FREEBSD7 and enable it in configs that have COMPAT_FREEBSD6.	2008-01-07 21:40:11 +00:00
alc	545d26e30b	Add an access type parameter to pmap_enter(). It will be used to implement superpage promotion. Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is a Boolean.	2008-01-03 07:34:34 +00:00
imp	9699ca07d2	Use correct function name in panic message	2008-01-03 06:44:12 +00:00
imp	a3fe3266a0	Fix obsolete comment. pmap_remove_all is the function we're in.	2008-01-03 06:35:04 +00:00
alc	37cdbd87f5	Add configuration knobs for the superpage reservation system. Initially, the reservation will only be enabled on amd64.	2007-12-27 16:45:39 +00:00
rwatson	bdee30611d	Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.	2007-12-25 17:52:02 +00:00
jkoshy	39d4b4accf	Add stubs to unbreak LINT.	2007-12-07 13:45:47 +00:00
marcel	5f073f2789	Add a BSD disklabel backend to g_part: o Disklabels can have between 8 and 20 partitions (inclusive). o No device special file is created for the raw partition. o Switch ia64 to use this backend. o No support for boot code yet.	2007-12-06 02:32:42 +00:00
rwatson	99285f7544	Break out stack(9) from ddb(4): - Introduce per-architecture stack_machdep.c to hold stack_save(9). - Introduce per-architecture machine/stack.h to capture any common definitions required between db_trace.c and stack_machdep.c. - Add new kernel option "options STACK"; we will build in stack(9) if it is defined, or also if "options DDB" is defined to provide compatibility with existing users of stack(9). Add new stack_save_td(9) function, which allows the capture of a stacktrace of another thread rather than the current thread, which the existing stack_save(9) was limited to. It requires that the thread be neither swapped out nor running, which is the responsibility of the consumer to enforce. Update stack(9) man page. Build tested: amd64, arm, i386, ia64, powerpc, sparc64, sun4v Runtime tested: amd64 (rwatson), arm (cognet), i386 (rwatson)	2007-12-02 20:40:35 +00:00
jhb	7fe785218b	Remove the 'needbounce' variable from the _bus_dmamap_load_buffer() routine. It is not needed as the existing tests for segment coalescing already handle bounced addresses and it prevents legal segment coalescing in certain edge cases. MFC after: 1 week Reviewed by: scottl	2007-11-27 17:28:12 +00:00
jasone	607f2953c0	Define atomic_readandclear_ptr.	2007-11-27 06:34:15 +00:00
scottl	b607c8d8ad	Extend critical section coverage in the low-level interrupt handlers to include the ithread scheduling step. Without this, a preemption might occur in between the interrupt getting masked and the ithread getting scheduled. Since the interrupt handler runs in the context of curthread, the scheudler might see it as having a such a low priority on a busy system that it doesn't get to run for a _long_ time, leaving the interrupt stranded in a disabled state. The only way that the preemption can happen is by a fast/filter handler triggering a schduling event earlier in the handler, so this problem can only happen for cases where an interrupt is being shared by both a fast/filter handler and an ithread handler. Unfortunately, it seems to be common for this sharing to happen with network and USB devices, for example. This fixes many of the mysterious TCP session timeouts and NIC watchdogs that were being reported. Many thanks to Sam Lefler for getting to the bottom of this problem. Reviewed by: jhb, jeff, silby	2007-11-21 04:03:51 +00:00
alc	d1ab859bdc	Prevent the leakage of wired pages in the following circumstances: First, a file is mmap(2)ed and then mlock(2)ed. Later, it is truncated. Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the pages beyond the EOF are unmapped and freed. However, when the file is mlock(2)ed, the pages beyond the EOF are unmapped but not freed because they have a non-zero wire count. This can be a mistake. Specifically, it is a mistake if the sole reason why the pages are wired is because of wired, managed mappings. Previously, unmapping the pages destroys these wired, managed mappings, but does not reduce the pages' wire count. Consequently, when the file is unmapped, the pages are not unwired because the wired mapping has been destroyed. Moreover, when the vm object is finally destroyed, the pages are leaked because they are still wired. The fix is to reduce the pages' wired count by the number of wired, managed mappings destroyed. To do this, I introduce a new pmap function pmap_page_wired_mappings() that returns the number of managed mappings to the given physical page that are wired, and I use this function in vm_object_page_remove(). Reviewed by: tegge MFC after: 6 weeks	2007-11-17 22:52:29 +00:00
marcel	1e7c4f0a3f	o Rename cpu_thread_setup() to cpu_thread_alloc() to better communicate that it relates to (is called by) thread_alloc() o Add cpu_thread_free() which is called from thread_free() to counter-act cpu_thread_alloc(). i386: Have cpu_thread_free() call cpu_thread_clean() to preserve behaviour. ia64: Have cpu_thread_free() call mtx_destroy() for the mutex initialized in cpu_thread_alloc(). PR: ia64/118024	2007-11-14 20:21:54 +00:00
julian	b2732e0c22	generally we are interested in what thread did something as opposed to what process. Since threads by default have teh name of the process unless over-written with more useful information, just print the thread name instead.	2007-11-14 06:21:24 +00:00
kib	9ae733819b	Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb	2007-11-05 11:36:16 +00:00
marcel	28e8a1b2ed	Set PTE_ACCESSED in the PTE and before inserting it in the VHPT. This avoids back-to-back faults for all TLB misses. This can be improved further in the future by also setting PTE_DIRTY for TLB misses for write accesses. MFC after: 1 week	2007-10-16 03:20:32 +00:00
marcel	70374bf52e	The flushrs instruction must be the first in an instruction group. GNU as(1) already made sure of that, but it's better to actually have the code right. MFC after: 1 week	2007-10-16 03:07:56 +00:00
marcel	0e76c44417	Print instruction stops to improve analysis of dependency violations. MFC after: 1 week	2007-10-16 02:59:03 +00:00
marcel	a1840b78b2	Fix disassembly of the invala, itc, itr and hint instructions by fixing the opcode ordering. MFC after: 1 week	2007-10-16 02:49:40 +00:00
brueffer	26461bf019	Use the correct expanded name for SCTP. PR: 116496 Submitted by: koitsu Reviewed by: rrs Approved by: re (kensmith)	2007-09-26 20:05:07 +00:00
alc	d1bce06c64	Change the management of cached pages (PQ_CACHE) in two fundamental ways: (1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock. This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock. Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq. (2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed. Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail. Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)	2007-09-25 06:25:06 +00:00
alc	20b10da706	It has been observed on the mailing lists that the different categories of pages don't sum to anywhere near the total number of pages on amd64. This is for the most part because uma_small_alloc() pages have never been counted as wired pages, like their kmem_malloc() brethren. They should be. This changes fixes that. It is no longer necessary for the page queues lock to be held to free pages allocated by uma_small_alloc(). I removed the acquisition and release of the page queues lock from uma_small_free() on amd64 and ia64 weeks ago. This patch updates the other architectures that have uma_small_alloc() and uma_small_free(). Approved by: re (kensmith)	2007-09-15 18:47:02 +00:00
marcel	5f4b9d20fc	Clear pending interrupts before we enable external interrupts. Recently the AP in my Merced box seems to have grown a habit of getting unexpected interrupts, such as redundant wake-ups and legacy interrupts that require an INTA cycle. While here, replace DELAY(0) with cpu_spinwait() so that it's clear what we're doing as well as enable the code to take advantage of cpu_spinwait() when it gets implemented. Approved by: re (blanket)	2007-08-06 05:15:57 +00:00
marcel	48dc5445bf	Keep interrupts disabled while handling external interrupts. There's no advantage in allowing nested external interrupts. In fact, it leads to a potential stack overrun. While here, put the interrupt vector in the trapframe, so as to compensate for the 36 cycle latency of reading cr.ivr. Further simplify assembly code by dealing with ASTs from C. Approved by: re (blanket)	2007-08-06 05:11:01 +00:00
marcel	4ee3bb0e27	In ia64_set_rr(), don't perform data serialization. This allows us to do the data serializations once after writing multiple region registers, as is done in pmap_switch(). All existing calls to ia64_set_rr() are followed with calls to ia64_srlz_d(). Approved by: re (blanket)	2007-08-05 18:19:38 +00:00
marcel	8e03a67f7b	Replace "__asm __volatile()" by equivalent support functions from ia64_cpu.h. This improves readability and consistency and aids in auditing the code. Add instruction-serialization after writing to cr.pta. Delay enabling interrupts until after we setup the clocks and after we program the task priority register. Approved by: re (blanket)	2007-08-04 19:52:10 +00:00
marcel	aaa09dca3c	Replace "__asm __volatile()" by equivalent support functions from ia64_cpu.h. This improves readability and consistency and aids in auditing the code. Add data-serialization after writing to the region registers and add instruction-serialization after writing to cr.pta. Approved by: re (blanket)	2007-08-04 19:36:14 +00:00
marcel	67e460fdcb	Replace "__asm __volatile()" by equivalent support functions from ia64_cpu.h. This improves readability and consistency and aids in auditing the code. Add data-serialization after writing to cr.tpr. Approved by: re (blanket)	2007-08-04 19:33:27 +00:00
marcel	57b4ebc17e	Add required data-serialization after writing to cr.itm and cr.itv. Approved by: re (blanket)	2007-08-04 19:28:19 +00:00
marcel	391597776c	Add ia64_srlz_d() and ia64_srlz_i() functions to aid in serialization. Approved by: re (blanket)	2007-08-04 19:26:42 +00:00
marcel	d4ec5356ec	o Switch to physical addressing before dereferencing the VHPT bucket pointer. The virtual mapping may not be present in the translation cache. This will result in a nested TLB fault at a place we don't handle (and don't want to handle). o Make sure there's a stop after the rfi instruction, otherwise its behaviour is undefined. o Make sure we switch back to virtual addressing before doing a rfi. Behaviour is undefined otherwise. Approved by: re (blanket)	2007-07-30 22:52:52 +00:00
marcel	a5fceab6ad	Add option EXCEPTION_TRACING, which enables KTR-like functionality for processor interruptions. This is especially useful to track unexpected nested TLB faults. Approved by: re (blanket)	2007-07-30 22:42:33 +00:00
marcel	7878602389	Rework the interrupt code and add support for interrupt filtering (INTR_FILTER). This includes: o Save a pointer to the sapic structure and IRQ for every vector, so that we can quickly EOI, mask and unmask the interrupt. o Add locking to the sapic code now that we can reprogram a sapic on multiple CPUs at the same time. o Use u_int for the vector and IRQ. We only have 256 vectors, so using a 64-bit type for it is rather excessive. o Properly handle concurrent registration of a handler for the same vector. Since vectors have a corresponding priority, we should not map IRQs to vectors in a linear fashion, but rather pick a vector that has a priority in line with the interrupt type. This is left for later. The vector/IRQ interchange has been untangled as much as possible to make this easier. Approved by: re (blacket)	2007-07-30 22:29:33 +00:00
marcel	8fddc91c70	Explicitly map the VHPT on all processors. Previously we were merely lucky that the VHPT was mapped as a side-effect of mapping the kernel, but when there's enough physical memory, this may not at all be the case. Approved by: re (blanket)	2007-07-30 22:12:53 +00:00
marcel	68c4f43232	Add casts to some of the more commonly used pointer-type atomic operations. We really should be able to make those inline functions, but this would break its use for sx_locks. Approved by: re (blanket)	2007-07-30 22:07:01 +00:00
dwmalone	e8276674f3	If clock_ct_to_ts fails to convert time time from the real time clock, print a one line error message. Add some comments on not being able to trust the day of week field (I'll act on these comments in a follow up commit). Approved by: re MFC after: 3 weeks	2007-07-23 09:42:32 +00:00
marcel	0ea7715cdf	Restore the value of ar.rnat after the assignment to ar.bspstore. The SDM states that writing to ar.bspstore invalidates the ar.rnat register as a side-effect. This was interpreted as "bits in the ar.rnat register that correspond to registers whose value is on the stack are undefined'. Since we keep the kernel stack NaT- aligned with the user stack (i.e. the lower 9 bits of the backing store pointer remain unchanged when we switch to the kernel stack) bits that need preserving would be preserved. That interpretation is questionable. So, now, the interpretation is more absolute: ar.rnat is undefined after writing to ar.bspstore. As such, we write the saved value of ar.rnat back to ar.rnat after writing to ar.bspstore. Discussed with: christian.kandeler@hob.de Approved by: re (kensmith)	2007-07-16 16:47:35 +00:00
marcel	15e9a30c60	dma_tag is a static structure. Testing for it being a NULL pointer doesn't make sense. Rewrite to what was intended. Correctly warned about by: GCC Approved by: re (bmah)	2007-07-09 04:58:16 +00:00
delphij	c990e91fd1	Enable SCTP by default for GENERIC kernels in order to give it more exposure. The current state of SCTP implementation is considered to be ready for 32-bit platforms, but still need some work/testing on 64-bit platforms. Approved by: re (kensmith) Discussed with: rrs	2007-06-14 17:14:27 +00:00
marcel	0d73aaee3a	Enable GEOM_PART_MBR by default. On ia64 this replaces GEOM_MBR.	2007-06-13 05:07:42 +00:00
alc	173b3d6d03	Add the machine-specific definitions for configuring the new physical memory allocator. Set the size of phys_avail[] using one of these definitions. Approved by: re	2007-06-10 23:39:07 +00:00
marcel	2a881a553e	Work around a firmware bug in the HP rx2660, where in ACPI an I/O port is really a memory mapped I/O address. The bug is in the GAS that describes the address and in particular the SpaceId field. The field should not say the address is an I/O port when it clearly is not. With an additional check for the IA64_BUS_SPACE_IO case in the bus access functions, and the fact that I/O ports pretty much not used in general on ia64, make the calculation of the I/O port address a function. This avoids inlining the work-around into every driver, and also helps reduce overall code bloat.	2007-06-10 16:53:01 +00:00
marcel	41b2f34ed7	Synchronize the instruction cache after writing to memory. This is needed for breakpoints to work.	2007-06-09 22:15:13 +00:00
marcel	75588c5a15	Add kdb_cpu_sync_icache(), intended to synchronize instruction caches with data caches after writing to memory. This typically is required to make breakpoints work on ia64 and powerpc. For those architectures the function is implemented.	2007-06-09 21:55:17 +00:00
marcel	c6fba5a928	Physical memory regions can be larger than INT_MAX. Change size1 from an int to a long to avoid printing negative byte and page counts.	2007-06-09 01:19:08 +00:00
rwatson	4f2da72e8b	Enable AUDIT by default in the GENERIC kernel, allowing security event auditing to be turned on without a kernel recompile, just an rc.conf option. Approved by: re (kensmith) Obtained from: TrustedBSD Project	2007-06-08 20:29:07 +00:00
marcel	dc29a4eb64	Remove remaining references to pc_curtid missed in previous commit.	2007-06-07 18:36:58 +00:00
marcel	a137153bff	Eliminate pmap_install(), which was used to wrap pmap_switch() and grab sched_lock. This would serialize calls to pmap_switch from cpu_switch(). With the introduction of thread_lock, this is not possible anymore, because thread_lock is not a single lock. It varies. Secondly and most importantly, it's not needed at all. The only requirement for pmap_switch() is that it's not preempted while in the middle of updating the CPU and PCPU. In other words, it's a critical region. No locking required.	2007-06-07 16:04:23 +00:00
davidxu	17322fae92	Fix compiling error.	2007-06-07 01:53:29 +00:00
marcel	11092d3888	Include <sys/sched.h> for sched_throw().	2007-06-06 04:44:19 +00:00
jeff	91d1501790	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-05 00:00:57 +00:00
jeff	8297f778b9	Commit 13/14 of sched_lock decomposition. - Add a new parameter to cpu_switch() that is used to release the lock on the outgoing thread and properly acquire the lock on the incoming thread. This parameter is not required for schedulers that don't do per-cpu locking and architectures which do not support it may continue to use the 4BSD scheduler. This feature is presently not supported on ia64 Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-04 23:58:47 +00:00
jeff	be3241715a	- Change comments and asserts to reflect the removal of the global scheduler lock. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-04 23:57:32 +00:00
jeff	20e0f793e8	Commit 10/14 of sched_lock decomposition. - Use sched_throw() rather than replicating the same cpu_throw() code for each architecture. This also allows the scheduler to use any locking it may want to. - Use the thread_lock() rather than sched_lock when preempting. - The scheduler lock is not required to synchronize release_aps. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-04 23:56:08 +00:00
attilio	e333d0ff0e	Rework the PCPU_* (MD) interface: - Rename PCPU_LAZY_INC into PCPU_INC - Add the PCPU_ADD interface which just does an add on the pcpu member given a specific value. Note that for most architectures PCPU_INC and PCPU_ADD are not safe. This is a point that needs some discussions/work in the next days. Reviewed by: alc, bde Approved by: jeff (mentor)	2007-06-04 21:38:48 +00:00
attilio	7dd8ed88a9	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
piso	42dfc78150	In some particular cases (like in pccard and pccbb), the real device handler is wrapped in a couple of functions - a filter wrapper and an ithread wrapper. In this case (and just in this case), the filter wrapper could ask the system to schedule the ithread and mask the interrupt source if the wrapped handler is composed of just an ithread handler: modify the "old" interrupt code to make it support this situation, while the "new" interrupt code is already ok. Discussed with: jhb	2007-05-31 19:25:35 +00:00
yongari	f53195d29a	Honor maxsegsz of less than a page size in a DMA tag. Previously it used to return PAGE_SIZE without respect to restrictions of a DMA tag. This affected all of the busdma load functions that use _bus_dmamap_loader_buffer() as their back-end. Reviewed by: scottl	2007-05-29 06:30:26 +00:00
alc	a530caef2a	Eliminate an unused definition.	2007-05-27 20:34:26 +00:00
marcel	df27a8ac99	Have the processor defer all faults and exceptions for control speculative loads. This at least makes control speculative loads work. In the future we should analyze which faults/exceptions we want to handle rather than defer to avoid having to call the recovery code when it's not strictly necessary.	2007-05-27 19:02:47 +00:00
kan	4c2d706212	Allow FreeBSD's native ELF image activators to execute shared libraries the same way it was enabled for Linux binares in linuxulator. This allows binaries built with -pie. Many ports auto-detect -fPIE support in GCC 4.2 and build binaries FreeBSD was unable to run.	2007-05-22 02:22:58 +00:00
marcel	4e45a73058	When speculation fails (as determined by the chk instruction) the processor is to jump to recovery code. This branching behaviour may not be implemented by the processor and a Speculative Operation fault is raised. The OS is responsible to emulate the branch. Implement this, because GCC 4.2 uses advanced loads regularly.	2007-05-21 05:11:43 +00:00
marcel	2a4c24267a	Fix GCC warning: va = va += PAGE_SIZE contains pointless operation va = va. Fix white space in nearby lines.	2007-05-19 18:25:14 +00:00
marcel	797bdcc549	Add a level of indirection to the kernel PTE table. The old scheme allowed for 1024 PTE pages, each containing 256 PTEs. This yielded 2GB of KVA. This is not enough to boot a kernel on a 16GB box and in general too low for a 64-bit machine. By adding a level of indirection we now have 1024 2nd-level directory pages, each capable of supporting 2GB of KVA. This brings the grand total to 2TB of KVA.	2007-05-19 13:11:27 +00:00
marcel	a94f1d6237	Account for the fact that contigmalloc(9) can return a NULL pointer. Fix the flags argument: M_WAITOK is not a valid flag. Its presence leaves the indication that contigmalloc(9) will not return a NULL pointer. The use of contigmalloc(9) in this place is probably not a good idea given the constraints. It's probably better to lift the constraints and instead add a permanent mapping to the ITR. It's possible that the first 256MB of memory is exhausted when we get here. This fixes a kernel panic on a 16GB rx3600.	2007-05-19 12:50:12 +00:00
jeff	e1996cb960	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
alc	b34f6f7ab1	Define every architecture as either VM_PHYSSEG_DENSE or VM_PHYSSEG_SPARSE depending on whether the physical address space is densely or sparsely populated with memory. The effect of this definition is to determine which of two implementations of vm_page_array and PHYS_TO_VM_PAGE() is used. The legacy implementation is obtained by defining VM_PHYSSEG_DENSE, and a new implementation that trades off time for space is obtained by defining VM_PHYSSEG_SPARSE. For now, all architectures except for ia64 and sparc64 define VM_PHYSSEG_DENSE. Defining VM_PHYSSEG_SPARSE on ia64 allows the entirety of my Itanium 2's memory to be used. Previously, only the first 1 GB could be used. Defining VM_PHYSSEG_SPARSE on sparc64 allows USIIIi-based systems to boot without crashing. This change is a combination of Nathan Whitehorn's patch and my own work in perforce. Discussed with: kmacy, marius, Nathan Whitehorn PR: 112194	2007-05-05 19:50:28 +00:00
sepotvin	a1e73b1eaf	Add support for specifying a minimal size for vm.kmem_size in the loader via vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will be at least 256mb (for example) without forcing a particular value via vm.kmem_size. Approved by: njl (mentor) Reviewed by: alc	2007-04-21 01:14:48 +00:00
pjd	f4e110ebf2	Remove trailing '.' for consistency!	2007-04-10 21:40:13 +00:00
pjd	b159725895	Add UFS_GJOURNAL options to the GENERIC kernel. Approved by: re (kensmith)	2007-04-10 16:49:41 +00:00
jkim	c06098a406	Catch up with ACPI-CA 20070320 import.	2007-03-22 18:16:43 +00:00
jhb	8b3222b80b	Change the amd64, i386, and ia64 nexus drivers to setup bus space tags and handles when activating a resource via bus_activate_resource() rather than doing some of the work in bus_alloc_resource() and some of it in bus_activate_resource(). One note is that when using isa_alloc_resourcev() on PC-98, drivers now need to just use bus_release_resource() without explicitly calling bus_deactivate_resource() first. nyan@ has already fixed all of the PC-98 drivers.	2007-03-21 15:36:38 +00:00
alc	b03ddb707b	Push down the implementation of PCPU_LAZY_INC() into the machine-dependent header file. Reimplement PCPU_LAZY_INC() on amd64 and i386 making it atomic with respect to interrupts. Reviewed by: bde, jhb	2007-03-11 05:54:29 +00:00
mohans	a332cb00d5	Over NFS, an open() call could result in multiple over-the-wire GETATTRs being generated - one from lookup()/namei() and the other from nfs_open() (for cto consistency). This change eliminates the GETATTR in nfs_open() if an otw GETATTR was done from the namei() path. Instead of extending the vop interface, we timestamp each attr load, and use this to detect whether a GETATTR was done from namei() for this syscall. Introduces a thread-local variable that counts the syscalls made by the thread and uses <pid, tid, thread syscalls> as the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on thread state that could be used as the timestamp with minimal overhead.	2007-03-09 04:02:38 +00:00
scottl	32acf7e446	Don't increment total_bounced when doing no-op dmamap_sync ops.	2007-03-06 18:28:43 +00:00
piso	b44ce1c4db	Updated ia64 isa support with the new bus_setup_intr() syntax. Approved by: re (implicit?)	2007-02-24 16:56:22 +00:00
piso	6a2ffa86e5	o break newbus api: add a new argument of type driver_filter_t to bus_setup_intr() o add an int return code to all fast handlers o retire INTR_FAST/IH_FAST For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current Reviewed by: many Approved by: re@	2007-02-23 12:19:07 +00:00
alc	989e3abb2c	Change pmap_protect() so that execute access can be removed without simultaneously removing write access.	2007-02-21 06:00:46 +00:00
alc	cc7fb68847	Eliminate some acquisitions and releases of the page queues lock that are no longer necessary.	2007-02-18 06:33:02 +00:00
marcel	e43343208a	Now that the free page queue mutex is a sleep mutex, we cannot call vm_page_alloc() from within a critical section in pmap_growkernel(). Since the need for a critical section may never have existed in the first place, simply get rid of it. Discussed with: alc@	2007-02-11 02:52:54 +00:00
brooks	beaea8e48e	Include GEOM_LABEL in GENERIC. It's very useful and not well publicized enough. Approved by: pjd	2007-02-09 19:03:18 +00:00
marcel	0245423ad8	Evolve the ctlreq interface added to geom_gpt into a generic partitioning class that supports multiple schemes. Current schemes supported are APM (Apple Partition Map) and GPT. Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM and GEOM_PART_GPT (resp). The ctlreq interface supports verbs to create and destroy partitioning schemes on a disk; to add, delete and modify partitions; and to commit or undo changes made.	2007-02-07 18:55:31 +00:00
imp	9109b1ceb8	Remove 3rd clause, renumber, ok per email	2007-01-12 07:26:21 +00:00
davidxu	5a984630fa	Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance. Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count. Tested on: Athlon64 X2 3800+, Dual Xeon 5130	2006-12-20 04:40:39 +00:00
julian	396ed947f6	Threading cleanup.. part 2 of several. Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.	2006-12-06 06:34:57 +00:00
marcel	aee58e2da0	Since printf also has at least one critical section, we need to initialize pc_curthread. While here, rename early_pcpu to pcpu0 to be conistent (compare thread0 and proc0).	2006-11-18 23:15:25 +00:00
marcel	a76d88c55c	Now that printf() needs the PCPU, set it up before we call printf(). Change the pc_pcb field from a pointer to struct pcb to struct pcb so that sizeof(struct pcb) includes the PCB we use for IPI_STOP. Statically declare early_pcb so that we don't have to allocate the PCB for thread0. This way we can setup the PCPU before cninit() and thus before we use printf().	2006-11-18 21:52:26 +00:00
marcel	08b7cebe65	Revert previous commit. PC_CONS_BUFR is not used nor needed by assembly.	2006-11-18 21:48:13 +00:00
ru	8beeb4382c	Fix a comment.	2006-11-13 06:26:57 +00:00
alc	6093953d36	Make pmap_enter() responsible for setting PG_WRITEABLE instead of its caller. (As a beneficial side-effect, a high-contention acquisition of the page queues lock in vm_fault() is eliminated.)	2006-11-12 21:48:34 +00:00
rwatson	2e2a6d6a66	Add missing includes of priv.h.	2006-11-06 17:43:10 +00:00
rwatson	10d0d9cf47	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
jb	0ba5da19a5	Remove the KDTRACE option again because of the complaints about having it as a default. For the record, the KDTRACE option caused _no_ additional source files to be compiled in; certainly no CDDL source files. All it did was to allow existing BSD licensed kernel files to include one or more CDDL header files. By removing this from DEFAULTS, the onus is on a kernel builder to add the option to the kernel config, possibly by including GENERIC and customising from there. It means that DTrace won't be a feature available in FreeBSD by default, which is the way I intended it to be. Without this option, you can't load the dtrace module (which contains the dtrace device and the DTrace framework). This is equivalent to requiring an option in a kernel config before you can load the linux emulation module, for example. I think it is a mistake to have DTrace ported to FreeBSD, but not to have it available to everyone, all the time. The only exception to this is the companies which distribute systems with FreeBSD embedded. Those companies will customise their systems anyway. The KDTRACE option was intended for them, and only them.	2006-11-04 23:50:12 +00:00
jb	f7bc0a87d6	Build in kernel support for loading DTrace modules by default. This adds the hooks that DTrace modules register with, and adds a few functions which have the dtrace_ prefix to allow the DTrace FBT (function boundary trace) provider to avoid tracing because they are called from the DTtrace probe context. Unlike other forms of tracing and debug, DTrace support in the kernel incurs negligible run-time cost. I think the only reason why anyone wouldn't want to have kernel support enabled for DTrace would be due to the license (CDDL) under which DTrace is released.	2006-11-04 04:58:10 +00:00
marcel	c4a6a4c4a7	Make sure kern_envp is never NULL. If we don't get a pointer to the environment from the loader, use the static environment.	2006-11-03 04:06:17 +00:00
jb	d2bd807356	Add a cnputs() function to write a string to the console with a lock to prevent interspersed strings written from different CPUs at the same time. To avoid putting a buffer on the stack or having to malloc one, space is incorporated in the per-cpu structure. The buffer size if 128 bytes; chosen because it's the next power of 2 size up from 80 characters. String writes to the console are buffered up the end of the line or until the buffer fills. Then the buffer is flushed to all console devices. Existing low level console output via cnputc() is unaffected by this change. ithread calls to log() are also unaffected to avoid blocking those threads. A minor change to the behaviour in a panic situation is that console output will still be buffered, but won't be written to a tty as before. This should prevent interspersed panic output as a number of CPUs panic before we end up single threaded running ddb. Reviewed by: scottl, jhb MFC after: 2 weeks	2006-11-01 04:54:51 +00:00
jb	d20db01f97	Remove the KSE option now that it's in DEFAULTS on these arches/machines. The 'nooption' kernel config entry has to be used to turn KSE off now. This isn't my preferred way of dealing with this, but I'll defer to scottl's experience with the io/mem kernel option change and the grief experienced over that. Submitted by: scottl@	2006-10-26 22:11:35 +00:00
jb	3a250c3e16	Add 'options KSE' to the kernel config DEFAULTS on all arches/machines except sun4v. This change makes the transition from a default to an option more transparent and is an attempt to head off all the compliants that are likely from people who don't read UPDATING, based on experience with the io/mem change. Submitted by: scottl@	2006-10-26 22:05:25 +00:00
jb	f82c799735	Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@	2006-10-26 21:42:22 +00:00
ru	dcc4e06e70	Move "device splash" back to MI NOTES and "files", it's MI.	2006-10-23 13:23:14 +00:00
marcel	9f9cb15f4e	o Eliminate nexus_print_resources(). Use resource_list_print_type() instead. o Eliminate nexus_print_all_resources(). Inline the function body in nexus_print_child().	2006-10-23 00:38:58 +00:00
alc	ab1a7ca9a2	Eliminate unnecessary PG_BUSY tests.	2006-10-22 04:18:01 +00:00
des	a3f4000fda	Move more MD devices and options out of MI NOTES.	2006-10-20 09:52:27 +00:00
des	5658cd6451	The VGA_DEBUG option only exists on {amd64,i386,ia64}. Also remove 'device io' from amd64 NOTES; DEFAULTS takes care of it.	2006-10-20 08:56:26 +00:00
marcel	82d986f8fd	Fix previous revision: o day and mday are the same. No need to subtract 1 from mday. o Set dow to -1 as clock_ct_to_ts() checks this field and returns EINVAL on any day of the week but Sunday.	2006-10-19 00:53:35 +00:00
davidxu	bb5a3880aa	o Add keyword volatile for user mutex owner field. o Fix type consistent problem by using type long for old umtx and wait channel. o Rename casuptr to casuword.	2006-10-17 02:24:47 +00:00
hrs	51beaea0b5	Add a newline to the printf(). Spotted by: Peter Carah <pete@altadena.net> MFC after: 3 days	2006-10-15 16:52:59 +00:00
marcel	a0da43241f	Include freebsd32_signal.h now that signal-related definitions are moved there. Found by: ia64 tinderbox.	2006-10-06 19:33:44 +00:00
simon	440a3866d6	- Remove SCHED_ULE from GENERIC to better avoid foot-shooting by unsuspecting users. - Add a comment in NOTES about experimental status of SCHED_ULE. - Make warning about experimental status in sched_ule(4) a bit stronger. Suggested and reviewed by: dougb Discussed on: developers MFC after: 3 days	2006-10-05 20:31:58 +00:00
jb	bf543444cf	PR: Submitted by: Reviewed by: Approved by: Obtained from: MFC after: Security: Move the relocation definitions to the common elf header so that DTrace can use them on one architecture targeted to a different one. Add the additional ELF types defines in Sun's "Linker and Libraries" manual.	2006-10-04 21:37:10 +00:00
phk	f245b890e0	Use calendrical calculations from subr_clock.c instead of home-rolled.	2006-10-02 16:32:36 +00:00
phk	84cc0f277f	Second part of a little cleanup in the calendar/timezone/RTC handling. Split subr_clock.c in two parts (by repo-copy): subr_clock.c contains generic RTC and calendaric stuff. etc. subr_rtc.c contains the newbus'ified RTC interface. Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock} sysctls and associated variables into subr_clock.c. They are not machine dependent and we have generic code that relies on being present so they are not even optional.	2006-10-02 15:42:02 +00:00
phk	50c81b8a9a	First part of a little cleanup in the calendar/timezone/RTC handling. Move relevant variables to <sys/clock.h> and fix #includes as necessary. Use libkern's much more time- & spamce-efficient BCD routines.	2006-10-02 12:59:59 +00:00
ru	d4abbeb0fe	Added COMPAT_FREEBSD6 option.	2006-09-26 12:36:34 +00:00
kan	c9b2659ee8	Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes the former and __builtin_va_start was present in all GCC version 3.1 and later.	2006-09-21 01:37:02 +00:00
rwatson	76eda1318a	Add audit hooks for ppc, ia64 system call paths. Reviewed by: marcel (ia64) Obtained from: TrustedBSD Project MFC after: 3 days	2006-09-16 17:03:02 +00:00
davidxu	87b5aa08ee	Implement casuword32, compare and set user integer, thank Marcel Moolenarr who wrote the IA64 version of casuword32.	2006-08-28 02:28:15 +00:00
alc	72ff1a9186	Eliminate unused definitions. (They came from NetBSD.) Discussed with: cognet, grehan, marcel	2006-08-25 23:51:11 +00:00
jhb	ce9f8963fd	First pass at allowing memory to be mapped using cache modes other than WB (write-back) on x86 via control bits in PTEs and PDEs (including making use of the PAT MSR). Changes include: - A new pmap_mapdev_attr() function for amd64 and i386 which takes an additional parameter (relative to pmap_mapdev()) specifying the cache mode for this mapping. Note that on amd64 only WB mappings are done with the direct map, all other modes result in a private mapping. - pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached) mappings rather than WB. Previously we relied on the BIOS setting up MTRR's to enforce memio regions being treated as UC. This might make hw.cbb_start_memory unnecessary in some cases now for example. - A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places that used pmap_mapdev() to map non-device memory (such as ACPI tables) to do so using WB as before. - A new pmap_change_attr() function for amd64 and i386 that changes the caching mode for a range of KVA. Reviewed by: alc	2006-08-11 19:22:57 +00:00
alc	a152234cf9	Complete the transition from pmap_page_protect() to pmap_remove_write(). Originally, I had adopted sparc64's name, pmap_clear_write(), for the function that is now pmap_remove_write(). However, this function is more like pmap_remove_all() than like pmap_clear_modify() or pmap_clear_reference(), hence, the name change. The higher-level rationale behind this change is described in src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm trying to clean up and fix our support for execute access. Reviewed by: marcel@ (ia64)	2006-08-01 19:06:06 +00:00
marcel	7067faff16	Remove sio(4) and related options from MI files to amd64, i386 and pc98 MD files. Remove nodevice and nooption lines specific to sio(4) from ia64, powerpc and sparc64 NOTES. There were no such lines for arm yet. sio(4) is usable on less than half the platforms, not counting a future mips platform. Its presence in MI files is therefore increasingly becoming a burden.	2006-07-29 18:38:54 +00:00
jhb	3a707d012d	Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is now back to just being an argument count.	2006-07-28 20:22:58 +00:00
jhb	c62c38439f	Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.	2006-07-28 19:05:28 +00:00
jhb	12302c47d0	Unify the checking for lock misbehavior in the various syscall() implementations and adjust some of the checks while I'm here: - Add a new check to make sure we don't return from a syscall in a critical section. - Add a new explicit check before userret() to make sure we don't return with any locks held. The advantage here is that we can include the syscall number and name in syscall() whereas that info is not available in userret(). - Drop the mtx_assert()'s of sched_lock and Giant. They are replaced by the more general checks just added. MFC after: 2 weeks	2006-07-27 22:32:30 +00:00
jhb	d832089727	Add KTR_SYSC tracing to the syscall() implementations that didn't have it yet. MFC after: 1 week	2006-07-27 21:25:50 +00:00
jhb	39705fd8c6	Add missing ptrace(2) system-call stops to various syscall() implementations. MFC after: 1 week	2006-07-27 19:50:16 +00:00
marcel	444a271f09	Move default GEOM classes from files.ia64, where they were marked standard, to the DEFAULTS file.	2006-07-17 20:02:51 +00:00
jhb	a72b0bcd7f	Simplify the pager support in DDB. Allowing different db commands to install custom pager functions didn't actually happen in practice (they all just used the simple pager and passed in a local quit pointer). So, just hardcode the simple pager as the only pager and make it set a global db_pager_quit flag that db commands can check when the user hits 'q' (or a suitable variant) at the pager prompt. Also, now that it's easy to do so, enable paging by default for all ddb commands. Any command that wishes to honor the quit flag can do so by checking db_pager_quit. Note that the pager can also be effectively disabled by setting $lines to 0. Other fixes: - 'show idt' on i386 and pc98 now actually checks the quit flag and terminates early. - 'show intr' now actually checks the quit flag and terminates early.	2006-07-12 21:22:44 +00:00
mjacob	672bdf0dbc	Make the firmware assist driver resident in preparation for isp using it.	2006-07-09 16:40:31 +00:00
bde	c3398d0c39	Fixed FP_R*. fp{get_set}round() apparently never worked on ia64, since the alpha values were used and are quite different. Fixed some style bugs by copying from the i386 version where it is better.	2006-07-05 06:10:21 +00:00
marcel	d7fe1ba45f	Partial support for branch long emulation. This only emulates the branch long jump and not the branch long call. Support for that is forthcoming.	2006-06-29 19:59:18 +00:00
alc	e05c27b796	Make several changes to pmap_enter_quick_locked(): 1. Make the caller responsible for performing pmap_install(). This reduces the number of times that pmap_install() is performed by pmap_enter_object() from twice per page to twice overall. 2. Don't block if pmap_find_pte() is unable to allocate a PTE. If it did block, then it might wind up mapping a cache page. Specifically, if pmap_enter_quick_locked() slept when called from pmap_enter_object(), the page daemon could change an active or inactive page into a cache page just before it was to be mapped. 3. Bail out of pmap_enter_quick_locked() if pv entries aren't plentiful. In other words, don't force the allocation of a pv entry if they aren't readily available. Reviewed by: marcel@	2006-06-27 05:05:05 +00:00
babkin	f0555f2de9	Backed out the change by request from rwatson. PR: kern/14584	2006-06-26 22:03:22 +00:00
babkin	3d8be823b0	The common UID/GID space implementation. It has been discussed on -arch in 1999, and there are changes to the sysctl names compared to PR, according to that discussion. The description is in sys/conf/NOTES. Lines in the GENERIC files are added in commented-out form. I'll attach the test script I've used to PR. PR: kern/14584 Submitted by: babkin	2006-06-25 18:37:44 +00:00
marcel	90ebd15455	Update to SDM 2.2: o Add tf (test feature) instruction, o Add vmsw (VM switch) instruction. While here, update copyright. MFC after: 1 week	2006-06-24 19:21:11 +00:00
marcel	f5d673ba84	Sync up with SDM 2.1: o Add nop/hint formats F16, I18, M48 and X5, o Add format M47 for ptc.e, o Add hint instruction, o Fix decoding of cmp8xchg16.	2006-06-24 01:19:52 +00:00
marcel	27e6a73de1	Identify the cual-core Montecito. MFC after: 3 days	2006-06-22 00:56:58 +00:00
netchild	11681ee0b5	Remove COMPAT_43 from GENERIC (and other kernel configs). For amd64 there's an explicit comment that it's needed for the linuxolator. This is not the case anymore. For all other architectures there was only a "KEEP THIS". I'm (and other people too) running a COMPAT_43-less kernel since it's not necessary anymore for the linuxolator. Roman is running such a kernel for a for longer time. No problems so far. And I doubt other (newer than ia32 or alpha) architectures really depend on it. This may result in a small performance increase for some workloads. If the removal of COMPAT_43 results in a not working program, please recompile it and all dependencies and try again before reporting a problem. The only place where COMPAT_43 is needed (as in: does not compile without it) is in the (outdated/not usable since too old) svr4 code. Note: this does not remove the COMPAT_43TTY option. Nagging by: rdivacky	2006-06-15 19:58:53 +00:00
ups	b3a7439a45	Remove mpte optimization from pmap_enter_quick(). There is a race with the current locking scheme and removing it should have no measurable performance impact. This fixes page faults leading to panics in pmap_enter_quick_locked() on amd64/i386. Reviewed by: alc,jhb,peter,ps	2006-06-15 01:01:06 +00:00
imp	038d1db25e	Add the ability to subset the devices that UART pulls in. This allows the arm to compile without all the extras that don't appear, at least not in the flavors of ARM I deal with. This helps us save about 100k. If I've botched the available devices on a platform, please let me know and I'll correct ASAP.	2006-06-12 04:21:50 +00:00
alc	ff4adb11fe	Introduce the function pmap_enter_object(). It maps a sequence of resident pages from the same object. Use it in vm_map_pmap_enter() to reduce the locking overhead of premapping objects. Reviewed by: tegge@	2006-06-05 20:35:27 +00:00
imp	7a24ed8d3d	EISA bus ia64 systems don't exist in reality. I'm told they may exist in theory, but that it was OK to remove from NOTES. OK'd by: marcel	2006-06-02 04:46:26 +00:00
alc	af1bb99b4d	Correct a syntax error in the previous revision.	2006-06-01 19:23:45 +00:00
silby	89bd691dee	After much discussion with mjacob and scottl, change bus_dmamem_alloc so that it just warns the user with a printf when it misaligns a piece of memory that was requested through a busdma tag. Some drivers (such as mpt, and probably others) were asking for alignments that could not be satisfied, but as far as driver operation was concerned, that did not matter. In the theory that other drivers will fall into this same category, we agreed that panicing or making the allocation fail will cause more hardship than is necessary. The printf should be sufficient motivation to get the driver glitch fixed.	2006-06-01 04:49:29 +00:00
mjacob	51170c3bdd	Since it's to all intents and purposes identical code to amd64 && i386, match the recent changes to bus_dmamem_alloc here.	2006-05-31 00:38:53 +00:00
marcel	efa2062722	Unbreak after previous commit. While here, improve function naming consistency by s/ssc/ssc_/g.	2006-05-27 17:52:08 +00:00
phk	a45f741c1f	Update to new console api.	2006-05-26 18:25:34 +00:00
marius	1a141a2cee	Add le(4). I could actually only test it on alpha, i386 and sparc64 but given that this includes the more problematic platforms I see no reason why it shouldn't also work on amd64 and ia64.	2006-05-17 20:45:45 +00:00
phk	ef310efff8	Since DELAY() was moved, most <machine/clock.h> #includes have been unnecessary.	2006-05-16 14:37:58 +00:00
marcel	10f0f16487	Fix braino in previous commit: Don't redefine OID_AUTO to something not equal to -1, or at all for that matter.	2006-05-11 22:49:31 +00:00
phk	d12dd358d4	Remove more straggling CPU_ macro references	2006-05-11 17:53:26 +00:00
phk	5d8c57a08b	Clean out sysctl machdep.* related defines. The cmos clock related stuff should really be in MI code.	2006-05-11 17:29:25 +00:00
marcel	193a6144b9	Rewrite of puc(4). Significant changes are: o Properly use rman(9) to manage resources. This eliminates the need to puc-specific hacks to rman. It also allows devinfo(8) to be used to find out the specific assignment of resources to serial/parallel ports. o Compress the PCI device "database" by optimizing for the common case and to use a procedural interface to handle the exceptions. The procedural interface also generalizes the need to setup the hardware (program chipsets, program clock frequencies). o Eliminate the need for PUC_FASTINTR. Serdev devices are fast by default and non-serdev devices are handled by the bus. o Use the serdev I/F to collect interrupt status and to handle interrupts across ports in priority order. o Sync the PCI device configuration to include devices found in NetBSD and not yet merged to FreeBSD. o Add support for Quatech 2, 4 and 8 port UARTs. o Add support for a couple dozen Timedia serial cards as found in Linux.	2006-04-28 21:21:53 +00:00
marcel	35fe8d3d11	In nexus_teardown_intr(), actually remove the handler. MFC after: 1 day	2006-04-21 16:12:28 +00:00
imp	34755358c7	Set the rid of the resource obtained from rman_reserve_resource.	2006-04-20 04:18:30 +00:00
alc	a7e3d6f83b	Retire pmap_track_modified(). We no longer need it because we do not create managed mappings within the clean submap. To prevent regressions, add assertions blocking the creation of managed mappings within the clean submap. Reviewed by: tegge	2006-04-12 04:22:52 +00:00
marcel	d28296b199	Improve handling of IPI_STOP: o use atomic operations to fiddle with stopped_cpus and started_cpus. o disable interrupts while we're waiting to be started. o remove logic relating to cpustop_restartfunc as it's not used.	2006-04-03 23:56:40 +00:00
marcel	8278e2d5fb	Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the PCB in which the context of stopped CPUs is stored. To access this PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The definition, when present, lives in <machine/kdb.h> and abstracts where MD code saves the context. Define KDB_STOPPEDPCB on i386, amd64, alpha and sparc64 in accordance to previous code.	2006-04-03 22:51:47 +00:00
peter	0f363b7d24	Remove the unused sva and eva arguments from pmap_remove_pages().	2006-04-03 21:16:10 +00:00
jhb	ff9c76bccd	Close some races between procfs/ptrace and exit(2): - Reorder the events in exit(2) slightly so that we trigger the S_EXIT stop event earlier. After we have signalled that, we set P_WEXIT and then wait for any processes with a hold on the vmspace via PHOLD to release it. PHOLD now KASSERT()'s that P_WEXIT is clear when it is invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops to zero. - Change proc_rwmem() to require that the processing read from has its vmspace held via PHOLD by the caller and get rid of all the junk to screw around with the vmspace reference count as we no longer need it. - In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it doesn't exist. - Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem() to clear an earlier single-step simualted via a breakpoint). We only do one to avoid races. Also, by making the EINVAL error for unknown requests be part of the default: case in the switch, the various switch cases can now just break out to return which removes a _lot_ of duplicated PRELE and proc unlocks, etc. Also, it fixes at least one bug where a LWP ptrace command could return EINVAL with the proc lock still held. - Changed the locking for ptrace_single_step(), ptrace_set_pc(), and ptrace_clear_single_step() to always be called with the proc lock held (it was a mixed bag previously). Alpha and arm have to drop the lock while the mess around with breakpoints, but other archs avoid extra lock release/acquires in ptrace(). I did have to fix a couple of other consumers in kern_kse and a few other places to hold the proc lock and PHOLD. Tested by: ps (1 mostly, but some bits of 2-4 as well) MFC after: 1 week	2006-02-22 18:57:50 +00:00
jhb	a3a3e58c13	Fix the hw.realmem sysctl. The global realmem variable is a count of pages, not a count of bytes. The sysctl handler for hw.realmem already uses ctob() to convert realmem from pages to bytes. Thus, on archs that were storing a byte count in the realmem variable, hw.realmem was inflated. Reported by: Valerio daelli valerio dot daelli at gmail dot com (alpha) MFC after: 3 days	2006-02-14 14:50:11 +00:00
marcel	ac97d94d79	Correct the spinlock nesting of the idle thread of the APs before we save the MCA state of the AP. Saving the MCA state of the AP requires us to allocate memory, which uses sleep locks. Now that we correct the spinlock nesting of the AP without having schedlock, avoid calling spinlock_exit(). Instead call critical_exit() and manually clear the MD spinlock count. MFC after: 3 days	2006-02-11 19:55:18 +00:00
phk	74f8e63a10	Simplify system time accounting for profiling. Rename struct thread's td_sticks to td_pticks, we will need the other name for more appropriately named use shortly. Reduce it from uint64_t to u_int. Clear td_pticks whenever we enter the kernel instead of recording its value as reference for userret(). Use the absolute value of td->pticks in userret() and eliminate third argument.	2006-02-08 08:09:17 +00:00
phk	bb2f62f536	Modify the way we account for CPU time spent (step 1) Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.	2006-02-07 21:22:02 +00:00
marcel	a331877ad8	Allocate memory for the MCA state information with M_NOWAIT. We can get a MCA event at any moment and it may not be safe to sleep. MFC after: 3 days	2006-02-07 02:02:14 +00:00
marcel	894653aa65	Remove devices acpi & mem, as they are in defaults already.	2006-02-02 23:41:08 +00:00
marcel	597a7332d8	s/DT_IA64_PLT_RESERVE/DT_IA_64_PLT_RESERVE/	2006-01-28 17:58:22 +00:00
marcel	1317d57d23	o Add missing relocations. o Minor white-space fixups.	2006-01-18 01:45:57 +00:00
marcel	408ca433c5	s/R_IA64_/R_IA_64_/g as per the ia64 psABI.	2006-01-17 21:03:22 +00:00
phk	57be8af642	Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43) to COMPAT_43TTY. Add COMPAT_43TTY to NOTES and */conf/GENERIC Compile tty_compat.c only under the new option. Spit out #warning "Old BSD tty API used, please upgrade." if ioctl_compat.h gets #included from userland.	2006-01-10 09:19:10 +00:00
imp	c2b2965b6a	By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT into param.h. Per request, I've placed these just after the _NO_NAMESPACE_POLLUTION ifndef. I've not renamed anything yet, but may since we don't need the __. Submitted by: bde, jhb, scottl, many others.	2006-01-09 06:05:57 +00:00
phk	44d6de75f9	Use ttyalloc() instead of ttymalloc()	2006-01-04 09:46:20 +00:00
imp	8d9b67a0e3	Define __HAVE_ACPI and/or __PCI_REROUTE_INTERRUPT, as appropriate for each platform. These will be used in the pci code in preference to the complicated #ifdefs we have there now.	2006-01-01 20:59:28 +00:00
netchild	507a9b3e93	MI changes: - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible) MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's) Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not. Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)	2005-12-31 14:39:20 +00:00
sobomax	34fa5a81a5	Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually allow executing elf dynamic binaries (aka shared libraries). When it is requested to execute ET_DYN elf image check if this flag is on after we know the elf brand allowing execution if so. PR: kern/87615 Submitted by: Marcin Koziej <creep@desk.pl>	2005-12-26 21:23:57 +00:00
jhb	cb0d490ebe	Tweak how the MD code calls the fooclock() methods some. Instead of passing a pointer to an opaque clockframe structure and requiring the MD code to supply CLKF_FOO() macros to extract needed values out of the opaque structure, just pass the needed values directly. In practice this means passing the pair (usermode, pc) to hardclock() and profclock() and passing the boolean (usermode) to hardclock_cpu() and hardclock_process(). Other details: - Axe clockframe and CLKF_FOO() macros on all architectures. Basically, all the archs were taking a trapframe and converting it into a clockframe one way or another. Now they can just extract the PC and usermode values directly out of the trapframe and pass it to fooclock(). - Renamed hardclock_process() to hardclock_cpu() as the latter is more accurate. - On Alpha, we now run profclock() at hz (profhz == hz) rather than at the slower stathz. - On Alpha, for the TurboLaser machines that don't have an 8254 timecounter, call hardclock() directly. This removes an extra conditional check from every clock interrupt on Alpha on the BSP. There is probably room for even further pruning here by changing Alpha to use the simplified timecounter we use on x86 with the lapic timer since we don't get interrupts from the 8254 on Alpha anyway. - On x86, clkintr() shouldn't ever be called now unless using_lapic_timer is false, so add a KASSERT() to that affect and remove a condition to slightly optimize the non-lapic case. - Change prototypeof arm_handler_execute() so that it's first arg is a trapframe pointer rather than a void pointer for clarity. - Use KCOUNT macro in profclock() to lookup the kernel profiling bucket. Tested on: alpha, amd64, arm, i386, ia64, sparc64 Reviewed by: bde (mostly)	2005-12-22 22:16:09 +00:00
marcel	0a081d09f4	Make our ELF64 type definitions match standards. In particular this means: o Remove Elf64_Quarter, o Redefine Elf64_Half to be 16-bit, o Redefine Elf64_Word to be 32-bit, o Add Elf64_Xword and Elf64_Sxword for 64-bit entities, o Use Elf_Size in MI code to abstract the difference between Elf32_Word and Elf64_Word. o Add Elf_Ssize as the signed counterpart of Elf_Size. MFC after: 2 weeks	2005-12-18 04:52:37 +00:00
jhb	0b37b8af54	- Cleanup whitespace and extra ()s in vtophys() macros. - Move vtophys() macros next to vtopte() where vtopte() exists to match comments above vtopte(). - Remove references to the alternate address space in the comment above vtopte(). amd64 never had the alternate address space, and i386 lost it prior to PAE support being added. - s/entires/entries/ in comments. Reviewed by: alc	2005-12-06 21:09:01 +00:00
ru	f9739084f5	Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with MACHINE_ARCH and MACHINE). Their purpose was to be able to test in cpp(1), but cpp(1) only understands integer type expressions. Using such unsupported expressions introduced a number of subtle bugs, which were discovered by compiling with -Wundef.	2005-12-06 13:27:21 +00:00
ru	3db1ffb040	Fix -Wundef warnings from compiling GENERIC and LINT kernels of all architectures.	2005-12-06 11:19:37 +00:00
ru	9fa3a162bc	- Allow duplicate "machine" directives with the same arguments. - Move existing "machine" directives to DEFAULTS.	2005-11-27 23:17:00 +00:00
jhb	80adaaedab	Don't enable PUC_FASTINTR by default in the source. Instead, enable it via the DEFAULTS kernel configs. This allows folks to turn it that option off in the kernel configs if desired without having to hack the source. This is especially useful since PUC_FASTINTR hangs the kernel boot on my ultra60 which has two uart(4) devices hung off of a puc(4) device. I did not enable PUC_FASTINTR by default on powerpc since powerpc does not currently allow sharing of INTR_FAST with non-INTR_FAST like the other archs.	2005-11-21 20:22:35 +00:00
jhb	23a1490fe0	Create DEFAULTS files for alpha, ia64, powerpc, and sparc64 and move 'device mem' over from GENERIC to DEFAULTS to be consistent with i386 and amd64. Additionally, on ia64 enable ACPI by default since ia64 requires acpi.	2005-11-21 20:17:46 +00:00
alc	b77df1e33a	Eliminate pmap_init2(). It's no longer used.	2005-11-20 06:09:49 +00:00
alc	29e067429c	In get_pv_entry() use PMAP_LOCK() instead of PMAP_TRYLOCK() when deadlock cannot possibly occur.	2005-11-13 02:17:05 +00:00
alc	8852c8f9e2	Reimplement the reclamation of PV entries. Specifically, perform reclamation synchronously from get_pv_entry() instead of asynchronously as part of the page daemon. Additionally, limit the reclamation to inactive pages unless allocation from the PV entry zone or reclamation from the inactive queue fails. Previously, reclamation destroyed mappings to both inactive and active pages. get_pv_entry() still, however, wakes up the page daemon when reclamation occurs. The reason being that the page daemon may move some pages from the active queue to the inactive queue, making some new pages available to future reclamations. Print the "reclaiming PV entries" message at most once per minute, but don't stop printing it after the fifth time. This way, we do not give the impression that the problem has gone away. Reviewed by: tegge	2005-11-09 08:19:21 +00:00
alc	796bccfcad	Begin and end the initialization of pvzone in pmap_init(). Previously, pvzone's initialization was split between pmap_init() and pmap_init2(). This split initialization was the underlying cause of some UMA panics during initialization. Specifically, if the UMA boot pages was exhausted before the pvzone was fully initialized, then UMA, through no fault of its own, would use an inappropriate back-end allocator leading to a panic. (Previously, as a workaround, we have increased the UMA boot pages.) Fortunately, there is no longer any reason that pvzone's initialization cannot be completed in pmap_init(). Eliminate a check for whether pv_entry_high_water has been initialized or not from get_pv_entry(). Since pvzone's initialization is completed in pmap_init(), this check is no longer needed. Use cnt.v_page_count, the actual count of available physical pages, instead of vm_page_array_size to compute the maximum number of pv entries. Introduce the vm.pmap.pv_entries tunable on alpha and ia64. Eliminate some unnecessary white space. Discussed with: tegge (item #1) Tested by: marcel (ia64)	2005-11-04 18:03:24 +00:00
alc	f044c16556	Remove the remaining spl*() calls. Add some assertions. Eliminate some excessive white space.	2005-11-03 07:51:02 +00:00
rwatson	be4f357149	Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.	2005-10-31 15:41:29 +00:00
marcel	7c66d4d570	Remove a stray return statement in the interrupt dispatch function that caused a premature exit after calling a fast interrupt handler and bypassing a much needed critical_exit() and the scheduling of the interrupt thread for non-fast handlers. In short: unbreak :-)	2005-10-30 17:23:01 +00:00
jhb	e20e5c07ce	Reorganize the interrupt handling code a bit to make a few things cleaner and increase flexibility to allow various different approaches to be tried in the future. - Split struct ithd up into two pieces. struct intr_event holds the list of interrupt handlers associated with interrupt sources. struct intr_thread contains the data relative to an interrupt thread. Currently we still provide a 1:1 relationship of events to threads with the exception that events only have an associated thread if there is at least one threaded interrupt handler attached to the event. This means that on x86 we no longer have 4 bazillion interrupt threads with no handlers. It also means that interrupt events with only INTR_FAST handlers no longer have an associated thread either. - Renamed struct intrhand to struct intr_handler to follow the struct intr_foo naming convention. This did require renaming the powerpc MD struct intr_handler to struct ppc_intr_handler. - INTR_FAST no longer implies INTR_EXCL on all architectures except for powerpc. This means that multiple INTR_FAST handlers can attach to the same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach to the same interrupt. Sharing INTR_FAST handlers may not always be desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun either. Drivers can always still use INTR_EXCL to ask for an interrupt exclusively. The way this sharing works is that when an interrupt comes in, all the INTR_FAST handlers are executed first, and if any threaded handlers exist, the interrupt thread is scheduled afterwards. This type of layout also makes it possible to investigate using interrupt filters ala OS X where the filter determines whether or not its companion threaded handler should run. - Aside from the INTR_FAST changes above, the impact on MD interrupt code is mostly just 's/ithread/intr_event/'. - A new MI ddb command 'show intrs' walks the list of interrupt events dumping their state. It also has a '/v' verbose switch which dumps info about all of the handlers attached to each event. - We currently don't destroy an interrupt thread when the last threaded handler is removed because it would suck for things like ppbus(8)'s braindead behavior. The code is present, though, it is just under #if 0 for now. - Move the code to actually execute the threaded handlers for an interrrupt event into a separate function so that ithread_loop() becomes more readable. Previously this code was all in the middle of ithread_loop() and indented halfway across the screen. - Made struct intr_thread private to kern_intr.c and replaced td_ithd with a thread private flag TDP_ITHREAD. - In statclock, check curthread against idlethread directly rather than curthread's proc against idlethread's proc. (Not really related to intr changes) Tested on: alpha, amd64, i386, sparc64 Tested on: arm, ia64 (older version of patch by cognet and marcel)	2005-10-25 19:48:48 +00:00
ade	25703033ca	Specifically panic() in the case where pmap_insert_entry() fails to get a new pv under high system load where the available pv entries have been exhausted before the pagedaemon has a chance to wake up to reclaim some. Prior to this, the NULL pointer dereference ended up causing secondary panics with rather less than useful resulting tracebacks. Reviewed by: alc, jhb MFC after: 1 week	2005-10-21 19:42:43 +00:00
phk	10f4dbda56	Make ttyconsolemode() call ttsetwater() so that drivers don't have to.	2005-10-16 20:58:22 +00:00
davidxu	3fbdb3c215	1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64	2005-10-14 12:43:47 +00:00
phk	84cf732734	Eliminate need for __RMAN_RESOURCE_VISIBLE Reviewed by: marcel@	2005-10-06 17:39:18 +00:00
rwatson	2b01dbdaa0	Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57, osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60, svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81, svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55, svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10, ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58, unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133: Now that Giant is acquired in uprintf() and tprintf(), the caller no longer leads to acquire Giant unless it also holds another mutex that would generate a lock order reversal when calling into these functions. Specifically not backed out is the acquisition of Giant in nfs_socket.c and rpcclnt.c, where local mutexes are held and would otherwise violate the lock order with Giant. This aligns this code more with the eventual locking of ttys. Suggested by: bde	2005-09-28 07:03:03 +00:00
peter	fe69f6532f	Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added stubs for ia64 to keep it compiling. These are used by 32 bit apps such as gdb.	2005-09-27 18:04:20 +00:00
jhb	89caa56972	Add a new atomic_fetchadd() primitive that atomically adds a value to a variable and returns the previous value of the variable. Tested on: i386, alpha, sparc64, arm (cognet) Reviewed by: arch@ Submitted by: cognet (arm) MFC after: 1 week	2005-09-27 17:39:11 +00:00
rwatson	c479a90eb8	Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(), as they both interact with the tty code (!MPSAFE) and may sleep if the tty buffer is full (per comment). Modify all consumers of uprintf() and tprintf() to hold Giant around calls into these functions. In most cases, this means adding an acquisition of Giant immediately around the function. In some cases (nfs_timer()), it means acquiring Giant higher up in the callout. With these changes, UFS no longer panics on SMP when either blocks are exhausted or inodes are exhausted under load due to races in the tty code when running without Giant. NB: Some reduction in calls to uprintf() in the svr4 code is probably desirable. NB: In the case of nfs_timer(), calling uprintf() while holding a mutex, or even in a callout at all, is a bad idea, and will generate warnings and potential upset. This needs to be fixed, but was a problem before this change. NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having non-MPSAFE tty code. MFC after: 1 week	2005-09-19 16:51:43 +00:00
csjp	6216087b31	Introduce a kernel config for the Mandatory Access Control framework. This kernel config briefly describes some of the major MAC policies available on FreeBSD. The hope is that this will raise the awareness about MAC and get more people interested. Discussed with: scottl	2005-09-18 03:15:36 +00:00
alc	4cfa27e2c0	Eliminate unused definitions.	2005-09-11 20:51:15 +00:00
obrien	5a7994d2cd	Canonize the include of acpi.h.	2005-09-11 18:39:03 +00:00
marcel	0257724685	Merge db_interface.c and db_trace.c into db_machdep.c.	2005-09-10 03:18:51 +00:00
marcel	5c8a9dbf0f	Move the prototypes of db_md_set_watchpoint(), db_md_clr_watchpoint() and db_md_list_watchpoints() to ddb/ddb.h.	2005-09-10 03:01:25 +00:00
marcel	6798cfc6b5	Move the ia32_sigcode structure from ia32_sigtramp.c to ia32_signal.c. It's a bit excessive to have it in a file of its own.	2005-09-10 02:12:49 +00:00
marcel	219ddde5ec	Remove redundant $FreeBSD$	2005-09-10 01:13:33 +00:00
marcel	e087d7c1f4	Change the High FP lock from a sleep lock to a spin lock. We can take the lock from interrupt context, which causes an implicit lock order reversal. We've been using the lock carefully enough that making it a spin lock should not be harmful.	2005-09-09 19:18:36 +00:00
marcel	9a432bff04	Milestone: enable SMP by default.	2005-09-05 21:36:28 +00:00
marcel	b58a5b21d1	o In pmap_remove_pte: always invalidate the page. Previously the page was not invalidated if the PTE was not actually being removed. In an UP kernel this didn't cause problems, because the new mapping would preempt the old one. In an SMP kernel this could lead to the use of stale translations when processes move between CPUs at the "right" moment. This fixes the last of the obvious SMP problems and it should be safe to enable SMP by default now. o In pmap_remove_pte: minor code refactoring to avoid duplication. o Test all PTE pointers against NULL. Don't use implicit boolean tests.	2005-09-05 21:32:02 +00:00
marcel	1a76dde0ef	o s/vhpt_size/pmap_vhpt_log2size/g o s/vhpt_base/pmap_vhpt_base/g o s/vhpt_bucket/pmap_vhpt_bucket/g o Declare the above in <machine/pmap.h> o Move the vm.stats.vhpt.* sysctls to machdep.vhpt.* o Create a tunable machdep.vhpt.log2size, with corresponding sysctl. The tunable allows the user to specify the VHPT size from the loader. o Don't keep track of the number of PTEs in the VHPT. Calculate the population when necessary by iterating the buckets and summing up the length of the buckets. o Don't perform the tpa instruction with a bucket lock held. The instruction can (theoretically) fault and locking is not needed.	2005-09-03 23:53:50 +00:00
marcel	516e4da61f	Fix collision chain termination checks. The result of IA64_PHYS_TO_RR7 is never 0, so one cannot test for a NULL pointer after a physical address is translated into a virtual pointer with said macro. Instead, keep the physical address around and test it against 0. Note that this obviously implies that a PTE can never be allocated at physical address 0. This isn't exactly guaranteed, but hasn't been a problem so far. We test the physical address against 0 for as long as the ia64 port exists...	2005-09-03 19:43:15 +00:00
alc	39788de49e	Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine whether the mapping should permit execute access.	2005-09-03 18:20:20 +00:00
stefanf	78a1b1beb4	Move MINSIGSTKSZ from <machine/signal.h> to <machine/_limits.h> and rename it to __MINSIGSTKSZ. Define MINSIGSTKSZ in <sys/signal.h>. This is done in order to use MINSIGSTKSZ for the macro PTHREAD_STACK_MIN in <pthread.h> (soon <limits.h>) without having to include the whole <sys/signal.h> header. Discussed with: bde	2005-08-20 16:44:41 +00:00
marcel	3b0e65cd6d	Remove the execute permission for stacks.	2005-08-14 23:17:59 +00:00
marcel	05cc3bf3e7	o s/pmap_lpte_/pmap_/g o Remove pmap_is_referenced(). It was already compiled-out.	2005-08-13 21:16:38 +00:00
marcel	0bc686305b	Fix the problem with the IPI for the lazy context switching of the high FP registers. It was not that the IPI got lost due to the perceived unreliability of the IPI delivery, but rather that the IPI was not assigned a vector (ugh). Sending a 0 vector to a CPU results in a stray external interrupt. Add a KASSERT to ipi_send() to catch this. The initialization of the IPIs could be better, but it's not at all sure what the future of the code is. Avoid wasting a lot of time on something that is going to be rewritten anyway.	2005-08-13 21:08:32 +00:00
marcel	c96864a4b2	Improve SMP support: o Allocate a VHPT per CPU. The VHPT is a hash table that the CPU uses to look up translations it can't find in the TLB. As such, the VHPT serves as a level 1 cache (the TLB being a level 0 cache) and best results are obtained when it's not shared between CPUs. The collision chain (i.e. the hash bucket) is shared between CPUs, as all buckets together constitute our collection of PTEs. To achieve this, the collision chain does not point to the first PTE in the list anymore, but to a hash bucket head structure. The head structure contains the pointer to the first PTE in the list, as well as a mutex to lock the bucket. Thus, each bucket is locked independently of each other. With at least 1024 buckets in the VHPT, this provides for sufficiently finei-grained locking to make the ssolution scalable to large SMP machines. o Add synchronisation to the lazy FP context switching. We do this with a seperate per-thread lock. On SMP machines the lazy high FP context switching without synchronisation caused inconsistent state, which resulted in a panic. Since the use of the high FP registers is not common, it's possible that races exist. The ia64 package build has proven to be a good stress test, so this will get plenty of exercise in the near future. o Don't use the local ID of the processor we want to send the IPI to as the argument to ipi_send(). use the struct pcpu pointer instead. The reason for this is that IPI delivery is unreliable. It has been observed that sending an IPI to a CPU causes it to receive a stray external interrupt. As such, we need a way to make the delivery reliable. The intended solution is to queue requests in the target CPU's per-CPU structure and use a single IPI to inform the CPU that there's a new entry in the queue. If that IPI gets lost, the CPU can check it's queue at any convenient time (such as for each clock interrupt). This also allows us to send requests to a CPU without interrupting it, if such would be beneficial. With these changes SMP is almost working. There are still some random process crashes and the machine can hang due to having the IPI lost that deals with the high FP context switch. The overhead of introducing the hash bucket head structure results in a performance degradation of about 1% for UP (extra pointer indirection). This is surprisingly small and is offset by gaining reasonably/good scalable SMP support.	2005-08-06 20:28:19 +00:00
marcel	6d02e606c3	Reduce the default MAXCPU from 16 to 4. This is in preparation of allocating a VHPT per CPU. Since we don't yet know how many CPUs are actually in the system at the time we need to allocate the VHPTs, we allocate for MAXCPU processors. This can result in a lot of wasted space for 2-way machines. So, for now, limit MAXCPU to something smaller until we have something more dynamic.	2005-08-06 19:59:23 +00:00
marcel	540bfa469b	For ia64_ptc_{e,g,ga,l}(), use instruction serialization. We typically don't know what the TLB described and need to assume that it affects the fetching of instructions.	2005-08-06 19:54:31 +00:00
jeff	4a761caec7	- Add support for saving stack traces and displaying them via printf(9) and KTR. Contributed by: Antoine Brodin <antoine.brodin@laposte.net> Concept code from: Neal Fachan <neal@isilon.com>	2005-08-03 04:27:40 +00:00
jhb	c7383aebd6	Convert the atomic_ptr() operations over to operating on uintptr_t variables rather than void * variables. This makes it easier and simpler to get asm constraints and volatile keywords correct. MFC after: 3 days Tested on: i386, alpha, sparc64 Compiled on: ia64, powerpc, amd64 Kernel toolchain busted on: arm	2005-07-15 18:17:59 +00:00
kensmith	2674ece2c5	Add recently invented COMPAT_FREEBSD5 option. MFC after: 3 days	2005-07-14 15:39:06 +00:00
davidxu	bc8b519d0f	Validate if the value written into {FS,GS}.base is a canonical address, writting non-canonical address can cause kernel a panic, by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring only canonical values get written to the registers. Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com > Approved by: re (scottl)	2005-07-10 23:31:11 +00:00
marcel	9c552f7c01	Enhance ia64_flush_dirty() to handle the case in which td != curthread. This case is triggered with ptrace(2) and the PT_SETREGS function. Change the return type of the function to int so that errors can be passed on to the caller. Approved by: re (scottl)	2005-07-05 17:12:18 +00:00
marcel	9e64e57e54	Implement functions calls from within DDB on ia64. On ia64 a function pointer doesn't point to the first instruction of that function, but rather to a descriptor. The descriptor has the address of the first instruction, as well as the value of the global pointer. The symbol table doesn't know anything about descriptors, so if you lookup the name of a function you get the address of the first instruction. The cast from the address, which is the result of the symbol lookup, to a function pointer as is done in db_fncall is therefore invalid. Abstract this detail behind the DB_CALL macro. By default DB_CALL is defined as db_fncall_generic, which yields the old behaviour. On ia64 the macro is defined as db_fncall_ia64, in which a descriptor is constructed to yield a valid function pointer. While here, introduce DB_MAXARGS. DB_MAXARGS replaces the existing (local) MAXARGS. The DB_MAXARGS macro can be defined by platforms to create a convenient maximum. By default this will be the legacy 10. On ia64 we define this macro to be 8, for 8 is the maximum number of arguments that can be passed in registers. This avoids having to implement spilling of arguments on the memory stack. Approved by: re (dwhite)	2005-07-02 23:52:37 +00:00
marcel	696ddfbe75	Fix a buglet that was present in the ia64 code and that got inherited by amd64 and i386: For buffered writes we collect data and write it out a ${DEV_BSIZE}-sized block at a time. The fragsz variable is used to keep track of how much data we have collected in the buffer so far and it's reset to zero immediately after writing a block to the dump device. When the last, possibly partially filled buffer is flushed, we didn't reset fragsz to 0 and as such would stop reflecting reality. Since we currently only need to do buffered writes once, this isn't a problem. However, when kernel dumps are made by hand (say by callling doadump from within DDB), the improperly cleared state from the first call to dumpsys causes the next call to dumpsys to create an invalid code file. This change resets fragsz after flushing the partially filled buffer so that it fixes the two problems at once. Approved by: re (scottl)	2005-07-02 19:57:31 +00:00
peter	921b3c5ee4	Jumbo-commit to enhance 32 bit application support on 64 bit kernels. This is good enough to be able to run a RELENG_4 gdb binary against a RELENG_4 application, along with various other tools (eg: 4.x gcore). We use this at work. ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace, procfs and core dumps. procfs_regs.c: vary the format of proc/XXX/regs depending on the client and target application. procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their sscanf fails. They expect an unsigned long. imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps. sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note that 64 bit consumers can still debug 32 bit targets. IA64 has got stubs for ia32_reg.c. Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't implemented in the 32/64 wrapper yet. We also make a tiny patch to gdb pacify it over conflicting formats of ld-elf.so.1. Approved by: re	2005-06-30 07:49:22 +00:00
marcel	b3e8712f74	Handle B-unit break instructions. The break.b is unique in that the immediate is not saved by the architecture. Any of the break.{mifx} instructions have their immediate saved in cr.iim on interruption. Consequently, when we handle the break interrupt, we end up with a break value of 0 when it was a break.b. The immediate is important because it distinguishes between different uses of the break and which are defined by the runtime specification. The bottomline is that when the GNU debugger replaces a B-unit instruction with a break instruction in the inferior, we would not send the process a SIGTRAP when we encounter it, because the value is not one we recognize as a debugger breakpoint. This change adds logic to decode the bundle in which the break instruction lives whenever the break value is 0. The assumption being that it's a break.b and we fetch the immediate directly out of the instruction. If the break instruction was not a break.b, but any of break.{mifx} with an immediate of 0, we would be doing unnecessary work. But since a break 0 is invalid, this is not a problem and it will still result in a SIGILL being sent to the process. Approved by: re (scottl)	2005-06-27 23:51:38 +00:00
marcel	d34460ded9	Replace the existing copyright notice with my own. Over the years I've changed this file so much that it's equivalent to a rewrite, and I'm not talking about any of the cosmetic changes of course. Approved by: re (scottl)	2005-06-27 23:34:35 +00:00
marcel	d51343ed23	Cosmetic: s/u_int64_t/uint64_t/g Approved by: re (scottl)	2005-06-27 23:29:06 +00:00
obrien	7af4f5af38	Add .cvsignore files just like in sys/<arch>/compiled, this keeps CVS from questing kernel config files not in CVS. Approved by: re(kensmith)	2005-06-20 16:52:59 +00:00
marcel	433bf57177	Define IPI_PREEMPT. Update a nearby comment while I'm here.	2005-06-12 19:03:01 +00:00
alc	2d109601cb	Introduce a procedure, pmap_page_init(), that initializes the vm_page's machine-dependent fields. Use this function in vm_pageq_add_new_page() so that the vm_page's machine-dependent and machine-independent fields are initialized at the same time. Remove code from pmap_init() for initializing the vm_page's machine-dependent fields. Remove stale comments from pmap_init(). Eliminate the Boolean variable pmap_initialized from the alpha, amd64, i386, and ia64 pmap implementations. Its use is no longer required because of the above changes and earlier changes that result in physical memory that is being mapped at initialization time being mapped without pv entries. Tested by: cognet, kensmith, marcel	2005-06-10 03:33:36 +00:00
jkoshy	1d3209ab83	MFP4: - Implement sampling modes and logging support in hwpmc(4). - Separate MI and MD parts of hwpmc(4) and allow sharing of PMC implementations across different architectures. Add support for P4 (EMT64) style PMCs to the amd64 code. - New pmcstat(8) options: -E (exit time counts) -W (counts every context switch), -R (print log file). - pmc(3) API changes, improve our ability to keep ABI compatibility in the future. Add more 'alias' names for commonly used events. - bug fixes & documentation.	2005-06-09 19:45:09 +00:00
marcel	bb14b518f7	Create nexus in configure_first() instead of in configure(). This makes sure that sysinit tasks that run after configure_first(), but before configure() have a nexus to hang devices off.	2005-05-29 23:44:22 +00:00
marcel	1ca3e5dd43	Call cninit_finish() in configure_final().	2005-05-29 22:48:41 +00:00
nyan	0fce92f5c4	Remove bus_{mem,p}io.h and related code for a micro-optimization on i386 and amd64. The optimization is a trivial on recent machines. Reviewed by: -arch (imp, marcel, dfr)	2005-05-29 04:42:30 +00:00
nyan	7d8da118c1	- Move bus dependent defines to {isa,cbus}_dmareg.h. - Use isa/isareg.h rather than <arch>/isa/isa.h. Tested on: i386, pc98	2005-05-14 10:14:56 +00:00
marcel	6352eca8c5	Don't define _MACHINE_BUS_MEMIO_H_ nor _MACHINE_BUS_PIO_H_.	2005-05-10 02:59:24 +00:00
davidxu	2155a04472	Change cpu_set_kse_upcall to more generic style, so we can reuse it in other codes. Add cpu_set_user_tls, use it to tweak user register and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall, but since cpu_set_kse_upcall is also used by M:N threads which may not need this feature, so I wrote a separated cpu_set_user_tls.	2005-04-23 02:32:32 +00:00
marcel	2bd4b7b50d	Sanity the RTC code: o Remove the clock interface. Not only does it conflict with the MI version when device genclock is added to the kernel, it was also not possible to have more than 1 clock device. This of course would have been a problem if we actually had more than 1 clock device. In short: we don't need a clock interface and if we do eventually, we should be using the MI one. o Rewrite inittodr() and resettodr() to take into account that: 1) We use the EFI interface directly. 2) time_t is 64-bit and we do need to make sure we can determine leap years from year 2100 and on. Add a nice explanation of where leap years come from and why. 3) This rewrite happened in 2005 so any date prior to 1/1/2005 (either M/D/Y or D/M/Y) is bogus. Reprogram the EFI clock with 1/1/2005 in that case. 4) The EFI clock has a high probability of being correct, so only (further) correct the EFI clock when the file system time is larger. That should never happen in a time-synchronised world. Complain when EFI lost 2 days or more. Replace the copyright notice now that I (pretty much) rewrote all of this file.	2005-04-22 05:04:58 +00:00
marcel	4dd49b3b66	Add empty header (except of the multiple-inclusion protection) to get hwpmc(4) to compile on this platform.	2005-04-20 18:44:53 +00:00
imp	b1662f9d0f	Break out the definition of bus_space_{tag,handle}_t and a few other types into _bus.h to help with name space polution from including all of bus.h. In a few days, I'll commit changes to the MI code to take advantage of thse sepration (after I've made sure that these changes don't break anything in the main tree, I've tested in my trees, but you never know...). Suggested by: bde (in 2002 or 2003 I think) Reviewed in principle by: jhb	2005-04-18 21:45:34 +00:00
marcel	691c2c574b	Add a kpte command to DDB. It dumps the PTE of a KVA. This helps to analyze faults and TLB/VHPT inconsistencies.	2005-04-16 23:38:32 +00:00
marcel	5175738443	Return better "error" values for UWX_BOTTOM and UWX_ABI_FRAME in unw_step(). Both errors denote the end of a stack trace (i.e. no prior frame), but are otherwise not error conditions. Have db_trace() return 0 when the trace ends due to one of these return codes as they are really normal termination conditions. This change especially improves the output of the "show thread" command in DDB when there are threads in fork_trampoline() and previously db_trace() would return an error, causing the show command to emit '***'.	2005-04-16 05:38:59 +00:00
marcel	81de31b855	Initialize curthread before we save the APs MCA state. Saving the MCA state requires a spin lock, which requires a valid curthread. This change allows SMP kernels to boot into multi-user again. While here, update the copyright notice and use __FBSDID for the revision string.	2005-04-15 00:21:23 +00:00
jhb	f9da7305b5	Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic operations in some places and simple non-per CPU math in others.	2005-04-12 23:18:54 +00:00
marcel	d752e216d4	Dot the i's: 1 Move the debug.clock_adjust_* sysctls to debug.clock.adjust_* to make it easier to get only the clock statistics. 2 Make the sysctls read-only [suggested by Marius]. 3 When determining the new clock adjustment, we checked for an error either larger than 12.5% or smaller than 12.5%. We left out an error of exactly 12.5%. For errors larger than 12.5% we adjust the clock reload value in such a way that the next clock interrupt would be early (as in premature). For errors less than 12.5% we stopped the adjustment. The current algorithm doesn't benefit from excluding an error of exactly 12.5%. Change the code to stop adjusting the clock if the error is not larger than 12.5% [suggested by Marius]. Discussed with: marius@	2005-04-12 18:50:57 +00:00
jhb	41cadaa11e	Divorce critical sections from spinlocks. Critical sections as denoted by critical_enter() and critical_exit() are now solely a mechanism for deferring kernel preemptions. They no longer have any affect on interrupts. This means that standalone critical sections are now very cheap as they are simply unlocked integer increments and decrements for the common case. Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter() and spinlock_exit(). This KPI is responsible for providing whatever MD guarantees are needed to ensure that a thread holding a spin lock won't be preempted by any other code that will try to lock the same lock. For now all archs continue to block interrupts in a "spinlock section" as they did formerly in all critical sections. Note that I've also taken this opportunity to push a few things into MD code rather than MI. For example, critical_fork_exit() no longer exists. Instead, MD code ensures that new threads have the correct state when they are created. Also, we no longer try to fixup the idlethreads for APs in MI code. Instead, each arch sets the initial curthread and adjusts the state of the idle thread it borrows in order to perform the initial context switch. This change is largely a big NOP, but the cleaner separation it provides will allow for more efficient alternative locking schemes in other parts of the kernel (bare critical sections rather than per-CPU spin mutexes for per-CPU data for example). Reviewed by: grehan, cognet, arch@, others Tested on: i386, alpha, sparc64, powerpc, arm, possibly more	2005-04-04 21:53:56 +00:00
sobomax	cf0b6b591e	Add USB Communication Device Class Ethernet driver. Originally written for FreeBSD based on aue(4) it was picked by OpenBSD, then from OpenBSD ported to NetBSD and finally NetBSD version merged with original one goes into FreeBSD. Obtained from: http://www.gank.org/freebsd/cdce/ NetBSD OpenBSD	2005-03-22 14:52:40 +00:00
njl	d4583f618f	s/SLIST/STAILQ to catch up with changes to resource lists. Missed by: imp	2005-03-20 06:55:49 +00:00
murray	f7c0ee4068	Add a comment to note that pseudo-device bpf is required for DHCP. This is mentioned in the Handbook but it is not as obvious to new users why bpf is needed compared to the other largely self-explanatory items in GENERIC. PR: conf/40855 MFC after: 1 week	2005-03-18 15:24:00 +00:00
iedowse	3c4f225bf8	Split configure() into 3 separate steps like we do on other architectures. This makes it possible to insert hooks before and after the device attachment step. Tested thanks to: marcel	2005-03-18 09:45:43 +00:00
scottl	7be505a035	Refactor the bus_dma header files so that the interface is described in sys/bus_dma.h instead of being copied in every single arch. This slightly reorders a flag that was specific to AXP and thus changes the ABI there. The interface still relies on bus_space definitions found in <machine/bus.h> so it cannot be included on its own yet, but that will be fixed at a later date. Add an MD <machine/bus_dma.h> for ever arch for consistency and to allow for future MD augmentation of the API. sparc64 makes heavy use of this right now due to its different bus_dma implemenation.	2005-03-14 16:46:28 +00:00
scottl	af7da441d2	Remove dead code.	2005-03-07 02:18:52 +00:00
joerg	c85a3e95f7	netchild's mega-patch to isolate compiler dependencies into a central place. This moves the dependency on GCC's and other compiler's features into the central sys/cdefs.h file, while the individual source files can then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42. By now, GCC and ICC (the Intel compiler) have been actively tested on IA32 platforms by netchild. Extension to other compilers is supposed to be possible, of course. Submitted by: netchild Reviewed by: various developers on arch@, some time ago	2005-03-02 21:33:29 +00:00
marcel	2307c11d0f	Make sure fpswa_iface equals NULL when bootinfo.bi_fpswa equals 0. We need to be able to test for the (possible) non-existence of the FPSWA code. PR: ia64/77591 Submitted by: Christian Kandeler (christian dot kandeler at hob dot de) MFC after: 1 day	2005-03-02 20:29:04 +00:00
wes	92310fbdd7	Attempt to doff the pointy hat: implement 'hw.realmem' on remaining architectures. Pointed out by O'Brien, ScottL via email. Reviewed by: obrien (various)	2005-03-01 21:55:27 +00:00
delphij	8f28e827c6	Remove acpi_perf from {ARCH}/conf/NOTES, to make tinderbox happy. Reported by: tinderbox Inspired by: acpi_perf build structure removal commit	2005-02-25 07:10:37 +00:00
ru	6cc6926066	Use a common multi-inclusion protection, and add such a protection to alpha/include/exec.h.	2005-02-19 21:16:48 +00:00
marcel	43ebe5600f	s/descr/oid_descr/	2005-02-09 04:48:23 +00:00
phk	749e4957d9	Since we are quite unlikely to ever face another platform which uses the i8237 without trying to emulate the PC architecture move the register definitions for the i8237 chip into the central include file for the chip, except for the PC98 case which is magic. Add new isa_dmatc() function which tells us as cheaply as possible if the terminal count has been reached for a given channel.	2005-02-06 13:46:39 +00:00
njl	2958530007	Finish the job of sorting all includes and fix the build by including malloc.h before proc.h on sparc64. Noticed by das@ Compiled on: alpha, amd64, i386, pc98, sparc64	2005-02-06 01:55:08 +00:00
njl	7721d27e92	Build cpufreq and acpi_perf on platforms that are likely to be able to use them.	2005-02-05 21:01:09 +00:00
marcel	f79e8556cb	Include sys/bus.h before sys/cpu.h. The latter needs device_t.	2005-02-04 06:38:58 +00:00
njl	54a88fdbee	Add an implementation of cpu_est_clockrate(9). This function estimates the current clock frequency for the given CPU id in units of Hz.	2005-02-04 05:32:56 +00:00

... 5 6 7 8 9 ...

1953 Commits