freebsd-dev

Author	SHA1	Message	Date
David Xu	66e1c26dba	Implement casuword32, compare and set user integer, thank Marcel Moolenarr who wrote the IA64 version of casuword32.	2006-08-28 02:28:15 +00:00
Alan Cox	b554f899bd	Eliminate unused definitions. (They came from NetBSD.) Discussed with: cognet, grehan, marcel	2006-08-25 23:51:11 +00:00
John Baldwin	7e9f73f3ed	First pass at allowing memory to be mapped using cache modes other than WB (write-back) on x86 via control bits in PTEs and PDEs (including making use of the PAT MSR). Changes include: - A new pmap_mapdev_attr() function for amd64 and i386 which takes an additional parameter (relative to pmap_mapdev()) specifying the cache mode for this mapping. Note that on amd64 only WB mappings are done with the direct map, all other modes result in a private mapping. - pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached) mappings rather than WB. Previously we relied on the BIOS setting up MTRR's to enforce memio regions being treated as UC. This might make hw.cbb_start_memory unnecessary in some cases now for example. - A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places that used pmap_mapdev() to map non-device memory (such as ACPI tables) to do so using WB as before. - A new pmap_change_attr() function for amd64 and i386 that changes the caching mode for a range of KVA. Reviewed by: alc	2006-08-11 19:22:57 +00:00
Alan Cox	78985e424a	Complete the transition from pmap_page_protect() to pmap_remove_write(). Originally, I had adopted sparc64's name, pmap_clear_write(), for the function that is now pmap_remove_write(). However, this function is more like pmap_remove_all() than like pmap_clear_modify() or pmap_clear_reference(), hence, the name change. The higher-level rationale behind this change is described in src/sys/amd64/amd64/pmap.c revision 1.567. The short version is that I'm trying to clean up and fix our support for execute access. Reviewed by: marcel@ (ia64)	2006-08-01 19:06:06 +00:00
Marcel Moolenaar	302981e72a	Remove sio(4) and related options from MI files to amd64, i386 and pc98 MD files. Remove nodevice and nooption lines specific to sio(4) from ia64, powerpc and sparc64 NOTES. There were no such lines for arm yet. sio(4) is usable on less than half the platforms, not counting a future mips platform. Its presence in MI files is therefore increasingly becoming a burden.	2006-07-29 18:38:54 +00:00
John Baldwin	cb76d9b05c	Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is now back to just being an argument count.	2006-07-28 20:22:58 +00:00
John Baldwin	af5bf12239	Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.	2006-07-28 19:05:28 +00:00
John Baldwin	22ea1bc57a	Unify the checking for lock misbehavior in the various syscall() implementations and adjust some of the checks while I'm here: - Add a new check to make sure we don't return from a syscall in a critical section. - Add a new explicit check before userret() to make sure we don't return with any locks held. The advantage here is that we can include the syscall number and name in syscall() whereas that info is not available in userret(). - Drop the mtx_assert()'s of sched_lock and Giant. They are replaced by the more general checks just added. MFC after: 2 weeks	2006-07-27 22:32:30 +00:00
John Baldwin	0c5d1dbd43	Add KTR_SYSC tracing to the syscall() implementations that didn't have it yet. MFC after: 1 week	2006-07-27 21:25:50 +00:00
John Baldwin	00f1856905	Add missing ptrace(2) system-call stops to various syscall() implementations. MFC after: 1 week	2006-07-27 19:50:16 +00:00
Marcel Moolenaar	4b8d8ccc23	Move default GEOM classes from files.ia64, where they were marked standard, to the DEFAULTS file.	2006-07-17 20:02:51 +00:00
John Baldwin	19e9205a23	Simplify the pager support in DDB. Allowing different db commands to install custom pager functions didn't actually happen in practice (they all just used the simple pager and passed in a local quit pointer). So, just hardcode the simple pager as the only pager and make it set a global db_pager_quit flag that db commands can check when the user hits 'q' (or a suitable variant) at the pager prompt. Also, now that it's easy to do so, enable paging by default for all ddb commands. Any command that wishes to honor the quit flag can do so by checking db_pager_quit. Note that the pager can also be effectively disabled by setting $lines to 0. Other fixes: - 'show idt' on i386 and pc98 now actually checks the quit flag and terminates early. - 'show intr' now actually checks the quit flag and terminates early.	2006-07-12 21:22:44 +00:00
Matt Jacob	086ba9f74f	Make the firmware assist driver resident in preparation for isp using it.	2006-07-09 16:40:31 +00:00
Bruce Evans	9366bb574f	Fixed FP_R*. fp{get_set}round() apparently never worked on ia64, since the alpha values were used and are quite different. Fixed some style bugs by copying from the i386 version where it is better.	2006-07-05 06:10:21 +00:00
Marcel Moolenaar	559adb10ad	Partial support for branch long emulation. This only emulates the branch long jump and not the branch long call. Support for that is forthcoming.	2006-06-29 19:59:18 +00:00
Alan Cox	9b123ca12a	Make several changes to pmap_enter_quick_locked(): 1. Make the caller responsible for performing pmap_install(). This reduces the number of times that pmap_install() is performed by pmap_enter_object() from twice per page to twice overall. 2. Don't block if pmap_find_pte() is unable to allocate a PTE. If it did block, then it might wind up mapping a cache page. Specifically, if pmap_enter_quick_locked() slept when called from pmap_enter_object(), the page daemon could change an active or inactive page into a cache page just before it was to be mapped. 3. Bail out of pmap_enter_quick_locked() if pv entries aren't plentiful. In other words, don't force the allocation of a pv entry if they aren't readily available. Reviewed by: marcel@	2006-06-27 05:05:05 +00:00
Sergey Babkin	d81175c738	Backed out the change by request from rwatson. PR: kern/14584	2006-06-26 22:03:22 +00:00
Sergey Babkin	7a799f1ef0	The common UID/GID space implementation. It has been discussed on -arch in 1999, and there are changes to the sysctl names compared to PR, according to that discussion. The description is in sys/conf/NOTES. Lines in the GENERIC files are added in commented-out form. I'll attach the test script I've used to PR. PR: kern/14584 Submitted by: babkin	2006-06-25 18:37:44 +00:00
Marcel Moolenaar	0705941372	Update to SDM 2.2: o Add tf (test feature) instruction, o Add vmsw (VM switch) instruction. While here, update copyright. MFC after: 1 week	2006-06-24 19:21:11 +00:00
Marcel Moolenaar	a4b606cd18	Sync up with SDM 2.1: o Add nop/hint formats F16, I18, M48 and X5, o Add format M47 for ptc.e, o Add hint instruction, o Fix decoding of cmp8xchg16.	2006-06-24 01:19:52 +00:00
Marcel Moolenaar	99ab1812b9	Identify the cual-core Montecito. MFC after: 3 days	2006-06-22 00:56:58 +00:00
Alexander Leidinger	28a3ae7f88	Remove COMPAT_43 from GENERIC (and other kernel configs). For amd64 there's an explicit comment that it's needed for the linuxolator. This is not the case anymore. For all other architectures there was only a "KEEP THIS". I'm (and other people too) running a COMPAT_43-less kernel since it's not necessary anymore for the linuxolator. Roman is running such a kernel for a for longer time. No problems so far. And I doubt other (newer than ia32 or alpha) architectures really depend on it. This may result in a small performance increase for some workloads. If the removal of COMPAT_43 results in a not working program, please recompile it and all dependencies and try again before reporting a problem. The only place where COMPAT_43 is needed (as in: does not compile without it) is in the (outdated/not usable since too old) svr4 code. Note: this does not remove the COMPAT_43TTY option. Nagging by: rdivacky	2006-06-15 19:58:53 +00:00
Stephan Uphoff	2053c12705	Remove mpte optimization from pmap_enter_quick(). There is a race with the current locking scheme and removing it should have no measurable performance impact. This fixes page faults leading to panics in pmap_enter_quick_locked() on amd64/i386. Reviewed by: alc,jhb,peter,ps	2006-06-15 01:01:06 +00:00
Warner Losh	78878cef94	Add the ability to subset the devices that UART pulls in. This allows the arm to compile without all the extras that don't appear, at least not in the flavors of ARM I deal with. This helps us save about 100k. If I've botched the available devices on a platform, please let me know and I'll correct ASAP.	2006-06-12 04:21:50 +00:00
Alan Cox	ce142d9ec0	Introduce the function pmap_enter_object(). It maps a sequence of resident pages from the same object. Use it in vm_map_pmap_enter() to reduce the locking overhead of premapping objects. Reviewed by: tegge@	2006-06-05 20:35:27 +00:00
Warner Losh	35988e2d46	EISA bus ia64 systems don't exist in reality. I'm told they may exist in theory, but that it was OK to remove from NOTES. OK'd by: marcel	2006-06-02 04:46:26 +00:00
Alan Cox	98c8f52baf	Correct a syntax error in the previous revision.	2006-06-01 19:23:45 +00:00
Mike Silbersack	f25d341cfb	After much discussion with mjacob and scottl, change bus_dmamem_alloc so that it just warns the user with a printf when it misaligns a piece of memory that was requested through a busdma tag. Some drivers (such as mpt, and probably others) were asking for alignments that could not be satisfied, but as far as driver operation was concerned, that did not matter. In the theory that other drivers will fall into this same category, we agreed that panicing or making the allocation fail will cause more hardship than is necessary. The printf should be sufficient motivation to get the driver glitch fixed.	2006-06-01 04:49:29 +00:00
Matt Jacob	87ba3ce6f4	Since it's to all intents and purposes identical code to amd64 && i386, match the recent changes to bus_dmamem_alloc here.	2006-05-31 00:38:53 +00:00
Marcel Moolenaar	4f0289b073	Unbreak after previous commit. While here, improve function naming consistency by s/ssc/ssc_/g.	2006-05-27 17:52:08 +00:00
Poul-Henning Kamp	05c3592e13	Update to new console api.	2006-05-26 18:25:34 +00:00
Marius Strobl	8df071afd9	Add le(4). I could actually only test it on alpha, i386 and sparc64 but given that this includes the more problematic platforms I see no reason why it shouldn't also work on amd64 and ia64.	2006-05-17 20:45:45 +00:00
Poul-Henning Kamp	c40da00ca3	Since DELAY() was moved, most <machine/clock.h> #includes have been unnecessary.	2006-05-16 14:37:58 +00:00
Marcel Moolenaar	d793542b06	Fix braino in previous commit: Don't redefine OID_AUTO to something not equal to -1, or at all for that matter.	2006-05-11 22:49:31 +00:00
Poul-Henning Kamp	99ab8292c7	Remove more straggling CPU_ macro references	2006-05-11 17:53:26 +00:00
Poul-Henning Kamp	5405ab4889	Clean out sysctl machdep.* related defines. The cmos clock related stuff should really be in MI code.	2006-05-11 17:29:25 +00:00
Marcel Moolenaar	64220a7e28	Rewrite of puc(4). Significant changes are: o Properly use rman(9) to manage resources. This eliminates the need to puc-specific hacks to rman. It also allows devinfo(8) to be used to find out the specific assignment of resources to serial/parallel ports. o Compress the PCI device "database" by optimizing for the common case and to use a procedural interface to handle the exceptions. The procedural interface also generalizes the need to setup the hardware (program chipsets, program clock frequencies). o Eliminate the need for PUC_FASTINTR. Serdev devices are fast by default and non-serdev devices are handled by the bus. o Use the serdev I/F to collect interrupt status and to handle interrupts across ports in priority order. o Sync the PCI device configuration to include devices found in NetBSD and not yet merged to FreeBSD. o Add support for Quatech 2, 4 and 8 port UARTs. o Add support for a couple dozen Timedia serial cards as found in Linux.	2006-04-28 21:21:53 +00:00
Marcel Moolenaar	c4830f0477	In nexus_teardown_intr(), actually remove the handler. MFC after: 1 day	2006-04-21 16:12:28 +00:00
Warner Losh	80837e3924	Set the rid of the resource obtained from rman_reserve_resource.	2006-04-20 04:18:30 +00:00
Alan Cox	826c207263	Retire pmap_track_modified(). We no longer need it because we do not create managed mappings within the clean submap. To prevent regressions, add assertions blocking the creation of managed mappings within the clean submap. Reviewed by: tegge	2006-04-12 04:22:52 +00:00
Marcel Moolenaar	d6d31e0900	Improve handling of IPI_STOP: o use atomic operations to fiddle with stopped_cpus and started_cpus. o disable interrupts while we're waiting to be started. o remove logic relating to cpustop_restartfunc as it's not used.	2006-04-03 23:56:40 +00:00
Marcel Moolenaar	bfcdefd8aa	Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the PCB in which the context of stopped CPUs is stored. To access this PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The definition, when present, lives in <machine/kdb.h> and abstracts where MD code saves the context. Define KDB_STOPPEDPCB on i386, amd64, alpha and sparc64 in accordance to previous code.	2006-04-03 22:51:47 +00:00
Peter Wemm	b9eee07e36	Remove the unused sva and eva arguments from pmap_remove_pages().	2006-04-03 21:16:10 +00:00
John Baldwin	06ad42b2f7	Close some races between procfs/ptrace and exit(2): - Reorder the events in exit(2) slightly so that we trigger the S_EXIT stop event earlier. After we have signalled that, we set P_WEXIT and then wait for any processes with a hold on the vmspace via PHOLD to release it. PHOLD now KASSERT()'s that P_WEXIT is clear when it is invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops to zero. - Change proc_rwmem() to require that the processing read from has its vmspace held via PHOLD by the caller and get rid of all the junk to screw around with the vmspace reference count as we no longer need it. - In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it doesn't exist. - Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem() to clear an earlier single-step simualted via a breakpoint). We only do one to avoid races. Also, by making the EINVAL error for unknown requests be part of the default: case in the switch, the various switch cases can now just break out to return which removes a _lot_ of duplicated PRELE and proc unlocks, etc. Also, it fixes at least one bug where a LWP ptrace command could return EINVAL with the proc lock still held. - Changed the locking for ptrace_single_step(), ptrace_set_pc(), and ptrace_clear_single_step() to always be called with the proc lock held (it was a mixed bag previously). Alpha and arm have to drop the lock while the mess around with breakpoints, but other archs avoid extra lock release/acquires in ptrace(). I did have to fix a couple of other consumers in kern_kse and a few other places to hold the proc lock and PHOLD. Tested by: ps (1 mostly, but some bits of 2-4 as well) MFC after: 1 week	2006-02-22 18:57:50 +00:00
John Baldwin	414c4ab4c5	Fix the hw.realmem sysctl. The global realmem variable is a count of pages, not a count of bytes. The sysctl handler for hw.realmem already uses ctob() to convert realmem from pages to bytes. Thus, on archs that were storing a byte count in the realmem variable, hw.realmem was inflated. Reported by: Valerio daelli valerio dot daelli at gmail dot com (alpha) MFC after: 3 days	2006-02-14 14:50:11 +00:00
Marcel Moolenaar	e13946c127	Correct the spinlock nesting of the idle thread of the APs before we save the MCA state of the AP. Saving the MCA state of the AP requires us to allocate memory, which uses sleep locks. Now that we correct the spinlock nesting of the AP without having schedlock, avoid calling spinlock_exit(). Instead call critical_exit() and manually clear the MD spinlock count. MFC after: 3 days	2006-02-11 19:55:18 +00:00
Poul-Henning Kamp	eb2da9a51f	Simplify system time accounting for profiling. Rename struct thread's td_sticks to td_pticks, we will need the other name for more appropriately named use shortly. Reduce it from uint64_t to u_int. Clear td_pticks whenever we enter the kernel instead of recording its value as reference for userret(). Use the absolute value of td->pticks in userret() and eliminate third argument.	2006-02-08 08:09:17 +00:00
Poul-Henning Kamp	5b1a8eb397	Modify the way we account for CPU time spent (step 1) Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.	2006-02-07 21:22:02 +00:00
Marcel Moolenaar	f9d7b4d515	Allocate memory for the MCA state information with M_NOWAIT. We can get a MCA event at any moment and it may not be safe to sleep. MFC after: 3 days	2006-02-07 02:02:14 +00:00
Marcel Moolenaar	8c4f6925c4	Remove devices acpi & mem, as they are in defaults already.	2006-02-02 23:41:08 +00:00
Marcel Moolenaar	05157fa0a1	s/DT_IA64_PLT_RESERVE/DT_IA_64_PLT_RESERVE/	2006-01-28 17:58:22 +00:00
Marcel Moolenaar	7ee3d29ed6	o Add missing relocations. o Minor white-space fixups.	2006-01-18 01:45:57 +00:00
Marcel Moolenaar	853b7411b6	s/R_IA64_/R_IA_64_/g as per the ia64 psABI.	2006-01-17 21:03:22 +00:00
Poul-Henning Kamp	d3e64681d6	Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43) to COMPAT_43TTY. Add COMPAT_43TTY to NOTES and */conf/GENERIC Compile tty_compat.c only under the new option. Spit out #warning "Old BSD tty API used, please upgrade." if ioctl_compat.h gets #included from userland.	2006-01-10 09:19:10 +00:00
Warner Losh	d5e61c97a6	By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT into param.h. Per request, I've placed these just after the _NO_NAMESPACE_POLLUTION ifndef. I've not renamed anything yet, but may since we don't need the __. Submitted by: bde, jhb, scottl, many others.	2006-01-09 06:05:57 +00:00
Poul-Henning Kamp	8c92c2096d	Use ttyalloc() instead of ttymalloc()	2006-01-04 09:46:20 +00:00
Warner Losh	501755f4f6	Define __HAVE_ACPI and/or __PCI_REROUTE_INTERRUPT, as appropriate for each platform. These will be used in the pci code in preference to the complicated #ifdefs we have there now.	2006-01-01 20:59:28 +00:00
Alexander Leidinger	ef39c05baa	MI changes: - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible) MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's) Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not. Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)	2005-12-31 14:39:20 +00:00
Maxim Sobolev	900b28f9f6	Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually allow executing elf dynamic binaries (aka shared libraries). When it is requested to execute ET_DYN elf image check if this flag is on after we know the elf brand allowing execution if so. PR: kern/87615 Submitted by: Marcin Koziej <creep@desk.pl>	2005-12-26 21:23:57 +00:00
John Baldwin	b439e431bf	Tweak how the MD code calls the fooclock() methods some. Instead of passing a pointer to an opaque clockframe structure and requiring the MD code to supply CLKF_FOO() macros to extract needed values out of the opaque structure, just pass the needed values directly. In practice this means passing the pair (usermode, pc) to hardclock() and profclock() and passing the boolean (usermode) to hardclock_cpu() and hardclock_process(). Other details: - Axe clockframe and CLKF_FOO() macros on all architectures. Basically, all the archs were taking a trapframe and converting it into a clockframe one way or another. Now they can just extract the PC and usermode values directly out of the trapframe and pass it to fooclock(). - Renamed hardclock_process() to hardclock_cpu() as the latter is more accurate. - On Alpha, we now run profclock() at hz (profhz == hz) rather than at the slower stathz. - On Alpha, for the TurboLaser machines that don't have an 8254 timecounter, call hardclock() directly. This removes an extra conditional check from every clock interrupt on Alpha on the BSP. There is probably room for even further pruning here by changing Alpha to use the simplified timecounter we use on x86 with the lapic timer since we don't get interrupts from the 8254 on Alpha anyway. - On x86, clkintr() shouldn't ever be called now unless using_lapic_timer is false, so add a KASSERT() to that affect and remove a condition to slightly optimize the non-lapic case. - Change prototypeof arm_handler_execute() so that it's first arg is a trapframe pointer rather than a void pointer for clarity. - Use KCOUNT macro in profclock() to lookup the kernel profiling bucket. Tested on: alpha, amd64, arm, i386, ia64, sparc64 Reviewed by: bde (mostly)	2005-12-22 22:16:09 +00:00
Marcel Moolenaar	757686b115	Make our ELF64 type definitions match standards. In particular this means: o Remove Elf64_Quarter, o Redefine Elf64_Half to be 16-bit, o Redefine Elf64_Word to be 32-bit, o Add Elf64_Xword and Elf64_Sxword for 64-bit entities, o Use Elf_Size in MI code to abstract the difference between Elf32_Word and Elf64_Word. o Add Elf_Ssize as the signed counterpart of Elf_Size. MFC after: 2 weeks	2005-12-18 04:52:37 +00:00
John Baldwin	696effb697	- Cleanup whitespace and extra ()s in vtophys() macros. - Move vtophys() macros next to vtopte() where vtopte() exists to match comments above vtopte(). - Remove references to the alternate address space in the comment above vtopte(). amd64 never had the alternate address space, and i386 lost it prior to PAE support being added. - s/entires/entries/ in comments. Reviewed by: alc	2005-12-06 21:09:01 +00:00
Ruslan Ermilov	224d140293	Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with MACHINE_ARCH and MACHINE). Their purpose was to be able to test in cpp(1), but cpp(1) only understands integer type expressions. Using such unsupported expressions introduced a number of subtle bugs, which were discovered by compiling with -Wundef.	2005-12-06 13:27:21 +00:00
Ruslan Ermilov	44e09d2fa2	Fix -Wundef warnings from compiling GENERIC and LINT kernels of all architectures.	2005-12-06 11:19:37 +00:00
Ruslan Ermilov	6646524f34	- Allow duplicate "machine" directives with the same arguments. - Move existing "machine" directives to DEFAULTS.	2005-11-27 23:17:00 +00:00
John Baldwin	7417e80b4e	Don't enable PUC_FASTINTR by default in the source. Instead, enable it via the DEFAULTS kernel configs. This allows folks to turn it that option off in the kernel configs if desired without having to hack the source. This is especially useful since PUC_FASTINTR hangs the kernel boot on my ultra60 which has two uart(4) devices hung off of a puc(4) device. I did not enable PUC_FASTINTR by default on powerpc since powerpc does not currently allow sharing of INTR_FAST with non-INTR_FAST like the other archs.	2005-11-21 20:22:35 +00:00
John Baldwin	d0750fb9b0	Create DEFAULTS files for alpha, ia64, powerpc, and sparc64 and move 'device mem' over from GENERIC to DEFAULTS to be consistent with i386 and amd64. Additionally, on ia64 enable ACPI by default since ia64 requires acpi.	2005-11-21 20:17:46 +00:00
Alan Cox	97a0c226d6	Eliminate pmap_init2(). It's no longer used.	2005-11-20 06:09:49 +00:00
Alan Cox	65336314cf	In get_pv_entry() use PMAP_LOCK() instead of PMAP_TRYLOCK() when deadlock cannot possibly occur.	2005-11-13 02:17:05 +00:00
Alan Cox	7a35a21e7b	Reimplement the reclamation of PV entries. Specifically, perform reclamation synchronously from get_pv_entry() instead of asynchronously as part of the page daemon. Additionally, limit the reclamation to inactive pages unless allocation from the PV entry zone or reclamation from the inactive queue fails. Previously, reclamation destroyed mappings to both inactive and active pages. get_pv_entry() still, however, wakes up the page daemon when reclamation occurs. The reason being that the page daemon may move some pages from the active queue to the inactive queue, making some new pages available to future reclamations. Print the "reclaiming PV entries" message at most once per minute, but don't stop printing it after the fifth time. This way, we do not give the impression that the problem has gone away. Reviewed by: tegge	2005-11-09 08:19:21 +00:00
Alan Cox	e9cb1037da	Begin and end the initialization of pvzone in pmap_init(). Previously, pvzone's initialization was split between pmap_init() and pmap_init2(). This split initialization was the underlying cause of some UMA panics during initialization. Specifically, if the UMA boot pages was exhausted before the pvzone was fully initialized, then UMA, through no fault of its own, would use an inappropriate back-end allocator leading to a panic. (Previously, as a workaround, we have increased the UMA boot pages.) Fortunately, there is no longer any reason that pvzone's initialization cannot be completed in pmap_init(). Eliminate a check for whether pv_entry_high_water has been initialized or not from get_pv_entry(). Since pvzone's initialization is completed in pmap_init(), this check is no longer needed. Use cnt.v_page_count, the actual count of available physical pages, instead of vm_page_array_size to compute the maximum number of pv entries. Introduce the vm.pmap.pv_entries tunable on alpha and ia64. Eliminate some unnecessary white space. Discussed with: tegge (item #1) Tested by: marcel (ia64)	2005-11-04 18:03:24 +00:00
Alan Cox	fcf67b0496	Remove the remaining spl*() calls. Add some assertions. Eliminate some excessive white space.	2005-11-03 07:51:02 +00:00
Robert Watson	5bb84bc84b	Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.	2005-10-31 15:41:29 +00:00
Marcel Moolenaar	6739824a02	Remove a stray return statement in the interrupt dispatch function that caused a premature exit after calling a fast interrupt handler and bypassing a much needed critical_exit() and the scheduling of the interrupt thread for non-fast handlers. In short: unbreak :-)	2005-10-30 17:23:01 +00:00
John Baldwin	e0f66ef861	Reorganize the interrupt handling code a bit to make a few things cleaner and increase flexibility to allow various different approaches to be tried in the future. - Split struct ithd up into two pieces. struct intr_event holds the list of interrupt handlers associated with interrupt sources. struct intr_thread contains the data relative to an interrupt thread. Currently we still provide a 1:1 relationship of events to threads with the exception that events only have an associated thread if there is at least one threaded interrupt handler attached to the event. This means that on x86 we no longer have 4 bazillion interrupt threads with no handlers. It also means that interrupt events with only INTR_FAST handlers no longer have an associated thread either. - Renamed struct intrhand to struct intr_handler to follow the struct intr_foo naming convention. This did require renaming the powerpc MD struct intr_handler to struct ppc_intr_handler. - INTR_FAST no longer implies INTR_EXCL on all architectures except for powerpc. This means that multiple INTR_FAST handlers can attach to the same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach to the same interrupt. Sharing INTR_FAST handlers may not always be desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun either. Drivers can always still use INTR_EXCL to ask for an interrupt exclusively. The way this sharing works is that when an interrupt comes in, all the INTR_FAST handlers are executed first, and if any threaded handlers exist, the interrupt thread is scheduled afterwards. This type of layout also makes it possible to investigate using interrupt filters ala OS X where the filter determines whether or not its companion threaded handler should run. - Aside from the INTR_FAST changes above, the impact on MD interrupt code is mostly just 's/ithread/intr_event/'. - A new MI ddb command 'show intrs' walks the list of interrupt events dumping their state. It also has a '/v' verbose switch which dumps info about all of the handlers attached to each event. - We currently don't destroy an interrupt thread when the last threaded handler is removed because it would suck for things like ppbus(8)'s braindead behavior. The code is present, though, it is just under #if 0 for now. - Move the code to actually execute the threaded handlers for an interrrupt event into a separate function so that ithread_loop() becomes more readable. Previously this code was all in the middle of ithread_loop() and indented halfway across the screen. - Made struct intr_thread private to kern_intr.c and replaced td_ithd with a thread private flag TDP_ITHREAD. - In statclock, check curthread against idlethread directly rather than curthread's proc against idlethread's proc. (Not really related to intr changes) Tested on: alpha, amd64, i386, sparc64 Tested on: arm, ia64 (older version of patch by cognet and marcel)	2005-10-25 19:48:48 +00:00
Ade Lovett	8d228514fb	Specifically panic() in the case where pmap_insert_entry() fails to get a new pv under high system load where the available pv entries have been exhausted before the pagedaemon has a chance to wake up to reclaim some. Prior to this, the NULL pointer dereference ended up causing secondary panics with rather less than useful resulting tracebacks. Reviewed by: alc, jhb MFC after: 1 week	2005-10-21 19:42:43 +00:00
Poul-Henning Kamp	7423b2b40c	Make ttyconsolemode() call ttsetwater() so that drivers don't have to.	2005-10-16 20:58:22 +00:00
David Xu	9104847f21	1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64	2005-10-14 12:43:47 +00:00
Poul-Henning Kamp	2628fdabad	Eliminate need for __RMAN_RESOURCE_VISIBLE Reviewed by: marcel@	2005-10-06 17:39:18 +00:00
Robert Watson	5f419982c2	Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57, osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60, svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81, svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55, svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10, ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58, unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133: Now that Giant is acquired in uprintf() and tprintf(), the caller no longer leads to acquire Giant unless it also holds another mutex that would generate a lock order reversal when calling into these functions. Specifically not backed out is the acquisition of Giant in nfs_socket.c and rpcclnt.c, where local mutexes are held and would otherwise violate the lock order with Giant. This aligns this code more with the eventual locking of ttys. Suggested by: bde	2005-09-28 07:03:03 +00:00
Peter Wemm	add121a476	Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added stubs for ia64 to keep it compiling. These are used by 32 bit apps such as gdb.	2005-09-27 18:04:20 +00:00
John Baldwin	3c2bc2bf26	Add a new atomic_fetchadd() primitive that atomically adds a value to a variable and returns the previous value of the variable. Tested on: i386, alpha, sparc64, arm (cognet) Reviewed by: arch@ Submitted by: cognet (arm) MFC after: 1 week	2005-09-27 17:39:11 +00:00
Robert Watson	84d2b7df26	Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(), as they both interact with the tty code (!MPSAFE) and may sleep if the tty buffer is full (per comment). Modify all consumers of uprintf() and tprintf() to hold Giant around calls into these functions. In most cases, this means adding an acquisition of Giant immediately around the function. In some cases (nfs_timer()), it means acquiring Giant higher up in the callout. With these changes, UFS no longer panics on SMP when either blocks are exhausted or inodes are exhausted under load due to races in the tty code when running without Giant. NB: Some reduction in calls to uprintf() in the svr4 code is probably desirable. NB: In the case of nfs_timer(), calling uprintf() while holding a mutex, or even in a callout at all, is a bad idea, and will generate warnings and potential upset. This needs to be fixed, but was a problem before this change. NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having non-MPSAFE tty code. MFC after: 1 week	2005-09-19 16:51:43 +00:00
Christian S.J. Peron	33cdc78d01	Introduce a kernel config for the Mandatory Access Control framework. This kernel config briefly describes some of the major MAC policies available on FreeBSD. The hope is that this will raise the awareness about MAC and get more people interested. Discussed with: scottl	2005-09-18 03:15:36 +00:00
Alan Cox	ac31d065a6	Eliminate unused definitions.	2005-09-11 20:51:15 +00:00
David E. O'Brien	2a191126de	Canonize the include of acpi.h.	2005-09-11 18:39:03 +00:00
Marcel Moolenaar	8115693121	Merge db_interface.c and db_trace.c into db_machdep.c.	2005-09-10 03:18:51 +00:00
Marcel Moolenaar	216e80c2ba	Move the prototypes of db_md_set_watchpoint(), db_md_clr_watchpoint() and db_md_list_watchpoints() to ddb/ddb.h.	2005-09-10 03:01:25 +00:00
Marcel Moolenaar	464d16ddf0	Move the ia32_sigcode structure from ia32_sigtramp.c to ia32_signal.c. It's a bit excessive to have it in a file of its own.	2005-09-10 02:12:49 +00:00
Marcel Moolenaar	0522a40412	Remove redundant $FreeBSD$	2005-09-10 01:13:33 +00:00
Marcel Moolenaar	87a59250b5	Change the High FP lock from a sleep lock to a spin lock. We can take the lock from interrupt context, which causes an implicit lock order reversal. We've been using the lock carefully enough that making it a spin lock should not be harmful.	2005-09-09 19:18:36 +00:00
Marcel Moolenaar	cca2e0f1cc	Milestone: enable SMP by default.	2005-09-05 21:36:28 +00:00
Marcel Moolenaar	ab870058d7	o In pmap_remove_pte: always invalidate the page. Previously the page was not invalidated if the PTE was not actually being removed. In an UP kernel this didn't cause problems, because the new mapping would preempt the old one. In an SMP kernel this could lead to the use of stale translations when processes move between CPUs at the "right" moment. This fixes the last of the obvious SMP problems and it should be safe to enable SMP by default now. o In pmap_remove_pte: minor code refactoring to avoid duplication. o Test all PTE pointers against NULL. Don't use implicit boolean tests.	2005-09-05 21:32:02 +00:00
Marcel Moolenaar	5280c8c2ab	o s/vhpt_size/pmap_vhpt_log2size/g o s/vhpt_base/pmap_vhpt_base/g o s/vhpt_bucket/pmap_vhpt_bucket/g o Declare the above in <machine/pmap.h> o Move the vm.stats.vhpt.* sysctls to machdep.vhpt.* o Create a tunable machdep.vhpt.log2size, with corresponding sysctl. The tunable allows the user to specify the VHPT size from the loader. o Don't keep track of the number of PTEs in the VHPT. Calculate the population when necessary by iterating the buckets and summing up the length of the buckets. o Don't perform the tpa instruction with a bucket lock held. The instruction can (theoretically) fault and locking is not needed.	2005-09-03 23:53:50 +00:00
Marcel Moolenaar	43be3aac7a	Fix collision chain termination checks. The result of IA64_PHYS_TO_RR7 is never 0, so one cannot test for a NULL pointer after a physical address is translated into a virtual pointer with said macro. Instead, keep the physical address around and test it against 0. Note that this obviously implies that a PTE can never be allocated at physical address 0. This isn't exactly guaranteed, but hasn't been a problem so far. We test the physical address against 0 for as long as the ia64 port exists...	2005-09-03 19:43:15 +00:00
Alan Cox	ba8bca610c	Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine whether the mapping should permit execute access.	2005-09-03 18:20:20 +00:00
Stefan Farfeleder	a1f85d7f83	Move MINSIGSTKSZ from <machine/signal.h> to <machine/_limits.h> and rename it to __MINSIGSTKSZ. Define MINSIGSTKSZ in <sys/signal.h>. This is done in order to use MINSIGSTKSZ for the macro PTHREAD_STACK_MIN in <pthread.h> (soon <limits.h>) without having to include the whole <sys/signal.h> header. Discussed with: bde	2005-08-20 16:44:41 +00:00
Marcel Moolenaar	d41a7ed490	Remove the execute permission for stacks.	2005-08-14 23:17:59 +00:00
Marcel Moolenaar	a812f8435a	o s/pmap_lpte_/pmap_/g o Remove pmap_is_referenced(). It was already compiled-out.	2005-08-13 21:16:38 +00:00
Marcel Moolenaar	86257f240a	Fix the problem with the IPI for the lazy context switching of the high FP registers. It was not that the IPI got lost due to the perceived unreliability of the IPI delivery, but rather that the IPI was not assigned a vector (ugh). Sending a 0 vector to a CPU results in a stray external interrupt. Add a KASSERT to ipi_send() to catch this. The initialization of the IPIs could be better, but it's not at all sure what the future of the code is. Avoid wasting a lot of time on something that is going to be rewritten anyway.	2005-08-13 21:08:32 +00:00
Marcel Moolenaar	4630415a47	Improve SMP support: o Allocate a VHPT per CPU. The VHPT is a hash table that the CPU uses to look up translations it can't find in the TLB. As such, the VHPT serves as a level 1 cache (the TLB being a level 0 cache) and best results are obtained when it's not shared between CPUs. The collision chain (i.e. the hash bucket) is shared between CPUs, as all buckets together constitute our collection of PTEs. To achieve this, the collision chain does not point to the first PTE in the list anymore, but to a hash bucket head structure. The head structure contains the pointer to the first PTE in the list, as well as a mutex to lock the bucket. Thus, each bucket is locked independently of each other. With at least 1024 buckets in the VHPT, this provides for sufficiently finei-grained locking to make the ssolution scalable to large SMP machines. o Add synchronisation to the lazy FP context switching. We do this with a seperate per-thread lock. On SMP machines the lazy high FP context switching without synchronisation caused inconsistent state, which resulted in a panic. Since the use of the high FP registers is not common, it's possible that races exist. The ia64 package build has proven to be a good stress test, so this will get plenty of exercise in the near future. o Don't use the local ID of the processor we want to send the IPI to as the argument to ipi_send(). use the struct pcpu pointer instead. The reason for this is that IPI delivery is unreliable. It has been observed that sending an IPI to a CPU causes it to receive a stray external interrupt. As such, we need a way to make the delivery reliable. The intended solution is to queue requests in the target CPU's per-CPU structure and use a single IPI to inform the CPU that there's a new entry in the queue. If that IPI gets lost, the CPU can check it's queue at any convenient time (such as for each clock interrupt). This also allows us to send requests to a CPU without interrupting it, if such would be beneficial. With these changes SMP is almost working. There are still some random process crashes and the machine can hang due to having the IPI lost that deals with the high FP context switch. The overhead of introducing the hash bucket head structure results in a performance degradation of about 1% for UP (extra pointer indirection). This is surprisingly small and is offset by gaining reasonably/good scalable SMP support.	2005-08-06 20:28:19 +00:00
Marcel Moolenaar	045f23cd0d	Reduce the default MAXCPU from 16 to 4. This is in preparation of allocating a VHPT per CPU. Since we don't yet know how many CPUs are actually in the system at the time we need to allocate the VHPTs, we allocate for MAXCPU processors. This can result in a lot of wasted space for 2-way machines. So, for now, limit MAXCPU to something smaller until we have something more dynamic.	2005-08-06 19:59:23 +00:00
Marcel Moolenaar	cbef4d0edc	For ia64_ptc_{e,g,ga,l}(), use instruction serialization. We typically don't know what the TLB described and need to assume that it affects the fetching of instructions.	2005-08-06 19:54:31 +00:00
Jeff Roberson	8d511e2a05	- Add support for saving stack traces and displaying them via printf(9) and KTR. Contributed by: Antoine Brodin <antoine.brodin@laposte.net> Concept code from: Neal Fachan <neal@isilon.com>	2005-08-03 04:27:40 +00:00
John Baldwin	122eceef61	Convert the atomic_ptr() operations over to operating on uintptr_t variables rather than void * variables. This makes it easier and simpler to get asm constraints and volatile keywords correct. MFC after: 3 days Tested on: i386, alpha, sparc64 Compiled on: ia64, powerpc, amd64 Kernel toolchain busted on: arm	2005-07-15 18:17:59 +00:00
Ken Smith	22e59cec3b	Add recently invented COMPAT_FREEBSD5 option. MFC after: 3 days	2005-07-14 15:39:06 +00:00
David Xu	740fd64d65	Validate if the value written into {FS,GS}.base is a canonical address, writting non-canonical address can cause kernel a panic, by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring only canonical values get written to the registers. Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com > Approved by: re (scottl)	2005-07-10 23:31:11 +00:00
Marcel Moolenaar	7906787a5f	Enhance ia64_flush_dirty() to handle the case in which td != curthread. This case is triggered with ptrace(2) and the PT_SETREGS function. Change the return type of the function to int so that errors can be passed on to the caller. Approved by: re (scottl)	2005-07-05 17:12:18 +00:00
Marcel Moolenaar	a2aeb24eff	Implement functions calls from within DDB on ia64. On ia64 a function pointer doesn't point to the first instruction of that function, but rather to a descriptor. The descriptor has the address of the first instruction, as well as the value of the global pointer. The symbol table doesn't know anything about descriptors, so if you lookup the name of a function you get the address of the first instruction. The cast from the address, which is the result of the symbol lookup, to a function pointer as is done in db_fncall is therefore invalid. Abstract this detail behind the DB_CALL macro. By default DB_CALL is defined as db_fncall_generic, which yields the old behaviour. On ia64 the macro is defined as db_fncall_ia64, in which a descriptor is constructed to yield a valid function pointer. While here, introduce DB_MAXARGS. DB_MAXARGS replaces the existing (local) MAXARGS. The DB_MAXARGS macro can be defined by platforms to create a convenient maximum. By default this will be the legacy 10. On ia64 we define this macro to be 8, for 8 is the maximum number of arguments that can be passed in registers. This avoids having to implement spilling of arguments on the memory stack. Approved by: re (dwhite)	2005-07-02 23:52:37 +00:00
Marcel Moolenaar	5116398a06	Fix a buglet that was present in the ia64 code and that got inherited by amd64 and i386: For buffered writes we collect data and write it out a ${DEV_BSIZE}-sized block at a time. The fragsz variable is used to keep track of how much data we have collected in the buffer so far and it's reset to zero immediately after writing a block to the dump device. When the last, possibly partially filled buffer is flushed, we didn't reset fragsz to 0 and as such would stop reflecting reality. Since we currently only need to do buffered writes once, this isn't a problem. However, when kernel dumps are made by hand (say by callling doadump from within DDB), the improperly cleared state from the first call to dumpsys causes the next call to dumpsys to create an invalid code file. This change resets fragsz after flushing the partially filled buffer so that it fixes the two problems at once. Approved by: re (scottl)	2005-07-02 19:57:31 +00:00
Peter Wemm	62919d788b	Jumbo-commit to enhance 32 bit application support on 64 bit kernels. This is good enough to be able to run a RELENG_4 gdb binary against a RELENG_4 application, along with various other tools (eg: 4.x gcore). We use this at work. ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace, procfs and core dumps. procfs_regs.c: vary the format of proc/XXX/regs depending on the client and target application. procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their sscanf fails. They expect an unsigned long. imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps. sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note that 64 bit consumers can still debug 32 bit targets. IA64 has got stubs for ia32_reg.c. Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't implemented in the 32/64 wrapper yet. We also make a tiny patch to gdb pacify it over conflicting formats of ld-elf.so.1. Approved by: re	2005-06-30 07:49:22 +00:00
Marcel Moolenaar	c31450b00d	Handle B-unit break instructions. The break.b is unique in that the immediate is not saved by the architecture. Any of the break.{mifx} instructions have their immediate saved in cr.iim on interruption. Consequently, when we handle the break interrupt, we end up with a break value of 0 when it was a break.b. The immediate is important because it distinguishes between different uses of the break and which are defined by the runtime specification. The bottomline is that when the GNU debugger replaces a B-unit instruction with a break instruction in the inferior, we would not send the process a SIGTRAP when we encounter it, because the value is not one we recognize as a debugger breakpoint. This change adds logic to decode the bundle in which the break instruction lives whenever the break value is 0. The assumption being that it's a break.b and we fetch the immediate directly out of the instruction. If the break instruction was not a break.b, but any of break.{mifx} with an immediate of 0, we would be doing unnecessary work. But since a break 0 is invalid, this is not a problem and it will still result in a SIGILL being sent to the process. Approved by: re (scottl)	2005-06-27 23:51:38 +00:00
Marcel Moolenaar	fc37111e5d	Replace the existing copyright notice with my own. Over the years I've changed this file so much that it's equivalent to a rewrite, and I'm not talking about any of the cosmetic changes of course. Approved by: re (scottl)	2005-06-27 23:34:35 +00:00
Marcel Moolenaar	9701d67eb8	Cosmetic: s/u_int64_t/uint64_t/g Approved by: re (scottl)	2005-06-27 23:29:06 +00:00
David E. O'Brien	c3e0dfa1f8	Add .cvsignore files just like in sys/<arch>/compiled, this keeps CVS from questing kernel config files not in CVS. Approved by: re(kensmith)	2005-06-20 16:52:59 +00:00
Marcel Moolenaar	442add308f	Define IPI_PREEMPT. Update a nearby comment while I'm here.	2005-06-12 19:03:01 +00:00
Alan Cox	1c245ae7d1	Introduce a procedure, pmap_page_init(), that initializes the vm_page's machine-dependent fields. Use this function in vm_pageq_add_new_page() so that the vm_page's machine-dependent and machine-independent fields are initialized at the same time. Remove code from pmap_init() for initializing the vm_page's machine-dependent fields. Remove stale comments from pmap_init(). Eliminate the Boolean variable pmap_initialized from the alpha, amd64, i386, and ia64 pmap implementations. Its use is no longer required because of the above changes and earlier changes that result in physical memory that is being mapped at initialization time being mapped without pv entries. Tested by: cognet, kensmith, marcel	2005-06-10 03:33:36 +00:00
Joseph Koshy	f263522a45	MFP4: - Implement sampling modes and logging support in hwpmc(4). - Separate MI and MD parts of hwpmc(4) and allow sharing of PMC implementations across different architectures. Add support for P4 (EMT64) style PMCs to the amd64 code. - New pmcstat(8) options: -E (exit time counts) -W (counts every context switch), -R (print log file). - pmc(3) API changes, improve our ability to keep ABI compatibility in the future. Add more 'alias' names for commonly used events. - bug fixes & documentation.	2005-06-09 19:45:09 +00:00
Marcel Moolenaar	470cd51ee6	Create nexus in configure_first() instead of in configure(). This makes sure that sysinit tasks that run after configure_first(), but before configure() have a nexus to hang devices off.	2005-05-29 23:44:22 +00:00
Marcel Moolenaar	a0c51afb16	Call cninit_finish() in configure_final().	2005-05-29 22:48:41 +00:00
Yoshihiro Takahashi	d4fcf3cba5	Remove bus_{mem,p}io.h and related code for a micro-optimization on i386 and amd64. The optimization is a trivial on recent machines. Reviewed by: -arch (imp, marcel, dfr)	2005-05-29 04:42:30 +00:00
Yoshihiro Takahashi	b22bf66063	- Move bus dependent defines to {isa,cbus}_dmareg.h. - Use isa/isareg.h rather than <arch>/isa/isa.h. Tested on: i386, pc98	2005-05-14 10:14:56 +00:00
Marcel Moolenaar	6fab4fece2	Don't define _MACHINE_BUS_MEMIO_H_ nor _MACHINE_BUS_PIO_H_.	2005-05-10 02:59:24 +00:00
David Xu	21fc316430	Change cpu_set_kse_upcall to more generic style, so we can reuse it in other codes. Add cpu_set_user_tls, use it to tweak user register and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall, but since cpu_set_kse_upcall is also used by M:N threads which may not need this feature, so I wrote a separated cpu_set_user_tls.	2005-04-23 02:32:32 +00:00
Marcel Moolenaar	8773a80baf	Sanity the RTC code: o Remove the clock interface. Not only does it conflict with the MI version when device genclock is added to the kernel, it was also not possible to have more than 1 clock device. This of course would have been a problem if we actually had more than 1 clock device. In short: we don't need a clock interface and if we do eventually, we should be using the MI one. o Rewrite inittodr() and resettodr() to take into account that: 1) We use the EFI interface directly. 2) time_t is 64-bit and we do need to make sure we can determine leap years from year 2100 and on. Add a nice explanation of where leap years come from and why. 3) This rewrite happened in 2005 so any date prior to 1/1/2005 (either M/D/Y or D/M/Y) is bogus. Reprogram the EFI clock with 1/1/2005 in that case. 4) The EFI clock has a high probability of being correct, so only (further) correct the EFI clock when the file system time is larger. That should never happen in a time-synchronised world. Complain when EFI lost 2 days or more. Replace the copyright notice now that I (pretty much) rewrote all of this file.	2005-04-22 05:04:58 +00:00
Marcel Moolenaar	ff7125a623	Add empty header (except of the multiple-inclusion protection) to get hwpmc(4) to compile on this platform.	2005-04-20 18:44:53 +00:00
Warner Losh	06db52b609	Break out the definition of bus_space_{tag,handle}_t and a few other types into _bus.h to help with name space polution from including all of bus.h. In a few days, I'll commit changes to the MI code to take advantage of thse sepration (after I've made sure that these changes don't break anything in the main tree, I've tested in my trees, but you never know...). Suggested by: bde (in 2002 or 2003 I think) Reviewed in principle by: jhb	2005-04-18 21:45:34 +00:00
Marcel Moolenaar	02b47ea204	Add a kpte command to DDB. It dumps the PTE of a KVA. This helps to analyze faults and TLB/VHPT inconsistencies.	2005-04-16 23:38:32 +00:00
Marcel Moolenaar	e190f6efc8	Return better "error" values for UWX_BOTTOM and UWX_ABI_FRAME in unw_step(). Both errors denote the end of a stack trace (i.e. no prior frame), but are otherwise not error conditions. Have db_trace() return 0 when the trace ends due to one of these return codes as they are really normal termination conditions. This change especially improves the output of the "show thread" command in DDB when there are threads in fork_trampoline() and previously db_trace() would return an error, causing the show command to emit '***'.	2005-04-16 05:38:59 +00:00
Marcel Moolenaar	64c92ba929	Initialize curthread before we save the APs MCA state. Saving the MCA state requires a spin lock, which requires a valid curthread. This change allows SMP kernels to boot into multi-user again. While here, update the copyright notice and use __FBSDID for the revision string.	2005-04-15 00:21:23 +00:00
John Baldwin	aa9aa68d2f	Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic operations in some places and simple non-per CPU math in others.	2005-04-12 23:18:54 +00:00
Marcel Moolenaar	a08d773359	Dot the i's: 1 Move the debug.clock_adjust_* sysctls to debug.clock.adjust_* to make it easier to get only the clock statistics. 2 Make the sysctls read-only [suggested by Marius]. 3 When determining the new clock adjustment, we checked for an error either larger than 12.5% or smaller than 12.5%. We left out an error of exactly 12.5%. For errors larger than 12.5% we adjust the clock reload value in such a way that the next clock interrupt would be early (as in premature). For errors less than 12.5% we stopped the adjustment. The current algorithm doesn't benefit from excluding an error of exactly 12.5%. Change the code to stop adjusting the clock if the error is not larger than 12.5% [suggested by Marius]. Discussed with: marius@	2005-04-12 18:50:57 +00:00
John Baldwin	c6a37e8413	Divorce critical sections from spinlocks. Critical sections as denoted by critical_enter() and critical_exit() are now solely a mechanism for deferring kernel preemptions. They no longer have any affect on interrupts. This means that standalone critical sections are now very cheap as they are simply unlocked integer increments and decrements for the common case. Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter() and spinlock_exit(). This KPI is responsible for providing whatever MD guarantees are needed to ensure that a thread holding a spin lock won't be preempted by any other code that will try to lock the same lock. For now all archs continue to block interrupts in a "spinlock section" as they did formerly in all critical sections. Note that I've also taken this opportunity to push a few things into MD code rather than MI. For example, critical_fork_exit() no longer exists. Instead, MD code ensures that new threads have the correct state when they are created. Also, we no longer try to fixup the idlethreads for APs in MI code. Instead, each arch sets the initial curthread and adjusts the state of the idle thread it borrows in order to perform the initial context switch. This change is largely a big NOP, but the cleaner separation it provides will allow for more efficient alternative locking schemes in other parts of the kernel (bare critical sections rather than per-CPU spin mutexes for per-CPU data for example). Reviewed by: grehan, cognet, arch@, others Tested on: i386, alpha, sparc64, powerpc, arm, possibly more	2005-04-04 21:53:56 +00:00
Maxim Sobolev	6bcf003260	Add USB Communication Device Class Ethernet driver. Originally written for FreeBSD based on aue(4) it was picked by OpenBSD, then from OpenBSD ported to NetBSD and finally NetBSD version merged with original one goes into FreeBSD. Obtained from: http://www.gank.org/freebsd/cdce/ NetBSD OpenBSD	2005-03-22 14:52:40 +00:00
Nate Lawson	ac5f2dab74	s/SLIST/STAILQ to catch up with changes to resource lists. Missed by: imp	2005-03-20 06:55:49 +00:00
Murray Stokely	991f5121f0	Add a comment to note that pseudo-device bpf is required for DHCP. This is mentioned in the Handbook but it is not as obvious to new users why bpf is needed compared to the other largely self-explanatory items in GENERIC. PR: conf/40855 MFC after: 1 week	2005-03-18 15:24:00 +00:00
Ian Dowse	60719a1a44	Split configure() into 3 separate steps like we do on other architectures. This makes it possible to insert hooks before and after the device attachment step. Tested thanks to: marcel	2005-03-18 09:45:43 +00:00
Scott Long	5974e5c71c	Refactor the bus_dma header files so that the interface is described in sys/bus_dma.h instead of being copied in every single arch. This slightly reorders a flag that was specific to AXP and thus changes the ABI there. The interface still relies on bus_space definitions found in <machine/bus.h> so it cannot be included on its own yet, but that will be fixed at a later date. Add an MD <machine/bus_dma.h> for ever arch for consistency and to allow for future MD augmentation of the API. sparc64 makes heavy use of this right now due to its different bus_dma implemenation.	2005-03-14 16:46:28 +00:00
Scott Long	8bf0837c7a	Remove dead code.	2005-03-07 02:18:52 +00:00
Joerg Wunsch	a5f50ef9e4	netchild's mega-patch to isolate compiler dependencies into a central place. This moves the dependency on GCC's and other compiler's features into the central sys/cdefs.h file, while the individual source files can then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42. By now, GCC and ICC (the Intel compiler) have been actively tested on IA32 platforms by netchild. Extension to other compilers is supposed to be possible, of course. Submitted by: netchild Reviewed by: various developers on arch@, some time ago	2005-03-02 21:33:29 +00:00
Marcel Moolenaar	f685f62c98	Make sure fpswa_iface equals NULL when bootinfo.bi_fpswa equals 0. We need to be able to test for the (possible) non-existence of the FPSWA code. PR: ia64/77591 Submitted by: Christian Kandeler (christian dot kandeler at hob dot de) MFC after: 1 day	2005-03-02 20:29:04 +00:00
Wes Peters	95e2054492	Attempt to doff the pointy hat: implement 'hw.realmem' on remaining architectures. Pointed out by O'Brien, ScottL via email. Reviewed by: obrien (various)	2005-03-01 21:55:27 +00:00
Xin LI	130d7d9ffb	Remove acpi_perf from {ARCH}/conf/NOTES, to make tinderbox happy. Reported by: tinderbox Inspired by: acpi_perf build structure removal commit	2005-02-25 07:10:37 +00:00
Ruslan Ermilov	3971d2cf5e	Use a common multi-inclusion protection, and add such a protection to alpha/include/exec.h.	2005-02-19 21:16:48 +00:00
Marcel Moolenaar	3ec2e857c1	s/descr/oid_descr/	2005-02-09 04:48:23 +00:00
Poul-Henning Kamp	0c3c54da63	Since we are quite unlikely to ever face another platform which uses the i8237 without trying to emulate the PC architecture move the register definitions for the i8237 chip into the central include file for the chip, except for the PC98 case which is magic. Add new isa_dmatc() function which tells us as cheaply as possible if the terminal count has been reached for a given channel.	2005-02-06 13:46:39 +00:00
Nate Lawson	3888a87205	Finish the job of sorting all includes and fix the build by including malloc.h before proc.h on sparc64. Noticed by das@ Compiled on: alpha, amd64, i386, pc98, sparc64	2005-02-06 01:55:08 +00:00
Nate Lawson	69bc96f231	Build cpufreq and acpi_perf on platforms that are likely to be able to use them.	2005-02-05 21:01:09 +00:00
Marcel Moolenaar	6fb59928a6	Include sys/bus.h before sys/cpu.h. The latter needs device_t.	2005-02-04 06:38:58 +00:00
Nate Lawson	4c4381e288	Add an implementation of cpu_est_clockrate(9). This function estimates the current clock frequency for the given CPU id in units of Hz.	2005-02-04 05:32:56 +00:00
Warner Losh	1f0ce611b3	nit in /*-	2005-01-31 08:16:45 +00:00
Marcel Moolenaar	b71bca0f84	Fix handling of post increment: Either the first or second operand is the register with the memory address, and it's that register's value we need to increment or decrement. MFC after: 3 days	2005-01-27 06:01:44 +00:00
Scott Long	9580b6766e	Fix compile errors. Bah.	2005-01-18 11:06:34 +00:00
Scott Long	a9e9b7e47b	Fix an assignment that I missed in the last commit.	2005-01-15 20:03:59 +00:00
Scott Long	33072f4de7	Add bus_dmamap_load_mbuf_sg() to ia64	2005-01-15 19:26:17 +00:00
John Baldwin	f4ef9cec40	- Remove some OBE comments regarding cpu_exit(). cpu_exit() is no longer the last action of kern_exit(). Instead, it is a MD callout to cleanup per-process state during exit. - Add notes of concern to Alpha and ia64 about the possible need to drop fp state in cpu_thread_exit() rather than in cpu_exit() since it is per-thread state rather than per-process.	2005-01-14 20:13:04 +00:00
Warner Losh	86cb007f9f	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 22:18:23 +00:00
Marcel Moolenaar	2fa9a15eca	Further enhance the handling of misaligned loads and stores: o implement double-extended and single precision loads and stores, o implement double precision stores, o replace the machdep.unaligned_print sysctl with debug.unaligned_print and change the default value to 0, o replace the machdep.unaligned_sigbus sysctl with debug.unaligned_test, o Remmove the fillfd() function. The function is trvial enough for inline assembly. The debug.unaligned_test sysctl is used to test the emulation of misaligned loads and stores. When PSR.ac is 0, the CPU will handle misaligned memory accesses itselfi and we don't get an exception for it. When PSR.ac is 1, the process needs to be signalled and we should not emulate. The sysctl takes effect when PSR.ac is 1 and tells us that we should emulate and not send a signal. PR: 72268 MFC after: 1 week	2005-01-02 00:20:54 +00:00
Alan Cox	1f70d62298	Modify pmap_enter_quick() so that it expects the page queues to be locked on entry and it assumes the responsibility for releasing the page queues lock if it must sleep. Remove a bogus comment from pmap_enter_quick(). Using the first change, modify vm_map_pmap_enter() so that the page queues lock is acquired and released once, rather than each time that a page is mapped.	2004-12-23 20:16:11 +00:00
Alan Cox	85f5b24573	In the common case, pmap_enter_quick() completes without sleeping. In such cases, the busying of the page and the unlocking of the containing object by vm_map_pmap_enter() and vm_fault_prefault() is unnecessary overhead. To eliminate this overhead, this change modifies pmap_enter_quick() so that it expects the object to be locked on entry and it assumes the responsibility for busying the page and unlocking the object if it must sleep. Note: alpha, amd64, i386 and ia64 are the only implementations optimized by this change; arm, powerpc, and sparc64 still conservatively busy the page and unlock the object within every pmap_enter_quick() call. Additionally, this change is the first case where we synchronize access to the page's PG_BUSY flag and busy field using the containing object's lock rather than the global page queues lock. (Modifications to the page's PG_BUSY flag and busy field have asserted both locks for several weeks, enabling an incremental transition.)	2004-12-15 19:55:05 +00:00
Marcel Moolenaar	7126a966b3	Fix the last of the instability and the cause of the annoying "vm_fault: fault on nofault entry, addr: %lx" panic. The problem was a stale PTE in the TLB that marked the page as not present, even though we had a good PTE in the VHPT. We typically don't yet insert PTEs in the TLB. We do that lazily. The CPU will look for the PTE in the VHPT when there's no PTE in the TLB. Unfortunately this doesn't handle the case of the stale PTE in the TLB. The quick fix is to invalidate the TLB (sloppily) when the VHPT doesn't contain a valid PTE. This is also the only case that may cause a PTE in the TLB that marks a page as non-present.	2004-12-12 19:27:58 +00:00
Marcel Moolenaar	3579953091	Use primitive types to avoid creating an artificial header dependency: o s/u_long/unsigned long/ o s/uint32_t/unsigned int/g o s/uint64_t/unsigned long/g Trigger case: multimedia/mpeg2codec	2004-12-11 06:15:12 +00:00
Marcel Moolenaar	f5929532f1	Don't obtain the HCDP address directly from the bootinfo structure. Use a function to keep the details at arms length from uart(4).	2004-12-08 05:46:54 +00:00
Marcel Moolenaar	bcc5241c43	Change gdb_cpu_setreg() to not take the value to which to set the specified register, but a pointer to the in-memory representation of that value. The reason for this is twofold: 1. Not all registers can be represented by a register_t. In particular FP registers fall in that category. Passing the new register value by reference instead of by value makes this point moot. 2. When we receive a G or P packet, both are for writing a register, the packet will have the register value in target-byte order and in the memory representation (modulo the fact that bytes are sent as 2 printable hexadecimal numbers of course). We only need to decode the packet to have a pointer to the register value. This change fixes the bug of extracting the register value of the P packet as a hexadecimal number instead of as a bit array. The quick (and dirty) fix to bswap the register value in gdb_cpu_setreg() as it has been added on i386 and amd64 can therefore be removed and has in fact been that. Tested on: alpha, amd64, i386, ia64, sparc64	2004-12-01 06:40:35 +00:00
Marcel Moolenaar	c0678028d7	Whitespace fixes: o Remove a bogus comment that relates to alpha. o s/u_int64_t/uint64_t/g o Add bi_spare2 to make the internal padding explicit. o Move BOOTINFO_MAGIC after the field it applies to.	2004-11-28 04:34:17 +00:00
David Schultz	6004362e66	Don't include sys/user.h merely for its side-effect of recursively including other headers.	2004-11-27 06:51:39 +00:00
Marcel Moolenaar	2ba0042660	Remove struct ia64_itir and use a plain old uint64_t instead.	2004-11-21 21:40:08 +00:00
David Schultz	ab44ebf537	Remove UAREA_PAGES. Reviewed by: arch@	2004-11-20 02:29:50 +00:00
David Schultz	11111b709f	U areas are going away, so don't allocate one for process 0. Reviewed by: arch@	2004-11-20 02:29:25 +00:00
David Schultz	ff3fd2e764	user.h is included only to get pcb.h, so use the latter directly instead.	2004-11-20 02:28:14 +00:00
Marcel Moolenaar	a0c88fb1d9	Remove the BR tag. When the machine doesn't have the DIG64 HCDP table with console settings, we now only need to know at which address the UART lives. Leaving the baudrate unspecified results in us using the baudrate at which the UART operates. This removes one parameter that can interfere with a successful installation out of the box.	2004-11-14 23:42:48 +00:00
John Baldwin	d39d4a6e64	- Change the ddb paging "support" to use a variable (db_lines_per_page) to control the number of lines per page rather than a constant. The variable can be examined and changed in ddb as '$lines'. Setting the variable to 0 will effectively turn off paging. - Change db_putchar() to force out pending whitespace before outputting newlines and carriage returns so that one can rub out content on the current line via '\r \r' type strings. - Change the simple pager to rub out the --More-- prompt explicitly when the routine exits. - Add some aliases to the simple pager to make it more compatible with more(1): 'e' and 'j' do a single line. 'd' does half a page, and 'f' does a full page. MFC after: 1 month Inspired by: kris	2004-11-01 22:15:15 +00:00
Poul-Henning Kamp	32204b9721	Use bioq_takefirst()	2004-10-23 12:44:19 +00:00
Poul-Henning Kamp	95bc568977	Add new function ttyinitmode() which sets our systemwide default modes on a tty structure. Both the ".init" and the current settings are initialized allowing the function to be used both at attach and open time. The function takes an argument to decide if echoing should be enabled. Echoing should not be enabled for regular physical serial ports unless they are consoles, in which case they should be configured by ttyconsolemode() instead. Use the new function throughout.	2004-10-18 21:51:27 +00:00
Nate Lawson	8f528832e5	Print flags in the nexus for child devices.	2004-10-14 22:36:47 +00:00
Nate Lawson	31ad3b8802	Move the code for halting the CPU (acpi_cpu_c1) into machdep files. This removes the last MD portion of acpi_cpu.c. MFC after: 2 weeks	2004-10-11 05:39:15 +00:00
Marcel Moolenaar	07cf947238	Add the Madison II, which is the second generation Madison. The Madison II is model 2 in the Itanium 2 family and has up to 9MB of L3 cache and clocks higher than 1.5Ghz. There's no LV variant AFAICT.	2004-10-06 02:43:28 +00:00
Alan Cox	8ceb3dcb60	The physical address stored in the vm_page is page aligned. There is no need to mask off the page offset bits. (This operation made some sense prior to i386/i386/pmap.c revision 1.254 when we passed a physical address rather than a vm_page pointer to pmap_enter().)	2004-10-03 00:16:43 +00:00
Alan Cox	07b3303943	Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These uses predate the change in the pmap_enter() interface that replaced the page's physical address by the address of its vm_page structure. The PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page structure that was being passed in.	2004-10-02 07:34:58 +00:00
Marcel Moolenaar	287e12f172	...And fix WITNESS builds: declare syscallnames.	2004-09-26 20:39:56 +00:00
Marcel Moolenaar	feae534e49	Fix INVARIANTS build: Include <machine/cpu.h>.	2004-09-26 00:38:56 +00:00
Marcel Moolenaar	03bfdd1362	Move the IA-32 trap handling from trap() to ia32_trap(). Move the ia32_syscall() function along with it to ia32_trap.c. When COMPAT_IA32 is not defined, we'll raise SIGEMT instead.	2004-09-25 04:27:44 +00:00
Marcel Moolenaar	0c32530bb7	Redefine a PTE as a 64-bit integral type instead of a struct of bit-fields. Unify the PTE defines accordingly and update all uses.	2004-09-23 00:05:20 +00:00
Marcel Moolenaar	0675c65d6b	s/u_int#_t/uint#_t/g	2004-09-22 23:12:46 +00:00
Marcel Moolenaar	08d3edb315	For the atomic_{add\|clear\|set\|subtract} family of inlines, return the old or previous value instead of void. This is not as is documented in atomic(9), but is API (and ABI) compatible and simply makes sense. This feature will primarily be used for atomic PTE updates in PMAP/ng.	2004-09-22 19:58:43 +00:00
Marcel Moolenaar	5c48823c36	MFp4: various style fixes, including o s/u_int/uint/g o s/#define<sp>/#define<tab>/g o indent macro definitions o Improve vertical spacing o Globally align line continuation character	2004-09-22 19:47:42 +00:00
John Baldwin	76764432e4	- Add support for "paging" in stack trace output. That is, when you do a stack trace from ddb, the output will pause with a '--More--' prompt every 18 lines. If you hit Enter, it will print another line and prompt again. If you hit space it will output another page and then prompt. If you hit 'q' or 'x' it will abort the rest of the stack trace. - Fix the sparc64 userland stack trace to honor the total count of lines to print. This is useful if your trace happens to walk back onto 0xdeadc0de and gets stuck in an endless loop. MFC after: 1 month Tested on: i386, alpha, sparc64	2004-09-20 19:05:32 +00:00
Marcel Moolenaar	13e6668525	MFp4: Completely remove the remaining EFI includes and add our own (type) definitions instead. While here, abstract more of the internals by providing interface functions.	2004-09-19 03:50:46 +00:00
Alan Cox	7580b56bdc	Release the page queues lock earlier in pmap_protect() and pmap_remove() in order to reduce contention.	2004-09-18 22:56:58 +00:00
Marcel Moolenaar	9f9ae8ebb7	Provide our own FPSWA definitions, instead of depending on the Intel EFI headers and put them all in <machine/fpu.h>. The Intel EFI headers conflict with the Intel ACPI headers (duplicate type definitions), so are being phased out in the kernel.	2004-09-17 22:19:41 +00:00
Marcel Moolenaar	cba8d3ae49	Remove useless inclusion of <machine/fpu.h>	2004-09-17 20:42:45 +00:00
Poul-Henning Kamp	7ce1979be6	Add new a function isa_dma_init() which returns an errno when it fails and which takes a M_WAITOK/M_NOWAIT flag argument. Add compatibility isa_dmainit() macro which whines loudly if isa_dma_init() fails. Problem uncovered by: tegge	2004-09-15 12:09:50 +00:00
Marcel Moolenaar	fa3b7cae8d	Catch up with other platforms: switch the default scheduler to 4BSD.	2004-09-12 05:50:32 +00:00
Scott Long	50736a153b	Fix a problem with tag->boundary inheritence that has existed since day one and was propagated to nearly every platform. The boundary of the child needs to consider the boundary of the parent and pick the minimum of the two, not the maximum. However, if either is 0 then pick the appropriate one. This bug was exposed by a recent change to ATA, which should now be fixed by this change. The alignment and maxsegsz tag attributes likely also need a similar review in the near future. This is a MT5 candidate. Reviewed by: marcel Submitted by: sos (in part)	2004-09-08 04:54:19 +00:00
Marcel Moolenaar	566d143be0	Sync the busdma code with i386. The most tangible upshot is that the alignment and boundary constraints are being respected, which fixes the reported ATA problems with SiI chips. I consider the busdma implementation worrisome nonetheless. Not only is there too much MI code duplicated in MD files, there's a lot of questionable code. I smell a wholesale, cross-platform overhaul coming... MT5 candidate.	2004-09-08 02:55:04 +00:00
Julian Elischer	ed062c8d66	Refactor a bunch of scheduler code to give basically the same behaviour but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week	2004-09-05 02:09:54 +00:00
Marcel Moolenaar	44af2aa001	Add aac(4) and aacp(4). The driver is 64-bit clean for roughly a year now and has been mentioned on the freebsd-ia64 list.	2004-09-02 18:05:26 +00:00
Julian Elischer	5995adc206	Remove an unneeded argument.. The removed argument could trivially be derived from the remaining one. That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument. Having both proc and thread as an argumen tjust gives an opportunity for them to get out sync. MFC after: 3 days	2004-08-31 07:34:54 +00:00
Julian Elischer	99e9dcb817	Remove sched_free_thread() which was only used in diagnostics. It has outlived its usefulness and has started causing panics for people who turn on DIAGNOSTIC, in what is otherwise good code. MFC after: 2 days	2004-08-31 06:12:13 +00:00
Alan Cox	bfa15df9ba	Remove unnecessary check for curthread == NULL.	2004-08-30 03:52:05 +00:00
Marcel Moolenaar	224407d6a8	s/ENTRY/ENTRY_NOPROFILE/g for particular functions that do not follow the C calling convention or are otherwise not regular functions. This allows us to boot a profiling kernel.	2004-08-30 01:32:28 +00:00
Marcel Moolenaar	82ecff453f	Catch up with the drive-by renaming of IA32 to COMPAT_IA32. Missed 11 days ago when all the other places were fixed and finally caught by the tinderbox run...	2004-08-27 21:57:00 +00:00
Marcel Moolenaar	0f2fe153bc	Move the kernel-specific logic to adjust frompc from MI to MD. For these two reasons: 1. On ia64 a function pointer does not hold the address of the first instruction of a functions implementation. It holds the address of a function descriptor. Hence the user(), btrap(), eintr() and bintr() prototypes are wrong for getting the actual code address. 2. The logic forces interrupt, trap and exception entry points to be layed-out contiguously. This can not be achieved on ia64 and is generally just bad programming. The MCOUNT_FROMPC_USER macro is used to set the frompc argument to some kernel address which represents any frompc that falls outside the kernel text range. The macro can expand to ~0U to bail out in that case. The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to some kernel address to represent a call to a trap or interrupt handler. This to avoid that the trap or interrupt handler appear to be called from everywhere in the call graph. The macro can expand to ~0U to prevent adjusting frompc. Note that the argument is selfpc, not frompc. This commit defines the macros on all architectures equivalently to the original code in sys/libkern/mcount.c. People can take it from here... Compile-tested on: alpha, amd64, i386, ia64 and sparc64 Boot-tested on: i386	2004-08-27 19:42:35 +00:00
Alan Cox	8991a235cb	The machine-independent parts of the virtual memory system always pass a valid pmap to the pmap functions that require one. Remove the checks for NULL. (These checks have their origins in the Mach pmap.c that was integrated into BSD. None of the new code written specifically for FreeBSD included them.)	2004-08-27 19:06:17 +00:00
Andre Oppermann	c21fd23260	Always compile PFIL_HOOKS into the kernel and remove the associated kernel compile option. All FreeBSD packet filters now use the PFIL_HOOKS API and thus it becomes a standard part of the network stack. If no hooks are connected the entire packet filter hooks section and related activities are jumped over. This removes any performance impact if no hooks are active. Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.	2004-08-27 15:16:24 +00:00
Marcel Moolenaar	04f093dde7	Get a step closer to profiling the kernel by fixing the definitions of the MCOUNT_ENTER, MCOUNT_EXIT and MCOUNT_DECL defines. Also make sure there's a prototype of _MCOUNT_DECL(). This allows us to build a kernel. There are still unresolved symbols, so linking fails.	2004-08-25 08:03:48 +00:00
Marcel Moolenaar	f0556e70bb	Make profiling actually work. The gcc compiler emits a call to the _mcount() stub when profiling is enabled. Emit this code sequence for assembly routines as welli (MCOUNT definition in <machine/asm.h>. We do not pass the GOT entry however as the 4th argument, because it's not used. The _mcount() stub calls __mcount(), which does the actual work. Define _MCOUNT_DECL to define __mcount. We do not have an implementation of mcount(), so we define MCOUNT as empty, but have a weak alias to _mcount() in _mcount.S. Note that the _mcount() stub in the kernel is slightly different from the stub in userland. This is because we do not have to worry about nested routines in the kernel.	2004-08-25 07:42:34 +00:00
Nate Lawson	dc6851d588	Catch up with i386 nexus.c rev 1.59: add bus_get_resource_list().	2004-08-24 19:22:54 +00:00
David E. O'Brien	6cda6c4a35	sr(4) definately won't work on IA64.	2004-08-24 18:31:27 +00:00
Arun Sharma	2d24da614a	The existing code fails some corner cases. Replace it with ia64_bsp_adjust() which has been tested to work in all cases for arbitrary (bsp, nslots) combinations. reviewed by: marcel@	2004-08-16 22:09:58 +00:00
Marcel Moolenaar	344bbdbd54	As I said: the previous commit was untested... Remove an #endif which should have ceased to exist when its corresponding #if was removed.	2004-08-16 19:05:08 +00:00
Marcel Moolenaar	97752b2cbd	Catch up with the drive-by renaming of IA32 to COMPAT_IA32. It must have been rush hour... While here, move COMPAT_IA32 from opt_global.h to opt_compat.h like on amd64. Consequently, it's unsafe to use the option in pcb.h. We now unconditionally have the ia32 specific registers in the PCB. This commit is untested.	2004-08-16 18:54:23 +00:00
Arun Sharma	646c6dd2c0	ITC.{i,d} instructions use format M41 not M42. reviewed by: marcel@	2004-08-16 18:41:24 +00:00
Marcel Moolenaar	c66fdb617d	Allocate memory in the unwinder with M_NOWAIT. We may need to provide backtraces with locks held.	2004-08-14 05:00:37 +00:00
Marcel Moolenaar	3a00930042	In set_regs(), flush the dirty registers onto the backingstore before we update the registers. That way we don't have any dirty registers to worry about and also know that bsp=bspstore, which makes updating the RSE related registers predictable. This is not the end of it. We need more validity checks, but for now this allows us to complete the gdb testsuite without crashing the kernel.	2004-08-11 05:29:13 +00:00
Marcel Moolenaar	4da47b2fec	Add __elfN(dump_thread). This function is called from __elfN(coredump) to allow dumping per-thread machine specific notes. On ia64 we use this function to flush the dirty registers onto the backingstore before we write out the PRSTATUS notes. Tested on: alpha, amd64, i386, ia64 & sparc64 Not tested on: arm, powerpc	2004-08-11 02:35:06 +00:00
Marcel Moolenaar	b4b7c60d70	Better preserve the original protection for the mappings we maintain. The hardware always gives read access for privilege level 0, which means that we cannot use the hardware access rights and privilege level in the PTE to test whether there's a change in protection. So, we save the original vm_prot_t in the PTE as well. Add pmap_pte_prot() to set the proper access rights and privilege level on the PTE given a pmap and the requested protection. The above allows us to compare the protection in pmap_extract_and_hold() which was missing. While in pmap_extract_and_hold(), add pmap locking. While here, clean up most (i.e. all but one) PTE macros we inherited from alpha. They were either unused, used inconsistently, badly named or simply weren't beneficial. We save the wired and managed state of the PTE in distinct (bit) fields. While in pte.h, s/u_int64_t/uint64_t/g pmap locking obtained from: alc@ feedback & review by: alc@	2004-08-09 20:44:41 +00:00
Marcel Moolenaar	47a86e3cd4	Implement single stepping when we leave the kernel through the EPC syscall path. The basic problem is that we cannot set the single stepping flag directly, because we don't leave the kernel via an interrupt return. So, we need another way to set the single stepping flag. The way we do this is by enabling the lower-privilege transfer trap, which gets raised when we drop the privilege level. However, since we're still running in kernel space (sec), we're not yet done. We clear the lower- privilege transfer trap, enable the taken-branch trap and continue exiting the kernel until we branch into user space. Given the current code, there's a total of two traps this way before we can raise SIGTRAP.	2004-08-08 00:28:07 +00:00
Marcel Moolenaar	6aa84a056a	Slightly move labels around to make sure we call ast() on our way out after a fork(2) in fork_trampoline(). By moving the epc_syscall_return label immediately before the call to do_ast() in epc_syscall(), we not only achieve that but also handle the detour through exception_return when the frame corresponds to an asynchronous kernel entry. Hence, we simplified fork_trampoline() as a side-effect.	2004-08-07 21:55:15 +00:00
Marcel Moolenaar	7d9a8b1cd5	De-inline gdb_cpu_signal() because we need to convert the trap vectors related to breakpoints and single stepping into SIGTRAP so gdb(1) knows why the remote target has stopped. In particular, gdb(1) needs to know if the reason is something of its own doing.	2004-08-07 21:40:52 +00:00
Arun Sharma	d7cf64c9a1	Use a 256MB TR instead of a 64MB TR to make sure that the kernel text/data are covered on APs. This enables the kernel to boot on a 4 way Intel Itanium-2 platform. This has a secondary effect of keeping the TRs identical on BP and the APs. reviewed by: marcel@	2004-08-04 20:09:41 +00:00
Mark Murray	d23a262fc5	Making a loadable null.ko for /dev/(null\|zero) proved rather unpopular, so remove this (mis)feature. Encouragement provided by: jhb (and others)	2004-08-03 19:24:54 +00:00
Maxime Henrion	9f1b87f106	Instead of calling ia32_pause() conditionally on __i386__ or __amd64__ being defined, define and use a new MD macro, cpu_spinwait(). It only expands to something on i386 and amd64, so the compiled code should be identical. Name of the macro found by: jhb Reviewed by: jhb	2004-08-03 18:44:27 +00:00
Marcel Moolenaar	78b9765bee	Fix 2 typos in previous commit: both s/strct/struct/	2004-08-02 18:37:55 +00:00
Marcel Moolenaar	cb1d0dd340	Add the mem and null devices now that they are optional.	2004-08-02 17:53:06 +00:00
Marcel Moolenaar	6d28b03026	Sort the miscellaneous devices to restore ordering after the insertion of the mem and null devices.	2004-08-02 17:50:39 +00:00
Mark Murray	a5ed4a0ad5	Remove extraneous ';'.	2004-08-01 18:51:44 +00:00
Mark Murray	8ab2f5ecc5	Break out the MI part of the /dev/[k]mem and /dev/io drivers into their own directory and module, leaving the MD parts in the MD area (the MD parts _are_ part of the modules). /dev/mem and /dev/io are now loadable modules, thus taking us one step further towards a kernel created entirely out of modules. Of course, there is nothing preventing the kernel from having these statically compiled.	2004-08-01 11:40:54 +00:00
Alan Cox	350fb8ae6a	- Add pmap locking to ia64's pmap_enter() and pmap_enter_quick(). (This brings ia64 to parity with alpha, amd64, and i386 in this area.) - Prevent a race in pmap_find_pte(): If pmap_find_pte() sleeps in uma_zalloc(), another thread could allocate a pte at the same address. Instead, sleep at a higher level and retry the lookup before retrying the allocation. Reviewed and tested by: marcel@	2004-07-30 20:25:12 +00:00
Marcel Moolenaar	f95c91bcee	Fix -O builds with gcc 3.4 by defining ffs as __builtin_ffs instead of creating an inline function that just calls __builtin_ffs.	2004-07-30 07:56:53 +00:00
Poul-Henning Kamp	0658bb8ef8	Move a relic to its correct location(s): Put nfs diskless initialization calls with the code they call. (Yet another example of mindless copy&paste).	2004-07-28 21:54:57 +00:00
Robert Watson	1a8cfbc450	Pass a thread argument into cpu_critical_{enter,exit}() rather than dereference curthread. It is called only from critical_{enter,exit}(), which already dereferences curthread. This doesn't seem to affect SMP performance in my benchmarks, but improves MySQL transaction throughput by about 1% on UP on my Xeon. Head nodding: jhb, bmilekic	2004-07-27 16:41:01 +00:00
Marcel Moolenaar	6dd19a884b	Work-around a gcc code generation bug for function descriptors references (target/16559). This fixes SMP configurations. Obtained from: arun@	2004-07-25 07:07:09 +00:00
Alan Cox	e4242deba7	In pmap_mincore() create a private copy of the pte for use after the pmap lock is released.	2004-07-22 02:05:46 +00:00
Alan Cox	756e6d1939	Additional pmap locking Tested by: marcel@	2004-07-21 07:01:48 +00:00
Marcel Moolenaar	fd32d93b97	Unify db_stack_trace_cmd(). All it did was look up the thread given the thread ID and call db_trace_thread(). Since arm has all the logic in db_stack_trace_cmd(), rename the new DB_COMMAND function to db_stack_trace to avoid conflicts on arm. While here, have db_stack_trace parse its own arguments so that we can use a more natural radix for IDs. If the ID is not a thread ID, or more precisely when no thread exists with the ID, try if there's a process with that ID and return the first thread in it. This makes it easier to print stack traces from the ps output. requested by: rwatson@ tested on: amd64, i386, ia64	2004-07-21 05:07:09 +00:00
David Schultz	479f8d2214	Make FLT_ROUNDS correctly reflect the dynamic rounding mode.	2004-07-19 08:17:25 +00:00
Alan Cox	4a5be3f70a	Add partial pmap locking. Tested by: marcel@	2004-07-19 05:39:49 +00:00
Alan Cox	6fe30ff3f2	Remove unused fields from the pmap.	2004-07-16 03:42:45 +00:00
Poul-Henning Kamp	672c05d49c	Preparation commit for the tty cleanups that will follow in the near future: rename ttyopen() -> tty_open() and ttyclose() -> tty_close(). We need the ttyopen() and ttyclose() for the new generic cdevsw functions for tty devices in order to have consistent naming.	2004-07-15 20:47:41 +00:00
Alan Cox	3d2e54c317	Push down the acquisition and release of the page queues lock into pmap_protect() and pmap_remove(). In general, they require the lock in order to modify a page's pv list or flags. In some cases, however, pmap_protect() can avoid acquiring the lock.	2004-07-15 18:00:43 +00:00
Alan Cox	72826d0f4a	A loop in pmap_remove() should use TAILQ_FOREACH_SAFE(), not TAILQ_FOREACH(), because the loop deletes elements from the list. Reviewed by: marcel@	2004-07-15 03:20:00 +00:00
David Xu	53dbf30349	Add ptrace_clear_single_step(), alpha already has it for years, the function will be used by ptrace to clear a thread's single step state.	2004-07-13 07:22:56 +00:00
Alan Cox	440382a953	Simplify pmap_protect().	2004-07-13 06:54:23 +00:00
Alan Cox	ce8da3091f	Push down the acquisition and release of the page queues lock into pmap_remove_pages(). (The implementation of pmap_remove_pages() is optional. If pmap_remove_pages() is unimplemented, the acquisition and release of the page queues lock is unnecessary.) Remove spl calls from the alpha, arm, and ia64 pmap_remove_pages().	2004-07-13 02:49:22 +00:00
Marcel Moolenaar	8bcb1e9e84	Add options KDB and GDB. KDB takes on the function of what DDB used to be. Both DDB and GDB specify which KDB backends to include.	2004-07-11 03:20:09 +00:00
Marcel Moolenaar	1ca618fcaa	Remove the now unused GDB stubs. See src/sys/gdb/* for the new KDB backend.	2004-07-11 01:47:26 +00:00
Marcel Moolenaar	37224cd3fc	Mega update for the KDB framework: turn DDB into a KDB backend. Most of the changes are a direct result of adding thread awareness. Typically, DDB_REGS is gone. All registers are taken from the trapframe and backtraces use the PCB based contexts. DDB_REGS was defined to be a trapframe on all platforms anyway. Thread awareness introduces the following new commands: thread X switch to thread X (where X is the TID), show threads list all threads. The backtrace code has been made more flexible so that one can create backtraces for any thread by giving the thread ID as an argument to trace. With this change, ia64 has support for breakpoints.	2004-07-10 23:47:20 +00:00
Marcel Moolenaar	6d33366c74	Update for the KDB framework: o ksym_start and ksym_end changed type to vm_offset_t. o Make debugging support conditional upon KDB instead of DDB. o Call kdb_enter() instead of breakpoint(). o Remove implementation of Debugger(). o Call kdb_trap() according to the new world order. unwinder: o s/db_active/kdb_active/g o Various s/ddb/kdb/g o Add support for unwinding from the PCB as well as the trapframe. Abuse a spare field in the special register set to flag whether the PCB was actually constructed from a trapframe so that we can make the necessary adjustments. md_var.h: o Add RSE convenience macros. o Add ia64_bsp_adjust() to add or subtract from BSP while taking NaT collections into account.	2004-07-10 22:59:30 +00:00
Marcel Moolenaar	5a39cbaf69	Implement makectx(). The makectx() function is used by KDB to create a PCB from a trapframe for purposes of unwinding the stack. The PCB is used as the thread context and all but the thread that entered the debugger has a valid PCB. This function can also be used to create a context for the threads running on the CPUs that have been stopped when the debugger got entered. This however is not done at the time of this commit.	2004-07-10 19:56:00 +00:00
Marcel Moolenaar	cbc174356c	Introduce the KDB debugger frontend. The frontend provides a framework in which multiple (presumably different) debugger backends can be configured and which provides basic services to those backends. Besides providing services to backends, it also serves as the single point of contact for any and all code that wants to make use of the debugger functions, such as entering the debugger or handling of the alternate break sequence. For this purpose, the frontend has been made non-optional. All debugger requests are forwarded or handed over to the current backend, if applicable. Selection of the current backend is done by the debug.kdb.current sysctl. A list of configured backends can be obtained with the debug.kdb.available sysctl. One can enter the debugger by writing to the debug.kdb.enter sysctl.	2004-07-10 18:40:12 +00:00
Marcel Moolenaar	72d44f31a6	Introduce the GDB debugger backend for the new KDB framework. The backend improves over the old GDB support in the following ways: o Unified implementation with minimal MD code. o A simple interface for devices to register themselves as debug ports, ala consoles. o Compression by using run-length encoding. o Implements GDB threading support.	2004-07-10 17:47:22 +00:00
Brian Somers	0ac4013324	Change the following environment variables to kernel options: bootp -> BOOTP bootp.nfsroot -> BOOTP_NFSROOT bootp.nfsv3 -> BOOTP_NFSV3 bootp.compat -> BOOTP_COMPAT bootp.wired_to -> BOOTP_WIRED_TO - i.e. back out the previous commit. It's already possible to pxeboot(8) with a GENERIC kernel. Pointed out by: dwmalone	2004-07-08 22:35:36 +00:00
Marcel Moolenaar	1e3e78d239	MFamd64 (1.275): Reduce the scope of the Giant lock being held for non-mpsafe syscalls. There was way too much code being covered.	2004-07-08 21:08:07 +00:00
Marcel Moolenaar	469df33664	Better handle the break instruction trap. The runtime specification has outlined which break numbers are software interrupts, debugger breakpoints and ABI specific breaks. We mostly treated all break numbers we didn't care about as debugger breakpoints.	2004-07-08 16:30:42 +00:00
Brian Somers	59e1ebc9b5	Change the following kernel options to environment variables: BOOTP -> bootp BOOTP_NFSROOT -> bootp.nfsroot BOOTP_NFSV3 -> bootp.nfsv3 BOOTP_COMPAT -> bootp.compat BOOTP_WIRED_TO -> bootp.wired_to This lets you PXE boot with a GENERIC kernel by putting this sort of thing in loader.conf: bootp="YES" bootp.nfsroot="YES" bootp.nfsv3="YES" bootp.wired_to="bge1" or even setting the variables manually from the OK prompt.	2004-07-08 13:40:33 +00:00
Alan Cox	8fe61389a7	- Correct pmap_extract()'s return type. It should be vm_paddr_t, not vm_offset_t. - Convert pmap_extract() to the ANSI style of declaration.	2004-07-05 23:18:48 +00:00
John Baldwin	0c0b25ae91	Implement preemption of kernel threads natively in the scheduler rather than as one-off hacks in various other parts of the kernel: - Add a function maybe_preempt() that is called from sched_add() to determine if a thread about to be added to a run queue should be preempted to directly. If it is not safe to preempt or if the new thread does not have a high enough priority, then the function returns false and sched_add() adds the thread to the run queue. If the thread should be preempted to but the current thread is in a nested critical section, then the flag TDF_OWEPREEMPT is set and the thread is added to the run queue. Otherwise, mi_switch() is called immediately and the thread is never added to the run queue since it is switch to directly. When exiting an outermost critical section, if TDF_OWEPREEMPT is set, then clear it and call mi_switch() to perform the deferred preemption. - Remove explicit preemption from ithread_schedule() as calling setrunqueue() now does all the correct work. This also removes the do_switch argument from ithread_schedule(). - Do not use the manual preemption code in mtx_unlock if the architecture supports native preemption. - Don't call mi_switch() in a loop during shutdown to give ithreads a chance to run if the architecture supports native preemption since the ithreads will just preempt DELAY(). - Don't call mi_switch() from the page zeroing idle thread for architectures that support native preemption as it is unnecessary. - Native preemption is enabled on the same archs that supported ithread preemption, namely alpha, i386, and amd64. This change should largely be a NOP for the default case as committed except that we will do fewer context switches in a few cases and will avoid the run queues completely when preempting. Approved by: scottl (with his re@ hat)	2004-07-02 20:21:44 +00:00
Marcel Moolenaar	3d8f0528e5	Unbreak build: define __RMAN_RESOURCE_VISIBLE See also src/sys/sys/rman.h rev. 1.21.	2004-06-30 23:55:14 +00:00
Nate Lawson	1a26ea7f2c	Add machdep quirks functions. On i386, this disables acpi on systems with BIOS dates earlier than Jan 1, 1999. Add prototypes and quirks flags.	2004-06-30 04:42:29 +00:00
Alan Cox	2551e6f323	- Remove unused definitions. - Move a definition inside the scope of a #ifdef _KERNEL.	2004-06-23 08:06:52 +00:00
Bruce Evans	4c5f10a672	Backed out previous commit. Blind substitution of dev_t by `struct cdev *' was just wrong here because the dev_t's are user dev_t's.	2004-06-20 03:52:50 +00:00
Alan Cox	ffcbbfc220	Remove dead code related to pv entry allocation. Reviewed by: marcel@	2004-06-19 20:31:49 +00:00
Poul-Henning Kamp	89c9c53da0	Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.	2004-06-16 09:47:26 +00:00
Alan Cox	52fae0ba6c	Neither pmap_enter() nor pmap_enter_quick() should create pv entries for unmanaged pages. Tested by: marcel@	2004-06-11 20:11:41 +00:00
Poul-Henning Kamp	1930e303cf	Deorbit COMPAT_SUNOS. We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.	2004-06-11 11:16:26 +00:00
Alan Cox	4e302a62f7	Reduce the number of preallocated pv entries and lpte entries in pmap_init(). Tested by: marcel@	2004-06-11 04:24:35 +00:00
Poul-Henning Kamp	2140d01b27	Machine generated patch which changes linedisc calls from accessing linesw[] directly to using the ttyld...() functions The ttyld...() functions ar inline so there is no performance hit.	2004-06-04 16:02:56 +00:00
Tim J. Robbins	cc05397ffc	Remove checks for curthread == NULL - it can't happen.	2004-06-03 10:22:47 +00:00
Poul-Henning Kamp	fd360128ff	Add missing <sys/module.h> instances which were shadowed by the nested include in <sys/kernel.h>	2004-06-03 05:58:30 +00:00
Tim J. Robbins	fa2a4d0595	Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid having to acquire sched_lock when manipulating it in lockmgr(), uiomove(), and uiomove_fromphys(). Reviewed by: jhb	2004-06-03 01:47:37 +00:00
Poul-Henning Kamp	138fbf675a	Gainfully employ the new ttyioctl in the trivial cases.	2004-06-01 13:49:28 +00:00
Thomas Moestl	65e29c4822	Retire cpu_sched_exit(); it is not used any more.	2004-05-26 12:09:39 +00:00
Bruce Evans	b2321e7cdb	Moved most of the "MI" definitions and declarations from <machine/profile.h> to <sys/gmon.h>. Cleaned them up a little by not attempting to ifdef for incomplete and out of date support for GUPROF in userland, as in the sparc64 version.	2004-05-19 15:41:26 +00:00
Stefan Farfeleder	b1aa0ba527	<stdint.h> should define WINT_M{AX,IN} independent from whether WCHAR_MIN is defined. Otherwise first including <wchar.h> and then <stdint.h> leads to no WINT_M{AX,IN} at all. PR: 64956 Approved by: das (mentor)	2004-05-18 16:04:57 +00:00
Marcel Moolenaar	875bcd3528	Fix typo in comment. While here, end the sentence with a period and remove the empty line between the fdc and sio devices. The empty line suggests that the comment applies to fdc only while it applies to all following devices and options. Typo spotted by: ru@	2004-05-17 18:36:14 +00:00
Marcel Moolenaar	ef1eb2f330	Unbreak build due to previous commit: now that elf_reloc_internal() gets the relocation base passed in relocbase, we cannot declare a local variable with the same name. Assume the argument holds the same value as the local variable did...	2004-05-17 07:11:37 +00:00
Marcel Moolenaar	f2ab7b8bb0	filter out the fdc(4) and sio(4) devices and corresponding options. Note that cy(4) uses COM_MULTIPORT, so we need to keep that option.	2004-05-17 07:03:01 +00:00
Peter Wemm	e8855d4f97	Make a small revision to the api between the elf linker core and the elf_reloc() backends for two reasons. First, to support the possibility of there being two elf linkers in the kernel (eg: amd64), and second, to pass the relocbase explicitly (for relocating .o format kld files).	2004-05-16 20:00:28 +00:00
Marcel Moolenaar	4c86725c82	Revert previous commit. We should not get any FP traps from within the kernel. We can guarantee this by resetting the FP status register. This masks all FP traps. The reason we did get FP traps was that we didn't reset the FP status register in all cases. Make sure to reset the FP status register in syscall(). This is one of the places where it was forgotten. While on the subject, reset the FP status register only when we trapped from user space.	2004-05-07 05:35:31 +00:00
Marcel Moolenaar	99aa060c2b	Make sure to sanitize the FP status register. Specifically this masks all FP traps, which should not happen in the kernel.	2004-05-07 05:29:12 +00:00
Nate Lawson	869ec176fc	Make unnecessary globals static and remove unused includes. Pointed out by: cscout	2004-05-06 02:18:58 +00:00
Nate Lawson	65a7c90189	Add an MI implementation of the ACPI global lock routines and retire the individual asm versions. The global lock is shared between the BIOS and OS and thus cannot use our mutexes. It is defined in section 5.2.9.1 of the ACPI specification. Reviewed by: marcel, bde, jhb	2004-05-05 20:04:14 +00:00
Marcel Moolenaar	e160e18ba6	Floating-point faults and exceptions can happen in the kernel too. Do not panic when it happens; handle them. Run into by: das	2004-05-03 04:13:31 +00:00
Marcel Moolenaar	1b3564abb3	Catch- and cleanup: o Fix and improve comments and references, o Add PFIL_HOOKS, UFS_ACL and UFS_DIRHASH, o Switch from SCHED_4BSD to SCHED_ULE, o Remove SCSI_DELAY (there's no SCSI support),	2004-05-03 00:10:59 +00:00
David E. O'Brien	4e744b5e7f	Spell Ethernet correctly.	2004-05-02 18:57:29 +00:00
Marcel Moolenaar	73b2b50372	Verify the MADT checksum before using the table. Submitted by: njl	2004-05-01 04:08:14 +00:00
David Schultz	be3930682a	Hide FLT_EVAL_METHOD and DECIMAL_DIG in pre-C99 compilation environments. PR: 63935 Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at>	2004-04-25 02:36:29 +00:00
Nate Lawson	8ec94874b2	Don't check for NULL, device_get_softc() always succeeds.	2004-04-21 02:10:58 +00:00
Alan Cox	3edd4a4094	MFamd64 Simplify the sf_buf implementation. In short, make it a veneer over the direct virtual-to-physical mapping.	2004-04-18 07:11:12 +00:00
Alan Cox	4e67150a95	Remove a comment that refers to avail_start and avail_end as these variables no longer exist.	2004-04-11 06:37:36 +00:00
Alan Cox	b14d6acced	- pmap_kenter_temporary() is unused by machine-independent code. Therefore, move its declaration to the machine-dependent header file on those machines that use it. In principle, only i386 should have it. Alpha and AMD64 should use their direct virtual-to-physical mapping. - Remove pmap_kenter_temporary() from ia64. It is unused. Approved by: marcel@	2004-04-10 22:41:46 +00:00
Warner Losh	f36cfd49ad	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson	2004-04-07 20:46:16 +00:00
Alan Cox	a3b706071c	Remove avail_end. As of yesterday, it is unused.	2004-04-06 01:38:28 +00:00
Alan Cox	c8607538c8	Remove avail_start on those platforms that no longer use it. (Only amd64 does anything with it beyond simple initialization.)	2004-04-05 04:08:00 +00:00
Alan Cox	bdb93eb248	Remove unused arguments from pmap_init().	2004-04-05 00:37:50 +00:00
Alan Cox	121230a40d	In some cases, sf_buf_alloc() should sleep with pri PCATCH; in others, it should not. Add a new parameter so that the caller can specify which is the case. Reported by: dillon	2004-04-03 09:16:27 +00:00
Marcel Moolenaar	60f66d9174	MFi386: correctly calculate the top-of-stack when a kthread is created with a larger kernel stack. Remove inclusion of opt_kstack_pages.h now that it's unused.	2004-03-27 17:44:25 +00:00
Marcel Moolenaar	6beee8df28	In breakpoint(), use a different immediate to make sure we can distinguish between debugger inserted breakpoints and fixed breakpoints. While here, make sure the break instruction never ends up in the last slot of a bundle by forcing it to be an M-unit instruction. This makes it easier for use to skip over it.	2004-03-21 01:41:29 +00:00
Alan Cox	1af1ebb84e	- Add uiomove_fromphys() implementations to alpha and ia64. These only differ trivially from amd64. - Correct a spelling error in a comment.	2004-03-20 21:06:20 +00:00
Marcel Moolenaar	a36bdc0606	Introduce the cpumask_t type. The purpose of the type is to create a level of abstraction for any and all CPU mask and CPU bitmap variables so that platforms have the ability to break free from the hard limit of 32 CPUs, simply because we don't have more bits in an u_int. Note that the type is not supposed to solve massive parallelism, where the number of CPUs can be larger than the width of the widest integral type. As such, cpumask_t is not supposed to be a compound type. If such would be necessary in the future, we can deal with the issues then and there. For now, it can be assumed that the type is integral and unsigned. With this commit, all MD definitions start off as u_int. This allows us to phase-in cpumask_t at our leasure without breaking anything. Once cpumask_t is used consistently, platforms can switch to wider (or smaller) types if such would be beneficial (or not; whatever :-) Compile-tested on: i386	2004-03-20 20:41:40 +00:00
Marcel Moolenaar	e10f4ce153	Replace uint64_t with unsigned long in struct dbreg.	2004-03-20 05:27:14 +00:00
Marcel Moolenaar	5ab92b80d0	Remove the last traditional hints. These hints only served the purpose for uart(4) to figure out which device to use as console. Use this file to define hw.uart.console instead so that we don't have to put it in the default loader.conf, which makes it hard to override.	2004-03-20 04:23:03 +00:00
John-Mark Gurney	4de27366d1	sync comment with i386's isa.c.. This removes a comment that is YEARS old...	2004-03-17 21:45:55 +00:00
Alan Cox	90ecfebd82	Refactor the existing machine-dependent sf_buf_free() into a machine- dependent function by the same name and a machine-independent function, sf_buf_mext(). Aside from the virtue of making more of the code machine- independent, this change also makes the interface more logical. Before, sf_buf_free() did more than simply undo an sf_buf_alloc(); it also unwired and if necessary freed the page. That is now the purpose of sf_buf_mext(). Thus, sf_buf_alloc() and sf_buf_free() can now be used as a general-purpose emphemeral map cache.	2004-03-16 19:04:28 +00:00
Scott Long	11d905ecd8	Now that contigfree() does not require Giant, don't grab it in busdma.	2004-03-13 15:42:59 +00:00
Marcel Moolenaar	39209a2fad	Identify the Deerfield processor. Deerfield is a low-voltage variant based on the Madison core and targeting the low end of the spectrum. Its clock frequency is 1Ghz, whereas Madison starts at 1.3Ghz. Since the CPUID information is the same for Madison and Deerfield, we use the clock frequency to identify the processor. Supposedly the Deerfield only uses 62W, which seems to be less than modern Xeon processors (about 70W) and about half what a Madison would need.	2004-03-10 22:23:20 +00:00
Alan Cox	fcffa790e9	Retire pmap_pinit2(). Alpha was the last platform that used it. However, ever since alpha/alpha/pmap.c revision 1.81 introduced the list allpmaps, there has been no reason for having this function on Alpha. Briefly, when pmap_growkernel() relied upon the list of all processes to find and update the various pmaps to reflect a growth in the kernel's valid address space, pmap_init2() served to avoid a race between pmap initialization and pmap_growkernel(). Specifically, pmap_pinit2() was responsible for initializing the kernel portions of the pmap and pmap_pinit2() was called after the process structure contained a pointer to the new pmap for use by pmap_growkernel(). Thus, an update to the kernel's address space might be applied to the new pmap unnecessarily, but an update would never be lost.	2004-03-07 21:06:48 +00:00
Alan Cox	d82852fbb9	Integrate the code from pmap_pinit2() into pmap_pinit(), leaving pmap_pinit2() empty. Approved by: marcel	2004-03-07 07:43:13 +00:00
Alan Cox	925d2fedf5	Remove unused declarations. (Some time ago, these variables became fields of vm/vm.h's struct kva_md_info.)	2004-03-07 07:13:15 +00:00
Lukas Ertl	1bcf24ee9d	Fix syntax errors and wrong function prototypes in several MD header files when using non-GNUC compilers. PR: kern/58515 Submitted by: Stefan Farfeleder <stefan@fafoe.narf.at> Approved by: grog (mentor), obrien	2004-03-05 09:19:59 +00:00
Marcel Moolenaar	27e327fdaf	Do not pre-map the I/O port space. On the Intel Tiger 4 this conflicts with a memory mapped I/O range that's immediately before it and is not 256MB aligned. As a result, when an address is accessed in the memory mapped range and a direct mapping is added for it, it overlaps with the pre-mapped I/O port space and causes a machine check. Based on a patch from: arun@	2004-02-22 02:10:48 +00:00
Poul-Henning Kamp	dc08ffec87	Device megapatch 4/6: Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION. Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.	2004-02-21 21:10:55 +00:00
Poul-Henning Kamp	8e1f1df080	Device megapatch 3/6: Add missing D_TTY flags to various drivers. Complete asserts that dev_t's passed to ttyread(), ttywrite(), ttypoll() and ttykqwrite() have (d_flags & D_TTY) and a struct tty pointer. Make ttyread(), ttywrite(), ttypoll() and ttykqwrite() the default cdevsw methods for D_TTY drivers and remove the explicit initializations in various drivers cdevsw structures.	2004-02-21 20:41:11 +00:00
Poul-Henning Kamp	c9c7976f7f	Device megapatch 1/6: Free approx 86 major numbers with a mostly automatically generated patch. A number of strategic drivers have been left behind by caution, and a few because they still (ab)use their major number.	2004-02-21 19:42:58 +00:00
Poul-Henning Kamp	0b7ed341e1	Change the disk(9) API in order to make device removal more robust. Previously the "struct disk" were owned by the device driver and this gave us problems when the device disappared and the users of that device were not immediately disappearing. Now the struct disk is allocate with a new call, disk_alloc() and owned by geom_disk and just abandonned by the device driver when disk_create() is called. Unfortunately, this results in a ton of "s/\./->/" changes to device drivers. Since I'm doing the sweep anyway, a couple of other API improvements have been carried out at the same time: The Giant awareness flag has been flipped from DISKFLAG_NOGIANT to DISKFLAG_NEEDSGIANT A version number have been added to disk_create() so that we can detect, report and ignore binary drivers with old ABI in the future. Manual page update to follow shortly.	2004-02-18 21:36:53 +00:00
Marcel Moolenaar	a753b14687	Sort PFIL_HOOKS.	2004-01-27 20:22:53 +00:00
Jeff Roberson	048ac395be	- Recruit some new ULE users by making it the default scheduler in GENERIC. ULE will be in a probationary period to determine whether it will be left as the default in 5.3 which would likely mean the rest of the 5.x series.	2004-01-24 21:38:52 +00:00
Jacques Vidrine	5864cda7c6	Add PFIL_HOOKS to the GENERIC kernel configuration, primarily so that one can load the IPFilter module (which requires PFIL_HOOKS). Requested by: Many, for over a year	2004-01-24 14:59:51 +00:00
Marcel Moolenaar	28466ae036	Fix handling of FP traps: o For traps, the cr.iip register points to the next instruction to execute on interrupt return (modulo slot). Since we need to get the bundle of the instruction that caused the FP fault/trap, make sure we fetch the previous bundle if the next instruction is in fact the first in a bundle. o When we call the FPSWA handler, we need to tell it whether it's a trap or a fault (first argument). This was hardcoded to mean a fault. Also, for FP faults, when a fault is converted to a trap, adjust the cr.iip and cr.ipsr registers to point to the next instruction. This makes sure that the SIGFPE handler gets a consistent state.	2004-01-20 03:29:24 +00:00
Marcel Moolenaar	dd45fd9791	s/framep/tf/g -- this normalizes on the use of tf to point to the trapframe and improves grep-ability.	2004-01-20 02:35:46 +00:00
Dag-Erling Smørgrav	2d6853a650	Whitespace nit.	2004-01-13 15:30:36 +00:00
Jacques Vidrine	e4dc8baa84	Provide sysarch(2) prototypes in the MD sysarch.h headers. While I'm at it, use the ANSI C generic pointer type for the second argument, thus matching the documentation. Remove the now extraneous (and now conflicting) function declarations in various libc sources. Remove now unnecessary casts. Reviewed by: bde	2004-01-09 16:52:09 +00:00
David Xu	a30ec4b99c	Make sigaltstack as per-threaded, because per-process sigaltstack state is useless for threaded programs, multiple threads can not share same stack. The alternative signal stack is private for thread, no lock is needed, the orignal P_ALTSTACK is now moved into td_pflags and renamed to TDP_ALTSTACK. For single thread or Linux clone() based threaded program, there is no semantic changed, because those programs only have one kernel thread in every process. Reviewed by: deischen, dfr	2004-01-03 02:02:26 +00:00
Mike Silbersack	ddeb5b242e	Track three new sendfile-related statistics: - The number of times sendfile had to do disk I/O - The number of times sfbuf allocation failed - The number of times sfbuf allocation had to wait	2003-12-28 08:57:09 +00:00
Mike Silbersack	5caf2b00f0	Move the declaration of sfbufspeak and sfbufsused to mbuf.h, and use imax instead of max, as sfbufspeak and sfbufsused are signed. Submitted by: bde	2003-12-28 01:43:22 +00:00
Mike Silbersack	5eda9873e9	Track current and peak sfbuf usage, export the values via sysctl.	2003-12-27 07:52:47 +00:00
Marcel Moolenaar	b3378ed911	Don't use NULL with integral types.	2003-12-24 19:55:07 +00:00
Peter Wemm	7655ebdaaa	Return AE_OK for stub functions returning ACPI_STATUS, not NULL	2003-12-24 05:26:26 +00:00
Peter Wemm	c15e347e22	GC the unused <machine/kse.h> file.	2003-12-24 00:51:30 +00:00
Peter Wemm	9b68618df0	Add an additional field to the elf brandinfo structure to support quicker exec-time replacement of the elf interpreter on an emulation environment where an entire /compat/* tree isn't really warranted.	2003-12-23 02:42:39 +00:00
Peter Wemm	59553d37df	Add missing #include "opt_compat.h" so that the compatability function freebsd4_freebsd32_sigreturn() is defined when expected. This should unbreak the tinderbox. Sorry.	2003-12-18 06:59:18 +00:00
Marcel Moolenaar	a8434283fc	In set_mcontext(), take into account that kse_switchin(2) will eventually be passed an async. context as well as a syscall context. While here, fix a serious bug in that if the trapframe is a syscall frame, but we're restoring an async context, we need to clear the FRAME_SYSCALL flag so that we leave the kernel via exception_restore.	2003-12-14 01:59:31 +00:00
Peter Wemm	64d85faa1c	Assimilate ia64 back into the fold with the common freebsd32/ia32 code. The split-up code is derived from the ia64 code originally. Note that I have only compile-tested this, not actually run-tested it. The ia64 side of the force is missing some significant chunks of signal delivery code.	2003-12-11 01:05:09 +00:00
Peter Wemm	ba3fa22e8c	Fix last second typo.	2003-12-10 22:59:03 +00:00
Peter Wemm	419d43c635	Use gcc's superior ffs() builtin.	2003-12-10 22:51:40 +00:00
Peter Wemm	a80f5f272c	Use ffs(x) == popcnt(x ^ (x - 1)) to implement 64 bit ffsl(). gcc's ffs() builtin uses this already but truncates the upper 32 bits.	2003-12-10 22:47:02 +00:00
Marcel Moolenaar	b291daba6f	Don't panic for misalignment traps when the onfault handler is set. Not all transfers between kernel and user space are byte oriented and thus alignment safe. Especially fuword() and suword() are sensitive to alignment but in general more optimal than block copies. By catching the misalignment trap we avoid pessimizing the common case of properly aligned memory accesses which we would do if we were to use byte copies or adding tests for proper alignment. Note that the expectation that the kernel produces aligned pointers is unchanged. This change therefore relates to possible unaligned pointers generated in userland.	2003-12-09 09:52:14 +00:00
Nate Lawson	cac6460cfe	Use the ACPI-CA definitions for the various APIC tables instead of our own.	2003-12-09 03:04:19 +00:00
David E. O'Brien	a5b5101f5e	Move the bktr(4) <arch>/include/ioctl_{bt848,meteor}.h files to dev/bktr as these ioctl's aren't MD. This also means they are installed in /usr/include/dev/bktr now. Also provide compatability wrappers for where these headers lived in 4.x.	2003-12-08 07:22:42 +00:00
Marcel Moolenaar	47eb01b822	Simplify the contexts created by the kernel and remove the related flags. We now create asynchronous contexts or syscall contexts only. Syscall contexts differ from the minimal ABI dictated contexts by having the scratch registers saved and restored because that's where we keep the syscall arguments and syscall return values. Since this change affects KSE, have it use kse_switchin(2) for the "new" syscall context.	2003-12-07 20:47:33 +00:00
Warner Losh	05a463a03d	Ooops. These are still used by the bktr driver. David O'Brien has plans for dealing, but I'll let him deal. Pointy hat to: imp@	2003-12-07 06:37:32 +00:00
Warner Losh	65b4a1b917	Remote meteor driver. It hasn't compiled in over 3 years. If someone makes it compile again, and can test it, we can restore the driver to the tree.	2003-12-07 04:41:11 +00:00
John Baldwin	798a45964d	- Split cpu_mp_probe() into two parts. cpu_mp_setmaxid() is still called very early (SI_SUB_TUNABLES - 1) and is responsible for setting mp_maxid. cpu_mp_probe() is now called at SI_SUB_CPU and determines if SMP is actually present and sets mp_ncpus and all_cpus. Splitting these up allows an architecture to probe CPUs later than SI_SUB_TUNABLES by just setting mp_maxid to MAXCPU in cpu_mp_setmaxid(). This could allow the CPU probing code to live in a module, for example, since modules sysinit's in modules cannot be invoked prior to SI_SUB_KLD. This is needed to re-enable the ACPI module on i386. - For the alpha SMP probing code, use LOCATE_PCS() instead of duplicating its contents in a few places. Also, add a smp_cpu_enabled() function to avoid duplicating some code. There is room for further code reduction later since much of this code is also present in cpu_mp_start(). - All archs besides i386 still set mp_maxid to the same values they set it to before this change. i386 now sets mp_maxid to MAXCPU. Tested on: alpha, amd64, i386, ia64, sparc64 Approved by: re (scottl)	2003-11-21 22:23:26 +00:00
Marcel Moolenaar	1fc7ca0fb1	Set the ACPI processor Id in the PCPU structure so that CPU idling on SMP systems has a chance of working. This was a loose end of the implementation of the ACPI Cx idle states. Since our logical CPU Id is the ACPI processor Id, we do not need to jump through hoops to obtain it. Approved: re@ (jhb)	2003-11-20 16:42:39 +00:00
Peter Wemm	0bfbe7b935	Widen the enable/disable helper function's argument in line with the ithread_create() changes etc. This should be mostly a NOP.	2003-11-17 06:10:15 +00:00
Bruce Evans	81bbee5996	Fixed a pedantic syntax error (a stray semicolon at the end of PCPU_MD_FIELDS).	2003-11-17 03:40:41 +00:00
Alan Cox	0ec3db3072	- Remove unnecessary synchronization from sf_buf_init(). (There is only one active CPU when sf_buf_init() is performed.)	2003-11-16 23:40:06 +00:00
Alan Cox	e45db9b837	- Modify alpha's sf_buf implementation to use the direct virtual-to- physical mapping. - Move the sf_buf API to its own header file; make struct sf_buf's definition machine dependent. In this commit, we remove an unnecessary field from struct sf_buf on the alpha, amd64, and ia64. Ultimately, we may eliminate struct sf_buf on those architecures except as an opaque pointer that references a vm page.	2003-11-16 06:11:26 +00:00
Nate Lawson	b72e9cf526	Add the pc_acpi_id PCPU member. The new acpi_cpu driver uses this to dereference the softc.	2003-11-15 18:58:29 +00:00
Marcel Moolenaar	eea3bbdff8	Remove ia64_highfp_load() now that it's unused.	2003-11-12 03:24:34 +00:00
Marcel Moolenaar	0d9ae4e24e	Further work-out the handling of the high FP registers. The most important change is in cpu_switch() where we disable the high FP registers for the thread that we switch-out if the CPU currently has its high FP registers. This avoids that the high FP registers remain enabled for the thread even when the CPU has unloaded them or the thread migrated to another processor. Likewise, when we switch-in a thread of that has its high FP registers on the CPU, we enable them. This avoids an otherwise harmless, but unnecessary trap to have them enabled. The code that handles the disabled high FP trap (in trap()) has been turned into a critical section for the most part to avoid being preempted. If there's a race, we bail out and have the processor trap again if necessary. Avoid using the generic ia64_highfp_save() function when the context is predictable. The function adds unnecessary overhead. Don't use ia64_highfp_load() for the same reason. The function is now unused and can be removed. These changes make the lazy context switching of the high FP registers in an UP kernel functional.	2003-11-12 01:26:02 +00:00
Marcel Moolenaar	a5ba2b5cc4	Save and restore the high FP registers in {g\|s}_mcontext(). Note that we currently do not keep track of whether the thread has actually used the high FP registers before. If not, we should not save them in the context which automaticly means that we also would not restore them from the context. For now, do it unconditionally so that we can reach functional completeness.	2003-11-11 09:53:37 +00:00
Marcel Moolenaar	9d52656a5a	Fix a nasty bug that got exposed when the sendsig() and sigreturn() functions switched to using {g\|s}et_mcontext(). The problem is that sigreturn(), being a syscall, can be given an async. context (i.e. one corresponding to an interrupt or trap). When this happens, we try to return to user mode via epc_syscall_return with a trapframe that can only be used to return to user mode via exception_restore. To fix this, we check the frame's flags immediately prior to epc_syscall_return and branch to exception_restore for non-syscall frames. Modify the assertion in set_mcontext() to check that if there's a mismatch, it's because of sigreturn().	2003-11-11 09:25:19 +00:00
Marcel Moolenaar	9422d61a1f	In get_mcontext(), do not update bspstore and ndirty in the trapframe. Only update them in the newly created context to reflect the state after copying the dirty registers onto the user stack. If we were to update the trapframe, we lose the state at entry into the kernel. We may need that after we create the context, such as for KSE upcalls. We have to update the trapframe after writing the dirty registers to the user stack for signal delivery to work. But this is best done in sendsig() itself where it applies, not in get_mcontext() where it's done unconditionally.	2003-11-10 05:28:05 +00:00
Marcel Moolenaar	3534a08109	When a thread is being swapped-out, save the high FP registers. We have a pointer in the PCPU to the PCB of the thread that currently has its high FP registers loaded.	2003-11-09 23:13:23 +00:00
Marcel Moolenaar	ac8c7680a6	Use get_mcontext() to construct the signal context in sendsig() and use set_mcontext() to restore the context in sigreturn(). Since we put the syscall number and the syscall arguments in the trapframe (we don't save the scratch registers for syscalls, which allows us to reuse the space to our advantage), create a MD specific flag so that we save the scratch registers even for syscalls. We would not be able to restart a syscall otherwise. The signal trampoline does not need to flush the regiters anymore, because get_mcontext() already handles that. In fact, if we set up the context correctly, we do not need to have a trampoline at all. This change however only minimally changes the trampoline code. In follow-up commits this can be further optimized. Note that normally we preserve cfm and iip in the trapframe created by the EPC syscall path when we restore a context in set_mcontext() because those fields are not normally set for a synchronuous context. The kernel puts the return address and frame info of the syscall stub in there. By preserving these fields we hide this detail from userland which allows us to use setcontext(2) for user created contexts. However, sigreturn() is commonly called from the trampoline, which means that if we preserve cfm and iip in all cases, we would return to the trampoline after the sigreturn(), which means we hit the safety net: we call exit(2). So, we do not preserve cfm and iip when we have a synchronous context that also has scratch registers (the uncommon context created by sendsig() only), under the assumption that if such a context is created in userland, something special is going on and the use of cfm and iip is then just another quirk. All this is invisible in the common case.	2003-11-09 22:17:36 +00:00
Marcel Moolenaar	fcaa2925a9	Change the clear_ret argument of get_mcontext() to be a flags argument. Since all callers either passed 0 or 1 for clear_ret, define bit 0 in the flags for use as clear_ret. Reserve bits 1, 2 and 3 for use by MI code for possible (but unlikely) future use. The remaining bits are for use by MD code. This change is triggered by a need on ia64 to have another knob for get_mcontext().	2003-11-09 20:31:04 +00:00
Marcel Moolenaar	00bd917263	Remove the atkbd, psm, sc and vga devices. Most ia64 boxes out there are zx1 based machines and they don't particularly like it when we poke at them with PC legacy code. The atkbd and psm devices were disabled in the hints file so that one could enable them on machines that support legacy devices, but that's not really something you can expect from a first-time installer. This still leaves syscons (sc) and the vga device, which were enabled by default and wrecking havoc anyway. We could disable them by default like the atkbd and psm devices, but there's really no point in pretending we're in a better shape that way.	2003-11-08 23:19:13 +00:00
Scott Long	eb3b7bf69f	Document the lockfunc and lockfuncarg arguments to bus_dma_tag_create() in the busdma headers.	2003-11-07 23:29:42 +00:00
John Baldwin	dac33f12cc	Regen.	2003-11-07 20:30:30 +00:00
John Baldwin	a060e9b7ef	Sync with global syscalls.master. ptrace(), dup(), pipe(), ktrace(), ia32_sigaltstack(), sysarch(), issetugid(), utrace(), and ia32_sigaction() are MP safe.	2003-11-07 20:27:16 +00:00
Marcel Moolenaar	51e25af386	Add support for unaligned ld2, st2, st4 and st8. While here, make sure we handle stacked registers properly by taking into account that: 1. bspstore points after the frame (due to cover), 2. we need to adjust for intermediate NaT collections.	2003-11-06 04:26:40 +00:00
Marcel Moolenaar	2642a8845b	Handle unaligned 4-byte loads. While in the neighborhood, remove the cr.isr sanity check. We actually encounter insanities, which very likely means that the insanity check itself is insane. Remove an empty comment while I'm at it.	2003-11-03 08:04:04 +00:00
Marcel Moolenaar	6537124772	Add a bogus definition of __va_list for use by lint. Make it visible only when lint is defined to protect builds with non-GNU compilers.	2003-11-03 05:04:09 +00:00
Marcel Moolenaar	fcca8c1dde	Remove headers copied from i386 and either useless or wrong on ia64. An example of useless is bios.h. An example of wrong is msdos.h (due to the use of long for 32-bit fields). display.h cannot be removed because it's used by syscons. That header however has no platform dependency and shouldn't really be here. Removal if these headers may cause build failures in the ports tree. It's the ports that need fixing in that case. Tested with: buildworld, LINT	2003-11-02 09:19:07 +00:00
Marcel Moolenaar	3bdfa17c6c	When switching the RSE to use the kernel stack as backing store, keep the RNAT bit index constant. The net effect of this is that there's no discontinuity WRT NaT collections which greatly simplifies certain operations. The cost of this is that there can be up to 504 bytes of unused stack between the true base of the kernel stack and the start of the RSE backing store. The cost of adjusting the backing store pointer to keep the RNAT bit index constant, for each kernel entry, is negligible. The primary reasons for this change are: 1. Asynchronuous contexts in KSE processes have the disadvantage of having to copy the dirty registers from the kernel stack onto the user stack. The implementation we had so far copied the registers one at a time without calculating NaT collection values. A process that used speculation would not work. Now that the RNAT bit index is constant, we can block-copy the registers from the kernel stack to the user stack without having to worry about NaT collections. They will be in the right place on the user stack. 2. The ndirty field in the trapframe is now also usable in userland. This was previously not the case because ndirty also includes the space occupied by NaT collections. The value could be off by 8, depending on the discontinuity. Now that the RNAT bit index is contants, we have exactly the same number of NaT collection points on the kernel stack as we would have had on the user stack if we didn't switch backing stores. 3. Debuggers and other applications that use ptrace(2) can now copy the dirty registers from the kernel stack (using ptrace(2)) and copy them whereever they want them (onto the user stack of the inferior as might be the case for gdb) without having to worry about NaT collections in the same way the kernel doesn't have to worry about them. There's a second order effect caused by the randomization of the base of the backing store, for it depends on the number of dirty registers the processor happened to have at the time of entry into the kernel. The second order effect is that the RSE will have a better cache utilization as compared to having the backing store always aligned at page boundaries. This has not been measured and may be in practice only minimally beneficial, if at all measurable.	2003-10-28 19:38:26 +00:00
Marcel Moolenaar	95b0df9df2	The previous commit removed both clause 3 and clause 4 from the UCB license. Only clause 3 has been revoked. Restore the fourth clause as clause 3. Pointed out by: das@ Remove my name as a copyright holder since I don't use a BSD license compatible or comparable to the UCB license. I choose not to add a complete second license for my work for aesthetic reasons, nor to replace the UCB license on grounds of rewriting more than 90% of the source files. The rewrite can also be seen as an enhancement and since the files were practically empty, it's rather trivial to have changed 90% of the files.	2003-10-27 22:54:34 +00:00
Marcel Moolenaar	f74fae21b8	Add support for userland to access I/O port space. This is primarily added for XFree86. There are 2 reasons for doing this with sysarch(): 1. The memory mapped I/O space is not at a fixed physical address. An application has to use some interface to get the base address. It gets worse if the machine has multiple memory mapped I/O spaces. 2. Access to the memory mapped I/O space needs to happen through a translation that is flagged as uncachable. There's no interface that allows a process to do uncached memory I/O, other than though /dev/mem (possibly). So, until we either disallow direct access to I/O or bus space from userland or have a better way of doing this, sysarch() has the least negative impact on existing interfaces.	2003-10-27 05:45:35 +00:00
Marcel Moolenaar	3a988c5c87	Remove unused header. See also ia64/disasm/disasm.h.	2003-10-24 06:53:43 +00:00
Marcel Moolenaar	2a0a749f39	Remove ia64_pack_bundle() and ia64_unpack_bundle(). They are not used anymore.	2003-10-24 06:52:21 +00:00
Marcel Moolenaar	4d85274d1a	Remove unused file. db_disasm() has been implemented in db_interface.c now.	2003-10-24 06:48:41 +00:00
Marcel Moolenaar	5664617492	Implement db_disasm() by using the new disassembler. Temporarily unimplement db_write_breakpoint() and db_clear_breakpoint().	2003-10-24 06:42:03 +00:00
Arun Sharma	f47392f4c2	Use a TR of size 1 << IA64_ID_PAGE_SHIFT instead of 16M to avoid overlapping TR/TC entries (which results in a machine check). Note that we don't look at the size of the memory descriptor, because it doesn't guarantee non-overlap. With this change, a UP kernel could boot on a Intel Tiger4 machine with the following options: options LOG2_ID_PAGE_SIZE=26 # 64M options LOG2_PAGE_SIZE=14 # 16K Approved by: marcel	2003-10-24 04:56:58 +00:00
Marcel Moolenaar	5c03a7c7f9	Don't use fuword() or suword() unconditionally. They explicitly disallow reading or writing.	2003-10-24 02:33:26 +00:00
Marcel Moolenaar	3fc58f92dc	Remove two unused fields in the operand structure (o_read & o_write).	2003-10-24 02:05:53 +00:00
Marcel Moolenaar	764015afda	Cleanup. Remove the md_flags for threads. It's not used. The flags we had were bogus. While here, reassign the copyright to the Project. There's nothing in this files that originates from NetBSD, especially now that the FreeBSD/alpha bits have been removed, but even then the amount of inherited code that we actually used was nil.	2003-10-23 06:41:59 +00:00
Marcel Moolenaar	32efda28bf	Reimplement unaligned_fixup() using the new disassembler and a mcontext_t for the register values. Currently only ld8 and ldfd instructions are handled as those are the ones we need now (a misaligned ld8 occurs 4 times in ntpd(8) and a misaligned ldfd occurs once in mozilla 1.4 and 1.5). Other instructions are added when needed.	2003-10-23 06:32:34 +00:00
Marcel Moolenaar	49e4ce1f63	Remove unused include of <machine/inst.h>	2003-10-23 06:23:55 +00:00
Marcel Moolenaar	5a931213f0	Remove prototype of unaligned_fixup() and fix a nearby style(9) bug.	2003-10-23 06:21:44 +00:00
Marcel Moolenaar	26c41f9dd1	Add prototypes for spillfd() and unaligned_fixup().	2003-10-23 06:20:38 +00:00
Marcel Moolenaar	075f7fe484	Add spillfd(). This function loads a double-precision FP register at the first address and spills it to the second address. This allows unaligned_fixup() to update the context of the process in a way that assures proper rounding. Similar functions for single-and extended-precision are added when needed.	2003-10-23 06:19:06 +00:00
Marcel Moolenaar	b9eabb421b	Add a new disassembler that improves over the previous disassembler in that it provides an abstract (intermediate) representation for instructions. This significantly improves working with instructions such as emulation of instructions that are not implemented by the hardware (e.g. long branch) or enhancing implemented instructions (e.g. handling of misaligned memory accesses). Not to mention that it's much easier to print instructions. Functions are included that provide a textual representation for opcodes, completers and operands. The disassembler supports all ia64 instructions defined by revision 2.1 of the SDM (Oct 2002).	2003-10-23 06:01:52 +00:00
Marcel Moolenaar	9ee99eb496	Remove md_bspstore from the MD fields of struct thread. Now that the backing store is at a fixed address, there's no need for a per-thread variable.	2003-10-21 01:13:49 +00:00
Marcel Moolenaar	bab1f05277	Put the RSE backing store at a fixed address. This change is triggered by libguile that needs to know the base of the RSE backing store. We currently do not export the fixed address to userland by means of a sysctl so user code needs to hardcode it for now. This will be revisited later. The RSE backing store is now at the bottom of region 4. The memory stack is at the top of region 4. This means that the whole region is usable for the stacks, giving a 61-bit stack space. Port: lang/guile (depended of x11/gnome2)	2003-10-20 05:34:10 +00:00
Nate Lawson	4c3655b418	Add the cpu_idle_hook() function pointer so that other idlers can be hooked at runtime. Make C1 sleep (e.g., HLT) be the default. This prepares the way for further ACPI sleep states.	2003-10-18 22:25:07 +00:00
Marcel Moolenaar	b0f865c1f3	Implement cpu_idle() on ia64. We put the processor in a lightweight halt state that minimizes power consumption while still preserving cache and TLB coherency. Halting the processor is not conditional at this time. Tested with UP and SMP kernels.	2003-10-17 02:24:59 +00:00
Robert Drehmel	ea924c4cd3	Implement preliminary support for the PT_SYSCALL command to ptrace(2).	2003-10-09 10:17:16 +00:00
Marcel Moolenaar	c3f4e4fbb5	With BETA 5 of libuwx some of the application registers are renamed from UWX_REG_MUMBLE to UWX_REG_AR_MUMBLE. Compatibility defines are present in libuwx. Change the names here so that we don't depend on compatibility defines. Note that there's now an UWX_REG_PFS and an UWX_REG_AR_PFS and the former is not a compatibility define for the latter AFAICT. Change to UWX_REG_AR_PFS as that seems to be the one we need to handle.	2003-10-09 03:11:37 +00:00
Marcel Moolenaar	f3e533d270	Include <sys/smp.h> for the prototype of smp_rendezvous().	2003-10-08 19:55:45 +00:00
Bruce M Simpson	2bc7dd5661	Move pmap_resident_count() from the MD pmap.h to the MI pmap.h. Add a definition of pmap_wired_count(). Add a definition of vmspace_wired_count(). Reviewed by: truckman Discussed with: peter	2003-10-06 01:47:12 +00:00
Alan Cox	566526a957	Migrate pmap_prefault() into the machine-independent virtual memory layer. A small helper function pmap_is_prefaultable() is added. This function encapsulate the few lines of pmap_prefault() that actually vary from machine to machine. Note: pmap_is_prefaultable() and pmap_mincore() have much in common. Going forward, it's worth considering their merger.	2003-10-03 22:46:53 +00:00
Marcel Moolenaar	5bf2d2b6b4	Swap the syscall caller frame info (i.e. the return pointer and frame marker) and the syscall stub frame info in the trap frame. Previously we stored the stub frame info in (rp,pfs) and the caller frame info in (iip,cfm). This ends up being suboptimal for the following reasons: 1. When we create a new context, such as for an execve(2), we had to set the (rp,pfs) pair for the entry point when using the syscall path out of the kernel but we need to set the (iip,cfm) pair when we take the interrupt way out. This is mostly just an inconsistency from the kernel's point of view, but an ugly irregularity from gdb(1)'s point of view. 2. The getcontext(2) and setcontext(2) syscalls had to swap the (rp,pfs) and (iip,cfm) pairs to make the context compatible with one created purely in userland. Swapping the (rp,pfs) and (iip,cfm) pairs is visible to signal handlers that actually peek at the mcontext_t and to gdb(1). Since this change is made for gdb(1) and we don't care about signal handlers that peek at the mcontext_t because we're still a tier 2 platform, this ABI breakage is academic at this moment in time. Note that there was no real reason to save the caller frame info in (iip,cfm) and the stub frame info in (rp,pfs).	2003-10-03 03:50:29 +00:00
Marcel Moolenaar	c0e56dc2c3	Drop any and all support for varargs. There's no history to worry about because we're still tier 2 and our current compiler, as well as future compilers will not support varargs. This is mostly a no-op in practice, because <sys/varargs.h> should already cause compile failures.	2003-09-28 05:34:07 +00:00
Poul-Henning Kamp	26b0e90ca2	Set cn_name, not cn_dev	2003-09-26 10:37:16 +00:00
Peter Wemm	c460ac3a00	Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit systems where the data/stack/etc limits are too big for a 32 bit process. Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c. Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy. Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced. Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does. Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.	2003-09-25 01:10:26 +00:00
Yoshihiro Takahashi	33e38a2cc8	Implement the bus_space_map() function to allocate resources and initialize a bus_handle, but currently it does only initializing a bus_handle.	2003-09-23 08:22:34 +00:00
Marcel Moolenaar	719325db8e	Fix the last remaining problem encountered by KSE: apparently it is not guaranteed that the RSE writes the NaT collection immediately, sort of atomically, to the backing store when it writes the register immediately prior to the NaT collection point. This means that we cannot assume that the low 9 bits of the backingstore pointer do not point to the NaT collection. This is rather a surprise and I don't know at this time if it's a bug in the Merced or that it's actually a valid condition of the architecture. A quick scan over the sources does not indicate that we depend on the false assumption elsewhere, but it's something to keep in mind. The fix is to write the saved contents of the ar.rnat register to the backingstore prior to entering the loop that copies the dirty registers from the kernel stack to the user stack.	2003-09-20 20:34:58 +00:00
Marcel Moolenaar	b8d941f010	Move uma_small_alloc() and uma_small_free() to uma_machdep.c. These functions reference UMA internals from <vm/uma_int.h>, which makes them highly unwanted in non-UMA specific files. While here, prune the includes in pmap.c and use __FBSDID(). Move the includes above the descriptive comment. The copyright of uma_machdep.c is assigned to the project and can be reassigned to the foundation if and when when such is preferrable.	2003-09-20 19:27:48 +00:00
Marcel Moolenaar	fe4723c884	Fix the most significant KSE breakage caused by not restoring the restart instruction bits in the PSR. As such, we were returning from interrupt to the instruction in the bundle that caused us to enter the kernel, only now we're returning to a completely different bundle. While close here: add two KASSERTs to make sure that we restore sync contexts only when entered the kernel through a syscall and restore an async context only when entered the kernel through an interrupt, trap or fault. While not exactly here, but close enough: use suword64() when we copy the dirty registers from the kernel stack to the user stack. The code was intended to be be replaced shortly after being added, but that was a couple of weeks ago. I might as well avoid that it is a source for panics until it's replaced.	2003-09-19 22:51:26 +00:00
Marcel Moolenaar	ebe42add33	Revamp trap(): make it more explicit which kinds of traps/faults we can get (or not) and what we do with them. This fixes the behaviour for NaT consumption and speculation faults in that we now don't panic for user faults. Remove the dopanic label and move the code to a function. This makes it easier in the simulator to set a breakpoint. While here, remove the special handling of the old break-based syscall path and move it to where we handle the break vector. While here, reserve a new break immediate for KSE. We currently use the old break- based syscall to deal with restoring async contexts. However, it has the side-effect of also setting the signal mask and callong ast() on the way out. The new break immediate simply restores the context and returns without calling ast().	2003-09-19 22:41:52 +00:00
Marcel Moolenaar	d6c3e38bb2	Change TRAPF_USERMODE and CLOCKF_USERMODE to not test for CPL == 3, but for CPL != 0. For some reason yet unknown it is possible for the CPL to be 2. This would previously be counted as kernel mode, which resulted in nasty panics. By changing the test it is now treated as user mode, which is more correct. We still need to figure out how it is possible that the privilege level can be 2 (or 1 for that matter), because it's not used by us. We only use 3 (user mode) and 0 (kernel mode).	2003-09-19 07:48:22 +00:00
Marcel Moolenaar	549ab7a654	Include "opt_kstack_pages.h". We export KSTACK_PAGES to assembly and better have the right value.	2003-09-19 00:37:41 +00:00
Alan Cox	b9850eb224	Add a new parameter to pmap_extract_and_hold() that is needed to eliminate Giant from vmapbuf(). Idea from: tegge	2003-09-12 07:07:49 +00:00
Marcel Moolenaar	87ad0260ff	Rewrite the SAPIC initialization to always program the RTEs with what we think is the correct trigger mode and polarity. This allows us to implement BUS_CONFIG_INTR() as an update of the RTE in question. Consequently, we can trust the RTE when we enable an interrupt and avoids that we need to know about the trigger mode and polarity at that time.	2003-09-10 22:49:38 +00:00
John Baldwin	42a12b2bd8	Move the definitions for ACPI MADT table entries not present in the ACPICA distribution to a MI header so it can be shared with other architectures.	2003-09-10 06:32:27 +00:00
Marcel Moolenaar	e6882c3469	Introduce IA64_ID_PAGE_{MASK\|SHIFT\|SIZE} and LOG2_ID_PAGE_SIZE. The latter is a kernel option for IA64_ID_PAGE_SHIFT, which in turn determines IA64_ID_PAGE_MASK and IA64_ID_PAGE_SIZE. The constants are used instead of the literal hardcoding (in its various forms) of the size of the direct mappings created in region 6 and 7. The default and probably only workable size is still 256M, but for kicks we use 128M for LINT.	2003-09-09 05:59:09 +00:00
Alan Cox	ba2157f218	Introduce a new pmap function, pmap_extract_and_hold(). This function atomically extracts and holds the physical page that is associated with the given pmap and virtual address. Such a function is needed to make the memory mapping optimizations used by, for example, pipes and raw disk I/O MP-safe. Reviewed by: tegge	2003-09-08 02:45:03 +00:00
Bill Paul	a94100fa9b	Take the support for the 8139C+/8169/8169S/8110S chips out of the rl(4) driver and put it in a new re(4) driver. The re(4) driver shares the if_rlreg.h file with rl(4) but is a separate module. (Ultimately I may change this. For now, it's convenient.) rl(4) has been modified so that it will never attach to an 8139C+ chip, leaving it to re(4) instead. Only re(4) has the PCI IDs to match the 8169/8169S/8110S gigE chips. if_re.c contains the same basic code that was originally bolted onto if_rl.c, with the following updates: - Added support for jumbo frames. Currently, there seems to be a limit of approximately 6200 bytes for jumbo frames on transmit. (This was determined via experimentation.) The 8169S/8110S chips apparently are limited to 7.5K frames on transmit. This may require some more work, though the framework to handle jumbo frames on RX is in place: the re_rxeof() routine will gather up frames than span multiple 2K clusters into a single mbuf list. - Fixed bug in re_txeof(): if we reap some of the TX buffers, but there are still some pending, re-arm the timer before exiting re_txeof() so that another timeout interrupt will be generated, just in case re_start() doesn't do it for us. - Handle the 'link state changed' interrupt - Fix a detach bug. If re(4) is loaded as a module, and you do tcpdump -i re0, then you do 'kldunload if_re,' the system will panic after a few seconds. This happens because ether_ifdetach() ends up calling the BPF detach code, which notices the interface is in promiscuous mode and tries to switch promisc mode off while detaching the BPF listner. This ultimately results in a call to re_ioctl() (due to SIOCSIFFLAGS), which in turn calls re_init() to handle the IFF_PROMISC flag change. Unfortunately, calling re_init() here turns the chip back on and restarts the 1-second timeout loop that drives re_tick(). By the time the timeout fires, if_re.ko has been unloaded, which results in a call to invalid code and blows up the system. To fix this, I cleared the IFF_UP flag before calling ether_ifdetach(), which stops the ioctl routine from trying to reset the chip. - Modified comments in re_rxeof() relating to the difference in RX descriptor status bit layout between the 8139C+ and the gigE chips. The layout is different because the frame length field was expanded from 12 bits to 13, and they got rid of one of the status bits to make room. - Add diagnostic code (re_diag()) to test for the case where a user has installed a broken 32-bit 8169 PCI NIC in a 64-bit slot. Some NICs have the REQ64# and ACK64# lines connected even though the board is 32-bit only (in this case, they should be pulled high). This fools the chip into doing 64-bit DMA transfers even though there is no 64-bit data path. To detect this, re_diag() puts the chip into digital loopback mode and sets the receiver to promiscuous mode, then initiates a single 64-byte packet transmission. The frame is echoed back to the host, and if the frame contents are intact, we know DMA is working correctly, otherwise we complain loudly on the console and abort the device attach. (At the moment, I don't know of any way to work around the problem other than physically modifying the board, so until/unless I can think of a software workaround, this will have do to.) - Created re(4) man page - Modified rlphy.c to allow re(4) to attach as well as rl(4). Note that this code works for the sample 8169/Marvell 88E1000 NIC that I have, but probably won't work for the 8169S/8110S chips. RealTek has sent me some sample NICs, but they haven't arrived yet. I will probably need to add an rlgphy driver to handle the on-board PHY in the 8169S/8110S (it needs special DSP initialization).	2003-09-08 02:11:25 +00:00
Marcel Moolenaar	5e3cb29a6b	Untangle the code in this file to improve understandability. Both ia64_count_cpus() and ia64_probe_sapics() called a single function to do the the actual work. The difference in behaviour was handled in that function and was further complicated by adding bootverbose related code. As such, even the simplest of changes was hard to comprehend. Untangling has been done by increasing code duplication and using a more naive style of coding. FWIW, the object file is slightly smaller than before, so things aren't as bad as it may seem. Triggered by: a simple fix on the P4 branch that never got merged.	2003-09-07 23:09:08 +00:00
Alan Cox	5d314346f5	MFamd64/i386 Add necessary page locking to pmap_mincore().	2003-09-07 20:02:38 +00:00
Marcel Moolenaar	10a686623d	MFp4: Revamped GENERIC (and hints). This is some much more pleasant to look at...	2003-09-07 06:39:51 +00:00
Marcel Moolenaar	f1220bfe41	Replace sio(4) with uart(4). Remove the sio(4) hints and only add those hints used by uart(4) for the determination of the serial console in the absence of the HCDP table.	2003-09-07 05:47:10 +00:00
Marcel Moolenaar	8d8d970db1	Fix a place where I forgot to change the code that checks whether we return to kernel or userland. This triggered a panic in a KSE application when TDF_USTATCLOCK was set in the case userland was interrupted, but we never called ast() on our way out. As such, we called ast() at some other time. Unfortunately, TDF_USTATCLOCK handling assumes running in the interrupt thread. This was not the case anymore. To avoid making the same mistake later, interrupt() now returns to its caller whether we interrupted userland or not. This avoids that we have to duplicate the check in assembly, where it's bound to fall off the scope. Now we simply check the return value and call ast() if appropriate. Run into this: davidxu	2003-09-05 22:50:10 +00:00
Marcel Moolenaar	f02e8e8122	Use pmap_steal_memory() for the msgbuf instead of trying to squeeze it in the last chunk (phys_avail block). The last chunk very often is not larger than one or two pages, resulting in a msgbuf that's too small to hold a complete verbose boot. Note that pmap_steal_memory() will bzero the memory it "allocates". Consequently, ia64 will never preserve previous msgbufs. This is not a noticable difference in practice. If the msgbuf could be reused, it was invariably too small to have anything preserved anyway.	2003-09-01 07:06:57 +00:00
Marcel Moolenaar	47f756866a	Use direct mapped KVA for the sf_buf allocator, as made possible by the previous commit. While here, fix a typo, reformat comments and fix a long line. Tested with: ftpd	2003-09-01 00:12:27 +00:00
Alan Cox	411d10a600	Migrate the sf_buf allocator that is used by sendfile(2) and zero-copy sockets into machine-dependent files. The rationale for this migration is illustrated by the modified amd64 allocator. It uses the amd64's direct map to avoid emphemeral mappings in the kernel's address space. On an SMP, the emphemeral mappings result in an IPI for TLB shootdown for each transmitted page. Yuck. Maintainers of other 64-bit platforms with direct maps should be able to use the amd64 allocator as a reference implementation.	2003-08-29 20:04:10 +00:00
Nate Lawson	5a4d072c93	Minor style cleanups.	2003-08-28 16:30:31 +00:00
Marcel Moolenaar	d0adfaea93	Change LOG2_PAGE_SIZE from 14 to 15 bits. This will cause the CTASSERT in vm_page.h to be reached and thus slightly increases the overall coverage of LINT on ia64.	2003-08-25 20:02:18 +00:00
Marcel Moolenaar	5b6a41bddf	Add the bits for a LINT kernel. It has been verified to compile. We may need to polish this.	2003-08-23 21:47:33 +00:00
Marcel Moolenaar	9539d5b4f6	Remove PAGE_SIZE_4K, PAGE_SIZE_8K and PAGE_SIZE_16K and replace them with LOG2_PAGE_SIZE. A single option is better to LINT than multiple mutual exclusive ones.	2003-08-23 03:39:55 +00:00
Marcel Moolenaar	ca668eda45	Remove unused inclusion of opt_acpi.h	2003-08-23 00:07:52 +00:00
John Baldwin	e7411b9d71	Regen.	2003-08-21 14:16:41 +00:00
John Baldwin	daf54a1e05	Swap sigaction/sigreturn since they are in the wrong order. Noticed indirectly by: peter	2003-08-21 14:16:00 +00:00
Marcel Moolenaar	4a98d8b095	Undo the mistake made in revision 1.77 of trap.c and which was the ultimate trigger for the follow-up fixes in revisions 1.78, 1.80, 1.81 and 1.82 of trap.c. I was simply too pre-occupied with the gateway page and how it blurs kernel space with user space and vice versa that I couldn't see that it was all a load of bollocks. It's not the IP address that matters, it's the privilege level that counts. We never run in user space with lifted permissions and we sure can not run in kernel space without it. Sure, the gateway page is the exception, but not if you look at the privilege level. It's user space if you run with user permissions and kernel space otherwise. So, we're back to looking at the privilege level like it should be. There's no other way. Pointy hat: marcel	2003-08-20 05:30:35 +00:00
Gordon Tetlow	df3d69c217	Fixup the ELF branding information to point to the new home of rtld.	2003-08-17 08:08:38 +00:00
Marcel Moolenaar	710338e94f	In vm_thread_swap{in\|out}(), remove the alpha specific conditional compilation and replace it with a call to cpu_thread_swap{in\|out}(). This allows us to add similar code on ia64 without cluttering the code even more.	2003-08-16 23:15:15 +00:00
Marcel Moolenaar	26502503e5	Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to cpu.h. This affects db_command.c and kern_shutdown.c. ia64: move all MD prototypes from cpu.h to md_var.h. This affects madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory(). It's not used (vm_machdep.c). alpha: the MD prototypes have been left in cpu.h with a comment that they should be there. Moving them is left for later. It was expected that the impact would be significant enough to be done in a seperate commit. powerpc: MD prototypes left in cpu.h. Comment added. Suggested by: bde Tested with: make universe (pc98 incomplete)	2003-08-16 16:57:57 +00:00
Marcel Moolenaar	c6d402d3f2	Fix a range check bug. Don't left-shift the integer argument 'data'. Sign extension happens after the shift, not before so that boundary cases like 0x40000000 will not be caught properly. Instead, right shift ndirty. It is guaranteed to be a multiple of 8. While here, do some manual code motion and code commoning. Range check bug pointed out by: iedowse	2003-08-16 01:49:38 +00:00
Marcel Moolenaar	1fdb0ba9bb	Fix the generation of coredumps. We did not take the dirty registers that were on the kernel stack into account. For now we write them out to the register stack of the process before creating the dump. This however is not the final solution. The problem is that we may invalidate the coredump by overwriting vital information due to an invalid backing store pointer. Instead we need to write the dirty registers to an unused region of VM which will result in a seperate segment in the coredump. For now we can at least get to all the registers from a coredump.	2003-08-15 05:52:48 +00:00
Marcel Moolenaar	b00555136c	Add an instruction group break after the move to application register and the move to control register to avoid dependency violations when these functions are used. Note that explicit data and instruction serialization also need to be in a subsequent instruction group. This too requires that we have an igrp break here.	2003-08-15 05:46:33 +00:00
Marcel Moolenaar	60518ee41c	Introduce two machine specific ptrace(2) requests: PT_GETKSTACK and PT_SETKSTACK. These requests allow the tracing process to access the dirty registers of the traced process that are on the kernel stack. Note that there's currently no way to access the rnat register for those dirty registers that are not (yet) covered by a nat collection point. The interface for this is still being slept on. Also note that implied by these requests is the division of work: The tracing process has to keep track of where registers are spilled and is responsible to figure out where the NaT bit of the stacked registers are at any time during the execution of the traced process. The kernel provides the interfaces but will not abstract the fact that the register stack can be split. This model does not follow the approach taken in Linux where PT_PEEK and PT_POKE deals with this automagically.	2003-08-15 05:40:59 +00:00
Marcel Moolenaar	6e1f209af1	Don't use VM_MIN_KERNEL_ADDRESS to check if the faulting address is in user space or kernel space. VM_MIN_KERNEL_ADDRESS starts after the gateway page, which means that improper memory accesses to the gateway page while in user mode would panic the kernel. Use VM_MAX_ADDRESS instead. It ends before the gateway page. The difference between VM_MIN_KERNEL_ADDRESS and VM_MAX_ADDRESS is exactly the gateway page.	2003-08-13 03:20:10 +00:00
Marcel Moolenaar	dfcba5aae3	Put an instruction group break between the move to ar.rnat and the move to ar.rsc. The RSE must be in enforced lazy mode when writing to RSE modifyable registers. In this case we restore the RSE NaT collection register ar.rnat. I have seen 2 general exception faults on pluto1 now that indicate that the move to ar.rsc has already happened prior to the move to ar.rnat, meaning that the RSE is not in enforced lazy mode anymore. The ia64 dependency and instruction ordering rules seem to allow having both registers written to in the same instruction group, provided ar.rsc is written to later than ar.rnat (based on the ordering semantics). It appears that we may be pushing our luck. For now, put them in seperate cycles (by means of the instruction group break). If we ever get a general exception fault on the move to ar.rnat again, we have definite proof that something else is fishy.	2003-08-13 02:49:50 +00:00
Warner Losh	06b4bf3e55	Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's copyrighted files. Approved by: Matt Dillon	2003-08-12 23:24:05 +00:00
Marcel Moolenaar	75cf31a016	Extend identifycpu(): o Differentiate between CPU family and CPU model. There are multiple Itanium 2 models and it's nice to differentiate between them. o Seperately export the CPU family and CPU model with sysctl. o Merced is the only model in the Itanium family. o Add Madison to the Itanium 2 family. We already knew about McKinley. o Print the CPU family between parenthesis, like we do with the i386 CPU class. My prototype now identifies itself as: CPU: Merced (800.03-Mhz Itanium) pluto1 and pluto2 will eventually identify themselves as: CPU: McKinley (900.00-Mhz Itanium 2)	2003-08-12 08:10:16 +00:00
Marcel Moolenaar	e57196b3db	Cleanup prototypes in cpu.h, including fswintrberr and any references to it. Sort the remaining prototypes in cpu.h. No functional change.	2003-08-12 03:51:53 +00:00
Marcel Moolenaar	322d6e0236	Cleanup and style(9) fixes. No functional change.	2003-08-11 21:25:19 +00:00
Marcel Moolenaar	425963bb80	o move cpu_reset() from vm_machdep.c to machdep.c. o reorder cpu_boot(), cpu_halt() and identifycpu(). No functional change.	2003-08-10 21:33:07 +00:00
Marcel Moolenaar	29952636d3	Now that we can ignore up to 8KB of dirty registers, remove the RSE magic from exec_setregs(). In set_mcontext() we now also don't have to worry that we entered the kernel with more that 512 bytes of dirty registers on the kernel stack. Note that we cannot make any assumptions anymore WRT to NaT collection points in exec_setregs(), so we have to deal with them now.	2003-08-10 08:04:21 +00:00
Marcel Moolenaar	f8e1f6d036	MFi386 1.422 & 1.423: lock page queues in pmap_insert_entry().	2003-08-08 00:30:26 +00:00
John Baldwin	8b149b5131	Consistently use the BSD u_int and u_short instead of the SYSV uint and ushort. In most of these files, there was a mixture of both styles and this change just makes them self-consistent. Requested by: bde (kern_ktrace.c)	2003-08-07 15:04:27 +00:00
Marcel Moolenaar	1634f50b1b	Better define the flags in the mcontext_t and properly set the flags when we create contexts. The meaning of the flags are documented in <machine/ucontext.h>. I only list them here to help browsing the commit logs: _MC_FLAGS_ASYNC_CONTEXT _MC_FLAGS_HIGHFP_VALID _MC_FLAGS_KSE_SET_MBOX _MC_FLAGS_RETURN_VALID _MC_FLAGS_SCRATCH_VALID Yes, _MC_FLAGS_KSE_SET_MBOX is a hack and I'm proud of it :-)	2003-08-07 07:52:39 +00:00
Marcel Moolenaar	a50bc30203	o Fix cut-n-paste whitespace corruption in previous commit o For trap-based upcalls the argument (the kse_mailbox) to the UTS must be written onto the kernel stack, not the user stack. While here, deal with the fact that we may be at a NaT collection point.	2003-08-07 07:40:19 +00:00
Marcel Moolenaar	bee4e73025	In cpu_set_upcall_kse(), create the upcall according to the entry path into the kernel. Normally it's due to a syscall, but one can also be created as the result of a clock interrupt (for example). This now even more looks like exec_setregs(). While here, add an assert that we don't expect more than 8KB of dirty registers on the kernel stack.	2003-08-06 23:28:19 +00:00
Marcel Moolenaar	5f20d75a5f	o In revision 1.45 of exception.S we changed exception_restore to unconditionally restore ar.k7 (kernel memory stack) and ar.k6 (kernel register stack). I don't know what I was smoking then, but if you unconditionally restore ar.k6, you also want to compute its value unconditionally. By having the computation predicated and dependent on whether we return to user mode, we would end up writing junk (= invalid value for ar.bspstore) if we would return to kernel mode. But the whole point of the unconditional restoration was that there is a grey area where we still need to have ar.k6 restored. If we restore with a junk value, we would end up wedging the machine on the next interrupt. So, unconditionally calculate the value we unconditionally write to ar.k6. o The previous braino was found while making the following change: We used to clear the lower 9 bits of the value we write to ar.k6. The meaning being that we know that the kernel register stack is at least 512 byte aligned and simply clearing the lower 9 bits allows us to return to a context of which we don't have dirty registers on the kernel stack, even though the context that entered the kernel does have dirty registers on the kernel stack. By masking-off the lower bits, we correctly obtain the base of the register stack without having to worry that we didn't actually reached the base while unwinding it. The change is to mask off the lower 13 bits, knowing that the kernel register stack is always 8KB aligned. The advantage is that we don't have to worry anymore if there's more than 512 bytes of dirty registers on the kernel stack. A situation that frequently occurs. In exec_setregs() in machdep.c:1.147 or older, we had to deal with that situation by copying the active portion of the register stack down in multiples of 512 bytes. Now that we mask off the lower 13 bits we don't have to do that at all. Contemporary IPF processors have a register file that can hold up to 96 stacked registers (=784 bytes [incl. 2 NaT collections]). With no indication that register files grow beyond a couple of hundred registers, we should not have to worry about it anymore... and yes, 640KB is enough for everybody :-) This change helps setcontext(2) and cpu_set_upcall_kse() in that they can return to completely different contexts without having to mess with the kernel stack. Of course exec_setregs() doesn't need to do that anymore as well.	2003-08-06 21:32:38 +00:00
Marcel Moolenaar	7f36189f8a	o Put the syscall return registers in the context. Not only do we need this for swapcontext(), KSE upcalls initiated from ast() also need to save them so that we properly return the syscall results after having had a context switch. Note that we don't use r11 in the kernel. However, the runtime specification has defined r8-r11 as return registers, so we put r11 in the context as well. I think deischen@ was trying to tell me that we should save the return registers before. I just wasn't ready for it :-) o The EPC syscall code has 2 return registers and 2 frame markers to save. The first (rp/pfs) belongs to the syscall stub itself. The second (iip/cfm) belongs to the caller of the syscall stub. We want to put the second in the context (note that iip and cfm relate to interrupts. They are only being misused by the syscall code, but are not part of a regular context). This way, when the context is switched to again, we return to the caller of setcontext(2) as one would expect. o Deal with dirty registers on the kernel stack. The getcontext() syscall will flush the RSE, so we don't expect any dirty registers in that case. However, in thread_userret() we also need to save the context in certain cases. When that happens, we are sure that there are dirty registers on the kernel stack. This implementation simply copies the registers, one at a time, from the kernel stack to the user stack. NAT collections are not dealt with. Hence we don't preserve NaT bits. A better solution needs to be found at some later time. We also don't deal with this in all cases in set_mcontext. No temporay solution is implemented because it's not a showstopper. The problem is that we need to ignore the dirty registers and we automaticly do that for at most 62 registers. When there are more than 62 dirty registers we have a memory "leak". This commit is fundamental for KSE support.	2003-08-05 18:52:02 +00:00
Marcel Moolenaar	02cc6a6f35	Fix logic bug in the previous commit. Any region less than 5 is a user space region. Hence, we need to test if 5 is greater than the region; not greater equal. This bug caused us to call ast() while interrupting kernel mode.	2003-08-04 22:00:48 +00:00
John Baldwin	3bdbd658f1	- Since td_critnest is now initialized in MI code, it doesn't have to be set in cpu_critical_fork_exit() anymore. - As far as I can tell, cpu_thread_link() has never been used, not even when it was originally added, so remove it.	2003-08-04 20:32:45 +00:00
Marcel Moolenaar	46e31b2612	Cleanup the clock code. This includes: o Remove alpha specific timer code (mc146818A) and compiled-out calibration of said timer. o Remove i386 inherited timer code (i8253) and related acquire and release functions. o Move sysbeep() from clock.c to machdep.c and have it return ENODEV. Console beeps should be implemented using ACPI or if no such device is described, using the sound driver. o Move the sysctls related to adjkerntz, disable_rtc_set and wall_cmos_clock from machdep.c to clock.c, where the variables are. o Don't hardcode a hz value of 1024 in cpu_initclocks() and don't bother faking a stathz that's 1/8 of that. Keep it simple: hz defaults to HZ and stathz equals hz. This is also how it's done for sparc64. o Keep a per-CPU ITC counter (pc_clock) and adjustment (pc_clockadj) to calculate ITC skew and corrections. On average, we adjust the ITC match register once every ~1500 interrupts for a duration of 2 consequtive interruprs. This is to correct the non-deterministic behaviour of the ITC interrupt (there's a delay between the match and the raising of the interrupt). o Add 4 debugging sysctls to monitor clock behaviour. Those are debug.clock_adjust_edges, debug.clock_adjust_excess, debug.clock_adjust_lost and debug.clock_adjust_ticks. The first counts the individual adjustment cycles (when the skew first crosses the threshold), the second counts the number of times the adjustment was excessive (any non-zero value is to be considered a bug), the third counts lost clock interrupts and the last counts the number of interrupts for which we applied an adjustment (debug.clock_adjust_ticks / debug.clock_adjust_edges gives the avarage duration of an individual adjustment -- should be ~2). While here, remove some nearby (trivial) left-overs from alpha and other cleanups.	2003-08-04 05:13:18 +00:00

... 7 8 9 10 11 ...

1853 Commits