freebsd-skq

Author	SHA1	Message	Date
nwhitehorn	d34514657e	Fix an XXX comment by answering 'no'. OS X does not set the day-of-week counter on SMU-based systems, which causes FreeBSD to reject the RTC time when used in a dual-boot environment. Since we don't use the day-of-week counter anyway, solve this by just not checking that it matches. MFC after: 3 weeks	2010-10-17 17:31:49 +00:00
davidxu	c8ed8cb6af	- Insert thread0 into correct thread hash link list. - In thr_exit() and kthread_exit(), only remove thread from hash if it can directly exit, otherwise let exit1() do it. - In thread_suspend_check(), fix cleanup code when thread needs to exit. This change seems fixed the "Bad link elm " panic found by Peter Holm. Stress testing: pho	2010-10-17 11:01:52 +00:00
kib	61f7905664	Provide vfs.ncsizefactor instead of hard-coding namecache ratio. Move debug.ncnegfactor to vfs.ncnegfactor [1]. Provide some descriptions for the namecache related sysctls [1]. Based on the submission by: Rogier R. Mulhuijzen <drwilco drwilco net> [1] MFC after: 2 weeks X-MFC-note: remove debug.ncnegfactor in HEAD after MFC	2010-10-16 09:44:31 +00:00
davidxu	ae4fb003c8	In kern_sigtimedwait(), move initialization code out of process lock, instead of using SIGISMEMBER to test every interesting signal, just unmask the signal set and let cursig() return one, get the signal after it returns, call reschedule_signal() after signals are blocked again. In kern_sigprocmask(), don't call reschedule_signal() when it is unnecessary. In reschedule_signal(), replace SIGISEMPTY() + SIGISMEMBER() with sig_ffs(), rename variable 'i' to sig.	2010-10-14 08:01:33 +00:00
mdf	256615c9b3	Use a safer mechanism for determining if a task is currently running, that does not rely on the lifetime of pointers being the same. This also restores the task KBI. Suggested by: jhb MFC after: 1 month	2010-10-13 22:59:04 +00:00
davidxu	666f83ad9c	sigqueue_collect_set() is no longer needed because other functions maintain pending set correctly.	2010-10-13 06:28:40 +00:00
mdf	58b7823599	Re-expose and briefly document taskqueue_run(9). The function is used in at least one 3rd party driver. Requested by: jhb	2010-10-12 18:36:03 +00:00
avg	55173efe7f	generic_stop_cpus: prevent parallel execution This is based on the same approach as used in panic(). In theory parallel execution of generic_stop_cpus() could lead to two CPUs stopping each other and everyone else, and thus a total system halt. Also, in theory, we should have some smarter locking here, because two (or more CPUs) could be stopping unrelated sets of CPUs. But in practice, it seems, this function is only used to stop "all other" CPUs. Additionally, I took this opportunity to make amd64-specific suspend_cpus() function use generic_stop_cpus() instead of rolling out essentially duplicate code. This code is based on code by Sandvine Incorporated. Suggested by: mdf Reviewed by: jhb, jkim (earlier version) MFC after: 2 weeks	2010-10-12 17:40:45 +00:00
davidxu	47dfb514f5	Add a flag TDF_TIDHASH to prevent a thread from being added to or removed from thread hash table multiple times.	2010-10-12 00:36:56 +00:00
kib	4036cd070d	The r184588 changed the layout of struct export_args, causing an ABI breakage for old mount(2) syscall, since most struct <filesystem>_args embed export_args. The mount(2) is supposed to provide ABI compatibility for pre-nmount mount(8) binaries, so restore ABI to pre-r184588. Requested and reviewed by: bde MFC after: 2 weeks	2010-10-10 07:05:47 +00:00
avg	2e73196837	add kmem_map_free sysctl: query largest contiguous free range in kmem_map Suggested by: alc Reviewed by: alc MFC after: 1 week	2010-10-09 09:03:17 +00:00
avg	dca49a4289	panic_cpu variable should be volatile This is to prevent caching of its value in a register when it is checked and modified by multiple CPUs in parallel. Also, move the variable into the scope of the only function that uses it. Reviewed by: jhb Hint from: mdf MFC after: 1 week	2010-10-09 08:07:49 +00:00
davidxu	55194e796c	Create a global thread hash table to speed up thread lookup, use rwlock to protect the table. In old code, thread lookup is done with process lock held, to find a thread, kernel has to iterate through process and thread list, this is quite inefficient. With this change, test shows in extreme case performance is dramatically improved. Earlier patch was reviewed by: jhb, julian	2010-10-09 02:50:23 +00:00
emaste	a3f6608533	Make a thread's address available via the kern proc sysctl, just like the process address. Add "tdaddr" keyword to ps(1) to display this thread address. Distilled from Sandvine's patch set by Mark Johnston.	2010-10-08 00:44:53 +00:00
avg	7010764d95	vm.kmem_map_size: a sysctl to query current kmem_map->size Based on a patch from Sandvine Incorporated via emaste. Reviewed by: emaste MFC after: 1 week	2010-10-07 18:11:33 +00:00
jh	d93ad5245d	Check the device name validity on device registration. A new function prep_devname() sanitizes a device name by removing leading and redundant sequential slashes. The function returns an error for names which already exist or are considered invalid. A new flag MAKEDEV_CHECKNAME for make_dev_p(9) and make_dev_credf(9) indicates that the caller is prepared to handle an error related to the device name. An invalid name triggers a panic if the flag is not specified. Document the MAKEDEV_CHECKNAME flag in the make_dev(9) manual page. Idea from: kib Reviewed by: kib	2010-10-07 18:00:55 +00:00
imp	dd58e02521	Adjust the all target message (but maybe all: sysent is better?	2010-10-02 22:12:41 +00:00
imp	1c2f641b98	Turns out this file was how we make sysent stuff, so add that part only back...	2010-10-02 21:35:33 +00:00
marcel	ff3fbc640d	Split the root mount logic from the (generic) mount code and move it (the root mount code) into a new file called vfs_mountroot.c The split is almost trivial, as the code is almost perfectly non-intertwined. The only adjustment needed was to move the UMA zone allocation out of vfs_mountroot() [in vfs_mountroot.c] and into vfs_mount.c, where it had to be done as a SYSINIT [see vfs_mount_init()]. There are no functional changes with this commit.	2010-10-02 19:44:13 +00:00
kib	0b7460fc16	Release the vnode lock and close the linker file vnode earlier in the linker_load_file methods. The change is that the consequent linker_file_unload() call is not under the vnode lock anymore. This prevents the LOR between kernel linker sx xlock and vnode lock, because linker_file_unload() relocks kernel linker lock. MFC after: 2 weeks	2010-10-02 16:04:50 +00:00
avg	c2519e339d	sysctls in kern_shutdown: add twin tunables also make couple of sysctl-controlled variables static Reviewed by: rwatson MFC after: 1 week	2010-10-01 09:34:41 +00:00
avg	eca696eeba	there must be only one SYSINIT with SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY order SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY should only be used to call scheduler() function which turns the initial thread into swapper proper and thus there is no further SYSINIT processing. Other SYSINITs with SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY may get ordered after scheduler() and thus never executed. That particular relative order is semi-arbitrary. Thus, change such places to use SI_ORDER_MIDDLE. Also, use SI_ORDER_MIDDLE instead of correct, but less appealing, SI_ORDER_ANY - 1. MFC after: 1 week	2010-09-30 17:05:23 +00:00
avg	1f20beb47f	debug.kdb.stop_cpus sysctl: hint that this is also a tunable MFC after: 1 week	2010-09-30 16:47:01 +00:00
avg	54a87db1fb	kmem_size* sysctls: hint that these are also tunables MFC after: 1 week	2010-09-30 16:45:27 +00:00
davidxu	6580ce86ea	- kern_sched_rr_get_interval should return interval for thread 1 in target process. - eliminate a goto. MFC after: 1 week	2010-09-29 07:31:05 +00:00
imp	4087eace5d	This file has been unused for ages. Retire it. Submitted by: pluknet	2010-09-28 15:33:30 +00:00
emaste	2d28788fe8	Remove extra braces for style(9) (found while cleaning up an old work tree).	2010-09-28 01:36:01 +00:00
avg	9864877541	kdb_backtrace: use stack_print_ddb instead of stack_print This is a followup to r212964. stack_print call chain obtains linker sx lock and thus potentially may lead to a deadlock depending on a kind of a panic. stack_print_ddb doesn't acquire any locks and it doesn't use any facilities of ddb backend. Using stack_print_ddb outside of DDB ifdef required taking a number of helper functions from under it as well. It is a good idea to rename linker_ddb_* and stack_*_ddb functions to have 'unlocked' component in their name instead of 'ddb', because those functions do not use any DDB services, but instead they provide unlocked access to linker symbol information. The latter was previously needed only for DDB, hence the 'ddb' name component. Alternative is to ditch unlocked versions altogether after implementing proper panic handling: 1. stop other cpus upon a panic 2. make all non-spinlock lock operations (mutex, sx, rwlock) be a no-op when panicstr != NULL Suggested by: mdf Discussed with: attilio MFC after: 2 weeks	2010-09-22 06:45:07 +00:00
mav	351da3e73c	If kernel built with DEVICE_POLLING, keep one CPU always in active state to handle it.	2010-09-22 05:32:37 +00:00
jhb	e350ad7930	Comment nit, set TDF_NEEDRESCHED after the comment describing why it is done rather than before. MFC after: 1 week	2010-09-21 19:12:22 +00:00
mav	10e1b075c5	If new callout scheduled to another CPU and we are using global timer, there is high probability that timer is already programmed by some other CPU. Especially by one that registered this callout, and so active now.	2010-09-21 17:37:28 +00:00
mav	e7b0e3848a	Remember last kern.eventtimer.periodic value, explicitly set by user. If timer capabilities forcing us to change periodicity mode, try to restore it back later, as soon as new choosen timer capable to do it. Without this, timer change like HPET->RTC->HPET always results in enabling periodic mode.	2010-09-21 16:50:24 +00:00
alc	524cb00f17	Fix exec_imgact_shell()'s handling of two error cases: (1) Previously, if the first line of a script exceeded MAXSHELLCMDLEN characters, then exec_imgact_shell() silently truncated the line and passed on the truncated interpreter name or argument. Now, exec_imgact_shell() will fail and return ENOEXEC, which is the commonly used errno among Unix variants for this type of error. (2) Previously, exec_imgact_shell()'s check on the length of the interpreter's name was ineffective. In other words, exec_imgact_shell() could not possibly fail and return ENAMETOOLONG. The reason being that the length of the interpreter name had to exceed MAXSHELLCMDLEN characters in order that ENAMETOOLONG be returned. But, the search for the end of the interpreter name stops after at most MAXSHELLCMDLEN - 2 characters are scanned. (In the end, this particular error is eventually discovered outside of exec_imgact_shell() and ENAMETOOLONG is returned. So, the real effect of this second change is that the error is detected earlier, in exec_imgact_shell().) Update the definition of MAXINTERP to the actual limit on the size of the interpreter name that has been in effect since r142453 (from 2005). In collaboration with: kib	2010-09-21 16:24:51 +00:00
avg	fe208ba095	kdb_backtrace: stack(9)-based code to print backtrace without any backend The idea is to add KDB and KDB_TRACE options to GENERIC kernels on stable branches, so that at least the minimal information is produced for non-specific panics like traps on page faults. The GENERICs in stable branches seem to already include STACK option. Reviewed by: attilio MFC after: 2 weeks	2010-09-21 15:07:44 +00:00
mav	16369ea8b2	Until hardclock() and respectively tc_windup() called first time, system is running on "dummy" time counter. But to function properly in one-shot mode, event timer management code requires working time counter. Slow moving "dummy" time counter delays first hardclock() call by few seconds on my systems, even though timer interrupts were correctly kicking kernel. That causes few seconds delay during boot with one-shot mode enabled. To break this loop, explicitly call tc_windup() first time during initialization process to let it switch to some real time counter.	2010-09-21 08:02:02 +00:00
trasz	3e2d23f909	First step at adopting FreeBSD to support PSARC/2010/029. This makes acl_is_trivial_np(3) properly recognize the new trivial ACLs. From the user point of view, that means "ls -l" no longer shows plus signs for all the files when running ZFS v28.	2010-09-20 17:10:06 +00:00
ed	a67dfa17fa	Just make callout devices and /dev/console force CLOCAL on open(). Instead of adding custom checks to wait for DCD on open(), just modify the termios structure to set CLOCAL. This means SIGHUP is no longer generated when losing DCD as well. Reviewed by: kib@ MFC after: 1 week	2010-09-19 16:35:42 +00:00
ed	99ba5ac113	Ignore DCD handling on /dev/console entirely. This makes /dev/console more fail-safe and prevents a potential console lock-up during boot. Discussed on: stable@ Tested by: koitsu@ MFC after: 1 week	2010-09-19 14:21:39 +00:00
rwatson	b9d3291981	With reworking of the socket life cycle in 7.x, the need for a "sotryfree()" was eliminated: all references to sockets are explicitly managed by sorele() and the protocols. As such, garbage collect sotryfree(), and update sofree() comments to make the new world order more clear. MFC after: 3 days Reported by: Anuranjan Shukla <anshukla at juniper dot net>	2010-09-18 11:18:42 +00:00
avg	6fb9e57674	kern.sched.topology_spec sysctl: use step of 1 for group levels numeration This is just a cosmetic change for prettier output. 'indent' variable/parameter serves two purposes: it specifies whitespace indentation level and also implies cpu group level/depth. It would have been better to split those two uses, but for now just a simple change. MFC after: 1 week	2010-09-18 11:16:43 +00:00
mav	a168297469	When global timer used at SMP system, update nextevent field on BSP before sending IPI to other CPUs. Otherwise, other CPUs will try to honor stale value, programming timer for zero interval. If timer is fast enough, it caused extra interrupt before timer correctly reprogrammed by BSP.	2010-09-18 07:18:30 +00:00
imp	1ffadf8af3	By popular demand, kill all the non GIANT related interrupt messages. They are confusing and add little value. Reviewed by: jhb@	2010-09-17 16:05:25 +00:00
mdf	5695ef4698	Re-add r212370 now that the LOR in powerpc64 has been resolved: Add a drain function for struct sysctl_req, and use it for a variety of handlers, some of which had to do awkward things to get a large enough SBUF_FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk (original patch)	2010-09-16 16:13:12 +00:00
mav	6eed5acb73	Fix panic on NULL dereference possible after r212541.	2010-09-14 10:26:49 +00:00
mav	6c05aa4db6	Make kern_tc.c provide minimum frequency of tc_ticktock() calls, required to handle current timecounter wraps. Make kern_clocksource.c to honor that requirement, scheduling sleeps on first CPU for no more then specified period. Allow other CPUs to sleep up to 1/4 second (for any case).	2010-09-14 08:48:06 +00:00
mav	5864d6e457	Replace spin lock with the set of atomics. It is impractical for one tc_ticktock() call to wait for another's completion -- just skip it.	2010-09-14 04:57:30 +00:00
mav	5f7bd119f7	Add some foot shooting protection by checking singlemul value correctness. Rephrase sysctls descriptions. Suggested by: edmaste	2010-09-14 04:48:04 +00:00
mdf	3ed6eac561	Revert r212370, as it causes a LOR on powerpc. powerpc does a few unexpected things in copyout(9) and so wiring the user buffer is not sufficient to perform a copyout(9) while holding a random mutex. Requested by: nwhitehorn	2010-09-13 18:48:23 +00:00
avg	ab04d6fe3f	bus_add_child: add specialized default implementation that calls panic If a kobj method doesn't have any explicitly provided default implementation, then it is auto-assigned kobj_error_method. kobj_error_method is proper only for methods that return error code, because it just returns ENXIO. So, in the case of unimplemented bus_add_child caller would get (device_t)ENXIO as a return value, which would cause the mistake to go unnoticed, because return value is typically checked for NULL. Thus, a specialized null_add_child is added. It would have sufficied for correctness to return NULL, but this type of mistake was deemed to be rare and serious enough to call panic instead. Watch out for this kind of problem with other kobj methods. Suggested by: jhb, imp MFC after: 2 weeks	2010-09-13 08:34:20 +00:00
mav	eb4931dc6c	Refactor timer management code with priority to one-shot operation mode. The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.	2010-09-13 07:25:35 +00:00

1 2 3 4 5 ...

11896 Commits