Commit Graph

11924 Commits

Author SHA1 Message Date
mdf
a53a6b6520 Whitespace and other aspects of style(9). No functional changes.
MFC after:  3 days
2010-11-08 20:57:08 +00:00
mdf
ccbc087adc Add a taskqueue_cancel(9) to cancel a pending task without waiting for
it to run as taskqueue_drain(9) does.

Requested by:	hselasky
Original code:	jeff
Reviewed by:	jhb
MFC after:	2 weeks
2010-11-08 20:56:31 +00:00
mav
1828dd7478 On APs startup skip hard-/statclock events, which time passed before CPU
was lauched. Few seconds event burst, accumulated during long startup,
reported to cause panic in SCHED_ULE priority calculation logic.
2010-11-08 15:25:12 +00:00
jh
5a3e494e92 Add missing curly brackets. By chance, the missing brackets didn't alter
the code behavior.

Submitted by:	Lucius Windschuh
2010-11-07 14:28:01 +00:00
jhb
177b41b47d Remove 'softclock_ih' as it is no longer used. 2010-11-03 15:38:52 +00:00
jhb
6aafe24a5f Tweak the waitchannel messages for the dead lock detection kthread. Use
a shorter message (userland generally only sees the first 6 to 8
characters) when waiting for the allproc lock.  Use "-" when idle to math
the behavior of other kthreads.

Reviewed by:	attilio
MFC after:	1 week
2010-11-02 18:34:31 +00:00
davidxu
4c899bcdf5 Use integer for size of cpuset, as it won't be bigger than INT_MAX,
This is requested by bge.
Also move the sysctl into file kern_cpuset.c, because it should
always be there, it is independent of thread scheduler.
2010-11-01 00:42:25 +00:00
mav
f1bdc58681 Fix callout_tickstofirst() behavior after signed integer ticks overflow.
This should fix callout precision drop to 1/4s after 25 days of uptime
with HZ = 1000.

Submitted by:	Taku YAMAMOTO <taku@tackymt.homeip.net>
2010-10-31 11:44:41 +00:00
kib
27d55f5f71 Remove sysctl debug.ncnegfactor, it is renamed to vfs.ncnegfactor.
MFC:	do not
2010-10-30 14:08:26 +00:00
trasz
81e925ec28 Fix uninitialized variable.
Found with:	Coverity Prevent(tm)
CID:		8632
2010-10-29 19:07:36 +00:00
davidxu
a5ea18413e Add sysctl kern.sched.cpusetsize to export the size of kernel cpuset,
also add sysconf() key _SC_CPUSET_SIZE to get sysctl value.

Submitted by: gcooper
2010-10-29 13:31:10 +00:00
jhb
3f3b4d105f Set bootverbose directly in mi_startup() rather than via a SYSINIT. This
ensures 'bootverbose' is in a valid state for all SYSINITs.

Reported by:	avg
MFC after:	1 week
2010-10-28 14:17:06 +00:00
davidxu
f8f25f57e2 - Revert r214409.
- Use long word to figure out sizeof kernel cpuset, hope it works.
2010-10-27 09:29:03 +00:00
davidxu
29be5dcd22 If input parameter cpusetsize is zero, give userland size of cpuset mask
kernel is using.
2010-10-27 02:32:54 +00:00
ivoras
d91b62acb0 Reduce the difference between hirunningspace and lorunningspace,
it should help interactivity in edge cases.
2010-10-25 14:05:25 +00:00
davidxu
bc55e49455 Use function tdfind() to find a thread. 2010-10-25 13:13:16 +00:00
brucec
41dcd4566c Mostly revert r203420, and add similar functionality into ada(4) since the
existing code caused problems with some SCSI controllers.

A new sysctl kern.cam.ada.spindown_shutdown has been added that controls
whether or not to spin-down disks when shutting down.
Spinning down the disks unloads/parks the heads - this is
much better than removing power when the disk is still
spinning because otherwise an Emergency Unload occurs which may cause damage
to the actuator.

PR:	kern/140752
Submitted by:   olli
Reviewed by:	arundel
Discussed with: mav
MFC after:	2 weeks
2010-10-24 16:31:57 +00:00
trasz
06405a0e7b Remove workaround for ZFS bug; fix was committed to the //depot/user/pjd/zfs/...
branch some time ago.

MFC after:	two weeks
2010-10-23 14:22:50 +00:00
davidxu
841633e7b6 In thr_exit() and kthread_exit(), only remove thread from
hash if it can directly exit, otherwise let exit1() do it.
The change should be in r213950, but for unknown reason,
it was lost.
2010-10-23 13:16:39 +00:00
delphij
0b25084d26 Call chainevh callback when we are invoked with neither MOD_LOAD nor
MOD_UNLOAD.  This makes it possible to add custom hooks for other module
events.

Return EOPNOTSUPP when there is no callback available.

Pointed out by:	jhb
Reviewed by:	jhb
MFC after:	1 month
2010-10-21 20:31:50 +00:00
jhb
a41cfdca06 - When disabling ktracing on a process, free any pending requests that
may be left.  This fixes a memory leak that can occur when tracing is
  disabled on a process via disabling tracing of a specific file (or if
  an I/O error occurs with the tracefile) if the process's next system
  call is exit().  The trace disabling code clears p_traceflag, so exit1()
  doesn't do any KTRACE-related cleanup leading to the leak.  I chose to
  make the free'ing of pending records synchronous rather than patching
  exit1().
- Move KTRACE-specific logic out of kern_(exec|exit|fork).c and into
  kern_ktrace.c instead.  Make ktrace_mtx private to kern_ktrace.c as a
  result.

MFC after:	1 month
2010-10-21 19:17:40 +00:00
delphij
18ce919932 In syscall_module_handler(): all switch branches return, remove
unreached code as pointed out in a Chinese forum [1].

[1] http://www.freebsdchina.org/forum/viewtopic.php?t=50619

Pointed out by:		btw616 <btw s qq com>
MFC after:		1 month
2010-10-21 08:57:25 +00:00
davidxu
520d21ea78 - Don't include sx.h, it is not needed.
- Check NULL pointer, move timeout calculation code outside of
  process lock.
2010-10-20 00:41:38 +00:00
ae
68ea32374c ZFS pool name is not a real device in devfs. Do not wait for
device appear when mounting root from ZFS.

Reviewed by:	marcel
Approved by:	mav (mentor)
2010-10-19 18:32:01 +00:00
emaste
ce87894035 We've already set p = td->td_proc, so use it. 2010-10-18 15:46:58 +00:00
marcel
d56bc2d35f Re-implement the root mount logic using a recursive approach, whereby each
root file system (starting with devfs and a synthesized configuration) can
contain directives for mounting another file system as root. The old root
file system is re-mounted under the new root file system (with /.mount or
/mnt as the mount point) to allow access to the underlying file system.

The configuration allows for creating vnode-backed memory disks that can
subsequently be mounted as root. This allows for an efficient and low-
cost way to distribute and boot FreeBSD software images that reside on
some storage media.

When trying a mount, the kernel will wait for the device in question to
arrive. The timeout is configurable and is part of the configuration.
This allows arbitrarily complex GEOM configurations to be constructed
on the fly.

A side-effect of this change is that all root specifications, whether
compiled into the kernel or typed at the prompt can contain root mount
options.
2010-10-18 05:01:53 +00:00
marcel
f093b8cccc In vfs_filteropt(), only print the errmsg when there's no errmsg
mount option. Otherwise errors tend to get printed multiple times.
2010-10-18 04:34:42 +00:00
marcel
ff2b095a39 Rename boot() to kern_reboot() and make it visible outside of
kern_shutdown.c. This makes it easier for emulators and other
parts of the kernel to initiate a reboot.
2010-10-18 04:30:27 +00:00
nwhitehorn
d34514657e Fix an XXX comment by answering 'no'. OS X does not set the day-of-week
counter on SMU-based systems, which causes FreeBSD to reject the RTC time
when used in a dual-boot environment. Since we don't use the day-of-week
counter anyway, solve this by just not checking that it matches.

MFC after:	3 weeks
2010-10-17 17:31:49 +00:00
davidxu
c8ed8cb6af - Insert thread0 into correct thread hash link list.
- In thr_exit() and kthread_exit(), only remove thread from
  hash if it can directly exit, otherwise let exit1() do it.
- In thread_suspend_check(), fix cleanup code when thread needs
  to exit.
This change seems fixed the "Bad link elm " panic found by
Peter Holm.

Stress testing: pho
2010-10-17 11:01:52 +00:00
kib
61f7905664 Provide vfs.ncsizefactor instead of hard-coding namecache ratio.
Move debug.ncnegfactor to vfs.ncnegfactor [1].
Provide some descriptions for the namecache related sysctls [1].

Based on the submission by:	Rogier R. Mulhuijzen <drwilco drwilco net> [1]
MFC after:	2 weeks
X-MFC-note:	remove debug.ncnegfactor in HEAD after MFC
2010-10-16 09:44:31 +00:00
davidxu
ae4fb003c8 In kern_sigtimedwait(), move initialization code out of process lock,
instead of using SIGISMEMBER to test every interesting signal, just
unmask the signal set and let cursig() return one, get the signal
after it returns, call reschedule_signal() after signals are blocked
again.

In kern_sigprocmask(), don't call reschedule_signal() when it is
unnecessary.

In reschedule_signal(), replace SIGISEMPTY() + SIGISMEMBER() with
sig_ffs(), rename variable 'i' to sig.
2010-10-14 08:01:33 +00:00
mdf
256615c9b3 Use a safer mechanism for determining if a task is currently running,
that does not rely on the lifetime of pointers being the same. This also
restores the task KBI.

Suggested by:	jhb
MFC after:	1 month
2010-10-13 22:59:04 +00:00
davidxu
666f83ad9c sigqueue_collect_set() is no longer needed because other functions
maintain pending set correctly.
2010-10-13 06:28:40 +00:00
mdf
58b7823599 Re-expose and briefly document taskqueue_run(9). The function is used
in at least one 3rd party driver.

Requested by:	jhb
2010-10-12 18:36:03 +00:00
avg
55173efe7f generic_stop_cpus: prevent parallel execution
This is based on the same approach as used in panic().
In theory parallel execution of generic_stop_cpus()  could lead to two CPUs
stopping each other and everyone else, and thus a total system halt.
Also, in theory, we should have some smarter locking here, because two
(or more CPUs) could be stopping unrelated sets of CPUs.
But in practice, it seems, this function is only used to stop
"all other" CPUs.

Additionally, I took this opportunity to make amd64-specific suspend_cpus()
function use generic_stop_cpus() instead of rolling out essentially
duplicate code.

This code is based on code by Sandvine Incorporated.

Suggested by:	mdf
Reviewed by:	jhb, jkim (earlier version)
MFC after:	2 weeks
2010-10-12 17:40:45 +00:00
davidxu
47dfb514f5 Add a flag TDF_TIDHASH to prevent a thread from being
added to or removed from thread hash table multiple times.
2010-10-12 00:36:56 +00:00
kib
4036cd070d The r184588 changed the layout of struct export_args, causing an ABI
breakage for old mount(2) syscall, since most struct <filesystem>_args
embed export_args. The mount(2) is supposed to provide ABI
compatibility for pre-nmount mount(8) binaries, so restore ABI to
pre-r184588.

Requested and reviewed by:	bde
MFC after:    2 weeks
2010-10-10 07:05:47 +00:00
avg
2e73196837 add kmem_map_free sysctl: query largest contiguous free range in kmem_map
Suggested by:	alc
Reviewed by:	alc
MFC after:	1 week
2010-10-09 09:03:17 +00:00
avg
dca49a4289 panic_cpu variable should be volatile
This is to prevent caching of its value in a register when it is checked
and modified by multiple CPUs in parallel.
Also, move the variable  into the scope of the only function that uses it.

Reviewed by:	jhb
Hint from:	mdf
MFC after:	1 week
2010-10-09 08:07:49 +00:00
davidxu
55194e796c Create a global thread hash table to speed up thread lookup, use
rwlock to protect the table. In old code, thread lookup is done with
process lock held, to find a thread, kernel has to iterate through
process and thread list, this is quite inefficient.
With this change, test shows in extreme case performance is
dramatically improved.

Earlier patch was reviewed by: jhb, julian
2010-10-09 02:50:23 +00:00
emaste
a3f6608533 Make a thread's address available via the kern proc sysctl, just like the
process address.

Add "tdaddr" keyword to ps(1) to display this thread address.

Distilled from Sandvine's patch set by Mark Johnston.
2010-10-08 00:44:53 +00:00
avg
7010764d95 vm.kmem_map_size: a sysctl to query current kmem_map->size
Based on a patch from Sandvine Incorporated via emaste.

Reviewed by:	emaste
MFC after:	1 week
2010-10-07 18:11:33 +00:00
jh
d93ad5245d Check the device name validity on device registration.
A new function prep_devname() sanitizes a device name by removing
leading and redundant sequential slashes. The function returns an error
for names which already exist or are considered invalid.

A new flag MAKEDEV_CHECKNAME for make_dev_p(9) and make_dev_credf(9)
indicates that the caller is prepared to handle an error related to the
device name. An invalid name triggers a panic if the flag is not
specified.

Document the MAKEDEV_CHECKNAME flag in the make_dev(9) manual page.

Idea from:	kib
Reviewed by:	kib
2010-10-07 18:00:55 +00:00
imp
dd58e02521 Adjust the all target message (but maybe all: sysent is better? 2010-10-02 22:12:41 +00:00
imp
1c2f641b98 Turns out this file was how we make sysent stuff, so add that part only back... 2010-10-02 21:35:33 +00:00
marcel
ff3fbc640d Split the root mount logic from the (generic) mount code and move
it (the root mount code) into a new file called vfs_mountroot.c

The split is almost trivial, as the code is almost perfectly
non-intertwined. The only adjustment needed was to move the UMA
zone allocation out of vfs_mountroot() [in vfs_mountroot.c] and
into vfs_mount.c, where it had to be done as a SYSINIT [see
vfs_mount_init()].

There are no functional changes with this commit.
2010-10-02 19:44:13 +00:00
kib
0b7460fc16 Release the vnode lock and close the linker file vnode earlier in
the linker_load_file methods. The change is that the consequent
linker_file_unload() call is not under the vnode lock anymore.
This prevents the LOR between kernel linker sx xlock and vnode lock,
because linker_file_unload() relocks kernel linker lock.

MFC after:	2 weeks
2010-10-02 16:04:50 +00:00
avg
c2519e339d sysctls in kern_shutdown: add twin tunables
also make couple of sysctl-controlled variables static

Reviewed by:	rwatson
MFC after:	1 week
2010-10-01 09:34:41 +00:00
avg
eca696eeba there must be only one SYSINIT with SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY order
SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY should only be used to call
scheduler() function which turns the initial thread into swapper proper
and thus there is no further SYSINIT processing.
Other SYSINITs with SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY may get ordered
after scheduler() and thus never executed.  That particular relative
order is semi-arbitrary.

Thus, change such places to use SI_ORDER_MIDDLE.
Also, use SI_ORDER_MIDDLE instead of correct, but less appealing,
SI_ORDER_ANY - 1.

MFC after:	1 week
2010-09-30 17:05:23 +00:00