11612 Commits

Author SHA1 Message Date
jhb
5b4b1c75c0 Style fixes.
Submitted by:	bde
2010-03-11 15:13:55 +00:00
nwhitehorn
142a4d2993 Provide groundwork for 32-bit binary compatibility on non-x86 platforms,
for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32
option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts
of the kernel and enhances the freebsd32 compatibility code to support
big-endian platforms.

Reviewed by:	kib, jhb
2010-03-11 14:49:06 +00:00
jhb
3a7e251600 Fix a comment nit.
Submitted by:	Alexander Best
2010-03-11 13:16:06 +00:00
jhb
9167691ca3 Add descriptions for debug.ktr sysctl nodes. 2010-03-10 21:35:42 +00:00
imp
3ebe1a7826 Bump up the firmware_table from 30 to 50. bwn needs more than 30, it
seems.
2010-03-07 22:37:35 +00:00
alfred
88b3bf6496 put calls to gzclose() under ifdef COMPRESS_USER_CORES to prevent
undefined symbols on kernels without this option.

Reported by: Alexander Best
2010-03-04 21:53:45 +00:00
rrs
17cf51b0bb sched_getparam was just plain broke for time-share
processes. It did not return an error but instead
just let garbage be passed back. This I fix so
it actually properly translates the priority the
process is at to a posix's high means more priority.
I also fix it so that if the ULE scheduler has bumped
it up to a realtime process you get back a sane value
i.e. the highest priority (63 for time-share).

sched_setscheduler() had the setting of the
timeshare class priority disabled. With some notes
about rejecting the posix high numbers is greater
priority and use nice instead. This fix also
adjusts that to work, with the cavet that a t-s
process may well get bumped up or down i.e. the
setscheduler() will NOT change the nice value only
the current priority. I think this is reasonable
considering if the user wants to play with nice then
he can. At least all the posix'ish interfaces now
respond sanely.

MFC after:	3 weeks
2010-03-03 21:46:51 +00:00
jhb
f9290c7f6d Allow lseek(SEEK_END) to work on disk devices by using the DIOCGMEDIASIZE
to determine the media size.

Submitted by:	nox
MFC after:	1 week
2010-03-03 16:18:04 +00:00
ivoras
14f9175723 Document the VM detection type and sysctl a bit better. 2010-03-02 23:57:42 +00:00
alfred
f34ce3dd38 Merge projects/enhanced_coredumps (r204346) into HEAD:
Enhanced process coredump routines.

  This brings in the following features:
  1) Limit number of cores per process via the %I coredump formatter.
  Example:
    if corefilename is set to %N.%I.core AND num_cores = 3, then
    if a process "rpd" cores, then the corefile will be named
    "rpd.0.core", however if it cores again, then the kernel will
    generate "rpd.1.core" until we hit the limit of "num_cores".

    this is useful to get several corefiles, but also prevent filling
    the machine with corefiles.

  2) Encode machine hostname in core dump name via %H.

  3) Compress coredumps, useful for embedded platforms with limited space.
    A sysctl kern.compress_user_cores is made available if turned on.

    To enable compressed coredumps, the following config options need to be set:
    options COMPRESS_USER_CORES
    device zlib   # brings in the zlib requirements.
    device gzio   # brings in the kernel vnode gzip output module.

  4) Eventhandlers are fired to indicate coredumps in progress.

  5) The imgact sv_coredump routine has grown a flag to pass in more
  state, currently this is used only for passing a flag down to compress
  the coredump or not.

  Note that the gzio facility can be used for generic output of gzip'd
  streams via vnodes.

Obtained from: Juniper Networks
Reviewed by: kan
2010-03-02 06:58:58 +00:00
bruno
3bef33deb1 Deliver siginfo when signal is generated by thr_kill(2) (SI_USER with properly
filled si_uid and si_pid).

Reported by:	Joel Bertrand <joel.bertrand systella fr>
PR:		141956
Reviewed by:	kib
MFC after:	2 weeks
2010-03-01 14:27:16 +00:00
rwatson
fc045dee13 Remove stale comment about socket buffer accounting from access(2) code.
It is the case, however, that the uidinfo of the temporary credential
set up for access(2) is not properly updated when its effective uid is
changed.

MFC after:	3 days
2010-02-27 19:57:40 +00:00
alc
83149d5d10 When running as a guest operating system, the FreeBSD kernel must assume
that the virtual machine monitor has enabled machine check exceptions.
Unfortunately, on AMD Family 10h processors the machine check hardware
has a bug (Erratum 383) that can result in a false machine check exception
when a superpage promotion occurs.  Thus, I am disabling superpage
promotion when the FreeBSD kernel is running as a guest operating system
on an AMD Family 10h processor.

Reviewed by:	jhb, kib
MFC after:	3 days
2010-02-27 18:00:57 +00:00
kib
a8e7da4cf2 For kinfo_proc in kp->ki_siglist, return the set of the signals pending
in the process queue when gathering information for the process, and set
of signals pending for the thread, when gathering information for the
thread. Previously, the sysctl returned a union of the process and some
arbitrary thread pending set for the process, and union of the process
and the thread pending set for the thread.

MFC after:	1 week
2010-02-27 15:32:49 +00:00
kib
695f0b496c Fix several style issues.
Define make_dev_credv() as static to match declaration.

MFC after:	3 days
2010-02-27 15:26:36 +00:00
jilles
9a57b41b18 Include terminated threads in ps's process cpu time field.
MFC after:	2 weeks
2010-02-27 12:15:59 +00:00
brooks
e7e1754f54 Don't inforce an upper bound on kern.ngroups. The INT_MAX-1 limit was
too high due to several overflows.  The actual limit is somewhere in the
neighborhood of INT_MAX/4 on 64-bit machines, but most systems could not
support such a limit due to a lack of memory and the cost of duplicate
credentials.

Reported by:	bde
2010-02-24 15:52:18 +00:00
ed
6c820f6479 Decompose the most lousy named file in sys/kern; kern_subr.c.
Although this file has historically been used as a dumping ground for
random functions, nowadays it only contains functions related to copying
bits {from,to} userspace and hash table utility functions.

Behold, subr_uio.c and subr_hash.c.
2010-02-21 19:53:33 +00:00
bz
7772630fb3 Set curvnet earlier so that it also covers calls to sodisconnect(), which
before were possibly panicing the system in ULP code in the VIMAGE case.

Submitted by:	Igor (igor ispsystem.com)
MFC after:	5 days
2010-02-20 22:29:28 +00:00
attilio
3138bddc87 Use the cached value within comparison.
Submitted by:	jhb
2010-02-19 15:10:05 +00:00
attilio
d4d3373f23 Fix the grammar.
Submitted by:	Brandon Gooch <bgooch at se dot edu>
2010-02-19 15:03:55 +00:00
attilio
5687771459 Fix a race in regard of p_numthreads.
Submitted by:	Giovanni Trematerra
		<giovanni dot trematerra at gmail dot com>
2010-02-19 14:59:41 +00:00
pjd
2d8af38442 - Reduce scope of vnode lock. vfs_mount_alloc() doesn't need vnode to be
locked.
- Remove code duplication.
2010-02-18 22:22:45 +00:00
pjd
a46e983531 Use vput() instead of VOP_UNLOCK()+vrele(). The comment here is out-dated,
we no longer pass thread pointer to VOP_UNLOCK().
2010-02-18 22:14:44 +00:00
pjd
f061fb129c Use NULL instead of 0 when setting up pointer. 2010-02-18 22:12:40 +00:00
neel
aa07cd3091 Kernel module support for mips.
Reviewed by: gonzo

Tested by: Alexandr Rybalko (ray@dlink.ua)
2010-02-18 05:49:52 +00:00
kib
473b74afba Do not leak process lock when current thread is not allowed to see target.
Bumped into by:	ed
MFC after:	3 days
2010-02-14 13:59:01 +00:00
marcel
574fc5dc43 Initialize pve_fsid and pve_fileid to VNOVAL. 2010-02-11 21:10:56 +00:00
marcel
59fa82c67c o Add support for COMPAT_IA32.
o  Incorporate review comments:
   -  Properly reference and lock the map
   -  Take into account that the VM map can change inbetween requests
   -  Add the fileid and fsid attributes

Credits: kib@
Reviewed by: kib@
2010-02-11 18:00:53 +00:00
davidxu
5b7c0a4237 In function umtxq_insert_queue, use parameter q (shared/exclusive queue)
instead of hard coded constant. This does not affect RELENG_8 and previous,
because the code only exists in the HEAD.
2010-02-10 05:47:34 +00:00
marcel
bcd7bda0da Unbreak building kernels with COMPAT_32 enabled. The actual support
for the PT_VM_ENTRY request from 32-bit processes will follow.

Pointy hat: marcel
2010-02-09 17:20:00 +00:00
marcel
764ce56ace Add PT_VM_TIMESTAMP and PT_VM_ENTRY so that the tracing process can
obtain the memory map of the traced process. PT_VM_TIMESTAMP can be
used to check if the memory map changed since the last time to avoid
iterating over all the VM entries unnecesarily.

MFC after:	1 month
2010-02-09 05:52:35 +00:00
ed
d40177139e Remove unused LIBCOMPAT keyword from syscalls.master. 2010-02-08 10:02:01 +00:00
davidxu
7d46cfed0a Set waiters flag before checking semaphore's counter,
otherwise we might lose a wakeup. Tested on postgresql database server.
2010-02-08 07:31:05 +00:00
gavin
c2c8b20929 Spelling nit 2010-02-07 18:00:13 +00:00
ed
760cf1fda3 Remove statistics from the TTY queues.
I added counters to see how often fast copying to userspace was actually
performed, which was only useful during development. Remove these
statistics now we know it to be effective.
2010-02-07 15:42:15 +00:00
mav
fab0a47626 MFp4:
Make CAM to stop all attached devices on system shutdown.
It allows devices to park heads, reducing stress on power loss.
Add `kern.cam.power_down` tunable and sysctl to controll it.
2010-02-03 08:42:08 +00:00
davidxu
47ff0c69ad Fix comments in do_sem_wait(). 2010-02-03 07:21:20 +00:00
davidxu
324cd07ff3 After busied the lock, re-read state word before checking waiters flag,
otherwise, the waiters bit may not be set and a wakeup is lost.

Submitted by:	justin.teller at gmail dot com
MFC after:	3 days
2010-02-03 03:56:32 +00:00
rwatson
ddf7a2e0a7 Only audit pathnames in namei(9) if copying the directory string completes
successfully.  Continue to do this before the empty path check so that the
ENOENT returned in that case gets an empty string token in the BSM record.

MFC after:	3 days
2010-02-02 23:10:27 +00:00
avg
8ad65aae5c KASSERT that return value of interrupt filter complies with contract
For example a return value of zero could lead to a stuck level-triggered
interrupt line.

Reviewed by:	jhb (for INTR_FILTER case)
MFC after:	3 weeks
2010-01-27 09:59:08 +00:00
delphij
d9a0cd0982 Revised revision 199201 (add interface description capability as inspired
by OpenBSD), based on comments from many, including rwatson, jhb, brooks
and others.

Sponsored by:	iXsystems, Inc.
MFC after:	1 month
2010-01-27 00:30:07 +00:00
attilio
953b6f2ced Split out an invariant in order to better check that newtd, when
provided, must be on a runqueue.

Tested by:	Giovanni Trematerra
		<giovanni dot trematerra at gmail dot com>
MFC:		2 weeks
X-MFC:		r202889
2010-01-24 18:16:38 +00:00
attilio
a4281bb884 - Fix the kthread_{suspend, resume, suspend_check}() locking.
In the current code, the locking is completely broken and may lead
  easilly to deadlocks. Fix it by using the proc_mtx, linked to the
  suspending thread, as lock for the operation.  Keep using the
  thread_lock for setting and reading the flag even if it is not entirely
  necessary (atomic ops may do it as well, but this way the code is more
  readable).
- Fix a deadlock within kthread_suspend().
  The suspender should not sleep on a different channel wrt the suspended
  thread, or, otherwise, the awaker should wakeup both. Uniform the
  interface to what the kproc_* counterparts do (sleeping on the same
  channel).
- Change the kthread_suspend_check() prototype.
  kthread_suspend_check() always assumes curthread and must only refer to
  it, so skip the thread pointer as it may be easilly mistaken.
  If curthread is not a kthread, the system will panic.

In collabouration with:	jhb
Tested by:		Giovanni Trematerra
			<giovanni dot trematerra at gmail dot com>
MFC:			2 weeks
2010-01-24 15:07:00 +00:00
attilio
9a7f4738f4 - Fix a race in sched_switch() of sched_4bsd.
In the case of the thread being on a sleepqueue or a turnstile, the
  sched_lock was acquired (without the aid of the td_lock interface) and
  the td_lock was dropped. This was going to break locking rules on other
  threads willing to access to the thread (via the td_lock interface) and
  modify his flags (allowed as long as the container lock was different
  by the one used in sched_switch).
  In order to prevent this situation, while sched_lock is acquired there
  the td_lock gets blocked. [0]
- Merge the ULE's internal function thread_block_switch() into the global
  thread_lock_block() and make the former semantic as the default for
  thread_lock_block(). This means that thread_lock_block() will not
  disable interrupts when called (and consequently thread_unlock_block()
  will not re-enabled them when called). This should be done manually
  when necessary.
  Note, however, that ULE's thread_unblock_switch() is not reaped
  because it does reflect a difference in semantic due in ULE (the
  td_lock may not be necessarilly still blocked_lock when calling this).
  While asymmetric, it does describe a remarkable difference in semantic
  that is good to keep in mind.

[0] Reported by:	Kohji Okuno
			<okuno dot kohji at jp dot panasonic dot com>
Tested by:		Giovanni Trematerra
			<giovanni dot trematerra at gmail dot com>
MFC:			2 weeks
2010-01-23 15:54:21 +00:00
kib
ddbe48bd56 For PT_TO_SCE stop that stops the ptraced process upon syscall entry,
syscall arguments are collected before ptracestop() is called. As a
consequence, debugger cannot modify syscall or its arguments.

For i386, amd64 and ia32 on amd64 MD syscall(), reread syscall number
and arguments after ptracestop(), if debugger modified anything in the
process environment. Since procfs stopeven requires number of syscall
arguments in p_xstat, this cannot be solved by moving stop/trace point
before argument fetching.

Move the code to read arguments into separate function
fetch_syscall_args() to avoid code duplication. Note that ktrace point
for modified syscall is intentionally recorded twice, once with original
arguments, and second time with the arguments set by debugger.

PT_TO_SCX stop is executed after cpu_syscall_set_retval() already.

Reported by:	Ali Polatel <alip exherbo org>
Briefly discussed with:	jhb
MFC after:	3 weeks
2010-01-23 11:45:35 +00:00
kib
a22fc66b6e Staticise sigqueue manipulation functions used only in kern_sig.c.
MFC after:	1 week
2010-01-23 11:43:30 +00:00
kib
604a9a1db3 When traced process is about to receive the signal, the process is
stopped and debugger may modify or drop the signal. After the changes to
keep process-targeted signals on the process sigqueue, another thread
may note the old signal on the queue and act before the thread removes
changed or dropped signal from the process queue. Since process is
traced, it usually gets stopped. Or, if the same signal is delivered
while process was stopped, the thread may erronously remove it,
intending to remove the original signal.

Remove the signal from the queue before notifying the debugger. Restore
the siginfo to the head of sigqueue when signal is allowed to be
delivered to the debugee, using newly introduced KSI_HEAD ksiginfo_t
flag. This preserves required order of delivery. Always restore the
unchanged signal on the curthread sigqueue, not to the process queue,
since the thread is about to get it anyway, because sigmask cannot be
changed.

Handle failure of reinserting the siginfo into the queue by falling
back to sq_kill method, calling sigqueue_add with NULL ksi.

If debugger changed the signal to be delivered, use sigqueue_add()
with NULL ksi instead of only setting sq_signals bit.

Reported by:	Gardner Bell <gbell72 rogers com>
Analyzed and first version of fix by:	Tijl Coosemans <tijl coosemans org>
PR:	142757
Reviewed by:	davidxu
MFC after:	2 weeks
2010-01-20 11:58:04 +00:00
ed
0c91591629 Remove a dead initialization.
Spotted by:	scan-build (uqs)
2010-01-18 18:58:03 +00:00
kib
1c4578d223 Add new function vunref(9) that decrements vnode use count (and hold
count) while vnode is exclusively locked.

The code for vput(9), vrele(9) and vunref(9) is merged.

In collaboration with:	pho
Reviewed by:	alc
MFC after:	3 weeks
2010-01-17 21:24:27 +00:00