Commit Graph

11731 Commits

Author SHA1 Message Date
Colin Percival
32a8b1d832 Correctly copy the M_RDONLY flag when duplicating a reference
to an mbuf external buffer.

Approved by:	so (cperciva)
Approved by:	re (kensmith)
Security:	FreeBSD-SA-10:07.mbuf
2010-07-13 02:45:17 +00:00
Jung-uk Kim
4a82f10889 Use type-specific inline function imax() instead of deprecated macro MAX().
Prodded by:	bde
2010-07-12 15:32:45 +00:00
Alan Cox
2882388376 Change the implementation of vm_hold_free_pages() so that it performs at
most one call to pmap_qremove(), and thus one TLB shootdown, instead of one
call and TLB shootdown per page.

Simplify the interface to vm_hold_free_pages().

MFC after:	3 weeks
2010-07-11 20:11:44 +00:00
Alexander Motin
3bc5958c0e Remove interval validation from cpu_tick_calibrate(). As I found, check
was needed at preliminary version of the patch, where number of CPU ticks
was divided strictly on 16 seconds. Final code instead uses real interval
duration, so precise interval should not be important. Same time aliasing
issues around second boundary causes false positives, periodically logging
useless "t_delta ... too long/short" messages when HZ set below 256.
2010-07-11 16:47:45 +00:00
Alan Cox
b99348e5ea Add support for the VM_ALLOC_COUNT() hint to vm_page_alloc(). Consequently,
the maintenance of vm_pageout_deficit can be localized to just two places:
vm_page_alloc() and vm_pageout_scan().

This change also corrects an off-by-one error in the maintenance of
vm_pageout_deficit.  Historically, the buffer cache functions, allocbuf()
and vm_hold_load_pages(), have not taken into account that vm_page_alloc()
already increments vm_pageout_deficit by one.

Reviewed by:	kib
2010-07-09 19:38:30 +00:00
John Baldwin
e113db82af Accidentally committed an older version of this comment rather than the
final one.
2010-07-09 13:59:53 +00:00
John Baldwin
07b183388a Refine a comment.
Reviewed by:	bde
2010-07-09 13:53:25 +00:00
Jaakko Heinonen
831aa555de Remove redundant high >= 0.
Reported by:	rstone
2010-07-09 10:57:55 +00:00
Jung-uk Kim
4624e08a59 Implement optional 'precision' for numbers. Previously, it was parsed but
ignored.  Some third-party modules (e.g., APCICA) prefer this format over
zero padding flag '0'.
2010-07-08 22:13:23 +00:00
John Baldwin
fc8cca02c7 - Various style and whitespace fixes.
- Make sugid_coredump and kern_logsigexit private to kern_sig.c.

Submitted by:	bde (partially)
MFC after:	1 month
2010-07-08 19:15:26 +00:00
Jaakko Heinonen
501812f2c5 Assert that low and high are >= 0. The allocator doesn't support the
negative range.
2010-07-08 16:53:19 +00:00
Attilio Rao
631cb86f11 - Simplify logic in handling ticks wrap-up
- Fix a bug where thread may be in sleeping state but the wchan won't
  be set, leading to an empty container for sleepq_type(). [0]

Sponsored by:		Sandvine Incorporated
[0] Submitted by:	Bryan Venteicher
			<bryanv at daemoninthecloset dot org>
MFC after:		3 days
X-MFC:			209577
2010-07-07 12:00:11 +00:00
Konstantin Belousov
aa81ae08e9 In revoke(), verify that VCHR vnode indeed belongs to devfs.
Found and tested by:	pho
MFC after:	1 week
2010-07-06 18:20:49 +00:00
Ed Schouten
822eb2b050 Fix a race condition, where a TTY could be destroyed twice.
There are special cases where tty_rel_free() can be called twice in a
row, namely when closing and revoking the TTY at the same moment. Only
call destroy_dev_sched_cb() once.

Reported by:	Jeremie Le Hen
MFC after:	1 week
2010-07-06 08:56:34 +00:00
Konstantin Belousov
5f195aa32e Add the ability for the allocflag argument of the vm_page_grab() to
specify the increment of vm_pageout_deficit when sleeping due to page
shortage. Then, in allocbuf(), the code to allocate pages when extending
vmio buffer can be replaced by a call to vm_page_grab().

Suggested and reviewed by:	alc
MFC after:	2 weeks
2010-07-05 21:13:32 +00:00
Jaakko Heinonen
13c02cbb18 Extend the kernel unit number allocator for allocating specific unit
numbers. This change adds a new function alloc_unr_specific() which
returns the requested unit number if it is free. If the number is
already allocated or out of the range, -1 is returned.

Update alloc_unr(9) manual page accordingly and add a MLINK for
alloc_unr_specific(9).

Discussed on:	freebsd-hackers
2010-07-05 16:23:55 +00:00
Konstantin Belousov
34a39b7b1f Obey sv_syscallnames bounds in syscallname().
Reported and tested by:	pho
2010-07-04 18:16:17 +00:00
Konstantin Belousov
8a26007903 Extend ptrace(PT_LWPINFO) to report siginfo for the signal that caused
debugee stop. The change should keep the ABI. Take care of compat32.

Discussed with:	davidxu, jhb
MFC after:	2 weeks
2010-07-04 11:48:30 +00:00
Alan Cox
41890423b6 Use vm_page_next() instead of vm_page_lookup() in exec_map_first_page()
because vm_page_next() is faster.
2010-07-02 15:50:30 +00:00
John Baldwin
fc0de8f0b6 Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to
<sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
2010-06-30 18:03:42 +00:00
John Baldwin
418a27e99e Update comment for tdsignal() -> tdsendsignal() rename. Forgot to include
this in 209592.
2010-06-30 18:00:45 +00:00
Alan Cox
f4b9ace4f8 Improve bufdone_finish()'s handling of the bogus page. Specifically, if
one or more mappings to the bogus page must be replaced, call pmap_qenter()
just once.  Previously, pmap_qenter() was called for each mapping to the
bogus page.

MFC after:	3 weeks
2010-06-30 04:52:42 +00:00
John Baldwin
7a6f3d7890 Send SIGPIPE to the thread that issued the offending system call
rather than to the entire process.

Reported by:	Anit Chakraborty
Reviewed by:	kib, deischen (concept)
MFC after:	1 week
2010-06-29 20:44:19 +00:00
John Baldwin
ad6eec7b9e Tweak the in-kernel API for sending signals to threads:
- Rename tdsignal() to tdsendsignal() and make it private to kern_sig.c.
- Add tdsignal() and tdksignal() routines that mirror psignal() and
  pksignal() except that they accept a thread as an argument instead of
  a process.  They send a signal to a specific thread rather than to an
  individual process.

Reviewed by:	kib
2010-06-29 20:41:52 +00:00
Doug Barton
d748aee076 If i is going to be used in the loop unconditionally the declaration
has to be unconditional as well.

Conical head covering to:	kib
2010-06-29 01:04:24 +00:00
Konstantin Belousov
13cedde2cb Regenerate 2010-06-28 18:17:21 +00:00
Konstantin Belousov
0d9d996d39 Despite system call deregistration drains the threads executing System V
shm syscalls, and initial check for the number of allocated segments
in the module deinitialization code, the following might happen:
   after the check for active segment, while waiting for threads to
   leave some other syscall, shmget(2) is called. Then, we can end
   up with the shared segment that cannot be detached since sysvshm
   module is unloaded.

Prevent the leak by rechecking and disclaiming a reference to the vm
object owned by sysvshm module, that might have grown during the drain.

Tested by:	pho
Reviewed by:	jhb
MFC after:	1 month
2010-06-28 18:12:42 +00:00
Konstantin Belousov
153ac44cf6 Count number of threads that enter and leave dynamically registered
syscalls. On the dynamic syscall deregistration, wait until all
threads leave the syscall code. This somewhat increases the safety
of the loadable modules unloading.

Reviewed by:	jhb
Tested by:	pho
MFC after:	1 month
2010-06-28 18:06:46 +00:00
Attilio Rao
b2488fc159 Fix a lock leak in the deadlock resolver in case the ticks counter
wrapped up.

Sponsored by:	Sandvine Incorporated
Submitted by:	pluknet <pluknet at gmail dot com>
Reported by:	Anton Yuzhaninov <citrin at citrin dot ru>
Reviewed by:	jhb
MFC after:	3 days
2010-06-28 17:45:00 +00:00
Jaakko Heinonen
bc96d3d17a Correct a comment typo. 2010-06-27 12:19:09 +00:00
Pawel Jakub Dawidek
3297cdd096 Correct arguments order. 2010-06-26 21:44:45 +00:00
Michael Tuexen
e1c97831ec * Do not dereference a NULL pointer when calling an SCTP send syscall
not providing a destination address and using ktrace.
* Do not copy out kernel memory when providing sinfo for sctp_recvmsg().
Both bug where reported by Valentin Nechayev.
The first bug results in a kernel panic.
MFC after: 3 days.
2010-06-26 19:26:20 +00:00
Nathan Whitehorn
bcebf6a165 Reverse the logic of the if statement that sets the default value of
HZ; the list of 1000 Hz platforms was getting unwieldy.

Suggested by:	marcel
2010-06-24 00:27:20 +00:00
Nathan Whitehorn
e864acd42f Move default HZ from 100 to 1000 on powerpc.
Reviewed by:	marcel
MFC after:	2 weeks
2010-06-23 23:26:14 +00:00
Konstantin Belousov
699d648aab Remove the support for int13 FPU exception reporting on i386. It is
believed that all 486-class CPUs FreeBSD is capable to run on, either
have no FPU and cannot use external coprocessor, or have FPU on the
package and can use #MF.

Reviewed by:	bde
Tested by:	pho (previous version)
2010-06-23 11:12:58 +00:00
Alexander Motin
6519968e59 "time lock" is no longer a spin-lock since r209371.
Reported by:	kib@
2010-06-21 21:15:51 +00:00
Ed Schouten
60ae52f785 Use ISO C99 integer types in sys/kern where possible.
There are only about 100 occurences of the BSD-specific u_int*_t
datatypes in sys/kern. The ISO C99 integer types are used here more
often.
2010-06-21 09:55:56 +00:00
Konstantin Belousov
c51050129f Do not report a stack garbage as the old value for debug.ncores sysctl.
Reported by:	brucec
2010-06-21 09:51:25 +00:00
Alexander Motin
875b8844be Implement new event timers infrastructure. It provides unified APIs for
writing event timer drivers, for choosing best possible drivers by machine
independent code and for operating them to supply kernel with hardclock(),
statclock() and profclock() events in unified fashion on various hardware.

Infrastructure provides support for both per-CPU (independent for every CPU
core) and global timers in periodic and one-shot modes. MI management code
at this moment uses only periodic mode, but one-shot mode use planned for
later, as part of tickless kernel project.

For this moment infrastructure used on i386 and amd64 architectures. Other
archs are welcome to follow, while their current operation should not be
affected.

This patch updates existing drivers (i8254, RTC and LAPIC) for the new
order, and adds event timers support into the HPET driver. These drivers
have different capabilities:
 LAPIC - per-CPU timer, supports periodic and one-shot operation, may
freeze in C3 state, calibrated on first use, so may be not exactly precise.
 HPET - depending on hardware can work as per-CPU or global, supports
periodic and one-shot operation, usually provides several event timers.
 i8254 - global, limited to periodic mode, because same hardware used also
as time counter.
 RTC - global, supports only periodic mode, set of frequencies in Hz
limited by powers of 2.

Depending on hardware capabilities, drivers preferred in following orders,
either LAPIC, HPETs, i8254, RTC or HPETs, LAPIC, i8254, RTC.
User may explicitly specify wanted timers via loader tunables or sysctls:
kern.eventtimer.timer1 and kern.eventtimer.timer2.
If requested driver is unavailable or unoperational, system will try to
replace it. If no more timers available or "NONE" specified for second,
system will operate using only one timer, multiplying it's frequency by few
times and uing respective dividers to honor hz, stathz and profhz values,
set during initial setup.
2010-06-20 21:33:29 +00:00
Pawel Jakub Dawidek
d32ef791eb Backout r207970 for now, it can lead to deadlocks.
Reported by:	kan
MFC after:	3 days
2010-06-17 17:39:51 +00:00
Rui Paulo
f05a947676 Make DTrace syscall provider work again by including opt_kdtrace.h here. 2010-06-17 17:34:45 +00:00
Jaakko Heinonen
24e8eaf191 - Fix compilation of the subr_unit.c user space test program.
- Use %zu for size_t in a few format strings.
2010-06-17 16:12:06 +00:00
Andriy Gapon
e7154e7ef1 lock_profile_release_lock: do not compare unsigned with zero
Found by:	Coverity Prevent
CID:		3660
Reviewed by:	jhb
MFC after:	2 weeks
2010-06-17 10:15:13 +00:00
Ed Schouten
2e983ace8f Remove the unit argument from the recently added make_dev_p().
New code that creates character devices shouldn't use device unit
numbers, but only si_drv[12] to hold pointer to per-device data. Make
this function more future proof by removing the unit number argument.

Discussed with:	kib
2010-06-17 08:49:31 +00:00
Jaakko Heinonen
8fa17b7953 Correct the function name in a KASSERT. 2010-06-16 16:02:17 +00:00
Jung-uk Kim
547d94bde3 Implement flexible BPF timestamping framework.
- Allow setting format, resolution and accuracy of BPF time stamps per
listener.  Previously, we were only able to use microtime(9).  Now we can
set various resolutions and accuracies with ioctl(2) BIOCSTSTAMP command.
Similarly, we can get the current resolution and accuracy with BIOCGTSTAMP
command.  Document all supported options in bpf(4) and their uses.

- Introduce new time stamp 'struct bpf_ts' and header 'struct bpf_xhdr'.
The new time stamp has both 64-bit second and fractional parts.  bpf_xhdr
has this time stamp instead of 'struct timeval' for bh_tstamp.  The new
structures let us use bh_tstamp of same size on both 32-bit and 64-bit
platforms without adding additional shims for 32-bit binaries.  On 64-bit
platforms, size of BPF header does not change compared to bpf_hdr as its
members are already all 64-bit long.  On 32-bit platforms, the size may
increase by 8 bytes.  For backward compatibility, struct bpf_hdr with
struct timeval is still the default header unless new time stamp format is
explicitly requested.  However, the behaviour may change in the future and
all relevant code is wrapped around "#ifdef BURN_BRIDGES" for now.

- Add experimental support for tagging mbufs with time stamps from a lower
layer, e.g., device driver.  Currently, mbuf_tags(9) is used to tag mbufs.
The time stamps must be uptime in 'struct bintime' format as binuptime(9)
and getbinuptime(9) do.

Reviewed by:	net@
2010-06-15 19:28:44 +00:00
Alexander Motin
93fc07b434 Virtualize pci_remap_msi_irq() call from general MSI code. It allows MSI
(FSB interrupts) to be used by non-PCI devices, such as HPET.
2010-06-14 07:10:37 +00:00
Konstantin Belousov
f1bb758d4b Add another variation of make_dev(9), make_dev_p(9), that is allowed
to fail and can return useful error code.

Requested by:	jh
Reviewed by:	imp, jh
MFC after:	3 weeks
2010-06-12 13:22:39 +00:00
Konstantin Belousov
76d43557d8 When make_dev_credf(MAKEDEV_WAITOK) is called, use
devctl_notify_f(M_WAITOK) for devfs notifications.

Suggested by:	jh
Reviewed by:	imp, jh
MFC after:	3 weeks
2010-06-12 13:21:25 +00:00
Konstantin Belousov
bebc339116 Add modifications of devctl_notify(9) functions that take flags. Use
flags to specify M_WAITOK/M_NOWAIT. M_WAITOK allows devctl to sleep for
the memory allocation.

As Warner noted, allowing the functions to sleep might cause
reordering of the queued notifications.

Reviewed by:	imp, jh
MFC after:	3 weeks
2010-06-12 13:20:38 +00:00