freebsd-dev

Author	SHA1	Message	Date
John Baldwin	6f7d0018b0	Add a BUS_CHILD_DELETED() method that a bus can hook to allow it to cleanup any bus-specific state (such as ivars) when a child device is deleted. Requested by: kan MFC after: 1 month	2012-08-21 18:13:09 +00:00
Konstantin Belousov	888aefef89	Deliver SIGSYS to the guilty thread, not to the process. MFC after: 1 week	2012-08-18 18:17:10 +00:00
David Xu	e31eb35c3f	regen.	2012-08-17 02:47:16 +00:00
David Xu	d65f1abca7	Implement syscall clock_getcpuclockid2, so we can get a clock id for process, thread or others we want to support. Use the syscall to implement POSIX API clock_getcpuclock and pthread_getcpuclockid. PR: 168417	2012-08-17 02:26:31 +00:00
John Baldwin	f39f73f47c	Remove D_NEEDGIANT from dead_devsw. biofinish() (and thus dead_strategy) does not need Giant. MFC after: 1 month	2012-08-16 18:04:33 +00:00
Konstantin Belousov	3fa615bc11	As a safety measure, disable lowering pid_max too much. Requested by: Peter Jeremy <peter@rulingia.com> MFC after: 1 week	2012-08-16 13:04:21 +00:00
Konstantin Belousov	abce621c3a	Fix grammar. Submitted by: jh MFC after: 1 week	2012-08-16 13:01:56 +00:00
Warner Losh	79f1fdb83b	Limit popcorn limit to something sane (either 2ns or 2 ticks if that's longer). PR: 156481 Submitted by: Ian Lepore	2012-08-16 02:35:44 +00:00
Alan Cox	33327b9e9b	Correct a KASSERT message. Submitted by: bde	2012-08-15 22:12:01 +00:00
Konstantin Belousov	02c6fc2114	Add a sysctl kern.pid_max, which limits the maximum pid the system is allowed to allocate, and corresponding tunable with the same name. Note that existing processes with higher pids are left intact. MFC after: 1 week	2012-08-15 15:56:21 +00:00
Hans Petter Selasky	c01fc06ee9	Revert r239178 and implement two new functions, namely "device_free_softc()" and "device_claim_softc()", to allow USB serial drivers refcounting the softc. These functions are used to grab the softc from auto-free and to free the softc back to the correct malloc type, respectivly. Discussed with: jhb MFC after: 2 weeks	2012-08-15 15:42:57 +00:00
David E. O'Brien	60ee433881	Don't include opt_ddb.h & <ddb/ddb.h> twice.	2012-08-15 14:18:54 +00:00
Jaakko Heinonen	2f0ac2593b	Reserve room for the terminating NUL when setting or getting kernel environment variables. KENV_MNAMELEN and KENV_MVALLEN doesn't include space for the terminating NUL.	2012-08-14 19:16:30 +00:00
David Xu	d7f97db7bd	Some style fixes inspired by @bde.	2012-08-11 23:48:39 +00:00
Alexander Motin	37f4e0254f	Some more minor tunings inspired by bde@.	2012-08-11 20:24:39 +00:00
Alexander Motin	bf89d544d0	Allow idle threads to steal second threads from other cores on systems with 8 or more cores to improve utilization. None of my tests on 2xXeon (2x6x2) system shown any slowdown from mentioned "excess thrashing". Same time in pbzip2 test with number of threads more then number of CPUs I see up to 10% speedup with SMT disabled and up 5% with SMT enabled. Thinking about trashing I was trying to limit that stealing within same last level cache, but got only worse results. Present code any way prefers to steal threads from topologically closer cores. Sponsored by: iXsystems, Inc.	2012-08-11 15:08:19 +00:00
David Xu	e8afbca2bc	tvtohz will print out an error message if a negative value is given to it, avoid this problem by detecting timeout earlier. Reported by: pho	2012-08-11 00:06:56 +00:00
Alexander Motin	579895df01	Some minor tunings/cleanups inspired by bde@ after previous commits: - remove extra dynamic variable initializations; - restore (4BSD) and implement (ULE) hogticks variable setting; - make sched_rr_interval() more tolerant to options; - restore (4BSD) and implement (ULE) kern.sched.quantum sysctl, a more user-friendly wrapper for sched_slice; - tune some sysctl descriptions; - make some style fixes.	2012-08-10 19:02:49 +00:00
Alexander Motin	9000aabf3b	sched_rr_interval() seems always returned period in hz ticks, but same always it was used as rate. Fix use side units to period in hz ticks.	2012-08-10 18:19:57 +00:00
Hans Petter Selasky	ea1bd564ac	Add new device method to free the automatically allocated softc structure which is returned by device_get_softc(). This method can be used to easily implement softc refcounting. This can be desirable when the softc has memory references which are controlled by userspace handles for example. This solves the problem of blocking the caller of device_detach() for a non-deterministic time. Discussed with: kib, ed MFC after: 2 weeks	2012-08-10 15:02:49 +00:00
Alexander Motin	3d7f41175d	Rework r220198 change (by fabient). I believe it solves the problem from the wrong direction. Before it, if preemption and end of time slice happen same time, thread was put to the head of the queue as for only preemption. It could cause single thread to run for indefinitely long time. r220198 handles it by not clearing TDF_NEEDRESCHED in case of preemption. But that causes delayed context switch every time preemption happens, even when not needed. Solve problem by introducing scheduler-specifoc thread flag TDF_SLICEEND, set when thread's time slice is over and it should be put to the tail of queue. Using SW_PREEMPT flag for that purpose as it was before just not enough informative to work correctly. On my tests this by 2-3 times reduces run time deviation (improves fairness) in cases when several threads share one CPU. Reviewed by: fabient MFC after: 2 months Sponsored by: iXsystems, Inc.	2012-08-09 19:26:13 +00:00
Alexander Motin	48317e9e27	SCHED_4BSD scheduling quantum mechanism appears to be broken for some time. With switchticks variable being reset each time thread preempted (that is done regularly by interrupt threads) scheduling quantum may never expire. It was not noticed in time because several other factors still regularly trigger context switches. Handle the problem by replacing that mechanism with its equivalent from SCHED_ULE called time slice. It is effectively the same, just measured in context of stathz instead of hz. Some unification is probably not bad.	2012-08-09 18:09:59 +00:00
Konstantin Belousov	c0c6e95f7f	Always initialize pl_event. Submitted by: Andrey Zonov <andrey@zonov.org> MFC after: 3 days	2012-08-08 00:20:30 +00:00
Alexander Kabaev	c9516c94b4	Do not add handler to event handlers list until ithread is created. In rare event when fast and ithread interrupts share the same vector and the fast handler was registered first, we can end up trying to schedule the ithread that is not created yet. The kernel built with INVARIANTS then triggers an assertion. Change the order to create the ithread first and only then add the handler that needs it to the interrupt event handlers list. Reviewed by: jhb	2012-08-06 16:37:43 +00:00
Konstantin Belousov	1c771f9222	After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks	2012-08-05 14:11:42 +00:00
Alexander Motin	2038943013	Particlly MFcalloutng r238425 (by davide): Fix an issue related to old periodic timers. The code in kern_clocksource.c uses interrupt to keep track of time, and this time may not match with binuptime(). In order to address such incoherency, switch periodic timers to binuptime(). Except further calloutng it is needed for already present cyclic subsystem.	2012-08-04 08:06:37 +00:00
Alexander Motin	9b71c63a8b	Partialy MFcalloutng r236894 (by davide): ... While here, Bruce Evans told me that "unsigned int" is spelled "u_int" in KNF, so replace it where needed.	2012-08-04 07:46:58 +00:00
Alexander Motin	c0722d20d3	Microoptimize time math. As soon as our event periods are always below ome second we may not add intereger parts by using bintime_addx() instead of bintime_add(). Profiling shows handleevents() time redction by 15%.	2012-08-03 09:08:20 +00:00
John Baldwin	e838f09cd0	Reorder the managament of advisory locks on open files so that the advisory lock is obtained before the write count is increased during open() and the lock is released after the write count is decreased during close(). The first change closes a race where an open() that will block with O_SHLOCK or O_EXLOCK can increase the write count while it waits. If the process holding the current lock on the file then tries to call exec() on the file it has locked, it can fail with ETXTBUSY even though the advisory lock is preventing other threads from succesfully completeing a writable open(). The second change closes a race where a read-only open() with O_SHLOCK or O_EXLOCK may return successfully while the write count is non-zero due to another descriptor that had the advisory lock and was blocking the open() still being in the process of closing. If the process that completed the open() then attempts to call exec() on the file it locked, it can fail with ETXTBUSY even though the other process that held a write lock has closed the file and released the lock. Reviewed by: kib MFC after: 1 month	2012-07-31 18:25:00 +00:00
David Xu	5ff2bb52cc	I am comparing current pipe code with the one in 8.3-STABLE r236165, I found 8.3 is a history BSD version using socket to implement FIFO pipe, it uses per-file seqcount to compare with writer generation stored in per-pipe object. The concept is after all writers are gone, the pipe enters next generation, all old readers have not closed the pipe should get the indication that the pipe is disconnected, result is they should get EPIPE, SIGPIPE or get POLLHUP in poll(). But newcomer should not know that previous writters were gone, it should treat it as a fresh session. I am trying to bring back FIFO pipe to history behavior. It is still unclear that if single EOF flag can represent SBS_CANTSENDMORE and SBS_CANTRCVMORE which socket-based version is using, but I have run the poll regression test in tool directory, output is same as the one on 8.3-STABLE now. I think the output "not ok 18 FIFO state 6b: poll result 0 expected 1. expected POLLHUP; got 0" might be bogus, because newcomer should not know that old writers were gone. I got the same behavior on Linux. Our implementation always return POLLIN for disconnected pipe even it should return POLLHUP, but I think it is not wise to remove POLLIN for compatible reason, this is our history behavior. Regression test: /usr/src/tools/regression/poll	2012-07-31 05:48:35 +00:00
David Xu	12a480fa41	When a thread is blocked in direct write state, it only sets PIPE_DIRECTW flag but not PIPE_WANTW, but FIFO pipe code does not understand this internal state, when a FIFO peer reader closes the pipe, it wants to notify the writer, it checks PIPE_WANTW, if not set, it skips calling wakeup(), so blocked writer never noticed the case, but in general, the writer should return from the syscall with EPIPE error code and may get SIGPIPE signal. Setting the PIPE_WANTW fixed problem, or you can turn off direct write, it should fix the problem too. This bug is found by PR/170203. Another bug in FIFO pipe code is when peer closes the pipe, another end which is being blocked in select() or poll() is not notified, it missed to call pipeselwakeup(). Third problem is found in poll regression test, the existing code can not pass 6b,6c,6d tests, but FreeBSD-4 works. This commit does not fix the problem, I still need to study more to find the cause. PR: 170203 Tested by: Garrett Copper < yanegomi at gmail dot com >	2012-07-31 02:00:37 +00:00
Davide Italiano	6e465ac7ce	Until now KTR_ENTRIES, which defines the size of circular buffer used in ktr(4), was constrained to be a power of two. Remove this constraint and update sys/conf/NOTES accordingly. Reviewed by: jhb Approved by: gnn (mentor) Sponsored by: Google Summer of Code 2012	2012-07-30 22:46:42 +00:00
Konstantin Belousov	d8c1da8b90	Add F_DUP2FD_CLOEXEC. Apparently Solaris 11 already did this. Submitted by: Jukka A. Ukkonen <jau iki fi> PR: standards/169962 MFC after: 1 week	2012-07-27 10:41:10 +00:00
Konstantin Belousov	481af8b933	Cosmetics: define FREEBSD32_MINUSER and AOUT32_MINUSER for struct sysentvec .sv_minuser. Also improve style. Submitted by: Oliver Pinter <oliver.pinter@gmail.com> MFC after: 1 week	2012-07-22 13:41:45 +00:00
Konstantin Belousov	a53cab2c6c	(Incomplete) fixes for symbols visibility issues and style in fcntl.h. Append '__' prefix to the tag of struct oflock, and put it under BSD namespace. Structure is needed both by libc and kernel, thus cannot be hidden under #ifdef _KERNEL. Move a set of non-standard F_* and O_* constants into BSD namespace. SUSv4 explicitely allows implemenation to pollute F_* and O_* names after fcntl.h is included, but it costs us nothing to adhere to the specification if exact POSIX compliance level is requested by user code. Change some spaces after #define to tabs. Noted by and discussed with: bde MFC after: 1 week	2012-07-21 13:02:11 +00:00
Konstantin Belousov	eb3d975443	Remove line which was accidentally kept in r238614. Submitted by: pjd Pointy hat to: kib MFC after: 1 week	2012-07-19 20:38:03 +00:00
Konstantin Belousov	d1ae5c8337	Fix several reads beyond the mapped first page of the binary in the ELF parser. Specifically, do not allow note reader and interpreter path comparision in the brandelf code to read past end of the page. This may happen if specially crafter ELF image is activated. Submitted by: Lukasz Wojcik <lukasz.wojcik zoho com> MFC after: 3 days	2012-07-19 11:15:53 +00:00
Konstantin Belousov	49d02b13bc	Implement F_DUPFD_CLOEXEC command for fcntl(2), specified by SUSv4. PR: standards/169962 Submitted by: Jukka A. Ukkonen <jau iki fi> MFC after: 1 week	2012-07-19 10:22:54 +00:00
George V. Neville-Neil	57d025c338	Add support for walltimestamp in DTrace. Submitted by: Fabian Keil MFC after: 2 weeks	2012-07-16 20:17:19 +00:00
Gabor Pali	599fc82b06	- Add support for displaying process stack memory regions. Approved by: rwatson MFC after: 3 days	2012-07-16 09:38:19 +00:00
Matthew D Fleming	f806cdcf99	Fix a bug with memguard(9) on 32-bit architectures without a VM_KMEM_MAX_SIZE. The code was not taking into account the size of the kernel_map, which the kmem_map is allocated from, so it could produce a sub-map size too large to fit. The simplest solution is to ignore VM_KMEM_MAX entirely and base the memguard map's size off the kernel_map's size, since this is always relevant and always smaller. Found by: Justin Hibbits	2012-07-15 20:29:48 +00:00
John Baldwin	2919668490	Make the interval timings for EVFILT_TIMER more accurate. tvtohz() always adds an extra tick to account for the current partial clock tick. However, that is not appropriate for a repeating timer when the exact tvtohz() value should be used for subsequent intervals. Fix repeating callouts for EVFILT_TIMER by subtracting 1 tick from the tvtohz() result similar to the fix used in realitexpire() for interval timers. While here, update a few comments to note that if the EVFILT_TIMER code were to move out of kern_event.c, it should move to kern_time.c (where the interval timer code it mimics lives) rather than kern_timeout.c. MFC after: 1 month	2012-07-13 13:24:33 +00:00
Konstantin Belousov	92a9c65b06	Fix build for kernels with dtrace hooks. MFC after: 1 month	2012-07-11 18:50:50 +00:00
George V. Neville-Neil	3fac94ba94	Initial commit of an I/O provider for DTrace on FreeBSD. These probes are most useful when looking into the structures they provide, which are listed in io.d. For example: dtrace -n 'io:genunix::start { printf("%d\n", args[0]->bio_bcount); }' Note that the I/O systems in FreeBSD and Solaris/Illumos are sufficiently different that there is not a 1:1 mapping from scripts that work with one to the other. MFC after: 1 month	2012-07-11 16:27:02 +00:00
David Xu	7ce60f6013	Always clear p_xthread if current thread no longer needs it, in theory, if debugger exited without calling ptrace(PT_DETACH), there is a time window that the p_xthread may be pointing to non-existing thread, in practical, this is not a problem because child process soon will be killed by parent process.	2012-07-10 05:45:13 +00:00
David Xu	5985d61556	If you have pressed CTRL+Z and a process is suspended, then you use gdb to attach to the process, it is surprising that the process is resumed without inputting any gdb commands, however ptrace manual said: The tracing process will see the newly-traced process stop and may then control it as if it had been traced all along. But the current code does not work in this way, unless traced process received a signal later, it will continue to run as a background task. To fix this problem, just send signal SIGSTOP to the traced process after we resumed it, this works like that you are attaching to a running process, it is not perfect but better than nothing.	2012-07-09 09:24:46 +00:00
Mateusz Guzik	4fd85c4b5d	Follow-up commit to r238220: Pass only FEXEC (instead of FREAD\|FEXEC) in fgetvp_exec. _fget has to check for !FWRITE anyway and may as well know about FREAD. Make _fget code a bit more readable by converting permission checking from if() to switch(). Assert that correct permission flags are passed. In collaboration with: kib Approved by: trasz (mentor) MFC after: 6 days X-MFC: with r238220	2012-07-09 05:39:31 +00:00
Mateusz Guzik	28a7f60741	Unbreak handling of descriptors opened with O_EXEC by fexecve(2). While here return EBADF for descriptors opened for writing (previously it was ETXTBSY). Add fgetvp_exec function which performs appropriate checks. PR: kern/169651 In collaboration with: kib Approved by: trasz (mentor) MFC after: 1 week	2012-07-08 00:51:38 +00:00
Mikolaj Golub	e71a7957bd	Fix KASSERT message. MFC after: 3 days	2012-07-03 19:08:02 +00:00
Konstantin Belousov	c5c1199c83	Extend the KPI to lock and unlock f_offset member of struct file. It now fully encapsulates all accesses to f_offset, and extends f_offset locking to other consumers that need it, in particular, to lseek() and variants of getdirentries(). Ensure that on 32bit architectures f_offset, which is 64bit quantity, always read and written under the mtxpool protection. This fixes apparently easy to trigger race when parallel lseek()s or lseek() and read/write could destroy file offset. The already broken ABI emulations, including iBCS and SysV, are not converted (yet). Tested by: pho No objections from: jhb MFC after: 3 weeks	2012-07-02 21:01:03 +00:00

1 2 3 4 5 ...

12787 Commits