freebsd-skq

Author	SHA1	Message	Date
avg	ab04d6fe3f	bus_add_child: add specialized default implementation that calls panic If a kobj method doesn't have any explicitly provided default implementation, then it is auto-assigned kobj_error_method. kobj_error_method is proper only for methods that return error code, because it just returns ENXIO. So, in the case of unimplemented bus_add_child caller would get (device_t)ENXIO as a return value, which would cause the mistake to go unnoticed, because return value is typically checked for NULL. Thus, a specialized null_add_child is added. It would have sufficied for correctness to return NULL, but this type of mistake was deemed to be rare and serious enough to call panic instead. Watch out for this kind of problem with other kobj methods. Suggested by: jhb, imp MFC after: 2 weeks	2010-09-13 08:34:20 +00:00
mav	eb4931dc6c	Refactor timer management code with priority to one-shot operation mode. The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.	2010-09-13 07:25:35 +00:00
mav	29989e4d9c	Do not print "frequency 0 Hz", when frequency is unknown.	2010-09-11 20:18:15 +00:00
kan	a47d3fbf19	Add missing pointer increment to sbuf_cat.	2010-09-11 19:42:50 +00:00
kib	107ea66c07	Protect mnt_syncer with the sync_mtx. This prevents a (rare) vnode leak when mount and update are executed in parallel. Encapsulate syncer vnode deallocation into the helper function vfs_deallocate_syncvnode(), to not externalize sync_mtx from vfs_subr.c. Found and reviewed by: jh (previous version of the patch) Tested by: pho MFC after: 3 weeks	2010-09-11 13:06:06 +00:00
mav	90db957786	Merge some SCHED_ULE features to SCHED_4BSD: - Teach SCHED_4BSD to inform cpu_idle() about high sleep/wakeup rate to choose optimized handler. In case of x86 it is MONITOR/MWAIT. Also it will be needed to bypass forthcoming idle tick skipping logic to not consume resources on events rescheduling when it won't give any benefits. - Teach SCHED_4BSD to wake up idle CPUs without using IPI. In case of x86, when MONITOR/MWAIT is active, it require just single memory write. This doubles performance on some heavily switching test loads.	2010-09-11 07:08:22 +00:00
jamie	5a233127aa	Don't exit kern_jail_set without freeing options when enforce_statfs has an illegal value. MFC after: 3 days	2010-09-10 21:45:42 +00:00
mdf	ab3a8b533a	Replace sbuf_overflowed() with sbuf_error(), which returns any error code associated with overflow or with the drain function. While this function is not expected to be used often, it produces more information in the form of an errno that sbuf_overflowed() did.	2010-09-10 16:42:16 +00:00
mav	aa2a743453	Do not IPI CPU that is already spinning for load. It doubles effect of spining (comparing to MWAIT) on some heavly switching test loads.	2010-09-10 13:24:47 +00:00
avg	c9fe8ad7f0	bus_add_child: change type of order parameter to u_int This reflects actual type used to store and compare child device orders. Change is mostly done via a Coccinelle (soon to be devel/coccinelle) semantic patch. Verified by LINT+modules kernel builds. Followup to: r212213 MFC after: 10 days	2010-09-10 11:19:03 +00:00
mdf	bc54684253	Add a drain function for struct sysctl_req, and use it for a variety of handlers, some of which had to do awkward things to get a large enough FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk	2010-09-09 18:33:46 +00:00
mdf	73d2d3f18e	Add drain functionality to sbufs. The drain is a function that is called when the sbuf internal buffer is filled. For kernel sbufs with a drain, the internal buffer will never be expanded. For userland sbufs with a drain, the internal buffer may still be expanded by sbuf_[v]printf(3). Sbufs now have three basic uses: 1) static string manipulation. Overflow is marked. 2) dynamic string manipulation. Overflow triggers string growth. 3) drained string manipulation. Overflow triggers draining. In all cases the manipulation is 'safe' in that overflow is detected and managed. Reviewed by: phk (the previous version)	2010-09-09 17:49:18 +00:00
mdf	3b526ad3fb	Refactor sbuf code so that most uses of sbuf_extend() are in a new sbuf_put_byte(). This makes it easier to add drain functionality when a buffer would overflow as there are fewer code points. Reviewed by: phk	2010-09-09 16:51:52 +00:00
rpaulo	a2f2f93652	Fix two bugs in DTrace: * when the process exits, remove the associated USDT probes * when the process forks, duplicate the USDT probes. Sponsored by: The FreeBSD Foundation	2010-09-09 09:58:05 +00:00
pjd	621aa135f8	Remove VI_MOUNT flag from vnode on VFS_MOUNT() failure.	2010-09-09 07:55:13 +00:00
pjd	cb66a2a961	Doing first mount and updating mount points are both handled by the same syscall and the same function, but are very different and share almost no code. To make it easier to read and analyze, split vfs_domount() into vfs_domount_first() and vfs_domount_update(). Reviewed by: kib	2010-09-08 21:00:53 +00:00
pjd	86d6e6cf79	- Log all the problems in devfs_fixup(). - Correct error paths. The system will be useless on devfs_fixup() failure, so why bother? Maybe for the same reason why a dead body is washed and dressed in a nice suit before it is put into a coffin? Maybe system's last will is to panic without any locks held? Reviewed by: kib	2010-09-08 20:56:18 +00:00
avg	ef26efcc43	subr_bus: use hexadecimal representation for bit flags It seems that this format is more custom in our code, and it is more convenient too. Suggested by: jhb No objection: imp MFC after: 1 week	2010-09-08 17:35:06 +00:00
tuexen	908a61906a	Implement correct handling of address parameter and sendinfo for SCTP send calls. MFC after: 4 weeks.	2010-09-05 20:13:07 +00:00
mav	4f9dee93a3	Initialize buffer for case of empty string. Happens only on non-refactored platforms.	2010-09-05 06:16:04 +00:00
avg	f8094d8ba9	struct device: widen type of flags and order fields to u_int Also change int -> u_int for order parameter in device_add_child_ordered. There should not be any ABI change as struct device is private to subr_bus.c and the API change should be compatible. To do: change int -> u_int for order parameter of bus_add_child method and its implementations. The change should also be API compatible, but is a bit more churn. Suggested by: imp, jhb MFC after: 1 week	2010-09-04 17:28:29 +00:00
mdf	62a144da37	Use a better #if guard. Suggested by pluknet <pluknet at gmail dot com>.	2010-09-03 17:42:17 +00:00
mdf	253bb8c9c3	Style(9) fixes and eliminate the use of min().	2010-09-03 17:42:12 +00:00
mdf	9b6269741b	Fix user-space libsbuf build. Why isn't CTASSERT available to user-space?	2010-09-03 17:23:26 +00:00
mdf	b46510221b	Fix brain fart when converting an if statement into a KASSERT.	2010-09-03 16:12:39 +00:00
mdf	edfefb2af9	Use math rather than iteration when the desired sbuf size is larger than SBUF_MAXEXTENDSIZE.	2010-09-03 16:09:17 +00:00
gibbs	6833acab2d	Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic. Add the BIO_ORDERED flag for struct bio and update bio clients to use it. The barrier semantics of bioq_insert_tail() were broken in two ways: o In bioq_disksort(), an added bio could be inserted at the head of the queue, even when a barrier was present, if the sort key for the new entry was less than that of the last queued barrier bio. o The last_offset used to generate the sort key for newly queued bios did not stay at the position of the barrier until either the barrier was de-queued, or a new barrier (which updates last_offset) was queued. When a barrier is in effect, we know that the disk will pass through the barrier position just before the "blocked bios" are released, so using the barrier's offset for last_offset is the optimal choice. sys/geom/sched/subr_disk.c: sys/kern/subr_disk.c: o Update last_offset in bioq_insert_tail(). o Only update last_offset in bioq_remove() if the removed bio is at the head of the queue (typically due to a call via bioq_takefirst()) and no barrier is active. o In bioq_disksort(), if we have a barrier (insert_point is non-NULL), set prev to the barrier and cur to it's next element. Now that last_offset is kept at the barrier position, this change isn't strictly necessary, but since we have to take a decision branch anyway, it does avoid one, no-op, loop iteration in the while loop that immediately follows. o In bioq_disksort(), bypass the normal sort for bios with the BIO_ORDERED attribute and instead insert them into the queue with bioq_insert_tail(). bioq_insert_tail() not only gives the desired command order during insertion, but also provides barrier semantics so that commands disksorted in the future cannot pass the just enqueued transaction. sys/sys/bio.h: Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio. sys/cam/ata/ata_da.c: sys/cam/scsi/scsi_da.c Use an ordered command for SCSI/ATA-NCQ commands issued in response to bios with the BIO_ORDERED flag set. sys/cam/scsi/scsi_da.c Use an ordered tag when issuing a synchronize cache command. Wrap some lines to 80 columns. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c sys/geom/geom_io.c Mark bios with the BIO_FLUSH command as BIO_ORDERED. Sponsored by: Spectra Logic Corporation MFC after: 1 month	2010-09-02 19:40:28 +00:00
mdf	200dc21dcc	Fix UP build. MFC after: 2 weeks	2010-09-02 16:23:05 +00:00
mdf	bbc3957715	Fix a bug with sched_affinity() where it checks td_pinned of another thread in a racy manner, which can lead to attempting to migrate a thread that is pinned to a CPU. Instead, have sched_switch() determine which CPU a thread should run on if the current one is not allowed. KASSERT in sched_bind() that the thread is not yet pinned to a CPU. KASSERT in sched_switch() that only migratable threads or those moving due to a sched_bind() are changing CPUs. sched_affinity code came from jhb@. MFC after: 2 weeks	2010-09-01 20:32:47 +00:00
mlaier	2ef34e9086	rmlock(9) two additions and one change/fix: - add rm_try_rlock(). - add RM_SLEEPABLE to use sx(9) as the back-end lock in order to sleep while holding the write lock. - change rm_noreadtoken to a cpu bitmask to indicate which CPUs need to go through the lock/unlock in order to synchronize. As a side effect, this also avoids IPI to CPUs without any readers during rm_wlock. Discussed with: ups@, rwatson@ on arch@ Sponsored by: Isilon Systems, Inc.	2010-09-01 19:50:03 +00:00
emaste	5216dea167	As long as we are going to panic anyway, there's no need to hide additional information behind DIAGNOSTIC.	2010-09-01 13:47:11 +00:00
davidxu	be8bfd2384	rescure comments from RELENG_4.	2010-09-01 01:26:07 +00:00
mdf	42170bf6d6	The realloc case for memguard(9) will copy too many bytes when reallocating to a smaller-sized allocation. Fix this issue. Noticed by: alc Reviewed by: alc Approved by: zml (mentor) MFC after: 3 weeks	2010-08-31 16:57:58 +00:00
davidxu	ae90c2fd09	If a process is being debugged, skips job control caused by SIGSTOP/SIGCONT signals, because it is managed by debugger, however a normal signal sent to a interruptibly sleeping thread wakes up the thread so it will handle the signal when the process leaves the stopped state. PR: 150138 MFC after: 1 week	2010-08-31 07:15:50 +00:00
jh	c2cb836190	execve(2) has a special check for file permissions: a file must have at least one execute bit set, otherwise execve(2) will return EACCES even for an user with PRIV_VFS_EXEC privilege. Add the check also to vaccess(9), vaccess_acl_nfs4(9) and vaccess_acl_posix1e(9). This makes access(2) to better agree with execve(2). Because ZFS doesn't use vaccess(9) for VEXEC, add the check to zfs_freebsd_access() too. There may be other file systems which are not using vaccess*() functions and need to be handled separately. PR: kern/125009 Reviewed by: bde, trasz Approved by: pjd (ZFS part)	2010-08-30 16:30:18 +00:00
kib	8505815b26	Regen	2010-08-30 14:26:02 +00:00
kib	0a6b8011f4	Make the syscalls reserved for AFS usable by OpenAFS port. Submitted by: Benjamin Kaduk <kaduk mit edu> MFC after: 2 weeks	2010-08-30 14:24:44 +00:00
kib	9d21e17f07	For some file types, select code registers two selfd structures. E.g., for socket, when specified POLLIN\|POLLOUT in events, you would have one selfd registered for receiving socket buffer, and one for sending. Now, if both events are not ready to fire at the time of the initial scan, but are simultaneously ready after the sleep, pollrescan() would iterate over the pollfd struct twice. Since both times revents is not zero, returned value would be off by one. Fix this by recalculating the return value in pollout(). PR: kern/143029 MFC after: 2 weeks	2010-08-28 17:42:08 +00:00
pjd	bc73fabf27	There is a bug in vfs_allocate_syncvnode() failure handling in mount code. Actually it is hard to properly handle such a failure, especially in MNT_UPDATE case. The only reason for the vfs_allocate_syncvnode() function to fail is getnewvnode() failure. Fortunately it is impossible for current implementation of getnewvnode() to fail, so we can assert this and make vfs_allocate_syncvnode() void. This in turn free us from handling its failures in the mount code. Reviewed by: kib MFC after: 1 month	2010-08-28 08:57:15 +00:00
pjd	43af1b0877	Run all tasks from a proper context, with proper priority, etc. Reviewed by: jhb MFC after: 1 month	2010-08-28 08:38:03 +00:00
kib	65295a82b8	Fix typo. Submitted by: Ben Kaduk <minimarmot gmail com>	2010-08-26 11:20:57 +00:00
brian	e098f7b033	If we read zero bytes from the directory, early out with ENOENT rather than forging ahead and interpreting garbage buffer content and dirent structures. This change backs out r211684 which was essentially a no-op. MFC after: 1 week	2010-08-25 18:09:51 +00:00
davidxu	86cb0861ef	If a thread is removed from umtxq while sleeping, reset error code to zero, this gives userland a better indication that a thread needn't to be cancelled.	2010-08-25 03:14:32 +00:00
davidxu	22bc7d14ad	Optimize thr_suspend, if timeout is zero, don't call msleep, just return immediately.	2010-08-24 07:29:55 +00:00
davidxu	6616b254f2	- According to specification, SI_USER code should only be generated by standard kill(). On other systems, SI_LWP is generated by lwp_kill(). This will allow conforming applications to differentiate between signals generated by standard events and those generated by other implementation events in a manner compatible with existing practice. - Bump __FreeBSD_version	2010-08-24 07:22:24 +00:00
imp	50d4a3193c	This should really be MACHINE not MACHINE_ARCH, and is this Makefile even used?	2010-08-23 06:22:35 +00:00
brian	89b2d8bbb4	uio_resid isn't updated by VOP_READDIR for nfs filesystems. Use the uio_offset adjustment instead to calculate a correct *len. Without this change, we run off the end of the directory data we're reading and panic horribly for nfs filesystems. MFC after: 1 week	2010-08-23 05:33:31 +00:00
rpaulo	dda24289cb	Call the systrace_probe_func() when the error value. Sponsored by: The FreeBSD Foundation	2010-08-22 11:30:49 +00:00
rpaulo	ea11ba6788	Add an extra comment to the SDT probes definition. This allows us to get use '-' in probe names, matching the probe names in Solaris.[1] Add userland SDT probes definitions to sys/sdt.h. Sponsored by: The FreeBSD Foundation Discussed with: rwaston [1]	2010-08-22 11:18:57 +00:00
rpaulo	6f62630bc2	Bump KDTRACE_THREAD_ZERO and use M_ZERO as a malloc flag instead of calling bzero. Sponsored by: The FreeBSD Foundation	2010-08-22 11:09:53 +00:00

1 2 3 4 5 ...

11848 Commits