freebsd-skq

Author	SHA1	Message	Date
kib	9d21e17f07	For some file types, select code registers two selfd structures. E.g., for socket, when specified POLLIN\|POLLOUT in events, you would have one selfd registered for receiving socket buffer, and one for sending. Now, if both events are not ready to fire at the time of the initial scan, but are simultaneously ready after the sleep, pollrescan() would iterate over the pollfd struct twice. Since both times revents is not zero, returned value would be off by one. Fix this by recalculating the return value in pollout(). PR: kern/143029 MFC after: 2 weeks	2010-08-28 17:42:08 +00:00
pjd	bc73fabf27	There is a bug in vfs_allocate_syncvnode() failure handling in mount code. Actually it is hard to properly handle such a failure, especially in MNT_UPDATE case. The only reason for the vfs_allocate_syncvnode() function to fail is getnewvnode() failure. Fortunately it is impossible for current implementation of getnewvnode() to fail, so we can assert this and make vfs_allocate_syncvnode() void. This in turn free us from handling its failures in the mount code. Reviewed by: kib MFC after: 1 month	2010-08-28 08:57:15 +00:00
pjd	43af1b0877	Run all tasks from a proper context, with proper priority, etc. Reviewed by: jhb MFC after: 1 month	2010-08-28 08:38:03 +00:00
kib	65295a82b8	Fix typo. Submitted by: Ben Kaduk <minimarmot gmail com>	2010-08-26 11:20:57 +00:00
brian	e098f7b033	If we read zero bytes from the directory, early out with ENOENT rather than forging ahead and interpreting garbage buffer content and dirent structures. This change backs out r211684 which was essentially a no-op. MFC after: 1 week	2010-08-25 18:09:51 +00:00
davidxu	86cb0861ef	If a thread is removed from umtxq while sleeping, reset error code to zero, this gives userland a better indication that a thread needn't to be cancelled.	2010-08-25 03:14:32 +00:00
davidxu	22bc7d14ad	Optimize thr_suspend, if timeout is zero, don't call msleep, just return immediately.	2010-08-24 07:29:55 +00:00
davidxu	6616b254f2	- According to specification, SI_USER code should only be generated by standard kill(). On other systems, SI_LWP is generated by lwp_kill(). This will allow conforming applications to differentiate between signals generated by standard events and those generated by other implementation events in a manner compatible with existing practice. - Bump __FreeBSD_version	2010-08-24 07:22:24 +00:00
imp	50d4a3193c	This should really be MACHINE not MACHINE_ARCH, and is this Makefile even used?	2010-08-23 06:22:35 +00:00
brian	89b2d8bbb4	uio_resid isn't updated by VOP_READDIR for nfs filesystems. Use the uio_offset adjustment instead to calculate a correct *len. Without this change, we run off the end of the directory data we're reading and panic horribly for nfs filesystems. MFC after: 1 week	2010-08-23 05:33:31 +00:00
rpaulo	dda24289cb	Call the systrace_probe_func() when the error value. Sponsored by: The FreeBSD Foundation	2010-08-22 11:30:49 +00:00
rpaulo	ea11ba6788	Add an extra comment to the SDT probes definition. This allows us to get use '-' in probe names, matching the probe names in Solaris.[1] Add userland SDT probes definitions to sys/sdt.h. Sponsored by: The FreeBSD Foundation Discussed with: rwaston [1]	2010-08-22 11:18:57 +00:00
rpaulo	6f62630bc2	Bump KDTRACE_THREAD_ZERO and use M_ZERO as a malloc flag instead of calling bzero. Sponsored by: The FreeBSD Foundation	2010-08-22 11:09:53 +00:00
rpaulo	a34abf7c98	Fix style issues. Sponsored by: The FreeBSD Foundation	2010-08-22 11:08:18 +00:00
davidxu	84d25462c9	make sure thread lock is locked.	2010-08-20 23:51:34 +00:00
jhb	d4890c88b0	Add dedicated routines to toggle lockmgr flags such as LK_NOSHARE and LK_CANRECURSE after a lock is created. Use them to implement macros that otherwise manipulated the flags directly. Assert that the associated lockmgr lock is exclusively locked by the current thread when manipulating these flags to ensure the flag updates are safe. This last change required some minor shuffling in a few filesystems to exclusively lock a brand new vnode slightly earlier. Reviewed by: kib MFC after: 3 days	2010-08-20 19:46:50 +00:00
davidxu	89f466d2b2	If thread set a TDP_WAKEUP for itself, clears the flag and returns EINTR immediately, this is used for implementing reliable pthread cancellation.	2010-08-20 04:28:30 +00:00
jhb	d02cab2556	Remove unused KTRACE includes.	2010-08-19 16:41:27 +00:00
jhb	2f662f7a9c	There isn't really a need to hold the ktrace mutex just to read the value of p_traceflag that is stored in the kinfo_proc structure. It is still racey even with the lock and the code will read a consistent snapshot of the flag without the lock.	2010-08-19 16:40:30 +00:00
jhb	faa167a723	Fix a whitespace nit and remove a questioning comment. STAILQ_CONCAT() does require the STAILQ the existing list is being added to to already be initialized (it is CONCAT() vs MOVE()).	2010-08-19 16:38:58 +00:00
jhb	d64a4df941	Keep the process locked when calling ktrops() or ktrsetchildren() instead of dropping the lock only to immediately reacquire it.	2010-08-17 21:34:19 +00:00
kib	d9f088a03e	Supply some useful information to the started image using ELF aux vectors. In particular, provide pagesize and pagesizes array, the canary value for SSP use, number of host CPUs and osreldate. Tested by: marius (sparc64) MFC after: 1 month	2010-08-17 08:55:45 +00:00
pjd	120209c66c	Simplify taskqueue_drain() by using proved macros.	2010-08-13 19:20:35 +00:00
gibbs	f5039e4d7d	Allow interrupt driven config hooks to be registered from config hook callbacks. Interrupt driven configuration hooks serve two purposes: they are a mechanism for registering for a callback that is invoked once interrupt services are available, and they hold off root device selection so long as any configuration hooks are still active. Before this change, it was not possible to safely register additional hooks from the context of a configuration hook callback. The need for this feature arises when interrupts are required to discover new devices (e.g. access to the XenStore to find para-virtualized devices) which in turn also require the ability to hold off root device selection until some lengthy, interrupt driven, configuration task has completed (e.g. Xen front/back device driver negotiation). More specifically, the mutex protecting the list of active configuration hooks is never held during a callback, and static information is used to ensure proper ordering and only a single callback to each hook even when faced with registration or removal of a hook during an active run. Sponsored by: Spectra Logic Corporation MFC after: 1 week.	2010-08-12 19:50:40 +00:00
gibbs	6b6ab892f9	Properly indent a continue statement. No functional changes.	2010-08-12 19:26:27 +00:00
jkim	f9341f06d7	Add the half of time-of-day clock resolution when we adjust system time from time-of-day clock or vice versa. For x86 systems, RTC resolution is one second and we used to lose up to one second whenever we initialize system time from RTC or write system time back to RTC. With this change, margin of error per conversion is roughly between -0.5 and +0.5 second rather than between -1 and 0 second. Note that it does not take care of errors from getnanotime(9) (which is up to 1/hz second) or CLOCK_GETTIME() latency. These are just too expensive to correct and it is not worthy of the cost.	2010-08-12 17:17:05 +00:00
jkim	1974c6514b	Provide description for 'machdep.disable_rtc_set' sysctl. Clean up style(9) nits. Remove a redundant return statement and an unnecessary variable.	2010-08-12 16:13:24 +00:00
kib	ade28bdd40	The buffers b_vflags field is not always properly protected by bufobj lock. If b_bufobj is not NULL, then bufobj lock should be held when manipulating the flags. Not doing this sometimes leaves BV_BKGRDINPROG to be erronously set, causing softdep' getdirtybuf() to stuck indefinitely in "getbuf" sleep, waiting for background write to finish which is not actually performed. Add BO_LOCK() in the cases where it was missed. In collaboration with: pho Tested by: bz Reviewed by: jeff MFC after: 1 month	2010-08-12 08:36:23 +00:00
mdf	0737955344	Rework memguard(9) to reserve significantly more KVA to detect use-after-free over a longer time. Also release the backing pages of a guarded allocation at free(9) time to reduce the overhead of using memguard(9). Allow setting and varying the malloc type at run-time. Add knobs to allow: - randomly guarding memory - adding un-backed KVA guard pages to detect underflow and overflow - a lower limit on the size of allocations that are guarded Reviewed by: alc Reviewed by: brueffer, Ulrich Spörlein <uqs spoerlein net> (man page) Silence from: -arch Approved by: zml (mentor) MFC after: 1 month	2010-08-11 22:10:37 +00:00
ivoras	9cecaf1c60	Fix (hopefully) the spelling of "queuing." Submitted by: bf1783 at gmail com	2010-08-09 23:32:37 +00:00
ivoras	191f678b27	Bumping the read-ahead count once more, to value equivalent to 512 KiB on most system, based on benchmark results on a low-end fibre channel SAN under VMWare: vfs.read_max read performance 8 (historical default) 83 MB/s 16 (recent bump) 131 MB/s 32 (this version) 152 MB/s 64 157 MB/s (results are +/- 3 MB/s) As read-ahead is heuristic, based on past IO requests, it shouldn't be problematic. The new default is still smaller then in other OSes.	2010-08-09 22:56:10 +00:00
ivoras	fa067e3c30	Elaborate on how hirunningspace was chosen.	2010-08-09 22:22:46 +00:00
gavin	dbc7cd5ae9	Add descriptions to a handful of sysctl nodes. PR: kern/148580 Submitted by: Galimov Albert <wtfcrap mail.ru> MFC after: 1 week	2010-08-09 14:48:31 +00:00
attilio	307b2c04a2	The r208165 fixed a bug related to unsigned integer overflowing for the number of CPUs detection. However, that was not mention at all, the problem was not reported, the patch has not been MFCed and the fix is mostly improper. Fix the original overflow (caused when 32 CPUs must be detected) by just using a different mathematical computation (it also makes more explicit the size of operands involved, which is good in the moment waiting for a more complete support for a large number of CPUs). PR: kern/148698 Submitted by: Joe Landers <jlanders at vmware dot com> Tested by: gianni MFC after: 10 days	2010-08-09 00:23:57 +00:00
jamie	4e0690ba81	Back out r210974. Any convenience of not typing "persist" is outweighed by the possibility of unintended partially-formed jails.	2010-08-08 23:22:55 +00:00
ivoras	252207bbfb	To help with sequential read UFS performance on modern systems, increase the vfs.read_max default. For most systems this means going from 128 KiB to 256 KiB, which is still very conservative and lower than what most other operating systems use, but as a sane default should not interfere much with existing systems. For systems with RAID volumes and/or virtualization envirnments, where read performance is very important, increasing this sysctl tunable to 32 or even more will demonstratively yield additional performance benefits. If MAXPHYS ever gets bumped up, it will probably be a good idea to slave read_max to it.	2010-08-07 18:30:10 +00:00
tuexen	542f657a7f	Fix a bug where MSG_TRUNC was not returned in all necessary cases for SOCK_DGRAM socket. MSG_TRUNC was only returned when some mbufs could not be copied to the application. If some data was left in the last mbuf, it was correctly discarded, but MSG_TRUNC was not set. Reviewed by: bz MFC after: 3 weeks	2010-08-07 17:57:58 +00:00
jamie	37e8c8fb79	Implicitly make a new jail persistent if it's set not to attach. MFC after: 3 days	2010-08-06 22:04:18 +00:00
jhb	19ddbf5c38	Add a new ipi_cpu() function to the MI IPI API that can be used to send an IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive. Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month	2010-08-06 15:36:59 +00:00
csjp	1e529a8eb9	Add Xen to the list of virtual vendors. In the non PV (HVM) case this fixes the virtualization detection successfully disabling the clflush instruction. This fixes insta-panics for XEN hvm users when the hw.clflush_disable tunable is -1 or 0 (-1 by default). Discussed with: jhb	2010-08-06 15:04:40 +00:00
kib	7c864123d4	Add "show cdev" ddb command. In collaboration with: pho MFC after: 1 month	2010-08-06 09:44:01 +00:00
kib	ba7ee96f4a	Add new make_dev_p(9) flag MAKEDEV_ETERNAL to inform devfs that created cdev will never be destroyed. Propagate the flag to devfs vnodes as VV_ETERNVALDEV. Use the flags to avoid acquiring devmtx and taking a thread reference on such nodes. In collaboration with: pho MFC after: 1 month	2010-08-06 09:42:15 +00:00
alc	b6ec5a5f0a	In order for MAXVNODES_MAX to be an "int" on powerpc and sparc, we must cast PAGE_SIZE to an "int". (Powerpc and sparc, unlike the other architectures, define PAGE_SIZE as a "long".) Submitted by: Andreas Tobler	2010-08-04 05:09:02 +00:00
alc	329f9f0435	Update the "desiredvnodes" calculation. In particular, make the part of the calculation that is based on the kernel's heap size more conservative. Hopefully, this will eliminate the need for MAXVNODES_MAX, but for the time being set MAXVNODES_MAX to a large value. Reviewed by: jhb@ MFC after: 6 weeks	2010-08-02 21:33:36 +00:00
rpaulo	1c3476a3fa	Bump the witness pendlist to 768 to accomodate the increased number of spinlocks.	2010-07-29 16:13:26 +00:00
mdf	6857471cf3	Add MALLOC_DEBUG_MAXZONES debug malloc(9) option to use multiple uma zones for each malloc bucket size. The purpose is to isolate different malloc types into hash classes, so that any buffer overruns or use-after-free will usually only affect memory from malloc types in that hash class. This is purely a debugging tool; by varying the hash function and tracking which hash class was corrupted, the intersection of the hash classes from each instance will point to a single malloc type that is being misused. At this point inspection or memguard(9) can be used to catch the offending code. Add MALLOC_DEBUG_MAXZONES=8 to -current GENERIC configuration files. The suggestion to have this on by default came from Kostik Belousov on -arch. This code is based on work by Ron Steinke at Isilon Systems. Reviewed by: -arch (mostly silence) Reviewed by: zml Approved by: zml (mentor)	2010-07-28 15:36:12 +00:00
alc	55426fcc55	The interpreter name should no longer be treated as a buffer that can be overwritten. (This change should have been included in r210545.) Submitted by: kib	2010-07-28 04:47:40 +00:00
alc	256c63de28	Introduce exec_alloc_args(). The objective being to encapsulate the details of the string buffer allocation in one place. Eliminate the portion of the string buffer that was dedicated to storing the interpreter name. The pointer to the interpreter name can simply be made to point to the appropriate argument string. Reviewed by: kib	2010-07-27 17:31:03 +00:00
alc	02c0473d35	Change the order in which the file name, arguments, environment, and shell command are stored in exec*()'s demand-paged string buffer. For a "buildworld" on an 8GB amd64 multiprocessor, the new order reduces the number of global TLB shootdowns by 31%. It also eliminates about 330k page faults on the kernel address space. Change exec_shell_imgact() to use "args->begin_argv" consistently as the start of the argument and environment strings. Previously, it would sometimes use "args->buf", which is the start of the overall buffer, but no longer the start of the argument and environment strings. While I'm here, eliminate unnecessary passing of "&length" to copystr(), where we don't actually care about the length of the copied string. Clean up the initialization of the exec map. In particular, use the correct size for an entry, and express that size in the same way that is used when an entry is allocated. The old size was one page too large. (This discrepancy originated in 2004 when I rewrote exec_map_first_page() to use sf_buf_alloc() instead of the exec map for mapping the first page of the executable.) Reviewed by: kib	2010-07-25 17:43:38 +00:00
alc	0c709bf109	Eliminate a little bit of duplicated code.	2010-07-23 18:58:27 +00:00

1 2 3 4 5 ...

11811 Commits