freebsd-skq

Author	SHA1	Message	Date
David Xu	7de1ecef2d	Add two commands to _umtx_op system call to allow a simple mutex to be locked and unlocked completely in userland. by locking and unlocking mutex in userland, it reduces the total time a mutex is locked by a thread, in some application code, a mutex only protects a small piece of code, the code's execution time is less than a simple system call, if a lock contention happens, however in current implemenation, the lock holder has to extend its locking time and enter kernel to unlock it, the change avoids this disadvantage, it first sets mutex to free state and then enters kernel and wake one waiter up. This improves performance dramatically in some sysbench mutex tests. Tested by: kris Sounds great: jeff	2008-06-24 07:32:12 +00:00
John Baldwin	c4f3a35a54	Remove the posixsem_check_destroy() MAC check. It is semantically identical to doing a MAC check for close(), but no other types of close() (including close(2) and ksem_close(2)) have MAC checks. Discussed with: rwatson	2008-06-23 21:37:53 +00:00
Robert Watson	3319d71265	If S_IFIFO is passed to mknod(2), invoke kern_mkfifoat(9) to create a FIFO, as required by SUSv3. No specific privilege check is performed in this case, as FIFOs may be created by unprivileged processes (subject to the normal file system name space restrictions that may be in place). Unlike the Apple implementation, we reject requests to create a FIFO using mknod(2) if there is a non-zero dev argument to the system call, which is permitted by the Open Group specification ("... undefined ..."). We might want to revise this if we find it causes compatibility problems for applications in practice. PR: kern/74242, kern/68459 Obtained from: Apple, Inc. MFC after: 3 weeks	2008-06-22 21:51:32 +00:00
Oleksandr Tymoshenko	22035f4727	Use minimum of max_aio_procs and target_aio_procs when spawning new aiod since there should be no more then max_aio_procs processes.	2008-06-21 11:34:34 +00:00
Warner Losh	c14909b6e2	Split out the probing magic of device_probe_and_attach into device_probe() so that it can be used by busses that may wish to do additional processing between probe and attach. Reviewed by: dfr@	2008-06-20 16:58:15 +00:00
Alan Cox	ac68d1c960	Enforce the mapping of kernel loadable modules in the uppermost 2GB of the kernel virtual address space on amd64.	2008-06-20 06:24:34 +00:00
Xin LI	2110d913c0	Revert rev. 178124 as requested by kris@. Having jail id not being reused too frequently is useful for script controlled environment.	2008-06-19 21:41:57 +00:00
Oleksandr Tymoshenko	23c8064e66	Renew semaphore's pointer after wakeup since during msleep sem_base may have been modified by destroying one of semaphores and semptr would not be valid in this case. PR: kern/123731	2008-06-19 18:08:42 +00:00
Konstantin Belousov	05427aafc6	Struct cdev is always the member of the struct cdev_priv. When devfs needed to promote cdev to cdev_priv, the si_priv pointer was followed. Use member2struct() to calculate address of the wrapping cdev_priv. Rename si_priv to __si_reserved. Tested by: pho Reviewed by: ed MFC after: 2 weeks	2008-06-16 17:34:59 +00:00
John Birrell	5d846378f7	Remove code that isn't required. It actually breaks the case where KDTRACE_HOOKS is defined and KDB isn't. This is the case that it was intended for.	2008-06-16 04:44:29 +00:00
Ed Schouten	0f03ce1bb8	Turn dev2unit(), minor(), unit2minor() and minor2unit() into macro's. Now that we got rid of the minor-to-unit conversion and the constraints on device minor numbers, we can convert the functions that operate on minor and unit numbers to simple macro's. The unit2minor() and minor2unit() macro's are now no-ops. The ZFS code als defined a macro named `minor'. Change the ZFS code to use umajor() and uminor() here, as it is the correct approach to do this. Also add $FreeBSD$ to keep SVN happy. Approved by: philip (mentor), pjd	2008-06-12 08:30:54 +00:00
Ed Schouten	29d4cb241b	Don't enforce unique device minor number policy anymore. Except for the case where we use the cloner library (clone_create() and friends), there is no reason to enforce a unique device minor number policy. There are various drivers in the source tree that allocate unr pools and such to provide minor numbers, without using them themselves. Because we still need to support unique device minor numbers for the cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's that are used in combination with the cloner library should be marked with this flag to make the cloning work. This means drivers can now freely use si_drv0 to store their own flags and state, making it effectively the same as si_drv1 and si_drv2. We still keep the minor() and dev2unit() routines around to make drivers happy. The NTFS code also used the minor number in its hash table. We should not do this anymore. If the si_drv0 field would be changed, it would no longer end up in the same list. Approved by: philip (mentor)	2008-06-11 18:55:19 +00:00
Oleksandr Tymoshenko	c9688a603b	Keep proper track of nsegs counter: sem_free is called for all allocated semaphores, so it's wrong to increase it conditionally, in this case for every over-the-limit semaphore nsegs is decreased without being previously increased. PR: kern/123685 Approved by: cognet (mentor)	2008-06-10 20:55:10 +00:00
Konstantin Belousov	a70537835f	Provide the mutual exclusion between the nfs export list modifications and nfs requests processing. Lockmgr lock provides the shared locking for nfs requests, while exclusive mode is used for modifications. The writer starvation is handled by lockmgr too. Reported by: kris, pho, many Based on the submission by: mohan Tested by: pho MFC after: 2 weeks	2008-06-09 10:31:38 +00:00
Wojciech A. Koszek	2e75877f12	Remove checks against DDB, which isn't used in this file. My intention is to bring no functional change. Discussion on: IRC Reviewed by: ed, kan, rink,	2008-06-08 20:43:27 +00:00
Ed Schouten	5db88944ac	Remove unneeded Giant locking of /dev/tty. The Giant lock is acquired in two places in tty_tty.c. In both places, it is unneeded. There is no reason to specify D_NEEDGIANT on this device node. The device node has only been designed to return ENXIO when opened. It doesn't make any sense to lock/unlock Giant, just to return this error. D_TTY is also unneeded. The unimplemented functions don't need to be patched by devfs. We don't need to lock Giant when we want to lookup the proper TTY vnode. s_ttyvp is already protected by proctree_lock (see devfs_vnops.c). Approved by: philip (mentor)	2008-06-03 12:38:00 +00:00
David Xu	6e24e61797	Use a seperated hash table for mutex and rwlock, avoid wasting some time on walking through idle threads sleeping on condition variables.	2008-05-30 02:18:54 +00:00
Ed Schouten	06d425f92e	Remove the distinction between device minor and unit numbers. Even though we got rid of device major numbers some time ago, device drivers still need to provide unique device minor numbers to make_dev(). These numbers are only used inside the kernel. They are not related to device major and minor numbers which are visible in devfs. These are actually based on the inode number of the device. It would eventually be nice to remove minor numbers entirely, but we don't want to be too agressive here. Because the 8-15 bits of the device number field (si_drv0) are still reserved for the major number, there is no 1:1 mapping of the device minor and unit numbers. Because this is now unused, remove the restrictions on these numbers. The MAXMAJOR definition was actually used for two purposes. It was used to convert both the userspace and kernelspace device numbers to their major/minor pair, which is why it is now named UMINORMASK. minor2unit() and unit2minor() have now become useless. Both minor() and dev2unit() now serve the same purpose. We should eventually remove some of them, at least turning them into macro's. If devfs would become completely minor number unaware, we could consider using si_drv0 directly, just like si_drv1 and si_drv2. Approved by: philip (mentor)	2008-05-29 12:50:46 +00:00
Ed Schouten	cc8945d204	Remove redundant checks from fcntl()'s F_DUPFD. Right now we perform some of the checks inside the fcntl()'s F_DUPFD operation twice. We first validate the `fd' argument. When finished, we validate the `arg' argument. These checks are also performed inside do_dup(). The reason we need to do this, is because fcntl() should return different errno's when the `arg' argument is out of bounds (EINVAL instead of EBADF). To prevent the redundant locking of the PROC_LOCK and FILEDESC_SLOCK, patch do_dup() to support the error semantics required by fcntl(). Approved by: philip (mentor)	2008-05-28 20:25:19 +00:00
Ed Schouten	09a80aba8e	Rename `tty_subr.c' to` subr_clist.c'. Because clists are also used outside the TTY layer, rename the file containing the clist routines to something more accurate. The mpsafetty TTY layer doesn't use clists. It uses its own buffers, which also implement the unbuffered copying to userspace. We cannot simply remove the clist routines then, because this would break various drivers that are present within the kernel. Approved by: philip (mentor)	2008-05-27 06:41:50 +00:00
Attilio Rao	48972152ee	Improve a comment which, in the actual CVS stock, doesn't completely explain the logic of the code chunk.	2008-05-27 00:27:50 +00:00
Konstantin Belousov	887aedc64e	Take into account possible overflow when multiplying. The casuality is the malloc call later, panicing kernel due to the oversized allocation. Reported by: pho Reviewed by: jeff	2008-05-26 10:01:13 +00:00
Robert Watson	e4372ceba0	Remove netatm from HEAD as it is not MPSAFE and relies on the now removed NET_NEEDS_GIANT. netatm has been disconnected from the build for ten months in HEAD/RELENG_7. Specifics: - netatm include files - netatm command line management tools - libatm - ATM parts in rescue and sysinstall - sample configuration files and documents - kernel support as a module or in NOTES - netgraph wrapper nodes for netatm - ctags data for netatm. - netatm-specific device drivers. MFC after: 3 weeks Reviewed by: bz Discussed with: bms, bz, harti	2008-05-25 22:11:40 +00:00
Attilio Rao	5047a8fd88	The "if" semantic is not needed, just fix this.	2008-05-25 16:11:27 +00:00
Attilio Rao	258f4727f1	Replace direct atomic operation for the file refcount witht the refcount interface. It also introduces the correct usage of memory barriers, as sometimes fdrop() and fhold() are used with shared locks, which don't use any release barrier.	2008-05-25 14:57:43 +00:00
John Birrell	6f5f25e521	Add the vtime (virtual time) hooks for DTrace.	2008-05-25 01:44:58 +00:00
John Birrell	5d217f173c	Add DTrace 'proc' provider probes using the Statically Defined Trace (sdt) mechanism.	2008-05-24 06:22:16 +00:00
Craig Rodrigues	a9722ace80	Do not convert the "snapshot" string to the MNT_SNAPSHOT flag here, since we do it further down in ffs_vfsops.c MFC after: 1 month	2008-05-23 23:33:07 +00:00
Konstantin Belousov	15822fcdbe	Rev. 1.274 put the ttyrel() call before the destroy_dev() in the ttyfree(), freeing the tty. Since destroy_dev() may call d_purge() cdevsw method, that is the ttypurge() for the tty, the code ends up accessing freed tty structure. Put the ttyrel() after destroy_dev() in the ttyfree. To prevent the panic the rev. 1.274 provided fix for, check the TS_GONE in sysctl handler and refuse to provide information on such tty. Reported, debugging help and tested by: pho DIscussed with and reviewed by: jhb MFC after: 1 week	2008-05-23 16:47:55 +00:00
Konstantin Belousov	cc57af357b	The dev_refthread() in the tty_gettp() may fail, because Giant is taken in the giant_trick routines after the dev_refthread increments the si_threadcount. Remove assert, do not perform dev_relthread() for failed dev_refthread(), and handle failure in the tty_gettp() callers (cdevsw tty methods). Before kern_conf.c 1.210 and 1.211, the kernel usually paniced in the giant_trick routines dereferencing NULL cdevsw, not taking this fault. Reported by: Vince Hoffman <jhary unsane co uk> Debugging help and tested by: pho Reviewed by: jhb MFC after: 1 week	2008-05-23 16:46:27 +00:00
Konstantin Belousov	ca091c56e3	Use the t_state for the TS_GONE test. Submitted by: jhb MFC after: 3 days	2008-05-23 16:43:59 +00:00
Konstantin Belousov	06fe11294d	Assert that si_threadcount > 0 before decrementing it. This helps catching the improper use of the dev_refthread/dev_relthread. Tested by: pho MFC after: 1 week	2008-05-23 16:38:38 +00:00
Ed Schouten	8837b0dd09	Move TTY unrelated bits out of <sys/tty.h>. For some reason, the <sys/tty.h> header file also contains routines of the clists and console that are used inside the TTY layer. Because the clists are not only used by the TTY layer (example: various input drivers), we'd better move the entire clist programming interface into <sys/clist.h>. Also remove a declaration of nonexistent variable. The <sys/tty.h> header also contains various definitions for the console code (tty_cons.c). Also move these to <sys/cons.h>, because they are not implemented inside the TTY layer. While there, create separate malloc pools for the clist and console code. Approved by: philip (mentor)	2008-05-23 16:06:35 +00:00
Konstantin Belousov	741b6cf8a5	Another problem caused by the knlist_cleardel() potentially dropping PIPE_MTX(). Since the pipe_present is cleared before (potentially) sleeping, the second thread may enter the pipeclose() for the reciprocal pipe end. The test at the end of the pipeclose() for the pipe_present == 0 would succeed, allowing the second thread to free the pipe memory. First threads then accesses the freed memory after being woken up. Properly track the closing state of the pipe in the pipe_present. Introduce the intermediate state that marks the pipe as mostly dismantled but might be sleeping waiting for the knote list to be cleared. Free the pipe pair memory only when both ends pass that point. Debugging help and tested by: pho Discussed with: jmg MFC after: 2 weeks	2008-05-23 11:14:03 +00:00
Konstantin Belousov	e2e1693f15	Destruction of the pipe calls knlist_cleardel() to remove the knotes monitoring the pipe. The code sets pipe_present = 0 and enters knlist_cleardel(), where the PIPE_MTX might be dropped when knl->kl_list cannot be cleared due to influx knotes. If the following often encountered code fragment if (!(kn->kn_status & KN_DETACHED)) kn->kn_fop->f_detach(kn); knote_drop(kn, td); [1] is executed while the knlist lock is dropped, then the knote memory is freed by the knote_drop() without knote being removed from the knlist, since the filt_pipedetach() contains the following: if (kn->kn_filter == EVFILT_WRITE) { if (!cpipe->pipe_peer->pipe_present) { PIPE_UNLOCK(cpipe); return; Now, the memory may be reused in the zone, causing the access to the freed memory. I got the panics caused by the marker knote appearing on the knlist, that, I believe, manifestation of the issue. In the Peter Holm test scenarious, we got unkillable processes too. The pipe_peer that has the knote for write shall be present. Ignore the pipe_present value for EVFILT_WRITE in filt_pipedetach(). Debugging help and tested by: pho Discussed with: jmg MFC after: 2 weeks	2008-05-23 11:09:50 +00:00
John Birrell	4b3d60930a	Add the ctf_get function and update the args to linker_file_function_listall.	2008-05-23 07:08:59 +00:00
John Birrell	82c4945b5b	Add the ctf_get method.	2008-05-23 04:06:49 +00:00
John Birrell	833b4a131a	Allow a rendezvous with just a specified CPU too. Make the API work in the non-smp case too so that a kernel module can work the same regardless of whether or not it is loaded on a SMP kernel or not.	2008-05-23 04:05:26 +00:00
John Birrell	75d94ef6ca	Add the CTF source file which gets shared with link_elf.c and link_elf_obj.c.	2008-05-23 03:04:27 +00:00
John Birrell	a2024a3edf	Add hooks for the Compact C Type Format (CTF) data to be attached to the elf files. This is complicated by the fact that the actual CTF parsing has to be done in CDDL'd code, so the BSD licensed code only knows about the opaque data which it must be able to free.	2008-05-23 00:49:39 +00:00
John Birrell	91dd776cd2	Add support for the DTrace malloc provider which can enable probes on a per-malloc type basis.	2008-05-23 00:43:36 +00:00
Robert Watson	17c2fc0cc7	When sendto(2) is called with an explicit destination address argument, call mac_socket_check_connect() on that address before proceeding with the send. Otherwise policies instrumenting the connect entry point for the purposes of checking destination addresses will not have the opportunity to check implicit connect requests. MFC after: 3 weeks Sponsored by: nCircle Network Security, Inc.	2008-05-22 07:18:54 +00:00
Konstantin Belousov	82f4d64035	Implement the per-open file data for the cdev. The patch does not change the cdevsw KBI. Management of the data is provided by the functions int devfs_set_cdevpriv(void priv, cdevpriv_dtr_t dtr); int devfs_get_cdevpriv(void *datap); void devfs_clear_cdevpriv(void); All of the functions are supposed to be called from the cdevsw method contexts. - devfs_set_cdevpriv assigns the priv as private data for the file descriptor which is used to initiate currently performed driver operation. dtr is the function that will be called when either the last refernce to the file goes away, the device is destroyed or devfs_clear_cdevpriv is called. - devfs_get_cdevpriv is the obvious accessor. - devfs_clear_cdevpriv allows to clear the private data for the still open file. Implementation keeps the driver-supplied pointers in the struct cdev_privdata, that is referenced both from the struct file and struct cdev, and cannot outlive any of the referee. Man pages will be provided after the KPI stabilizes. Reviewed by: jhb Useful suggestions from: jeff, antoine Debugging help and tested by: pho MFC after: 1 month	2008-05-21 09:31:44 +00:00
Pawel Jakub Dawidek	988f0e193a	Be more friendly for DDB pager. Educated by: jhb's BSDCan presentation	2008-05-18 21:08:12 +00:00
John Birrell	80544aebe3	Add support for the DTrace struct proc and struct thread extended data via ctor and dtor event handlers. The size of the extra data is allocated opaquely and this file contains a function which the dtrace module can call to check that the kernel supports at least the amount of data that it needs. This file is optionally compiled into nthe kernel if the KDTRACE_HOOKS kernel option is defined.	2008-05-18 19:43:52 +00:00
John Birrell	5572901b33	Add kernel support for the Statically Defined Trace provider. This is BSD licensed code written specifically for FreeBSD. It initialises using SYSINIT so that the SDT provider, probe and argument description linkage is done whenever a module is loaded, regardless of whether the DTrace modules are loaded or not. This file is optionally compiled into the kernel if the KDTRACE_HOOKS option is defined.	2008-05-18 19:32:36 +00:00
Rui Paulo	221351b7a5	devctl_process_running(): Check for devsoftc.inuse == 1 instead of devsoftc.async_proc != NULL because the latter might not be true sometimes. This way /etc/rc.suspend gets executed. Reviwed by: njl Submitted by: Mitsuru IWASAKI <iwasaki at jp.FreeBSD.org> Tested also by: Andreas Wetzel <mickey242 at gmx.net> MFC after: 1 week	2008-05-18 13:55:51 +00:00
Robert Watson	8e230e30b7	Attempt to improve convergence of POSIX semaphore code with style(9). MFC after: 3 days	2008-05-16 18:10:07 +00:00
George V. Neville-Neil	49f287f8c5	Update the kernel to count the number of mbufs and clusters (all types) used per socket buffer. Add support to netstat to print out all of the socket buffer statistics. Update the netstat manual page to describe the new -x flag which gives the extended output. Reviewed by: rwatson, julian	2008-05-15 20:18:44 +00:00
Attilio Rao	90356491d7	- Embed the recursion counter for any locking primitive directly in the lock_object, using an unified field called lo_data. - Replace lo_type usage with the w_name usage and at init time pass the lock "type" directly to witness_init() from the parent lock init function. Handle delayed initialization before than witness_initialize() is called through the witness_pendhelp structure. - Axe out LO_ENROLLPEND as it is not really needed. The case where the mutex init delayed wants to be destroyed can't happen because witness_destroy() checks for witness_cold and panic in case. - In enroll(), if we cannot allocate a new object from the freelist, notify that to userspace through a printf(). - Modify the depart function in order to return nothing as in the current CVS version it always returns true and adjust callers accordingly. - Fix the witness_addgraph() argument name prototype. - Remove unuseful code from itismychild(). This commit leads to a shrinked struct lock_object and so smaller locks, in particular on amd64 where 2 uintptr_t (16 bytes per-primitive) are gained. Reviewed by: jhb	2008-05-15 20:10:06 +00:00

1 2 3 4 5 ...

10518 Commits