freebsd-skq

Author	SHA1	Message	Date
ed	e309996070	Enable secure TTY input buffer flushing by default. I'm leaving the sysctl there. If people really notice a slowdown, they can revert to the old behaviour. Discussed with: kib	2009-05-21 16:48:06 +00:00
ed	77fae8a219	Add a new sysctl: kern.tty_inq_flush_secure. When enabled all TTY input queue buffers are zeroed when flushing or closing the TTY. Because TTY input queues are also used to store filled in passwords, this may be an interesting switch to enable for security minded people.	2009-05-21 16:19:54 +00:00
jhb	35ba8bce41	Only use the ABI compat shim for vfs.bufspace if the old buffer is smaller than a long. PR: amd64/134786 Submitted by: Emil Mikulic emikulic\| gmail MFC after: 3 days	2009-05-21 16:18:45 +00:00
attilio	68353e273f	Move the M_WAITOK flag in notify() into an M_NOWAIT one in order to match the behaviour alredy present with the further malloc() call in devctl_notify(). This fixes a bug in the CAM layer where the camisr handler finished to call camperiphfree() (and subsequently destroy_dev() resulting in a new dev notify) while the xpt lock is held. PR: kern/130330 Tested by: Riccardo Torrini <riccardo dot torrini at esaote dot com>	2009-05-21 13:22:07 +00:00
jhb	ebdd571432	Set the umask in a new file descriptor table earlier in fdcopy() to remove two lock operations.	2009-05-20 18:42:04 +00:00
jhb	53d01dc702	Remove an obsolete assertion. We always wake up all waiters when unlocking a mutex and never set the lock cookie == MTX_CONTESTED.	2009-05-20 18:29:14 +00:00
jhb	033485e00c	Fix a typo.	2009-05-20 17:19:30 +00:00
imp	d339e0404f	We no longer need to use d_thread_t for portability here, switch to struct thread *.	2009-05-20 16:58:16 +00:00
kmacy	879984a728	Add minimal ZFS lock hierarchy	2009-05-20 02:51:48 +00:00
rwatson	9c1bc41813	With SMPng, DEVICE_POLLING uses its own idle threads, rather than the system idle loop, to run ether_poll(), so make ether_poll() static. MFC after: 1 week	2009-05-19 19:21:25 +00:00
avg	89d59b82b3	sysctl_rman: report shared resources to devinfo shared uses of a resource are recorded on a sub-list hanging off a main resource object on a main resource list; without this change a shared resource (e.g. irq) is reported only once by devinfo -r/-u; with this change the resource is reported for each driver that allocates it (which is even more than what vmstat -i -a reports). Approved by: jhb (mentor)	2009-05-19 14:08:21 +00:00
rwatson	d9e163e093	Binding interrupts to a CPU consists of two parts: setting up CPU affinity for the interrupt thread, and requesting that underlying hardware direct interrupts to the CPU. For software interrupt threads, implement a no-op interrupt event binder that returns success, so that the interrupt management code will just set the ithread's affinity and succeed. Reviewed by: jhb MFC after: 1 week	2009-05-18 14:02:55 +00:00
ed	a6c06ba89c	Mark the clock sysctls as MPSAFE. These sysctls don't need any form of locking. At least cp_times is used by powerd very often, which means I get 50% less calls to non-MPSAFE sysctls on my system. The other 50% is consumed by dev.cpu.0.freq, but this seems to need Giant for Newbus.	2009-05-18 12:03:43 +00:00
alc	c8b00d493e	Several changes to vfs_bio_clrbuf(): Provide a more descriptive comment. Eliminate dead code. The page cannot possibly have PG_ZERO set. Eliminate unnecessary blank lines. Reviewed by: tegge	2009-05-17 23:25:53 +00:00
alc	dc942dabcf	Introduce vfs_bio_set_valid() and use it from ffs_realloccg(). This eliminates the misuse of vfs_bio_clrbuf() by ffs_realloccg(). In collaboration with: tegge	2009-05-17 20:26:00 +00:00
ed	44192767f8	Print an extra newline when not at the first column already. This makes siginfo output look a lot better when pressing it the first time when in sh(1), for example: $ load: 0.00 cmd: sh 1945 [ttyin] 3.94r 0.00u 0.00s 0% 1960k load: 0.00 cmd: sh 1945 [ttyin] 4.19r 0.00u 0.00s 0% 1960k will now become: $ load: 0.00 cmd: sh 1945 [ttyin] 3.94r 0.00u 0.00s 0% 1960k load: 0.00 cmd: sh 1945 [ttyin] 4.19r 0.00u 0.00s 0% 1960k	2009-05-17 16:17:48 +00:00
ed	69acbef6ca	Several cleanups to tty_info(), better known as Ctrl-T. - Only pick up PROC_LOCK once, which means we can drop the PGRP_LOCK right after picking up PROC_LOCK for the first time. - Print the process real time, making it consistent with tools like time(1). - Use `p' and `td' to reference the process/thread we are going to print. Only use pick-variables inside the loops. We already did this for the threads, but not the processes.	2009-05-17 12:30:25 +00:00
des	9919b017d4	Remove do-nothing code that was required to dirty the old buffer on Alpha. Coverity ID: 838 Approved by: jhb, alc	2009-05-15 21:34:58 +00:00
kib	b8162aa0c9	Revert r192094. The revision caused problems for sysctl(3) consumers that expect that oldlen is filled with required buffer length even when supplied buffer is too short and returned error is ENOMEM. Redo the fix for kern.proc.filedesc, by reverting the req->oldidx when remaining buffer space is too short for the current kinfo_file structure. Also, only ignore ENOMEM. We have to convert ENOMEM to no error condition to keep existing interface for the sysctl, though. Reported by: ed, Florian Smeets <flo kasimir com> Tested by: pho	2009-05-15 14:41:44 +00:00
jhb	cbf4ebe5a3	- Use a separate sx lock to try to limit the number of concurrent userland sysctl requests to avoid wiring too much user memory. Only grab this lock if the user's old buffer is larger than a page as a tradeoff to allow more concurrency for common small requests. - Just use a shared lock on the sysctl tree for user sysctl requests now. MFC after: 1 week	2009-05-14 22:01:32 +00:00
kib	7361db2279	Do not advance req->oldidx when sysctl_old_user returning an error due to copyout failure or short buffer. The later breaks the usermode iterators of the sysctl results that pack arbitrary number of variable-sized structures. Iterator expects that kernel filled exactly oldlen bytes, and tries to interpret half-filled or garbage structure at the end of the buffer. In particular, kinfo_getfile(3) segfaulted. Reported and tested by: pho MFC after: 3 weeks	2009-05-14 10:54:57 +00:00
jeff	20397e6431	- Implement a lockless file descriptor lookup algorithm in fget_unlocked(). - Save old file descriptor tables created on expansion until the entire descriptor table is freed so that pointers may be followed without regard for expanders. - Mark the file zone as NOFREE so we may attempt to reference potentially freed files. - Convert several fget_locked() users to fget_unlocked(). This requires us to manage reference counts explicitly but reduces locking overhead in the common case.	2009-05-14 03:24:22 +00:00
alc	82da6bfdea	Eliminate page queues locking from bufdone_finish() through the following changes: Rename vfs_page_set_valid() to vfs_page_set_validclean() to reflect what this function actually does. Suggested by: tegge Introduce a new version of vfs_page_set_valid() that does no more than what the function's name implies. Specifically, it does not update the page's dirty mask, and thus it does not require the page queues lock to be held. Update two of the three callers to the old vfs_page_set_valid() to call vfs_page_set_validclean() instead because they actually require the page's dirty mask to be cleared. Introduce vm_page_set_valid(). Reviewed by: tegge	2009-05-13 05:39:39 +00:00
trasz	e6d7976851	Add missing 'break' statement. Found with: Coverity Prevent(tm) CID: 3919	2009-05-12 17:05:40 +00:00
kib	60c4168558	Prevent overflow of uio_resid. Noted by: jhb MFC after: 3 days	2009-05-11 19:58:03 +00:00
attilio	c639aa3d25	Fix a kernel compilation error, introduced after r191990, by defining thread with curthread in the AUDIT case. Reported by: dchagin	2009-05-11 16:32:58 +00:00
attilio	1dcb84131b	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
alc	123b385c44	Revert CVS revision 1.94 (svn r16840). Current pmap implementations don't suffer from the race condition that motivated revision 1.94. Consequently, the work-around that was implemented by revision 1.94 is no longer needed. Moreover, reverting this work-around eliminates the need for vfs_busy_pages() to acquire the page queues lock when preparing a buffer for read. Reviewed by: tegge	2009-05-11 05:16:57 +00:00
imp	9a12159016	Spell NULL properly, use (void) rather than () for functions with no parameters. Mark two items as static that aren't used elsewhere...	2009-05-09 19:08:22 +00:00
imp	66ca9cb573	Retire kern.vm.kmem.size. It was marked as obsolete prior to 5.2, so it can go.	2009-05-09 19:00:47 +00:00
kan	7b57a857b7	Do not embed struct ucred into larger netcred parent structures. Credential might need to hang around longer than its parent and be used outside of mnt_explock scope controlling netcred lifetime. Use separate reference-counted ucred allocated separately instead. While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up some unused declarations in new NFS code. Reported by: John Hickey PR: kern/133439 Reviewed by: dfr, kib	2009-05-09 18:09:17 +00:00
zec	b31e199a10	A NOP change: style / whitespace cleanup of the noise that slipped into r191816. Spotted by: bz Approved by: julian (mentor) (an earlier version of the diff)	2009-05-08 14:34:25 +00:00
zec	639797b2e6	Introduce a new virtualization container, provisionally named vprocg, to hold virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import. As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet. Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch. Permit kldload / kldunload operations to be executed only from the default vimage context. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz Approved by: julian (mentor)	2009-05-08 14:11:06 +00:00
jamie	267ea54b44	Move the per-prison Linux MIB from a private one-off pointer to the new OSD-based jail extensions. This allows the Linux MIB to accessed via jail_set and jail_get, and serves as a demonstration of adding jail support to a module. Reviewed by: dchagin, kib Approved by: bz (mentor)	2009-05-07 18:36:47 +00:00
kib	78e147b4e4	Eliminate the loop and the call to pause(9) in vfs_vget_ino(). If vfs_busy(MBF_NOWAIT) failed, unlock the vnode and sleep in vfs_busy(). Suggested and reviewed by: jeff Tested by: pho MFC after: 3 weeks	2009-05-07 18:14:21 +00:00
ed	2f086e8725	If we have a regular rint handler, never go into rint_bypass mode. It turns out if we called cfmakeraw() on a TTY with only a rint handler in place, it could inject data into the TTY, even though it should be redirected. Always take a look at the hooks before looking at the termios flags.	2009-05-07 17:39:23 +00:00
zec	d78a1b1a82	Change the curvnet variable from a global const struct vnet , previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged. This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_ macros expand to whitespace. The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another. The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions. This change also introduces a DDB subcommand to show the list of all vnet instances. Approved by: julian (mentor)	2009-05-05 10:56:12 +00:00
jamie	8e4ffe653f	Add a constant PR_MAXMETHOD to better define the jail/OSD interface. Reviewed by: dchagin, kib Approved by: bz (mentor)	2009-05-05 05:49:08 +00:00
ed	fb2908c8ff	Remove unneeded check for SESS_LEADER(). We perform the same check ~10 lines above.	2009-05-04 11:11:10 +00:00
jamie	d462264a61	Don't call the OSD destructor if the data slot is NULL (since it's already not done on unused slots, which are indistinguishable to the caller). Approved by: bz (mentor)	2009-04-30 22:43:21 +00:00
zec	39b6dc8ba2	Permit buiding kernels with options VIMAGE, restricted to only a single active network stack instance. Turning on options VIMAGE at compile time yields the following changes relative to default kernel build: 1) V_ accessor macros for virtualized variables resolve to structure fields via base pointers, instead of being resolved as fields in global structs or plain global variables. As an example, V_ifnet becomes: options VIMAGE: ((struct vnet_net ) vnet_net)->_ifnet default build: vnet_net_0._ifnet options VIMAGE_GLOBALS: ifnet 2) INIT_VNET_ macros will declare and set up base pointers to be used by V_ accessor macros, instead of resolving to whitespace: INIT_VNET_NET(ifp->if_vnet); becomes struct vnet_net vnet_net = (ifp->if_vnet)->mod_data[VNET_MOD_NET]; 3) Memory for vnet modules registered via vnet_mod_register() is now allocated at run time in sys/kern/kern_vimage.c, instead of per vnet module structs being declared as globals. If required, vnet modules can now request the framework to provide them with allocated bzeroed memory by filling in the vmi_size field in their vmi_modinfo structures. 4) structs socket, ifnet, inpcbinfo, tcpcb and syncache_head are extended to hold a pointer to the parent vnet. options VIMAGE builds will fill in those fields as required. 5) curvnet is introduced as a new global variable in options VIMAGE builds, always pointing to the default and only struct vnet. 6) struct sysctl_oid has been extended with additional two fields to store major and minor virtualization module identifiers, oid_v_subs and oid_v_mod. SYSCTL_V_ family of macros will fill in those fields accordingly, and store the offset in the appropriate vnet container struct in oid_arg1. In sysctl handlers dealing with virtualized sysctls, the SYSCTL_RESOLVE_V_ARG1() macro will compute the address of the target variable and make it available in arg1 variable for further processing. Unused fields in structs vnet_inet, vnet_inet6 and vnet_ipfw have been deleted. Reviewed by: bz, rwatson Approved by: julian (mentor)	2009-04-30 13:36:26 +00:00
jeff	9ff631ca46	- Fix non-SMP build by encapsulating idle spin logic in a macro. Pointy hat to: me	2009-04-29 23:04:31 +00:00
jamie	8fbb51e637	Regen for new jail system calls in r191673. Approved by: bz (mentor)	2009-04-29 21:50:13 +00:00
jamie	453b86f943	Introduce the extensible jail framework, using the same "name=value" interface as nmount(2). Three new system calls are added: * jail_set, to create jails and change the parameters of existing jails. This replaces jail(2). * jail_get, to read the parameters of existing jails. This replaces the security.jail.list sysctl. * jail_remove to kill off a jail's processes and remove the jail. Most jail parameters may now be changed after creation, and jails may be set to exist without any attached processes. The current jail(2) system call still exists, though it is now a stub to jail_set(2). Approved by: bz (mentor)	2009-04-29 21:14:15 +00:00
bms	32a71137f0	Bite the bullet, and make the IPv6 SSM and MLDv2 mega-commit: import from p4 bms_netdev. Summary of changes: * Connect netinet6/in6_mcast.c to build. The legacy KAME KPIs are mostly preserved. * Eliminate now dead code from ip6_output.c. Don't do mbuf bingo, we are not going to do RFC 2292 style CMSG tricks for multicast options as they are not required by any current IPv6 normative reference. * Refactor transports (UDP, raw_ip6) to do own mcast filtering. SCTP, TCP unaffected by this change. * Add ip6_msource, in6_msource structs to in6_var.h. * Hookup mld_ifinfo state to in6_ifextra, allocate from domifattach path. * Eliminate IN6_LOOKUP_MULTI(), it is no longer referenced. Kernel consumers which need this should use in6m_lookup(). * Refactor IPv6 socket group memberships to use a vector (like IPv4). * Update ifmcstat(8) for IPv6 SSM. * Add witness lock order for IN6_MULTI_LOCK. * Move IN6_MULTI_LOCK out of lower ip6_output()/ip6_input() paths. * Introduce IP6STAT_ADD/SUB/INC/DEC as per rwatson's IPv4 cleanup. * Update carp(4) for new IPv6 SSM KPIs. * Virtualize ip6_mrouter socket. Changes mostly localized to IPv6 MROUTING. * Don't do a local group lookup in MROUTING. * Kill unused KAME prototypes in6_purgemkludge(), in6_restoremkludge(). * Preserve KAME DAD timer jitter behaviour in MLDv1 compatibility mode. * Bump __FreeBSD_version to 800084. * Update UPDATING. NOTE WELL: * This code hasn't been tested against real MLDv2 queriers (yet), although the on-wire protocol has been verified in Wireshark. * There are a few unresolved issues in the socket layer APIs to do with scope ID propagation. * There is a LOR present in ip6_output()'s use of in6_setscope() which needs to be resolved. See comments in mld6.c. This is believed to be benign and can't be avoided for the moment without re-introducing an indirect netisr. This work was mostly derived from the IGMPv3 implementation, and has been sponsored by a third party.	2009-04-29 19:19:13 +00:00
jamie	51a4d1c4a3	Some non-functional changes: whitespace, KASSERT strings, declaration order. Approved by: bz (mentor)	2009-04-29 18:41:08 +00:00
jeff	fe5d856f47	- Fix the FBSDID line.	2009-04-29 03:26:30 +00:00
jeff	88a1cd92bb	- Remove the bogus idle thread state code. This may have a race in it and it only optimized out an ipi or mwait in very few cases. - Skip the adaptive idle code when running on SMT or HTT cores. This just wastes cpu time that could be used on a busy thread on the same core. - Rename CG_FLAG_THREAD to CG_FLAG_SMT to be more descriptive. Re-use CG_FLAG_THREAD to mean SMT or HTT. Sponsored by: Nokia	2009-04-29 03:15:43 +00:00
bz	1507f5bd4d	Prevent a superuser inside a jail from modifying the dedicated root cpuset of that jail. Processes inside the jail will still be able to change child sets. A superuser outside of a jail will still be able to change the jail cpuset and thus limit the number of cpus available to the jail. Problem reported by: 000.fbsd@quip.cz (Miroslav Lachman) PR: kern/134050 Reviewed by: jeff MFC after: 3 weeks X-MFC: backout r191596	2009-04-28 21:00:50 +00:00
rwatson	fb89496678	Improve approximation of style(9).	2009-04-26 21:16:03 +00:00

... 2 3 4 5 6 ...

11262 Commits