freebsd-skq

Author	SHA1	Message	Date
dfr	6929a6d99b	Regen.	2008-11-03 10:39:35 +00:00
dfr	2fb03513fc	Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month	2008-11-03 10:38:00 +00:00
ivoras	d819bb20f8	Increase the initial sbuf size for CPU topology dump to something more usable for newer CPUs. The new value allows 2 x quad core configuration dumps to fit within the initial buffer without reallocations. Approved by: gnn (mentor) (older version) Pointed out by: rdivacky	2008-11-02 23:11:20 +00:00
attilio	e1f493235e	Improve VFS locking: - Implement real draining for vfs consumers by not relying on the mnt_lock and using instead a refcount in order to keep track of lock requesters. - Due to the change above, remove the mnt_lock lockmgr because it is now useless. - Due to the change above, vfs_busy() is no more linked to a lockmgr. Change so its KPI by removing the interlock argument and defining 2 new flags for it: MBF_NOWAIT which basically replaces the LK_NOWAIT of the old version (which was unlinked from the lockmgr alredy) and MBF_MNTLSTLOCK which provides the ability to drop the mountlist_mtx once the mnt interlock is held (ability still desired by most consumers). - The stub used into vfs_mount_destroy(), that allows to override the mnt_ref if running for more than 3 seconds, make it totally useless. Remove it as it was thought to work into older versions. If a problem of "refcount held never going away" should appear, we will need to fix properly instead than trust on such hackish solution. - Fix a bug where returning (with an error) from dounmount() was still leaving the MNTK_MWAIT flag on even if it the waiters were actually woken up. Just a place in vfs_mount_destroy() is left because it is going to recycle the structure in any case, so it doesn't matter. - Remove the markercnt refcount as it is useless. This patch modifies VFS ABI and breaks KPI for vfs_busy() so manpages and __FreeBSD_version will be modified accordingly. Discussed with: kib Tested by: pho	2008-11-02 10:15:42 +00:00
ed	57b4089c20	Clamp the values of t_column to 5 digits in `pstat -t' and` show all ttys'. We often run into these very high column numbers when we run curses applications, because they don't print any newlines. This messes up the table output of `pstat -t'. If these numbers get really high, they aren't of any use to the reader anyway. Convert them to `99999' when they run out of bounds.	2008-11-01 13:40:46 +00:00
ed	c2c324d379	Reimplement the /dev/console device node. One of the pieces of code that I had left alone during the development of the MPSAFE TTY layer, was tty_cons.c. This file actually has two different functions: - It contains low-level console input/output routines (cnputc(), etc). - It creates /dev/console and wraps all its cdevsw calls to the appropriate TTY. This commit reimplements the second set of functions by moving it directly into the TTY layer. /dev/console is now a character device node that's basically a regular TTY, but does a lookup of `si_drv1' each time you open it. d_write has also been changed to call log_console(). d_close() is not present, because we must make sure we don't revoke the TTY after writing a log message to it. Even though I'm not convinced this is in line with the future directions of our console code, it is a good move for now. It removes recursive locking from the top half of the TTY layer. The previous implementation called into the TTY layer with Giant held. I'm renaming tty_cons.c to kern_cons.c now. The code hardly contains any TTY related bits, so we'd better give it a less misleading name. Tested by: Andrzej Tobola <ato iem pw edu pl>, Carlos A.M. dos Santos <unixmania gmail com>, Eygene Ryabinkin <rea-fbsd codelabs ru>	2008-11-01 08:35:28 +00:00
peter	1f7fd22cbb	Add three extra to the kinfo_proc_vmmap data. kve_offset - the offset within an object that a mapping refers to. fileid and fsid are inode/dev for vnodes. (Linux procfs has these and valgrind is really unhappy without them.) I believe I didn't change the size of the struct.	2008-10-31 05:43:19 +00:00
sobomax	dafc63cd43	Make it possible to compile kernel with KTR but without DDB.	2008-10-30 21:48:28 +00:00
ivoras	483637ae39	Introduce a new sysctl, kern.sched.topology_spec, that returns an XML dump of detected ULE CPU topology. This dump can be used to check the topology detection and for general system information. An example of CPU topology dump is: kern.sched.topology_spec: <groups> <group level="1" cache-level="0"> <cpu count="8" mask="0xff">0, 1, 2, 3, 4, 5, 6, 7</cpu> <flags></flags> <children> <group level="2" cache-level="0"> <cpu count="4" mask="0xf">0, 1, 2, 3</cpu> <flags></flags> </group> <group level="2" cache-level="0"> <cpu count="4" mask="0xf0">4, 5, 6, 7</cpu> <flags></flags> </group> </children> </group> </groups> Reviewed by: jeff Approved by: gnn (mentor)	2008-10-29 13:36:23 +00:00
davidxu	11aa09b488	If threads limit is exceeded, increase the totoal number of failures.	2008-10-29 12:11:48 +00:00
trasz	4e57a80147	Rename a variable missed in previous accmode_t-related commits. Approved by: rwatson (mentor)	2008-10-28 21:58:48 +00:00
trasz	0ad8692247	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
kib	b9b0d2c54c	Style return statements in vn_pollrecord().	2008-10-28 12:22:33 +00:00
kib	86b5e61ab2	Protect check for v_pollinfo == NULL and assignment of the newly allocated vpollinfo with vnode interlock. Fully initialize vpollinfo before putting pointer to it into vp->v_pollinfo. Discussed with: dwhite Tested by: pho MFC after: 1 week	2008-10-28 12:08:36 +00:00
rwatson	a2129bd144	Rename three MAC entry points from _proc_ to _cred_ to reflect the fact that they operate directly on credentials: mac_proc_create_swapper(), mac_proc_create_init(), and mac_proc_associate_nfsd(). Update policies. Obtained from: TrustedBSD Project	2008-10-28 11:33:06 +00:00
peter	b5b26198a7	After a machine has been up for a bit more than 20 days with HZ=1000, "ticks" goes negative. This breaks the signed comparison in softclock. This causes sleep() to never wake up, tcp to stop, etc etc. This is bad(TM). Use the SEQ_LT() method from tcp's sequence number comparisons.	2008-10-28 03:26:25 +00:00
jhb	c343bee743	- Whitespace fix for vop_poll. - Use the right label for vop_vptofh lock assertions so they are enforced.	2008-10-27 21:41:55 +00:00
sobomax	2bddeb51d2	vm_pnames should be "const char *const[]". Submitted by: Christoph Mallon	2008-10-27 08:09:05 +00:00
sobomax	c9fd562aa0	vm_pnames has no reason to be global. MFC after: 2 weeks	2008-10-27 06:34:41 +00:00
sobomax	6b076dc603	Default HZ value (1,000) on i386/amd64 is not very virtual machine friendly. Due to the nature of the beast it causes lot of unproductive overhead. This is especially bad when running SMP kernel on VMWare with several virtual processors - idle FreeBSD guest with SMP kernel takes 150% host CPU time on my dual-core MacBook Pro when I am enabling two virtual CPUs, making even host not very usable. Detect when we are running in the sandbox and reduce HZ to 10 (can be adjusted via VM_HZ in the kernel config) in such cases. This brings host CPU usage of idle FreeBSD/SMP on two virtual processors down to 10%. Detect most popular VM platforms out there - VMWare, Parallels, VirtualBox and VirtualPC. MFC after: 2 weeks	2008-10-27 06:25:02 +00:00
dfr	f98e1f1bbf	Don't rely on the value of *statep without first taking the vnode interlock. Reviewed by: Mike Tancsa MFC after: 2 weeks	2008-10-24 16:04:10 +00:00
davidxu	238f3ee5f4	Don't rearm callout if the process is exiting, it may leak a callout because callout_drain() only waits for running callout, but not disable it if it is rearmed.	2008-10-24 01:09:24 +00:00
davidxu	e66e7ee6bb	partly revert revision 184199, because TDF_NEEDSIGCHK is persitent when thread is in kernel mode, it can cause dead loop, now unlock process lock after acquired sleep queue lock and thread lock to avoid the problem. This means TDF_NEEDSIGCHK and TDF_NEEDSUSPCHK must be set with process lock and thread lock being hold at same time.	2008-10-24 01:03:31 +00:00
jhb	2e4682de75	Whitespace fix.	2008-10-23 21:50:16 +00:00
des	a1e1ad22e0	Fix a number of style issues in the MALLOC / FREE commit. I've tried to be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.	2008-10-23 20:26:15 +00:00
des	66f807ed8b	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
davidxu	2062caca24	Actually, for signal and thread suspension, extra process spin lock is unnecessary, the normal process lock and thread lock are enough. The spin lock is still needed for process and thread exiting to mimic single sched_lock.	2008-10-23 07:55:38 +00:00
jhb	327ae6eb3a	Split the copyout of *base at the end of getdirentries() out leaving the rest in kern_getdirentries(). Use kern_getdirentries() to implement freebsd32_getdirentries(). This fixes a bug where calls to getdirentries() in 32-bit binaries would trash the 4 bytes after the 'long base' in userland. Submitted by: ups MFC after: 1 week	2008-10-22 21:55:48 +00:00
marcel	7de1858d0c	Trivially avoid a null pointer dereference when drivers don't set the rman description. While drivers should set it, a kernel panic is not the right behaviour when faced without one.	2008-10-22 18:20:45 +00:00
thompsa	0fcb99be5e	Fix spelling mistake in the last rev.	2008-10-21 14:44:25 +00:00
thompsa	8ee58ba9e6	If we have getc_inject hooked then the outq buffer is inaccessible to the driver so skip the drain rather than waiting indefinitely. Reviewed by: ed	2008-10-21 14:18:45 +00:00
kib	cc3d7dc928	Change vn_start_write() to clear *mpp on all failures when non-NULL vp is supplied, since vm_pageout_scan() expects it to be cleared on error. Submitted by: tegge PR: 123768 MFC after: 1 week	2008-10-21 09:55:49 +00:00
attilio	42c5b05453	In the actual code for witness_warn: - If there aren't spinlocks held, but there are problems with old sleeplocks, they are not reported. - If the spinlock found is not the only one, problems are not reported. Fix these 2 problems. Reported by: tegge	2008-10-20 19:22:16 +00:00
kib	e4785f6af4	Assert that v_holdcnt is non-zero before entering lockmgr in vn_lock and ffs_lock. This cannot catch situations where holdcnt is incremented not by curthread, but I think it is useful. Reviewed by: tegge, attilio Tested by: pho MFC after: 2 weeks	2008-10-20 10:11:33 +00:00
kib	015479d466	In vfs_busy(), lockmgr() cannot legitimately sleep, because code checked MNTK_UNMOUNT before, and mnt_mtx is used as interlock. vfs_busy() always tries to obtain a shared lock on mnt_lock, the other user is unmount who tries to drain it, setting MNTK_UNMOUNT before. Reviewed by: tegge, attilio Tested by: pho MFC after: 2 weeks	2008-10-20 10:07:28 +00:00
davidxu	57a7a67ea5	In realtimer_delete(), clear timer's value and interval to tell realtimer_expire() to not rearm the timer, otherwise there is a chance that a callout will be left there and be tiggered in future unexpectly. Bug reported by: tegge@	2008-10-20 02:37:53 +00:00
kib	e8c0b1746f	Ktr(9) stores format string and arguments in the event circular buffer, not the string formatted at the time of CTRX() call. Stack_ktr(9) uses an on-stack buffer for the symbol name, that is supplied as an argument to ktr. As result, stack_ktr() traces show garbage or cause page faults. Fix stack_ktr() by using pointer to module symbol table that is supposed to have a longer lifetime. Tested by: pho MFC after: 1 week	2008-10-19 11:13:49 +00:00
kmacy	4ceda2abba	- Forward port flush of page table updates on context switch or userret - Forward port vfork XEN hack	2008-10-19 01:35:27 +00:00
bz	4d4d2d367d	Add cr_canseeinpcb() doing checks using the cached socket credentials from inp_cred which is also available after the socket is gone. Switch cr_canseesocket consumers to cr_canseeinpcb. This removes an extra acquisition of the socket lock. Reviewed by: rwatson MFC after: 3 months (set timer; decide then)	2008-10-17 16:26:16 +00:00
kmacy	f9a07efdb6	make sure that SO_NO_DDP and SO_NO_OFFLOAD get passed in correctly PR: 127360 MFC after: 3 days	2008-10-17 01:25:45 +00:00
attilio	708fbd2d50	- Fix a race in witness_checkorder() where, between the PCPU_GET() and PCPU_PTR() curthread can migrate on another CPU and get incorrect results. - Fix a similar race into witness_warn(). - Fix the interlock's checks bypassing by correctly using the appropriate children even when the lock_list chunk to be explored is not the first one. - Allow witness_warn() to work with spinlocks too. Bugs found by: tegge Submitted by: jhb, tegge Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-10-16 12:42:56 +00:00
davidxu	3f5ab59cf2	Restore code wrongly removed in SVN revision 173004, it causes threaded process to be stuck in execv(). Noticed by: delphij	2008-10-16 04:17:17 +00:00
ed	48c0c8f51a	Import some improvements to the TTY code from the MPSAFE TTY branch. - Change the ddb(4) commands to be more useful (by thompsa@): - `show ttys' is now called `show all ttys'. This command will now also display the address where the TTY data structure resides. - Add `show tty <addr>', which dumps the TTY in a readable form. - Place an upper bound on the TTY buffer sizes. Some drivers do not want to care about baud rates. Protect these drivers by preventing the TTY buffers from getting enormous. Right now we'll just clamp it to 64K, which is pretty high, taking into account that these buffers are only used by the built-in discipline. - Only call ttydev_leave() when needed. Back in April/May the TTY reference counting mechanism was a little different, which required us to call ttydev_leave() each time we finished a cdev operation. Nowadays we only need to call ttydev_leave() when we really mark it as being closed. - Improve return codes of read() and write() on TTY device nodes. - Make sure we really wake up all blocked threads when the driver calls tty_rel_gone(). There were some possible code paths where we didn't properly wake up any readers/writers. - Add extra assertions to prevent sleeping on a TTY that has been abandoned by the driver. - Use ttydev_cdevsw as a more reliable method to figure out whether a device node is a real TTY device node. Obtained from: //depot/projects/mpsafetty/... Reviewed by: thompsa	2008-10-15 16:58:35 +00:00
davidxu	5068f6dcf0	Move per-thread userland debugging flags into seperated field, this eliminates some problems of locking, e.g, a thread lock is needed but can not be used at that time. Only the process lock is needed now for new field.	2008-10-15 06:31:37 +00:00
rdivacky	ead773b051	Check the result of copyin and in a case of error return one. This prevents setting wrong priority or (more likely) returning EINVAL. Approved by: kib (mentor)	2008-10-13 21:04:52 +00:00
rwatson	ef6dfc27c4	Downgrade XXX to a Note for fgetsock() and fputsock(). MFC after: 3 days	2008-10-12 20:03:17 +00:00
rwatson	f2c33837dd	Remove stale comment: while uipc_connect2() was, until recently, not static so it could be used by fifofs (actually portalfs), it is now static. Submitted by: kensmith	2008-10-11 17:28:22 +00:00
attilio	b8bf37e585	Remove the struct thread unuseful argument from bufobj interface. In particular following functions KPI results modified: - bufobj_invalbuf() - bufsync() and BO_SYNC() "virtual method" of the buffer objects set. Main consumers of bufobj functions are affected by this change too and, in particular, functions which changed their KPI are: - vinvalbuf() - g_vfs_close() Due to the KPI breakage, __FreeBSD_version will be bumped in a later commit. As a side note, please consider just temporary the 'curthread' argument passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP Reviewed by: kib Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-10-10 21:23:50 +00:00
imp	793aee6634	Close, but not eliminate, a race condition. It is one that properly designed drivers would never hit, but was exposed in diving into another problem... When expanding the devclass array, free the old memory after updating the pointer to the new memory. For the following single race case, this helps: allocate new memory copy to new memory free old memory <interrupt> read pointer to freed memory update pointer to new memory Now we do allocate new memory copy to new memory update pointer to new memory free old memory Which closes this problem, but doesn't even begin to address the multicpu races, which all should be covered by Giant at the moment, but likely aren't completely. Note: reviewers were ok with this fix, but suggested the use case wasn't one we wanted to encourage. Reviewed by: jhb, scottl.	2008-10-10 17:49:47 +00:00
kib	997f16fb43	If the ABI-overriden interpreter was not loaded, do not set have_interp to TRUE. This allows the code in image activator to try /libexec/ld-elf.so.1 as interpreter when newinterp is not found to execute. Reviewed by: peter MFC after: 2 weeks (together with r175105)	2008-10-08 11:11:36 +00:00

1 2 3 4 5 ...

10757 Commits