freebsd-skq

Author	SHA1	Message	Date
kan	9761044c2f	Redo previous change using simpler patch that happens to be also more correct. Submitted by: tor	2009-04-14 23:56:48 +00:00
kan	3d27e1410d	Fix yet another negative dotodot entry fallout. Reported by: pho	2009-04-14 23:46:57 +00:00
kmacy	de9c351c80	- use a shared lock for reads - remove stale comment Reviewed by: jeffr	2009-04-13 23:09:44 +00:00
davidxu	87940a2686	Make UMTX_OP_WAIT_UINT actually wait for an unsigned integer on 64-bits machine. MFC after: 1 week	2009-04-13 05:21:17 +00:00
kmacy	0cb01ac4ef	sendfile doesn't modify the vnode - acquire vnode lock shared Reviewed by: ups, jeffr	2009-04-12 05:19:35 +00:00
rwatson	90c1837110	Remove conditionally compiled time counter statistics; tools like DTrace, kernel profiling, etc, can provide this information without the overhead. MFC after: 3 days Suggested by: bde	2009-04-11 22:01:40 +00:00
kan	cae135c489	Fix v_cache_dd handling for negative entries. v_cache_dd pointer was not populated in parent directory if negative entry was being created, yet entry itself was added to the nc_neg list. It was possible for parent vnode to get discarded later, leaving negative entry pointing to now unused memory block. Reported by: dho Revewed by: kib	2009-04-11 20:23:08 +00:00
kib	9260481d50	When zapping v_cache_dd for !MAKEENTRY case in cache_lookup(), we shall lock cache as writer. Reviewed by: kan	2009-04-11 16:12:20 +00:00
zec	b39b54e6de	Introduce vnet module registration / initialization framework with dependency tracking and ordering enforcement. With this change, per-vnet initialization functions introduced with r190787 are no longer directly called from traditional initialization functions (which cc in most cases inlined to pre-r190787 code), but are instead registered via the vnet framework first, and are invoked only after all prerequisite modules have been initialized. In the long run, this framework should allow us to both initialize and dismantle multiple vnet instances in a correct order. The problem this change aims to solve is how to replay the initialization sequence of various network stack components, which have been traditionally triggered via different mechanisms (SYSINIT, protosw). Note that this initialization sequence was and still can be subtly different depending on whether certain pieces of code have been statically compiled into the kernel, loaded as modules by boot loader, or kldloaded at run time. The approach is simple - we record the initialization sequence established by the traditional mechanisms whenever vnet_mod_register() is called for a particular vnet module. The vnet_mod_register_multi() variant allows a single initializer function to be registered multiple times but with different arguments - currently this is only used in kern/uipc_domain.c by net_add_domain() with different struct domain * as arguments, which allows for protosw-registered initialization routines to be invoked in a correct order by the new vnet initialization framework. For the purpose of identifying vnet modules, each vnet module has to have a unique ID, which is statically assigned in sys/vimage.h. Dynamic assignment of vnet module IDs is not supported yet. A vnet module may specify a single prerequisite module at registration time by filling in the vmi_dependson field of its vnet_modinfo struct with the ID of the module it depends on. Unless specified otherwise, all vnet modules depend on VNET_MOD_NET (container for ifnet list head, rt_tables etc.), which thus has to and will always be initialized first. The framework will panic if it detects any unresolved dependencies before completing system initialization. Detection of unresolved dependencies for vnet modules registered after boot (kldloaded modules) is not provided. Note that the fact that each module can specify only a single prerequisite may become problematic in the long run. In particular, INET6 depends on INET being already instantiated, due to TCP / UDP structures residing in INET container. IPSEC also depends on INET, which will in turn additionally complicate making INET6-only kernel configs a reality. The entire registration framework can be compiled out by turning on the VIMAGE_GLOBALS kernel config option. Reviewed by: bz Approved by: julian (mentor)	2009-04-11 05:58:58 +00:00
rwatson	fba90f2e03	Remove VOP_LEASE and supporting functions. This hasn't been used since the removal of NQNFS, but was left in in case it was required for NFSv4. Since our new NFSv4 client and server can't use it for their requirements, GC the old mechanism, as well as other unused lease- related code and interfaces. Due to its impact on kernel programming and binary interfaces, this change should not be MFC'd. Proposed by: jeff Reviewed by: jeff Discussed with: rmacklem, zach loafman @ isilon	2009-04-10 10:52:19 +00:00
kib	a5ecda6fd6	Cache_lookup() for DOTDOT drops dvp vnode lock, allowing dvp to be reclaimed. Check the condition and return ENOENT then. In nfs_lookup(), respect ENOENT return from cache_lookup() when it is caused by dvp reclaim. Reported and tested by: pho	2009-04-10 10:22:44 +00:00
thompsa	39714cb212	Revert r190676,190677 The geom and CAM changes for root_hold are the wrong solution for USB design quirks. Requested by: scottl	2009-04-10 04:08:34 +00:00
ed	df0bd942fb	Fix tty_wait_background() to comply with standards. It turns out my handling of SIGTTOU and SIGTTIN didn't entirely comply to the standards. It is true that in the SIGTTOU case we should not return EIO when the signal is ignored/blocked, but in the SIGTTIN case we must. See also: POSIX issue 7 section 11.1.4	2009-04-08 15:56:50 +00:00
rwatson	b2a611f93a	Nul-terminate strings in the VFS name cache, which negligibly change the size and cost of name cache entries, but make adding debugging and tracing easier. Add SDT DTrace probes for various namecache events: vfs:namecache:enter:done - new entry in the name cache, passed parent directory vnode pointer, name added to the cache, and child vnode pointer. vfs:namecache:enter_negative:done - new negative entry in the name cache, passed parent vnode pointer, name added to the cache. vfs:namecache:fullpath:enter - call to vn_fullpath1() is made, passed the vnode to resolve to a name. vfs:namecache:fullpath:hit - vn_fullpath1() successfully resolved a search for the parent of an object using the namecache, passed the discovered parent directory vnode pointer, name, and child vnode pointer. vfs:namecache:fullpath:miss - vn_fullpath1() failed to resolve a search for the parent of an object using the namecache, passed the child vnode pointer. vfs:namecache:fullpath:return - vn_fullpath1() has completed, passed the error number, and if that is zero, the vnode to resolve, and the returned path. vfs:namecache:lookup:hit - postive name cache entry hit, passed the parent directory vnode pointer, name, and child vnode pointer. vfs:namecache:lookup:hit_negative - negative name cache entry hit, passed the parent directory vnode pointer and name. vfs:namecache:lookup:miss - name cache miss, passed the parent directory pointer and the full remaining component name (not terminated after the cache miss component). vfs:namecache:purge:done - name cache purge for a vnode, passed the vnode pointer to purge. vfs:namecache:purge_negative:done - name cache purge of negative entries for children of a vnode, passed the vnode pointer to purge. vfs:namecache:purgevfs - name cache purge for a mountpoint, passed the mount pointer. Separate probes will also be invoked for each cache entry zapped. vfs:namecache:zap:done - name cache entry zapped, passed the parent directory vnode pointer, name, and child vnode pointer. vfs:namecache:zap_negative:done - negative name cache entry zapped, passed the parent directory vnode pointer and name. For any probes involving an extant name cache entry (enter, hit, zapp), we use the nul-terminated string for the name component. For misses, the remainder of the path, including later components, is provided as an argument instead since there is no handy nul-terminated version of the string around. This is arguably a bug. MFC after: 1 month Sponsored by: Google, Inc. Reviewed by: jhb, kan, kib (earlier version)	2009-04-07 20:58:56 +00:00
rwatson	f775cac272	Add SDT DTrace probes for namei(): vfs:namei:lookup:entry takes parent directory vnode pointer, path to look up, and lookup flags. vfs:namei:lookup:return takes an error value, and if successful, the returned vnode pointer. MFC after: 1 month	2009-04-06 10:32:40 +00:00
dchagin	01bf63c9fb	Fix KBI breakage by r190520 which affects older linux.ko binaries: 1) Move the new field (brand_note) to the end of the Brandinfo structure. 2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer is valid. 3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old modules won't have the flag set, so the new field brand_note would be ignored. Suggested by: jhb Reviewed by: jhb Approved by: kib (mentor) MFC after: 6 days	2009-04-05 09:27:19 +00:00
kan	51a543ed33	Revert change 190655 temporarily. It breaks many setups where nullfs is used and needs to be revisited.	2009-04-04 17:48:38 +00:00
marcel	c9498bd9af	PowerPC, meet kernel core dumps. The support is based on a generic dumper that creates an ELF core file and uses PMAP functions to scan and iterate over memory chunks, as well as handle memory mappings used during dumping. the PMAP layer can choose to return physical memory chunks or virtual memory chunks. For minidumps, the chunks should be virtual. The default MMU I/F implementation for the scan_md() method returns NULL. Thus, when a PMAP implementation does not implement the required methods, an empty core file is created. Here, empty means having an ELF header only. Obtained from: Juniper Networks	2009-04-04 02:12:37 +00:00
thompsa	fe5458f665	Add a how argument to root_mount_hold() so it can be passed NOWAIT and be called in situations where sleeping isnt allowed.	2009-04-03 19:46:12 +00:00
peter	77d3ac29e9	vn_vptocnp() unlocks the name cache and forgets to re-lock it before returning in one error case, and mistakenly unlocks it for the umount -f case.	2009-04-02 21:16:20 +00:00
brueffer	f970c13637	Fix memory leak in semunload(). PR: 133064 Submitted by: Mateusz Guzik <mjguzik@gmail.com> MFC after: 1 week	2009-03-30 15:01:29 +00:00
thompsa	84a2f3fa57	Further rate limit the root wait status, it will be printed once per root_mount_rel() wakeup.	2009-03-30 05:57:55 +00:00
kan	ca739eae4a	Replace v_dd vnode pointer with v_cache_dd pointer to struct namecache in directory vnodes. Allow namecache dotdot entry to be created pointing from child vnode to parent vnode if no existing links in opposite direction exist. Use direct link from parent to child for dotdot lookups otherwise. This restores more efficient dotdot caching in NFS filesystems which was lost when vnodes stoppped being type stable. Reviewed by: kib	2009-03-29 21:25:40 +00:00
jamie	5a5a677581	Whitespace/spelling fixes in advance of upcoming functional changes. Approved by: bz (mentor)	2009-03-27 13:13:59 +00:00
thompsa	8caa2c2d5c	Skip the allocation of the root hold token if the mount already happened.	2009-03-27 03:52:08 +00:00
jhb	7f1bee5227	When looking up the parent devclass of a new devclass, create the parent devclass if it doesn't already exist.	2009-03-25 17:02:05 +00:00
jhb	503529db8f	When a file lookup fails due to encountering a doomed vnode from a forced unmount, consistently return ENOENT rather than EBADF. Reviewed by: kib MFC after: 1 month	2009-03-24 18:16:42 +00:00
jkim	effc077868	Clean up MI inittodr(9) and kill noop code. It was derived from i386 version long ago but never resync'ed again. Originally, i386 version compared the current time from realtime clock with time_second (which was just `time' in the old days). When this MI version was written, it was wrongly compared against `base' AND never used because of a bug (typo?) in the code. This check was killed in i386 version when home-rolled calendaric calculation was removed. Now, we just remove the code here as well to make the code simpler.	2009-03-23 21:16:21 +00:00
jhb	ffcd13a80f	Improve the description of a few sysctls. Submitted by: bde (partially) MFC after: 3 days	2009-03-23 20:18:06 +00:00
kan	ab12866f38	Add safety check that does not allow empty strings to be queued to the devctl notification queue. Empty strings cause devctl read call to return 0 and result in devd exiting prematurely. The actual offender (ugen notes for root hubs) will be fixed by separate commit.	2009-03-23 01:13:34 +00:00
cperciva	b7238dced4	Correctly sanity-check timer IDs. [SA-09:06] Limit the size of malloced buffer when dumping environment variables. [EN-09:01] Approved by: so (cperciva) Approved by: re (kensmith) Security: FreeBSD-SA-09:06.ktimer Errata: FreeBSD-EN-09:01.kenv	2009-03-23 00:00:50 +00:00
kib	4c3e8a8b03	Fix several issues with parsing the notes for ELF objects. Badly formed ELF note may cause the caclulated pointer to the next note to point both after the note region, that was checked in the code, but also to point before the region, that was not checked [1]. Remember the first note location in note0 and leap out if the note is not between note0 and note_end. In the similar way, badly formed note may cause infinite loop by pointing next note into the same or previous note. Guard against this by limiting amount of loop iterations by arbitrary choosen big number. For clarity, check the calculated note alignment in each iteration. Reported by: Chris Palmer <chris noncombatant org> [1] PR: kern/132886 Reviewed and tested by: dchagin MFC after: 3 days	2009-03-22 13:42:41 +00:00
kib	c8b4c76eed	Do not underflow the buffer and then report the problem. Check for the condition before the buffer write. Also, since buflen is unsigned, previous check was ignored. Reviewed by: marcus Tested by: pho	2009-03-20 11:08:57 +00:00
kib	2e1bae397f	Remove unneeded braces to reduce used vertical screen space. The location was missed in r190140.	2009-03-20 11:03:55 +00:00
kib	bbdc41384b	Do not forget to adjust buflen for the first resolution of the path from namecache. While there, compare pointers for equiality. Reviewed by: marcus Tested by: pho	2009-03-20 11:00:39 +00:00
kib	6c64d83946	The nc_nlen member of the struct namecache contains the length of the cached name, not the length + 1. PR: 132620, 132542 Reported by: bf2006a yahoo com Tested by: bf2006a, pho Reviewed by: marcus	2009-03-20 10:59:06 +00:00
kib	0b55c1725c	When ktracing namei operations, log a result of the __getcwd(). MFC after: 1 week	2009-03-20 10:47:16 +00:00
kib	de298453c7	Remove unneeded braces to reduce used vertical screen space.	2009-03-20 10:04:00 +00:00
attilio	9c2a3fc781	Fix an old-standing bug that crept in along the several revisions: B_DELWRI cleanup and vnode disassociation should happen just before to assign the buffer to a queue. Reported by: miwi, Volker <volker at vwsoft dot com>, Ben Kaduk <minimarmot at gmail dot com>, Christopher Mallon <christoph dot mallon at gmx dot de> Tested by: lulf, miwi	2009-03-17 16:30:49 +00:00
kib	e905171fbe	Supply AT_EXECPATH auxinfo entry to the interpreter, both for native and compat32 binaries. Tested by: pho Reviewed by: kan	2009-03-17 12:53:28 +00:00
kib	37de637d31	Use the properly sized types for ELF object header and program headers. This fixes osrel fetching from the FreeBSD branding note for the 64bit platforms. Reported by: swell.k gmail com Reviewed by: dchagin Tested by: dchagin, swell.k gmail com	2009-03-17 09:50:40 +00:00
jkim	3eda4741da	Initial suspend/resume support for amd64. This code is heavily inspired by Takanori Watanabe's experimental SMP patch for i386 and large portion was shamelessly cut and pasted from Peter Wemm's AP boot code.	2009-03-17 00:48:11 +00:00
kib	7ad4235008	Fix two issues with bufdaemon, often causing the processes to hang in the "nbufkv" sleep. First, ffs background cg group block write requests a new buffer for the shadow copy. When ffs_bufwrite() is called from the bufdaemon due to buffers shortage, requesting the buffer deadlock bufdaemon. Introduce a new flag for getnewbuf(), GB_NOWAIT_BD, to request getblk to not block while allocating the buffer, and return failure instead. Add a flag argument to the geteblk to allow to pass the flags to getblk(). Do not repeat the getnewbuf() call from geteblk if buffer allocation failed and either GB_NOWAIT_BD is specified, or geteblk() is called from bufdaemon (or its helper, see below). In ffs_bufwrite(), fall back to synchronous cg block write if shadow block allocation failed. Since r107847, buffer write assumes that vnode owning the buffer is locked. The second problem is that buffer cache may accumulate many buffers belonging to limited number of vnodes. With such workload, quite often threads that own the mentioned vnodes locks are trying to read another block from the vnodes, and, due to buffer cache exhaustion, are asking bufdaemon for help. Bufdaemon is unable to make any substantial progress because the vnodes are locked. Allow the threads owning vnode locks to help the bufdaemon by doing the flush pass over the buffer cache before getnewbuf() is going to uninterruptible sleep. Move the flushing code from buf_daemon() to new helper function buf_do_flush(), that is called from getnewbuf(). The number of buffers flushed by single call to buf_do_flush() from getnewbuf() is limited by new sysctl vfs.flushbufqtarget. Prevent recursive calls to buf_do_flush() by marking the bufdaemon and threads that temporarily help bufdaemon by TDP_BUFNEED flag. In collaboration with: pho Reviewed by: tegge (previous version) Tested by: glebius, yandex ... MFC after: 3 weeks	2009-03-16 15:39:46 +00:00
rwatson	70b6a8119c	Remove IFF_NEEDSGIANT, a compatibility infrastructure introduced in FreeBSD 5.x to allow network device drivers to run with Giant despite the network stack being Giant-free. This significantly simplifies calls into ioctl() on network interfaces, especially in the multicast code, as well as eliminates deferred invocation of interface if_start routines. Disable the build on device drivers still depending on IFF_NEEDSGIANT as they no longer compile. They will be removed in a few weeks if they haven't been made MPSAFE in that time. Disabled drivers: if_ar if_axe if_aue if_cdce if_cue if_kue if_ray if_rue if_rum if_sr if_udav if_ural if_zyd Drivers that were already disabled because of tty changes: if_ppp if_sl Discussed on: arch@	2009-03-15 14:21:05 +00:00
jeff	9fedeedb8d	- Wrap lock profiling state variables in #ifdef LOCK_PROFILING blocks.	2009-03-15 08:03:54 +00:00
jeff	ee1ec823f6	- Implement a new mechanism for resetting lock profiling. We now guarantee that all cpus have acknowledged the cleared enable int by scheduling the resetting thread on each cpu in succession. Since all lock profiling happens within a critical section this guarantees that all cpus have left lock profiling before we clear the datastructures. - Assert that the per-thread queue of locks lock profiling is aware of is clear on thread exit. There were several cases where this was not true that slows lock profiling and leaks information. - Remove all objects from all lists before clearing any per-cpu information in reset. Lock profiling objects can migrate between per-cpu caches and previously these migrated objects could be zero'd before they'd been removed Discussed with: attilio Sponsored by: Nokia	2009-03-15 06:41:47 +00:00
jeff	7cd47702cc	- When a mutex is destroyed while locked we need to inform lock profiling that it has been released.	2009-03-14 11:43:38 +00:00
jeff	4e3c0111d8	- Call lock_profile_release when we're transitioning a lock to be owned by LK_KERNPROC. Discussed with: attilio	2009-03-14 11:43:02 +00:00
jeff	96eaa9ff52	- Fix an error that occurs when mp_ncpu is an odd number. steal_thresh is calculated as 0 which causes errors elsewhere. Submitted by: KOIE Hidetaka <koie@suri.co.jp> - When sched_affinity() is called with a thread that is not curthread we need to handle the ON_RUNQ() case by adding the thread to the correct run queue. Submitted by: Justin Teller <justin.teller@gmail.com> MFC after: 1 Week	2009-03-14 11:41:36 +00:00
dchagin	2408b715a0	Implement new way of branding ELF binaries by looking to a ".note.ABI-tag" section. The search order of a brand is changed, now first of all the ".note.ABI-tag" is looked through. Move code which fetch osreldate for ELF binary to check_note() handler. PR: 118473 Approved by: kib (mentor)	2009-03-13 16:40:51 +00:00

1 2 3 4 5 ...

11044 Commits