freebsd-nq

Author	SHA1	Message	Date
Mateusz Guzik	8066a14a3c	cache: stop holding the ncneg_hot lock across purging Only non-hot entries are purged so the lock is not needed in the first place. This saves one lock/unlock pair. MFC after: 1 week	2017-05-04 03:11:59 +00:00
Brooks Davis	a3b7d0fb60	Regen after r316594.	2017-04-06 23:40:51 +00:00
Mateusz Guzik	dfecf51dd0	cache: use vrefact for '.' lookups and refing the rdir in fullpath	2017-01-30 03:20:05 +00:00
Mateusz Guzik	17071ff298	cache: annotate with __read_mostly and __exclusive_cache_line MFC after: 1 month	2017-01-27 14:56:36 +00:00
Mateusz Guzik	4938d86764	cache: sprinkle __predict_false	2016-12-29 16:35:49 +00:00
Mateusz Guzik	b37707533e	cache: move shrink lock init to nchinit This gets rid of unnecesary sysinit usage. While here also rename the lock to be consistent with the rest.	2016-12-29 12:01:54 +00:00
Mateusz Guzik	0569bc9ca9	cache: depessimize hashing macros/inlines All hash sizes are power-of-2, but the compiler does not know that for sure and 'foo % size' forces doing a division. Store the size - 1 and use 'foo & hash' instead which allows mere shift.	2016-12-29 08:41:25 +00:00
Mateusz Guzik	6dd9661b77	cache: drop the NULL check from VP2VNODELOCK Now that negative entries are annotated with a dedicated flag, NULL vnodes are no longer passed.	2016-12-29 08:34:50 +00:00
Mateusz Guzik	25e578de55	vfs: use vrefact in getcwd and fchdir	2016-12-12 19:16:35 +00:00
Mateusz Guzik	8b0e0c91e0	cache: ensure that the number of bucket locks does not exceed hash size The size can be changed by side effect of modifying kern.maxvnodes. Since numbucketlocks was not modified, setting a sufficiently low value would give more locks than actual buckets, which would then lead to corruption. Force the number of buckets to be not smaller. Note this should not matter for real world cases. Reported and tested by: pho	2016-11-23 19:50:12 +00:00
Mateusz Guzik	6ce45c6ac3	cache: plug a write-only variable in cache_negative_zap_one	2016-11-15 03:43:10 +00:00
Mateusz Guzik	317cac6d5a	cache: fix a race between entry removal and demotion The negative list shrinker can demote an entry with only hotlist + neglist locks held. On the other hand entry removal possibly sets the NCF_DVDROP without aformentioned locks held prior to detaching it from the respective netlist., which can lose the update made by the shrinker. Reported and tested by: truckman	2016-11-15 03:38:05 +00:00
Konstantin Belousov	9bd4f0a2c6	vn_fullpath1() checked VV_ROOT and then unreferenced vp->v_mount->mnt_vnodecovered unlocked. This allowed unmount to race. Lock vnode after we noticed the VV_ROOT flag. See comments for explanation why unlocked check for the flag is considered safe. Reported and tested by: avg Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-07 10:55:56 +00:00
Mateusz Guzik	bb697a20d7	cache: fix up a corner case in r307650 If no negative entry is found on the last list, the ncp pointer will be left uninitialized and a non-null value will make the function assume an entry was found. Fix the problem by initializing to NULL on entry. Reported by: glebius	2016-10-20 19:55:50 +00:00
Mateusz Guzik	a45a1a25b8	cache: split negative entry LRU into multiple lists This splits the ncneg_mtx lock while preserving the hit ratio at least during buildworld. Create N dedicated lists for new negative entries. Entries with at least one hit get promoted to the hot list, where they get requeued every M hits. Shrinking demotes one hot entry and performs a round-robin shrinking of regular lists. Reviewed by: kib	2016-10-19 18:29:52 +00:00
Konstantin Belousov	f71d08566c	Limit scope of the optimization in r306608 to dounmount() caller only. Other uses of cache_purgevfs() do rely on the cache purge for correct operations, when paths are invalidated without unmount. Reported and tested by: jkim Discussed with: mjg Sponsored by: The FreeBSD Foundation	2016-10-07 11:38:28 +00:00
Mateusz Guzik	4876636eb7	cache: ignore purgevfs requests for filesystems with few vnodes purgevfs is purely optional and induces lock contention in workloads which frequently mount and unmount filesystems. In particular, poudriere will do this for filesystems with 4 vnodes or less. Full cache scan is clearly wasteful. Since there is no explicit counter for namecache entries, the number of vnodes used by the target fs is checked. The default limit is the number of bucket locks. Reviewed by: kib	2016-10-03 00:02:32 +00:00
Mateusz Guzik	1d2541fd1a	cache: get rid of the global lock Add a table of vnode locks and use them along with bucketlocks to provide concurrent modification support. The approach taken is to preserve the current behaviour of the namecache and just lock all relevant parts before any changes are made. Lookups still require the relevant bucket to be locked. Discussed with: kib Tested by: pho	2016-09-23 04:45:11 +00:00
Ed Maste	69a2875821	Renumber license clauses in sys/kern to avoid skipping #3	2016-09-15 13:16:20 +00:00
Mateusz Guzik	a27815330c	cache: improve scalability by introducing bucket locks An array of bucket locks is added. All modifications still require the global cache_lock to be held for writing. However, most readers only need the relevant bucket lock and in effect can run concurrently to the writer as long as they use a different lock. See the added comment for more details. This is an intermediate step towards removal of the global lock. Reviewed by: kib Tested by: pho	2016-09-10 16:29:53 +00:00
Mateusz Guzik	591df14528	cache: defer freeing entries until after the global lock is dropped This also defers vdrop for held vnodes. Glanced at by: kib	2016-09-04 16:52:14 +00:00
Mateusz Guzik	31977b420a	cache: manage negative entry list with a dedicated lock Since negative entries are managed with a LRU list, a hit requires a modificaton. Currently the code tries to upgrade the global lock if needed and is forced to retry the lookup if it fails. Provide a dedicated lock for use when the cache is only shared-locked. Reviewed by: kib MFC after: 1 week	2016-09-04 08:58:35 +00:00
Mateusz Guzik	b9042ae1bf	cache: put all negative entry management code into dedicated functions Reviewed by: kib MFC after: 1 week	2016-09-04 08:55:15 +00:00
Pedro F. Giffuni	e3043798aa	sys/kern: spelling fixes in comments. No functional change.	2016-04-29 22:15:33 +00:00
Konstantin Belousov	0791e0c0e7	Provide more correct sizing of the KVA consumed by a vnode, used by the virtvnodes calculation. Include the size of fs-specific v_data as the nfs nclnode inline, the NFS nclnode is bigger than either ZFS znode or UFS inode. Include the size of namecache_ts and short cache path element, multiplied by the name cache population factor, again inline. Inline defines are used to avoid pollution of the vnode.h with the subsystem-private objects. Non-significant unsynchronized changes of the definitions are fine, we do not care about that precision, and e.g. ZFS consumes much malloced memory per vnode for reasons unaccounted in the formula. Lower the partition of kmem dedicated to vnodes, from 1/7 to 1/10. The measures reduce vnode cache pressure on kmem and bring the vnode cache memory use below some apparent thresholds that were exceeded by r291244 due to more robust vnode reuse. Reported and tested by: marius (i386, previous version) Reviewed by: bde Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-02-24 15:15:46 +00:00
Mateusz Guzik	b0632ab432	cache: minor changes 1. vhold and zap immediately instead of postponing few lines later 2. increment numneg after new entry is added No functional changes. No objections: kib	2016-01-21 01:09:39 +00:00
Mateusz Guzik	baa2bcf572	cache: perform . lockup without the namecache lock Reviewed by: kib	2016-01-21 01:07:05 +00:00
Mateusz Guzik	db709ecbcc	cache: provide a helper for computing the hash Reviewed by: kib	2016-01-21 01:05:41 +00:00
Mateusz Guzik	76583fa294	cache: use counter(9) API to maintain statistics Previously the code would just increment statistics while only holding a shared lock, in effect losing updates. Separate tracking for nchstats is removed as values can be obtained from existing counters. Note that some fields are updated by external consumers and are left unfixed. This should not be a serious issue as this structure looks quite obsolete. No strong objections: kib	2016-01-21 01:04:03 +00:00
Mateusz Guzik	6b53d1bc6f	cache: ansify functions and fix some style issues No functional changes.	2016-01-07 02:04:17 +00:00
Mark Johnston	3616095801	Fix style issues around existing SDT probes. - Use SDT_PROBE<N>() instead of SDT_PROBE(). This has no functional effect at the moment, but will be needed for some future changes. - Don't hardcode the module component of the probe identifier. This is set automatically by the SDT framework. MFC after: 1 week	2015-12-16 23:39:27 +00:00
Andriy Gapon	2f2f522b5d	save some bytes by using more concise SDT_PROBE<n> instead of SDT_PROBE SDT_PROBE requires 5 parameters whereas SDT_PROBE<n> requires n parameters where n is typically smaller than 5. Perhaps SDT_PROBE should be made a private implementation detail. MFC after: 20 days	2015-09-28 12:14:16 +00:00
Kirk McKusick	17518b1a2b	Track changes to kern.maxvnodes and appropriately increase or decrease the size of the name cache hash table (mapping file names to vnodes) and the vnode hash table (mapping mount point and inode number to vnode). An appropriate locking strategy is the key to changing hash table sizes while they are in active use. Reviewed by: kib Tested by: Peter Holm Differential Revision: https://reviews.freebsd.org/D2265 MFC after: 2 weeks	2015-09-06 05:50:51 +00:00
Mateusz Guzik	752fc07d33	vfs: implement v_holdcnt/v_usecount manipulation using atomic ops Transitions 0->1 and 1->0 (which decide e.g. on putting the vnode on the free list) of either counter are still guarded with vnode interlock. Reviewed by: kib (earlier version) Tested by: pho	2015-07-16 13:57:05 +00:00
Edward Tomasz Napierala	6289b482ec	Modify kern___getcwd() to take max pathlen limit as an additional argument. This will be used for the Linux emulation layer - for Linux, PATH_MAX is 4096 and not 1024. Differential Revision: https://reviews.freebsd.org/D2335 Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation	2015-04-21 13:55:24 +00:00
Kirk McKusick	f351915514	More accurately collect name-cache statistics in sysctl functions sysctl_debug_hashstat_nchash() and sysctl_debug_hashstat_rawnchash(). These changes are in preparation for allowing changes in the size of the vnode hash tables driven by increases and decreases in the maximum number of vnodes in the system. Reviewed by: kib@ Phabric: D2265	2015-04-18 00:59:03 +00:00
Dmitry Chagin	9f7a06f27e	Indeed, instead of hiding the kern___getcwd() bug by bogus cast in r276564, change path type to char * (pathnames are always char ). And remove bogus casts of malloc(). kern___getcwd() internally doesn't actually use or support u_char paths, except to copy them to a normal char * path. These changes are not visible to libc as libc/gen/getcwd.c misdeclares __getcwd() as taking a plain char * path. While here remove _SYS_SYSPROTO_H_ for __getcwd() syscall as we always have sysproto.h. Pointed out by: bde MFC after: 1 week	2015-01-04 10:34:02 +00:00
Hans Petter Selasky	f0188618f2	Fix multiple incorrect SYSCTL arguments in the kernel: - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies	2014-10-21 07:31:21 +00:00
Sergey Kandaurov	bcdd3bceb6	vn_path_to_global_path: update comment.	2014-08-03 07:59:19 +00:00
Konstantin Belousov	fe20047039	Fix accounting for the negative cache entries when reusing v_cache_dd. Having ncneg diverge with the actual length of the ncneg tailq causes NULL dereference. Add assertion that an entry taken from ncneg queue is indeed negative. Reported by and discussed with: avg Sponsored by: The FreeBSD Foundation MFC after: 1 week	2013-12-27 17:09:59 +00:00
Andriy Gapon	d9fae5ab88	dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE In its stead use the Solaris / illumos approach of emulating '-' (dash) in probe names with '__' (two consecutive underscores). Reviewed by: markj MFC after: 3 weeks	2013-11-26 08:46:27 +00:00
Attilio Rao	54366c0bd7	- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging option, unbreak the lock tracing release semantic by embedding calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined version of the releasing functions for mutex, rwlock and sxlock. Failing to do so skips the lockstat_probe_func invokation for unlocking. - As part of the LOCKSTAT support is inlined in mutex operation, for kernel compiled without lock debugging options, potentially every consumer must be compiled including opt_kdtrace.h. Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES is linked there and it is only used as a compile-time stub [0]. [0] immediately shows some new bug as DTRACE-derived support for debug in sfxge is broken and it was never really tested. As it was not including correctly opt_kdtrace.h before it was never enabled so it was kept broken for a while. Fix this by using a protection stub, leaving sfxge driver authors the responsibility for fixing it appropriately [1]. Sponsored by: EMC / Isilon storage division Discussed with: rstone [0] Reported by: rstone [1] Discussed with: philip	2013-11-25 07:38:45 +00:00
Andriy Gapon	4633a4c379	namecache sdt: freebsd doesn't support structured characters yet :-) MFC after: 7 days	2013-07-09 08:58:34 +00:00
Kirk McKusick	3289d5877a	When renaming a directory from one parent directory to another, we need to call ufs_checkpath() to walk from our new location to the root of the filesystem to ensure that we do not encounter ourselves along the way. Until now, we accomplished this by reading the ".." entries of each directory in our path until we reached the root (or encountered an error). This change tries to avoid the I/O of reading the ".." entries by first looking them up in the name cache and only doing the I/O when the name cache lookup fails. Reviewed by: kib Tested by: Peter Holm MFC after: 4 weeks	2013-03-20 17:57:00 +00:00
Konstantin Belousov	5050aa86cf	Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho	2012-10-22 17:50:54 +00:00
Rick Macklem	5e99212d36	Post r230394, the Lookup RPC counts for both NFS clients increased significantly. Upon investigation this was caused by name cache misses for lookups of "..". For name cache entries for non-".." directories, the cache entry serves double duty. It maps both the named directory plus ".." for the parent of the directory. As such, two ctime values (one for each of the directory and its parent) need to be saved in the name cache entry. This patch adds an entry for ctime of the parent directory to the name cache. It also adds an additional uma zone for large entries with this time value, in order to minimize memory wastage. As well, it fixes a couple of cases where the mtime of the parent directory was being saved instead of ctime for positive name cache entries. With this patch, Lookup RPC counts return to values similar to pre-r230394 kernels. Reported by: bde Discussed with: kib Reviewed by: jhb MFC after: 2 weeks	2012-03-03 01:06:54 +00:00
Maxim Konovalov	7dfdd83d56	o Reduce chances for integer overflow. o More verbose sysctl description added. MFC after: 2 weeks Sponsored by: Nginx, Inc.	2012-02-25 12:06:40 +00:00
John Baldwin	bf40d24a3f	Rename cache_lookup_times() to cache_lookup() and retire the old API and ABI stub for cache_lookup().	2012-02-06 17:00:28 +00:00
Konstantin Belousov	d5210589b7	Fix remaining calls to cache_enter() in both NFS clients to provide appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks	2012-01-25 20:48:20 +00:00
Konstantin Belousov	7a7e609a32	Apparently, both nfs clients do not use cache_enter_time() consistently, creating some namecache entries without NCF_TS flag. This causes panic due to failed assertion. As a temporal relief, remove the assert. Return epoch timestamp for the entries without timestamp if asked. While there, consolidate the code which returns timestamps, into a helper cache_out_ts(). Discussed with: jhb MFC after: 2 weeks	2012-01-23 17:09:23 +00:00

1 2 3 4 5

226 Commits