zfs: wait in arc_lowmem only if curproc == pageproc

... otherwise the current thread might be holding ARC locks and thus run
into a deadlock.  This happens, for example, when a thread does memory
allocation in the ARC code and runs into KVA shortage.
Also, it really makes the most sense to wait in pageproc, so that the
results of ARC reclamation are seen before the page cache is acted upon.
In other cases where vm_lowmem is invoked, e.g. on KVA space shortage,
the callers perform multiple attempts (up to 8) and wait for rather
long intervals between them (up to 4 seconds), so ARC reclaim results
should become visible even without explicit waiting on the ARC thread.

Note that this is not a critical issue for typical ZFS usages where KVA
space should already be large enough.  On amd64 systems setting KVA size
to twice the physical memory size is known to mitigate KVA fragmentation
issues in practice.

Side note: perhaps vm_lowmem 'how' parameter should be used to
differentiate between causes of the event.

Reported by:	Nikolay Denev <ndenev@gmail.com>
MFC after:	19 days
This commit is contained in:
avg 2012-10-20 10:02:18 +00:00
parent 04a3ee8c86
commit 766d50825e

View File

@ -3792,8 +3792,16 @@ arc_lowmem(void *arg __unused, int howto __unused)
mutex_enter(&arc_reclaim_thr_lock);
needfree = 1;
cv_signal(&arc_reclaim_thr_cv);
while (needfree)
msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0);
/*
* It is unsafe to block here in arbitrary threads, because we can come
* here from ARC itself and may hold ARC locks and thus risk a deadlock
* with ARC reclaim thread.
*/
if (curproc == pageproc) {
while (needfree)
msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0);
}
mutex_exit(&arc_reclaim_thr_lock);
mutex_exit(&arc_lowmem_lock);
}