62 Commits

Author SHA1 Message Date
markj
8f237db141 Add a return value to vm_page_remove().
Use it to indicate whether the page may be safely freed following
its removal from the object.  Also change vm_page_remove() to assume
that the page's object pointer is non-NULL, and have callers perform
this check instead.

This is a step towards an implementation of an atomic reference counter
for each physical page structure.

Reviewed by:	alc, dougm, kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20758
2019-06-26 17:37:51 +00:00
markj
6a9f104098 Replace uses of vm_page_unwire(m, PQ_NONE) with vm_page_unwire_noq(m).
These calls are not the same in general: the former will dequeue the
page if it is enqueued, while the latter will just leave it alone.  But,
all existing uses of the former apply to unmanaged pages, which are
never enqueued in the first place.  No functional change intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20470
2019-06-07 18:23:29 +00:00
cem
0e503ca577 Include eventhandler.h in more compilation units
This was enumerated with exhaustive search for sys/eventhandler.h includes,
cross-referenced against EVENTHANDLER_* usage with the comm(1) utility.  Manual
checking was performed to avoid redundant includes in some drivers where a
common os_bsd.h (for example) included sys/eventhandler.h indirectly, but it is
possible some of these are redundant with driver-specific headers in ways I
didn't notice.

(These CUs did not show up as missing eventhandler.h in tinderbox.)

X-MFC-With:	r347984
2019-05-21 01:18:43 +00:00
imp
ae526637c4 Revert drm2 removal.
Revert r338177, r338176, r338175, r338174, r338172

After long consultations with re@, core members and mmacy, revert
these changes. Followup changes will be made to mark them as
deprecated and prent a message about where to find the up-to-date
driver.  Followup commits will be made to make this clear in the
installer. Followup commits to reduce POLA in ways we're still
exploring.

It's anticipated that after the freeze, this will be removed in
13-current (with the residual of the drm2 code copied to
sys/arm/dev/drm2 for the TEGRA port's use w/o the intel or
radeon drivers).

Due to the impending freeze, there was no formal core vote for
this. I've been talking to different core members all day, as well as
Matt Macey and Glen Barber. Nobody is completely happy, all are
grudgingly going along with this. Work is in progress to mitigate
the negative effects as much as possible.

Requested by: re@ (gjb, rgrimes)
2018-08-24 00:02:00 +00:00
mmacy
8f3d6467d4 Remove legacy drm and drm2 from tree
As discussed on the MLs drm2 conflicts with the ports' version and there
is no upstream for most if not all of drm. Both have been merged in to
a single port.

Users on powerpc, 32-bit hardware, or with GPUs predating Radeon
and i915 will need to install the graphics/drm-legacy-kmod. All
other users should be able to use one of the LinuxKPI-based ports:
graphics/drm-stable-kmod, graphics/drm-next-kmod, graphics/drm-devel-kmod.

MFC: never
Approved by: core@
2018-08-22 01:50:12 +00:00
kib
ee3d0fb8ef vm_wait() rework.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass.  If there is no object
associated with the wait, use curthread' policy domainset.  The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.

Eliminate pagedaemon_wait().  vm_domain_clear() handles the same
operations.

Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.

Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.

Scetched and reviewed by:	jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D14384
2018-02-20 10:13:13 +00:00
markj
ff85e189b3 Don't bother enqueuing a page immediately before freeing it.
No functional change intended.

MFC after:	1 week
2017-05-18 18:37:19 +00:00
markj
91faea9058 Fix a few uses of kern_yield() in the TTM and the LinuxKPI.
kern_yield(0) effectively causes the calling thread to be rescheduled
immediately since it resets the thread's priority to the highest possible
value. This can cause livelocks when the pattern
"while (!trylock()) kern_yield(0);" is used since the thread holding the
lock may linger on the runqueue for the CPU on which the looping thread is
running.

MFC after:	1 week
2017-05-18 18:35:14 +00:00
mmel
117cf46453 Allow DRM2 code to be built on platforms without AGP.
This patch is taken from original drm-3.8 code.

Differential Revision: https://reviews.freebsd.org/D8454
MFC after: 3 weeks
2016-11-13 13:31:23 +00:00
kib
af14bca641 Fix a race in vm_page_busy_sleep(9).
Suppose that we have an exclusively busy page, and a thread which can
accept shared-busy page.  In this case, typical code waiting for the
page xbusy state to pass is
again:
	VM_OBJECT_WLOCK(object);
	...
	if (vm_page_xbusied(m)) {
		vm_page_lock(m);
 		VM_OBJECT_WUNLOCK(object);    <---1
		vm_page_busy_sleep(p, "vmopax");
 		goto again;
	}

Suppose that the xbusy state owner locked the object, unbusied the
page and unlocked the object after we are at the line [1], but before we
executed the load of the busy_lock word in vm_page_busy_sleep().  If it
happens that there is still no waiters recorded for the busy state,
the xbusy owner did not acquired the page lock, so it proceeded.

More, suppose that some other thread happen to share-busy the page
after xbusy state was relinquished but before the m->busy_lock is read
in vm_page_busy_sleep().  Again, that thread only needs vm_object lock
to proceed.  Then, vm_page_busy_sleep() reads busy_lock value equal to
the VPB_SHARERS_WORD(1).

In this case, all tests in vm_page_busy_sleep(9) pass and we are going
to sleep, despite the page being share-busied.

Update check for m->busy_lock == VPB_UNBUSIED in vm_page_busy_sleep(9)
to also accept shared-busy state if we only wait for the xbusy state to
pass.

Merge sequential if()s with the same 'then' clause in
vm_page_busy_sleep().

Note that the current code does not share-busy pages from parallel
threads, the only way to have more that one sbusy owner is right now
is to recurse.

Reported and tested by:	pho (previous version)
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8196
2016-10-13 14:41:05 +00:00
pfg
eed4bd22ad sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
pfg
6ce01c2d90 etc: minor spelling fixes.
Mostly comments but also some user-visible strings.

MFC after:	2 weeks
2016-05-02 16:47:28 +00:00
kib
c91e1d14af Handle the driver KPI change from r292373. Ensure that managed device
pagers fault routines always return with a result page, be it the
proper and valid result page, or initially passed freshly allocated
placeholder.  Do not free the passed in page until we are able to
provide the replacement, and do not assign NULL to *mres.

Reported and tested by:	dumbbell
Reviewed by:	royger (who also verified that Xen code is safe)
Sponsored by:	The FreeBSD Foundation
2016-03-24 09:56:53 +00:00
alc
8343c406db Introduce a new mechanism for relocating virtual pages to a new physical
address and use this mechanism when:

1. kmem_alloc_{attr,contig}() can't find suitable free pages in the physical
   memory allocator's free page lists.  This replaces the long-standing
   approach of scanning the inactive and inactive queues, converting clean
   pages into PG_CACHED pages and laundering dirty pages.  In contrast, the
   new mechanism does not use PG_CACHED pages nor does it trigger a large
   number of I/O operations.

2. on 32-bit MIPS processors, uma_small_alloc() and the pmap can't find
   free pages in the physical memory allocator's free page lists that are
   covered by the direct map.  Tested by: adrian

3. ttm_bo_global_init() and ttm_vm_page_alloc_dma32() can't find suitable
   free pages in the physical memory allocator's free page lists.

In the coming months, I expect that this new mechanism will be applied in
other places.  For example, balloon drivers should use relocation to
minimize fragmentation of the guest physical address space.

Make vm_phys_alloc_contig() a little smarter (and more efficient in some
cases).  Specifically, use vm_phys_segs[] earlier to avoid scanning free
page lists that can't possibly contain suitable pages.

Reviewed by:	kib, markj
Glanced at:	jhb
Discussed with:	jeff
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D4444
2015-12-19 18:42:50 +00:00
glebius
63cd1c131a A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES().
o With new KPI consumers can request contiguous ranges of pages, and
  unlike before, all pages will be kept busied on return, like it was
  done before with the 'reqpage' only. Now the reqpage goes away. With
  new interface it is easier to implement code protected from race
  conditions.

  Such arrayed requests for now should be preceeded by a call to
  vm_pager_haspage() to make sure that request is possible. This
  could be improved later, making vm_pager_haspage() obsolete.

  Strenghtening the promises on the business of the array of pages
  allows us to remove such hacks as swp_pager_free_nrpage() and
  vm_pager_free_nonreq().

o New KPI accepts two integer pointers that may optionally point at
  values for read ahead and read behind, that a pager may do, if it
  can. These pages are completely owned by pager, and not controlled
  by the caller.

  This shifts the UFS-specific readahead logic from vm_fault.c, which
  should be file system agnostic, into vnode_pager.c. It also removes
  one VOP_BMAP() request per hard fault.

Discussed with:	kib, alc, jeff, scottl
Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2015-12-16 21:30:45 +00:00
dumbbell
08274e02c7 drm/ttm: Drain taskqueue if taskqueue_cancel_timeout() returned an error
Before, this was done if `pending` was true. This is not what the
manpage suggests and not what was done elsewhere in the same file.
2015-09-08 19:41:19 +00:00
avg
1aac2948e9 ttm_vm_page_alloc: use vm_page_alloc for pages without dma32 restriction
This change re-organizes code a little bit to extract common pieces
of ttm_alloc_new_pages() and ttm_get_pages() into dedicated functions.
Also, for requests without address restrictions regular vm_page_alloc()
is used.
Lastly, when vm_page_alloc_contig() fails we call VM_WAIT before calling
vm_pageout_grow_cache() to ensure that there is enough free pages at all.

Reviewed by:	kib
MFC after:	15 days
2015-06-15 13:43:23 +00:00
dumbbell
e6c59cd5c0 drm: Update the device-independent code to match Linux 3.8.13
This update brings few features:
    o  Support for the setmaster/dropmaster ioctls. For instance, they
       are used to run multiple X servers simultaneously.
    o  Support for minor devices. The only user-visible change is a new
       entry in /dev/dri but it is useless at the moment. This is a
       first step to support render nodes [1].

The main benefit is to greatly reduce the diff with Linux (at the
expense of an unreadable commit diff). Hopefully, next upgrades will be
easier.

No updates were made to the drivers, beside adapting them to API
changes.

[1] https://en.wikipedia.org/wiki/Direct_Rendering_Manager#Render_nodes

Tested by:	Many people
MFC after:	1 month
Relnotes:	yes
2015-03-17 18:50:33 +00:00
kib
360ba91b98 If the vm_page_alloc_contig() failed in the ttm page allocators, do
what other callers of vm_page_alloc_contig() do, retry after
vm_pageout_grow_cache().

Sponsored by:	The FreeBSD Foundation
2015-02-03 13:45:06 +00:00
tijl
b445c32a66 Use default memory type for TTM buffer objects that may be cached.
MFC after:	1 week
2014-11-02 14:08:54 +00:00
tijl
d208d96586 Port the TTM AGP backend to the FreeBSD agp driver and enable AGP support
in the radeonkms driver.

Note: In PCI mode virtual addresses on the graphics card that map to system
RAM are translated to physical addresses by the graphics card itself.   In
AGP mode address translation is done by the AGP chipset so fictitious
addresses appear on the system bus.  For the CPU cache management to work
correctly when the CPU accesses this memory it needs to use the same
fictitious addresses (and let the chipset translate them) instead of using
the physical addresses directly.

Reviewed by:	kib
MFC after:	1 month
2014-10-30 14:26:36 +00:00
royger
d0f808b6f9 drm: fix usage of vm_phys_fictitious_to_vm_page
vm_phys_fictitious_to_vm_page should not be called directly, even when
operating on a range that has been registered using
vm_phys_fictitious_reg_range. PHYS_TO_VM_PAGE should be used instead
because on arches that use VM_PHYSSEG_DENSE the page might come
directly from vm_page_array.

Reported by: nwhitehorn
Tested by: nwhitehorn, David Mackay <davidm.jx8p@gmail.com>
Sponsored by: Citrix Systems R&D
2014-08-06 17:45:59 +00:00
attilio
2802c525ad - Modify vm_page_unwire() and vm_page_enqueue() to directly accept
the queue where to enqueue pages that are going to be unwired.
- Add stronger checks to the enqueue/dequeue for the pagequeues when
  adding and removing pages to them.

Of course, for unmanaged pages the queue parameter of vm_page_unwire() will
be ignored, just as the active parameter today.
This makes adding new pagequeues quicker.

This change effectively modifies the KPI.  __FreeBSD_version will be,
however, bumped just when the full cache of free pages will be
evicted.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc
Tested by:	pho
2014-06-16 18:15:27 +00:00
dumbbell
33f08bf94d drm/ttm, drm/radeon: Replace EINTR/ERESTART by ERESTARTSYS...
... for msleep/cv_*wait() return values, where wait_event*() is used
on Linux. ERESTARTSYS is the return code expected by callers when the
operation was interrupted.

For instance, this is the case of radeon_cs_ioctl() (radeon_cs.c): if
an error occurs, and the code isn't ERESTARTSYS (eg. EINTR), it logs an
error.

Note that ERESTARTSYS is defined as ERESTART, but this keeps callers'
code close to Linux.

Submitted by:	avg@ (previous version)
2013-12-21 15:40:36 +00:00
avg
16415009ec ttm_bo_vm_lookup_rb: actually make use of the red-black tree
Previously the code would just iterate over the whole tree as if it were
just a list.

Without this change I would observe X server becoming more and more
jerky over time.

MFC after:	5 days
2013-12-19 12:00:48 +00:00
eadler
44c01df173 Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit.  Instead use (1U << 31) which gets the
expected result.

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.

Discussed with:	-arch, rdivacky
Reviewed by:	cperciva
2013-11-30 22:17:27 +00:00
jkim
8fcf363f03 Partially revert r254880. The bitmap operations actually use long type now. 2013-08-29 22:46:21 +00:00
dumbbell
df84fc5689 drm: Use the new drm_atomic.h, following the merge of projects/atomic64
Submitted by:	jkim@
2013-08-25 15:38:16 +00:00
dumbbell
34063ddb05 drm/ttm: Remove unused VM_ALLOC_DMA32 define 2013-08-25 15:33:17 +00:00
dumbbell
3e5001b6b2 drm/ttm: Fix a reversed condition and add missing locks
This allows to run OpenGL applications on at least two test machines
with the Radeon driver.

Approved by:	kib@
2013-08-25 15:29:23 +00:00
dumbbell
b01e0cf4b5 drm/ttm: Fix style in ttm_bo_release_mmap() 2013-08-25 15:26:45 +00:00
dumbbell
bcae8cdece drm/ttm: Fix unmap of buffer object
Add a new ttm_bo_release_mmap() function to unmap pages in a
vm_object_t. Pages are freed when the buffer object is later released.

This function is called in ttm_bo_unmap_virtual_locked(), replacing
Linux' unmap_mapping_range(). In particular this is called when a buffer
object is about to be moved, so that its mapping is invalidated.

However, we don't use this function in ttm_bo_vm_dtor(), because the
vm_object_t is already marked as OBJ_DEAD and the pages will be
unmapped.

Approved by:	kib@
2013-08-25 15:15:55 +00:00
dumbbell
138dd96bce ttm: "to_page->valid = VM_PAGE_BITS_ALL" before vm_page_dirty(to_page)
Approved by;	kib@
2013-08-25 15:12:26 +00:00
dumbbell
56e71221b6 drm/ttm: Improve comment in ttm_bo_vm_ctor() about lack of ref acquisition
Approved by:	kib@
2013-08-25 15:06:48 +00:00
dumbbell
0461a5aeca drm/ttm: When removing a range of pages from a pool, remove all of them
Submitted by:	Mark Kettenis and Jonathan Gray from OpenBSD
Approved by:	kib@
2013-08-25 15:05:22 +00:00
dumbbell
22072c43ad drm/ttm: Fix style errors 2013-08-25 15:01:35 +00:00
dumbbell
4e3539e87c drm/ttm: Make ttm_bo_wait() call uninterruptible in page fault handler
This fixes a crash where a SIGLALRM, heavily used by X.Org, would
interrupt the wait, causing the page fault to fail and the "Xorg"
process to receive a SIGSEGV.

Approved by:	kib@
2013-08-25 15:00:48 +00:00
dumbbell
beb0153eed drm/ttm: Import Linux commit ff7c60c580d9722f820d85c9c58ca55ecc1ee7c4
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Jan 14 15:08:14 2013 +0100

    drm/ttm: fix fence locking in ttm_buffer_object_transfer, 2nd try

    This fixes up

    commit e8e89622ed361c46bf90ba4828e685a8b603f7e5
    Author: Daniel Vetter <daniel.vetter@ffwll.ch>
    Date:   Tue Dec 18 22:25:11 2012 +0100

        drm/ttm: fix fence locking in ttm_buffer_object_transfer

    which leaves behind a might_sleep in atomic context, since the
    fence_lock spinlock is held over a kmalloc(GFP_KERNEL) call. The fix
    is to revert the above commit and only take the lock where we need it,
    around the call to ->sync_obj_ref.

    v2: Fixup things noticed by Maarten Lankhorst:
    - Brown paper bag locking bug.
    - No need for kzalloc if we clear the entire thing on the next line.
    - check for bo->sync_obj (totally unlikely race, but still someone
      else could have snuck in) and clear fbo->sync_obj if it's cleared
      already.

    Reported-by: Dave Airlie <airlied@gmail.com>
    Cc: Jerome Glisse <jglisse@redhat.com>
    Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

Approved by:	kib@
2013-08-25 14:58:44 +00:00
dumbbell
3f9b476839 drm/ttm: Import Linux commit 014b34409fb2015f63663b6cafdf557fdf289628
Author: Dave Airlie <airlied@gmail.com>
Date:   Wed Jan 16 15:58:34 2013 +1000

    ttm: on move memory failure don't leave a node dangling

    if we have a move notify callback, when moving fails, we call move notify
    the opposite way around, however this ends up with *mem containing the mm_node
    from the bo, which means we double free it. This is a follow on to the previous
    fix.

    Reviewed-by: Jerome Glisse <jglisse@redhat.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

Approved by:	kib@
2013-08-25 14:56:14 +00:00
dumbbell
419380e66b drm/ttm: Import Linux commit 630541863b29f88c7ab34e647758344e4cd1eafd
Author: Dave Airlie <airlied@gmail.com>
Date:   Wed Jan 16 14:25:44 2013 +1000

    ttm: don't destroy old mm_node on memcpy failure

    When we are using memcpy to move objects around, and we fail to memcpy
    due to lack of memory to populate or failure to finish the copy, we don't
    want to destroy the mm_node that has been copied into old_copy.

    While working on a new kms driver that uses memcpy, if I overallocated bo's
    up to the memory limits, and eviction failed, then machine would oops soon
    after due to having an active bo with an already freed drm_mm embedded in it,
    freeing it a second time didn't end well.

    Reviewed-by: Jerome Glisse <jglisse@redhat.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

Approved by:	kib@
2013-08-25 14:55:08 +00:00
dumbbell
7c851436c8 drm/ttm: Import Linux commit cc4c0c4de3c775be22072ec3251f2e581b63d9a0
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date:   Tue Jan 15 14:57:28 2013 +0100

    drm/ttm: unexport ttm_bo_wait_unreserved

    All legitimate users of this function outside ttm_bo.c are gone, now
    it's only an implementation detail.

    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
    Reviewed-by: Jerome Glisse <jglisse@redhat.com>

Approved by:	kib@
2013-08-25 14:53:39 +00:00
dumbbell
694c0fc1ca drm/ttm: Import Linux commit f2d476a110bc24fde008698ae9018c99e803e25c
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date:   Tue Jan 15 14:57:10 2013 +0100

    drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers, v2

    This requires re-use of the seqno, which increases fairness slightly.
    Instead of spinning with a new seqno every time we keep the current one,
    but still drop all other reservations we hold. Only when we succeed,
    we try to get back our other reservations again.

    This should increase fairness slightly as well.

    Changes since v1:
     - Increase val_seq before calling ttm_bo_reserve_slowpath_nolru and
       retrying to take all entries to prevent a race.

    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
    Reviewed-by: Jerome Glisse <jglisse@redhat.com>

Approved by:	kib@
2013-08-25 14:52:20 +00:00
dumbbell
03fccc43b4 drm/ttm: Import Linux commit 5e45d7dfd74100d622f9cdc70bfd1f9fae1671de
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date:   Tue Jan 15 14:57:05 2013 +0100

    drm/ttm: add ttm_bo_reserve_slowpath

    Instead of dropping everything, waiting for the bo to be unreserved
    and trying over, a better strategy would be to do a blocking wait.

    This can be mapped a lot better to a mutex_lock-like call.

    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
    Reviewed-by: Jerome Glisse <jglisse@redhat.com>

Approved by:	kib@
2013-08-25 14:47:22 +00:00
dumbbell
08c07dcd29 drm/ttm: Import Linux commit 7a1863084c9d90ce4b67d645bf9b0f1612e68f62
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date:   Tue Jan 15 14:56:48 2013 +0100

    drm/ttm: cleanup ttm_eu_reserve_buffers handling

    With the lru lock no longer required for protecting reservations we
    can just do a ttm_bo_reserve_nolru on -EBUSY, and handle all errors
    in a single path.

    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
    Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-08-25 14:41:22 +00:00
dumbbell
52ccac1de9 drm/ttm: Import Linux commit 63d0a4195560362e2e00a3ad38fc331d34e1da9b
Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Date:   Tue Jan 15 14:56:37 2013 +0100

    drm/ttm: remove lru_lock around ttm_bo_reserve

    There should no longer be assumptions that reserve will always succeed
    with the lru lock held, so we can safely break the whole atomic
    reserve/lru thing. As a bonus this fixes most lockdep annotations for
    reservations.

    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
    Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-08-25 14:39:51 +00:00
kib
ba12eedccd Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9).
The flag was mandatory since r209792, where vm_page_grab(9) was
changed to only support the alloc retry semantic.

Suggested and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
2013-08-22 07:39:53 +00:00
kib
4675fcfce0 Different consumers of the struct vm_page abuse pageq member to keep
additional information, when the page is guaranteed to not belong to a
paging queue.  Usually, this results in a lot of type casts which make
reasoning about the code correctness harder.

Sometimes m->object is used instead of pageq, which could cause real
and confusing bugs if non-NULL m->object is leaked.  See r141955 and
r253140 for examples.

Change the pageq member into a union containing explicitly-typed
members.  Use them instead of type-punning or abusing m->object in x86
pmaps, uma and vm_page_alloc_contig().

Requested and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
2013-08-10 17:36:42 +00:00
attilio
e9f37cac74 On all the architectures, avoid to preallocate the physical memory
for nodes used in vm_radix.
On architectures supporting direct mapping, also avoid to pre-allocate
the KVA for such nodes.

In order to do so make the operations derived from vm_radix_insert()
to fail and handle all the deriving failure of those.

vm_radix-wise introduce a new function called vm_radix_replace(),
which can replace a leaf node, already present, with a new one,
and take into account the possibility, during vm_radix_insert()
allocation, that the operations on the radix trie can recurse.
This means that if operations in vm_radix_insert() recursed
vm_radix_insert() will start from scratch again.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc (older version)
Reviewed by:	jeff
Tested by:	pho, scottl
2013-08-09 11:28:55 +00:00
attilio
16c7563cf4 The soft and hard busy mechanism rely on the vm object lock to work.
Unify the 2 concept into a real, minimal, sxlock where the shared
acquisition represent the soft busy and the exclusive acquisition
represent the hard busy.
The old VPO_WANTED mechanism becames the hard-path for this new lock
and it becomes per-page rather than per-object.
The vm_object lock becames an interlock for this functionality:
it can be held in both read or write mode.
However, if the vm_object lock is held in read mode while acquiring
or releasing the busy state, the thread owner cannot make any
assumption on the busy state unless it is also busying it.

Also:
- Add a new flag to directly shared busy pages while vm_page_alloc
  and vm_page_grab are being executed.  This will be very helpful
  once these functions happen under a read object lock.
- Move the swapping sleep into its own per-object flag

The KPI is heavilly changed this is why the version is bumped.
It is very likely that some VM ports users will need to change
their own code.

Sponsored by:	EMC / Isilon storage division
Discussed with:	alc
Reviewed by:	jeff, kib
Tested by:	gavin, bapt (older version)
Tested by:	pho, scottl
2013-08-09 11:11:11 +00:00
jeff
de4ecca213 Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

 - Normalize functions that allocate memory to use kmem_*
 - Those that allocate address space are named kva_*
 - Those that operate on maps are named kmap_*
 - Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-07 06:21:20 +00:00