1998-11-04 00:29:01 +00:00
|
|
|
/*-
|
|
|
|
* Copyright (c) 1998 Michael Smith <msmith@freebsd.org>
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
* Copyright 2015 Toomas Soome <tsoome@me.com>
|
1998-11-04 00:29:01 +00:00
|
|
|
* All rights reserved.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
|
|
|
|
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
* SUCH DAMAGE.
|
1998-11-02 23:28:11 +00:00
|
|
|
*/
|
|
|
|
|
2003-08-25 23:30:41 +00:00
|
|
|
#include <sys/cdefs.h>
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
#include <sys/param.h>
|
2003-08-25 23:30:41 +00:00
|
|
|
__FBSDID("$FreeBSD$");
|
|
|
|
|
1998-11-02 23:28:11 +00:00
|
|
|
/*
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
* Simple hashed block cache
|
1998-11-02 23:28:11 +00:00
|
|
|
*/
|
|
|
|
|
2004-10-03 16:34:01 +00:00
|
|
|
#include <sys/stdint.h>
|
|
|
|
|
1998-11-02 23:28:11 +00:00
|
|
|
#include <stand.h>
|
|
|
|
#include <string.h>
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
#include <strings.h>
|
1998-11-02 23:28:11 +00:00
|
|
|
|
|
|
|
#include "bootstrap.h"
|
|
|
|
|
|
|
|
/* #define BCACHE_DEBUG */
|
|
|
|
|
|
|
|
#ifdef BCACHE_DEBUG
|
2019-03-12 16:21:39 +00:00
|
|
|
# define DPRINTF(fmt, args...) printf("%s: " fmt "\n" , __func__ , ## args)
|
1998-11-02 23:28:11 +00:00
|
|
|
#else
|
2019-05-07 07:46:40 +00:00
|
|
|
# define DPRINTF(fmt, args...) ((void)0)
|
1998-11-02 23:28:11 +00:00
|
|
|
#endif
|
|
|
|
|
|
|
|
struct bcachectl
|
|
|
|
{
|
|
|
|
daddr_t bc_blkno;
|
1998-11-02 23:50:59 +00:00
|
|
|
int bc_count;
|
1998-11-02 23:28:11 +00:00
|
|
|
};
|
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
/*
|
|
|
|
* bcache per device node. cache is allocated on device first open and freed
|
|
|
|
* on last close, to save memory. The issue there is the size; biosdisk
|
|
|
|
* supports up to 31 (0x1f) devices. Classic setup would use single disk
|
|
|
|
* to boot from, but this has changed with zfs.
|
|
|
|
*/
|
|
|
|
struct bcache {
|
|
|
|
struct bcachectl *bcache_ctl;
|
|
|
|
caddr_t bcache_data;
|
loader: bcache read ahead block count should take account the large sectors
The loader bcache is implementing simple read-ahead to boost the cache.
The bcache is built based on 512B block sizes, and the read ahead is attempting
to read number of cache blocks, based on amount of the free bcache space.
However, there are devices using larger sector sizes than 512B, most obviously
the CD media is based on 2k sectors. This means the read-ahead can not be just
random number of blocks, but we should use value suitable also for use with
larger sectors, as for example, with CD devices, we should read multiple of 2KB.
Since the sector size from disk interface is not too reliable, i guess we can
just use "good enough" value, so the implementation is rounding down the read
ahead block count to be multiple of 16.
This means we have covered sector sizes to 8k.
In addition, the update does implement the end of cache marker, to help to
detect the possible memory corruption - I have not seen it happening so far,
but it does not hurt to have the detection mechanism in place.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9179
2017-02-06 08:58:40 +00:00
|
|
|
size_t bcache_nblks;
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
size_t ra;
|
|
|
|
};
|
1998-11-02 23:28:11 +00:00
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
static u_int bcache_total_nblks; /* set by bcache_init */
|
|
|
|
static u_int bcache_blksize; /* set by bcache_init */
|
|
|
|
static u_int bcache_numdev; /* set by bcache_add_dev */
|
|
|
|
/* statistics */
|
|
|
|
static u_int bcache_units; /* number of devices with cache */
|
|
|
|
static u_int bcache_unit_nblks; /* nblocks per unit */
|
|
|
|
static u_int bcache_hits;
|
|
|
|
static u_int bcache_misses;
|
|
|
|
static u_int bcache_ops;
|
|
|
|
static u_int bcache_bypasses;
|
|
|
|
static u_int bcache_bcount;
|
|
|
|
static u_int bcache_rablks;
|
|
|
|
|
|
|
|
#define BHASH(bc, blkno) ((blkno) & ((bc)->bcache_nblks - 1))
|
|
|
|
#define BCACHE_LOOKUP(bc, blkno) \
|
|
|
|
((bc)->bcache_ctl[BHASH((bc), (blkno))].bc_blkno != (blkno))
|
|
|
|
#define BCACHE_READAHEAD 256
|
|
|
|
#define BCACHE_MINREADAHEAD 32
|
|
|
|
|
|
|
|
static void bcache_invalidate(struct bcache *bc, daddr_t blkno);
|
|
|
|
static void bcache_insert(struct bcache *bc, daddr_t blkno);
|
|
|
|
static void bcache_free_instance(struct bcache *bc);
|
1998-11-02 23:28:11 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialise the cache for (nblks) of (bsize).
|
|
|
|
*/
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
void
|
loader: bcache read ahead block count should take account the large sectors
The loader bcache is implementing simple read-ahead to boost the cache.
The bcache is built based on 512B block sizes, and the read ahead is attempting
to read number of cache blocks, based on amount of the free bcache space.
However, there are devices using larger sector sizes than 512B, most obviously
the CD media is based on 2k sectors. This means the read-ahead can not be just
random number of blocks, but we should use value suitable also for use with
larger sectors, as for example, with CD devices, we should read multiple of 2KB.
Since the sector size from disk interface is not too reliable, i guess we can
just use "good enough" value, so the implementation is rounding down the read
ahead block count to be multiple of 16.
This means we have covered sector sizes to 8k.
In addition, the update does implement the end of cache marker, to help to
detect the possible memory corruption - I have not seen it happening so far,
but it does not hurt to have the detection mechanism in place.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9179
2017-02-06 08:58:40 +00:00
|
|
|
bcache_init(size_t nblks, size_t bsize)
|
1998-11-02 23:28:11 +00:00
|
|
|
{
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
/* set up control data */
|
|
|
|
bcache_total_nblks = nblks;
|
1998-11-02 23:28:11 +00:00
|
|
|
bcache_blksize = bsize;
|
2000-03-15 01:56:12 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
* add number of devices to bcache. we have to divide cache space
|
|
|
|
* between the devices, so bcache_add_dev() can be used to set up the
|
|
|
|
* number. The issue is, we need to get the number before actual allocations.
|
|
|
|
* bcache_add_dev() is supposed to be called from device init() call, so the
|
|
|
|
* assumption is, devsw dv_init is called for plain devices first, and
|
|
|
|
* for zfs, last.
|
2000-03-15 01:56:12 +00:00
|
|
|
*/
|
|
|
|
void
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
bcache_add_dev(int devices)
|
2000-03-15 01:56:12 +00:00
|
|
|
{
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
bcache_numdev += devices;
|
|
|
|
}
|
|
|
|
|
|
|
|
void *
|
|
|
|
bcache_allocate(void)
|
|
|
|
{
|
|
|
|
u_int i;
|
|
|
|
struct bcache *bc = malloc(sizeof (struct bcache));
|
|
|
|
int disks = bcache_numdev;
|
|
|
|
|
|
|
|
if (disks == 0)
|
|
|
|
disks = 1; /* safe guard */
|
|
|
|
|
|
|
|
if (bc == NULL) {
|
|
|
|
errno = ENOMEM;
|
|
|
|
return (bc);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* the bcache block count must be power of 2 for hash function
|
|
|
|
*/
|
|
|
|
i = fls(disks) - 1; /* highbit - 1 */
|
|
|
|
if (disks > (1 << i)) /* next power of 2 */
|
|
|
|
i++;
|
|
|
|
|
|
|
|
bc->bcache_nblks = bcache_total_nblks >> i;
|
|
|
|
bcache_unit_nblks = bc->bcache_nblks;
|
2018-11-23 22:36:56 +00:00
|
|
|
bc->bcache_data = malloc(bc->bcache_nblks * bcache_blksize);
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
if (bc->bcache_data == NULL) {
|
|
|
|
/* dont error out yet. fall back to 32 blocks and try again */
|
|
|
|
bc->bcache_nblks = 32;
|
loader: bcache read ahead block count should take account the large sectors
The loader bcache is implementing simple read-ahead to boost the cache.
The bcache is built based on 512B block sizes, and the read ahead is attempting
to read number of cache blocks, based on amount of the free bcache space.
However, there are devices using larger sector sizes than 512B, most obviously
the CD media is based on 2k sectors. This means the read-ahead can not be just
random number of blocks, but we should use value suitable also for use with
larger sectors, as for example, with CD devices, we should read multiple of 2KB.
Since the sector size from disk interface is not too reliable, i guess we can
just use "good enough" value, so the implementation is rounding down the read
ahead block count to be multiple of 16.
This means we have covered sector sizes to 8k.
In addition, the update does implement the end of cache marker, to help to
detect the possible memory corruption - I have not seen it happening so far,
but it does not hurt to have the detection mechanism in place.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9179
2017-02-06 08:58:40 +00:00
|
|
|
bc->bcache_data = malloc(bc->bcache_nblks * bcache_blksize +
|
|
|
|
sizeof(uint32_t));
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
bc->bcache_ctl = malloc(bc->bcache_nblks * sizeof(struct bcachectl));
|
2000-03-15 01:56:12 +00:00
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
if ((bc->bcache_data == NULL) || (bc->bcache_ctl == NULL)) {
|
|
|
|
bcache_free_instance(bc);
|
|
|
|
errno = ENOMEM;
|
loader: bcache read ahead block count should take account the large sectors
The loader bcache is implementing simple read-ahead to boost the cache.
The bcache is built based on 512B block sizes, and the read ahead is attempting
to read number of cache blocks, based on amount of the free bcache space.
However, there are devices using larger sector sizes than 512B, most obviously
the CD media is based on 2k sectors. This means the read-ahead can not be just
random number of blocks, but we should use value suitable also for use with
larger sectors, as for example, with CD devices, we should read multiple of 2KB.
Since the sector size from disk interface is not too reliable, i guess we can
just use "good enough" value, so the implementation is rounding down the read
ahead block count to be multiple of 16.
This means we have covered sector sizes to 8k.
In addition, the update does implement the end of cache marker, to help to
detect the possible memory corruption - I have not seen it happening so far,
but it does not hurt to have the detection mechanism in place.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9179
2017-02-06 08:58:40 +00:00
|
|
|
return (NULL);
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
}
|
2000-03-15 01:56:12 +00:00
|
|
|
|
|
|
|
/* Flush the cache */
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
for (i = 0; i < bc->bcache_nblks; i++) {
|
|
|
|
bc->bcache_ctl[i].bc_count = -1;
|
|
|
|
bc->bcache_ctl[i].bc_blkno = -1;
|
1998-11-19 18:12:03 +00:00
|
|
|
}
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
bcache_units++;
|
|
|
|
bc->ra = BCACHE_READAHEAD; /* optimistic read ahead */
|
|
|
|
return (bc);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
bcache_free(void *cache)
|
|
|
|
{
|
|
|
|
struct bcache *bc = cache;
|
|
|
|
|
|
|
|
if (bc == NULL)
|
|
|
|
return;
|
|
|
|
|
|
|
|
bcache_free_instance(bc);
|
|
|
|
bcache_units--;
|
1998-11-02 23:28:11 +00:00
|
|
|
}
|
|
|
|
|
2001-12-11 00:10:00 +00:00
|
|
|
/*
|
|
|
|
* Handle a write request; write directly to the disk, and populate the
|
|
|
|
* cache with the new values.
|
|
|
|
*/
|
|
|
|
static int
|
2016-12-30 19:06:29 +00:00
|
|
|
write_strategy(void *devdata, int rw, daddr_t blk, size_t size,
|
|
|
|
char *buf, size_t *rsize)
|
2001-12-11 00:10:00 +00:00
|
|
|
{
|
|
|
|
struct bcache_devdata *dd = (struct bcache_devdata *)devdata;
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
struct bcache *bc = dd->dv_cache;
|
2001-12-11 00:10:00 +00:00
|
|
|
daddr_t i, nblk;
|
|
|
|
|
|
|
|
nblk = size / bcache_blksize;
|
|
|
|
|
|
|
|
/* Invalidate the blocks being written */
|
|
|
|
for (i = 0; i < nblk; i++) {
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
bcache_invalidate(bc, blk + i);
|
2001-12-11 00:10:00 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Write the blocks */
|
2016-12-30 19:06:29 +00:00
|
|
|
return (dd->dv_strategy(dd->dv_devdata, rw, blk, size, buf, rsize));
|
2001-12-11 00:10:00 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Handle a read request; fill in parts of the request that can
|
1998-11-02 23:28:11 +00:00
|
|
|
* be satisfied by the cache, use the supplied strategy routine to do
|
|
|
|
* device I/O and then use the I/O results to populate the cache.
|
|
|
|
*/
|
2001-12-11 00:10:00 +00:00
|
|
|
static int
|
2016-12-30 19:06:29 +00:00
|
|
|
read_strategy(void *devdata, int rw, daddr_t blk, size_t size,
|
|
|
|
char *buf, size_t *rsize)
|
1998-11-02 23:28:11 +00:00
|
|
|
{
|
|
|
|
struct bcache_devdata *dd = (struct bcache_devdata *)devdata;
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
struct bcache *bc = dd->dv_cache;
|
|
|
|
size_t i, nblk, p_size, r_size, complete, ra;
|
|
|
|
int result;
|
|
|
|
daddr_t p_blk;
|
1998-11-02 23:28:11 +00:00
|
|
|
caddr_t p_buf;
|
loader: bcache read ahead block count should take account the large sectors
The loader bcache is implementing simple read-ahead to boost the cache.
The bcache is built based on 512B block sizes, and the read ahead is attempting
to read number of cache blocks, based on amount of the free bcache space.
However, there are devices using larger sector sizes than 512B, most obviously
the CD media is based on 2k sectors. This means the read-ahead can not be just
random number of blocks, but we should use value suitable also for use with
larger sectors, as for example, with CD devices, we should read multiple of 2KB.
Since the sector size from disk interface is not too reliable, i guess we can
just use "good enough" value, so the implementation is rounding down the read
ahead block count to be multiple of 16.
This means we have covered sector sizes to 8k.
In addition, the update does implement the end of cache marker, to help to
detect the possible memory corruption - I have not seen it happening so far,
but it does not hurt to have the detection mechanism in place.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9179
2017-02-06 08:58:40 +00:00
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
if (bc == NULL) {
|
|
|
|
errno = ENODEV;
|
|
|
|
return (-1);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (rsize != NULL)
|
|
|
|
*rsize = 0;
|
|
|
|
|
1998-11-02 23:28:11 +00:00
|
|
|
nblk = size / bcache_blksize;
|
2016-12-30 19:06:29 +00:00
|
|
|
if (nblk == 0 && size != 0)
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
nblk++;
|
1998-11-02 23:28:11 +00:00
|
|
|
result = 0;
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
complete = 1;
|
1998-11-02 23:28:11 +00:00
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
/* Satisfy any cache hits up front, break on first miss */
|
1998-11-02 23:28:11 +00:00
|
|
|
for (i = 0; i < nblk; i++) {
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
if (BCACHE_LOOKUP(bc, (daddr_t)(blk + i))) {
|
|
|
|
bcache_misses += (nblk - i);
|
|
|
|
complete = 0;
|
|
|
|
if (nblk - i > BCACHE_MINREADAHEAD && bc->ra > BCACHE_MINREADAHEAD)
|
|
|
|
bc->ra >>= 1; /* reduce read ahead */
|
|
|
|
break;
|
1998-11-02 23:28:11 +00:00
|
|
|
} else {
|
|
|
|
bcache_hits++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
if (complete) { /* whole set was in cache, return it */
|
|
|
|
if (bc->ra < BCACHE_READAHEAD)
|
|
|
|
bc->ra <<= 1; /* increase read ahead */
|
2016-12-30 19:06:29 +00:00
|
|
|
bcopy(bc->bcache_data + (bcache_blksize * BHASH(bc, blk)), buf, size);
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Fill in any misses. From check we have i pointing to first missing
|
|
|
|
* block, read in all remaining blocks + readahead.
|
|
|
|
* We have space at least for nblk - i before bcache wraps.
|
|
|
|
*/
|
|
|
|
p_blk = blk + i;
|
|
|
|
p_buf = bc->bcache_data + (bcache_blksize * BHASH(bc, p_blk));
|
|
|
|
r_size = bc->bcache_nblks - BHASH(bc, p_blk); /* remaining blocks */
|
|
|
|
|
|
|
|
p_size = MIN(r_size, nblk - i); /* read at least those blocks */
|
|
|
|
|
loader: bcache read ahead block count should take account the large sectors
The loader bcache is implementing simple read-ahead to boost the cache.
The bcache is built based on 512B block sizes, and the read ahead is attempting
to read number of cache blocks, based on amount of the free bcache space.
However, there are devices using larger sector sizes than 512B, most obviously
the CD media is based on 2k sectors. This means the read-ahead can not be just
random number of blocks, but we should use value suitable also for use with
larger sectors, as for example, with CD devices, we should read multiple of 2KB.
Since the sector size from disk interface is not too reliable, i guess we can
just use "good enough" value, so the implementation is rounding down the read
ahead block count to be multiple of 16.
This means we have covered sector sizes to 8k.
In addition, the update does implement the end of cache marker, to help to
detect the possible memory corruption - I have not seen it happening so far,
but it does not hurt to have the detection mechanism in place.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9179
2017-02-06 08:58:40 +00:00
|
|
|
/*
|
|
|
|
* The read ahead size setup.
|
|
|
|
* While the read ahead can save us IO, it also can complicate things:
|
|
|
|
* 1. We do not want to read ahead by wrapping around the
|
|
|
|
* bcache end - this would complicate the cache management.
|
|
|
|
* 2. We are using bc->ra as dynamic hint for read ahead size,
|
|
|
|
* detected cache hits will increase the read-ahead block count, and
|
|
|
|
* misses will decrease, see the code above.
|
|
|
|
* 3. The bcache is sized by 512B blocks, however, the underlying device
|
|
|
|
* may have a larger sector size, and we should perform the IO by
|
|
|
|
* taking into account these larger sector sizes. We could solve this by
|
|
|
|
* passing the sector size to bcache_allocate(), or by using ioctl(), but
|
|
|
|
* in this version we are using the constant, 16 blocks, and are rounding
|
|
|
|
* read ahead block count down to multiple of 16.
|
|
|
|
* Using the constant has two reasons, we are not entirely sure if the
|
|
|
|
* BIOS disk interface is providing the correct value for sector size.
|
|
|
|
* And secondly, this way we get the most conservative setup for the ra.
|
|
|
|
*
|
|
|
|
* The selection of multiple of 16 blocks (8KB) is quite arbitrary, however,
|
2017-02-08 18:32:53 +00:00
|
|
|
* we want to cover CDs (2K) and 4K disks.
|
|
|
|
* bcache_allocate() will always fall back to a minimum of 32 blocks.
|
|
|
|
* Our choice of 16 read ahead blocks will always fit inside the bcache.
|
loader: bcache read ahead block count should take account the large sectors
The loader bcache is implementing simple read-ahead to boost the cache.
The bcache is built based on 512B block sizes, and the read ahead is attempting
to read number of cache blocks, based on amount of the free bcache space.
However, there are devices using larger sector sizes than 512B, most obviously
the CD media is based on 2k sectors. This means the read-ahead can not be just
random number of blocks, but we should use value suitable also for use with
larger sectors, as for example, with CD devices, we should read multiple of 2KB.
Since the sector size from disk interface is not too reliable, i guess we can
just use "good enough" value, so the implementation is rounding down the read
ahead block count to be multiple of 16.
This means we have covered sector sizes to 8k.
In addition, the update does implement the end of cache marker, to help to
detect the possible memory corruption - I have not seen it happening so far,
but it does not hurt to have the detection mechanism in place.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9179
2017-02-06 08:58:40 +00:00
|
|
|
*/
|
|
|
|
|
2017-04-06 15:57:53 +00:00
|
|
|
if ((rw & F_NORA) == F_NORA)
|
|
|
|
ra = 0;
|
|
|
|
else
|
|
|
|
ra = bc->bcache_nblks - BHASH(bc, p_blk + p_size);
|
|
|
|
|
loader: bcache read ahead block count should take account the large sectors
The loader bcache is implementing simple read-ahead to boost the cache.
The bcache is built based on 512B block sizes, and the read ahead is attempting
to read number of cache blocks, based on amount of the free bcache space.
However, there are devices using larger sector sizes than 512B, most obviously
the CD media is based on 2k sectors. This means the read-ahead can not be just
random number of blocks, but we should use value suitable also for use with
larger sectors, as for example, with CD devices, we should read multiple of 2KB.
Since the sector size from disk interface is not too reliable, i guess we can
just use "good enough" value, so the implementation is rounding down the read
ahead block count to be multiple of 16.
This means we have covered sector sizes to 8k.
In addition, the update does implement the end of cache marker, to help to
detect the possible memory corruption - I have not seen it happening so far,
but it does not hurt to have the detection mechanism in place.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9179
2017-02-06 08:58:40 +00:00
|
|
|
if (ra != 0 && ra != bc->bcache_nblks) { /* do we have RA space? */
|
|
|
|
ra = MIN(bc->ra, ra - 1);
|
|
|
|
ra = rounddown(ra, 16); /* multiple of 16 blocks */
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
p_size += ra;
|
1998-11-02 23:28:11 +00:00
|
|
|
}
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
|
|
|
|
/* invalidate bcache */
|
|
|
|
for (i = 0; i < p_size; i++) {
|
|
|
|
bcache_invalidate(bc, p_blk + i);
|
1998-11-02 23:28:11 +00:00
|
|
|
}
|
2016-05-01 21:06:59 +00:00
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
r_size = 0;
|
2016-05-01 21:06:59 +00:00
|
|
|
/*
|
|
|
|
* with read-ahead, it may happen we are attempting to read past
|
|
|
|
* disk end, as bcache has no information about disk size.
|
|
|
|
* in such case we should get partial read if some blocks can be
|
|
|
|
* read or error, if no blocks can be read.
|
|
|
|
* in either case we should return the data in bcache and only
|
|
|
|
* return error if there is no data.
|
|
|
|
*/
|
2017-04-06 15:57:53 +00:00
|
|
|
rw &= F_MASK;
|
2016-12-30 19:06:29 +00:00
|
|
|
result = dd->dv_strategy(dd->dv_devdata, rw, p_blk,
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
p_size * bcache_blksize, p_buf, &r_size);
|
|
|
|
|
|
|
|
r_size /= bcache_blksize;
|
|
|
|
for (i = 0; i < r_size; i++)
|
|
|
|
bcache_insert(bc, p_blk + i);
|
|
|
|
|
2016-05-01 21:06:59 +00:00
|
|
|
/* update ra statistics */
|
|
|
|
if (r_size != 0) {
|
|
|
|
if (r_size < p_size)
|
|
|
|
bcache_rablks += (p_size - r_size);
|
|
|
|
else
|
|
|
|
bcache_rablks += ra;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* check how much data can we copy */
|
|
|
|
for (i = 0; i < nblk; i++) {
|
|
|
|
if (BCACHE_LOOKUP(bc, (daddr_t)(blk + i)))
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2016-07-30 17:45:56 +00:00
|
|
|
if (size > i * bcache_blksize)
|
|
|
|
size = i * bcache_blksize;
|
|
|
|
|
2016-05-01 21:06:59 +00:00
|
|
|
if (size != 0) {
|
2016-12-30 19:06:29 +00:00
|
|
|
bcopy(bc->bcache_data + (bcache_blksize * BHASH(bc, blk)), buf, size);
|
2016-05-01 21:06:59 +00:00
|
|
|
result = 0;
|
|
|
|
}
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
|
1998-11-02 23:28:11 +00:00
|
|
|
done:
|
|
|
|
if ((result == 0) && (rsize != NULL))
|
|
|
|
*rsize = size;
|
|
|
|
return(result);
|
|
|
|
}
|
|
|
|
|
2001-12-11 00:10:00 +00:00
|
|
|
/*
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
* Requests larger than 1/2 cache size will be bypassed and go
|
2001-12-11 00:10:00 +00:00
|
|
|
* directly to the disk. XXX tune this.
|
|
|
|
*/
|
|
|
|
int
|
2016-12-30 19:06:29 +00:00
|
|
|
bcache_strategy(void *devdata, int rw, daddr_t blk, size_t size,
|
|
|
|
char *buf, size_t *rsize)
|
2001-12-11 00:10:00 +00:00
|
|
|
{
|
|
|
|
struct bcache_devdata *dd = (struct bcache_devdata *)devdata;
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
struct bcache *bc = dd->dv_cache;
|
|
|
|
u_int bcache_nblks = 0;
|
|
|
|
int nblk, cblk, ret;
|
|
|
|
size_t csize, isize, total;
|
2001-12-11 00:10:00 +00:00
|
|
|
|
|
|
|
bcache_ops++;
|
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
if (bc != NULL)
|
|
|
|
bcache_nblks = bc->bcache_nblks;
|
2001-12-11 00:10:00 +00:00
|
|
|
|
|
|
|
/* bypass large requests, or when the cache is inactive */
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
if (bc == NULL ||
|
2016-12-30 19:06:29 +00:00
|
|
|
((size * 2 / bcache_blksize) > bcache_nblks)) {
|
2019-03-12 16:21:39 +00:00
|
|
|
DPRINTF("bypass %zu from %qu", size / bcache_blksize, blk);
|
2001-12-11 00:10:00 +00:00
|
|
|
bcache_bypasses++;
|
2017-04-06 15:57:53 +00:00
|
|
|
rw &= F_MASK;
|
2016-12-30 19:06:29 +00:00
|
|
|
return (dd->dv_strategy(dd->dv_devdata, rw, blk, size, buf, rsize));
|
2001-12-11 00:10:00 +00:00
|
|
|
}
|
|
|
|
|
2017-04-06 15:57:53 +00:00
|
|
|
switch (rw & F_MASK) {
|
2001-12-11 00:10:00 +00:00
|
|
|
case F_READ:
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
nblk = size / bcache_blksize;
|
2016-12-30 19:06:29 +00:00
|
|
|
if (size != 0 && nblk == 0)
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
nblk++; /* read at least one block */
|
|
|
|
|
|
|
|
ret = 0;
|
|
|
|
total = 0;
|
|
|
|
while(size) {
|
|
|
|
cblk = bcache_nblks - BHASH(bc, blk); /* # of blocks left */
|
|
|
|
cblk = MIN(cblk, nblk);
|
|
|
|
|
|
|
|
if (size <= bcache_blksize)
|
|
|
|
csize = size;
|
2016-12-30 19:06:29 +00:00
|
|
|
else
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
csize = cblk * bcache_blksize;
|
|
|
|
|
2016-12-30 19:06:29 +00:00
|
|
|
ret = read_strategy(devdata, rw, blk, csize, buf+total, &isize);
|
2016-05-01 21:06:59 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* we may have error from read ahead, if we have read some data
|
|
|
|
* return partial read.
|
|
|
|
*/
|
|
|
|
if (ret != 0 || isize == 0) {
|
|
|
|
if (total != 0)
|
|
|
|
ret = 0;
|
|
|
|
break;
|
|
|
|
}
|
2016-12-30 19:06:29 +00:00
|
|
|
blk += isize / bcache_blksize;
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
total += isize;
|
|
|
|
size -= isize;
|
|
|
|
nblk = size / bcache_blksize;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (rsize)
|
|
|
|
*rsize = total;
|
|
|
|
|
|
|
|
return (ret);
|
2001-12-11 00:10:00 +00:00
|
|
|
case F_WRITE:
|
2017-04-06 15:57:53 +00:00
|
|
|
return write_strategy(devdata, F_WRITE, blk, size, buf, rsize);
|
2001-12-11 00:10:00 +00:00
|
|
|
}
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
1998-11-02 23:28:11 +00:00
|
|
|
/*
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
* Free allocated bcache instance
|
1998-11-02 23:28:11 +00:00
|
|
|
*/
|
|
|
|
static void
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
bcache_free_instance(struct bcache *bc)
|
1998-11-02 23:28:11 +00:00
|
|
|
{
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
if (bc != NULL) {
|
2019-05-07 08:14:30 +00:00
|
|
|
free(bc->bcache_ctl);
|
|
|
|
free(bc->bcache_data);
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
free(bc);
|
1998-11-02 23:28:11 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
* Insert a block into the cache.
|
1998-11-02 23:28:11 +00:00
|
|
|
*/
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
static void
|
|
|
|
bcache_insert(struct bcache *bc, daddr_t blkno)
|
1998-11-02 23:28:11 +00:00
|
|
|
{
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
u_int cand;
|
1998-11-02 23:28:11 +00:00
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
cand = BHASH(bc, blkno);
|
|
|
|
|
2019-03-12 16:21:39 +00:00
|
|
|
DPRINTF("insert blk %llu -> %u # %d", blkno, cand, bcache_bcount);
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
bc->bcache_ctl[cand].bc_blkno = blkno;
|
|
|
|
bc->bcache_ctl[cand].bc_count = bcache_bcount++;
|
1998-11-02 23:28:11 +00:00
|
|
|
}
|
|
|
|
|
2001-12-11 00:10:00 +00:00
|
|
|
/*
|
|
|
|
* Invalidate a block from the cache.
|
|
|
|
*/
|
|
|
|
static void
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
bcache_invalidate(struct bcache *bc, daddr_t blkno)
|
2001-12-11 00:10:00 +00:00
|
|
|
{
|
|
|
|
u_int i;
|
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
i = BHASH(bc, blkno);
|
|
|
|
if (bc->bcache_ctl[i].bc_blkno == blkno) {
|
|
|
|
bc->bcache_ctl[i].bc_count = -1;
|
|
|
|
bc->bcache_ctl[i].bc_blkno = -1;
|
2019-03-12 16:21:39 +00:00
|
|
|
DPRINTF("invalidate blk %llu", blkno);
|
2001-12-11 00:10:00 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
#ifndef BOOT2
|
1998-11-02 23:28:11 +00:00
|
|
|
COMMAND_SET(bcachestat, "bcachestat", "get disk block cache stats", command_bcache);
|
|
|
|
|
|
|
|
static int
|
2019-05-07 10:01:45 +00:00
|
|
|
command_bcache(int argc, char *argv[] __unused)
|
1998-11-02 23:28:11 +00:00
|
|
|
{
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
if (argc != 1) {
|
|
|
|
command_errmsg = "wrong number of arguments";
|
|
|
|
return(CMD_ERROR);
|
1998-11-02 23:28:11 +00:00
|
|
|
}
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
|
2018-11-29 14:21:01 +00:00
|
|
|
printf("\ncache blocks: %u\n", bcache_total_nblks);
|
|
|
|
printf("cache blocksz: %u\n", bcache_blksize);
|
|
|
|
printf("cache readahead: %u\n", bcache_rablks);
|
|
|
|
printf("unit cache blocks: %u\n", bcache_unit_nblks);
|
|
|
|
printf("cached units: %u\n", bcache_units);
|
|
|
|
printf("%u ops %d bypasses %u hits %u misses\n", bcache_ops,
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
bcache_bypasses, bcache_hits, bcache_misses);
|
1998-11-02 23:28:11 +00:00
|
|
|
return(CMD_OK);
|
|
|
|
}
|
A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
|
|
|
#endif
|