276 lines
8.3 KiB
Groff
276 lines
8.3 KiB
Groff
.\" Copyright (c) 1990, 1993
|
|
.\" The Regents of the University of California. All rights reserved.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\" 3. All advertising materials mentioning features or use of this software
|
|
.\" must display the following acknowledgement:
|
|
.\" This product includes software developed by the University of
|
|
.\" California, Berkeley and its contributors.
|
|
.\" 4. Neither the name of the University nor the names of its contributors
|
|
.\" may be used to endorse or promote products derived from this software
|
|
.\" without specific prior written permission.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" @(#)btree.3 8.4 (Berkeley) 8/18/94
|
|
.\" $FreeBSD$
|
|
.\"
|
|
.Dd August 18, 1994
|
|
.Dt BTREE 3
|
|
.Os
|
|
.Sh NAME
|
|
.Nm btree
|
|
.Nd "btree database access method"
|
|
.Sh SYNOPSIS
|
|
.In sys/types.h
|
|
.In db.h
|
|
.Sh DESCRIPTION
|
|
The routine
|
|
.Fn dbopen
|
|
is the library interface to database files.
|
|
One of the supported file formats is
|
|
.Nm
|
|
files.
|
|
The general description of the database access methods is in
|
|
.Xr dbopen 3 ,
|
|
this manual page describes only the
|
|
.Nm
|
|
specific information.
|
|
.Pp
|
|
The
|
|
.Nm
|
|
data structure is a sorted, balanced tree structure storing
|
|
associated key/data pairs.
|
|
.Pp
|
|
The
|
|
.Nm
|
|
access method specific data structure provided to
|
|
.Fn dbopen
|
|
is defined in the
|
|
.In db.h
|
|
include file as follows:
|
|
.Bd -literal
|
|
typedef struct {
|
|
u_long flags;
|
|
u_int cachesize;
|
|
int maxkeypage;
|
|
int minkeypage;
|
|
u_int psize;
|
|
int (*compare)(const DBT *key1, const DBT *key2);
|
|
size_t (*prefix)(const DBT *key1, const DBT *key2);
|
|
int lorder;
|
|
} BTREEINFO;
|
|
.Ed
|
|
.Pp
|
|
The elements of this structure are as follows:
|
|
.Bl -tag -width indent
|
|
.It Va flags
|
|
The flag value is specified by
|
|
.Em or Ns 'ing
|
|
any of the following values:
|
|
.Bl -tag -width indent
|
|
.It Dv R_DUP
|
|
Permit duplicate keys in the tree, i.e., permit insertion if the key to be
|
|
inserted already exists in the tree.
|
|
The default behavior, as described in
|
|
.Xr dbopen 3 ,
|
|
is to overwrite a matching key when inserting a new key or to fail if
|
|
the
|
|
.Dv R_NOOVERWRITE
|
|
flag is specified.
|
|
The
|
|
.Dv R_DUP
|
|
flag is overridden by the
|
|
.Dv R_NOOVERWRITE
|
|
flag, and if the
|
|
.Dv R_NOOVERWRITE
|
|
flag is specified, attempts to insert duplicate keys into
|
|
the tree will fail.
|
|
.Pp
|
|
If the database contains duplicate keys, the order of retrieval of
|
|
key/data pairs is undefined if the
|
|
.Va get
|
|
routine is used, however,
|
|
.Va seq
|
|
routine calls with the
|
|
.Dv R_CURSOR
|
|
flag set will always return the logical
|
|
.Dq first
|
|
of any group of duplicate keys.
|
|
.El
|
|
.It Va cachesize
|
|
A suggested maximum size (in bytes) of the memory cache.
|
|
This value is
|
|
.Em only
|
|
advisory, and the access method will allocate more memory rather than fail.
|
|
Since every search examines the root page of the tree, caching the most
|
|
recently used pages substantially improves access time.
|
|
In addition, physical writes are delayed as long as possible, so a moderate
|
|
cache can reduce the number of I/O operations significantly.
|
|
Obviously, using a cache increases (but only increases) the likelihood of
|
|
corruption or lost data if the system crashes while a tree is being modified.
|
|
If
|
|
.Va cachesize
|
|
is 0 (no size is specified) a default cache is used.
|
|
.It Va maxkeypage
|
|
The maximum number of keys which will be stored on any single page.
|
|
Not currently implemented.
|
|
.\" The maximum number of keys which will be stored on any single page.
|
|
.\" Because of the way the
|
|
.\" .Nm
|
|
.\" data structure works,
|
|
.\" .Va maxkeypage
|
|
.\" must always be greater than or equal to 2.
|
|
.\" If
|
|
.\" .Va maxkeypage
|
|
.\" is 0 (no maximum number of keys is specified) the page fill factor is
|
|
.\" made as large as possible (which is almost invariably what is wanted).
|
|
.It Va minkeypage
|
|
The minimum number of keys which will be stored on any single page.
|
|
This value is used to determine which keys will be stored on overflow
|
|
pages, i.e., if a key or data item is longer than the pagesize divided
|
|
by the minkeypage value, it will be stored on overflow pages instead
|
|
of in the page itself.
|
|
If
|
|
.Va minkeypage
|
|
is 0 (no minimum number of keys is specified) a value of 2 is used.
|
|
.It Va psize
|
|
Page size is the size (in bytes) of the pages used for nodes in the tree.
|
|
The minimum page size is 512 bytes and the maximum page size is 64K.
|
|
If
|
|
.Va psize
|
|
is 0 (no page size is specified) a page size is chosen based on the
|
|
underlying file system I/O block size.
|
|
.It Va compare
|
|
Compare is the key comparison function.
|
|
It must return an integer less than, equal to, or greater than zero if the
|
|
first key argument is considered to be respectively less than, equal to,
|
|
or greater than the second key argument.
|
|
The same comparison function must be used on a given tree every time it
|
|
is opened.
|
|
If
|
|
.Va compare
|
|
is
|
|
.Dv NULL
|
|
(no comparison function is specified), the keys are compared
|
|
lexically, with shorter keys considered less than longer keys.
|
|
.It Va prefix
|
|
The
|
|
.Va prefix
|
|
element
|
|
is the prefix comparison function.
|
|
If specified, this routine must return the number of bytes of the second key
|
|
argument which are necessary to determine that it is greater than the first
|
|
key argument.
|
|
If the keys are equal, the key length should be returned.
|
|
Note, the usefulness of this routine is very data dependent, but, in some
|
|
data sets can produce significantly reduced tree sizes and search times.
|
|
If
|
|
.Va prefix
|
|
is
|
|
.Dv NULL
|
|
(no prefix function is specified),
|
|
.Em and
|
|
no comparison function is specified, a default lexical comparison routine
|
|
is used.
|
|
If
|
|
.Va prefix
|
|
is
|
|
.Dv NULL
|
|
and a comparison routine is specified, no prefix comparison is
|
|
done.
|
|
.It Va lorder
|
|
The byte order for integers in the stored database metadata.
|
|
The number should represent the order as an integer; for example,
|
|
big endian order would be the number 4,321.
|
|
If
|
|
.Va lorder
|
|
is 0 (no order is specified) the current host order is used.
|
|
.El
|
|
.Pp
|
|
If the file already exists (and the
|
|
.Dv O_TRUNC
|
|
flag is not specified), the
|
|
values specified for the
|
|
.Va flags , lorder
|
|
and
|
|
.Va psize
|
|
arguments
|
|
are ignored
|
|
in favor of the values used when the tree was created.
|
|
.Pp
|
|
Forward sequential scans of a tree are from the least key to the greatest.
|
|
.Pp
|
|
Space freed up by deleting key/data pairs from the tree is never reclaimed,
|
|
although it is normally made available for reuse.
|
|
This means that the
|
|
.Nm
|
|
storage structure is grow-only.
|
|
The only solutions are to avoid excessive deletions, or to create a fresh
|
|
tree periodically from a scan of an existing one.
|
|
.Pp
|
|
Searches, insertions, and deletions in a
|
|
.Nm
|
|
will all complete in
|
|
O lg base N where base is the average fill factor.
|
|
Often, inserting ordered data into
|
|
.Nm Ns s
|
|
results in a low fill factor.
|
|
This implementation has been modified to make ordered insertion the best
|
|
case, resulting in a much better than normal page fill factor.
|
|
.Sh ERRORS
|
|
The
|
|
.Nm
|
|
access method routines may fail and set
|
|
.Va errno
|
|
for any of the errors specified for the library routine
|
|
.Xr dbopen 3 .
|
|
.Sh SEE ALSO
|
|
.Xr dbopen 3 ,
|
|
.Xr hash 3 ,
|
|
.Xr mpool 3 ,
|
|
.Xr recno 3
|
|
.Rs
|
|
.%T "The Ubiquitous B-tree"
|
|
.%A Douglas Comer
|
|
.%J "ACM Comput. Surv. 11"
|
|
.%N 2
|
|
.%D June 1979
|
|
.%P 121-138
|
|
.Re
|
|
.Rs
|
|
.%A Bayer
|
|
.%A Unterauer
|
|
.%T "Prefix B-trees"
|
|
.%J "ACM Transactions on Database Systems"
|
|
.%N 1
|
|
.%V Vol. 2
|
|
.%D March 1977
|
|
.%P 11-26
|
|
.Re
|
|
.Rs
|
|
.%B "The Art of Computer Programming Vol. 3: Sorting and Searching"
|
|
.%A D. E. Knuth
|
|
.%D 1968
|
|
.%P 471-480
|
|
.Re
|
|
.Sh BUGS
|
|
Only big and little endian byte order is supported.
|