Rank balanced (RB) trees are a class of balanced trees that includes

AVL trees, red-black trees, and others. Weak AVL (wavl) trees are a
recently discovered member of that class. This change replaces
red-black rebalancing with weak AVL rebalancing in the RB tree macros.

Wavl trees sit between AVL and red-black trees in terms of how
strictly balance is enforced. They have the stricter balance of AVL
trees as the tree is built - a wavl tree is an AVL tree until the
first deletion. Once removals start, wavl trees are lazier about
rebalancing than AVL trees, so that removals can be fast, but the
balance of the tree can decay to that of a red-black tree. Subsequent
insertions can push balance back toward the stricter AVL conditions.

Removing a node from a wavl tree never requires more than two
rotations, which is better than either red-black or AVL
trees. Inserting a node into a wavl tree never requires more than two
rotations, which matches red-black and AVL trees. The only
disadvantage of wavl trees to red-black trees is that more insertions
are likely to adjust the tree a bit. That's the cost of keeping the
tree more balanced.

Testing has shown that for the cases where red-black trees do worst,
wavl trees better balance leads to faster lookups, so that if lookups
outnumber insertions by a nontrivial amount, lookup time saved exceeds
the extra cost of balancing.

Reviewed by:	alc, gbe, markj
Tested by:	pho
Discussed with:	emaste
Differential Revision:	https://reviews.freebsd.org/D25480
This commit is contained in:
dougm 2020-07-23 17:16:20 +00:00
parent 0c33107f0a
commit 504ba65c8c
2 changed files with 170 additions and 164 deletions

View File

@ -99,7 +99,7 @@
.Nm RB_INSERT ,
.Nm RB_REMOVE ,
.Nm RB_REINSERT
.Nd "implementations of splay and red-black trees"
.Nd "implementations of splay and rank-balanced (wavl) trees"
.Sh SYNOPSIS
.In sys/tree.h
.Fn SPLAY_PROTOTYPE NAME TYPE FIELD CMP
@ -195,7 +195,7 @@
.Fn RB_REINSERT NAME "RB_HEAD *head" "struct TYPE *elm"
.Sh DESCRIPTION
These macros define data structures for different types of trees:
splay trees and red-black trees.
splay trees and rank-balanced (wavl) trees.
.Pp
In the macro definitions,
.Fa TYPE
@ -364,26 +364,26 @@ macro:
The
.Fn SPLAY_EMPTY
macro should be used to check whether a splay tree is empty.
.Sh RED-BLACK TREES
A red-black tree is a binary search tree with the node color as an
extra attribute.
It fulfills a set of conditions:
.Bl -enum -offset indent
.It
Every search path from the root to a leaf consists of the same number of
black nodes.
.It
Each red node (except for the root) has a black parent.
.It
Each leaf node is black.
.El
.Sh RANK-BALANCED TREES
Rank-balanced (RB) trees are a framework for defining height-balanced
binary search trees, including AVL and red-black trees.
Each tree node has an associated rank.
Balance conditions are expressed by conditions on the differences in
rank between any node and its children.
Rank differences are stored in each tree node.
.Pp
The balance conditions implemented by the RB macros lead to weak AVL
(wavl) trees, which combine the best aspects of AVL and red-black
trees.
Wavl trees rebalance after an insertion in the same way AVL trees do,
with the same worst-case time as red-black trees offer, and with
better balance in the resulting tree.
Wavl trees rebalance after a removal in a way that requires less
restructuring, in the worst case, than either AVL or red-black trees
do. Removals can lead to a tree almost as unbalanced as a red-black
tree; insertions lead to a tree becoming as balanced as an AVL tree.
.Pp
Every operation on a red-black tree is bounded as
.Fn O "lg n" .
The maximum height of a red-black tree is
.Fn 2lg "n + 1" .
.Pp
A red-black tree is headed by a structure defined by the
A rank-balanced tree is headed by a structure defined by the
.Fn RB_HEAD
macro.
A
@ -488,7 +488,7 @@ The
macro initializes the tree referenced by
.Fa head .
.Pp
The red-black tree can also be initialized statically by using the
The rank-balanced tree can also be initialized statically by using the
.Fn RB_INITIALIZER
macro like this:
.Bd -ragged -offset indent
@ -567,7 +567,7 @@ and will be overwritten to provide safe traversal.
.Pp
The
.Fn RB_EMPTY
macro should be used to check whether a red-black tree is empty.
macro should be used to check whether a rank-balanced tree is empty.
.Pp
The
.Fn RB_REINSERT
@ -581,7 +581,7 @@ a node's key.
This is a lower overhead alternative to removing the element
and reinserting it again.
.Sh EXAMPLES
The following example demonstrates how to declare a red-black tree
The following example demonstrates how to declare a rank-balanced tree
holding integers.
Values are inserted into it and the contents of the tree are printed
in order.
@ -697,6 +697,17 @@ to indicate an error.
.Sh SEE ALSO
.Xr arb 3 ,
.Xr queue 3
.Rs
.%A "Bernhard Haeupler"
.%A "Siddhartha Sen"
.%A "Robert E. Tarjan"
.%T "Rank-Balanced Trees"
.%U "http://sidsen.azurewebsites.net/papers/rb-trees-talg.pdf"
.%J "ACM Transactions on Algorithms"
.%V "11"
.%N "4"
.%D "June 2015"
.Re
.Sh HISTORY
The tree macros first appeared in
.Fx 4.6 .

View File

@ -36,7 +36,7 @@
/*
* This file defines data structures for different types of trees:
* splay trees and red-black trees.
* splay trees and rank-balanced trees.
*
* A splay tree is a self-organizing data structure. Every operation
* on the tree causes a splay to happen. The splay moves the requested
@ -50,15 +50,24 @@
* and n inserts on an initially empty tree as O((m + n)lg n). The
* amortized cost for a sequence of m accesses to a splay tree is O(lg n);
*
* A red-black tree is a binary search tree with the node color as an
* extra attribute. It fulfills a set of conditions:
* - every search path from the root to a leaf consists of the
* same number of black nodes,
* - each red node (except for the root) has a black parent,
* - each leaf node is black.
* A rank-balanced tree is a binary search tree with an integer
* rank-difference as an attribute of each pointer from parent to child.
* The sum of the rank-differences on any path from a node down to null is
* the same, and defines the rank of that node. The rank of the null node
* is -1.
*
* Every operation on a red-black tree is bounded as O(lg n).
* The maximum height of a red-black tree is 2lg (n+1).
* Different additional conditions define different sorts of balanced
* trees, including "red-black" and "AVL" trees. The set of conditions
* applied here are the "weak-AVL" conditions of Haeupler, Sen and Tarjan:
* - every rank-difference is 1 or 2.
* - the rank of any leaf is 1.
*
* For historical reasons, rank differences that are even are associated
* with the color red (Rank-Even-Difference), and the child that a red edge
* points to is called a red child.
*
* Every operation on a rank-balanced tree is bounded as O(lg n).
* The maximum height of a rank-balanced tree is 2lg (n+1).
*/
#define SPLAY_HEAD(name, type) \
@ -294,7 +303,7 @@ void name##_SPLAY_MINMAX(struct name *head, int __comp) \
(x) != NULL; \
(x) = SPLAY_NEXT(name, head, x))
/* Macros that define a red-black tree */
/* Macros that define a rank-balanced tree */
#define RB_HEAD(name, type) \
struct name { \
struct type *rbh_root; /* root of the tree */ \
@ -325,25 +334,16 @@ struct { \
* that the left or right child of the tree node is "red".
*/
#define RB_UP(elm, field) (elm)->field.rbe_parent
#define RB_BITS(elm, field) *(__uintptr_t *)&RB_UP(elm, field)
#define RB_RED_L (__uintptr_t)1
#define RB_RED_R (__uintptr_t)2
#define RB_RED_MASK (__uintptr_t)3
#define RB_BITS(elm, field) (*(__uintptr_t *)&RB_UP(elm, field))
#define RB_RED_L ((__uintptr_t)1)
#define RB_RED_R ((__uintptr_t)2)
#define RB_RED_MASK ((__uintptr_t)3)
#define RB_FLIP_LEFT(elm, field) (RB_BITS(elm, field) ^= RB_RED_L)
#define RB_FLIP_RIGHT(elm, field) (RB_BITS(elm, field) ^= RB_RED_R)
#define RB_RED_LEFT(elm, field) ((RB_BITS(elm, field) & RB_RED_L) != 0)
#define RB_RED_RIGHT(elm, field) ((RB_BITS(elm, field) & RB_RED_R) != 0)
#define RB_PARENT(elm, field) ((__typeof(RB_UP(elm, field))) \
(RB_BITS(elm, field) & ~RB_RED_MASK))
/*
* This header may appear in user code where 'bool' is not defined,
* so it defines its own boolean type to avoid breaking that code.
*/
#define RB_BOOL int
#define RB_TRUE 1
#define RB_FALSE 0
#define RB_ROOT(head) (head)->rbh_root
#define RB_EMPTY(head) (RB_ROOT(head) == NULL)
@ -357,7 +357,7 @@ struct { \
RB_LEFT(elm, field) = RB_RIGHT(elm, field) = NULL; \
} while (/*CONSTCOND*/ 0)
#define RB_COLOR(elm, field) (RB_PARENT(elm, field) == NULL ? RB_FALSE : \
#define RB_COLOR(elm, field) (RB_PARENT(elm, field) == NULL ? 0 : \
RB_LEFT(RB_PARENT(elm, field), field) == elm ? \
RB_RED_LEFT(RB_PARENT(elm, field), field) : \
RB_RED_RIGHT(RB_PARENT(elm, field), field))
@ -422,7 +422,8 @@ struct { \
#define RB_PROTOTYPE_INSERT_COLOR(name, type, attr) \
attr void name##_RB_INSERT_COLOR(struct name *, struct type *)
#define RB_PROTOTYPE_REMOVE_COLOR(name, type, attr) \
attr void name##_RB_REMOVE_COLOR(struct name *, struct type *)
attr void name##_RB_REMOVE_COLOR(struct name *, \
struct type *, struct type *)
#define RB_PROTOTYPE_REMOVE(name, type, attr) \
attr struct type *name##_RB_REMOVE(struct name *, struct type *)
#define RB_PROTOTYPE_INSERT(name, type, attr) \
@ -464,123 +465,132 @@ struct { \
attr void \
name##_RB_INSERT_COLOR(struct name *head, struct type *elm) \
{ \
struct type *gparent, *parent; \
struct type *child, *parent; \
while ((parent = RB_PARENT(elm, field)) != NULL) { \
if (RB_LEFT(parent, field) == elm) \
RB_FLIP_LEFT(parent, field); \
else \
RB_FLIP_RIGHT(parent, field); \
if ((gparent = RB_PARENT(parent, field)) == NULL) \
break; \
if (RB_RED_LEFT(gparent, field) && \
RB_RED_RIGHT(gparent, field)) { \
RB_FLIP_LEFT(gparent, field); \
RB_FLIP_RIGHT(gparent, field); \
elm = gparent; \
continue; \
} \
if (RB_RED_LEFT(gparent, field) && \
parent == RB_LEFT(gparent, field)) { \
if (RB_RIGHT(parent, field) == elm) { \
RB_ROTATE_LEFT(head, parent, elm, field);\
RB_FLIP_RIGHT(parent, field); \
RB_FLIP_LEFT(elm, field); \
parent = elm; \
} \
RB_ROTATE_RIGHT(head, gparent, parent, field); \
RB_FLIP_LEFT(gparent, field); \
RB_FLIP_RIGHT(parent, field); \
} else if (RB_RED_RIGHT(gparent, field) && \
parent == RB_RIGHT(gparent, field)) { \
if (RB_LEFT(parent, field) == elm) { \
RB_ROTATE_RIGHT(head, parent, elm, field);\
if (RB_LEFT(parent, field) == elm) { \
if (RB_RED_LEFT(parent, field)) { \
RB_FLIP_LEFT(parent, field); \
RB_FLIP_RIGHT(elm, field); \
parent = elm; \
return; \
} \
RB_FLIP_RIGHT(parent, field); \
if (RB_RED_RIGHT(parent, field)) { \
elm = parent; \
continue; \
} \
if (!RB_RED_RIGHT(elm, field)) { \
RB_FLIP_LEFT(elm, field); \
RB_ROTATE_LEFT(head, elm, child, field);\
if (RB_RED_LEFT(child, field)) \
RB_FLIP_RIGHT(elm, field); \
else if (RB_RED_RIGHT(child, field)) \
RB_FLIP_LEFT(parent, field); \
elm = child; \
} \
RB_ROTATE_RIGHT(head, parent, elm, field); \
} else { \
if (RB_RED_RIGHT(parent, field)) { \
RB_FLIP_RIGHT(parent, field); \
return; \
} \
RB_ROTATE_LEFT(head, gparent, parent, field); \
RB_FLIP_RIGHT(gparent, field); \
RB_FLIP_LEFT(parent, field); \
if (RB_RED_LEFT(parent, field)) { \
elm = parent; \
continue; \
} \
if (!RB_RED_LEFT(elm, field)) { \
RB_FLIP_RIGHT(elm, field); \
RB_ROTATE_RIGHT(head, elm, child, field);\
if (RB_RED_RIGHT(child, field)) \
RB_FLIP_LEFT(elm, field); \
else if (RB_RED_LEFT(child, field)) \
RB_FLIP_RIGHT(parent, field); \
elm = child; \
} \
RB_ROTATE_LEFT(head, parent, elm, field); \
} \
RB_BITS(elm, field) &= ~RB_RED_MASK; \
break; \
} \
}
#define RB_GENERATE_REMOVE_COLOR(name, type, field, attr) \
attr void \
name##_RB_REMOVE_COLOR(struct name *head, struct type *par) \
name##_RB_REMOVE_COLOR(struct name *head, \
struct type *parent, struct type *elm) \
{ \
struct type *gpr, *sib, *nec; \
RB_BOOL left_elm, left_par, red_gpr; \
left_par = (RB_LEFT(par, field) == NULL); \
do { \
left_elm = left_par; \
if (left_elm ? \
!RB_RED_RIGHT(par, field) : \
!RB_RED_LEFT(par, field)) { \
gpr = RB_PARENT(par, field); \
left_par = gpr != NULL && \
RB_LEFT(gpr, field) == par; \
red_gpr = gpr == NULL ? \
RB_TRUE: RB_COLOR(par, field); \
} \
if (left_elm) { \
if (RB_RED_RIGHT(par, field)) { \
red_gpr = RB_TRUE; \
RB_ROTATE_LEFT(head, par, gpr, field); \
RB_FLIP_RIGHT(par, field); \
RB_FLIP_LEFT(gpr, field); \
struct type *sib; \
if (RB_LEFT(parent, field) == elm && \
RB_RIGHT(parent, field) == elm) { \
RB_BITS(parent, field) &= ~RB_RED_MASK; \
elm = parent; \
parent = RB_PARENT(elm, field); \
if (parent == NULL) \
return; \
} \
do { \
if (RB_LEFT(parent, field) == elm) { \
if (!RB_RED_LEFT(parent, field)) { \
RB_FLIP_LEFT(parent, field); \
return; \
} \
sib = RB_RIGHT(par, field); \
if (RB_RED_RIGHT(sib, field)) { \
if (RB_RED_LEFT(sib, field)) { \
if (RB_RED_RIGHT(parent, field)) { \
RB_FLIP_RIGHT(parent, field); \
elm = parent; \
continue; \
} \
sib = RB_RIGHT(parent, field); \
if ((~RB_BITS(sib, field) & RB_RED_MASK) == 0) {\
RB_BITS(sib, field) &= ~RB_RED_MASK; \
elm = parent; \
continue; \
} \
RB_FLIP_RIGHT(sib, field); \
if (RB_RED_LEFT(sib, field)) \
RB_FLIP_LEFT(parent, field); \
else if (!RB_RED_RIGHT(sib, field)) { \
RB_FLIP_LEFT(parent, field); \
RB_ROTATE_RIGHT(head, sib, elm, field); \
if (RB_RED_RIGHT(elm, field)) \
RB_FLIP_LEFT(sib, field); \
RB_FLIP_RIGHT(par, field); \
} \
RB_FLIP_RIGHT(sib, field); \
} else if (RB_RED_LEFT(sib, field)) { \
RB_ROTATE_RIGHT(head, sib, nec, field); \
RB_FLIP_LEFT(sib, field); \
sib = nec; \
} else { \
RB_FLIP_RIGHT(par, field); \
par = gpr; \
continue; \
if (RB_RED_LEFT(elm, field)) \
RB_FLIP_RIGHT(parent, field); \
RB_BITS(elm, field) |= RB_RED_MASK; \
sib = elm; \
} \
RB_ROTATE_LEFT(head, par, sib, field); \
return; \
RB_ROTATE_LEFT(head, parent, sib, field); \
} else { \
if (RB_RED_LEFT(par, field)) { \
red_gpr = RB_TRUE; \
RB_ROTATE_RIGHT(head, par, gpr, field); \
RB_FLIP_LEFT(par, field); \
RB_FLIP_RIGHT(gpr, field); \
if (!RB_RED_RIGHT(parent, field)) { \
RB_FLIP_RIGHT(parent, field); \
return; \
} \
sib = RB_LEFT(par, field); \
if (RB_RED_LEFT(sib, field)) { \
if (RB_RED_RIGHT(sib, field)) { \
RB_FLIP_RIGHT(sib, field); \
RB_FLIP_LEFT(par, field); \
} \
RB_FLIP_LEFT(sib, field); \
} else if (RB_RED_RIGHT(sib, field)) { \
RB_ROTATE_LEFT(head, sib, nec, field); \
RB_FLIP_RIGHT(sib, field); \
sib = nec; \
} else { \
RB_FLIP_LEFT(par, field); \
par = gpr; \
if (RB_RED_LEFT(parent, field)) { \
RB_FLIP_LEFT(parent, field); \
elm = parent; \
continue; \
} \
RB_ROTATE_RIGHT(head, par, sib, field); \
return; \
sib = RB_LEFT(parent, field); \
if ((~RB_BITS(sib, field) & RB_RED_MASK) == 0) {\
RB_BITS(sib, field) &= ~RB_RED_MASK; \
elm = parent; \
continue; \
} \
RB_FLIP_LEFT(sib, field); \
if (RB_RED_RIGHT(sib, field)) \
RB_FLIP_RIGHT(parent, field); \
else if (!RB_RED_LEFT(sib, field)) { \
RB_FLIP_RIGHT(parent, field); \
RB_ROTATE_LEFT(head, sib, elm, field); \
if (RB_RED_LEFT(elm, field)) \
RB_FLIP_RIGHT(sib, field); \
if (RB_RED_RIGHT(elm, field)) \
RB_FLIP_LEFT(parent, field); \
RB_BITS(elm, field) |= RB_RED_MASK; \
sib = elm; \
} \
RB_ROTATE_RIGHT(head, parent, sib, field); \
} \
} while (!red_gpr); \
if (gpr == NULL); \
else if (left_par) \
RB_FLIP_LEFT(gpr, field); \
else \
RB_FLIP_RIGHT(gpr, field); \
break; \
} while ((parent = RB_PARENT(elm, field)) != NULL); \
}
#define RB_GENERATE_REMOVE(name, type, field, attr) \
@ -588,7 +598,6 @@ attr struct type * \
name##_RB_REMOVE(struct name *head, struct type *elm) \
{ \
struct type *child, *old, *parent, *right; \
RB_BOOL red; \
\
old = elm; \
parent = RB_PARENT(elm, field); \
@ -600,9 +609,6 @@ name##_RB_REMOVE(struct name *head, struct type *elm) \
else { \
if ((child = RB_LEFT(right, field)) == NULL) { \
child = RB_RIGHT(right, field); \
red = RB_RED_RIGHT(old, field); \
if (red) \
RB_FLIP_RIGHT(old, field); \
RB_RIGHT(old, field) = child; \
parent = elm = right; \
} else { \
@ -611,28 +617,17 @@ name##_RB_REMOVE(struct name *head, struct type *elm) \
while ((child = RB_LEFT(elm, field)) != NULL); \
child = RB_RIGHT(elm, field); \
parent = RB_PARENT(elm, field); \
red = RB_RED_LEFT(parent, field); \
if (red) \
RB_FLIP_LEFT(parent, field); \
RB_LEFT(parent, field) = child; \
RB_SET_PARENT(RB_RIGHT(old, field), elm, field); \
RB_SET_PARENT(RB_RIGHT(old, field), elm, field);\
} \
RB_SET_PARENT(RB_LEFT(old, field), elm, field); \
elm->field = old->field; \
} \
if (elm == child) { \
red = RB_COLOR(old, field); \
if (!red); \
else if (RB_LEFT(parent, field) == old) \
RB_FLIP_LEFT(parent, field); \
else \
RB_FLIP_RIGHT(parent, field); \
} \
RB_SWAP_CHILD(head, old, elm, field); \
if (child != NULL) \
RB_SET_PARENT(child, parent, field); \
else if (!red && parent != NULL) \
name##_RB_REMOVE_COLOR(head, parent); \
if (parent != NULL) \
name##_RB_REMOVE_COLOR(head, parent, child); \
while (parent != NULL) { \
RB_AUGMENT(parent); \
parent = RB_PARENT(parent, field); \