14B Red-Black Trees

Version 5.2.21

(Drake, sec. 14.4)

A red-black tree (RBT) is a height-balanced binary search tree. Height-balanced means (roughly) that the left and right subtrees are "almost" the same height, in a way that guarantees the height of the whole tree is Θ(log N).

In addition to the order property required for all BSTs, a RBT has the following requirements:

  1. The red condition: Every red node has a black parent.
  2. The black condition: every path from the root to an empty subtree has the same number of black nodes. (This number is called the black height of the tree.)

Compare these with Drake's 3 conditions on pp. 389-390.

Note that while it is required that every subtree of a red-black tree is a binary search tree, it is not required that every subtree be a red-black tree. In fact, some subtrees will have red roots, which disqualifies them from being red-black trees in themselves.

The red and black conditions imply that the longest path from root to leaf is no more than twice the length of the shortest path from root to leaf.

Think about it. The shortest possible path would consist of all black nodes. The longest possible path would have to have alternating red and black nodes, since every red node has a black parent. Consequently, its length is double that of the shortest possible path. Of course, in a particular RBT, the shortest actual path might not be the shortest possible, and the longest actual path might not be the longest possible.

It is in this sense that a RBT is "balanced". It is (usually) not perfectly balanced, like a "perfect" binary tree. (Ah, so that's why those are called "perfect".) But it is reasonably well balanced, enough to guarantee the following:

Theorem: The height of a RBT containing N nodes is not more than 2 ceil(log(N + 1)), and is therefore Θ(log N).

(Proof omitted).

Algorithms for Red-Black Trees

Search is the same as in BST.

Insertion

Insertion is in some ways like a BST, and in some ways like a heap. As in a BST, the insertion is made (initially) at a leaf, the same place that would be found by the search procedure. As in a heap, this initial insertion may violate a required condition, and we make some adjustments to restore the condition.

The place for a new node to be inserted is always a leaf, and is found in the same way as for a plain BST.

And we always insert a red node.

Consequently, the insertion cannot violate the black property (inserting a red node does not change the number of black nodes on a path from root to leaf), and we only have to worry about the red property being violated.

Since the new node is red, if it has a red parent, we are in violation of the red property and must repair the damage.

Damage is repaired by a combination of tools, including color changes and "rotations". A rotation rearranges the positions of some of the nodes. For the details of the rotations and color changes, I will diverge from the textbook and cover a simpler method that works for purely functional red-black trees.

  1. If the new node is the root (i.e., we inserted into an empty tree), we simply blacken it to restore the red property.
  2. If the red condition is violated below the root, then there must be a black node with a red child and a red grandchild. There are four variations of this:

    1. Red left child and left-left grandchild (i.e., the left child's left child).
    2. Red left child and left-right grandchild (i.e., the left child's right child).
    3. Red right child and right-left grandchild.
    4. Red right child and right-right grandchild.

    See diagram for details.3

Exercises

  1. Insert these values into an initially empty RBT: −8, 0, 6, −25, −17, −10, −15 (or some similar stream of values).

  2. Insert a stream of ascending numbers: 1, 2, 3, ....

Deletion

Deletion is again like in a BST, but the initial deletion of a node may violate either the red or the black condition, and we have to perform more adjustments to restore it to a red-black tree. We omit the details, which are complex.

Implementations of the Algorithms

In Haskell

Since the presentation has taken a functional approach, let's implement it in Haskell.

(Currently, this will not run on merlin because we do not have graphviz installed.)

In Java

It seems cruel to do this in Java, by comparison, but here it is. The Java implementation does show some more detail in the visualization---showing each step in rebuilding the tree after an insertion.

Since I have completely redone my Java implementation of functional binary trees and binary search trees, I am providing links to all the source files:

(Currently, this will not run on merlin because we do not have graphviz installed.)

References

Footnotes


  1. Revision log.
    • Version 5.2.2, 2012 Nov 28. Clarified red and black conditions' relation to Drake's conditions; revised exercises; revised Haskell code for GHC 7.4.2 compatibility; switched order of Haskell and Java implementations.
    • Version 5.2.1, 2012 Nov 27. Fixed diagram.
    • Version 5.2, 2010 Nov 16. Converted to markdown.
    • Version 5.1, 2009 Nov 20. Added diagram.
    • Version 5, 2009 Nov 16. Converted to RST.
    • Version 4, 2008 Nov 5. Replaced imperative-style red-black trees with functional-style based on Okasaki's. Added Java and Haskell source code links.
    • Version 3, 2008 Nov 3. Split file into 14a heaps and tries and 14b red-black trees. Needs thorough revision to simplify the insert algorithm. Needs formatting. Need to add figures to the web page.
    • Version 2, 2006 Nov 14. Complete, except for the delete operation in red-black trees.
  2. Empty subtrees in a red-black tree are sometimes considered as leaves with no data, and called "nils" or "nulls". This is silly and apt to be confusing, because null refers to no object. If a child pointer is null, then there is no child. Although I will not speak in this confusing way, you should know that some people, including Drake, do so. To complicate matters further, they usually say that these "null children" are black nodes. If we say that an empty subtree is a black node, then this adds one to the number of black nodes (the "black height") along every path from the root to an empty subtree. This doesn't change anything, of course: if all black heights were equal before adding 1, then they are all equal after adding 1.

  3. Diagram made with Inkscape. Based on a diagram by Chris Okasaki in Purely Functional Data Structures.