Binary heaps

Binary heaps dynamically store values in a tree structure built according to a given ordering of these values. Thanks to this structure, a number of operations can be efficiently implemented. For a binary heap of n values, pushing a new value, extracting the first value out of the heap, deleting a value, and replacing a value all have a complexity of O(log(n)) at worst. Just getting the first value without extracting it out of the heap is an O(1) operation. Here, values means objects of any type that implement an ordering rule, not necessarily numeric values. Depending on the ordering chosen for the heap, the first value (also called the root value) may be the least or the greatest one.

Basic usage

In QuickHeaps, a binary heap is created by the BinaryHeap constructor:

h = BinaryHeap{T}(o = TotalMin)

where T is the type of the values stored by the heap and o::Base.Order.Ordering is the ordering rule for sorting values. The default TotalMin ordering yields a min-heap whose root entry is the smallest one. With o = TotalMax, a max-heap is created. A min-heap (resp. a max-heap) sorts its values in increasing (resp. decreasing) order, followed by NaN then missing values.

For floating-point values, FastMin or FastMax orders may be used, respectively, instead of TotalMin or TotalMax for (slightly) faster sorting of the heap. However, FastMin and FastMax do not define the order of NaN values and fail on missing values. If your data may contain NaN or missing values, use TotalMin or TotalMax instead of FastMin or FastMax.

A vector vals storing the initial values of the binary heap can be specified:

h = BinaryHeap{T}(vals, o = TotalMin)

to create a binary heap starting with the values in vals. Type parameter T can be omitted to assume T=eltype(vals). The initial values need not be ordered, the BinaryHeap constructor automatically takes care of that. If vals is a Vector instance with elements of type T, the binary-heap will be directly built into vals. Call BinaryHeap(copy(vals)) to create a binary heap with its own storage.

A binary heap h can be used as an ordered queue of values:

pop!(h)     # yields the root value and discard it from the heap
push!(h, x) # pushes value x in heap h

The root value is the first one according to the ordering of the heap. To examine the root value without discarding it from the heap, call either of:

peek(h)
first(h)
h[1]

A binary heap h behaves like an abstract vector (with 1-based linear indices), in particular:

length(h)   # yields the number of values in heap h
h[i]        # yields the i-th value of heap h
h[i] = x    # replaces the i-th value of heap h and heapify h

Note that h[1] is the value of the root entry of the heap h (the least heap values for a min-heap, the greatest heap value for a max-heap) and that setting a value in the heap may trigger reordering of the values stored by the heap to maintain the binary heap structure. In particular, after doing h[i] = x, do not assume that h[i] yields x because the heap h is automatically (and efficiently) re-ordered so as to maintain its heap structure.

To delete the i-th value from the heap h, call:

delete!(h, i)

Call empty!(h) to delete all the values of the binary heap h and isempty(h) to query whether h is empty.

Call Base.Order.Ordering(h) to retrieve the order o of the binary heap h.

Note

Operations that modify the heap, like deletion by delete!(h,i), insertion by h[i] = x, pushing by push!(h,x), and extracting by pop!(h) are of numerical complexity O(1) in the best case, O(log(n)) in the worst case, with n = length(h) the number of values in the heap h. Query a given value with peek(h), first(h), or h[i] is always of complexity O(1).

Advanced usage

Instances of BinaryHeap store their values in a Julia vector whose length is always equal to the number of stored values. Slightly faster binary heaps are created by the FastBinaryHeap constructor. Such binary heaps never automatically reduce the size of the array backing the storage of their values (even though the size is automatically augmented as needed). You may call resize!(h) to explicitly reduce the storage to its minimum.

A hint about the anticipated size n of a heap h (of any kind) can be set by:

sizehint!(h, n)

which yields h.

Customize binary heaps

The behavior of the binary heap types provided by QuickHeaps can be tweaked by using a particular instance of the ordering o::Baae.Order.Ordering and by specializing the Base.lt method called as Base.Order.lt(o,x,y) to decide whether value x strictly occurs before value y according to ordering o. In the implementation of binary heaps by the QuickHeaps package, x and y are always both of type T, the type of the values stored by the heap.

If this is not sufficient, a custom binary heap type may be created that inherits from AbstractBinaryHeap{T,O} with T the type of the values stored by the heap and O the type of the ordering. Assuming the array backing the storage of the values in the custom heap type has 1-based linear indexing, it is sufficient to specialize the following methods for an instance h of the custom heap type, say CustomBinaryHeap:

  • Base.length(h::CustomBinaryHeap) yields the number of values in h;
  • Base.empty!(h::CustomBinaryHeap) delete all values in h;
  • QuickHeaps.storage(h::CustomBinaryHeap) yields the array backing the storage of values;
  • QuickHeaps.unsafe_grow!(h::CustomBinaryHeap, n::Integer) to grow the size of the object backing the storage of h to have n elements;
  • QuickHeaps.unsafe_shrink!(h::CustomBinaryHeap, n::Integer) to shrink the size of the object backing the storage of h to have n elements;

to have a fully functional custom binary heap type.

By default, Base.resize!(h) does nothing (except returning its argument) for any instance h of a type that inherits from AbstractBinaryHeap; but this method may also be specialized.

The QuickHeaps package provides a number of public methods (some un-exported) that may be useful for implementing new binary heap types:

Note that the heapify, heapify!, and isheap functions which are exported by the QuickHeaps package have the same behavior but are different than those in the DataStructures package. If you are using both packages, you'll have to explicitly prefix these methods by the package module, that is QuickHeaps or DataStructures.

Simple priority queues

A binary heap can be used to implement a simple priority queue with keys of type K and values of type V as follows:

struct Node{K,V}
   key::K
   val::V
end
Base.Order.lt(o::Base.Order.Ordering, a::T, b::T) where {T<:Node} =
    Base.Order.lt(o, a.val, b.val)
Q = FastBinaryHeap{Node{K,V}}()

Another possibility is to use key => val pairs as entries and provide an ordering wrapper to compare against values of pairs:

struct ByValue{O<:Base.Order.Ordering} <: Base.Order.Ordering
   order::O
end
Base.Order.lt(o::ByValue, (_,a)::Pair, (_,b)::Pair) = Base.Order.lt(o.order, a, b)
Q = FastBinaryHeap{Pair{K,V}}(ByValue(Forward))

These simple priority queues are binary heaps of key-value entries sorted according to their values and indexed as a vector. This is useful if you are mainly interested in accessing the root entry. Such simple priority queues are faster than DataStructures.PriorityQueue but provide no means to access an entry by its key (Q is indexed by an integer linear index which depends on the position of the entry in the heap, hence, on its value not on its key) nor to ensure that keys are unique (an auxiliary array, dictionary, or set must be used for that). The QuickHeaps package provides more advanced priority queues which are sorted by value but indexed by key. QuickHeaps.PriorityQueue and QuickHeaps.FastPriorityQueue are concrete implementations of these priority queues.