We have seen several ways to construct dictionaries: hash tables, unbalanced binary search trees, red-black trees, and skip lists. There are several factors that influence the choice of an algorithm:
· Sorted output. If sorted output is required, then hash tables are not a viable alternative.
Entries are stored in the table based on their hashed value, with no other ordering. For
binary trees, the story is different. An in-order tree walk will produce a sorted list. For
example:
void WalkTree(Node *P) {
if (P == NIL) return;
WalkTree(P->Left);
/* examine P->Data here */
WalkTree(P->Right);
}
WalkTree(Root);
To examine skip list nodes in order, simply chain through the level-0 pointers. For
example:
Node *P = List.Hdr->Forward[0];
while (P != NIL) {
/* examine P->Data here */
P = P->Forward[0];
}
· Space. The amount of memory required to store a value should be minimized. This is
especially true if many small nodes are to be allocated.
¨ For hash tables, only one forward pointer per node is required. In addition, the
hash table itself must be allocated.
¨ For red-black trees, each node has a left, right, and parent pointer. In addition, the
color of each node must be recorded. Although this requires only one bit, more
space may be allocated to ensure that the size of the structure is properly aligned.
Therefore each node in a red-black tree requires enough space for 3-4 pointers.
¨ For skip lists, each node has a level-0 forward pointer. The probability of having
a level-1 pointer is ½. The probability of having a level-2 pointer is ¼. In
general, the number of forward pointers per node is
· Time. The algorithm should be efficient. This is especially true if a large dataset is
expected. Table 3-2 compares the search time for each algorithm. Note that worst-case
behavior for hash tables and skip lists is extremely unlikely. Actual timing tests are
described below.
· Simplicity. If the algorithm is short and easy to understand, fewer mistakes may be made.
This not only makes your life easy, but the maintenance programmer entrusted with the
task of making repairs will appreciate any efforts you make in this area. The number of
statements required for each algorithm is listed in Table 3-2.
n 1
1
2
1
4
= + + + L =2 .
- 28 -
method statements average time worst-case time
hash table 26 O (1) O (n)
unbalanced tree 41 O (lg n) O (n)
red-black tree 120 O (lg n) O (lg n)
skip list 55 O (lg n) O (n)
Table 3-2: Comparison of Dictionaries
Average time for insert, search, and delete operations on a database of 65,536 (216) randomly
input items may be found in Table 3-3. For this test the hash table size was 10,009 and 16 index
levels were allowed for the skip list. Although there is some variation in the timings for the four
methods, they are close enough so that other considerations should come into play when
selecting an algorithm.
method insert search delete
hash table 18 8 10
unbalanced tree 37 17 26
red-black tree 40 16 37
skip list 48 31 35
Table 3-3: Average Time (ms), 65536 Items, Random Input
order count hash table unbalanced tree red-black tree skip list
16 4 3 2 5
random 256 3 4 4 9
input 4,096 3 7 6 12
65,536 8 17 16 31
16 3 4 2 4
ordered 256 3 47 4 7
input 4,096 3 1,033 6 11
65,536 7 55,019 9 15
Table 3-4: Average Search Time (us)
Table 3-4 shows the average search time for two sets of data: a random set, where all values
No comments:
Post a Comment