Thursday, June 26, 2008

Binary Search using recursion.


Implement a function to perform a binary search on a sorted array of integers to find the index of a given integer. Use this method declaration:

int binarySearch( int[] array, int lower, int upper, int target );

Comment on the efficiency of this search and compare it with other search methods.

In a binary search, you compare the central element in your sorted search space (an array, in this case) with the item you’re looking for. There are three possibilities. If it’s less than what you’re searching for, you eliminate the first half of the search space. If it’s more than the search value, you eliminate the second half of the search space. In the third case, the central element is equal to the search item and you stop the search. Otherwise, you repeat the process on the remaining portion of the search space. If it’s not already familiar to you from computer science courses, this algorithm may remind you of the optimum strategy in the children’s number-guessing game in which one child guesses numbers in a given range and a second responds “higher” or “lower” to each incorrect guess.

Because a binary search can be described in terms of binary searches on successively smaller portions of the search space, it lends itself to a recursive implementation. Your method will need to be passed the array it is searching, the limits within which it should search, and the element for which it is searching. You can subtract the lower limit from the upper limit to find the size of the search space, divide this size by two, and add it to the lower limit to find the index of the central element. Next compare this element to the search element. If they’re equal, return the index. Otherwise, if the search element is smaller, then the new upper limit becomes the central index – 1; if the search element is larger, the new lower limit is the central index + 1. Recurse until you match the element you’re searching for.

Before you code, consider what error conditions you’ll need to handle. One way to think about this is to consider what assumptions you’re making about the data you are being given and then consider how these assumptions might be violated. One assumption, explicitly stated in the problem, is that only a sorted array can be searched, so you’ll want to detect unsorted lists. You can do this by checking whether the value at the upper limit is less than the value at the lower limit, although this won’t catch all unsorted lists. If the limits are wrong, you should return an error code. (Another way to handle this case would be to call a sort routine and then restart the search, but that’s more than you need to do in an interview.)

Another assumption implicit in a search may be a little less obvious: The element you’re searching for is assumed to exist in the array. If you don’t terminate the recursion until you find the element, you’ll recurse infinitely when the element is missing from the array. You can avoid this by returning an error code if the upper and lower limits are equal and the element at that location is not the element you’re searching for. Finally, you assume that the lower limit is less than or equal to the upper limit. For simplicity, you can just return an error code in this case, although in a real program you’d probably want to either define this as an illegal call and use an assert to check it (for more efficient programs) or silently reverse the limits when they are out of order (for easier programming).

Now you can translate these algorithms and error checks into C# code:

const int NOT_IN_ARRAY = -1;
const int ARRAY_UNORDERED = -2;
const int LIMITS_REVERSED = -3;

int binarySearch( int[] array, int lower, int upper, int target ){
int center, range;

range = upper - lower;
if (range < 0) {
return LIMITS_REVERSED;
} else if( range == 0 && array[lower] != target ){
return NOT_IN_ARRAY;
}

if( array[lower] > array[upper] )
return ARRAY_UNORDERED;

center = ((range)/2) + lower;

if( target == array[center] ){
return center;
} else if( target < array[center] ){
return binarySearch( array, lower, center - 1, target );
} else {
return binarySearch( array, center + 1, upper, target );
}
}

Note that the example returns an error code instead of throwing an exception. You could use this as a chance to discuss the pros and cons of exception throwing with the interviewer.

Although the preceding routine completes the given task, it is not as efficient as it could be. As discussed at the beginning of this chapter, recursive implementations are generally less efficient than equivalent iterative implementations.

If you analyze the recursion in the previous solution, you can see that each recursive call serves only to change the search limits. There’s no reason why you can’t change the limits on each iteration of a loop and avoid the overhead of recursion. The method that follows is a more-efficient, iterative analog of the recursive binary search:

int iterBinarySearch( int[] array, int lower, int upper, int target ){
int center, range;

if( lower > upper )
return LIMITS_REVERSED;

while( true ){
range = upper - lower;
if( range == 0 && array[lower] != target )
return NOT_IN_ARRAY;

if( array[lower] > array[upper] )
return ARRAY_UNORDERED;

center = ((range)/2) + lower;
if( target == array[center] ){
return center;
} else if( target < array[center] ){
upper = center - 1;
} else {
lower = center + 1;
}
}
}

A binary search is O(log(n)) because half of the search is eliminated (in a sense, searched) on each iteration. This is more efficient than a simple search through all the elements, which would be O(n).

However, in order to perform a binary search the array must be sorted, an operation that is usually O(n log(n)), unless of course the array is always kept in sorted order.

No comments: