Java Notes: Algorithms: Binary Search

Divide in half

A fast way to search a sorted array is to use a binary search. The idea is to look at the element in the middle. If the key is equal to that, the search is finished. If the key is less than the middle element, do a binary search on the first half. If it's greater, do a binary search of the second half.

Performance

The advantage of a binary search over a linear search is astounding for large numbers. For an array of a million elements, binary search, O(log N), will find the target element with a worst case of only 20 comparisons. Linear search, O(N), on average will take 500,000 comparisons to find the element. Probably the only faster kind of search uses hashing, a topic that isn't covered in these notes.

This performance comes at a price - the array must be sorted first. Because sorting isn't a fast operation, O(N log N), it may not be worth the effort to sort when there are only a few searches.

Example

  1 
  2 
  3 
  4 
  5 
  6 
  7 
  8 
  9 
 10 
 11 
 12 
 13 
 14 
 15 
 16 
 17 
 18 
 19 
 20 
 21 
 22 
 23 
 24 
 25 
 26 
 27 
//=========================================================== binarySearch
/** Binary search of sorted array.  Negative value on search failure.
 *  The upperbound index is not included in the search.
 *  This is to be consistent with the way Java in general expresses ranges.
 *  The performance is O(log N).
 *  @param sorted Array of sorted values to be searched.
 *  @param first Index of first element to serach, sorted[first].
 *  @param upto  Index of last element to search, sorted[upto-1].
 *  @param key   Value that is being looked for.
 *  @return      Returns index of the first match, or or -insertion_position
 *               -1 if key is not in the array. This value can easily be
 *               transformed into the position to insert it.
 */
public static int binarySearch(int[] sorted, int first, int upto, int key) {
    
    while (first < upto) {
        int mid = (first + upto) / 2;  // Compute mid point.
        if (key < sorted[mid]) {
            upto = mid;     // repeat search in bottom half.
        } else if (key > sorted[mid]) {
            first = mid + 1;  // Repeat search in top half.
        } else {
            return mid;     // Found it. return position
        }
    }
    return -(first + 1);    // Failed to find key
}

Style violation? Note: this code violates the advice of not altering method parameters. The reason is that it can under some circumstances make the code less readable. In this case, however, the code would probably be less readable by introducing two extra local variables.

Computing the midpoint

There are two common ways to compute the index of the middle element. The most common way is to add the lowest and hightest and divide by two. For example,

int mid = (first + last) + 2;

Overflow. This works well, or at least has worked well, until recently. The problem appeared when memories, and arrays, got very large. For the first time the sum of two array indexes, an intermediate value in the computation, can overflow the size of an int. Computing the midpoint by adding the two and then dividing doesn't work.

Reordering the expession avoids overflow. The solution is to rewrite the expression so that no intermediate value overflows the size of an int. This is easy to do using the following.

int mid = first + (last - first) / 2;

For most programs the difference between the two computations will never be seen because it will only appeaer with very large arrays.

Variation on binary search

The following method shows another version of a binary search method: (1) it compares elements of a string array (so comparisons use the compareTo() method), and (2) it sorts the entire array so fewer parameters are supplied, and therefore local variables are declared to take their place.

  1 
  2 
  3 
  4 
  5 
  6 
  7 
  8 
  9 
 10 
 11 
 12 
 13 
 14 
 15 
 16 
public static int binarySearch(String[] sorted, String key) {
    int first = 0;
    int upto  = sorted.length;
    
    while (first < upto) {
        int mid = (first + upto) / 2;  // Compute mid point.
        if (key.compareTo(sorted[mid]) < 0) {
            upto = mid;       // repeat search in bottom half.
        } else if (key.compareTo(sorted[mid]) > 0) {
            first = mid + 1;  // Repeat search in top half.
        } else {
            return mid;       // Found it. return position
        }
    }
    return -(first + 1);      // Failed to find key
}

Related Pages

Linear Search, Recursive Binary Search