Concepts in Computing
CS4 - Winter 2007
Instructor: Fabio Pellacini

Lecture 8: Sorting Algorithms

Overview

  • Putting array into sorted order
  • How "fast" is an algorithm: algorithmic efficiency

Sorted order

As we saw last time, searching goes a lot faster if the array is in sorted order. Algorithms for searching and sorting are some of the most fundamental. To sort a sequence means to re-arrange the elements of the sequence so that they are "in order," for instance either numeric order or alphabetical order. Let's concentrate on the idea of numeric order for now. Which of these arrays are sorted?

  • A[] = { 5, 9, 10, 11, 15, 20, 25 }
  • B[] = { 5, 5, 6, 12, 15, 15, 16 }
  • C[] = { 5, 6, 7, 8, 6, 7, 9, 10 }

If you have an array of numbers, how do you tell whether or not it is sorted? One way is to look at each pair of adjacent numbers in the array, and ensure that the first one is no bigger than the second. For instance:

function CheckSorting(N) {
    var index = 0;
    while (index < N.length-1) {
        if(N[index] > N[index+1]) {
            print("Not sorted");
            return false;
        } else {
            index = index + 1;
        }
    }
    print("sorted");
}

This is similar to the sequential search algorithm described last time.

Sorting

Now, suppose you have an array of numbers, and they are not sorted. How do you sort them? In other words, how do you rearrange them so that they are in the correct order? This is the problem of sorting.

to Sort given N[]:

  • On input, N is an array of numbers in any order.
  • When Sort halts, running CheckSorting(N) should return True.
  • The elements of N remain the same, only their order changes.

Below are some intuitive descriptions of some sorting algorithms. Pretend like you've been dealt a hand of cards, which you're holding in your right hand.

Insertion sort
The left hand, initially empty, will hold cards that have been sorted. Repeat: take the next card from the right hand and put it in the correct position in the left hand. To put it in the correct position, repeatedly swap it with the card in front of it, as long as it's smaller than that card.
Selection sort
Again, the left hand will hold the sorted cards. Repeat: find the smallest card in the right hand, and add it at the back of the left hand. In practice, to add it at the back of the left hand, we swap it with the card at the front of the right hand.
Merge sort
Divide the cards into two piles. Sort the two piles, and then merge them by repeatedly taking the smaller of the two elements at the top of the two piles. To sort the two piles, do the same thing -- divide them into two smaller piles, etc.
Quicksort
We can avoid the need to merge if, when we divide the cards into piles, we make sure that all the cards in one pile are smaller than all the cards in the other. So, choose a card (say the first), and divide the cards according to whether they're smaller or larger than it. Note that one pile may be significantly larger than the other. Do the same thing separately in each pile.

Demos at xSortLab (David Eck).

Efficiency

In addition to being correct, we would ulimately like our algorithms to be efficient. For example, in writing an algorithm to play chess, we can't have one move take trillions of years. The analysis of algorithms is a complex and at times difficult process, but crucial to nearly every aspect of computer science. We will not spend too much time discussing the mathematics behind this, but I want to give you some appreciation for why this analysis is important -- beyond how long you have to wait for MS Word to start up. For example, such analysis can prove that your password is secure (more on this later).

Let's consider the following four algorithms on an array whose length is n.

  • Binary search runs in time proportional to log(n)
  • Sequential search runs in time proportional to n
  • Many sorting algorithms (such as selection sort) run in time proportional to n2. So does any algorithm that deals with all the pairs of numbers runs in time proportional to n2 (think of a round-robin tournament).
    For example, given {1, 2, 3, 4, 5}, compute:
    1+12+1...5+1
    1+22+2...5+2
    1+32+3...5+3
    1+42+4...5+4
    1+52+5...5+5
  • Algorithms that deal with all possible combinations of the items run in time 2n.
    For example, given {1, 2, 3, 4, 5}, compute:
    1. 1, 2, 3, 4, 5
    2. 1+2, 1+3, 1+4, 1+5, 2+3, 2+4, 2+5, 3+4, 3+5, 4+5
    3. 1+2+3, 1+2+4, 1+2+5, 1+3+4, 1+3+5, ...
    4. 1+2+3+4, ...
    5. 1+2+3+4+5, ...
    (Also think of all possible ways that n coin tosses could turn out, or all possible games of chess.)

Now, computers are always getting faster, but these "orders of growth" help us see at a glance the inherent differences in run-time for different algorithms. Supposing a computer could do a single operation in 0.0001 second, we'd have the following total amounts of time, for various problem sizes and various orders of growth.

order10501001000
log(n)0.0003 s0.0006 s0.0007 s0.001 s
n0.001 s0.005 s0.01 s0.1 s
n20.01 s0.25 s1 s1.67 min
2n0.1024 s3570 yrs4x1018 yrsforget about it

Growth 1-4

Growth 1-8

Growth 1-16

Growth 1-32

Growth 1-64

Growth 1-128

Clearly, when designing algorithms we need to be careful. For example, a brute-force chess algorithm has runtime 2n which makes it completely impractical. Interestingly, though, this type of complexity can help us. In particular, the reason that it is difficult for someone to crack your password is because the best known algorithm for cracking passwords runs in 2n time (specifically factoring large numbers into primes).