# Lecture 8: Sorting Algorithms

## Overview

- Putting array into sorted order
- How "fast" is an algorithm: algorithmic efficiency

## Sorted order

As we saw last time, searching goes a lot faster if the array is in
sorted order. Algorithms for searching and *sorting* are some
of the most fundamental. To sort a sequence means to re-arrange the
elements of the sequence so that they are "in order," for instance
either numeric order or alphabetical order. Let's concentrate on the
idea of numeric order for now. Which of these arrays are sorted?

`A[] = { 5, 9, 10, 11, 15, 20, 25 }`

`B[] = { 5, 5, 6, 12, 15, 15, 16 }`

`C[] = { 5, 6, 7, 8, 6, 7, 9, 10 }`

If you have an array of numbers, how do you tell whether or not it is sorted? One way is to look at each pair of adjacent numbers in the array, and ensure that the first one is no bigger than the second. For instance:

function CheckSorting(N) { var index = 0; while (index < N.length-1) { if(N[index] > N[index+1]) { print("Not sorted"); return false; } else { index = index + 1; } } print("sorted"); }

This is similar to the sequential search algorithm described last time.

## Sorting

Now, suppose you have an array of numbers, and they are not sorted. How do you sort them? In other words, how do you rearrange them so that they are in the correct order? This is the problem of sorting.

to Sort given `N[]`

:

- On input,
`N`

is an array of numbers in any order. - When
`Sort`

halts, running`CheckSorting(N)`

should return True. - The elements of
`N`

remain the same, only their order changes.

Below are some intuitive descriptions of some sorting algorithms. Pretend like you've been dealt a hand of cards, which you're holding in your right hand.

**Insertion sort**- The left hand, initially empty, will hold cards that have been sorted. Repeat: take the next card from the right hand and put it in the correct position in the left hand. To put it in the correct position, repeatedly swap it with the card in front of it, as long as it's smaller than that card.
**Selection sort**- Again, the left hand will hold the sorted cards. Repeat: find the smallest card in the right hand, and add it at the back of the left hand. In practice, to add it at the back of the left hand, we swap it with the card at the front of the right hand.
**Merge sort**- Divide the cards into two piles. Sort the two piles, and then merge them by repeatedly taking the smaller of the two elements at the top of the two piles. To sort the two piles, do the same thing -- divide them into two smaller piles, etc.
**Quicksort**- We can avoid the need to merge if, when we divide the cards into piles, we make sure that all the cards in one pile are smaller than all the cards in the other. So, choose a card (say the first), and divide the cards according to whether they're smaller or larger than it. Note that one pile may be significantly larger than the other. Do the same thing separately in each pile.

Demos at xSortLab (David Eck).

## Efficiency

In addition to being correct, we would ulimately like our algorithms to be efficient. For example, in writing an algorithm to play chess, we can't have one move take trillions of years. The analysis of algorithms is a complex and at times difficult process, but crucial to nearly every aspect of computer science. We will not spend too much time discussing the mathematics behind this, but I want to give you some appreciation for why this analysis is important -- beyond how long you have to wait for MS Word to start up. For example, such analysis can prove that your password is secure (more on this later).

Let's consider the following four algorithms on an array whose length
is *n*.

- Binary search runs in time proportional to
*log(n)* - Sequential search runs in time proportional to
*n* - Many sorting algorithms (such as selection sort) run in time
proportional to
*n*^{2}. So does any algorithm that deals with all the pairs of numbers runs in time proportional to*n*^{2}(think of a round-robin tournament).

For example, given {1, 2, 3, 4, 5}, compute:1+1 2+1 ... 5+1 1+2 2+2 ... 5+2 1+3 2+3 ... 5+3 1+4 2+4 ... 5+4 1+5 2+5 ... 5+5 - Algorithms that deal with all possible combinations of the items
run in time 2
^{n}.

For example, given {1, 2, 3, 4, 5}, compute:- 1, 2, 3, 4, 5
- 1+2, 1+3, 1+4, 1+5, 2+3, 2+4, 2+5, 3+4, 3+5, 4+5
- 1+2+3, 1+2+4, 1+2+5, 1+3+4, 1+3+5, ...
- 1+2+3+4, ...
- 1+2+3+4+5, ...

*n*coin tosses could turn out, or all possible games of chess.)

Now, computers are always getting faster, but these "orders of growth" help us see at a glance the inherent differences in run-time for different algorithms. Supposing a computer could do a single operation in 0.0001 second, we'd have the following total amounts of time, for various problem sizes and various orders of growth.

order | 10 | 50 | 100 | 1000 |
---|---|---|---|---|

log(n) | 0.0003 s | 0.0006 s | 0.0007 s | 0.001 s |

n | 0.001 s | 0.005 s | 0.01 s | 0.1 s |

n^{2} | 0.01 s | 0.25 s | 1 s | 1.67 min |

2^{n} | 0.1024 s | 3570 yrs | 4x10^{18} yrs | forget about it |

Clearly, when designing algorithms we need to be careful. For
example, a brute-force chess algorithm has runtime
2^{n} which makes it completely
impractical. Interestingly, though, this type of complexity can help
us. In particular, the reason that it is difficult for someone to
crack your password is because the best known algorithm for cracking
passwords runs in 2^{n} time (specifically factoring
large numbers into primes).