Algorithm analysis
Table of contents
- Introduction
- The RAM model of computation
- Big O notation
- Growth rates and dominance relations
- References
Introduction
Algorithm analysis involves calculating the complexity of algorithms, usually either the time-complexity or the space-complexity.
Two common tools used for algorithm analysis are the RAM model of computation and the asymptotic analysis of worst-case complexity [1, P. 31].
The RAM model of computation
The RAM (Random Access Machine) model is used for analyzing algorithms without running them on a physical machine.
The RAM model has the following properties:
- A simple operation (
+
,\
,*
,-
,=
,if
) takes one time step. - Loops and subroutines are comprised of simple operations.
- Memory is unlimited and access takes one time step (the RAM model does not consider whether data is on cache or disk).
Using the RAM model, you measure the running time of an algorithm by counting the number of steps an algorithm takes on a given input [1, P. 32].
Despite the simplifying assumptions made by the RAM model, it works well in practice.
Best, worst, and average-case complexity
Often, an algorithm’s input determines the number of steps the algorithm takes to run.
For a given input :
- The best-case complexity is the minimum number of steps taken to complete.
- The worst-case complexity is the maximum number of steps taken to complete.
- The average-case complexity is the average number of steps taken to complete.
Each of these complexities defines a numerical function that represents time vs problem size. For example, for an input of size , an algorithm takes steps to complete in the worst case. The worst-case function for time complexity here is .
This can be graphed:
Worst-case complexity is normally the most useful measure. This is because the best-case is often so unlikely that it isn’t beneficial to think about. The average-case can be useful, but it is usually difficult to determine. The worst-case is both likely to happen, and easy to calculate [1, P. 33].
Big O notation
Big O is a mathematical notation that describes the limiting behavior of a function as its input tends towards infinity.
Big O is useful in algorithm analysis because the functions that we get from counting steps often require a lot of detail to specify. For example, a worst-case analysis might produce a function like this:
In reality, this level of detail is not much more informative than stating that “the time grows quadratically with n” [1, P. 34].
It’s easier to instead compare the upper and lower bounds of time-complexity functions with asymptotic notation. This includes big O, big omega, and big theta [1, P. 34].
Big O describes the upper bound on a function . means is an upper bound on . There exists some constant such that is always for large enough (i.e., for some constant ) [1, P. 35].
Big O notation ignores multiplicative constants. In big O analysis, and are identical.
Big omega describes the lower bound on a function. means is a lower bound on . There exists some constant such that is always , for all [1, P. 35].
Big theta defines a tight bound where a function is bound both above and below. means is an upper bound on and is a lower bound on , for all . There exist constants and such that and . This means that provides a tight bound on [1, P. 35].
Big O is the most common notation used when analyzing algorithms.
Growth rates and dominance relations
Big O notation creates classes of functions, where all functions in a class are equivalent in big O analysis [1, P. 39].
There are a few common classes:
Notation | Name | Description |
---|---|---|
Constant | No matter the size of the input, the algorithm will take the same amount of time to complete. | |
Logarithmic | Logarithmic algorithms grow slowly because they halve the amount of data they work with on each iteration. For example, binary search. | |
Linear | Linear algorithms grow linearly with n. For example an algorithm to calculate the exponent of a number by multiplying a value n times in a loop. | |
Superlinear | Superlinear algorithms grow just a little faster than linear algorithms. Common sorting algorithms like Mergesort and Quicksort run in superlinear time. | |
Quadratic | Quadratic algorithms run slowly. An example is an algorithm that checks for duplicates in an array by looping over each element, and then looping over every other element to check for matches. | |
Exponential | Exponential algorithms are very slow. An example is a recursive algorithm to find the nth term of the fibonacci sequence. | |
Factorial | Factorial algorithms quickly become useless. They occur when generating all permutations of n [1, P. 39]. |
You can see the growth rate of the common classes in the following table:
10 | 0.003 μs | 0.01 μs | 0.033 μs | 0.1 μs | 1 μs | 3.63 ms |
20 | 0.004 μs | 0.02 μs | 0.086 μs | 0.4 μs | 1 ms | 77.1 years |
30 | 0.005 μs | 0.03 μs | 0.147 μs | 0.9 μs | 1 sec | 8.4 × 1015 yrs |
40 | 0.005 μs | 0.04 μs | 0.213 μs | 1.6 μs | 18.3 min | |
50 | 0.006 μs | 0.05 μs | 0.282 μs | 2.5 μs | 13 days | |
100 | 0.007 μs | 0.1 μs | 0.644 μs | 10 μs 4 × 10^13 yrs | ||
1,000 | 0.010 μs | 1.00 μs | 9.966 μs | 1 ms | ||
10,000 | 0.013 μs | 10 μs | 130 μs | 100 ms | ||
100,000 | 0.017 μs | 0.10 ms | 1.67 ms | 10 sec | ||
1,000,000 | 0.020 μs | 1 ms | 19.93 ms | 16.7 min | ||
10,000,000 | 0.023 μs | 0.01 sec | 0.23 sec | 1.16 days | ||
100,000,000 | 0.027 μs | 0.10 sec | 2.66 sec | 115.7 days | ||
1,000,000,000 | 0.030 μs | 1 sec | 29.90 sec | 31.7 years |
References
- [1] S. Skiena, The Algorithm Design Manual, 2nd ed. Springer, 2008.