The notes on unit testing can be found here.
Last week, as we discussed two different searching algorithms, it was pretty clear that the runtime for each was very different. Ultimately, we reasoned that for an input of size $n$, linear search ran $\propto n$ and binary search $\propto \log_2(n)$. Today, we will look at a way to generalize analysis of the runtime of a particular function in terms of the size of its inputs.
Big-O analysis is a way of reasoning about the asymptotic runtime of a function. Recall from last time that we can count the number of operations in a function to come up with some $T(n)$ expression. One of the important aspects of Big-O is that the constant factors in this expression don't matter and, more generally, only the largest term is meaningful. We also don't care about small input--only large inputs.
One justification for this is that the constants we determine are based on some incorrect assumptions. For example, we've assumed that addition takes just as long as multiplication and the function calls have no overhead, which is not accurate. Even if we could determine these relations on a particular system, the results wouldn't generalize to different hardware anyway. There are situations where these constant factors would make a difference, but we will not be concerned with that in this class.
Below is a graph of some common runtime classes, with various constants. Hopefully this helps to convince you that the various constants we use don't matter by the time we get to large input. In some cases, the functions that are better at large inputs are slower for small inputs. However, most functions run "instantly" on small input anyway, so this is also not a problem.
We've already talked a little about being able to use functions with the same prototype interchangeably, so what does this mean in terms of big-O. First of all, if we swap out two functions with the same big-O runtime in some code we've written, we can assume that the runtime of each will be similar. More importantly, say we've written a large program that uses (sorted) linear search, then we can replace it with binary search and expect that it will run faster with no side effects (as long as we aren't assuming anything beyond what's in the prototype!).
Formally, we define big-O as $$O(f) = \{g| \exists c > 0 \textrm{ and } n \geq n_0 \textrm{ such that } \forall n \geq n_0 g(n) \leq cf(n)\}$$. Don't let the set notation scare you. All we're saying here is, for example, all functions that run in linear time are in the same class of functions.
Now for a simple example: consider $T(n) = 34n^2$. If we let $c = 34$ and $n_0 = 0$ then our analysis satisfies the formal definition.
How about a more complicated case? Consider $T(n) = 9n^3 + 32n^2 + 99$. Intuitively, we can see that this is $O(n^3)$, but how do we show this using the formal definition? $9n^3 + 32n^2 + 99 \leq? cn^3$. We make approximations. For $n \geq 1$ we can say that $n^3 \geq n^2 \geq 1$. Therefore, we can write an inequality stating that $$9n^2 + 32n^2 + 99 \leq 9n^3 + 32n^3 + 99n^3 = 140n^3$$ Therefore $c = 140$. Now, how about $n_0$? This doesn't work for $n_0 = 0$, but works if we let $n_0 = 1$.
Suppose we have two functions $f$ and $g$ where $T_f(n) = 4n$ and $T_g(n) = 33n + 5$. A big-O analysis would allow us to say that $O(f) = O(g)$. You may have noticed that our definition isn't very strict. For example, in addition to saying that $O(f(n)) = O(n)$ it is also true that $O(f(n)) = O(n^2)$. Since we've already defined big-O in terms of set notation, we can also say that $O(n) \in O(n^2)$. In other words, every function that has a linear bound on runtime also has a quadratic bound on runtime. You should be familiar with these relations and be comfortable with the notation. However, if we ask you for the big-O runtime of a function, we are looking for the smallest bounding function. We could explicitly ask for the tight bound (big-$\theta$), but the proof would be more involved.
Here is another interesting result of big-O. Consider functions $f$ and $g$ where $T_f(n) = \log_2(n)$ and $T_g(n) = \log_3(n)$. What can we say about their big-O runtimes? It turns out that $O(f) = O(g)$. Recall the change of base formula for logs: $$\log_a(n) = \frac{\log_b(n)}{\log_b(a)}$$ This holds for all $a, b$. So the two differ by only a constant factor. For this reason, we actually usually don't write any base on our logs in big-O analysis.
How about exponents? Consider functions $f$ and $g$ where $T_f(n) = 2^n$ and $T_g(n) = 3^n$. These are not the same. There is no $c$ such that $3^n \leq c2^n$, regardless of $n_0$. The key to understanding this is that $c$ must be generic, i.e. not tied to $n$. Yes, for a particular value of $n$ we can find such a constant, but we can increase $n$ to the point where that constant would not longer work.
So, how do we take this very abstract stuff and bring it back to analyzing actual code? Well, we sure don't want to have to continue counting operations in everything we write. In general, this comes down to recognizing general patterns in code as being similar to one another. Here are some examples:
One final note: big-O runtimes can involve more than one variable. Suppose you have nested for loops that loop over $m$ and $n$, with only constant time operations inside, then it would be $O(mn)$. Don't feel compelled to find a way to reduce this to one variable.
Runtime analysis will be pivotal to our discussions of data structures (introduced next week) for the remainder of the semester.
If you have any comments or see any errors, please let me know.