Editor's note: Our recent blog about why engineers should embrace and understand Big-O notation to help them best implement algorithms in their work started a bit of a discussion. (Hacker News thread here. To bring the counterpoint side of this argument to our blog pages, I welcomed in one commenter (a writer and engineer in his own right) to pen their full case for why Big-O's usefulness in software development – and software developer hiring – is often overplayed. (Oh, and since he prefers the styling Big O, it will henceforth be shown here as such. Works for me!)

Why should you learn Big O and stop hacking your way through algorithms? I guess I would answer by saying there's a lot wrong with that premise. You see, I think Big O notation should stop being pushed as a practical software development pillar, and my thesis on this essentially has three main points:

  1. Especially in startups, it's more important to build things that work rather than worry about the internals; the name of the game is iterating quickly and finding product-market fit.
  2. Interviewing processes should focus more on code-writing and code-reading rather than memorized trivia (e.g. what's the Big O of X or Y).
  3. Big O is deceptively trivialized, as certain algorithms and data structures can wildly swing in complexity as well as its analysis.

The root of all evil

Outside of computer science departments, very few people actually do complexity analysis professionally. In fact, Donald Knuth famously wrote:

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: Premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

Keep in mind Knuth was an academic. In the real world, one might spend even less time worrying about inefficiencies. Unless working in very specific performance-sensitive niches (think high frequency trading or embedded systems), code should be easy to follow, easy to debug, and easy to test. When working in a startup, this is doubly important. Spending time optimizing an algorithm from cubic to quadratic complexity, for example, is a huge waste when the startup is still trying to gauge product-market fit. Not only that, but spending time even working out the complexity of an algorithm can take inordinate amounts of time and expertise. Further, programming languages and operating systems these days to their best to hand-hold developers:

  • C++'s STL (standard libraries) are well-tested and optimized. There are some variants (namely, EA's) that specifically focus on performance.
  • Java was one of the first languages to implement thread-safe data structures (ConcurrentHashMap and its ilk). These days, C++, as well as Rust, have followed suit.
  • Modern compiler optimizations are extremely advanced: See gcc's -O flag or MSVC's /O options.

The main throughline here is that fiddling with Big O, analyzing, and micro-optimizing are less important than simply getting your program to do what it's supposed to do. This is doubly important in fast-moving scenarios (like early-stage startups). Eternally mulling over algorithmic complexities can be a great way to get stuck in an eternal analysis paralysis where you're figuring out for months-on-end what's the best framework, the best language, the best platform, the best data structure, or the best algorithm. Voltaire said it best: Il meglio è l'inimico del bene – the best is the enemy of the good.

The elephant in the room

But we must be here for a reason – why do people study Big O? Why is the internet full of blog posts and Stack Overflow questions about Big O? The answer is twofold:

  1. Big O is part of computer science curriculum.
  2. Big O is part of the professional interview process.

The first point is understandable. Grasping algorithmic behavior as some input tends to infinity should certainly be understood by undergraduate, let alone graduate, students. Understanding best case, worst case, and average case time and space complexity is central to proving things about algorithms and data structures. Big O is a part of a wide arsenal of tools a computer scientist might have at her disposal, alongside λ-calculus, proof theory, type theory, various formal grammars, and so on.

The second point, however, I am fundamentally opposed to. Big O during interview processes won't have any kind of mathematical richness or rigor behind it; it won't be analyzed in a deep way, nor will it be used in a formal proof one way or another. Big O in interviews is trivia. What's the Big O of quicksort? – the correct answer is O(n log n). Learning this kind of Big O trivia is well-covered by books like Cracking the Coding Interview and websites like bigocheatsheet.com.

But the why and the how? The answers to these questions – the interesting ones – are all but ignored. This does a disservice both to interviewers, as well as interviewees. Instead of being evaluated on coding ability like writing, reading, reviewing, debugging, etc., candidates are evaluated on memorized trivia. And instead of providing the company with actual value, we're hiring people that are good at memorization but not at actually building things. This is one of the few hills I'm willing to die on: Big O questions during interviews need to stop. They provide no value and are easily gamed.

Big O is hard

Finally, I'd like to look at a couple of seemingly-benign Big O interview-style questions you might encounter. First, we have one of my favorite Stack Overflow questions: O(log n) algorithm for finding max of array?

The question is simple: does an algorithm exist that finds the maximum of an unsorted array in O(log n) time? The answer is also seemingly simple: no. Mathematically, we don't have anything to repeatedly cut in half to achieve the logarithmic behavior we're looking for.

This is, on the face of it, correct. But if we dig a bit deeper, we find a rich discussion in the question's comments. Alexander Yee (of y-cruncher fame), claims that we can, in fact, achieve logarithmic complexity using parallel reduction, given that we have k > n processors. I contend that even with parallel reduction, thread allocation is still O(n), so the complexity will remain, in the best case, O(n log n). There's some back and forth where we discuss how expensive thread allocation is, and whether or not O(processors × time complexity) applies given constant thread allocation or not. I also touch on algorithm cascading (see Slide 30) and Brent's Theorem (1974), which states:

"Assume a parallel computer where each processor can perform an operation in unit time. Further, assume that the computer has exactly enough processors to exploit the maximum concurrency in an algorithm with M operations, such that T time steps suffice. Brent’s Theorem say that a similar computer with fewer processes, P, can perform the algorithm in:

Screen Shot 2020-09-10 at 3.19.01 PM.png

And eventually, our debate is put to rest by another user: I know with Sorting Networks, complexity is explicitly qualified as either the 'size' (correlating with O(#processors x time complexity) ) or 'depth' (correlating with O(time complexity) ), so I would extrapolate to this question as the idea that either is correct, if adequately qualified. The point here is that even a simple first-year CS question that seems entirely obvious is actually quite deep.

Next, I'd like to look at another simple Big O question: What's the complexity of a HeapSort-like algorithm? Of course, we memorized that HeapSort is O(n log n), but let's try and actually derive that. HeapSort, or a HeapSort-like algorithm, will look something like so:

  • The input is an unsorted array of n elements, A
  • We need a temporary array T, of n+1 elements, which will function as a heap
  • Insert elements from A into T, by calling T.insert(A[i]) for every index i
  • Call T.deleteMin() n times, moving elements from T to A one-by-one

This seems like it would take a fairly straightforward complexity analysis, so let's take a look:

Screen Shot 2020-09-10 at 3.19.21 PM.png

But that's not right; or at least it doesn't agree with our memorized O(n log n) complexity class. The devil, as they say, is in the details. Even if we assume that we might have the knowledge to successfully derive the third step – and get to O(log(n!)) – we almost certainly won't be equipped with the mathematical tools to go further: namely, Stirling's approximation. Stirling's approximation is one of the many approximations of the factorial function, and can be obtained by evaluating the sum approximated by the integral:

Screen Shot 2020-09-10 at 3.19.34 PM.png

In our case (because we're looking at asymptotic behavior), we ignore the trailing leftovers, and we are left with:

Screen Shot 2020-09-10 at 3.19.58 PM.png

This will dominate the preceding factor of O(n), and we simply use the above as the complexity of HeapSort, which is what we memorized in the first place! But the road was long and circuitous: we had to know what a factorial is, we had to have remembered Stirling's approximation, and, if not, we would have had to understand how to integrate the natural logarithm. Real life code is orders of magnitude more complicated than my benign HeapSort example – with more cases, conditional loops, and wildly varying best, worst, and average cases. Let's not pretend we can simply do Big O on a lunchtime napkin.

Parting words

At the end of the day, Big O will remain, for most people, a curiosity. Unless you're a computer scientist or you're working in niches where you'd need a deep understanding of performance constraints, just building stuff that works will, more often than not, be the best course of action.

This is not meant to be a denigration of Big O or the math behind it; I myself think it's profoundly interesting. However, I don't think it has much application, especially when modern languages and operating systems do a pretty good job of hand-holding as is. In fact, Big O's primary practical use these days is in trivia-style interview questions that ought to be abolished.

If you'd like to do Big O, by all means, do Big O – but be warned that here be dragons.

Correction: We fixed a small error in the Big O of quicksort explanation above.

Discussion

Categories
Share Article

Continue Reading