Kolmogorov Complexity and Formula Size Lower Bounds

PhD thesis of Troy Lee, University of Amsterdam, January 2006

[ pdf ] (1096 KB)


Part I: Kolmogorov Complexity

Randomness is a concept familiar to all of us in our daily lives. From changing weather, to the flip of a coin, to the shuffle function on a music player---random is frequently used to describe the behavior of things around us. Despite this familiarity, or perhaps because of it, randomness eluded a precise mathematical definition until the second half of the twentieth century.

In the early 1960's, independently and within a few years of each other, Solomonoff, Kolmogorov, and Chaitin all developed an elegant way to capture when a sequence of events (viewed as a string with letters 0 and 1) is random. They proposed the following definition: the complexity of a string is the length of a shortest description of that string. Here a description can be thought of as a computer program. We will refer to this measure as the Kolmogorov complexity of the string, and write $C(x\pipe y)$ for the length of a shortest program for $x$ which is allowed to make use of the advice string $y$. While the string $01010101010101010101010101010101010101010101010101$ has 50 symbols, it has a much shorter description saying ``write 25 times the string 01''. On the other hand, a random string will have no structure which would allow a description of length shorter than the length of the string itself. Thus we call a string random if its Kolmogorov complexity is at least as large as its length.

Far beyond its initial purpose, Kolmogorov complexity has become an important tool in theoretical computer science, witnessing many diverse applications. Almost all of the applications use at least one of the following four fundamental theorems, which we call the `four pillars' of Kolmogorov complexity:

A drawback to Kolmogorov complexity is that it is uncomputable, and this sometimes limits its range of applicability in computational complexity. One way to scale down the theory into the feasible domain is to require that the program which prints the string $x$ does so in time which is polynomial in the length of $x$. The main goal of the first part of s thesis is to see what analogues, if any, of the four pillars hold in this resource bounded setting.

The first pillar holds unchanged in the resource bounded setting, as adding restrictions to a program can only make it harder to describe a string.

Things get much more interesting beginning with the second pillar. The most natural resource bounded analog of the second pillar is the statement: every element $x$ in a set $A$ decidable in polynomial time has a description of length $\log \card{A}$ from which $x$ can be generated in polynomial time. This statement is unlikely to hold, however, as it implies the polynomial hierarchy collapses. We show that going up one step higher in the polynomial hierarchy, however, we can find an analog of the second pillar. That is, we are able to show every element $x$ in a set $A$ which can be decided in nondeterministic polynomial time can be given a description of length about $\log \card{A}$ from which $x$ can be generated in nondeterministic polynomial time.

The most natural analogue of the third pillar would be: any element in the support of a polynomial time computable probability distribution $P$ can be given a description of length about $-\log P(x)$. We show that such a statement implies that randomized algorithms (those which can make decisions based on the outcome of a coin flip) can be simulated by deterministic algorithms (which cannot flip coins). Precisely, we show this implies $\BPP \ne \EXP$. Recently, Antunes and Fortnow were able to prove a weak converse to this statement: under a derandomization assumption the polynomial time version of the third pillar holds. With these results we have a fairly complete picture of the third pillar in the resource bounded setting.

Finally, we turn to symmetry of information which is perhaps the most tricky of the four pillars in the resource bounded domain. To prove symmetry of information seems to require the ability to do both language compression and source compression. We are only able to prove a weaker version of symmetry of information which says that the polynomial time complexity of the pair $(x,y)$ is larger than the randomized nondeterministic complexity of $x$ plus the randomized nondeterministic complexity of $y$ given $x$. This result is tight with respect to relativizing techniques---we give an oracle where even the nondeterministic complexity of $(x,y)$ is nearly twice as large as the nondeterministic complexity of $x$ plus the nondeterministic complexity of $y$ given $x$.

Part II: Formula Size Lower Bounds

One of the most famous, important, and yes, most difficult problems in complexity theory is the question of whether or not P is equal to NP. P is the set of all problems that can be solved in time polynomial in the length of the problem description; NP is the set of problems which can be efficiently verified, meaning that if someone gives you the solution to the problem you can indeed check that it is a proper solution in polynomial time. Currently, it is widely believed that P is not equal to NP. To prove this one must show a lower bound, that is give an example of a problem in NP that requires more than polynomial time to solve. A well-studied approach to doing this is through circuit complexity, where a circuit consists of AND, OR, and NOT gates. The current best lower bound on the size of a circuit required to compute a function in NP is 5n. As the task of proving larger circuit lower bounds seems extremely difficult, we look instead at the weaker model of formula size, where a formula is simply a circuit where every gate has exactly one outgoing wire. The current best lower bound for formula size is $n^3$.

We give a new algebraic approach to proving formula size lower bounds. We are not able to improve on the best $n^3$ lower bound; we are, however, able to simultaneously generalize several techniques from the literature and show them as part of an overarching theory. We also give some concrete examples of functions for which our new method is able to provably do better than previous methods. Perhaps the most interesting consequence of our results, however, is that it gives evidence for a connection between the formula size of a function and its complexity in a very different model, namely its quantum query complexity. In fact, our results can be taken as evidence for the provocative conjecture that the square of the quantum query complexity of a function lower bounds its formula size.