Nathan Linial, Yishay Mansour, and Noam Nisan:
Constant Depth Circuits, Fourier Transform, and Learnability

This paper was a pioneer in using the Fourier Transform to analyze Boolean functions. The Fourier transform allows any real-valued function to be expressed as a real sum of functions corresponding to parities. To view a Boolean valued function as a special case of a real valued function and use the Fourier transform to investigate its properties has proved to be of central importance and is by now a standard tool.

This paper proved that any function computed by a small-depth Boolean circuit of limited size has almost all of its Fourier mass concentrated on coefficients corresponding to small size parities. Apart from being a beautiful mathematical statement this theorem has a number of important consequences. In particular any such function has low average sensitivity, is approximated by a low degree polynomial over the real numbers and can be learned efficiently under the uniform distribution.

Nick Littlestone and Manfred K. Warmuth:
The Weighted Majority Algorithm

This paper presents and analyzes the weighted majority algorithm, one of the most fundamental innovations in all of online learning. This algorithm combines the insights of many experts on a sequence of online binary prediction tasks in such a way that the algorithm makes a number of mistakes that is only a small amount more than the best individual expert does on that sequence; the rate of prediction errors very quickly approaches that of the best expert.

The algorithm makes its decisions based on a weighted combination of the experts’ predictions that is multiplicatively re-weighted based on the experts’ past performance. Both the algorithm and its analysis are simple and beautiful. By showing the simplicity and power of multiplicative weight updates, the methods of this paper have become a key paradigm for showing that algorithms based on these updates yield efficient solutions for a wide variety of algorithmic problems.

Seinosuke Toda:
On the Computational Power of PP and ⊕P

This paper proves a fundamental and surprising result in computational complexity: Every problem in the polynomial-time hierarchy (PH, which contains NP at level 1) is polynomial-time Turing reducible to the problem of determining whether CNF formulas are satisfied by a majority of their input assignments. In technical terms, PH ⊆ PPP, which also implies that PH ⊆ P#P.

The main result is established by showing the stronger result that PH ⊆ BP⊕P ⊆ PPP. The first of these containments says that if one had a subroutine to determine whether instances of some NP-complete problem have an odd number of solutions then any instance of a problem in PH could be solved with high probability using a single call to this subroutine on a related instance created using random choices and polynomial time.

Beyond the intrinsic value of this paper in clarifying the landscape of complexity classes associated with NP, the results and techniques, especially the solution count amplification technique based on polynomials that it introduced, have inspired subsequent work on other fundamental questions in computational complexity theory.