Aggregating algorithms are a class of prediction algorithms
developed in the strand of machine learning known as prediction with expert advice.
The main algorithms in this class are the Strong Aggregating Algorithm
(SAA, introduced in my COLT 1990 paper)
and the Weak Aggregating Algorithm
(WAA, introduced by Yura Kalnishkan and Misha Vyugin in their COLT 2005 paper).
The SAA is the generalization of the Bayesian merging rule
(introduced in learning theory by DeSantis, Markowsky, and Wegman
in their FOCS 1988 paper)
to a wider class of loss functions;
it includes as special cases the earlier Weighted Majority Algorithm
(proposed independently by Littlestone and Warmuth,
COLT 1989 and *Information and Computation* 1994,
and by me, *Information and Computation* 1992)
and the universal portfolio algorithm
(Cover, *Mathematical Finance* 1991).
The WAA extends the basic scheme of the SAA to a wider class of loss functions.

There are many other techniques in prediction with expert advice, such as gradient descent (and its important modification, exponentiated gradient descent), following the perturbed leader, and defensive forecasting. The first two techniques are typically more computationally efficient (although one often has to pay the price of weaker performance guarantees). There are also new exciting kinds of results (such as prediction with limited feedback and the analysis of internal regret) in prediction with expert advice that are not reflected (as yet) on this page.

Papers (zipped postscript or pdf; where the authors are not given, I am the only author):

- "Aggregating strategies" (COLT 1990, pp 371–383). This paper introduces the Strong Aggregating Algorithm (SAA).
- "A game of prediction with expert advice"
(
*Journal of Computer and System Sciences***56**, 153–173, 1998). This paper shows that the SAA is optimal in a natural sense. - "Derandomizing stochastic prediction strategies"
(
*Machine Learning***35**, 247–282, 1999). Application of the SAA to the problem of tracking the best expert and related problems; in particular, the paper shows that an earlier algorithm proposed by Herbster and Warmuth is a special case of the SAA. - "Competitive on-line statistics"
(
*International Statistical Review***69**, 213–248, 2001). Application of the SAA to uncountably many experts; review of the field. - "The weak aggregating algorithm and weak mixability" by Kalnishkan and Vyugin (COLT 2005). This paper introduces the Weak Aggregating Algorithm.
- Competing with stationary prediction strategies.
In this paper we introduce the class of stationary prediction strategies and construct a prediction algorithm that asymptotically performs as well as the best continuous stationary strategy. We make mild compactness assumptions but no stochastic assumptions about the environment. In particular, no assumption of stationarity is made about the environment, and the stationarity of the considered strategies only means that they do not depend explicitly on time; we argue that it is natural to consider only stationary strategies even for highly non-stationary environments. The conference version of this paper is published in the COLT 2007 Proceedings.

- Competing with Markov prediction strategies.
Assuming that the loss function is convex in the prediction, we construct a prediction strategy universal for the class of Markov prediction strategies, not necessarily continuous. Allowing randomization, we remove the requirement of convexity.

- Metric entropy in competitive on-line prediction.
Competitive on-line prediction (also known as universal prediction of individual sequences) is a strand of learning theory avoiding making any stochastic assumptions about the way the observations are generated. The predictor's goal is to compete with a benchmark class of prediction rules, which is often a proper Banach function space. Metric entropy provides a unifying framework for competitive on-line prediction: the numerous known upper bounds on the metric entropy of various compact sets in function spaces readily imply bounds on the performance of on-line prediction strategies. This paper discusses strengths and limitations of the direct approach to competitive on-line prediction via metric entropy, including comparisons to other approaches.

- Prediction with expert advice for the Brier game.
In this technical report I show that the Brier game of prediction is perfectly mixable and find the optimal learning rate and substitution function for it. These results are straightforward, but the computations are surprisingly messy.

Talks:

- Talk at the workshop "Metric Entropy and Applications in Analysis, Learning Theory and Probability" (Edinburgh, Scotland, 12 September 2006); for the slides, click here. The proofs for all the results stated in the talk (or relevant references) can be found here.
- Talk at the 12th International Conference on Approximation Theory (San Antonio, Texas, 7 March 2007): click here.
- Talk at COLT 2007 (San Diego, California, 15 June 2007): click here. The talk is based on this paper.
- Talk at the workshop on Frontiers of Computational Reasoning (Microsoft Research Cambridge, 19 March 2009): click here.