Introduction to online optimization: introduction
online learning protocol:
characteristic:
limited feedback
Exponentially weighed average forecaster
Bounded convex loss and expert regret
Hoeffding’s's inequality
lemma 2.1:
Therorem: For any convex loss taking values in [0,1], the Exp strategy satisfies:
Exp-concave loss and expert regret
Lower bound
General convex and bounded loss is unimprovable.
Anytime strategy
For any convex loss with values in [0,1], the exp strategy with time-varying parameter satisfies for all
:
Subdifferentiable loss with bounded subgradient
online finite optimization
For any loss with values in [0,1], the finite Exp strategy with parameter satisfies with probability at least
: