The paper shows that designing learning algorithms for a circuit class C can yield circuit lower bounds against C. This has been shown before in a paper of Fortnow and Klivans, but the current paper gives a straightforward and simple proof, which is also much stronger quantitatively.
One nice special case of one of their results is that any non-trivial learning algorithm for a circuit class C in the exact learning model (with membership and equivalence queries) separates C from EXP. By non-trivial they mean makes less that $2^n$ queries (and spends at most exponential time in total). I see it as a kind of an analogue to Ryan Williams works, with SAT algorithms replaced with learning. The nice thing here is that it shows that learning algorithms allow us to separate C from EXP and not just from NEXP as in Ryan's works.
As far as I can see, the instantiation highlighted by Or does not appear explicitly in the paper, although it is elluded to in the concluding section (Sec. VI). Also, Thm 1 neglectly omits the (crucial) condition $M<2^n$, which does appear in the formal statements in Sec. III.
I wish to stress that the paper provides an array of results for various learning models, ranging from exact learning (mentioned above) to standard PAC and Statistical learning models.
See CCC'13 proceedings.