I did not get to study these two related works yet, but the results sound very impressive. Rather than waiting another month (or so) till I get to study them, let me take a small risk and recommend them somewhat blindly (i.e., based on the abstracts).
Note that these results refer to approximations up to an additive error term, whereas most previous works referred to constant factor approximation. In case of entropy, the additive term is a constant.
The papers are available as TR10-180 and TR10-180 of ECCC.
We introduce a new approach to characterizing the unobserved portion of a distribution, which provides sublinear-sample additive estimators for a class of properties that includes entropy and distribution support size. Together with the lower bounds proven in the companion paper [29], this settles the longstanding question of the sample complexities of these estimation problems (up to constant factors). Our algorithm estimates these properties up to an arbitrarily small additive constant, using O(n/logn) samples; [29] shows that no algorithm on o(n/logn) samples can achieve this (where n is a bound on the support size, or in the case of estimating the support size, 1/n is a lower bound the probability of any element of the domain). Previously, no explicit sublinear-sample algorithms for either of these problems were known. Additionally, our algorithm runs in time linear in the number of samples used.