Note: a handful of other works are covered by previous choices.
Re (1): I'm happy to see yet another application of the paradigm of decomposing any graph into parts that are rapidly mixing, while omitting relatively few edges. Let me seize this opportunity and mention the application of this paradigm in the bipartite tester for bounded-degree graphs.
Re (2): When small-bias spaces were first defined and constructed, I felt that the exact polynomial dependence on the length and bias parameters (i.e., $n$ and $\eps$, respectively) is quite immaterial. Nowadays, being even more convinced of the fundamental nature of this notion, obtaining an efficient construction of optimal size seems a worthy goal to me, and this paper seems to make a significant step in this direction.
Re (3): This work generalizes Chernoff Bound to the case that the 0-1 random variables may not be independent but have a small bias in the sense that the bias of any linear combination decreases exponentially with its size. On the other hand, these biases correspond to co-moments of the random variables, which may hint to why this condition may lead to strong tail inequalities. Turning back to the dual view of biases, which are reflected in guessing XORs of values, this leads to proofs of Threshold Direct-Product Theorems (DPTs) in various setting. Thus, while I still maintain that it is odd to derive a standard DPT via the XOR Lemma (cf. Remark following Lemma 7 in the old survey of Yao's XOR Lemma), this work demonstrates that the XOR Lemma is a natural tool towards deriving Theorems DPTs.
Re (4): Being fascinated with Valiant's model of Evolvability, this paper is an obvious choice for me. It seems that this paper indicates (more than prior works) that evlovability as modeled by Valiant is quite realistic. I refer to the monotone feature, and do note that it is established (for all SQ learnable classes) only via a distribtrubtion-dependent algorithm. Still!