We prove the following results concerning the combinatorics of list decoding, motivated by the exponential gap between the known upper bound (of $O(1/\gamma)$) and lower bound (of $\Omega_p(\log (1/\gamma))$) for the list-size needed to decode up to radius $p$ with rate $\gamma$ away from capacity, i.e., $1-h(p)-\gamma$ (here $p\in (0,1/2)$ and $\gamma > 0$).
We prove that in any binary code $C \subseteq \{0,1\}^n$ of rate $1-h(p)-\gamma$, there must exist a set $\mathcal{L} \subset C$ of $\Omega_p(1/\sqrt{\gamma})$ codewords such that the average distance of the points in $\mathcal{L}$ from their centroid is at most $pn$. In other words, there must exist $\Omega_p(1/\sqrt{\gamma})$ codewords with low ``average radius.'' The motivation for this result is that it gives a list-size lower bound for a strong notion of list decoding; this strong form has been implicitly been used in the previous negative results for list decoding. (The usual notion of list decoding corresponds to replacing average radius by the minimum radius of an enclosing Hamming ball.)
The remaining results are for the usual notion of list decoding:
1. We give a short simple proof, over all fixed alphabets, of the above-mentioned $\Omega_p(\log (1/\gamma))$ lower bound due to Blinovsky.
2. We show that one {\em cannot} improve the $\Omega_p(\log (1/\gamma))$ lower bound via techniques based on identifying the zero-rate regime for list decoding of constant-weight codes (this is a typical approach for negative results in coding theory, including the $\Omega_p(\log (1/\gamma))$ list size lower bound). On a positive note, our $\Omega_p(1/\sqrt{\gamma})$ lower bound for the strong form of list decoding does circumvent this barrier.
3. We show a ``reverse connection'' showing that constant-weight codes for list decoding imply general codes for list decoding with higher rate. This shows that the best possible list-size, as a function of the gap $\gamma$ of the rate to the capacity limit, is the same up to constant factors for both constant-weight codes and general codes.
4. We give simple second moment based proofs that w.h.p. a list-size of $\Omega_p(1/\gamma)$ is needed for list decoding {\em random} codes from errors as well as erasures, at rates which are $\gamma$ away from the corresponding capacities. For {\em random linear} codes, the corresponding list size bounds are $\Omega_p(1/\gamma)$ for errors and $\exp(\Omega_{p}(1/\gamma))$ for erasures.