We study the fundamental problems of (i) uniformity testing of a discrete distribution,
and (ii) closeness testing between two discrete distributions with bounded \ell_2-norm.
These problems have been extensively studied in distribution testing
and sample-optimal estimators are known for them~\cite{Paninski:08, CDVV14, VV14, DKN:15}.
In this work, we show that the original collision-based testers proposed for these problems
~\cite{GRdist:00, BFR+:00} are sample-optimal, up to constant factors.
Previous analyses showed sample complexity upper bounds for these testers that are optimal
as a function of the domain size n, but suboptimal by polynomial factors
in the error parameter \epsilon. Our main contribution is a new tight analysis
establishing that these collision-based testers are information-theoretically optimal,
up to constant factors, both in the dependence on n and in the dependence on \epsilon.
The collision probability tester, introduced by Goldreich and Ron ({\em ECCC}, TR00-020, 2000),
distinguishes the uniform distribution over [n] from any distribution that is \e-far from this distribution using \poly(1/\epilon)\cdot{\sqrt n} samples.
While the original analysis established only an
upper bound of O(1/\epsilon)^4\cdot{\sqrt n} on the sample complexity, a recent analysis of Diakonikolas, Gouleakis, Peebles, and Price
({\em ECCC}, TR16-178, 2016) established the optimal upper bound of O(1/\epsilon)^2\cdot{\sqrt n}.
In this note we survey their analysis,
while highlighting the sources of improvement. Specifically:
\begin{itemize}
\item
While the original analysis reduces the testing problem to approximating the collision probability of the unknown distribution up to a 1+\epsilon^2 factor, the improved analysis capitalizes on the fact that the latter problem
needs only be solved ``at the extreme'' (i.e., it suffices to distinguish the uniform distribution, which has collision probability 1/n, from any distribution that has collision probability exceeding (1+4\epsilon^2)/n).
\item
While the original analysis provides an almost
optimal analysis of the variance of the estimator when \epsilon=\Omega(1), a more careful analysis yields a significantly better bound for the case of \epsilon=o(1), which is the case that is relevant here.
\end{itemize}