This article considers experimental design based on the strategy of rerandomization to increase the efficiency in experiments. Two aspects of rerandomization are addressed. First, we propose a two-stage allocation sample scheme for randomization inference to the units in experiments that guarantees that the difference-in-mean estimator is an unbiased estimator of the sample average treatment effect for any experiment, conserves the exactness of randomization inference, and halves the time consumption of the rerandomization design. Second, we propose a rank-based covariate-balance measure which can take into account the estimated relative weight of each covariate. Several strategies for estimating these weights using pre-experimental data are proposed. Using Monte Carlo simulations, the proposed strategies are compared to complete randomization and Mahalanobis-based rerandomization. An empirical example is given where the power of a mean difference test of electricity consumption of 54 households is increased by 99%, in comparison to complete randomization, using one of the proposed designs based on high frequency longitudinal electricity consumption data. Supplementary materials for this article are available online.
Recently a computational-based experimental design strategy called rerandomization has been
proposed as an alternative or complement to traditional blocked designs. The idea of rerandomization is to
remove, from consideration, those allocations with large imbalances in observed covariates according to a
balance criterion, and then randomize within the set of acceptable allocations. Based on the Mahalanobis
distance criterion for balancing the covariates, we show that asymptotic inference to the population, from which
the units in the sample are randomly drawn, is possible using only the set of best, or ‘optimal’, allocations.
Finally, we show that for the optimal and near optimal designs, the quite complex asymptotic sampling
distribution derived by Li et al. (2018), is well approximated by a normal distribution.
Blocking is commonly used in randomized experiments to increase efficiency of estimation. A generalization of blocking removes allocations with imbalance in covariate distributions between treated and control units, and then randomizes within the remaining set of allocations with balance. This idea of rerandomization was formalized by Morgan and Rubin (Annals of Statistics, 2012, 40, 1263–1282), who suggested using Mahalanobis distance between treated and control covariate means as the criterion for removing unbalanced allocations. Kallus (Journal of the Royal Statistical Society, Series B: Statistical Methodology, 2018, 80, 85–112) proposed reducing the set of balanced allocations to the minimum. Here we discuss the implication of such an ‘optimal’ rerandomization design for inferences to the units in the sample and to the population from which the units in the sample were randomly drawn. We argue that, in general, it is a bad idea to seek the optimal design for an inference because that inference typically only reflects uncertainty from the random sampling of units, which is usually hypothetical, and not the randomization of units to treatment versus control.