Alwyn Young - Research

Stata Code

Randomization confidence intervals for OLS regression models: randcmdci.ado & randcmdci.sthlp.  Randomization-t confidence intervals and p-values that are asymptotically robust to deviations from the sharp null in favour of average treatment effects.

Randomization inference p-values (updated 5/2020): randcmd.ado & randcmd.sthlp (also available through ssc install in Stata).  Randomization-c and -t p-values for individual treatment effects and joint Wald and Westfall-Young multiple-testing tests of statistical significance for equations with multiple treatment effects and for an experiment as a whole.  The omnibus test (available in earlier versions) has been restored at the request of users.

Effective degrees of freedom corrections (updated 10/2016): edfreg.ado & edfreg.sthlp,  The updated version runs faster and allows for absorbed fixed effects, weights and noconstant. 

Working Papers

Nearly Collinear Regressors and the Replicability and Robustness of Published Results (comment on Oreopoulos 2006).  July 2021. 


Users of the Oreopoulos (2006) public use code and data have had trouble replicating the paper's estimated Mincerian return to schooling for the United Kingdom.  The UK regressor matrices contain ancilliary variables which are very nearly collinear, magnifying computation errors and making the estimated Mincerian return substantively sensitive to factors as trivial as the processor used and the order of the data and variables. For example, by permuting the order of the variables, the Mincerian return can be made to vary between 1.2 and 3000 percent in a single specification.

"Consistency of the OLS Bootstrap for Independently but Not-Identically Distributed Data: A Permutation Perspective."  May 2021.  On-line appendix.

Analyzing the distribution of the OLS bootstrap as that of a permutation statistic easily reveals moment conditions sufficient for consistency with full-sampling and independently but potentially not-identically distributed (inid) data which are less demanding than previous results for independently and identically distributed (iid) data.  With inid data sub-sampling, which often solves failures of the bootstrap and in the case of iid data delivers consistency with minimal assumptions, may render the OLS bootstrap inconsistent in environments where full-sampling is otherwise consistent.  Intuitively, sub-sampling increases the tail influence of increasingly extreme but increasingly rare distributions, whose effects are otherwise averaged out in the full sample.

"Asymptotically Robust Randomization Confidence Intervals for Parametric OLS Regression."  September 2020.  On-line appendix.

Merging results concerning the asymptotic distribution of permutation statistics with White's (1980) proof of the asymptotic accuracy of heteroskedasticity robust inference, this paper shows that insofar as conventional inference asymptotically provides accurate coverage probabilities, randomization tests using sharp nulls that incorrectly specify an absence of heterogeneity do so as well, regardless of covariate interactions or the level of standard error clustering.  Intuitively, in all cases the distribution of the Wald statistic across permutations of treatment converges to that of a chi-squared variable.  When the conventional test statistic is similarly asymptotically distributed chi-squared, the order statistics of the permutation distribution average the linearly accurate coverage probabilities of the conventional test.  The additional to White (1980) assumptions used to prove the result are that treatment is iid, interactions are included separately as covariates, sufficiently high moments of treatment, interactions and errors exist, and the maximum size of interdependent observations is bounded.

"Misspecified Politics and the Recurrence of Populism."  (joint with Gilat Levy and Ronny Razin).  Forthcoming American Economic Review.  On-line appendix.

We develop a dynamic model of political competition between two groups that differ in their subjective model of the data generating process for a common outcome. One group has a simpler model than the other group as they ignore some relevant policy variables. We show that policy cycles must arise and that simple world views - which can be interpreted as populist world views - imply extreme policy choices. Periods in which those with a more complex model govern increase the specification error of the simpler world view, leading the latter to overestimate the positive impact of a few extreme policy actions.


“Leverage, Heteroskedasticity and Instrumental Variables in Practical Application.”  (formerly "Consistency without Inference") June 2021.  Appendix.

I use Monte Carlo simulations, the jackknife and multiple forms of the bootstrap to study a comprehensive sample of 1359 instrumental variables regressions in 31 papers published in the journals of the American Economic Association.  Monte Carlo simulations based upon published regressions show that non-iid error processes in highly leveraged regressions, both prominent features of published work, adversely affect the size and power of IV estimates, while increasing the bias of IV relative to OLS.  Weak instrument pre-tests based upon F-statistics are found to be largely uninformative of both size and bias.  In published papers, statistically significant IV results generally depend upon only one or two observations or clusters, IV has little power as, despite producing substantively different estimates, it rarely rejects the OLS point estimate or the null that OLS is unbiased, while the statistical significance of excluded instruments is exaggerated.


“Improved, Nearly Exact, Statistical Inference with Robust and Clustered Covariance Matrices using Effective Degrees of Freedom Corrections  January 2016.

I propose bias and effective degrees of freedom corrections, based upon mimicking the first two moment properties of a chi-squared variable, to statistical inference using robust and clustered covariance matrices.  Simulation, using 1378 practical regression examples found in 44 experimental papers, shows that these corrections render the test statistics nearly exact in the face of ideal iid normal errors and provide large improvements in the accuracy of statistical inference in the presence of distinctly non-iid non-normal errors.

Published Papers by Topic

Applied Econometrics

"Channelling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results."  Quarterly Journal of Economics 134 (May 2019): 557-598.  Preprint.  Appendix.

Macroeconomic Implications of Worker Self Selection

"Structural Transformation, the Mismeasurement of Productivity Growth, and the Cost Disease of Services."  American Economic Review 104 (November 2014): 3635-67. 

    Files: Appendix..  US sectoral TFP growthData and programmes.

"Inequality, the Urban-Rural Gap, and Migration." Quarterly Journal of Economics 128 (November 2013): 1727-1785. Preprint.  Appendix.  DHS based migration data.

The Aids Epidemic and African Growth

"The African Growth Miracle." Journal of Political Economy 120 (August 2012): 696-739.  Data and programmes.

"In Sorrow to Bring Forth Children: Fertility amidst the Plague of HIV." Journal of Economic Growth 12 (December 2007): 283-327.

“The Gift of the Dying: The Tragedy of AIDS and the Welfare of Future African GenerationsQuarterly Journal of Economics 120 (May 2005): 423-466.

Growth and Reform in the People's Republic of China

“Gold into Base Metals: Productivity Growth in the People’s Republic of China during the Reform PeriodJournal of Political Economy 111 (December 2003): 1220-1261.

“The Razor’s Edge: Distortions and Incremental Reform in the People’s Republic of ChinaQuarterly Journal of Economics 115 (November 2000): 1091-1135.

East Asian Productivity Growth

“The Tyranny of Numbers: Confronting the Statistical Realities of the East Asian Growth ExperienceQuarterly Journal of Economics 110 (August 1995): 641-680.

“Lessons from the East Asian NICs: A Contrarian ViewEuropean Economic Review 38 (1994): 964-973.

“A Tale of Two Cities: Factor Accumulation and Technical Change in Hong Kong and Singapore.” In NBER, Macroeconomics Annual 1992. Cambridge, MA: MIT Press, 1992. 

Models of Endogenous Growth

“Growth without Scale EffectsJournal of Political Economy 106 (February 1998): 41-63.

“Substitution and Complementarity in Endogenous InnovationQuarterly Journal of Economics 108 (August 1993): 775-807.

“Invention and Bounded Learning by DoingJournal of Political Economy 101 (June 1993): 443-472.

“Learning by Doing and the Dynamic Effects of International TradeQuarterly Journal of Economics 106 (May 1991): 369-405.

 Ballistic Missile Defense

“Ballistic Missile Defense:  Capabilities and Constraints  The Fletcher Forum 8 (Winter 1984). 

          Reprinted in Department of Defense, Current News, Special Edition, No. 1142, 25 April 1984 and Air War College, Strategic Nuclear Force Posture, Vol. II, Chap. 5, December 1984.


Resting Papers

"The Gini Coefficient for a Mixture of Ln-Normal Populations."  December 2011.  (Data and programmes).

“Demographic Fluctuations, Generational Welfare and Intergenerational Transfers.” October 2001.

“Transport, Processing and Information: Value Added and the Circuitous Movement of Goods.” May 1999.