We introduce an asymptotically unbiased estimator for the full high-dimensional parameter vector in linear regression models where the number of variables exceeds the number of available observations. The estimator is accompanied by a closed-form expression for the covariance matrix of the estimates that is free of tuning parameters. This enables the construction of confidence intervals that are valid uniformly over the parameter vector. Estimates are obtained by using a scaled Moore-Penrose pseudoinverse as an approximate inverse of the singular empirical covariance matrix of the regressors. The approximation induces a bias, which is then corrected for using the lasso. Regularization of the pseudoinverse is shown to yield narrower confidence intervals under a suitable choice of the regularization parameter. The methods are illustrated in Monte Carlo experiments and in an empirical example where gross domestic product is explained by a large number of macroeconomic and financial indicators.
# 17-032/III (2017-03-14; 2017-07-05)
- Tom Boot, University of Groningen, The Netherlands; Didier Nibbering, Erasmus University Rotterdam, The Netherlands
- high-dimensional regression, confidence intervals, Moore-Penrose pseudoinverse, random projection, ridge regression