valid statistical inference when an experimental data set is supplemented with predictions from a machine-learning system

We present prediction-powered inference, a framework that provides an affirmative answer to the question of whether predictions can improve inferential quality.

Instead, our key idea is to use the gold-standard data set to quantify how the prediction errors affect the imputed estimate, and then construct a confidence set for θ∗ by adjusting for this effect

rectifier

the imputation approach

the rectifier captures a notion of prediction error. In the general setting of convex estimation problems, the relevant notion of error is the bias of the subgradient gθ computed using the predictions

n each of the following appli- cations, we compute the prediction-powered confidence interval for an estimand of interest and compare it to two alternatives: the classical interval, which uses only the gold-standard data (X, Y ), and the imputed interval, which uses only the imputed data ( ̃X, ̃f ) by treating it as gold-standard data.

Applications

