This discussion paper led to a publication in 'Biometrika', 2015, 102(2), 325-343.
We investigate the information theoretic optimality properties of the score function of the predictive likelihood as a device to update parameters in observation driven time-varying parameter models. The results provide a new theoretical justification for the class of generalized autoregressive score models, which covers the GARCH model as a special case. Our main contribution is to show that only parameter updates based on the score always reduce the local Kullback-Leibler divergence between the true conditional density and the model implied conditional density. This result holds irrespective of the severity of model misspecification. We also show that the use of the score leads to a considerably smaller global Kullback-Leibler divergence in empirically relevant settings. We illustrate the theory with an application to time-varying volatility models. We show that th e reduction in Kullback-Leibler divergence across a range of different settings can be substantial in comparison to updates based on for example squared lagged observations.