Weak consistency for Metropolis-Hastings algorithm in high-dimension

Abstract

High-dimensional asymptotics of random walk Metropolis-Hastings algorithm is well understood for a class of light-tailed target distribution. In this talk, we develop a study for the analysis of (spherically symmetric) heavy-tailed target distribution. In particular, we obtain consistency of random walk Metropolis-Hastings algorithm and a kind of optimal rate for consistency.

On construction of a volatility information criterion

Abstract

Stochastic differential equations are used for modeling various structures of volatility. Then model selection is a basic problem to develop real data analysis. In this talk, we present a spot volatility information criterion sVIC for volatility model selection.
Statistics is non-ergodic when observations are sampled discretely under finite time horizon. As in classical mathematical statistics, inferential statistical methods give an approach to this question in dependent data. However, construction of the information criterion is not straightforward due to non-ergodicity. Machineries are the quasi likelihood analysis and the martingale expansion in mixed normal limit.
This is a joint work with Masayuki Uchida.

semi-optimal on-line learning for restricted gradients

Abstract

It is known that a well designed on-line algorithm asymptotically converges to the solution as fast as any other algorithm. In a simple case where infinitely many i.i.d. training samples are obtained and the cost for training is likelihood, the convergence speed reaches the Cramer-Rao bound with Newton-Raphson gradients or Gauss-Newton gradients and 1/t annealing. On the other hand, in some practical cases, e.g. Elo rating system for evaluating the relative skill levels of players, only a few elements of the parameter are allowed to be updated at each training step, therefore optimally designed gradients cannot be applied.
In this talk, we consider the case where gradients are restricted in a specific subspace and discuss practically optimal designs of on-line algorithms in terms of convergence speed.

Hybrid multi-step estimators for diffusion processes

Abstract

We treat an efficient parametric estimation for diffusion processes based on sampled data from computational point of view. By using the adaptive estimators and the Newton-Raphson method, hybrid multi-step estimators are obtained and their asymptotic properties are shown.

We construct a copula from the multivariate skew t-distribution of Azzalini and Capitanio (2003). This copula can capture asymmetric and extreme dependence between variables, and it is one of the few that is effective when the number of dimensions is high. However, two problems arise when estimating the parameters by maximum likelihood estimation. Here, we solve these problems and provide a concrete maximum likelihood estimation algorithm. We test our solution by simulating trivariate data with realistic parameters. The parameters are estimated from the daily returns of three stock indices: the SP500, DAX, and Nikkei225.

High-frequency lead-lag relationships in the Japanese stock market

Abstract

We are concerned with very short-term, lead-lag relationships between market prices of identical stocks traded concurrently on multiple domestic trading venues. By use of high-frequency, limit-order book data for major Japanese stocks with millisecond time resolution, we empirically investigate whether there exist such lead/lags among them and measure how large or small lead/lag times are if indeed that is the case.
We adopt the lead-lag estimation framework proposed by Hoffmann et al. (2010, 2013), which utilizes Hayashi and Yoshida (2005)’s nonsynchronous covariance estimator.
The datasets are of the TOPIX Core30 Index component stock prices on the Tokyo Stock Exchange, the Japannext PTS and Chi-X Japan PTS. The period is the first 7.5 months of the year 2012.
After measuring lead/lag times of each stock per each day, we put them together and conduct a longitudinal data analysis (or panel data analysis) to understand systematic patterns found in the observed lead/lag times in term of observable characteristics of the individual stocks. Empirical findings will be presented in the talk.

We establish a central limit theorem for a class of pre-averaging covariance estimators in a general endogenous time setting. In particular, we show that the time endogeneity has no impact on the asymptotic distribution in the first order. This contrasts with the case of the realized volatility in a pure diffusion setting.

Quasi-likelihood analysis for nonsynchronously observed diffusion processes

Abstract

We construct a quasi-maximum likelihood estimator and a Bayes type estimator for a statistical model of nonsynchronously observed diffusion processes. We will see asymptotic mixed normality and asymptotic efficiency of these estimators.

Limit theorems for nearly unstable Hawkes processes

Abstract

(FMSP seminar)

Because of their tractability and their natural interpretations in term of market quantities, Hawkes processes are nowadays widely used in high frequency finance. However, in practice, the statistical estimation results seem to show that very often, only nearly unstable Hawkes processes are able to fit the data properly. By nearly unstable, we mean that the L1 norm of their kernel is close to unity. We study in this work such processes for which the stability condition is almost violated. Our main result states that after suitable rescaling, they asymptotically behave like integrated Cox Ingersoll Ross models. Thus, modeling financial order flows as nearly unstable Hawkes processes may be a good way to reproduce both their high and low frequency stylized facts. We then extend this result to the Hawkes based price model introduced by Bacry et al. We show that under a similar criticality condition, this process converges to a Heston model. Again, we recover well known stylized facts of prices, both at the microstructure level and at the macroscopic scale.

joint work with Thibault Jaisson (Ecole Polytechnique Paris).

Remark on the large deviation inequality in mixed-rates asymptotics

Abstract

We will show how the polynomial type large deviation inequality for statistical random fields can carry over into the non-conventional "mixed-rates" framework, which especially covers the sparse M-estimation paradigm. In particular, we clarify rate of convergence of probability in zero-parameter estimation; e.g. rate of consistency in variable selection in regression. This talk is partly based on a joint work with Yusuke Shimizu (Kyushu University).

Prediction of the approximation accuracy of the asymptotic expansion using random forest

Abstract

To approximate the expectation related to various stochastic differential equation models, the asymptotic expansion is a useful method. In generally, the error between true value and approximation value cannot represent explicitly. In this talk, using random forest we will make a classification of model data based on the size of errors and modify each approximation value.

Risk bounds for convex and Bayesian tensor estimators

Abstract

Low rank tensor estimation is a useful statistical tool to analyze multi-dimensional array data such as spatio-temporal data and purchase data. We develop two types of estimators for estimating a low rank tensor: minimum convex regularized risk minimizer and Bayes estimator. We show convergence rates of these methods, and see how the rank of the tensor affects the rate.
This talk is partly based on a joint work with Ryota Tomioka (Toyota Technological Institute at Chicago).