You have just completed your registration at OpenAire.
Before you can login to the site, you will need to activate your account.
An e-mail will be sent to you with the proper instructions.
Important!
Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version
of the site upon release.
Estimating multiple geometric shapes such as tracks or surfaces creates significant mathematical challenges particularly in the presence of unknown data association. In particular, problems of this type have two major challenges. The first is typically the object of interest is infinite dimensional whilst data is finite dimensional. As a result the inverse problem is ill-posed without regularization. The second is the data association makes the likelihood function highly oscillatory.\ud \ud The focus of this thesis is on techniques to validate approaches to estimating problems in geometric statistical inference. We use convergence of the large data limit as an indicator of robustness of the methodology. One particular advantage of our approach is that we can prove convergence under modest conditions on the data generating process. This allows one to apply the theory where very little is known about the data. This indicates a robustness in applications to real world problems.\ud \ud The results of this thesis therefore concern the asymptotics for a selection of statistical inference problems. We construct our estimates as the minimizer of an appropriate functional and look at what happens in the large data limit. In each case we will show our estimates converge to a minimizer of a limiting functional. In certain cases we also add rates of convergence.\ud \ud The emphasis is on problems which contain a data association or classification component. More precisely we study a generalized version of the k-means method which is suitable for estimating multiple trajectories from unlabeled data which combines data association with spline smoothing. Another problem considered is a graphical approach to estimating the labeling of data points. Our approach uses minimizers of the Ginzburg-Landau functional on a suitably defined graph.\ud \ud In order to study these problems we use variational techniques and in particular I-convergence. This is the natural framework to use for studying sequences of minimization problems. A key advantage of this approach is that it allows us to deal with infinite dimensional and highly oscillatory functionals.
[1] E. F. Abaya and G. L. Wise. Convergence of vector quantizers with applications to optimal quantization. SIAM Journal on Applied Mathematics, 44(1):183-189, 1984.
[2] R. A. Adams. Sobolev Spaces. Pure and applied mathematics; a series of monographs and textbooks; v. 65. Academic Press, Inc. (London) Ltd., 1975.
[5] G. Alberti and G. Bellettini. A nonlocal anisotropic model for phase transitions: Asymptotic behaviour of rescaled energies. European Journal of Applied Mathematics, 1998.
[6] F. Alter and V. Caselles. Uniqueness of the Cheeger set of a convex body. Nonlinear Analysis: Theory, Methods and Applications, 70(1):32-44, 2009.
[7] L. Ambrosio, M. Miranda Jr., S. Maniglia, and D. Pallara. BV functions in abstract Wiener spaces. Journal of Functional Analysis, 258(3):785-813, 2010.
[8] L. Ambrosio and A. Pratelli. Existence and stability results in the L1 theory of optimal transportation. In Optimal Transportation and Applications, volume 1813 of Lecture Notes in Mathematics, pages 123-160. Springer Berlin Heidelberg, 2003.
[9] T. Amemiya. Advanced Econometrics. Havard University Press, 1985.
[14] E. Arias-Castro, G. Lerman, and T. Zhang. Spectral clustering based on local PCA. arXiv preprint arXiv:1301.2007, 2013.
[15] H. Attouch, G. Buttazzo, and G. Michaille. Variational Analysis in Sobolev and BV Spaces: Applications to PDE's and Optimization. MPS-SIAM Series on Optimization, 2006.
[16] A. Baldi. Weighted BV functions. Houston Journal of Mathematics, 27(3), 2001.
[17] J. D. Barrow, S. P. Bhavsar, and D. H. Sonoda. Minimal spanning trees, filaments and galaxy clustering. Monthly Notices of the Royal Astronomical Society, 216:17-35, 1985.
[18] P. L. Bartlett, T. Linder, and G. Lugosi. The minimax distortion redundancy in empirical quantizer design. Information Theory, IEEE Transactions on, 44(5):1802-1813, 1998.
[19] S. Ben-David, D. Pa´l, and H. U. Simon. Stability of k-means clustering. In Proceedings of the Twentieth Annual Conference on Computational Learning, pages 20-34, 2007.
[20] A. L. Bertozzi and A. Flenner. Diffuse interface models on graphs for classification of high dimensional data. Multiscale Modeling & Simulation, 10(3):1090-1118, 2012.
[21] G. Biau, L. Devroye, and G. Lugosi. On the performance of clustering in Hilbert spaces. Information Theory, IEEE Transactions on, 54(2):781-790, 2008.
[22] E. Bingham and H. Mannila. Random projection in dimensionality reduction: applications to image and text data. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 245-250, 2001.
[24] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang. Complex networks: Structure and dynamics. Physics Reports, 424(4-5):175-308, 2006.
[25] V. I. Bogachev. Gaussian Measures. the American Mathematical Society, 1998.
[26] L. Bottou and Y. Bengio. Convergence properties of the k-means algorithms. In Advances in Neural Information Processing Systems 7, pages 585-592, 1995.
[27] A. Braides. -Convergence for Beginners. Oxford University Press, 2002.
[28] A. Braides. Local Minimization, Variational Evolution and -Convergence. Springer International Publishing, 2014.
[29] X. Bresson, T. Laurent, D. Uminsky, and J. H. von Brecht. Convergence and energy landscape for Cheeger cut clustering. In Advances in Neural Information Processing Systems 25, pages 1385-1393. Curran Associates, Inc., 2012.
[31] A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener. Graph structure in the web. Computer networks, 33(1):309-320, 2000.
[32] L. D. Brown and M. G. Low. Asymptotic equivalence of nonparametric regression and white noise. The Annals of Statistics, 24(6):2384-2398, 1996.
[35] R. J. Carroll, A. C. M. Van Rooij, and F. H. Ruymgaart. Theoretical aspects of ill-posed problems in statistics. Acta Applicandae Mathematica, 24(2):113-140, 1991.
[52] M. Dashti, K. J. H. Law, A. M. Stuart, and J. Voss. Map estimators and their consistency in Bayesian nonparametric inverse problems. Inverse Problems, 29(9):095017, 2013.
[53] C. De Boor. A Practical Guide to Splines. Springer-Verlag New York Inc., 1978.
[54] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1-38, 1977.
[58] R. B. Ellis, J. L. Martin, and C. Yan. Random geometric graph diameter in the unit ball. Algorithmica, 2007.
[59] L. C. Evans. Partial Differential Equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, 2010.
[64] E. A. Feinberg, P. O. Kasyanov, and N. V. Zadoianchuk. Fatou's lemma for weakly converging probabilities. Theory of Probability & Its Applications, 58(4):683-689, 2014.
[65] S. Fortunato. Community detection in graphs. Physics Reports, 2010.
[68] C. Garcia-Cardona, A. Flenner, and A. G. Percus. Multiclass semi-supervised learning on graphs using Ginzburg-Landau functional minimization. In Pattern Recognition Applications and Methods, pages 119-135. Springer, 2015.
[70] N. Garc´ıa Trillos and D. Slepcˇev. On the rate of convergence of empirical measures in 1-transportation distance. arXiv preprint arXiv:1407.1157, 2014.
[72] S. Ghosal, J. K. Ghosh, and A. W. van der Vaart. Convergence rates of posterior distributions. The Annals of Statistic, 28(2):500-531, 2000.
[73] A. Gkiokas, A. I. Cristea, and M. Thorpe. Self-reinforced meta learning for belief generation. In Research and Development in Intelligent Systems XXXI, pages 185-190. Springer International Publishing, 2014.
[92] B. Kulis and M. I. Jordan. Revisiting k-means: New algorithms via Bayesian nonparametrics. In J. Langford and J. Pineau, editors, Proceedings of the 29th International Conference on Machine Learning (ICML-12), pages 513-520. ACM, 2012.
[93] J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright. Convergence properties of the Nelder-Mead simplex method in low dimensions. SIAM Journal of Optimization, 9:112-147, 1998.
[94] M.-J. Lai and L. Wang. Bivariate penalized splines for regression. Statistica Sinica, 23:1399-1417, 2013.
[96] J. Lember. On minimizing sequences for k-centres. Journal of Approximation Theory, 120:20-35, 2003.
[97] G. Leoni. A First Course in Sobolev Spaces, volume 105. American Mathematical Society, 2009.
[98] K.-C. Li. Asymptotic optimality for Cp, CL, cross-validation and generalized crossvalidation: Discrete index set. The Annals of Statistics, 15(3):958-975, 1987.
[99] X.-Z. Li, J.-F. Wang, W.-Z. Yang, Z.-J. Li, and S.-J. Lai. A spatial scan statistic for nonisotropic two-level risk cluster. Statistics in medicine, 31(2):177-187, 2012.
[101] E. V. Linder, M. Oh, T. Okumura, C. G. Sabiu, and Y.-S. Song. Cosmological constraints from the anisotropic clustering analysis using BOSS DR9. Phys. Rev. D, 89, 2014.
[109] C. L. Mallows. Some comments on CP . Technometrics, 15(4):661-675, 1973.
[140] A. N. Shiryaev. Probability (2nd Ed.). Springer-Verlag New York, Inc., 1995.
[142] M. Soltanolkotabi, E. Elhamifar, and E. J. Cande`s. Robust subspace clustering. The Annals of Statistics, 42(2):669-699, 2014.
[171] G. Wahba. Spline models for observational data. Society for Industrial and Applied Mathematics (SIAM), 1990.
[172] G. Wahba and S. Wold. A completely automatic french curve: Fitting spline functions by cross-validation. Communications in Statistics, 4(1), 1975.
[173] M. P. Wand. On the optimal amount of smoothing in penalised spline regression. Biometrika, 86(4):936-940, 1999.
[174] X. Wang, J. Shen, and D. Ruppert. On the asymptotics of penalized spline smoothing. Electronic Journal of Statistics, 5:1-17, 2011.
[175] S. Wasserman and K. Faust. Social network analysis: Methods and applications. Cambridge University Press, 1994.
[177] M. A. Wong. Asymptotic properties of univariate sample k-means clusters. Journal of Classification, 1(1):225-270, 1984.
[178] K. J. Worsley, M. Andermann, T. Koulis, D. MacDonald, and A. C. Evans. Detecting changes in nonisotropic images. Human brain mapping, 8(2-3):98-101, 1999.