LOGIN TO YOUR ACCOUNT

Username
Password
Remember Me
Or use your Academic/Social account:

CREATE AN ACCOUNT

Or use your Academic/Social account:

Congratulations!

You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.

Important!

Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message

CREATE AN ACCOUNT

Name:
Username:
Password:
Verify Password:
E-mail:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Samko, Oksana
Languages: English
Types: Doctoral thesis
Subjects: QA75
Building models of high-dimensional data in a low dimensional space has become extremely popular in recent years. Motion tracking, facial animation, stock market tracking, digital libraries and many other different models have been built and tuned to specific application domains. However, when the underlying structure of the original data is unknown, the modelling of such data is still an open question. The problem is of interest as capturing and storing large amounts of high dimensional data has become trivial, yet the capability to process, interpret, and use this data is limited. In this thesis, we introduce novel algorithms for modelling high dimensional data with an unknown structure, which allows us to represent the data with good accuracy and in a compact manner. This work presents a novel fully automated dynamic hierarchical algorithm, together with a novel automatic data partitioning method to work alongside existing specific models (talking head, human motion). Our algorithm is applicable to hierarchical data visualisation and classification, meaningful pattern extraction and recognition, and new data sequence generation. Also during our work we investigated problems related to low dimensional data representation: automatic optimal input parameter estimation, and robustness against noise and outliers. We show the potential of our modelling with many data domains: talking head, motion, audio, etc. and we believe that it has good potential in adapting to other domains.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • 2 L iterature R ev iew 9 2.1 Reducing Dimensionality of theD ata Space ............................................. 9 2.1.1 Review of Dimensionality R eduction T e c h n iq u e s........................ 9 2.1.2 A pplications of Nonlinear Dim ensionality Reduction Techniques 13 2.1.3 Isomap Algorithm Problem s and Existing S o lu tio n s .................. 14 2.1.4 Partial D ata Representation: N M F ................................................ 16 2.2 D ata Intrinsic D im e n s io n a lity ....................................................................... 18 2.3 Subspace C lustering for High-DimensionalD a t a ...................................... 19 2.3.1 Flat Clustering A lg o rith m s................................................................. 20 2.3.2 Hierarchical Clustering A lg o r it h m s ................................................ 22 2.3.3 Clustering P r o b l e m s ........................................................................... 23 N o n l i n e a r D i m e n s i o n a l i t y R e d u c t i o n U s i n g I so m a p 3.1 I n t r o d u c t i o n ......................................................................................................
    • 3.2 Iso m a p A l g o r i t h m ............................................................................................
    • 3.3 K ernel Trick: New D a ta S a m p lin g in to Iso m a p S p a c e .............................
    • 3.4 Iso m a p Inverse P r o j e c t i o n ............................................................................
    • 3.5 O p tim a l N e ig h b o u rh o o d P a r a m e te r V alue for th e Isom ap A lg o rith m 3.6 E x p e rim e n ta l R e s u l t s ...................................................................................... 3.6.1 S c u lp tu r e Face D a ta S e t .................................................................... 3.6.2 S w i s s r o l l .................................................................................................. 3.6.3 S - C u r v e .................................................................................................. 3.6.4 E a r D a t a b a s e ........................................................................................ 3.6.5 C lassificatio n E x p e rim e n ts : O liv e tti FaceD a t a b a s e .................. 3.6.6 C lassificatio n E x p e rim e n ts : H a n d w r itte n D ig its, M N IS T D a ta 3.7 S u m m a r y ...........................................................................................................
    • H ie r a r c h ic a l M o d e l l i n g o f H ig h D i m e n s i o n a l D a t a 4.1 I n t r o d u c t i o n ........................................................................................................
    • 4.2 H ierarchical C lu ste rin g A lg o rith m O v e r v i e w ...........................................
    • 4.3 Selecting a n A p p ro p ria te N u m b e r of C l u s t e r s ...........................................
    • 4.4 D a t a M odelling as a G a u ssia n M ix tu re M o d e l .........................................
    • 4.5 H ie ra rch ica l A gglom erative C lu s te rin g ..................................................
    • 4.6 A p p lic a tio n : J o in t D a t a E s t i m a t i o n ............................................................
    • 4.7 E x p e r im e n ts w ith a R eal W orld D a t a ........................................................ 4.7.1 T a lk in g H e ad D a ta S e t ........................................................................ 4.7.2 H u m a n M o t i o n ..................................................................................... 4.7.3 M ID I D a t a ............................................................................................ 4.7.4 H a n d w r itte n D i g i t s ...............................................................................
    • 4.8 S u m m a ry ...........................................................................................................
    • 5.1 I n t r o d u c t i o n .......................................................................................................
    • 5.2 HHMM : D e fin itio n a n d R e p r e s e n ta tio n as a D B N .................................. 5.2.1 H id d e n M arkov M o d e l ........................................................................ 5.2.2 H ie ra rch ica l H id d en M ark o v M o d e l s ................................................
    • 5.2.3 A n a ly sis a n d E s tim a tio n of H H M M .............................................
    • 5.4.1 D e fin itio n of t h e C P D s ......................................................................
    • 5.4.2 DBN: In fere n c e a n d L e a r n i n g ........................................................
    • 5.5.1 D y n a m ic Fram ew ork: T r a in in g P r o c e s s .......................................
    • 5.5.2 D y n a m ic Fram ew ork: T e s tin g P ro c e sse s ....................................
    • A u t o m a t i c P a r t - B a s e d D a t a D e c o m p o s i t i o n 7.1 I n t r o d u c t i o n ........................................................................................................
    • 7.2 D a t a P re p ro c e ssin g a n d P a r a m e te r S e t t i n g ...............................................
    • 7.3 C o n s tr u c t in g M odified S p arse N M F ............................................................. 7.3.1 S p a rse N M F M odification: R a n d o m A colin itia lis a tio n . . . . 7.3.2 S p a rse N M F M odification: E a r t h M o v e r's D is ta n c e ....................
    • 7.4 D a t a P o stp ro c e ssin g : M ask C o n s t r u c t i o n ..................................................
    • 7.5 E x p e rim e n ta l R e s u l t s ....................................................................................... 7.5.1 T a lk in g H e ad D a t a .............................................................................. 7.5.2 F a cial E x p ressio n D a t a ...................................................................... 7.5.3 M o tio n D a t a ........................................................................................
    • 7.6 S u m m a r y ............................................................................................................
    • 8.1 C o n c l u s i o n s ........................................................................................................
    • 8.2 F u tu r e W o r k ........................................................................................................ 8.2.1 R o b u s t Iso m ap A lg o rith m M o d i f i c a t i o n ....................................... 8.2.2 D y n a m ic M odelling of M u lti-S o u rc e D a t a ......................................
    • 3.1 A new d a ta projection into Isomap space (1000 points, green) for the swissroll d ata, 1000 points (blue), 4% noise.................................................
    • 3.2 Cost function for sculpture d a ta s e t ..............................................................
    • 3.3 Two-dimensional Isomap em bedding of sculpture d a ta set w ith optim al param eter K = 7 ..............................................................................................
    • 3.4 Two-dimensional Isomap representation of sculpture d a ta set w ith optim al param eter K = 7 ....................................................................................
    • 3.5 Two-dimensional Isomap representation of sculpture d a ta set w ith suboptim al param eter K = 1 5 ..............................................................................
    • 3.6 Two-dimensional Isomap representation of sculpture d a ta set w ith suboptim al p aram eter e = 1 . 2 ..............................................................................
    • 3.7 Swissroll d a ta s e t ..............................................................................................
    • 3.8 Swissroll cost functions at 200, 1000 and 2000 p o i n t s ............................
    • 3.9 Residual variance, correlation coefficients and Spearm an's p w ith height and angle (left to right) for the Swissroll data. At each plot these values are shown for e = 4 ,5 ,6 .2 .................................................................................
    • 4.6 Two-dimensional Isomap mapping of the appearance param eters . . .
    • 4.7 Two-dimensional Isomap mapping of the Mel-Cepstral coefficients . .
    • 4.8 Hierarchical model representation of the talking head d a t a .................
    • 4.9 GMM in the Isomap space. Different colours correspond to different clusters...................................................................................................................
    • 4.10 Walking d ata projection into Isomap space (first two dimensions), visual representation..............................................................................................
    • 4.11 Walking d a ta projection into Isomap space (second and third dimensions), visual representation..............................................................................
    • 4.12 Hierarchical model representation of the hum an motion, 3D view . . .
    • 4.13 Hierarchical model representation of the hum an motion, side view . .
    • 4.14 Dendrogram for the MIDI d a t a ....................................................................
    • 4.15 Hierarchical model representation of MIDI d a t a .....................................
    • 4.16 Hierarchical model representation of M NIST d a ta (digits from 0 to 9 in the bottom l i n e ) ...........................................................................................
    • 4.17 Hierarchical d a ta modelling a lg o rith m ..........................................................
    • 5.1 A two-level HHMM w ith observations at the bottom . Black edges denote vertical and horizontal transitions between states and observations. Dashed edges denote returns from the end state of each level to the level's parent state .......................................................................................
    • 5.2 A dendrogram for the walking data. T he red line indicates the cut-off level. Four clusters are formed in this e x a m p le ........................................
    • 5.3 A hierarchical representation for the m otion d a t a ..................................
    • 5.4 A HHMM state transition diagram for the m otion d a t a ........................
    • 5.5 An HHMM represented as a DBN. Q f is the state at time £, level d; F f = 1 if the HHM at level d has finished (entered its exit state), otherwise F f = 0. Shaded nodes are observed; the remaining nodes are hidden..............................................................................................................
    • [1] R. A dam s and L. Bischof. Seeded region growing. IEEE Transactions on Pattern Analysis and M achine Intelligence, 16(6):641-647, 1994.
    • [2] C. A ndrieu, N. de F reita, A. D oucet, an d M. Jo rdan. An in troduction to M CMC for m achine learning. M achine Learning, 50:5-43, 2003.
    • [3] M. B alasubram anian an d E.L. Schw artz. T h e Isom ap algorithm and topological stability. Science, 295(5552):7-7, 2002.
    • [4] M. Belkin and P. Niyogi. L aplacian eigenm aps for dim ensionality reduction and d a ta representation. Neural C om putation, 15(6): 1373-1396, 2003.
    • [5] R.E. Bellman. D ynam ic program m ing. Princeton University P ress, 1957.
    • [6] A. Ben-Hur, D. H orn, H. Siegelm ann, an d V. Vapnik. Support vector clustering. Journal of M achine Learning Research, 2:125-137, 2001.
    • [7] Y. Bengio, J. P aiem ent, and P. V incent. O ut-of-sam ple extensions for LLE, Isomap, MDS, E igenm aps and spectral clustering. Advances in Neural Information Processing S ystem s, 16:177-184, 2004.
    • [8] M. B ernstein, V .de Silva, J.C . Langford, and J.B. Tenenbaum . G raph approxim ations to geodesics on em bedded manifolds. Technical Report, Stanford University, 2000.
    • [9] J. A. Bilmes. A gentle tu to ria l on th e EM algorithm and its application to p aram eter estim atio n for G aussian m ixture and Hidden M arkov Models. Technical Report icsi-tr-97-021, UC Berkeley, 1997.
    • [10] J. B lackburn an d R. E. H um an m otion recognition using isomap and dynam ic tim e w arping. Human M otion- Understanding, Modeling, Capture and Anim ation: Lecture N otes in Com puter Science, 4814:285-298, 2007.
    • [11] D. Blei, T . G ri, M. Jo rd a n , and J. Tenenbaum . H ierarchical topic models and th e nested Chinese re sta u ra n t process. In Proc. of Advances in Neural Information Processing System s (NIPS), volum e 16, pages 17-24, 2004.
    • [12] R. Bowden. Learning non-linear Models of Shape and Motion. P hD thesis, Dept Systems Engineering, B runei U niversity, 2000.
    • [13] M. Brand. S tru ctu re learning in conditional p robability models via an entropic prior and p a ra m ete r ex tinction. Neural C om putation, 11 (5): 1155-1182, 1999.
    • [14] M. Brand. C h artin g a m anifold. Advances in NIPS, 15:961-968, 2003.
    • [15] M. Brand. From subspace to subm anifold m ethods. Proc. of British Mashine Vision Conference (B M V C ), pages 6-12, 2004.
    • [16] G. B uchsbaum an d O. Bloch. Color categories revealed by non-negative m atrix factorization of m unsell color spectra. Vision Research, 42:559-563, 2002.
    • [17] I. Buciu and I. P itas. A pplication of non-negative and local non-negative m atrix factorization to facial expression recognition. In Proc. of ICPR, pages 288-291, 2004.
    • [18] M. Burge and W . B urger. E a r biom etrics in com puter vision. In Proc. of ICPR, pages 822-826, 2000.
    • [19] F. C a m a stra an d A. V inciarelli. E stim ating the intrinsic dimension of d a ta w ith a fractal-based m eth od . IEEE Trans, on Pattern Analysis and Machine Intelligence, 24(10): 1404-1407, 2002.
    • [21] X. C hen, L. G u, S.Z. Li, an d H .J. Zhang. Learning representative local features for face detection. In Proc. of ICPR, volume 1, pages 1126-1131, 2001.
    • [22] H. Choi and S. Choi. K ernel Isom ap on noisy manifold. In Proc. of IEEE I n t'l Conf. D evelopm ent and Learning (ICDL), pages 208-213, 2005.
    • [23] R. Cilibrasi, P.M .B. V itanyi, and R .de Wolf. A lgorithm ic clustering of music based on string com pression. Com puter M usic Journal, 28:49-67, 2004.
    • [24] M. Cohen, D. M assaro, an d R. Clark. T raining a talking head. In Proc. of the IEEE Fourth International Conference on M ultimodal Interfaces, (IC M I'02), pages 499-504, 2002.
    • [25] T. F. C ootes, G. E dw ards, and C. J. Taylor. Active appearance models. In Proc. of European Conf. on Com puter Vision, volume 2, pages 484-498, 1998.
    • [26] D. Cosker. A nim ation of a Hierarchical Appearance Based Facial Model and Perceptual A nalysis of Visual Speech. P hD thesis, Cardiff University, 2005.
    • [27] D. Cosker, D. M arshall, P. Rosin, S. Paddock, an d S. R ushton. Towards perceptu ally realistic talk in g heads: Models, m etrics and M cGurk. In Proc. of ACM SIG G R A P H Sym posium on Applied Perception in Graphics and Visualization (A P G V ), volum e 2, pages 270-285, 2004.
    • [28] D.P. Cosker, A .D . M arshall, P.L. Rosin, and Y. Hicks. Video realistic talk ­ ing heads using hierarchical non-linear speech-appearance models. In Proc. of Mirage 2003, INRIA Rocquencourt, pages 20-27, Prance, March.
    • [30] T.M . Cover and J.A . T hom as. E lem ents of inform ation theory. New York, 1991.
    • [35] R. Diestel. G rap h theory. Springer-Verlag, Heidelberg, 2005.
    • [36] D. Donoho an d V. S to dden. W hen does non-negative m atrix factorization give a correct decom position in to p a rts. In Proc. of NIPS, volume 401, pages 759-760, 2003.
    • [37] D. L. D onoho and C. E. G rim es. Local Isom ap perfectly recovers th e underlying p a ra m etriza tio n of o cclu d ed /lacu n ary libraries of a rticulated images. Technical Report 2002-27, D epartm ent of Statistics, Stanford University, 2002.
    • [38] D. L. D onoho an d C. E. G rim es. W hen does Isom ap recover n atu ral param eterization of fam ilies of a rtic u la te d images? Technical Report 2002-27, Department of Statistics, Stanford U niversity, 2002.
    • [39] D. L. Donoho an d C. E. G rim es. Hessian eigenmaps: locally linear em bedding techniques for high-dim ensional d ata. In Proc. of the National Academy of Arts and Sciences, volum e 100, pages 5591-5596, 2003.
    • [40] R. O. D uda, P. E. H a rt, an d D. G. Stork. P a tte rn classification. Interscience Publication, 2000.
    • [47] J. G hosh. H a n d b o o k of d a ta m ining, c h ap ter scalable clustering m ethods for d a ta m ining. Lawrence Erlbaum A ssoc, 2003.
    • [48] E. Gokcay an d J.C . P rin cip e. A new clustering evaluation function using renyi's inform ation p o te n tia l. In Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, volum e 6, pages 3490-3493, 2000.
    • [49] J.J.d e G ru ijte r a n d A .B. M cB ratney. A m odified fuzzy k m eans for predictive classification. Classification and Related M ethods of D ata A nalysis, pages 9 7- 104, 1988.
    • [50] D. G uillam et, M. B ressan, a n d J. V itria. A w eighted non-negative m atrix factorization for local rep re se n ta tio n s. In Proc. of C V P R 01, volum e 1, pages 942-947, 2001.
    • [51] D. G uillam et an d J. V itria. N on-negative m a trix factorization for face recognition. Proc. of C C IA , pages 336-344, 2002.
    • [56] P. Hall, D. M arshall, a n d R. M artin. Merging and splitting eigenspace models. IEEE Transactions on P attern Analysis and Machine Intelligence, 22(9): 1042- 1049, 2000.
    • [57] J. H artigan. C lu sterin g algorithm s. New York, 1975.
    • [66] A.K. Jain, M .N. M urty, an d P .J. Flynn. D a ta clustering: a review. A CM Computing Surveys (C SU R ), 31 (3):264-323, 1999.
    • [68] F. Jelinek. S ta tistic a l m eth o d s for speech recognition. M IT Press, 1997.
    • [69] O. C. Jenkins an d M. J. M ataric. A spatio-tem poral extension to isom ap nonlinear dim ension red uction . In Proc. of the International Conference on Machine Learning (ICML 2004), pages 441-448, 2004.
    • [70] I.T. Jolliffe. P rincipal component analysis. Springer Verlag, New York, 1986.
    • [71] J. Karaulova, P.M . Hall, a n d A.D. M arshall. A hierarchical m odel for tracking people w ith a single video cam era. In Proc. of B ritish Machine Vision Conference, volum e 1, pages 352-361, 2000.
    • [72] J. K arhunen and J. Jo utsensalo. R ep resentation and separation of signals using nonlinear PC A ty p e learning. Neural N etworks, 7:113-127, 1994.
    • [76] O. K ouropteva, O. O kun, an d M. Pietikainen. Classification of hand w ritten digits using supervised locally linear em bedding algorithm and supp ort vecto r m achine. In Proc. of the 11th European Symposium on Artificial Neural Networks (E S A N N '2003), pages 229-234, Bruges, Belgium, April 2003.
    • [77] A. N. Langville, C. D. M eyer, R. A lbright, J. Cox, and D. Duling. Initializations for th e nonnegative m a trix factorization. In Proc. of the 12 A C M SIGKDD International Conference on Knowledge Discovery and D ata Mining, pages 20- 27, 2006.
    • [78] M.H. Law, N. Zhang, an d A.K. Jain. N onlinear m anifold learning for d a ta stream . In Proc. of SIA M D ata M ining, volum e 100, pages 33-44, 2004.
    • [79] Y. LeCun, L. B o tto u , Y. Bengio, and P. Haffner. G radient-based learning applied to docum ent recognition. In Proc. of the IEEE, volume 86, pages 2278- 2324, 1998.
    • [80] D. D. Lee and H. S. Seung. L earning th e p a rts of objects by non-negative m atrix factorization. N ature, 401(6755):788-791, O ctob er 1999.
    • [81] D.D. Lee and H. S. Seung. A lgorithm s for non-negative m atrix factorization. In Proc. of N IPS, pages 556-562, 2000.
    • [82] H. Li and X. Li. G ait analysis using Isom ap. In Proc. of Machine Learning and Cybernetics, volum e 6, pages 3894-3898, 2004.
    • [83] S. Li, X. Hou, and H. Zhang. Learning spatially localized, parts-based representation. In Proc. of C V P R , volum e 1, pages 207-212, 2001.
    • [84] I.S. Lim, P.d.H . C iechom ski, S. Sarni, and D. T halm ann. P la n a r arrangem ent of high-dim ensional biom edical d a ta sets by isomap coordinates. In Proc. of CBMS, pages 50-55, 2003.
    • [85] W . Liu and N. Zheng. Learning sparse features for classification by m ixture models. P attern Recognition Letters, 25(2): 155-161, 2004.
    • [86] W. Liu and N. Zheng. N on-negative m atrix factorization based m ethods for object recogni-tion. P attern Recognition L etters, 25:893-897, 2004.
    • [87] L. Luo, Y. W ang, an d S-Y. K ung. H ierarchy of probabilistic principal component subspaces for d a ta m ining. In Proc. of the IEEE Neural Networks for Signal Processing, pages 497-506, 1999.
    • [88] P. M cCullagh an d J. Yang. How m any clusters? Bayesian Analysis, 1:101-120, 2008.
    • [89] G .J. M cLachlan an d K .E . B asford. M ixture models: Inference an d applications to clustering. Marcel D ekker, 1988.
    • [90] M. Meila and D. H echerm an. A n experim ental com parison of m odel-based clustering m ethods. M icrosoft Research, M achine Learning, 42:9-29, 2001.
    • [96] L. Parsons, E. H aque, an d H. Liu. Subspace clustering for high dimensional data: a review. A C M SIG K D D Explorations N ewsletter, 6(1):90-105, 2004.
    • [97] J. Pearl and S. Russell. B ayesian networks. M IT Press, 2000.
    • [105] M. R ajapakse, J. T an, a n d J. R ajapakse. Color channel encoding w ith nm f for face recognition. In Proc. of ICIP, volume 3, pages 2007-2010, 2004.
    • [108] P. L. Rosin. U nim odal thresholding. P attern Recognition, 34:2083-2096, 2001.
    • [124] R.E. W alpole, R.H . M yers, an d S.L. Myers. Probability and statistics for engineers and scientists. Prentice Hall International, 1998.
    • [125] J. Wang, C. Zhang, a n d Z. Kou. An analytical m apping for lie and its application in m ulti-pose face synthesis. In Proc. of BM VC , pages 38-46, 2003.
    • [126] Y. W ang and Y .Jiar. F ish er non-negative m atrix factorization for learning local features. In Proc. of A sian Conference on Com puter Vision, pages 27-30, 2004.
    • [133] M.H. Yang. Face recognition using extended Isomap. In Proc. of ICIP, volume 2, pages 117-120, 2002.
    • [135] S.J. Young, D. K ershaw , G. M oore, J. Odell, D. Ollason, V. Valtchev, and P.C. W oodland. T he H T K book. Cambridge University Engineering Department, at http://htk.eng.cam .ac.uk, 2002.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article