LOGIN TO YOUR ACCOUNT

Username
Password
Remember Me
Or use your Academic/Social account:

CREATE AN ACCOUNT

Or use your Academic/Social account:

Congratulations!

You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.

Important!

Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message

CREATE AN ACCOUNT

Name:
Username:
Password:
Verify Password:
E-mail:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Liverani, Silvia
Languages: English
Types: Doctoral thesis
Subjects: QA, QH426
This thesis is concerned with the study of a Bayesian clustering algorithm, proposed by Heard et al. (2006), used successfully for microarray experiments over time. It focuses not only on the development of new ways of setting hyperparameters so that inferences both reflect the scientific needs and contribute to the inferential stability of the search, but also on the design of new fast algorithms for the search over the partition space. First we use the explicit forms of the associated Bayes factors to demonstrate that such methods can be unstable under common settings of the associated hyperparameters. We then prove that the regions of instability can be removed by setting the hyperparameters in an unconventional way. Moreover, we demonstrate that MAP (maximum a posteriori) search is satisfied when a utility function is defined according to the scientific interest of the clusters. We then focus on the search over the partition space. In model-based clustering a comprehensive search for the highest scoring partition is usually impossible, due to the huge number of partitions of even a moderately sized dataset. We propose two methods for the partition search. One method encodes the clustering as a weighted MAX-SAT problem, while the other views clusterings as elements of the lattice of partitions. Finally, this thesis includes the full analysis of two microarray experiments for identifying circadian genes.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • Chapter 5 Utility-based Clustering 90 5.1 A Clustering for Time-course Data . . . . . . . . . . . . . . . . . . . . 92 5.2 Utility over Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.2.1 A Useful Class of Utilities . . . . . . . . . . . . . . . . . . . . . 94 5.2.2 Marginal Search . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3 Properties of the Product Utility . . . . . . . . . . . . . . . . . . . . . 100 5.3.1 Product Utilities and Local Moves . . . . . . . . . . . . . . . . 100 5.3.2 Relationships between Product Utility and MAP . . . . . . . . . 103 5.3.3 Robustness of the Utility Weighted Score . . . . . . . . . . . . 107 5.3.4 Some Practical Issues . . . . . . . . . . . . . . . . . . . . . . . 107 5.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.1 Clusters obtained on 18 genes with direct AHC . . . . . . . . . . . . . 111 5.2 Clusters obtained on 18 genes with AHC on interesting clusters . . . . . 112 5.3 Reclassification of a known gene from a potentially not interesting cluster to a potentially circadian cluster . . . . . . . . . . . . . . . . . . . . . 114
    • 7.1 The lattice of partitions . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Amaratunga, D. and Cabrera, J. (2003). Exploration and analysis of DNA microarray and protein array data. Wiley-IEEE.
    • Anderson, P. E., Smith, J. Q., Edwards, K. D., and Millar, A. J. (2006). Guided Conjugate Bayesian Clustering for Uncovering Rhythmically expressed Genes. Technical Report 07, CRiSM Working Paper, University of Warwick, UK.
    • Angelini, C., De Canditiis, D., Mutarelli, M., and Pensky, M. (2007). A Bayesian Approach to Estimation and Testing in Time-course Microarray Experiments. Statistical Applications in Genetics and Molecular Biology, 6(1):1299.
    • Banfield, J. D. and Raftery, A. E. (1993). Model-Based Gaussian and Non-Gaussian Clustering. Biometrics, 49(3):803-821.
    • Bar-Joseph, Z., Gifford, D., Jaakkola, T., and Simon, I. (2002). A new approach to analyzing gene expression time series data. Proceedings of the 6th Annual International Conference on Computational Biology, pages 39-48.
    • Barry, D. and Hartigan, J. A. (1992). Product Partition Models for Change Point Problems. Annals of Statistics, 20(1):260-279.
    • Ben-Dor, A., Shamir, R., and Yakhini, Z. (1999). Clustering Gene Expression Patterns. Journal of Computational Biology, 6(3-4):281-297.
    • Ben-Hur, A., Elisseeff, A., and Guyon, I. (2002). A stability based method for discovering structure in clustered data. In Pacific Symposium on Biocomputing 2002: Kauai, Hawaii, 3-7 January 2002, page 6. World Scientific Publishing Company.
    • Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Chichester: Wiley.
    • Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers Norwell, MA, USA.
    • Blangiardo, M. and Richardson, S. (2008). A Bayesian calibration model for combining different pre-processing methods in Affymetrix chips. BMC bioinformatics, 9(1):512.
    • Bolstad, B., Irizarry, R., Astrand, M., and Speed, T. (2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19(2):185-193.
    • Booth, J. G., Casella, G., and Hobert, J. P. (2008). Clustering using objective functions and stochastic search. Journal of the Royal Statistical Society, Series B, 70(1):119- 139.
    • Bowler, C. and Allen, A. E. (2007). The contribution of genomics to the understanding of algal evolution. In Brodie, J. and Lewis, J., editors, Unravelling the Algae: The Past, Present, and Future of Algal Systematics, page 331. CRC Press.
    • Brettschneider, J., Collin, F., Bolstad, B. M., and Speed, T. P. (2008). Quality assessment for short oligonucleotide microarray data. Technometrics, 50(3):241-264.
    • Eisen, M., Spellman, P., Brown, P., and Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences, 95(25):14863-14868.
    • Fern´andez, C., Ley, E., and Steel, M. F. J. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100(2):381-427.
    • Fisher, W. D. (1958). On grouping for maximum homogeneity. Journal of the American Statistical Association, pages 789-798.
    • Fowlkes, E. B. and Mallows, C. L. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, pages 553-569.
    • Fraley, C. and Raftery, A. E. (1998). How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, 41:578-588.
    • Fraley, C. and Raftery, A. E. (2002). Model-Based Clustering, Discriminant Analysis and Density Estimation. Journal of the American Statistical Association, 97:611-631.
    • French, S. and Britain), O. R. S. G. (1989). Readings in decision analysis. Chapman and Hall London.
    • Gordon, A. D. (1987). A Review of Hierarchical Classification. Journal of the Royal Statistical Society, Series A, 150(2):119-137.
    • Meil˘a, M. (2005). Comparing clusterings - an axiomatic view. In International Conference on Machine Learning, volume 22, page 577.
    • Michael, T., Mockler, T., Breton, G., McEntee, C., Byer, A., Trout, J., Hazen, S., Shen, R., Priest, H., Sullivan, C., Givan, S., Yanovsky, M., Hong, F., Kay, S., and Chory, J. (2008). Network Discovery Pipeline Elucidates Conserved Time-of-Day-Specific cis-Regulatory Modules. PLoS Genetics, 4(2):e14.
  • Inferred research data

    The results below are discovered through our pilot algorithms. Let us know how we are doing!

    Title Trust
    61
    61%
  • No similar publications.

Share - Bookmark

Cite this article