Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Quadrianto, Novi; Ghahramani, Zoubin (2015)
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Languages: English
Types: Article
Subjects: QA273

Classified by OpenAIRE into

arxiv: Statistics::Computation
ACM Ref: ComputingMethodologies_PATTERNRECOGNITION
Random forests works by averaging several predictions of de-correlated trees. We show a conceptually radical approach to generate a random forest: random sampling of many trees from a prior distribution, and subsequently performing a weighted ensemble of predictive probabilities. Our approach uses priors that allow sampling of decision trees even before looking at the data, and a power likelihood that explores the space spanned by combination of decision trees. While each tree performs Bayesian inference to compute its predictions, our aggregation procedure uses the power likelihood rather than the likelihood and is therefore strictly speaking not Bayesian. Nonetheless, we refer to it as a Bayesian random forest but with a built-in safety. The safeness comes as it has good predictive performance even if the underlying probabilistic model is wrong. We demonstrate empirically that our Safe-Bayesian random forest outperforms MCMC or SMC based Bayesian decision trees in term of speed and accuracy, and achieves competitive performance to entropy or Gini optimised random forest, yet is very simple to construct.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [1] Leo Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.
    • [2] J. R. Quinlan. Induction of decision trees. Machine Learning, pages 81-106, 1986.
    • [3] Sebastian Nowozin. Improved information gain estimates for decision tree induction. In International Conference on Machine Learning (ICML), 2012.
    • [4] L. Breiman. Random forests. Technical Report TR567, UC Berkeley, 1999.
    • [5] Jamie Shotton, Andrew W. Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard Moore, Alex Kipman, and Andrew Blake. Real-time human pose recognition in parts from single depth images. In Computer Vision and Pattern Recognition (CVPR), 2011.
    • [6] Gabriele Fanelli, Matthias Dantone, Juergen Gall, Andrea Fossati, and Luc Gool. Random forests for real time 3d face analysis. International Journal of Computer Vision, pages 1-22, 2012.
    • 15.59±3.46 27.03±1.80 Israeli-Images dataset 15.88±0.86 35.21±2.28 34.54±1.32 33.97±1.91 Animals with Attributes dataset 27.39±2.66 41.96±0.91 41.86±1.38 41.57±0.92 34.93±2.05 38.17±0.92
    • [7] Antonio Criminisi, Jamie Shotton, and Ender Konukoglu. Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semisupervised learning. Foundations and Trends in Computer Graphics and Vision, 7(2-3):81-227, 2012.
    • [8] Tin Kam Ho. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8):832-844, 1998.
    • [9] Wray L. Buntine. Learning classification trees. Statistics and Computing, 2:63-73, 1992.
    • [10] Jonathan J. Oliver and David J. Hand. On pruning and averaging decision trees. In International Conference on Machine Learning (ICML), 1995.
    • [11] Hugh Chipman, Edward I. George, and Robert E. Mcculloch. Bayesian cart model search. Journal of the American Statistical Association, pages 935-948, 1998.
    • [12] David G. T. Denison, Bani K. Mallick, and Adrian F. M. Smith. A bayesian cart algorithm. Biometrika, 85(2):363-377, 1998.
    • [13] Hugh A. Chipman, Edward I. George, and Robert E. McCulloch. Bayesian ensemble learning. In Neural Information Processing Systems (NIPS), 2007.
    • [14] Balaji Lakshminarayanan, Daniel M. Roy, and Yee Whye Teh. Top-down particle filtering for bayesian decision trees. In International Conference on Machine Learning (ICML), 2013.
    • [15] T. P. Minka. Bayesian model averaging is not model combination. Technical report, MIT Media Lab., 2002.
    • [16] Hyun-Chul Kim and Zoubin Ghahramani. Bayesian classifier combination. Journal of Machine Learning Research - Proceedings Track, 22:619-627, 2012.
    • [17] Stephen Walker and Nils Lid Hjort. On bayesian consistency. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 63(4):811-821, 2001.
    • [18] Isadora AntonianoVillalobos and Stephen G. Walker. Bayesian nonparametric inference for the power likelihood. Journal of Computational and Graphical Statistics, 2012.
    • [19] Peter Gru¨ nwald. The safe bayesian: Learning the learning rate via the mixability gap. In International Conference on Algorithmic Learning Theory (ALT), 2012.
    • [20] Tong Zhang. Learning bounds for a generalized family of bayesian posterior distributions. In Neural Information Processing Systems (NIPS), 2003.
    • [21] Wray L. Buntine. A Theory of Learning Classification Rules. PhD thesis, University of Technology Sydney, 1992.
    • [22] J. M. Hammersley and D. C. Handscomb. Monte Carlo methods. Methuen London, 1964.
    • [23] Joseph G. Ibrahim and Ming-Hui Chen. Power prior distributions for regression models. Statistical Science, 15(1):pp. 46-60, 2000.
    • [24] N. Friel and A. N. Pettitt. Marginal likelihood estimation via power posteriors. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(3):589-607, 2008.
    • [25] Robert Gramacy, Richard Samworth, and Ruth King. Importance tempering. Statistics and Computing, 20(1):1-7, 2010.
    • [26] Florent Perronnin, Jorge Sa´nchez, and Thomas Mensink. Improving the fisher kernel for large-scale image classification. In European Conference on Computer Vision (ECCV), 2010.
    • [27] Herbert Bay, Andreas Ess, Tinne Tuytelaars, and Luc Van Gool. Speeded-up robust features (surf). Computer Vision and Image Understanding, pages 346-359, 2008.
    • [28] F. Pedregosa et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830, 2011.
    • [29] R. Caruana, N. Karampatziakis, and A. Yessenalina. An empirical evaluation of supervised learning in high dimensions. In International Conference on Machine Learning (ICML), 2008.
    • [30] Adele Cutler and Guohua Zhao. Pert - perfect random tree ensembles. Computing Science and Statistics, 2001.
    • [31] Pierre Geurts, Damien Ernst, and Louis Wehenkel. Extremely randomized trees. Machine Learning Journal, 63(1), 2006.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article