Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Makris, Dimitios; Ellis, Tim (2005)
Publisher: Institute of Electrical and Electronics Engineers
Languages: English
Types: Article
Subjects: computer
This paper considers the problem of automatically learning an activity-based semantic scene model from a stream of video data. A scene model is proposed that labels regions according to an identifiable activity in each region, such as entry/exit zones, junctions, paths, and stop zones. We present several unsupervised methods that learn these scene elements and present results that show the efficiency of our approach. Finally, we describe how the models can be used to support the interpretation of moving objects in a visual surveillance environment.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • [1] I. Haritaoglu, D. Harwood, and L. S. Davis, “W4: Real-time surveillance of people and their activities,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 809-830, Aug. 2000.
    • [2] M. Xu and T. Ellis, “Partial observation versus blind tracking through occlusion,” in Proc. BMVC, Sep. 2002, pp. 777-786.
    • [3] J. Black, T. Ellis, and P. Rosin, “Multi view image surveillance and tracking,” in IEEE Workshop Motion and Video Computing, Orlando, FL, Dec. 2002, pp. 169-174.
    • [4] M. Xu and T. Ellis, “Illumination-invariant motion detection using color mixture models,” in Proc. BMVC, Manchester, U.K., Sep. 2001, pp. 163-172.
    • [5] J. Black, T. J. Ellis, and D. Makris, “A hierarchical database for visual surveillance applications,” in IEEE Int. Conf. Multimedia and Expo, vol. 3, Jun. 2004, pp. 1571-1574.
    • [6] A. R. Hanson and E. M. Riseman, “The VISIONS image-understanding system,” Proc. Advances in Computer Vision, vol. 1, pp. 1-114, 1988.
    • [7] B. Neumann, “Natural language description of time-varying scenes,” Semantic Structures: Advances in Natural Language Processing, 1989.
    • [8] M. Mohnhaupt and B. Neumann, “Understanding object motion: Recognition, learning, and spatio-temporal reasoning,” J. Robot. Auton. Syst., vol. 8, pp. 65-91, 1991.
    • [9] R. J. Howard and H. Buxton, “Analogical representation of spatial events, for understanding traffic behavior,” in Proc. 10th Eur. Conf. AI, 1992, pp. 785-789.
    • [10] R. J. Howarth, “Spatial representation and control for a surveillance system,” Ph.D. dissertation, Queen Mary and Westfield College, Univ. London, London, U.K., 1994.
    • [11] J. H. Fernyhough, A. G. Cohn, and D. C. Hogg, Generation of Semantic Regions from Image Sequences, B. Buxton and R. Cipolla, Eds. New York: Springer-Verlag, 1996, pp. 475-478.
    • [12] N. Johnson and D. C. Hogg, “Learning the distribution of object trajectories for event recognition,” in Proc. BMVC, Birmingham, UK, Sep. 1995, pp. 583-592.
    • [13] E. B. Koller-Meier and L. Van Gool, “Modeling and recognition of human actions using a stochastic approach,” in Proc. Eur. Workshop n Advanced Video-Based Surveillance Systems, Kingston, U.K., Sep. 2001, pp. 17-28.
    • [14] W. E. L. Grimson, C. Stauffer, R. Romano, and L. Lee, “Using adaptive tracking to classify and monitor activities in a site,” in Proc. CVPR, Santa Barbara, CA, Jun. 1998, pp. 22-31.
    • [15] C. Stauffer, “Scene reconstruction using accumulated line-of-sight,” M.S. dissertation, Mass. Inst. Technol., Cambridge, 1997.
    • [16] D. Makris and T. Ellis, “Finding paths in video sequences,” in Proc. Brit. Machine Vision Conf., vol. 1, Manchester, U.K., Sep. 2001, pp. 263-272.
    • [17] , “Path detection in video surveillance,” Image Vis. Comput., vol. 20/12, pp. 895-903, Oct. 2002.
    • [18] , “Spatial and probabilistic modeling of pedestrian behavior,” in Proc. Brit. Machine Vision Conf., vol. 2, Cardiff, U.K., Sep. 2002, pp. 557-566.
    • [19] , “Automatic learning of an activity-based semantic scene model,” in Proc. IEEE Int. Conf. Advanced Video and Signal Based Surveillance, Miami, FL, Jul. 2003, pp. 183-188.
    • [20] C. Stauffer, “Estimating tracking sources and sinks,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Madison, WI, Jul. 2003, pp. I-259-I-266.
    • [21] H. Buxton, “Generative models for learning and understanding dynamic scene activity,” in Eur. ConF. Computer Vision, Workshop on Generative Model-Based Vision, Copenhagen, Denmark, Jun. 2002, pp. 77-81.
    • [22] C. Stauffer and W. E. L. Grimson, “Adaptive background mixture models for real-time tracking,” in Proc. CVPR, Fort Colins, CO, 1999, pp. 246-252.
    • [23] F. Cupillard, F. Bremond, and M. Thonnat, “Behavior recognition for individuals, groups of people, and crowd,” presented at the IEE Seminar Intelligent Distributed Surveillance Systems, London, U.K., Mar. 2003.
    • [24] A. Dempster, N. Laird, and D. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. Roy. Statist. Soc., vol. B-39, pp. 1-38, 1977.
    • [25] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article