Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Lopes, Manuel; Montesano, Luis (2014)
Languages: English
Types: Preprint
Subjects: Computer Science - Artificial Intelligence
In this survey we present different approaches that allow an intelligent agent to explore autonomous its environment to gather information and learn multiple tasks. Different communities proposed different solutions, that are in many cases, similar and/or complementary. These solutions include active learning, exploration/exploitation, online-learning and social learning. The common aspect of all these approaches is that it is the agent to selects and decides what information to gather next. Applications for these approaches already include tutoring systems, autonomous grasping learning, navigation and mapping and human-robot interaction. We discuss how these approaches are related, explaining their similarities and their differences in terms of problem assumptions and metrics of success. We consider that such an integrated discussion will improve inter-disciplinary research and applications.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • Aloimonos, J., Weiss, I., and Bandyopadhyay, A. (1988). Active vision. Inter. Journal of Computer Vision, 1(4):333{356.
    • Angluin, D. (1988). Queries and concept learning. Machine Learning, 2:319{342.
    • Baram, Y., El-Yaniv, R., and Luz, K. (2004). Online choice of active learning algorithms. The Journal of Machine Learning Research, 5:255{291.
    • Baranes, A. and Oudeyer, P. (2010). Intrinsically motivated goal exploration for active motor learning in robots: A case study. In Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ Inter. Conf. on, pages 1766{1773.
    • Baranes, A. and Oudeyer, P. (2011). The interaction of maturational constraints and intrinsic motivations in active motor development. In Inter. Conf. on Development and Learning (ICDL'11).
    • Baranes, A. and Oudeyer, P. (2012). Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems.
    • Baranes, A. and Oudeyer, P.-Y. (2009). R-iac: Robust intrinsically motivated exploration and active learning. Autonomous Mental Development, IEEE Transactions on, 1(3):155{169.
    • Barto, A. and Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(4):341{379.
    • Asada, M., MacDorman, K., Ishiguro, H., and Kuniyoshi, Y. (2001). Cognitive developmental robotics as a new paradigm for the design of humanoid robots. Robotics and Automation, 37:185{193.
    • Berglund, E. and Sitte, J. (2005). Sound source localisation through active audition. In Intelligent Robots and Systems, 2005.(IROS 2005). 2005 IEEE/RSJ Inter. Conf. on, pages 653{658.
    • Cakmak, M. and Lopes, M. (2012). Algorithmic and human teaching of sequential decision tasks. In AAAI Conference on Arti cial Intelligence (AAAI'12), Toronto, Canada.
    • Berlyne, D. (1960). Con ict, arousal, and curiosity. McGraw-Hill Book Company.
    • Billing, E. and Hellstrom, T. (2010). A formalism for learning from demonstration. Paladyn. Journal of Behavioral Robotics, 1(1):1{13.
    • Bourgault, F., Makarenko, A., Williams, S., Grocholsky, B., and Durrant-Whyte, H. (2002). Information based adaptive robotic exploration. In IEEE/RSJ Conf. on Intelligent Robots and Systems (IROS).
    • Brafman, R. and Tennenholtz, M. (2003). R-max - a general polynomial time algorithm for near-optimal reinforcement learning. The Journal of Machine Learning Research, 3:213{231.
    • Braziunas, D. and Boutilier, C. (2005). Local utility elicitation in gai models. In Twenty- rst Conf. on Uncertainty in Arti cial Intelligence, pages 42{49.
    • Breazeal, C., Brooks, A., Gray, J., Ho man, G., Lieberman, J., Lee, H., Thomaz, A. L., and Mulanda., D. (2004). Tutelage and collaboration for humanoid robots. Inter. Journal of Humanoid Robotics, 1(2).
    • Brochu, E., Cora, V., and De Freitas, N. (2010). A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. Arxiv preprint arXiv:1012.2599.
    • Brochu, E., de Freitas, N., and Ghosh, A. (2007). Active preference learning with discrete choice data. In Advances in Neural Information Processing Systems.
    • Bubeck, S. and Cesa-Bianchi, N. (2012). Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends R in Stochastic Systems, 1(4).
    • Byrne, R. W. (2002). Seeing actions as hierarchically organised structures: great ape manualskills. In The Imitative Mind. Cambridge University Press.
    • Cakmak, M., Chao, C., and Thomaz, A. (2010a). Designing interactions for robot active learners. IEEE Transactions on Autonomous Mental Development, 2(2):108{ 118.
    • Cakmak, M. and Thomaz, A. (2010). Optimality of human teachers for robot learners. In Inter. Conf. on Development and Learning (ICDL).
    • Cakmak, M. and Thomaz, A. (2011). Active learning with mixed query types in learning from demonstration. In Proc. of the ICML Workshop on New Developments in Imitation Learning.
    • Cakmak, M. and Thomaz, A. (2012). Designing robot learners that ask good questions. In 7th ACM/IEE Inter. Conf. on Human-Robot Interaction.
    • Calinon, S., Guenter, F., and Billard, A. (2007). On learning, representing and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man and Cybernetics, Part B. Special issue on robot learning by observation, demonstration and imitation, 37(2):286{ 298.
    • Carpentier, A., Ghavamzadeh, M., Lazaric, A., Munos, R., and Auer, P. (2011). Upper con dence bounds algorithms for active learning in multi-armed bandits. In Algorithmic Learning Theory.
    • Castro, R. and Novak, R. (2008). Minimax bounds for active learning. IEEE Trans. on Information Theory, 54(5):2339{2353.
    • Castronovo, M., Maes, F., Fonteneau, R., and Ernst, D. (2012). Learning exploration/exploitation strategies for single trajectory reinforcement learning. 10th European Workshop on Reinforcement Learning (EWRL 2012).
    • Chajewska, U., Koller, D., and Parr, R. (2000). Making rational decisions using adaptive utility elicitation. In National Conf. on Arti cial Intelligence, pages 363{ 369. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999.
    • Chao, C., Cakmak, M., and Thomaz, A. (2010). Transparent active learning for robots. In Human-Robot Interaction (HRI), 2010 5th ACM/IEEE Inter. Conf. on, pages 317{324.
    • Chernova, S. and Veloso, M. (2009). Interactive policy learning through con dence-based autonomy. J. Arti cial Intelligence Research, 34:1{25.
    • Cakmak, M., DePalma, N., Arriaga, R., and Thomaz, A. (2010b). Exploiting social partners in robot learning. Autonomous Robots.
    • Cohn, D., Atlas, L., and Ladner, R. (1994). Improving generalization with active learning. Machine Learning, 15(2):201{221.
    • Cohn, D. A., Ghahramani, Z., and Jordan, M. I. (1996). Active learning with statistical models. Journal of Arti cial Intelligence Research, 4:129{145.
    • Cohn, R., Durfee, E., and Singh, S. (2011). Comparing action-query strategies in semi-autonomous agents. In Inter. Conf. on Autonomous Agents and Multiagent Systems.
    • Cohn, R., Durfee, E., and Singh, S. (2012). Planning delayed-response queries and transient policies under reward uncertainty. Seventh Annual Workshop on Multiagent Sequential Decision Making Under Uncertainty (MSDM-2012), page 17.
    • Cohn, R., Maxim, M., Durfee, E., and Singh, S. (2010). Selecting Operator Queries using Expected Myopic Gain. In 2010 IEEE/WIC/ACM Inter. Conf. on Web Intelligence and Intelligent Agent Technology, pages 40{47.
    • Simsek, O. and Barto, A. G. (2004). Using relative novelty to identify useful temporal abstractions in reinforcement learning. In Inter. Conf. on Machine Learning.
    • Dasgupta, S. (2005). Analysis of a greedy active learning strategy. In Advances in Neural Information Processing Systems (NIPS), pages 337{344.
    • Dasgupta, S. (2011). Two faces of active learning. Theoretical computer science, 412(19):1767{1781.
    • Dearden, R., Friedman, N., and Russell, S. (1998). Bayesian q-learning. In AAAI Conf. on Arti cial Intelligence, pages 761{768.
    • Deisenroth, M., Neumann, G., and Peters, J. (2013). A survey on policy search for robotics. Foundations and Trends in Robotics, 21.
    • Detry, R., Baseski, E., ?, M. P., Touati, Y., Kruger, N., Kroemer, O., Peters, J., and Piater, J. (2009). Learning object-speci c grasp a ordance densities. In IEEE 8TH Inter. Conf. on Development and Learning.
    • Digney, B. (1998). Learning hierarchical control structures for multiple tasks and changing environments. In fth Inter. Conf. on simulation of adaptive behavior on From animals to animats, volume 5, pages 321{330.
    • Dillmann, R. (2004). Teaching and learning of robot tasks via observation of human performance. Robotics and Autonomous Systems, 47(2):109{116.
    • Dillmann, R., Rogalla, O., Ehrenmann, M., Zollner, R., and Bordegoni, M. (2000). Learning robot behaviour and skills based on human demonstration and advice: the machine learning paradigm. In Inter. Symposium on Robotics Research (ISRR), volume 9, pages 229{238.
    • Dillmann, R., Zollner, R., Ehrenmann, M., Rogalla, O., et al. (2002). Interactive natural programming of robots: Introductory overview. In Tsukuba Research Center, AIST. Citeseer.
    • Dima, C. and Hebert, M. (2005). Active learning for outdoor obstacle detection. In Robotics Science and Systems Conf., Cambridge, MA.
    • Dima, C., Hebert, M., and Stentz, A. (2004). Enabling learning from large datasets: Applying active learning to mobile robotics. In Robotics and Automation, 2004. Proceedings. ICRA'04. 2004 IEEE Inter. Conf. on, volume 1, pages 108{114.
    • Dorigo, M. and Colombetti, M. (1994). Robot shaping: Developing autonomous agents through learning. Arti - cial intelligence, 71(2):321{370.
    • Doshi, F., Pineau, J., and Roy, N. (2008). Reinforcement learning with limited reinforcement: using bayes risk for active learning in pomdps. In 25th Inter. Conf. on Machine learning (ICML'08), pages 256{263.
    • Du , M. (2003). Design for an optimal probe. In Inter. Conf. on Machine Learning.
    • Ekvall, S. and Kragic, D. (2004). Interactive grasp learning based on human demonstration. In Robotics and Automation, 2004. Proceedings. ICRA'04. 2004 IEEE Inter. Conf. on, volume 4, pages 3519{3524.
    • Elman, J. (1997). Rethinking innateness: A connectionist perspective on development, volume 10. The MIT press.
    • Fails, J. and Olsen Jr, D. (2003). Interactive machine learning. In 8th Inter. Conf. on Intelligent user interfaces, pages 39{45.
    • Feder, H. J. S., Leonard, J. J., and Smith, C. M. (1999). Adaptive mobile robot navigation and mapping. International Journal of Robotics Research, 18(7):650{668.
    • Fitzpatrick, P., Metta, G., Natale, L., Rao, S., and Sandini., G. (2003). Learning about objects through action: Initial steps towards arti cial cognition. In IEEE Inter. Conf. on Robotics and Automation, Taipei, Taiwan.
    • Fong, T., Thorpe, C., and Baur, C. (2003). Robot, asker of questions. Robotics and Autonomous systems, 42(3):235{243.
    • Fox, D., Burgard, W., and Thrun, S. (1998). Active markov localization for mobile robots. Robotics and Autonomous Systems, 25(3):195{207.
    • Fox, R. and Tennenholtz, M. (2007). A reinforcement learning algorithm with polynomial interaction complexity for only-costly-observable mdps. In National Conf. on Arti cial Intelligence (AAAI).
    • Francke, H., Ruiz-del Solar, J., and Verschae, R. (2007). Real-time hand gesture detection and recognition using boosted classi ers and active learning. Advances in Image and Video Technology, pages 533{547.
    • Grollman, D. and Jenkins, O. (2008). Sparse incremental learning for interactive robot control policy estimation. In Robotics and Automation, 2008. ICRA 2008. IEEE Inter. Conf. on, pages 3315{3320.
    • Freund, Y., Seung, H., Shamir, E., and Tishby, N. (1997). Selective sampling using the query by committee algorithm. Machine learning, 28(2):133{168.
    • Furnkranz, J. and Hullermeier, E. (2010). Preference learning: An introduction. Preference Learning, page 1.
    • Garg, S., Singh, A., and Ramos, F. (2012). E cient spacetime modeling for informative sensing. In Sixth Inter. Workshop on Knowledge Discovery from Sensor Data, pages 52{60.
    • Gilks, W. and Berzuini, C. (2002). Following a moving target?onte carlo inference for dynamic bayesian models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(1):127{146.
    • Gittins, J. (1979). Bandit processes and dynamic allocation indices. Journal of the Royal Statistical Society. Series B (Methodological), pages 148{177.
    • Golovin, D., Faulkner, M., and Krause, A. (2010a). Online distributed sensor selection. In Proc. ACM/IEEE Inter. Conf. on Information Processing in Sensor Networks (IPSN).
    • Golovin, D. and Krause, A. (2010). Adaptive submodularity: A new approach to active learning and stochastic optimization. In Proc. Inter. Conf. on Learning Theory (COLT).
    • Golovin, D., Krause, A., and Ray, D. (2010b). Near-optimal bayesian active learning with noisy observations. In Proc. Neural Information Processing Systems (NIPS).
    • Gottlieb, J., Oudeyer, P.-Y., Lopes, M., and Baranes, A. (2013). Information seeking, curiosity and attention: computational and empirical mechanisms. Trends in Cognitive Sciences.
    • Grizou, J., Lopes, M., and Oudeyer, P.-Y. (2013). Robot Learning Simultaneously a Task and How to Interpret Human Instructions. In Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob), Osaka, Japan.
    • Grollman, D. and Jenkins, O. (2007a). Dogged learning for robots. In Robotics and Automation, 2007 IEEE Inter. Conf. on, pages 2483{2488.
    • Grollman, D. and Jenkins, O. (2007b). Learning robot soccer skills from demonstration. In Development and Learning, 2007. ICDL 2007. IEEE 6th Inter. Conf. on, pages 276{281.
    • Gottlieb, J. (2012). Attention, learning, and the value of Jaksch, T., Ortner, R., and Auer, P. (2010). Near-optimal information. Neuron, 76(2):281{295. regret bounds for reinforcement learning. J. Mach. Learn. Res., 11:1563{1600.
    • Guyon, I. and Elissee , A. (2003). An introduction to variable and feature selection. The Journal of Machine Learning Research, 3:1157{1182.
    • Hart, S. and Grupen, R. (2013). Intrinsically motivated affordance discovery and modeling. In Intrinsically Motivated Learning in Natural and Arti cial Systems, pages 279{300. Springer.
    • Hart, S., Sen, S., and Grupen, R. (2008). Intrinsically motivated hierarchical manipulation. In 2008 IEEE Conf. on Robots and Automation (ICRA), Pasadena, California.
    • Hengst, B. (2002). Discovering hierarchy in reinforcement learning with hexq. In MACHINE LEARNING-Inter. WORKSHOP THEN Conf.-, pages 243{250. Citeseer.
    • Hester, T., Lopes, M., and Stone, P. (2013). Learning exploration strategies. In AAMAS, USA.
    • Judah, K., Roy, S., Fern, A., and Dietterich, T. (2010). Reinforcement learning via practice and critique advice. In AAAI Conf. on Arti cial Intelligence (AAAI-10).
    • Knox, W. and Stone, P. (2009). Interactively shaping agents via human reinforcement: The tamer framework. In fth Inter. Conf. on Knowledge capture, pages 9{16.
    • Jung, T. and Stone, P. (2010). Gaussian processes for sample e cient reinforcement learning with rmax-like exploration. Machine Learning and Knowledge Discovery in Databases, pages 601{616.
    • Kaelbling, L. P., Littman, M. L., and Cassandra, A. R. (1998). Planning and acting in partially observable stochastic domains. Arti cial intelligence, 101(1):99{ 134.
    • Kaelbling, L. P., Littman, M. L., and Moore, A. W. (1996). Reinforcement learning: A survey. J. Arti cial Intelligence Research, 4:237{285.
    • Kaochar, T., Peralta, R., Morrison, C., Fasel, I., Walsh, T., and Cohen, P. (2011). Towards understanding how humans teach robots. User Modeling, Adaption and Personalization, pages 347{352.
    • Kapoor, A., Grauman, K., Urtasun, R., and Darrell, T. (2007). Active learning with gaussian processes for object categorization. In IEEE 11th Inter. Conf. on Computer Vision.
    • Katagami, D. and Yamada, S. (2000). Interactive classi er system for real robot learning. In Robot and Human Interactive Communication, 2000. RO-MAN 2000. Proceedings. 9th IEEE Inter. Workshop on, pages 258{ 263.
    • Katz, D., Pyuro, Y., and Brock, O. (2008). Learning to manipulate articulated objects in unstructured environments using a grounded relational representation. In RSS - Robotics Science and Systems IV, Zurich, Switzerland.
    • Kearns, M. and Singh, S. (2002). Near-optimal reinforcement learning in polynomial time. Machine Learning, 49(2):209{232.
    • King, R., Whelan, K., Jones, F., Reiser, P., Bryant, C., Muggleton, S., Kell, D., and Oliver, S. (2004). Functional genomic hypothesis generation and experimentation by a robot scientist. Nature, 427(6971):247{252.
    • Klingspor, V., Demiris, J., and Kaiser, M. (1997). Humanrobot communication and machine learning. Applied Arti cial Intelligence, 11(7):719{746.
    • Kneebone, M. and Dearden, R. (2009). Navigation planning in probabilistic roadmaps with uncertainty. ICAPS. AAAI.
    • Knox, W. and Stone, P. (2010). Combining manual feedback with subsequent mdp reward signals for reinforcement learning. In 9th Inter. Conf. on Autonomous Agents and Multiagent Systems (AAMAS'10), pages 5{12.
    • Kober, J., Bagnell, D., and Peters, J. (2013). Reinforcement learning in robotics: a survey. Inter. Journal of Robotics Research, 32(11):12361272.
    • Kollar, T., Tellex, S., Roy, D., and Roy, N. (2010). Grounding verbs of motion in natural language commands to robots. In Inter. Symposium on Experimental Robotics (ISER), New Delhi, India.
    • Kolter, J. and Ng, A. (2009). Near-bayesian exploration in polynomial time. In 26th Annual Inter. Conf. on Machine Learning, pages 513{520.
    • Konidaris, G. and Barto, A. (2008). Sensorimotor abstraction selection for e cient, autonomous robot skill acquisition. In Inter. Conf. on Development and Learning (ICDL'08).
    • Korupolu, V.N., P., Sivamurugan, M., and Ravindran, B. (2012). Instructing a reinforcement learner. In TwentyFifth Inter. FLAIRS Conf.
    • Kozima, H. and Yano, H. (2001). A robot that learns to communicate with human caregivers. In First Inter. Workshop on Epigenetic Robotics, pages 47{52.
    • Krause, A. and Guestrin, C. (2005). Near-optimal nonmyopic value of information in graphical models. In Uncertainty in AI.
    • Krause, A. and Guestrin, C. (2007). Nonmyopic active learning of gaussian processes: an explorationexploitation approach. In 24th Inter. Conf. on Machine learning.
    • Krause, A., Singh, A., and Guestrin, C. (2008). Nearoptimal sensor placements in gaussian processes: Theory, e cient algorithms and empirical studies. Journal of Machine Learning Research, 9:235{284.
    • Knox, W., Glass, B., Love, B., Maddox, W., and Stone, P. (2012). How humans teach agents: A new experimental perspective. Inter. Journal of Social Robotics, Special Issue on Robot Learning from Demonstration.
    • Kristensen, S., Hansen, V., Horstmann, S., Klandt, J., Kondak, K., Lohnert, F., and Stopp, A. (1999). Interactive learning of world model information for a service robot. Sensor Based Intelligent Robots, pages 49{67.
    • Kroemer, O., Detry, R., Piater, J., and Peters, J. (2010). Combining active learning and reactive control for robot grasping. Robotics and Autonomous Systems, 58(9):1105{1116.
    • Kulick, J., Toussaint, M., Lang, T., and Lopes, M. (2013). Active learning for teaching a robot grounded relational symbols. In Inter. Joint Conference on Arti cial Intelligence (IJCAI'13), Beijing, China.
    • Lang, T., Toussaint, M., and Kersting, K. (2010). Exploration in relational worlds. Machine Learning and Knowledge Discovery in Databases, pages 178{194.
    • Lapeyre, M., Ly, O., and Oudeyer, P. (2011). Maturational constraints for motor learning in high-dimensions: the case of biped walking. In Inter. Conf. on Humanoid Robots (Humanoids'11), pages 707{714.
    • Lauria, S., Bugmann, G., Kyriacou, T., and Klein, E. (2002). Mobile robot programming using natural language. Robotics and Autonomous Systems, 38(3- 4):171{181.
    • Lee, C. and Xu, Y. (1996). Online, interactive learning of gestures for human/robot interfaces. In Robotics and Automation, 1996. Proceedings., 1996 IEEE Inter. Conf. on, volume 4, pages 2982{2987.
    • Lee, M., Meng, Q., and Chao, F. (2007). Staged competence learning in developmental robotics. Adaptive Behavior, 15(3):241{255.
    • Lehman, J. and Stanley, K. (2011). Abandoning objectives: Evolution through the search for novelty alone. Evolutionary Computation, 19(2):189{223.
    • Lewis, D. and Gale, W. (1994). A sequential algorithm for training text classi ers. In 17th annual Inter. ACM SIGIR Conf. on Research and development in information retrieval, pages 3{12. Springer-Verlag New York, Inc.
    • Lim, S. and Auer, P. (2012). Autonomous exploration for navigating in mdps. JMLR.
    • Linder, S., Nestrick, B., Mulders, S., and Lavelle, C. (2001). Facilitating active learning with inexpensive mobile robots. Journal of Computing Sciences in Colleges, 16(4):21{33.
    • Lockerd, A. and Breazeal, C. (2004). Tutelage and socially guided robot learning. In Intelligent Robots and Systems, 2004.(IROS 2004). Proceedings. 2004 IEEE/RSJ Inter. Conf. on, volume 4, pages 3475{ 3480.
    • Lopes, M., Lang, T., Toussaint, M., and Oudeyer, P.- Y. (2012). Exploration in model-based reinforcement learning by empirically estimating learning progress. In Neural Information Processing Systems (NIPS'12), Tahoe, USA.
    • Lopes, M., Melo, F., Kenward, B., and Santos-Victor, J. (2009a). A computational model of social-learning mechanisms. Adaptive Behavior, 467(17).
    • Lopes, M., Melo, F., Montesano, L., and Santos-Victor, J. (2010). Abstraction levels for robotic imitation: Overview and computational approaches. In Sigaud, O. and Peters, J., editors, From Motor to Interaction Learning in Robots, volume 264 of Studies in Computational Intelligence, pages 313{355. Springer.
    • Lopes, M., Melo, F. S., and Montesano, L. (2007). A ordance-based imitation learning in robots. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'07), USA.
    • Lopes, M., Melo, F. S., and Montesano, L. (2009b). Active learning for reward estimation in inverse reinforcement learning. In Machine Learning and Knowledge Discovery in Databases (ECML/PKDD'09).
    • Maillard, O. (2012). Hierarchical optimistic region selection driven by curiosity. In Advances in Neural Information Processing Systems.
    • Maillard, O. A., Munos, R., and Ryabko, D. (2011). Selecting the state-representation in reinforcement learning. In Advances in Neural Information Processing Systems.
    • Mannor, S., Menache, I., Hoze, A., and Klein, U. (2004). Dynamic abstraction in reinforcement learning via clustering. In Inter. Conf. on Machine Learning, page 71.
    • Manoonpong, P., Worgotter, F., and Morimoto, J. (2010). Extraction of reward-related feature space using correlation-based and reward-based learning methods. In 17th Inter. Conf. on Neural information processing: theory and algorithms - Volume Part I, ICONIP'10, pages 414{421, Berlin, Heidelberg. Springer-Verlag.
    • Marchant, R. and Ramos, F. (2012). Bayesian optimisation for intelligent environmental monitoring. In Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ Inter. Conf. on, pages 2242{2249.
    • Martinez-Cantin, R., de Freitas, N., Brochu, E., Castellanos, J., and Doucet, A. (2009). A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Autonomous Robots - Special Issue on Robot Learning, Part B.
    • Martinez-Cantin, R., de Freitas, N., Doucet, A., and Castellanos., J. (2007). Active policy learning for robot planning and exploration under uncertainty. In Robotics: Science and Systems (RSS).
    • Martinez-Cantin, R., Lopes, M., and Montesano, L. (2010). Body schema acquisition through active learning. In IEEE International Conference on Robotics and Automation (ICRA'10), Alaska, USA.
    • Mason, M. and Lopes, M. (2011). Robot self-initiative and personalization by learning through repeated interactions. In 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI'11).
    • McGovern, A. and Barto, A. G. (2001). Automatic discovery of subgoals in reinforcement learning using diverse density. In Inter. Conf. on Machine Learning (ICML'01), San Francisco, CA, USA.
    • Meger, D., Forssen, P., Lai, K., Helmer, S., McCann, S., Southey, T., Baumann, M., Little, J., and Lowe, D. (2008). Curious george: An attentive semantic robot. Robotics and Autonomous Systems, 56(6):503{511.
    • Melo, F., Lopes, M., Santos-Victor, J., and Ribeiro, M. I. (2007). A uni ed framework for imitation-like behaviors. In 4th International Symposium on Imitation in Animals and Artifacts, Newcastle, UK.
    • Nagai, Y., Asada, M., and Hosoda, K. (2006). Learning for joint attention helped by functional development. Advanced Robotics, 20(10):1165{1181.
    • Ogata, T., Masago, N., Sugano, S., and Tani, J. (2003). Interactive learning in human-robot collaboration. In Intelligent Robots and Systems, 2003.(IROS 2003). Proceedings. 2003 IEEE/RSJ Inter. Conf. on, volume 1, pages 162{167.
    • Thrun, S. (1995). Exploration in active learning. Handbook of Brain Science and Neural Networks, pages 381{384.
    • Thrun, S., Schwartz, A., et al. (1995). Finding structure in 1 Introduction reinforcement learning. Advances in neural information processing systems, pages 385{392.
    • Tong, S. and Koller, D. (2001). Support vector machine active learning with applications to text classi cation. Journal of Machine Learning Research, 2:45{66.
    • Ugur, E., Dogar, M. R., Cakmak, M., and Sahin, E. (2007). Curiosity-driven learning of traversability a ordance on a mobile robot. In Development and Learning, 2007. ICDL 2007. IEEE 6th Inter. Conf. on, pages 13{18.
  • No related research data.
  • Discovered through pilot similarity algorithms. Send us your feedback.

Share - Bookmark

Cite this article

Collected from