SEMINARIOS EN INGENIERÍA INFORMÁTICA Y DE TELECOMUNICACIÓN 2006-2007


Doctorado en Ingeniería Informática y de Telecomunicación

Programa Oficial de Posgrado en Ingeniería Informática y de Telecomunicación
Escuela Politécnica Superior, Universidad Autónoma de Madrid

Escuela Politécnica Superior                        


Miércoles, 14 de febrero de 2007, 15:00

Salón de Grados, Escuela Politécnica Superior, Universidad Autónoma de Madrid


Efficient methods for control of agents in a dynamical environment.

Prof. dr. H.J. Kappen   

Foundation for Neural Networks (SNN)

Department of Medical Physics and Biophysics

University of Nijmegen  (The Netherlands)


     

Abstract

One of the important challenges in robotics is to design control systems that allow robots to act properly in changing and unforeseen environments, for instance for robots in the home.  Existing approaches to this problem are most often rule-based, which has the disadvantage that all possible scenarios need to be anticipated. In this talk I propose the use of optimal control theory for robot action planning. In general, solving an optimal control problem is too complex to work in practice. However, I will introduce a class of non-linear stochastic control problems that can be efficiently solved using a path integral. In this control formalism, the central concept of cost-to-go or value function becomes a free energy and methods and concepts from statistical physics can be readily applied, such as Monte Carlo sampling or the Laplace approximation. When applied to a receding horizon problem in a stationary environment, the solution resembles the one obtained by traditional reinforcement learning with discounted reward.  It is shown that this solution can be computed more efficiently than in the discounted reward framework. The main advantage of the path integral control method is that it can be applied to time-dependent tasks and is therefore of great relevance for modeling real-time interactions between agents. We propose to use opponent modeling to predict the near future behaviour of the environment, and show its feasibility in a multi-agent setting.

PDF presentation

Bert Kappen

Bert Kappen studied particle physics in Groningen, the Netherlands and completed his PhD in this field in 1987 at the Rockefeller University in New York. From 1987 until 1989 he worked as a scientist at the Philips Research Laboratories in Eindhoven, the Netherlands. Since 1989, he is conducting research on neural networks at the laboratory for biophysics of the University of Nijmegen, the Netherlands. Since 1997 he is associate professor and since 2004 full professor at this university. His group consists of 10 people and is involved in research on machine learning (stochastic processes, learning algorithms, probabilistic reasoning and several applications in collaboration with industry) and computational neuroscience.  His research was awarded in 1997 the prestigious national PIONIER research subsidy.  He co-founded in 1998 the company Smart Research, which sell prediction software based on neural networks. He has developed a medical diagnostic expert system called Promedas, which assists doctors to make accurate diagnosis of patients. Promedas is currently being commercialized through a new spin-off company. He is director of the Dutch Foundation for Neural Networks (SNN), which coordinates research on neural networks in the Netherlands. He organizes annual national conferences on machine learning and artificial intelligence.  He is author of approximately 120 publications.