Using Dec-POMDP algorithms to solve multi-agent decision problems in robot soccer

Aşık, Okan.

Archives and Documentation Center Digital Archives Home
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Bilgisayar Mühendisliği
→
M.S. Theses
→
View Item

dc.contributor	Graduate Program in Computer Engineering.
dc.contributor.advisor	Akın, H. Levent.
dc.contributor.author	Aşık, Okan.
dc.date.accessioned	2023-03-16T10:01:10Z
dc.date.available	2023-03-16T10:01:10Z
dc.date.issued	2012.
dc.identifier.other	CMPE 2012 A75
dc.identifier.uri	http://digitalarchive.boun.edu.tr/handle/123456789/12224
dc.description.abstract	Decentralized Partially Observable Markov Decision Process (Dec-POMDP) is a recent mathematical framework which has been used to model multi-agent coordination and decision making. However, its real life applications are limited. Robot soccer is one of the good testbeds to investigate the potential of Dec-POMDP algorithms. In this work, we use the Dec-POMDP algorithm developed by Eker and Akın [1]. The algorithm is a policy search algorithm. It searches the policy space with a genetic algorithm. The genetic algorithm uses a simulator to estimate the fitness of chromosomes. There are two policy representations. The finite state controller representation is used for discrete Dec-POMDP models. We extend Eker and Akın’s algorithm by using a neural network representation for continuous Dec-POMDP problems. The experiments are carried out in the RoboCup 2D robot soccer simulator and TeamBots simulator. We show that the algorithm is capable of solving complex problems such as robot soccer. We have experimented with different fitness functions, and we have found that the game score is the best one. We also compare the performances of the two methods, namely Dec-POMDP algorithm and reinforcement learning. It is found that the Dec-POMDP algorithm with the finite state controller representation is better than the reinforcement learning method. We also show that, in the case of the Keepaway problem, the Dec-POMDP algorithm with the neural network representation is better than a hand-coded benchmark policy, and is also comparable to the reinforcement learning method.
dc.format.extent	30 cm.
dc.publisher	Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2012.
dc.relation	Includes appendices.
dc.relation	Includes appendices.
dc.subject.lcsh	Markov processes.
dc.subject.lcsh	Decision making -- Mathematical models.
dc.title	Using Dec-POMDP algorithms to solve multi-agent decision problems in robot soccer
dc.format.pages	xiii, 50 leaves ;