Özet:
Decentralized Partially Observable Markov Decision Process (Dec-POMDP) is a recent mathematical framework which has been used to model multi-agent coordination and decision making. However, its real life applications are limited. Robot soccer is one of the good testbeds to investigate the potential of Dec-POMDP algorithms. In this work, we use the Dec-POMDP algorithm developed by Eker and Akın [1]. The algorithm is a policy search algorithm. It searches the policy space with a genetic algorithm. The genetic algorithm uses a simulator to estimate the fitness of chromosomes. There are two policy representations. The finite state controller representation is used for discrete Dec-POMDP models. We extend Eker and Akın’s algorithm by using a neural network representation for continuous Dec-POMDP problems. The experiments are carried out in the RoboCup 2D robot soccer simulator and TeamBots simulator. We show that the algorithm is capable of solving complex problems such as robot soccer. We have experimented with different fitness functions, and we have found that the game score is the best one. We also compare the performances of the two methods, namely Dec-POMDP algorithm and reinforcement learning. It is found that the Dec-POMDP algorithm with the finite state controller representation is better than the reinforcement learning method. We also show that, in the case of the Keepaway problem, the Dec-POMDP algorithm with the neural network representation is better than a hand-coded benchmark policy, and is also comparable to the reinforcement learning method.