Archives and Documentation Center
Digital Archives

Robot skill acquisition via representation sharing and reward conditioning

Show simple item record

dc.contributor Graduate Program in Computer Engineering.
dc.contributor.advisor Uğur Emre.
dc.contributor.author Akbulut, Mete Tuluhan.
dc.date.accessioned 2023-03-16T10:05:29Z
dc.date.available 2023-03-16T10:05:29Z
dc.date.issued 2021.
dc.identifier.other CMPE 2021 A43
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/12457
dc.description.abstract Skill acquisition is a character trait of intelligent behavior, which Robot Learning aims to give to robots. An effective approach is to teach an initial version of the skill by demonstrating as a form of Supervised Learning (SL), called Learning from Demonstrations (LfD), then let the robot improve it and adapt to novel tasks via Reinforcement Learning (RL). In this thesis, we first propose a novel LfD+RL framework, Adaptive Conditional Neural Movement Primitives (ACNMP), that simultaneously utilizes LfD and RL together during adaptation and makes demonstrations and RL guided trajectories share the same latent representation space. We show through simulation experiments that (i) ACNMP successfully adapts the skill using order of magnitude fewer trajectory samples than baselines; (ii) its simultaneous training method preserves the demonstration characteristics; (iii) ACNMP enables skill transfer between robots with different morphologies. Our real-world experiments verify the suitability of ACNMP in real-world applications where non-linearity and the number of dimensions increases. Next, we extend the idea of using SL in reward-based skill learning tasks and propose our second framework called Reward Conditioned Neural Movement Primitives (RC-NMP), where learning is done using only SL. RC-NMP takes rewards as input, generates trajectories conditioned on desired rewards. The model uses variational inference to create a stochastic latent representation space from where varying trajectories are sampled to create a trajectory population. Finally, the diversity of the population is increased using crossover and mutation operations from Evolutionary Strategies to handle environments with sparse rewards, multiple solutions, or local minima. Our simulation and real-world experiments show that RC-NMP is more stable and efficient than ACNMP and two other robotic RL algorithms.
dc.format.extent 30 cm.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2021.
dc.subject.lcsh Machine learning.
dc.subject.lcsh Ability.
dc.subject.lcsh Robotics -- Programming.
dc.title Robot skill acquisition via representation sharing and reward conditioning
dc.format.pages xiv, 54 leaves ;


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account