Posterior Goal Sampling for Hierarchical Reinforcement Learning
Original Github repository : Deep Reinforcement Learning Algorithms with PyTorch.
This work was done as a class project for CS 330 : Deep Multi-Task and Meta Learning. It is a replication of Hierarchical-DQN (h-DQN), (Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation (Kulkarni et al. 2016) ) on the Long Corridor Game environment.
The work was done on the h-DQN algorithm and provided a modified version that includes a Hierarchical-Ensemble DQN RL for efficient high-level policy learning. The code is present in agents/hierarchical_agents/bh_agents.py.
We also included a new function pick_actions2() in agents/DQN_agents/DQN.py that allows goal sampling among all Q heads functions.
python results/Long_Corridor.py
The repository’s high-level structure is:
├── agents
├── DQN_agents
├── actor_critic_agents
├── hierarchical_agents
└── policy_gradient_agents
├── environments
├── exploration_strategies
├── results
└── data_and_graphs
├── tests
├── utilities
└── data structures