Open-World Reinforcement Learning over

Long Short-Term Imagination

1MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University
2Ningbo Institute of Digital Twin, Eastern Institute of Technology    3School of Computer Science and Technology, East China Normal University   

*Denotes equal contribution     Indicates corresponding author

overview

The general framework of LS-Imagine, an MBRL agent that operates solely on raw pixels. The fundamental idea is to extend the imagination horizon within a limited number of state transition steps, enabling the agent to explore behaviors that potentially lead to promising long-term feedback.

Abstract

Training visual reinforcement learning agents in a high-dimensional open world presents significant challenges. While various model-based methods have improved sample efficiency by learning interactive world models, these agents tend to be “short-sighted”, as they are typically trained on short snippets of imagined experiences. We argue that the primary obstacle in open-world decision-making is improving the efficiency of off-policy exploration across an extensive state space. In this paper, we present LS-Imagine, which extends the imagination horizon within a limited number of state transition steps, enabling the agent to explore behaviors that potentially lead to promising long-term feedback. The foundation of our approach is to build a short-term world model. To achieve this, we simulate goal-conditioned jumpy state transitions and compute corresponding affordance maps by zooming in on specific areas within single images. This facilitates the integration of direct long-term values into behavior learning. Our method demonstrates significant improvements over state-of-the-art techniques in MineDojo.

Evaluation

Showcases of LS-Imagine in MineDojo.

Harvest log in plains
Harvest water with bucket
Harvest sand
Shear sheep
Mine iron ore

Performance comparison of LS-Imagine against existing approaches in MineDojo.

Performance comparison of LS-Imagine

BibTeX

@article{li2024open,
    title={Open-World Reinforcement Learning over Long Short-Term Imagination}, 
    author={Jiajian Li and Qi Wang and Yunbo Wang and Xin Jin and Yang Li and Wenjun Zeng and Xiaokang Yang},
    journal={arXiv preprint arXiv:2410.03618},
    year={2024}
  }
  

Acknowledgements

This website adapted from Nerfies template.