1. Sedlmeier, A., Müller, R., Illium, S., and Linnhoff-Popien, C. 2020. Policy entropy for out-of-distribution classification. Artificial Neural Networks and Machine Learning–ICANN 2020: 29th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 15–18, 2020, Proceedings, Part II 29, Springer International Publishing, 420–431.

PEOC PerformanceIn this work, the development of PEOC, a policy entropy-based classifier for detecting unencountered states in deep reinforcement learning, is proposed. Utilizing the agent’s policy entropy as a score, PEOC effectively identifies out-of-distribution scenarios, crucial for ensuring safety in real-world applications. Evaluated against advanced one-class classifiers within procedurally generated environments, PEOC demonstrates competitive performance. Additionally, a structured benchmarking process for out-of-distribution classification in reinforcement learning is presented, offering a comprehensive approach to evaluating such systems’ reliability and effectiveness. [Sedlmeier et al. 2020]

PEOC Pipeline