Optimal Decision Tree Policies for Markov Decision Processes
On 30 Jan, 2023 By admin 0 Comments
In practical applications, we can rarely assume full observability of a system's environment, despite such knowledge being important for determining a reactive control system's precise interaction with its environment. Therefore, we propose an approach for reinforcement learning (RL) in partially observable environments. While assuming that the environment behaves like a partially observable Markov decision process with known discrete actions, we assume no knowledge about its structure or transition probabilities.
Jonathan Rubin, Ohad Shamir, Naftali Tishby