Add to Favourites
To login click here

Researchers have introduced Temporal Reward Decomposition (TRD) to enhance explainability in reinforcement learning by predicting the next N expected rewards, revealing when and what rewards are anticipated. This approach allows for better interpretation of an agent’s decisions and can be integrated into existing RL models with minimal performance impact.