Add to Favourites
To login click here

Artificial Intelligence (AI) is being used in a variety of real-life applications, such as autonomous cars, drones, chat-bots, and smart speakers. Reinforcement Learning (RL) is a system of rewards and punishments used to train AI agents to become experts in a task. However, AI must also follow moral rules, such as fairness to other agents and not destroying the environment, as well as not leaking user data. This thesis proposes three algorithms (UCRL-V, BUCRL and TSUCRL) to address these challenges, which have been shown to be near-optimal. Additionally, a novel objective connected to fairness and justice is proposed for the stateless version of reinforcement learning, and an algorithm (UCRG) is developed to solve it. Finally, algorithms are designed to preserve differential-privacy in multi-armed bandit scenarios.