机器学习实验室博士生系列论坛(第十期)—— Distributional Perspective on Reinforcement Learning
报告人:Hao Jin (PKU)
时间:2021-07-21 15:10-16:10
地点:大阳城2138静园六院一楼大会议室&腾讯会议 498 2865 6467
Abstract: Distributional reinforcement learning is proposed to characterize the intrinsic uncertainty of an MDP in the learning process. Specifically, instead of a scalar function, Q(s, a) is taken as a random variable. In this way, we have to extend the traditional Bellman update of scalar Q(s, a) to the update of distributions characterized by Q(s, a). Such difference challenges both the algorithm design and theoretical analysis on its performance. Algorithms motivated by such distributional perspective are named as distributional reinforcement learning (DRL) algorithms. In this talk, we firstly introduce several famous DRL algorithms, from C51 to FQF, which have different methods in updating the random variable Q(s, a). Then we focus on the theoretical analysis on these DRL algorithms: comparison between DRL with traditional RL methods, relation to risk-sensitive algorithm, and the proposal of a unified framework for all DRL algorithms. Additionally, we introduce several works motivated by distributional perspective in other tasks, from algorithm design in policy learning to multi-agent reinforcement learning.