Citation
Voloshin, Cameron (2024) Guaranteed Policy Performance in Reinforcement Learning. Dissertation (Ph.D.), California Institute of Technology. doi:10.7907/n2fg-e554. https://resolver.caltech.edu/CaltechTHESIS:06062024-061120491
Abstract
Decision-making is ubiquitous in everyday life. Increasingly, researchers are seeking answers on how to optimally solve sequential decision-making tasks. Thanks to recent availability of computation, advances in deep learning, and released open-sourced code, it has become easy to train a computational agent to make decisions in many domains. Nevertheless, in realistic scenarios where the consequences of failure are high, running a trained computational agent in the wild poses substantial risk.
The goal of this thesis is to develop and advance techniques that guarantee a learned agent does what we expect it to do. The thesis tackles two central questions:
1) Given an agent, how can we predict if it will perform desirably?
2) Can we structure the learning process to guarantee desirable post-learning performance?
On the former question, this thesis proposes multiple algorithms to evaluate such agents, finds factors that have high influence on the success of agent evaluation, and open-sources benchmarks for further development in the space.
On the latter question, this thesis formulates desirable agent behavior as a constrained optimization with varying types of constraints depending on the structure afforded to the practitioner. Constraining the search space over the learning process ensures post-learning behaviors will, by definition, perform as desired.
| Item Type: | Thesis (Dissertation (Ph.D.)) | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Subject Keywords: | Reinforcement Learning, Policy Learning, Off policy Evaluation | ||||||||||||||||||
| Degree Grantor: | California Institute of Technology | ||||||||||||||||||
| Division: | Engineering and Applied Science | ||||||||||||||||||
| Major Option: | Computing and Mathematical Sciences | ||||||||||||||||||
| Thesis Availability: | Public (worldwide access) | ||||||||||||||||||
| Research Advisor(s): |
|
||||||||||||||||||
| Thesis Committee: |
|
||||||||||||||||||
| Defense Date: | 13 June 2023 | ||||||||||||||||||
| Non-Caltech Author Email: | clvoloshin (AT) gmail.com | ||||||||||||||||||
| Record Number: | CaltechTHESIS:06062024-061120491 | ||||||||||||||||||
| Persistent URL: | https://resolver.caltech.edu/CaltechTHESIS:06062024-061120491 | ||||||||||||||||||
| DOI: | 10.7907/n2fg-e554 | ||||||||||||||||||
| Related URLs: |
|
||||||||||||||||||
| ORCID: |
|
||||||||||||||||||
| Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||||||||||||||
| ID Code: | 16508 | ||||||||||||||||||
| Collection: | CaltechTHESIS | ||||||||||||||||||
| Deposited By: | Cameron Voloshin | ||||||||||||||||||
| Deposited On: | 06 Jun 2024 23:04 | ||||||||||||||||||
| Last Modified: | 14 Jun 2024 21:18 |
Thesis Files
|
|
PDF
- Final Version
See Usage Policy. 18MB |
Repository Staff Only: item control page