Description: job summary: Post-train policies via behaviour cloning and RL; own the full loop from data to deployment. Partner with the Data Collection team to drive collecting new data: specify what good data looks like, identify failure modes, ensure ...
13 days ago