Description: job summary: Post-train policies via behaviour cloning and RL; own the full loop from data to deployment. Partner with the Data Collection team to drive collecting new data: specify what good data looks like, identify failure modes, ensure ...
14 days ago