Description:
job summary: Post-train policies via behaviour cloning and RL; own the full loop from data to deployment. Partner with the Data Collection team to drive collecting new data: specify what good data looks like, identify failure modes, ensure diversity and coverage. Work closely with external partners to ensure steady supply of high-quality pretraining-scale data. Run pre-/mid-/post-training on VLA stack; explore new modalities and architecture changes. Build and maintain continuous pipelines:
Feb 6, 2026;
from:
dice.com