My job alerts

Members of Technical Staff, Physical AI (Robotics / World Models)

Orbifold

Software Engineering, IT, Data Science

Palo Alto, CA, USA

Posted on May 7, 2026

Apply now

Member of Technical Staff
Physical AI (Robotics / World Models)

Palo Alto, CA

About Orbifold AI

Orbifold AI advances the frontier of physical AI and world model companies through rigorous evaluation and curated, real-world data. We work directly with leading robotics and world model research teams on the field's hardest problem: systematically surfacing where today's foundation models break, and producing the curated multimodal data that closes the gap.

Our work sits at the intersection of data, evaluation, and model training. We design evaluation harnesses that expose a model's true failure modes; each failure becomes a structured data deficit; and our curation pipeline produces the targeted, high-quality, fully verified data that fills it — collected against a specific failure mode, sampled to balance the long tail, annotated to a co-defined taxonomy, and verified before it reaches a training or reinforcement learning run. Each cycle compounds: sharper evaluations expose finer failures, finer failures drive more precise curation, and more precise curation narrows the distance between demo and deployment.

We collaborate with partners end to end, co-designing datasets, evaluation frameworks, and training and RL pipelines that shape how their models learn from real-world signals. The data standard and curation framework we're building will define the next frontier of robotics and world model training.

‍
About the Role

We are hiring Members of Technical Staff to build the data and evaluation foundations for world models and embodied AI systems. Today's frontier models look impressive on cherry-picked demos but break in production: they fail on long-tail edge cases, hallucinate, lose temporal coherence, mishandle contact and causality, generalize poorly out of distribution, and produce silent failures that automated metrics don't catch. Closing that gap requires evaluation infrastructure that can systematically surface, categorize, and diagnose failures—and feed those signals directly back into data and training.

In this role, you will work closely with internal teams and external research partners to design, build, and iterate on data pipelines, workflows, and evaluation frameworks that drive model quality. You'll define what "good" means for a given partner, build the harnesses that measure it, and translate failures into the next round of data curation and training. This is a highly applied role focused on real-world system performance, not purely theoretical research.

‍

Key Responsibilities

Co-design data pipelines and model training workflows end to end with robotics and world model teams
Define how multimodal data (video, image, audio, sensor, text) should be structured and indexed for training and evaluation
Build scalable systems for data ingestion, cleaning, annotation, and taxonomy-driven curation and balanced sampling across the failure modes that matter for partner deployments
Work on training or fine-tuning models for perception, policy learning, or multimodal reasoning including automated critics and judges that approximate human evaluation at scale
Iterate on datasets and training setups based on downstream model performance, closing the loop between evaluation findings and the next round of data curation
Build comprehensive evaluation frameworks, including fine-grained failure taxonomies, edge case discovery, long-tail probing, distribution shift, robustness, and behavior under adversarial or out-of-spec inputs.
Bridge gaps between raw data, dataset design, and model behavior, and translate evaluation findings into concrete data and training recommendations for partner teams

‍

Preferred Qualifications

Self-driven and high agency, with experience working in fast-pace applied research or startup environments
Experience in robotics, embodied AI, world models, spatial-temporal reasoning, multimodal reasoning and/or reinforcement learning
Strong understanding of model training (e.g. VLA systems, world models, multimodal reasoning, video generation models, etc.) and RL workflows at scale
Experience working with real-world data pipelines, including collection, preprocessing, and curation
Experience designing evaluation frameworks including failure mode analysis, benchmark construction, rubrics and metrics design, automated critics that correlate with downstream model behavior
Ability to reason about how data quality, mixture, and structure impact model performance
Hands-on experience building or iterating on applied ML systems at scale
Comfortable operating across both research and engineering, and building production-grade systems

‍

Nice to Have

Experience with large-scale video or multimodal datasets
Experience in simulation to real transfer or real-world robotics data collection
Familiarity with dataset annotation strategies or evaluation frameworks
Experience working in fast-moving applied research or startup environments
Experience training or evaluating Multimodal Large Language Models (MLLMs) as critics, judges, or reward models

‍

Why join Orbifold AI?

Work directly with leading teams building real-world AI systems; the same partners shipping the next generation of world models and embodied agents
You love being on the cutting edge of RL, physical AI and world model research
Tackle one of the most important and underexplored problems in the field: turning evaluation from a vanity metric into a real driver of model improvement
Operate across both data and model training, not just one side — owning the loop from raw data to evaluation to the next training run
High ownership, fast iteration, and real impact on deployed systems

‍‍

Apply Now: Send your resume and a short introduction, and any relevant work (papers, projects, repos) to careers@orbifold.ai. We'd love to hear from you!

‍

Apply now

See more open positions at Orbifold

Opportunities With Bonfire-Backed Companies

Members of Technical Staff, Physical AI (Robotics / World Models)

Member of Technical StaffPhysical AI (Robotics / World Models)

Opportunities
With Bonfire-Backed Companies

Member of Technical Staff
Physical AI (Robotics / World Models)