Simulation Systems Engineer

  • Zürich, Zurich, Switzerland
  • Full-Time
  • On-Site

Job Description:

Delta Labs uses AI to simulate and predict consumer behaviour at scale. We build Elaiia, a simulation engine that generates AI Twins — intelligent synthetic agents that mirror real consumer populations. Our clients use Elaiia to simulate customer decisions before committing to them: pricing strategies, product launches, campaign messaging, channel allocation. We replace surveys, focus groups, and intuition with simulation-based evidence.

We’re a small, focused team and we intend to stay that way. We give people ownership, trust, and the autonomy to do their best work. We work with urgency and expect new team members to match our pace. Elaiia is live, paying customers use it daily, and the problems ahead are about scaling what works — not figuring out if it works.


What you’ll do

•       Own and evolve the retrieval and generation pipeline that powers AI Twin behaviour — from selecting and weighting variables out of enriched data sources to producing behaviorally realistic Twin responses

•       Optimise our RAG infrastructure: improve how individual- and aggregate-level data is retrieved, matched, and injected into Twin generation

•       Design and extend LLM-based architectures for distinct cognitive processes within a Twin (e.g. reasoning, preference formation, response generation)

•       Build evaluation frameworks to systematically benchmark LLM models across pipeline stages — accuracy, consistency, cost, latency

•       Validate Twin response quality: define consistency metrics, detect drift, and close the loop between simulated and real-world behaviour

•       Develop infrastructure to scale Twin samples while preserving the statistical properties of real populations — distributions, correlations, and segment structure

•       Fine-tune and integrate LLMs where they improve fidelity, and know when they don’t


What you need

•       You’ve built and maintained Python services that handle real traffic and real failure modes — not notebooks, systems

•       Hands-on experience with RAG pipelines in production: retrieval strategies, embedding models, chunking, ranking

•       Experience with LLMs in production: context engineering (retrieval, prompt design, context assembly), fine-tuning, structured evaluation, cost/quality tradeoffs

•       Strong intuition for statistics: distributions, correlations, sampling methods. You’ll need to reason about whether synthetic populations actually reflect real ones

•       You’ve designed or contributed to evaluation/benchmarking systems for ML or LLM outputs

•       Comfortable with SQL and working with structured data from diverse sources — survey responses, behavioural logs, demographic profiles

•       You think carefully about concurrency, queuing, and failure modes

•       You use AI tools as a core part of your development workflow and consider AI-assisted engineering the standard, not optional 



Nice to have

•       Background in agent-based modelling, computational social science, or behavioural science

•       Experience with synthetic data generation or population modelling at scale

•       Familiarity with psychometrics or survey methodology

•       Familiarity with Rust or other performance-critical languages