Senior Software Engineer - AI

2d2 days ago

Lattice

CA · Full-time · C$160,000 – C$180,000

About this role

Lattice’s AI Engineering team is building the systems that power how AI works across the company. The team has laid the foundations with traces flowing and evals running, and is now focused on defining how AI products are measured, improved, and trusted at scale.

In this high-ownership role you will shape evaluation methodology, agent architecture, and the core systems that determine how AI performs in production. Work centers on building maintainable, performant systems using modern technologies while collaborating closely with product and design.

Day-to-day responsibilities include designing end-to-end evaluation frameworks, architecting reusable agent infrastructure with LangGraph, and scaling RAG pipelines. You will instrument meaningful metrics, maintain automated scoring pipelines, and contribute to production AI systems with emphasis on reliability and observability.

The position offers the chance to own projects end-to-end, inform technical direction on agent quality strategy, and raise the AI engineering bar through code review and technical debate across the broader team.

Requirements

5+ years of professional software engineering experience with significant time spent on production AI/ML systems.
Deep hands-on experience with LLM-based systems including prompt engineering, RAG pipelines, agent orchestration, evaluation metrics, and model fine-tuning.
Proven ability to build and operate agentic AI systems in production including multi-step workflows and multi-agent topologies.
Strong command of AI evaluation with experience building eval frameworks and distinguishing good evals from vanity metrics.
Production-grade Python engineering skills with clean, maintainable, testable code.
Experience with LangGraph or comparable agent orchestration frameworks and real agent workflows in production.
Experience with LangSmith or comparable LLM observability tooling for tracing, evaluation, and debugging.
Proven ability to work with data and understand statistics, especially in experiments.

Responsibilities

Design and ship a robust, end-to-end AI evaluation framework covering offline evals, production tracing, and human-in-the-loop feedback loops.
Define and instrument metrics for agent task completion, hallucination rates, response quality, user engagement, and downstream business outcomes.
Build and maintain evaluation datasets, test harnesses, and automated scoring pipelines to catch regressions before they ship.
Architect and implement reusable agent infrastructure including multi-turn workflows, LLM DAGs, and standardized patterns using LangGraph.
Build and scale RAG pipelines and retrieval infrastructure, including vector store management and retrieval quality optimization.
Make principled build vs. buy decisions across LLM providers, agent frameworks, and evaluation tooling.
Own projects end-to-end, scope them, drive them to completion, and bring in the right people at the right time.
Partner with engineering leads to inform technical direction on agent quality and evaluation strategy.