Open position

Applied AI Engineer

About Us

We build software that makes governance, risk, and compliance (GRC) approachable: turning dense regulatory frameworks and internal policy into something organizations can work with. AI is central to how we do that: surfacing the right information, making sense of complex documents, and keeping outputs trustworthy in a domain where accuracy matters.

We're a small team, early in the journey. The core architecture is in place and proving itself, but there's plenty left to shape. You'll have real ownership over both what you build and how it evolves. We're based in Stockholm and work hybrid.

Role Overview

This role owns the AI layer end-to-end: how LLMs are prompted, orchestrated, and evaluated, and how documents are ingested, understood, and retrieved. The core architecture is in place and proving itself; there's real influence over where it goes next.

We're looking for a strong mid-level to senior engineer who has built LLM-powered systems that real users depend on (not just prototypes) and who wants meaningful ownership from early on. You'll work closely with the backend team. You don't need a background in GRC, but you need to be genuinely curious about the problem.

What You'll Work On

Owning LLM prompt design, orchestration, and evaluation across our Azure AI Foundry deployments (Claude and GPT family): questioning what exists, improving what matters, proposing what's next
Driving retrieval quality across the full RAG pipeline: chunking, embedding, indexing (Milvus / HNSW), and re-ranking, with eval sets that prove a change is an improvement, not just a variation
Building the eval and safety layer: hallucination detection, citation faithfulness, contradiction detection, and regression suites in CI — in a regulated domain, evaluation is a first-class engineering concern
Extending and hardening the Celery-based document processing pipeline (extraction → graph): improving reliability, observability, and cost efficiency
Shipping to production via AKS and Terraform, instrumenting what you build, and staying accountable after launch
Challenging the architecture and ways of working when you see a better approach, and making the case

What We're Looking For

5+ years of software engineering experience, with at least 2 years building LLM-powered systems in production
Strong Python; comfort in a typed, tested codebase (we use mypy --strict, ruff, and pytest)
Hands-on experience with RAG: vector databases (Milvus, pgvector, Pinecone, or similar), embedding model selection, chunking strategies, re-ranking
A practical approach to evals: you can talk concretely about how you've measured retrieval quality or output reliability on a real system
Experience with prompt engineering for structured output and multi-step pipelines, including handling failures and partial results
Understanding of where LLMs fail and how to defend against it: prompt injection, hallucination, citation drift, model version changes
Product instinct: you connect technical decisions to user outcomes

Helpful But Not Required

Experience in a regulated domain (GRC, finance, legal, healthcare) where audit trails and source attribution matter
Celery, Redis, or similar distributed task-queue experience
Apache AGE or other graph database experience
Microsoft Teams bot or Adaptive Card development

How to Apply

Send your resume and preferably a short cover letter detailing your relevant experience and what excites you about this role to careers@nooga.net.

Applied AI Engineer

It's Time to Apply!

Email us