When a company decides to deploy AI, the first instinct is almost always the same: hire a data scientist. That instinct is understandable — the data scientist is the iconic role of the AI era. In practice, however, we see this hire being the first mistake that slows or stalls a project.
The right team composition depends on what you are specifically building: a RAG system over company documents is different from a predictive anomaly model on a production line, which is different from an autonomous agent in customer service. Each of these projects requires a different mix of people — and some of them do not need an in-house team at all.
Why a data scientist is not the first answer
A data scientist traditionally focuses on exploratory data analysis, statistical modelling, and experimentation. They are valuable when you do not know what you are looking for in the data — in predictive analytics, when uncovering patterns in historical data, or when running A/B tests.
Most AI projects that companies actually need, however, are not about discovering patterns. They are system integrations: an LLM connected to company documentation, an agent that validates orders in an ERP, automated email triage. Here you need someone who can build production software — not exploratory analysis.
In practice this plays out like this: a company hires a data scientist, who runs neat experiments in a Jupyter notebook, and the results look promising. Then comes the question "when do we deploy this to production?" — and silence follows. A data scientist is not a software engineer; they are not familiar with Docker, CI/CD, API design, or production system monitoring. The project gets stuck in the prototype phase.
The roles you actually need
ML / AI engineer — the core of the team
This is the role most companies underestimate, yet it is critical for production AI. An ML/AI engineer combines two competencies: they understand models (fine-tuning, embeddings, inference, prompt engineering) and they know how to package those capabilities into robust software (APIs, queue systems, monitoring, testing).
In practice: an ML engineer sets up a RAG pipeline with hybrid search, selects the right embedding model, tunes retrieval, and exposes the result as a production API with full observability. A data scientist would handle the first step; a software engineer without ML context would handle the last. An ML/AI engineer handles both.
For companies building on frontier models via API (Claude, GPT, Gemini), the ML/AI engineer role becomes even more important — here the focus shifts away from training and toward orchestration, prompt engineering, tool calling, and integrations. For more on what drives cost and reliability, see AI agent costs in production.
Domain expert — without one, the project goes nowhere
This is the role companies most commonly leave out, and its absence is the second most frequent reason for failure (after poor data quality).
A domain expert is someone who deeply understands the process you are automating. In practice this means an experienced operator, department head, technician, or subject-matter specialist. This person knows nothing about LLMs, but they know precisely: - Which answers are correct and which merely look correct - Where the edge cases, exceptions, and failure modes are - What a "good result" looks like from a business perspective — not from an accuracy metric
Without a domain expert, the ML engineer optimises a metric, not real value. The result is a system that looks great in a demo and repeats the same mistakes over and over in production.
A domain expert does not need to be a full-time team member. Four to eight hours per week for output review, evaluation calibration, and feedback cycles is sufficient.
Data engineer — only when you have a data problem
If your AI project depends on data pipelines — cleaning, transformation, event streaming from machines, integrating multiple sources — you need a data engineer. This profile builds the infrastructure that reliably supplies AI with data.
If, however, you are building a RAG system over existing documents or an agent that calls existing APIs, a data engineer is not critical in the first phase. Do not overestimate the need for this role — bring it in when you genuinely have a data problem, not pre-emptively.
MLOps engineer — from a certain scale onward
MLOps covers model deployment, drift monitoring, version management, and retraining pipelines. It is a critical role, but only once you have models in production that need to be maintained and updated.
For initial projects, the ML engineer typically covers this function alone — a dedicated MLOps specialist is unnecessary until there is something to manage. Bring one in when you are managing more than three to five production models or are dealing with retraining cycles.
Product owner — an underestimated but essential role
Every AI project needs someone who is accountable for which problem you are solving and how you measure success. Without this role, the ML/AI engineer optimises for technically interesting things, not business value.
A product owner (or AI product manager) defines success metrics before the project begins, prioritises use cases by ROI, and bridges communication between the technical team and stakeholders. This role can be filled by an internal project manager with sufficient technical understanding — it does not have to be an AI specialist.
Minimum viable team for a first project
From experience, for a typical first production AI project (RAG system, agent, LLM integration):
- 1× ML/AI engineer (full-time during implementation)
- 1× domain expert (part-time, 4–8 h/week)
- 1× product owner / business owner (part-time, 2–4 h/week)
This is the minimum viable team. Fewer than three people in this configuration is a risky setup — either technical competence is missing, or the link to real business context is missing.
For more complex projects (multi-agent systems, fine-tuning a custom model, integration with multiple systems) add a data engineer and an MLOps specialist. For guidance on when fine-tuning a model makes sense, see RAG vs fine-tuning — how to decide.
In-house team vs. partner — when to choose which
This is a question companies ask too late — usually only after the hiring process stalls or the project gets stuck.
Build an in-house team when: - AI is the core of your product or a key competitive advantage (not just support for an existing process) - You have a long investment horizon (12+ months) and a stable product to build AI on top of - You need full control over data, models, and pipelines (regulated environment, GDPR-critical systems) - You plan to iterate quickly and in short cycles — an externally managed team is slower here
Bring in an external partner when: - You are running your first or second AI project and do not yet have internal competence - You need a fast result (3–6 months) — hiring and onboarding an internal team typically takes 4–9 months - The use case is well-defined and requires only maintenance after deployment, not active development - You want knowledge transfer — a good partner not only delivers the solution but also trains your internal team to work with it
We covered the decision between building in-house and buying a ready-made solution in more detail in Build vs buy AI solution.
A hybrid model works well: an external partner delivers the first project, the internal team learns alongside them and takes ownership of maintenance and development. What matters is that the partner actively supports this transfer — rather than simply delivering and walking away.
Common mistakes when building a team
You search for an "AI expert" — and cannot find one
A generalist AI expert does not exist. Every genuinely experienced specialist has a focus: inference and serving, fine-tuning, agent orchestration, RAG architecture. Look for someone with specific relevance to your use case — not a universal genius.
You underestimate the domain expert's time commitment
Companies typically plan for "a few hours a month for consultations." In practice, a quality AI project requires regular, consistent involvement from the domain expert — not a one-off kickoff at the start and a sign-off at the end.
You mistake experimentation for production competence
A candidate who shows you impressive prototypes in a Colab notebook does not necessarily have experience with production deployment. Ask for specific examples: What exactly did you ship to production? How did you handle monitoring? How did you respond to a regression in quality? The answers to these questions separate experimenters from engineers.
You scale before validating the use case
We see this regularly: a company hires a full team (three to five people), buys GPU infrastructure, and six months later discovers the use case was not valuable enough for production deployment. Validate first with a minimal team and an external solution; scale up once you know what works.
Signs that the team has the wrong composition
From experience — warning signals we see with clients:
- The project has been stuck in the prototype phase for more than three months — typically the ML/AI engineer with production experience is missing, or there is no product owner to define exit criteria
- The system "does not work in practice" despite good benchmarks — the domain expert is absent from the evaluation cycle
- The team cannot answer the question "how do you measure success" — the product owner is missing, or metrics are defined in purely technical terms (accuracy, F1) with no business link
- Hiring has been ongoing for 6+ months with no result — the profile is too broad or salary expectations do not match the market; consider an external partner for the first phase
How to proceed if you are starting out
If you are building your first AI project and have no internal team, we recommend:
- 1.Define the use case and success metrics before any hiring — without this step you do not know who you are looking for
- 2.Identify a domain expert internally — this is a role you cannot buy from outside
- 3.Consider an external partner for the first phase — faster start, lower risk, and the opportunity for knowledge transfer
- 4.Start hiring an ML/AI engineer in parallel with the pilot — the team will be ready to take over the project at the right moment
For more on how to structure the first 90 days of an AI project, see How to start with AI in your company.
Frequently asked questions
Do I need a data scientist for an LLM-based AI project?
In most cases no — at least not as the primary role. A data scientist is valuable for exploratory analysis and statistical modelling. Projects built on LLMs (RAG, agents, integrations) primarily need an ML/AI engineer who understands both models and production software. A data scientist can complement the team later — for example in evaluation or output analysis.
What is the minimum number of people needed for a production AI project?
From experience: three roles are the minimum — an ML/AI engineer (full-time), a domain expert (part-time), and a product owner / business owner (part-time). A team of one person, or a team without a domain expert, carries significantly higher risk of failure or getting stuck in the prototype phase.
Is it better to hire a team or go with an external partner?
It depends on your horizon and where you are today. For a first project we recommend an external partner or a hybrid model — it is faster and reduces the risk of costly hiring before the use case is validated. An in-house team makes sense when AI is the core of your product and you have a long-term investment horizon.
How do I recognise that a candidate has production experience rather than just prototype experience?
Ask about specific deployments: What exactly ran in production? What was the data volume or user load? How did you handle monitoring and error states? How did you respond when output quality degraded? Prototypers answer in general terms; production engineers describe specific decisions and trade-offs.
Does the domain expert need to understand AI?
No — and it is not even desirable. The domain expert needs to understand the process you are automating: be able to judge whether an output is correct, identify edge cases, and define what a "good result" looks like from a business perspective. Technical knowledge of AI is not required here.
*If you are working through team composition for an AI project, or weighing what to build in-house versus what to hand to a partner, we are happy to meet for a free consultation. We will help you assess which roles you actually need — and when external collaboration makes more sense than hiring.*
