Glossary

SFT

Supervised fine-tuning (SFT) adapts a pretrained model by training it to reproduce curated target outputs for specified inputs or conversational contexts.

For AI leaders, model and data teams, evaluation teams, and technical buyers

Definition: Supervised fine-tuning (SFT) adapts a pretrained model by training it to reproduce curated target outputs for specified inputs or conversational contexts.

Category: Alignment and post-training

Full Definition

In modern foundation-model development, SFT usually refers to instruction or behavior tuning after pretraining. The examples can include question-answer pairs, multi-turn conversations, code, structured outputs, tool calls, critiques, domain solutions, multimodal responses, or demonstrations of policy-compliant behavior. The model is optimized with a supervised objective over the target sequence or action representation.

SFT teaches a model what a good response or trajectory looks like under the supplied distribution. It does not guarantee that the model understands the underlying rule, generalizes to unseen conditions, remains calibrated, or will refuse unsafe requests correctly. Dataset composition, example quality, model scale, training recipe, and protected evaluation determine whether the adaptation is useful.

How It Works in Practice

A production SFT pipeline begins with an intended behavior and task taxonomy. Data may be written by experts, derived from licensed or customer sources, generated by a stronger model and verified, extracted from successful tool executions, or transformed from existing records. Each example is normalized into a versioned schema, checked for source rights and privacy, validated automatically, reviewed semantically, and assigned use and quality classes.

Training mixtures should preserve source and slice metadata so teams can inspect balance across task, domain, language, difficulty, modality, policy category, and origin. Deduplication and contamination checks help prevent memorization and evaluation leakage. After training, capability, safety, style, and regression tests determine whether the examples changed the intended behavior and whether a smaller or more targeted mixture would be preferable.

Why It Matters for AI Data

SFT is often the first high-leverage step for turning a general model into a usable assistant, specialist, or tool-using system. It can establish output formats and baseline behavior before preference optimization. For data procurement, “SFT data” is too broad to be a specification: buyers need the atomic record, target provenance, expert qualifications, source mix, difficulty, acceptance criteria, and evidence that the mixture improves a held-out system metric.

What a Production Record May Contain

Field or artifactPurpose
InputInstruction, conversation history, evidence, attachments, environment state, and task tags.
TargetAssistant output, structured object, code, action, tool call, or complete trajectory.
Source and verificationHuman/synthetic/licensed origin, author qualification, evidence, tests, and reviewer status.
Training controlsTemplate, loss mask, truncation, weight, split, and mixture membership.
GovernanceRights, privacy, sensitivity, retention, release version, and limitations.

Quality and Governance Risks

  • Low-quality targets teach hallucinations, formatting artifacts, unsupported reasoning, or unsafe shortcuts with high confidence.
  • Overly homogeneous examples can narrow behavior, reduce robustness, and erase useful pretrained capabilities.
  • Synthetic examples can recursively reproduce model biases or source content unless lineage and verification are preserved.
  • Near-duplicates and benchmark leakage can inflate evaluation and encourage memorization.
  • Loss masking, chat templates, tool schemas, and truncation can silently change what the model is trained to imitate.
  • Examples containing private reasoning or unsupported rationales should not be marketed as ground-truth internal thought.

Practical Example

A software-agent SFT record may contain a repository snapshot, issue, allowed tools, an executed sequence of file reads and edits, test results, final patch, and outcome verifier. The data team preserves the native trajectory and terminal state, removes secrets, checks that the tests actually pass, labels intervention, and distinguishes successful, recovery, and failed examples. A protected repository set then measures whether the fine-tuned agent generalizes beyond the demonstrations.

Related Terms

RLHF · DPO · Data Curation · Tool-Use Trajectory

Key Takeaway

SFT is supervised imitation of selected behavior. A credible SFT program couples high-quality, traceable targets with mixture governance and independent evaluation rather than equating more instruction pairs with a better model.