0/70 completed
Infrastructure

Model Versioning

Track every model, dataset, and experiment. Enable reproducibility, rollback, and compliance in production ML.

๐Ÿ“œ Version History

VersionDateChangeAccuracyActions
v1.0.02023-06-01Initial production model68.0%
Current
v1.1.02023-08-15Added weather features71.0%
v2.0.02023-11-20Switched to XGBoost74.0%
v2.1.02024-01-10Hyperparameter tuning76.0%
v2.2.02024-01-25Feature engineering update75.0%

๐Ÿ“ฆ What to Version

๐Ÿ’ป

Code

Tool: Git

Training scripts, configs

๐Ÿ“Š

Data

Tool: DVC

Training datasets, features

๐Ÿง 

Model

Tool: MLflow

Weights, artifacts

๐Ÿ”ฌ

Experiments

Tool: W&B

Hyperparams, metrics

๐Ÿ”— Lineage Tracking

Data

  • raw_odds_2024.parquet
  • player_stats_v3.csv
โ†’

Features

  • feature_store/nfl_v2
  • embeddings/team_v1
โ†’

Training

  • config.yaml
  • train.py@a3f2d1
โ†’

Model

  • model_v2.1.0.pkl
  • scaler.pkl
โ†’

Deploy

  • Dockerfile
  • k8s/deployment.yaml

Full lineage: know exactly what data, code, and config produced each model

โœ… Best Practices

Semantic Versioning

MAJOR.MINOR.PATCH for clarity

Immutable Artifacts

Never modify, always create new

Full Lineage

Track data โ†’ code โ†’ model โ†’ predictions

Reproducibility

Same inputs โ†’ same outputs

Rollback Ready

Previous versions instantly deployable

A/B Comparison

Compare versions in production

DVC + Git Example

# Initialize DVC in your repo
dvc init

# Track large data files
dvc add data/training_data.parquet
git add data/training_data.parquet.dvc .gitignore

# Track model files
dvc add models/player_projection_v2.pkl
git add models/player_projection_v2.pkl.dvc

# Commit everything together
git add .
git commit -m "v2.1.0: Add weather features"
git tag v2.1.0

# Push data to remote storage (S3, GCS)
dvc push

# Reproduce on another machine
git clone <repo>
git checkout v2.1.0
dvc pull  # Downloads data/models from remote

# Compare versions
dvc diff v2.0.0 v2.1.0
dvc metrics diff v2.0.0 v2.1.0

โœ… Key Takeaways

  • โ€ข Version code (Git), data (DVC), models (MLflow)
  • โ€ข Use semantic versioning (MAJOR.MINOR.PATCH)
  • โ€ข Never modifyโ€”always create new versions
  • โ€ข Track full lineage for reproducibility
  • โ€ข Enable instant rollback in production
  • โ€ข Required for audit and compliance

Pricing Models & Frameworks Tutorial

Built for mastery ยท Interactive learning