Knowledge Distillation
Training a smaller 'student' model to mimic a larger 'teacher' model, creating more efficient models with similar performance.
Training a smaller 'student' model to mimic a larger 'teacher' model, creating more efficient models with similar performance.