SFT

De Wiki BackProp
Révision datée du 1 août 2023 à 12:00 par Jboscher (discussion | contributions) (Page créée avec « Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels. == Références == * [https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32] Fine-Tune Your Own Llama 2 Model in a Colab Notebook »)
(diff) ← Version précédente | Voir la version actuelle (diff) | Version suivante → (diff)
Aller à la navigation Aller à la recherche

Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels.

Références

  • [1] Fine-Tune Your Own Llama 2 Model in a Colab Notebook