SFT

De Wiki BackProp

Révision datée du 1 août 2023 à 12:00 par Jboscher (discussion | contributions) (Page créée avec « Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels. == Références == * [https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32] Fine-Tune Your Own Llama 2 Model in a Colab Notebook »)

(diff) ← Version précédente | Voir la version actuelle (diff) | Version suivante → (diff)

Aller à la navigation Aller à la recherche

Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels.

Références

[1] Fine-Tune Your Own Llama 2 Model in a Colab Notebook

Récupérée de « http://wiki.backprop.fr/index.php?title=SFT&oldid=77 »