« SFT » : différence entre les versions

De Wiki BackProp
Aller à la navigation Aller à la recherche
(Page créée avec « Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels. == Références == * [https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32] Fine-Tune Your Own Llama 2 Model in a Colab Notebook »)
 
Aucun résumé des modifications
 
Ligne 1 : Ligne 1 :
Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels.
Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels.[1]
 
However, in some cases, updating the knowledge of the model is not enough and you want to modify the behavior of the LLM. In these situations, you will need a supervised fine-tuning (SFT) dataset, which is a collection of prompts and their corresponding responses. SFT datasets can be manually curated by users or generated by other LLMs. Supervised fine-tuning is especially important for LLMs such as ChatGPT, which have been designed to follow user instructions and stay on a specific task across long stretches of text. This specific type of fine-tuning is also referred to as instruction fine-tuning [2]


== Références ==
== Références ==


* [https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32] Fine-Tune Your Own Llama 2 Model in a Colab Notebook
* [https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32] Fine-Tune Your Own Llama 2 Model in a Colab Notebook
* [https://bdtechtalks.com/2023/07/10/llm-fine-tuning/] The complete guide to LLM fine-tuning

Version actuelle datée du 1 août 2023 à 12:10

Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels.[1]

However, in some cases, updating the knowledge of the model is not enough and you want to modify the behavior of the LLM. In these situations, you will need a supervised fine-tuning (SFT) dataset, which is a collection of prompts and their corresponding responses. SFT datasets can be manually curated by users or generated by other LLMs. Supervised fine-tuning is especially important for LLMs such as ChatGPT, which have been designed to follow user instructions and stay on a specific task across long stretches of text. This specific type of fine-tuning is also referred to as instruction fine-tuning [2]

Références

  • [1] Fine-Tune Your Own Llama 2 Model in a Colab Notebook
  • [2] The complete guide to LLM fine-tuning