« SFT » : différence entre les versions

Version actuelle datée du 1 août 2023 à 12:10

Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels.[1]

However, in some cases, updating the knowledge of the model is not enough and you want to modify the behavior of the LLM. In these situations, you will need a supervised fine-tuning (SFT) dataset, which is a collection of prompts and their corresponding responses. SFT datasets can be manually curated by users or generated by other LLMs. Supervised fine-tuning is especially important for LLMs such as ChatGPT, which have been designed to follow user instructions and stay on a specific task across long stretches of text. This specific type of fine-tuning is also referred to as instruction fine-tuning [2]

Références

[1] Fine-Tune Your Own Llama 2 Model in a Colab Notebook
[2] The complete guide to LLM fine-tuning

@@ Ligne 1 : / Ligne 1 : @@
-Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels.
+Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels.[1]
+However, in some cases, updating the knowledge of the model is not enough and you want to modify the behavior of the LLM. In these situations, you will need a supervised fine-tuning (SFT) dataset, which is a collection of prompts and their corresponding responses. SFT datasets can be manually curated by users or generated by other LLMs. Supervised fine-tuning is especially important for LLMs such as ChatGPT, which have been designed to follow user instructions and stay on a specific task across long stretches of text. This specific type of fine-tuning is also referred to as instruction fine-tuning [2]
 == Références ==
 * [https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32] Fine-Tune Your Own Llama 2 Model in a Colab Notebook
+* [https://bdtechtalks.com/2023/07/10/llm-fine-tuning/] The complete guide to LLM fine-tuning

« SFT » : différence entre les versions

Version actuelle datée du 1 août 2023 à 12:10

Références

Menu de navigation

Rechercher

« SFT » : différence entre les versions