SFT
Supervised Fine-Tuning (SFT): Models are trained on a dataset of instructions and responses. It adjusts the weights in the LLM to minimize the difference between the generated answers and ground-truth responses, acting as labels.[1]
However, in some cases, updating the knowledge of the model is not enough and you want to modify the behavior of the LLM. In these situations, you will need a supervised fine-tuning (SFT) dataset, which is a collection of prompts and their corresponding responses. SFT datasets can be manually curated by users or generated by other LLMs. Supervised fine-tuning is especially important for LLMs such as ChatGPT, which have been designed to follow user instructions and stay on a specific task across long stretches of text. This specific type of fine-tuning is also referred to as instruction fine-tuning [2]