« Textual Inversion » : différence entre les versions

De Wiki BackProp
Aller à la navigation Aller à la recherche
Aucun résumé des modifications
Aucun résumé des modifications
 
Ligne 2 : Ligne 2 :


"We learn to generate specific concepts, like personal objects or artistic styles, by describing them using new "words" in the embedding space of pre-trained text-to-image models. These can be used in new sentences, just like any other word."
"We learn to generate specific concepts, like personal objects or artistic styles, by describing them using new "words" in the embedding space of pre-trained text-to-image models. These can be used in new sentences, just like any other word."
Textual inversion is a process where you can quickly "teach" a new word to the text model and plain its embeddings close to some visual representation. This is achieved by adding a new token to the vocabulary, freezing the weights of all the models (except the text encoder), and train with a few representative images.





Version actuelle datée du 5 janvier 2023 à 13:35

Textual Inversion est défini de la façon suivante :

"We learn to generate specific concepts, like personal objects or artistic styles, by describing them using new "words" in the embedding space of pre-trained text-to-image models. These can be used in new sentences, just like any other word."

Textual inversion is a process where you can quickly "teach" a new word to the text model and plain its embeddings close to some visual representation. This is achieved by adding a new token to the vocabulary, freezing the weights of all the models (except the text encoder), and train with a few representative images.


Références

  • [1] An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion