« Textual Inversion » : différence entre les versions
Aller à la navigation
Aller à la recherche
(Page créée avec « Textual Inversion est défini de la façon suivante : "We learn to generate specific concepts, like personal objects or artistic styles, by describing them using new "words" in the embedding space of pre-trained text-to-image models. These can be used in new sentences, just like any other word." == Références == * [https://textual-inversion.github.io] ») |
Aucun résumé des modifications |
||
(Une version intermédiaire par le même utilisateur non affichée) | |||
Ligne 2 : | Ligne 2 : | ||
"We learn to generate specific concepts, like personal objects or artistic styles, by describing them using new "words" in the embedding space of pre-trained text-to-image models. These can be used in new sentences, just like any other word." | "We learn to generate specific concepts, like personal objects or artistic styles, by describing them using new "words" in the embedding space of pre-trained text-to-image models. These can be used in new sentences, just like any other word." | ||
Textual inversion is a process where you can quickly "teach" a new word to the text model and plain its embeddings close to some visual representation. This is achieved by adding a new token to the vocabulary, freezing the weights of all the models (except the text encoder), and train with a few representative images. | |||
== Références == | == Références == | ||
* [https://textual-inversion.github.io] | * [https://textual-inversion.github.io] An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion |
Version actuelle datée du 5 janvier 2023 à 13:35
Textual Inversion est défini de la façon suivante :
"We learn to generate specific concepts, like personal objects or artistic styles, by describing them using new "words" in the embedding space of pre-trained text-to-image models. These can be used in new sentences, just like any other word."
Textual inversion is a process where you can quickly "teach" a new word to the text model and plain its embeddings close to some visual representation. This is achieved by adding a new token to the vocabulary, freezing the weights of all the models (except the text encoder), and train with a few representative images.
Références
- [1] An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion