ArunKumar R’s Post

View profile for ArunKumar R, graphic

Data and AI

Transformers Transformers has been a major milestone in the field of NLP and heavily used in Generative AI. There are 3 variants of transformer based models and they are 1. Encoder-only 2. Decoder-only 3. Encoder-Decoder Encoder only models: These are also called as autoencoders and these are pretrained using a technique called masked language modeling. A text with a random masked token is sent to the model to predict the masked token. For example, consider the text "If you don't stop at the sign, you will get a ticket" The training input that will be passed on the encoder model is "If you don't _____ at the sign, you will get a ticket. The model is expected to predict the token (word) "stop" These models use bi-directional representations of the input to better understand the full context of a token. Examples of encoder models: BERT Family models Decoder only models: These models are called autoregressive models and are pretrained using a technique called causal language modeling. These models predict the next token using the previous tokens. For example, consider the text "If you don't stop at the sign, you will get a ticket" The training input that will be passed on the encoder model is "If you don't ______" The model will still try to predict the word "stop", but only based on previous tokens. These models are used for generative tasks, including question answering. Examples of decoder only models: GPT Family, Falcon, LLama models Encoder decoder models: These models are called sequence to sequence models and the pretraining objective varies from model to model. For example the popular FLAN - T5 uses a consecutive multitoken masking called span corruption. For example, consider the text "If you don't stop at the sign, you will get a ticket" The training input that will be passed on the encoder model is "If you don't _____ _______ the sign, you will get a ticket. The model will try to predict the tokens "stop at" These models are good at translation tasks. Examples: T5 family In all the above explanations we just took one sentence of text. The LLMs you see in the market are trained on huge volumes of text available over the internet. #LLM #encoder #decoder #transformer #genai

  • No alternative text description for this image

To view or add a comment, sign in

Explore topics