huggingface_pytorch-pretrained-bert_gpt.md · Jeremy Lee/pytorch-hubhub

layout

background-class

body-class

title

summary

Model Description

GPT was released together with the paper Improving Language Understanding by Generative Pre-Training by Alec Radford et al at OpenAI. It's a combination of two ideas: Transformer model and large scale unsupervised pre-training.

Here are three models based on OpenAI's pre-trained weights along with the associated Tokenizer. It includes:

openAIGPTModel: raw OpenAI GPT Transformer model (fully pre-trained)
openAIGPTLMHeadModel: OpenAI GPT Transformer with the tied language modeling head on top (fully pre-trained)
openAIGPTDoubleHeadsModel: OpenAI GPT Transformer with the tied language modeling head and a multiple choice classification head on top (OpenAI GPT Transformer is pre-trained, the multiple choice classification head is only initialized and has to be trained)

Requirements

Unlike most other PyTorch Hub models, GPT requires a few additional Python packages to be installed.

pip install tqdm boto3 requests regex ftfy spacy

Example

Here is an example on how to tokenize the text with openAIGPTTokenizer, and then get the hidden states computed by openAIGPTModel or predict the next token using openAIGPTLMHeadModel. Finally, we showcase how to use openAIGPTDoubleHeadsModel to combine the language modeling head and a multiple choice classification head.

### First, tokenize the input
#############################
import torch
tokenizer = torch.hub.load('huggingface/pytorch-pretrained-BERT', 'openAIGPTTokenizer', 'openai-gpt')

#  Prepare tokenized input
text = "Who was Jim Henson ? Jim Henson was a puppeteer"
tokenized_text = tokenizer.tokenize(text)
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
tokens_tensor = torch.tensor([indexed_tokens])


### Get the hidden states computed by `openAIGPTModel`
######################################################
model = torch.hub.load('huggingface/pytorch-pretrained-BERT', 'openAIGPTModel', 'openai-gpt')
model.eval()

# Compute hidden states features for each layer
with torch.no_grad():
	hidden_states = model(tokens_tensor)


### Predict the next token using `openAIGPTLMHeadModel`
#######################################################
lm_model = torch.hub.load('huggingface/pytorch-pretrained-BERT', 'openAIGPTLMHeadModel', 'openai-gpt')
lm_model.eval()

# Predict all tokens
with torch.no_grad():
	predictions = lm_model(tokens_tensor)

# Get the last predicted token
predicted_index = torch.argmax(predictions[0, -1, :]).item()
predicted_token = tokenizer.convert_ids_to_tokens([predicted_index])[0]
assert predicted_token == '.</w>'


### Language modeling and multiple choice classification `openAIGPTDoubleHeadsModel`
####################################################################################
double_head_model = torch.hub.load('huggingface/pytorch-pretrained-BERT', 'openAIGPTDoubleHeadsModel', 'openai-gpt')
double_head_model.eval() # Set the model to train mode if used for training

text_bis = "Who was Jim Henson ? Jim Henson was a mysterious young man"
tokenized_text_bis = tokenizer.tokenize(text_bis)
indexed_tokens_bis = tokenizer.convert_tokens_to_ids(tokenized_text_bis)
tokens_tensor = torch.tensor([[indexed_tokens, indexed_tokens_bis]])
mc_token_ids = torch.LongTensor([[len(tokenized_text)-1, len(tokenized_text_bis)-1]])

with torch.no_grad():
    lm_logits, multiple_choice_logits = double_head_model(tokens_tensor, mc_token_ids)

Requirement

The model only support python3.

Resources

Paper: Improving Language Understanding by Generative Pre-Training
Blogpost from OpenAI
Initial repository (with detailed examples and documentation): pytorch-pretrained-BERT

Jeremy Lee / pytorch-hubhub

Model Description

Requirements

Example

Requirement

Resources

简介

发行版

贡献者

近期动态

Jeremy Lee / pytorch-hubhub .gitee-modal { width: 500px !important; }

Model Description

Requirements

Example

Requirement

Resources

简介

发行版

贡献者

近期动态

搜索帮助

Jeremy Lee / pytorch-hubhub