# fastNLP.modules.encoder.gpt2 module¶

class fastNLP.modules.encoder.gpt2.GPT2Model(config)[源代码]

Outputs: Tuple comprising various elements depending on the configuration (config) and inputs:
last_hidden_state: torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)

Sequence of hidden-states at the last layer of the model.

past:

list of torch.FloatTensor (one for each layer) of shape (2, batch_size, num_heads, sequence_length, embed_size_per_head): that contains pre-computed hidden-states (key and values in the attention blocks). Can be used (see past input) to speed up sequential decoding. The token ids which have their past given to this model should not be passed as input ids as they have already been computed.

hidden_states: (optional, returned when config.output_hidden_states=True)

list of torch.FloatTensor (one for the output of each layer + the output of the embeddings) of shape (batch_size, sequence_length, hidden_size): Hidden-states of the model at the output of each layer plus the initial embedding outputs.

attentions: (optional, returned when config.output_attentions=True)

list of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length): Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.

Examples:

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2Model.from_pretrained('gpt2')
input_ids = torch.tensor(tokenizer.encode("Hello, my dog is cute", add_special_tokens=True)).unsqueeze(0)  # Batch size 1
outputs = model(input_ids)
last_hidden_states = outputs[0]  # The last hidden-state is the first element of the output tuple

property dtype

torch.dtype: The dtype of the module (assuming that all the module parameters have the same dtype).

forward(input_ids, state=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None, output_attentions=True)[源代码]

• input_ids (torch.LongTensor) – batch_size x max_len or batch_size x beam_size x 1

• state (GPT2State) – 之前的状态

training: bool