## BertForMaskedLM

from transformers import BertTokenizer, BertForMaskedLM
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForMaskedLM.from_pretrained('bert-base-uncased')
input_ids = tokenizer("Hello, my dog is cute", return_tensors="pt")["input_ids"]
# print(input_ids)
outputs = model(input_ids, labels=input_ids)
loss, prediction_scores = outputs[:2]
print(type(loss),loss)
print(type(prediction_scores),prediction_scores)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForMaskedLM were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['cls.predictions.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

<class 'torch.Tensor'> tensor(6.6588, grad_fn=<NllLossBackward>)
<class 'torch.Tensor'> tensor([[[ -7.8962,  -7.8105,  -7.7903,  ...,  -7.0694,  -7.1693,  -4.3590],
         [ -8.4461,  -8.4401,  -8.5044,  ...,  -8.0625,  -7.9909,  -5.7160],
         [-15.2953, -15.4727, -15.5865,  ..., -12.9857, -11.7038, -11.4293],
         ...,
         [-14.0628, -14.2535, -14.3645,  ..., -12.7151, -11.1621, -10.2317],
         [-10.6576, -10.7892, -11.0402,  ..., -10.3233, -10.1578,  -3.7721],
         [-11.3383, -11.4590, -11.1767,  ...,  -9.2152,  -9.5209,  -9.5571]]],
       grad_fn=<AddBackward0>)

## BertForNextSentencePrediction
from transformers import BertTokenizer, BertForNextSentencePrediction
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForNextSentencePrediction.from_pretrained('bert-base-uncased')
prompt = "In Italy, pizza served in formal settings, such as at a restaurant, is presented unsliced."
next_sentence = "The sky is blue due to the shorter wavelength of blue light."
encoding = tokenizer(prompt, next_sentence, return_tensors='pt')
loss, logits = model(**encoding, next_sentence_label=torch.LongTensor([1]))
assert logits[0, 0] < logits[0, 1] # next sentence was random
print('loss',type(loss),loss)
print('logits',type(logits),logits)
print('encoding',type(encoding),encoding)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForNextSentencePrediction: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForNextSentencePrediction from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForNextSentencePrediction from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

loss <class 'torch.Tensor'> tensor(0.0001, grad_fn=<NllLossBackward>)
logits <class 'torch.Tensor'> tensor([[-3.0729,  5.9056]], grad_fn=<AddmmBackward>)
encoding <class 'transformers.tokenization_utils_base.BatchEncoding'> {'input_ids': tensor([[  101,  1999,  3304,  1010, 10733,  2366,  1999,  5337, 10906,  1010,
          2107,  2004,  2012,  1037,  4825,  1010,  2003,  3591,  4895, 14540,
          6610,  2094,  1012,   102,  1996,  3712,  2003,  2630,  2349,  2000,
          1996,  7820, 19934,  1997,  2630,  2422,  1012,   102]]), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

## BertForSequenceClassification
from transformers import BertTokenizer, BertForSequenceClassification
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
labels = torch.tensor([1]).unsqueeze(0)  # Batch size 1
outputs = model(**inputs, labels=labels)
loss, logits = outputs[:2]


print('loss',type(loss),loss)
print('logits',type(logits),logits)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

loss <class 'torch.Tensor'> tensor(0.9083, grad_fn=<NllLossBackward>)
logits <class 'torch.Tensor'> tensor([[ 0.1846, -0.2075]], grad_fn=<AddmmBackward>)

## BertForMultipleChoice 
from transformers import BertTokenizer, BertForMultipleChoice
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForMultipleChoice.from_pretrained('bert-base-uncased')
prompt = "In Italy, pizza served in formal settings, such as at a restaurant, is presented unsliced."
choice0 = "It is eaten with a fork and a knife."
choice1 = "It is eaten while held in the hand."
choice2 = "It is eaten while held in the handle."
choice3 = "It is eaten while held in the way."

labels = torch.tensor(0).unsqueeze(0)  # choice0 is correct (according to Wikipedia ;)), batch size 1
encoding = tokenizer([[prompt, prompt], [choice0, choice1]], return_tensors='pt', padding=True)
outputs = model(**{k: v.unsqueeze(0) for k,v in encoding.items()}, labels=labels)  # batch size is 1
# the linear classifier still needs to be trained
loss, logits = outputs[:2]

print('loss',type(loss),loss)
print('logits',type(logits),logits)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMultipleChoice: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForMultipleChoice from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForMultipleChoice from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForMultipleChoice were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

loss <class 'torch.Tensor'> tensor(0.6803, grad_fn=<NllLossBackward>)
logits <class 'torch.Tensor'> tensor([[-0.2700, -0.2957]], grad_fn=<ViewBackward>)

from transformers import BertTokenizer, BertForTokenClassification
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForTokenClassification.from_pretrained('bert-base-uncased')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
labels = torch.tensor([1] * inputs["input_ids"].size(1)).unsqueeze(0)  # Batch size 1
outputs = model(**inputs, labels=labels)
loss, scores = outputs[:2]
print('loss',type(loss),loss)
print('logits',type(logits),logits)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForTokenClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

loss <class 'torch.Tensor'> tensor(0.7565, grad_fn=<NllLossBackward>)
logits <class 'torch.Tensor'> tensor([[-0.2700, -0.2957]], grad_fn=<ViewBackward>)

from transformers import BertTokenizer, BertForQuestionAnswering
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForQuestionAnswering.from_pretrained('bert-base-uncased')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
start_positions = torch.tensor([1])
end_positions = torch.tensor([3])
outputs = model(**inputs, start_positions=start_positions, end_positions=end_positions)
loss, start_scores, end_scores = outputs[:3]


print('loss',type(loss),loss)
print('start_scores',type(start_scores),start_scores)

print('outputs',type(outputs),outputs)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForQuestionAnswering: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['qa_outputs.weight', 'qa_outputs.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

loss <class 'torch.Tensor'> tensor(2.0500, grad_fn=<DivBackward0>)
start_scores <class 'torch.Tensor'> tensor([[ 0.0505,  0.2813,  0.3198,  0.4249, -0.1320,  0.1518, -0.1591,  0.4437]],
       grad_fn=<SqueezeBackward1>)
outputs <class 'tuple'> (tensor(2.0500, grad_fn=<DivBackward0>), tensor([[ 0.0505,  0.2813,  0.3198,  0.4249, -0.1320,  0.1518, -0.1591,  0.4437]],
       grad_fn=<SqueezeBackward1>), tensor([[0.0317, 0.3645, 0.9012, 0.4658, 0.3815, 0.7034, 0.6190, 0.1950]],
       grad_fn=<SqueezeBackward1>))

{'score': 0.6969107183737014, 'start': 17, 'end': 20, 'answer': 'cute'}

from __future__ import print_function
from transformers import pipeline
nlp_sentence_classif = pipeline('sentiment-analysis')
nlp_sentence_classif('Such a nice weather outside !')

[{'label': 'POSITIVE', 'score': 0.9997655749320984}]

nlp_token_class = pipeline('ner')
nlp_token_class('Hugging Face is a French company based in New-York.')

[{'word': 'Hu', 'score': 0.9970937967300415, 'entity': 'I-ORG', 'index': 1},
 {'word': '##gging',
  'score': 0.9345751404762268,
  'entity': 'I-ORG',
  'index': 2},
 {'word': 'Face', 'score': 0.9787060618400574, 'entity': 'I-ORG', 'index': 3},
 {'word': 'French',
  'score': 0.9981995820999146,
  'entity': 'I-MISC',
  'index': 6},
 {'word': 'New', 'score': 0.9983047246932983, 'entity': 'I-LOC', 'index': 10},
 {'word': '-', 'score': 0.8913456797599792, 'entity': 'I-LOC', 'index': 11},
 {'word': 'York', 'score': 0.9979523420333862, 'entity': 'I-LOC', 'index': 12}]

nlp_qa = pipeline('question-answering')
nlp_qa(context="Hello, my dog is cute", question='how about my dog?')
nlp_qa(context="Hello, my cat is baby", question='who is my baby?')

{'score': 0.5036106829213413, 'start': 7, 'end': 13, 'answer': 'my cat'}

nlp_qa(context="this weather, 오늘은 날씨가 좋습니다.", question='오늘 날씨 어때요?')
nlp_qa(context="어제는 날씨가 안 좋았습니다, 오늘은 날씨가 좋습니다.", question='오늘 날씨 어때요?')
## 한국어가 안되는 이유를 찾는중

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-39-9df7e2d06b35> in <module>
      1 nlp_qa(context="this weather, 오늘은 날씨가 좋습니다.", question='오늘 날씨 어때요?')
----> 2 nlp_qa(context="어제는 날씨가 안 좋았습니다, 오늘은 날씨가 좋습니다.", question='오늘 날씨 어때요?')

C:\anacondas\envs\tf22\lib\site-packages\transformers\pipelines.py in __call__(self, *args, **kwargs)
   1314                         ),
   1315                     }
-> 1316                     for s, e, score in zip(starts, ends, scores)
   1317                 ]
   1318 

C:\anacondas\envs\tf22\lib\site-packages\transformers\pipelines.py in <listcomp>(.0)
   1314                         ),
   1315                     }
-> 1316                     for s, e, score in zip(starts, ends, scores)
   1317                 ]
   1318 

KeyError: 0

from transformers import BertForSequenceClassification
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')
model.train()

티스토리

python: pytorch, 질의응답관련 , bert 사용