You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
/content/image-captioning
2023-09-02 18:30:18.889829: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Device: cuda:0
Images found: 263
Split size: 263
Checkpoint loading...
load checkpoint from ./checkpoints/model_large_caption.pth
Model to cuda:0
Inference started
0batch [00:01, ?batch/s]
Traceback (most recent call last):
File "/content/image-captioning/inference.py", line 88, in
caption = model.generate(
File "/content/image-captioning/models/blip.py", line 201, in generate
outputs = self.text_decoder.generate(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1675, in generate
return self.beam_search(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 3014, in beam_search
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 886, in forward
outputs = self.bert(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 781, in forward
encoder_outputs = self.encoder(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 445, in forward
layer_outputs = layer_module(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 361, in forward
cross_attention_outputs = self.crossattention(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 277, in forward
self_outputs = self.self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 178, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: The size of tensor a (3) must match the size of tensor b (9) at non-singleton dimension 0
The text was updated successfully, but these errors were encountered:
I have found the solution, at least for myself personally: it might be a version mismatch with some of the modules, check the requirements text file, if any of those modules are newer on your system than the requirements in this file, there might be feature deprecation preventing this from running
I had this same exact error and fixed it with this command: pip install timm==0.4.12 transformers==4.17.0 fairscale==0.4.4 pycocoevalcap pillow
It found that my timm, transformers and fairscale were on newer versions, pulled the downgrade, and got this working first try.
If you use these for anything else already and it might break functionality, it may not be worth it, unless you really need the functionality of this system.
EDIT:
This error also crops up if you try to create a batch size larger than the number of image files being processed
my images are 256x256 pixels
/content/image-captioning
2023-09-02 18:30:18.889829: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Device: cuda:0
Images found: 263
Split size: 263
Checkpoint loading...
load checkpoint from ./checkpoints/model_large_caption.pth
Model to cuda:0
Inference started
0batch [00:01, ?batch/s]
Traceback (most recent call last):
File "/content/image-captioning/inference.py", line 88, in
caption = model.generate(
File "/content/image-captioning/models/blip.py", line 201, in generate
outputs = self.text_decoder.generate(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1675, in generate
return self.beam_search(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 3014, in beam_search
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 886, in forward
outputs = self.bert(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 781, in forward
encoder_outputs = self.encoder(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 445, in forward
layer_outputs = layer_module(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 361, in forward
cross_attention_outputs = self.crossattention(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 277, in forward
self_outputs = self.self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/image-captioning/models/med.py", line 178, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: The size of tensor a (3) must match the size of tensor b (9) at non-singleton dimension 0
The text was updated successfully, but these errors were encountered: