You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[i 0914 20:35:15.018271 64 compiler.py:869] Jittor(1.2.3.101) src: /home/llc/.local/lib/python3.7/site-packages/jittor
[i 0914 20:35:15.024461 64 compiler.py:870] g++ at /usr/bin/g++(7.5.0)
[i 0914 20:35:15.024553 64 compiler.py:871] cache_path: /home/llc/.cache/jittor/default/g++
[i 0914 20:35:15.319920 64 install_cuda.py:37] cuda_driver_version: [11, 6]
[i 0914 20:35:15.337710 64 init.py:286] Found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc(11.2.152) at /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc.
[i 0914 20:35:15.403338 64 init.py:286] Found addr2line(2.30) at /usr/bin/addr2line.
[i 0914 20:35:15.491815 64 compiler.py:959] py_include: -I/usr/include/python3.7m -I/usr/include/python3.7m
[i 0914 20:35:15.579729 64 compiler.py:961] extension_suffix: .cpython-37m-x86_64-linux-gnu.so
[i 0914 20:35:15.719783 64 init.py:178] Total mem: 7.75GB, using 2 procs for compiling.
[i 0914 20:35:16.493494 64 jit_compiler.cc:22] Load cc_path: /usr/bin/g++
[i 0914 20:35:16.493646 64 init.cc:57] Found cuda archs: [75,]
[i 0914 20:35:16.641731 64 compile_extern.py:451] mpicc not found, distribution disabled.
[i 0914 20:35:16.717446 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/cublas.h
[i 0914 20:35:16.739669 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcublas.so
[i 0914 20:35:16.739794 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcublasLt.so.11
[i 0914 20:35:17.317255 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/cudnn.h
[i 0914 20:35:17.341903 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn.so.8
[i 0914 20:35:17.341998 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_ops_infer.so.8
[i 0914 20:35:17.349224 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_ops_train.so.8
[i 0914 20:35:17.350055 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_cnn_infer.so.8
[i 0914 20:35:17.395974 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_cnn_train.so.8
[i 0914 20:35:17.411565 64 compiler.py:667] handle pyjt_include/home/llc/.local/lib/python3.7/site-packages/jittor/extern/cuda/cudnn/inc/cudnn_warper.h
[i 0914 20:35:17.923592 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/curand.h
[i 0914 20:35:17.950855 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcurand.so
[i 0914 20:35:18.847675 64 cuda_flags.cc:26] CUDA enabled.
Loading config from: configs/retinanet_gaofen.py
[e 0914 20:35:22.246316 64 init.py:996] load parameter rpn_net.retina_cls.weight failed: expect the shape of rpn_net.retina_cls.weight to be [777,256,3,3,], but got [315,256,3,3,]
[e 0914 20:35:22.246449 64 init.py:996] load parameter rpn_net.retina_cls.bias failed: expect the shape of rpn_net.retina_cls.bias to be [777,], but got [315,]
[w 0914 20:35:22.246808 64 init.py:998] load total 311 params, 2 failed
Tue Sep 14 20:35:22 2021 Loading model parameters from weights/yx_init_pretrained.pk_jt.pk
Tue Sep 14 20:35:22 2021 Loading model parameters from work_dirs/retinanet_gaofen/checkpoints/ckpt_30.pkl
Tue Sep 14 20:35:22 2021 Start running
Tue Sep 14 20:35:22 2021 Testing...
0%| | 0/1126 [00:00<?, ?it/s]
[e 0914 20:35:28.524608 64 executor.cc:527]
=== display_memory_info ===
total_cpu_ram: 7.75GB total_cuda_ram: 24GB
hold_vars: 587 lived_vars: 3579 lived_ops: 3546
update queue: 311/311
name: sfrl is_cuda: 1 used: 210.1MB(94.6%) unused: 11.94MB(5.38%) total: 222MB
name: sfrl is_cuda: 1 used: 367.1MB(92%) unused: 31.85MB(7.98%) total: 399MB
name: sfrl is_cuda: 0 used: 367.1MB(92%) unused: 31.85MB(7.98%) total: 399MB
name: sfrl is_cuda: 0 used: 180.5KB(17.6%) unused: 843.5KB(82.4%) total: 1MB
name: temp is_cuda: 0 used: 0 B(-nan%) unused: 0 B(-nan%) total: 0 B
name: temp is_cuda: 1 used: 0 B(-nan%) unused: 0 B(-nan%) total: 0 B
cpu&gpu: 1021MB gpu: 621MB cpu: 400MB
free: cpu(922.4MB) gpu(22.09GB)
[e 0914 20:35:28.525250 64 executor.cc:531] [Error] source file location: /home/llc/.cache/jittor/default/g++/jit/_opkey0:broadcast_to_Tx:float32__DIM=7__BCAST=19__JIT:1__JIT_cuda:1__index_t:int32___opkey...hash:7e74aa6468b00eb_op.c
c
0%| | 0/1126 [00:05<?, ?it/s]
Traceback (most recent call last):
File "run_net.py", line 54, in
main()
File "run_net.py", line 45, in main
runner.run()
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/runner/runner.py", line 89, in run
self.test()
File "/home/llc/.local/lib/python3.7/site-packages/jittor/init.py", line 89, in inner
ret = func(*args, **kw)
File "/home/llc/.local/lib/python3.7/site-packages/jittor/init.py", line 257, in inner
ret = func(*args, **kw)
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/runner/runner.py", line 197, in test
result = self.model(images,targets)
File "/home/llc/.local/lib/python3.7/site-packages/jittor/init.py", line 737, in call
return self.execute(*args, **kw)
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/models/networks/retinanet.py", line 64, in execute
results,losses = self.rpn_net(features, targets)
File "/home/llc/.local/lib/python3.7/site-packages/jittor/init.py", line 737, in call
return self.execute(*args, **kw)
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/models/roi_heads/retina_head.py", line 351, in execute
results = self.get_bboxes(all_proposals,all_bbox_pred,all_cls_score,targets)
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/models/roi_heads/retina_head.py", line 231, in get_bboxes
jt.sync([bbox_j, score_j])
RuntimeError: Wrong inputs arguments, Please refer to examples(help(jt.sync)).
Types of your inputs are:
self = module,
args = (list, ),
The function declarations are:
void sync(const vector<VarHolder*>& vh=vector<VarHolder*>(), bool device_sync=false)
问题描述
执行python3.7 run_net.py --config-file=configs/retinanet_gaofen.py --task=train后报错CUDA error
完整日志
XXX@DESKTOP-8B01LP5:/mnt/e/cpt/JDet-master/projects/retinanet$ python3.7 run_net.py --config-file=configs/retinanet_gaofen.py --task=train
[i 0914 20:35:15.018271 64 compiler.py:869] Jittor(1.2.3.101) src: /home/llc/.local/lib/python3.7/site-packages/jittor
[i 0914 20:35:15.024461 64 compiler.py:870] g++ at /usr/bin/g++(7.5.0)
[i 0914 20:35:15.024553 64 compiler.py:871] cache_path: /home/llc/.cache/jittor/default/g++
[i 0914 20:35:15.319920 64 install_cuda.py:37] cuda_driver_version: [11, 6]
[i 0914 20:35:15.337710 64 init.py:286] Found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc(11.2.152) at /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc.
[i 0914 20:35:15.403338 64 init.py:286] Found addr2line(2.30) at /usr/bin/addr2line.
[i 0914 20:35:15.491815 64 compiler.py:959] py_include: -I/usr/include/python3.7m -I/usr/include/python3.7m
[i 0914 20:35:15.579729 64 compiler.py:961] extension_suffix: .cpython-37m-x86_64-linux-gnu.so
[i 0914 20:35:15.719783 64 init.py:178] Total mem: 7.75GB, using 2 procs for compiling.
[i 0914 20:35:16.493494 64 jit_compiler.cc:22] Load cc_path: /usr/bin/g++
[i 0914 20:35:16.493646 64 init.cc:57] Found cuda archs: [75,]
[i 0914 20:35:16.641731 64 compile_extern.py:451] mpicc not found, distribution disabled.
[i 0914 20:35:16.717446 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/cublas.h
[i 0914 20:35:16.739669 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcublas.so
[i 0914 20:35:16.739794 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcublasLt.so.11
[i 0914 20:35:17.317255 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/cudnn.h
[i 0914 20:35:17.341903 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn.so.8
[i 0914 20:35:17.341998 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_ops_infer.so.8
[i 0914 20:35:17.349224 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_ops_train.so.8
[i 0914 20:35:17.350055 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_cnn_infer.so.8
[i 0914 20:35:17.395974 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcudnn_cnn_train.so.8
[i 0914 20:35:17.411565 64 compiler.py:667] handle pyjt_include/home/llc/.local/lib/python3.7/site-packages/jittor/extern/cuda/cudnn/inc/cudnn_warper.h
[i 0914 20:35:17.923592 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/include/curand.h
[i 0914 20:35:17.950855 64 compile_extern.py:20] found /home/llc/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/lib64/libcurand.so
[i 0914 20:35:18.847675 64 cuda_flags.cc:26] CUDA enabled.
Loading config from: configs/retinanet_gaofen.py
[e 0914 20:35:22.246316 64 init.py:996] load parameter rpn_net.retina_cls.weight failed: expect the shape of rpn_net.retina_cls.weight to be [777,256,3,3,], but got [315,256,3,3,]
[e 0914 20:35:22.246449 64 init.py:996] load parameter rpn_net.retina_cls.bias failed: expect the shape of rpn_net.retina_cls.bias to be [777,], but got [315,]
[w 0914 20:35:22.246808 64 init.py:998] load total 311 params, 2 failed
Tue Sep 14 20:35:22 2021 Loading model parameters from weights/yx_init_pretrained.pk_jt.pk
Tue Sep 14 20:35:22 2021 Loading model parameters from work_dirs/retinanet_gaofen/checkpoints/ckpt_30.pkl
Tue Sep 14 20:35:22 2021 Start running
Tue Sep 14 20:35:22 2021 Testing...
0%| | 0/1126 [00:00<?, ?it/s]
[e 0914 20:35:28.524608 64 executor.cc:527]
=== display_memory_info ===
total_cpu_ram: 7.75GB total_cuda_ram: 24GB
hold_vars: 587 lived_vars: 3579 lived_ops: 3546
update queue: 311/311
name: sfrl is_cuda: 1 used: 210.1MB(94.6%) unused: 11.94MB(5.38%) total: 222MB
name: sfrl is_cuda: 1 used: 367.1MB(92%) unused: 31.85MB(7.98%) total: 399MB
name: sfrl is_cuda: 0 used: 367.1MB(92%) unused: 31.85MB(7.98%) total: 399MB
name: sfrl is_cuda: 0 used: 180.5KB(17.6%) unused: 843.5KB(82.4%) total: 1MB
name: temp is_cuda: 0 used: 0 B(-nan%) unused: 0 B(-nan%) total: 0 B
name: temp is_cuda: 1 used: 0 B(-nan%) unused: 0 B(-nan%) total: 0 B
cpu&gpu: 1021MB gpu: 621MB cpu: 400MB
free: cpu(922.4MB) gpu(22.09GB)
[e 0914 20:35:28.525250 64 executor.cc:531] [Error] source file location: /home/llc/.cache/jittor/default/g++/jit/_opkey0:broadcast_to_Tx:float32__DIM=7__BCAST=19__JIT:1__JIT_cuda:1__index_t:int32___opkey...hash:7e74aa6468b00eb_op.c
c
0%| | 0/1126 [00:05<?, ?it/s]
Traceback (most recent call last):
File "run_net.py", line 54, in
main()
File "run_net.py", line 45, in main
runner.run()
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/runner/runner.py", line 89, in run
self.test()
File "/home/llc/.local/lib/python3.7/site-packages/jittor/init.py", line 89, in inner
ret = func(*args, **kw)
File "/home/llc/.local/lib/python3.7/site-packages/jittor/init.py", line 257, in inner
ret = func(*args, **kw)
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/runner/runner.py", line 197, in test
result = self.model(images,targets)
File "/home/llc/.local/lib/python3.7/site-packages/jittor/init.py", line 737, in call
return self.execute(*args, **kw)
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/models/networks/retinanet.py", line 64, in execute
results,losses = self.rpn_net(features, targets)
File "/home/llc/.local/lib/python3.7/site-packages/jittor/init.py", line 737, in call
return self.execute(*args, **kw)
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/models/roi_heads/retina_head.py", line 351, in execute
results = self.get_bboxes(all_proposals,all_bbox_pred,all_cls_score,targets)
File "/usr/local/lib/python3.7/dist-packages/jdet-0.1.0.0-py3.7.egg/jdet/models/roi_heads/retina_head.py", line 231, in get_bboxes
jt.sync([bbox_j, score_j])
RuntimeError: Wrong inputs arguments, Please refer to examples(help(jt.sync)).
Types of your inputs are:
self = module,
args = (list, ),
The function declarations are:
void sync(const vector<VarHolder*>& vh=vector<VarHolder*>(), bool device_sync=false)
Failed reason:[f 0914 20:35:28.525344 64 executor.cc:533] Execute fused operator(116/574) failed: [Op(0x2d46dcb0:0:0:1:i1:o1:s0,broadcast_to->0x2de7ec90),Op(0x2d32f690:0:0:1:i1:o1:s0,reindex->0x2d433bc0),Op(0x2e5f0e30:0:0:1:i2:o1:s0
,binary.multiply->0x2de764c0),Op(0x2e5f3e30:0:0:1:i1:o1:s0,reduce.add->0x2de7a8c0),]
Reason: [f 0914 20:35:28.524532 64 helper_cuda.h:126] CUDA error at /home/llc/.local/lib/python3.7/site-packages/jittor/src/mem/allocator/cuda_managed_allocator.cc:23 code=2( cudaErrorMemoryAllocation ) cudaMallocManaged(&ptr, size
)
The text was updated successfully, but these errors were encountered: