【tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed】错误解决方案

港控/mmm° 2022-12-10 03:59 305阅读 0赞

最近使用yolov3训练模型,依然使用之前的配置和环境,但是却出现以下错误,百思不得其解。看过了很多博客,研究了好久……

直到今天,在寻找一个类似报错的时候,看到某篇博客的评论区提出的解决方案,困扰我许久的问题终于解决了!!!!

报错问题:

  1. E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
  2. Traceback (most recent call last):
  3. File "E:/Project/keras-yolo3-person&vehicle&aeroplane/train.py", line 190, in <module>
  4. _main()
  5. File "E:/Project/keras-yolo3-person&vehicle&aeroplane/train.py", line 84, in _main
  6. callbacks=[logging, checkpoint, reduce_lr, early_stopping])
  7. File "D:\ProgramData\Anaconda3\envs\keras-yolo3-cp36\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
  8. return func(*args, **kwargs)
  9. File "D:\ProgramData\Anaconda3\envs\keras-yolo3-cp36\lib\site-packages\keras\engine\training.py", line 1418, in fit_generator
  10. initial_epoch=initial_epoch)
  11. File "D:\ProgramData\Anaconda3\envs\keras-yolo3-cp36\lib\site-packages\keras\engine\training_generator.py", line 217, in fit_generator
  12. class_weight=class_weight)
  13. File "D:\ProgramData\Anaconda3\envs\keras-yolo3-cp36\lib\site-packages\keras\engine\training.py", line 1217, in train_on_batch
  14. outputs = self.train_function(ins)
  15. File "D:\ProgramData\Anaconda3\envs\keras-yolo3-cp36\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
  16. return self._call(inputs)
  17. File "D:\ProgramData\Anaconda3\envs\keras-yolo3-cp36\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
  18. fetched = self._callable_fn(*array_vals)
  19. File "D:\ProgramData\Anaconda3\envs\keras-yolo3-cp36\lib\site-packages\tensorflow\python\client\session.py", line 1439, in __call__
  20. run_metadata_ptr)
  21. File "D:\ProgramData\Anaconda3\envs\keras-yolo3-cp36\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
  22. c_api.TF_GetCode(self.status.status))
  23. tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed : m=86528, n=32, k=64
  24. [[{
  25. {node conv2d_3/convolution}} = Conv2D[T=DT_FLOAT, _class=["loc:@training/Adam/gradients/conv2d_3/convolution_grad/Conv2DBackpropInput"], data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](leaky_re_lu_2/LeakyRelu, conv2d_3/kernel/read)]]
  26. [[{
  27. {node yolo_loss/while_2/strided_slice_1/stack_1/_4337}} = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_11657_yolo_loss/while_2/strided_slice_1/stack_1", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_cloopyolo_loss/while_2/strided_slice_1/stack_2/_4125)]]

解决方案

添加代码:

  1. import os
  2. os.environ['CUDA_VISIBLE_DEVICES'] = '/gpu:0'

存在问题:不再使用GPU去训练,此时使用的是CPU。

发表评论

表情:
评论列表 (有 0 条评论,305人围观)

还没有评论,来说两句吧...

相关阅读