# GPU 使用指南
# 一、安装 NVIDIA 显卡驱动
使用 GPU 前需安装 NVIDIA 驱动:
- Linux:版本 ≥
525.60.13
- Windows:版本 ≥
528.33
# 二、安装 CUDA 和 cuDNN
建议安装以下版本:
- CUDA:v12.4
- cuDNN:v8.9.7
下载安装链接:
安装好之后需要重启电脑,验证安装是否成功:
nvcc -V
若输出包含 v12.4
,则说明安装成功。
# 三、添加 GPU 离线依赖(推荐)
仅适用于Pytorch引擎的模型
默认情况下,DJL 会自动下载 GPU 依赖,但速度较慢,建议手动添加 Maven 依赖:
# Windows(x86_64)
<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-native-cu124</artifactId>
<classifier>win-x86_64</classifier>
<version>2.5.1</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-jni</artifactId>
<version>2.5.1-0.32.0</version>
<scope>runtime</scope>
</dependency>
# Linux(x86_64)
<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-native-cu124</artifactId>
<classifier>linux-x86_64</classifier>
<version>2.5.1</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-jni</artifactId>
<version>2.5.1-0.32.0</version>
<scope>runtime</scope>
</dependency>
# 四、配置系统环境变量(Windows)
# 代码中指定GPU
SmartJavaAI 默认使用 CPU。如需使用 GPU,需要手动指定设备类型:
FaceModelConfig config = new FaceModelConfig();
config.setModelEnum(FaceModelEnum.RETINA_FACE); // 人脸模型
config.setDevice(DeviceEnum.GPU);// 指定 GPU
FaceModel faceModel = FaceModelFactory.getInstance().getModel(config);
首次运行时,程序会自动解压依赖库,你将看到如下日志,即使后面有报错也没有关系:
即使随后程序抛出异常也无需担心,此步骤的目的是为了完成依赖文件解压到缓存路径。
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/asmjit.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/c10.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/c10_cuda.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/caffe2_nvrtc.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cublas64_12.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cublasLt64_12.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cudart64_12.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cudnn64_9.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cudnn_adv64_9.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cudnn_cnn64_9.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cudnn_engines_precompiled64_9.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cudnn_engines_runtime_compiled64_9.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cudnn_graph64_9.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cudnn_heuristic64_9.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cudnn_ops64_9.dll to cache ...
[main] INFO ai.djl.pytorch.jni.LibUtils - Extracting pytorch/cu124/win-x86_64/cufft64_11.dll to cache ...
# 缓存目录说明
系统 | 缓存目录 |
---|---|
Windows | C:/Users/{user}/smartjavaai_cache |
Linux | /root/smartjavaai_cache |
macOS | /Users/{user}/smartjavaai_cache |
# 配置步骤
打开缓存路径,定位至目录:
pytorch/2.5.1-20241113-cu124-win-x86_64
注意事项
如果在缓存目录中找不到 pytorch/2.5.1-20241113-cu124-win-x86_64
目录,请检查前面的步骤是否完成
- 将该目录添加到 系统环境变量
PATH
中。 - 删除原有 CUDA 路径,避免冲突。
- 修改环境变量后一定要重启你的IDE或者重启电脑。
示例图:
5、前面的步骤操作完成后,重新运行程序,运行成功示例:
# 五、Seetaface6模型GPU使用指南
- 1、使用 Seetaface6 模型需要安装CUDAv11.6.2
- 2、将CUDA加入到系统环境变量(PATH)中
按照如上的步骤,即可正常使用Seetaface6的GPU模式
# 六、OCR模块GPU使用指南
OCR 模块使用的推理引擎为 ONNX Runtime。在完成前述 GPU 配置步骤(前 4 步)后,还需执行以下操作以启用 GPU :
- 1、排除onnxruntime的CPU版本
- 2、引用onnxruntime_gpu
注意: 如果项目中还引入了其他 SmartJavaAI 模块,务必确保统一排除其传递依赖中的 onnxruntime(CPU 版本),否则可能导致运行时冲突或 GPU 加速失效。
<dependency>
<groupId>cn.smartjavaai</groupId>
<artifactId>smartjavaai-ocr</artifactId>
<scope>runtime</scope>
<exclusions>
<exclusion>
<groupId>com.microsoft.onnxruntime</groupId>
<artifactId>onnxruntime</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.microsoft.onnxruntime</groupId>
<artifactId>onnxruntime_gpu</artifactId>
<version>1.20.0</version>
<scope>runtime</scope>
</dependency>
# 七、常见错误与解决方法
# 示例错误日志1:
ai.djl.engine.EngineException: Could not run 'aten::empty_strided' with arguments from the 'CUDA' backend.
This could be because the operator doesn't exist for this backend,
or was omitted during the selective/custom build process (if using custom build).
If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions.
'aten::empty_strided' is only available for these backends: [CPU...].
问题原因: 安装的cuda/cudnn版本不匹配
解决方案: 请使用文档中要求的版本安装
# 示例错误日志2:
Caused by: java.lang.UnsatisfiedLinkError: C:\Users\Administrator\smartjavaai_cache\pytorch\2.5.1-20241113-cu124-win-x86_64\torch_cuda.dll: Can't find dependent libraries
at java.base/java.lang.ClassLoader$NativeLibrary.load0(Native Method)
at java.base/java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2437)
at java.base/java.lang.ClassLoader$NativeLibrary.loadLibrary(ClassLoader.java:2494)
at java.base/java.lang.ClassLoader.loadLibrary0(ClassLoader.java:2694)
at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2624)
at java.base/java.lang.Runtime.load0(Runtime.java:765)
at java.base/java.lang.System.load(System.java:1852)
at ai.djl.pytorch.jni.LibUtils.loadNativeLibrary(LibUtils.java:379)
at ai.djl.pytorch.jni.LibUtils.loadLibTorch(LibUtils.java:195)
at ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:82)
at ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:53)
... 39 more
问题原因: cuda环境变量配置不正确
解决方案: 可以查看配置系统环境变量
# 示例错误日志3:
Caused by: java.lang.Exception: Compute device gpu has no memory device registered. Please call RegisterMemoryDevice firstly.
at com.seeta.sdk.FaceDetector.construct(Native Method)
at com.seeta.sdk.FaceDetector.<init>(FaceDetector.java:17)
at com.seeta.pool.FaceDetectorPool$1.makeObject(FaceDetectorPool.java:37)
at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:566)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:306)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:233)
at cn.smartjavaai.face.model.facerec.SeetaFace6Model.extractFeatures(SeetaFace6Model.java:853)
... 29 more
问题原因: Seetaface6没有正确加载到gpu的依赖库
解决方案: 请使用SmartJavaAI最新版本,历史版本有可能存在兼容性问题
# 示例错误日志4:
java.lang.UnsatisfiedLinkError: C:\Users\Administrator\smartjavaai_cache\seetaface6\tennis.dll: Can't find dependent libraries
问题原因: 使用Seetaface6模型,cuda未安装或版本不正确
解决方案: 请安装cuda v11.6.2版本,并配置系统环境变量