1、深度学习所需环境
- Python2、Python3:https://www.python.org/、https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/
- TensorFlow:(Google)https://www.tensorflow.org/、https://tensorflow.google.cn/、 Pytorch:(Facebook)https://pytorch.org/
- cuda driver:NVIDIA 显卡驱动(最底层) https://www.nvidia.cn/Download/index.aspx?lang=cn
- cudatoolkit:cuda相关的工具包 https://developer.nvidia.com/cuda-toolkit-archive
- cuda:nvidia推出的用于自家GPU的并行计算框架 https://developer.nvidia.com/cuda-toolkit-archive
- cudnn:nvidia打造的针对深度神经网络的加速库 https://developer.nvidia.com/rdp/cudnn-download
2、版本对应
- TensorFlow 与 cuDNN、CUDA 的对应版本:https://tensorflow.google.cn/install/source#gpu、https://tensorflow.google.cn/install/source_windows#gpu
- Pytorch 与 CUDA 的对应版本:https://pytorch.org/get-started/previous-versions/
- 显卡驱动 与 CUDA 的对应版本:https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
- cuDNN 与 CUDA 的对应版本:https://developer.nvidia.com/rdp/cudnn-archive
3、查看版本
- Python:命令行
python
orwhich python
- Tensorflow
- 法1:
import tensorflow as tf print(tf.__version__)
- 法2:命令行
pip list
orconda list
- Pytorch
- 法1:
import torch print(torch.__version__)
- 法2:命令行
pip list
orconda list
- 显卡驱动:命令行
nvidia-smi
orcat /proc/driver/nvidia/version
- CUDA:
nvcc -V
ornvcc --version
,但前提是添加了环境变量【见下面】,cat /usr/local/cuda/version.txt
- cuDNN
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2 cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2
4、安装版本
- conda自动完成:
conda install tensorflow-gpu==1.13.1
- 显卡驱动:https://www.nvidia.cn/Download/index.aspx?lang=cn
- cuda:下载 https://developer.nvidia.com/cuda-toolkit-archive
sudo sh cuda_9.0.176_384.81_linux.run vim ~/.bashrc # cuda-9.0 # export PATH="/usr/local/cuda-9.0/bin:$PATH" # export LD_LIBRARY_PATH="/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH" # cuda-10.0 export PATH="/usr/local/cuda-10.0/bin:$PATH" export LD_LIBRARY_PATH="/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH"
- cudnn:https://developer.nvidia.com/rdp/cudnn-archive
- 官方教程:https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#handle-uninstallation
5、指定GPU的id
CUDA_VISIBLE_DEVICES=1 python my_script.py # 命令行
export CUDA_VISIBLE_DEVICES=1 # shell脚本
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # Python
6、BERT 与 显卡12G内存、batch size、sequence length关系
System | Seq Length | Max Batch Size |
---|---|---|
BERT-Base |
64 | 64 |
… | 128 | 32 |
… | 256 | 16 |
… | 320 | 14 |
… | 384 | 12 |
… | 512 | 6 |
BERT-Large |
64 | 12 |
… | 128 | 6 |
… | 256 | 2 |
… | 320 | 1 |
… | 384 | 0 |
… | 512 | 0 |