My previous installation of CUDA on Ubuntu 14.04 was a bit frustrating due to the booting issue after installation. System would often be frozen and stuck on the Ubuntu logo while booting. Some say it was due to the driver issues.
And I have turned to windows and issues seemed solved. I basically followed this blog guide, but some steps were done differently.
OK, here are the steps:
1. Install Visual Studio (2013)
First you need to check the supported version of VC according to the CUDA version that you will be downloading. At my time, the latest supported VC was 2013, so it would NOT be supported if you had chosen to install VC 2015 for example.
After installation, add below two paths to your system PATH:
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\;C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE
2. Install CUDA (7.5)
My version was 7.5, which is the latest version at my time. Installation should be quite simple.
3. Python and dependencies
Since I have Anaconda installed as my python, just type below commands to install additional dependencies:
conda install mingw libpython
4. Install Theano
I chose to install the bleeding-edge version from github. As for installation of other python packages, cd to anaconda/Lib/site-packages, and clone the package from github:
git clone https://github.com/Theano/Theano.git
Then cd into the Theano folder and install:
cd Theano
python setup.py develop
After installing theano, create a .theanorc.txt file in your HOME directory (somewhere like c:/Users/YOURNAME/), with following contents:
[global]
floatX = float32
device = gpu
[nvcc]
flags=-LC:\SciSoft\Anaconda\libs
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin
Make sure
flags=-LC:\SciSoft\Anaconda\libs
goes to your correct Anaconda directory. (e.g. I installed Anaconda in HOME\SciSoft\Anaconda)
5. Install PyCUDA
Download the .whl file from here. My python was 2.7 so I downloaded the latest pycuda‑2015.1.3+cuda7518‑cp27‑none‑win_amd64.whl.
And installed it using
pip install pycuda‑2015.1.3+cuda7518‑cp27‑none‑win_amd64.whl
6. Testing Theano and PyCUDA
Theano with GPU:
Import theano should give you similar lines as below:
import theano
Using gpu device 0: GeForce GT 705 (CNMeM is disabled)
And follows theano docs to test example snippet like below:
from theano import function, config, shared, sandbox import theano.tensor as T import numpy import time vlen = 10 * 30 * 768 # 10 x #cores x # threads per core iters = 1000 rng = numpy.random.RandomState(22) x = shared(numpy.asarray(rng.rand(vlen), config.floatX)) f = function([], T.exp(x)) print(f.maker.fgraph.toposort()) t0 = time.time() for i in xrange(iters): r = f() t1 = time.time() print("Looping %d times took %f seconds" % (iters, t1 - t0)) print("Result is %s" % (r,)) if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]): print('Used the cpu') else: print('Used the gpu')
PyCUDA:
import pycuda.autoinit import pycuda.driver as drv import numpy from pycuda.compiler import SourceModule mod = SourceModule(""" __global__ void multiply_them(float *dest, float *a, float *b) { const int i = threadIdx.x; dest[i] = a[i] * b[i]; } """) multiply_them = mod.get_function("multiply_them") a = numpy.random.randn(400).astype(numpy.float32) b = numpy.random.randn(400).astype(numpy.float32) dest = numpy.zeros_like(a) multiply_them( drv.Out(dest), drv.In(a), drv.In(b), block=(400,1,1), grid=(1,1)) print dest-a*b
And this should give a screen of zeros.
You are good to go!
Thanks A lot Weimin Wang . Your Blog helped a lot.
LikeLike
Clear, concise, and it worked!
Thank you so much! You’ve saved me hours
LikeLike
When I running the code for pycuda its shows me error:
CompileError: nvcc compilation of C:\Users\NRB\AppData\Local\Temp\tmplothnihj\kernel.cu failed
I have added NVCC compiler path also but I got the same error. Can you please tell me how can I fix the problem.
LikeLike