gpgpu - The performance of convn using GPU in Matlab -


i found related question in stackoverflow.

it not solve problem. found when dimension m small, convn using gpu works worse 1 using cpu. used following test codes:

m=28; n=10; %m=100; %n=100; nsample = 50; k=5; x1 = gpuarray.rand(m,m,nsample,'single'); x2 = gpuarray.rand(k,'single'); tic; i=1:n     gc=convn(x1 , x2 , 'valid'); end toc  x1 = rand(m,m,nsample,'single'); x2 = rand(k,'single'); tic; i=1:n     c=convn(x1,x2,'valid'); end toc 

for different m value, performance different:

when m = 28, elapsed time 0.076827 seconds. elapsed time 0.020734 seconds.

when m = 100, elapsed time 0.086314 seconds. elapsed time 0.234197 seconds.

when m = 200, elapsed time 0.077589 seconds. elapsed time 0.243595 seconds.

it seems matlab optimize gpu code when m small. think maybe depends on io cost.

is there way optimize code when m small ?

thank guys much.

-----------------------extra experiment results (11/03/2014)----------------------------

hi

thank replies.

i , colleague have made experiments, , found results. in short, observe 2 problems in matlab:

  1. it take time on garbage collection.
  2. it throw out error when keep allocating new variables.

for problem 1

i used following codes:

timel=[]; i=1:10;     m1 = 1000;    m2 = 1000;    nsample = 50;    n=10;    k=5;     tic;     x1 = gpuarray.rand(m1,m2,nsample,'single');     x2 = gpuarray.rand(k,'single');     clear x1 x2;     time = toc;     timel = [timel , time]; end timel 

the result is:

timel =   0.0646    0.0579    0.0563    0.0566    0.0572    0.0571    0.0560    0.0574    0.0573    0.0582 

it seems takes long time collect garbage memory space.

for running time of convn, if use following codes, adds function convn, result saved gc

timel=[]; i=1:10;     m1 = 1000;    m2 = 1000;    nsample = 50;    n=10;    k=5;     x1 = gpuarray.rand(m1,m2,nsample,'single');     x2 = gpuarray.rand(k,'single');      % running time of convn     tic;             gc = convn(x1,x2,'valid');             %clear gc;     time = toc;      clear x1 x2;     timel = [timel , time]; end timel 

the runing time is:

 timel = 0.0643    0.1215    0.0075    0.2517    0.0075    0.2504    0.0075    0.2500    0.0075    0.0075 

the running time not stable: 0.0075 0.2517

if uncomment line clear gc;, running time quite stable, follows:

timel = 0.0627    0.0657    0.0634    0.0634    0.0623    0.0623    0.0629    0.0632    0.0632    0.0622 

it indecates running time of convn fast, memory collection bottleneck.

does know how can use 1 part of fix gpu memory out memory collection, example, c style code:

`convn(x1,x2,&gc,'valid')` 

for problem 2

i used following codes:

%% test memory allocation timel=[]; i=1:10000;     m1 = 1000;    m2 = 1000;    nsample = 50;    n=10;    k=5;     tic;     x1 = gpuarray.rand(m1,m2,nsample,'single');     %clear x1;     time = toc;     timel = [timel , time] end   

the result shows cost time of allocating new memory trival, raise 1 error after several loops:

error using gpuarray.rand unexpected error occurred trying launch kernel. cuda error was: launch timed out , terminated error in test (line 20)         x1 = gpuarray.rand(m1,m2,nsample,'single'); 

it reasonable, since have allocated lot of gpu memorry. worse thing that, if use

clear all; 

i warning, follows:

warning: unexpected error occurred during cuda execution. cuda error was: cuda_error_launch_timeout  warning: unexpected error occurred during cuda execution. cuda error was: cuda_error_launch_timeout  

i still can not allocate new gpu memory until reset gpu device, using: reset(gpudevice()) seems matlab not track memory when previous error raised, , not clear allocated memory.


Comments

Popular posts from this blog

Android layout hidden on keyboard show -

google app engine - 403 Forbidden POST - Flask WTForms -

c - Why would PK11_GenerateRandom() return an error -8023? -