gpgpu - The performance of convn using GPU in Matlab -
i found related question in stackoverflow.
it not solve problem. found when dimension m small, convn using gpu works worse 1 using cpu. used following test codes:
m=28; n=10; %m=100; %n=100; nsample = 50; k=5; x1 = gpuarray.rand(m,m,nsample,'single'); x2 = gpuarray.rand(k,'single'); tic; i=1:n gc=convn(x1 , x2 , 'valid'); end toc x1 = rand(m,m,nsample,'single'); x2 = rand(k,'single'); tic; i=1:n c=convn(x1,x2,'valid'); end toc
for different m value, performance different:
when m = 28, elapsed time 0.076827 seconds. elapsed time 0.020734 seconds.
when m = 100, elapsed time 0.086314 seconds. elapsed time 0.234197 seconds.
when m = 200, elapsed time 0.077589 seconds. elapsed time 0.243595 seconds.
it seems matlab optimize gpu code when m small. think maybe depends on io cost.
is there way optimize code when m small ?
thank guys much.
-----------------------extra experiment results (11/03/2014)----------------------------
hi
thank replies.
i , colleague have made experiments, , found results. in short, observe 2 problems in matlab:
- it take time on garbage collection.
- it throw out error when keep allocating new variables.
for problem 1
i used following codes:
timel=[]; i=1:10; m1 = 1000; m2 = 1000; nsample = 50; n=10; k=5; tic; x1 = gpuarray.rand(m1,m2,nsample,'single'); x2 = gpuarray.rand(k,'single'); clear x1 x2; time = toc; timel = [timel , time]; end timel
the result is:
timel = 0.0646 0.0579 0.0563 0.0566 0.0572 0.0571 0.0560 0.0574 0.0573 0.0582
it seems takes long time collect garbage memory space.
for running time of convn, if use following codes, adds function convn
, result saved gc
timel=[]; i=1:10; m1 = 1000; m2 = 1000; nsample = 50; n=10; k=5; x1 = gpuarray.rand(m1,m2,nsample,'single'); x2 = gpuarray.rand(k,'single'); % running time of convn tic; gc = convn(x1,x2,'valid'); %clear gc; time = toc; clear x1 x2; timel = [timel , time]; end timel
the runing time is:
timel = 0.0643 0.1215 0.0075 0.2517 0.0075 0.2504 0.0075 0.2500 0.0075 0.0075
the running time not stable: 0.0075
0.2517
if uncomment line clear gc;
, running time quite stable, follows:
timel = 0.0627 0.0657 0.0634 0.0634 0.0623 0.0623 0.0629 0.0632 0.0632 0.0622
it indecates running time of convn
fast, memory collection bottleneck.
does know how can use 1 part of fix gpu memory out memory collection, example, c style code:
`convn(x1,x2,&gc,'valid')`
for problem 2
i used following codes:
%% test memory allocation timel=[]; i=1:10000; m1 = 1000; m2 = 1000; nsample = 50; n=10; k=5; tic; x1 = gpuarray.rand(m1,m2,nsample,'single'); %clear x1; time = toc; timel = [timel , time] end
the result shows cost time of allocating new memory trival, raise 1 error after several loops:
error using gpuarray.rand unexpected error occurred trying launch kernel. cuda error was: launch timed out , terminated error in test (line 20) x1 = gpuarray.rand(m1,m2,nsample,'single');
it reasonable, since have allocated lot of gpu memorry. worse thing that, if use
clear all;
i warning, follows:
warning: unexpected error occurred during cuda execution. cuda error was: cuda_error_launch_timeout warning: unexpected error occurred during cuda execution. cuda error was: cuda_error_launch_timeout
i still can not allocate new gpu memory until reset gpu device, using: reset(gpudevice())
seems matlab not track memory when previous error raised, , not clear allocated memory.
Comments
Post a Comment