About

Michael Zucchi

 B.E. (Comp. Sys. Eng.)

  also known as zed
  & handle of notzed

Tags

android (44)
beagle (63)
biographical (87)
blogz (7)
business (1)
code (63)
cooking (30)
dez (7)
dusk (30)
ffts (3)
forth (3)
free software (4)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (434)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (224)
java ee (3)
javafx (48)
jjmpeg (77)
junk (3)
kobo (15)
libeze (7)
linux (5)
mediaz (27)
ml (15)
nativez (8)
opencl (119)
os (17)
parallella (97)
pdfz (8)
philosophy (26)
picfx (2)
playerz (2)
politics (7)
ps3 (12)
puppybits (17)
rants (137)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
wanki (3)
workshop (3)
zcl (1)
zedzone (21)
Wednesday, 14 September 2011, 07:20

Shared Bank Conflicts

So yesterday I was working on some code for a wavelet transform - essentially a bunch of convolutions performed in a specific way.

One routine I have does 4 interleaved convolutions simultaneously, and had a large number of shared bank conflicts - 200% or so. So I spent quite a while trying to remove them. I got rid of most of them - but for some weird reason I still get about 14% and I ran out of day (and patience) to work out what was going on there.

I find this kind of problem quite tricky - trying to juggle dozens of sets of numbers in your head and still come up with something that works. I've developed a few rules of thumb but I still haven't done it often enough to make it easy.

But for all that effort I got a very minor performance improvement: barely 2%. From 70uS, to 67uS kernel time. Hardly seemed worth it.

Tagged opencl.
Aparapi freed | Masked Loops & LU Decomposition
Copyright (C) 2019 Michael Zucchi, All Rights Reserved. Powered by gcc & me!