Decided it was time i had a look into 'the state of things' wrt these new apis. I've not been keeping up with OpenCL of late either so I guess I should refresh on that (oh that's right, c++ kernels, well fuck that shit).
Vulkan definitely looks interesting. I mean vulkan looks like a huge amount of boilerplate from what little is available so far (the GDC 2015 slides from khronos) but it all makes plenty of sense. Pity it's still work in progress so all I can do is read about snippets. One hopes JavaFX will use it as a backend in the future though, and maybe i'll do a ZVK to compliment ZCL myself (it's good way to learn about an api).
There's a bizarre thread on the khronos forums where one poster wants to scare away beginner programmers from Vulkan least they get confused by such wild concepts as memory allocation and asynchronous synchronisation. That's ... well that's just wrong. All that information hiding in OpenGL is what makes it such a pain to understand beyond trivial cases, particularly since fixed pipeline went away. If people are using C (as they must to use it) then malloc is hardly an insurmountable barrier. And asynchronous sync is the only technique which makes multi-threading code manageable and really should be the main technique for MT code taught in any software engineering course. To be honest I think it's all that learning about critical sections and mutexes which do all the damage and make multi-threading seem so complicated and they should be something unlearnt as quickly as possible (as a general MT sync technique that is, obviously they are needed if only to implement more useful primitives like queues and barriers).
Anyway we know how it will play out for beginners and learning professionals alike: we'll all just use google to find the first example of the boilerplate that compiles and then keep using that for years. If I was khronos I'd make sure it's their up to date examples and documentation that using google finds and not some shitty agglomeration site full of out-dated shit.
SPIR looks a little "odd" to me at first glance, but i'm not a compiler writer so it probably should. To my mind i would describe it (a bit) like assembly language that has been pre-processed and annotated ready for an optimising compiler and then it's internal object representation flattened into a flat global space referenced by index. Rather than registers or addresses results are referenced by the instruction that produces them. It should make a frontend compiler fairly simple as the compiler doesn't need to deal with registers or stacks, and it makes a backend a bit simpler as some of the work is already done. And apparently it's not a file format, although it sure looks like one and i can't imagine it would make sense to anyone to create a different one to store the same information.
I wonder what this mean for HSA (or infact, any other compiler system using it's own virtual processor)? I've not looked into it so maybe something is already announced but if not i suspect SPIR is in it's future too. BRIG takes quite a different approach in that a specific register-based virtual machine is targeted and it's up to the frontend compiler to perform register optimisation and the backend isn't (supposed to be) much more than a machine code translator (which is still strictly speaking, a compiler). I just can't see hardware vendors keeping both maintained.
Oh my gods, it's full of compilers ...
One realisation I had when playing with the soft-GPU code on the parallella was that modern GPU drivers are mostly just compilers with a bit of memory and queue management bolted on the side. And similarly that modern GPUs are really just moderately wide vector CPUs with a little bit logic for fragment generation and texture lookup. of HSA itself is more about compiler writers than the end-user too.
It's just a bit of a personal bummer because although compilers are a bit interesting they're not enough interesting to me to spend the years required to get up to speed.