B.E. (Comp. Sys. Eng.)
also known as zed
& handle of notzed
Kept plugging away and added options for new images and new layers at last.
Not all of the layer types work 100% yet, but they work for normal drawing and layer operations and PNG at least saves mono images as mono. I probably don't need them all there - I just did it because I could.
All the calculations and blending is just done in float RGBA so none of the code needed changing, although it does mean it's doing a lot of redundant work if the result is only monochrome.
Save Take 2 of 2
Had another stab at saving again today. Damn i forgot how tedious it is to try and make relatively nice looking GUI components, particularly when the layout toolkit seems to want to fight you at every stage ...
It's pretty basic but it should suffice for what I need. Actually this is the first post where I created the images using ImageZ itself, so it marks a bit of a milestone.
I haven't actually hooked up the comment stuff yet, but i've done it before and it's a matter of frobbing around with the metadata. Unfortunately it is in XML ...
I decided to leave in the RGBA support for JPEG files after reading the JPEG metadata specification for Java. Maybe nothing else will read them properly but it looks like Java will. But it defaults to flattening the image for JPEG files just to avoid confusion and i'll probably add a warning.
While doing testing I got really sick of the childish 'this file exists are you sure' crap (yeah no fucking shit sherlock!) and found I think a reasonable way to get rid of it. The file requester is used as normal to select the file to save to, and if you happen to select or enter a file that exists it just shows a warning in the bottom of the window and relabels the Save button to Overwrite. That's almost always what I want to do when I save over the same name. So the idea is not to interfere with typical use, but still let you back out if you truly made a mistake. Actually it lets you just type in a name from this window so you don't have to back out too far either.
I'm still not sure about the `Save Version' button here - the idea is that it will just create a name automatically if there is a clash. But I have a feeling that without filesystem multi-version support it's just a good way to lose what you're working on too easily.
Unfortunately behind the scenes there's still a load of rubbish to deal with such as keeping track of the meta-data from the image loading, transferring meta-data between formats, and so on.
But with this weather I might be spending a lot of time inside ... sigh, really cold, wet, windy, shitty weather. And I hurt my foot doing something I can't remember although it's getting better quickly because i'm giving it plenty of rest. Both are at quite inconvenient times, I finally got council approval for my shed and I still have a lot of work to do getting the yard ready - and soon will have a hard deadline in which to achieve it.
Had a poke at saving images tonight. I don't know where the damn day went but I only started looking at doing some hacking at 10pm tonight.
To start with I thought i'd skip any esoteric floating point formats or bothering with layers and just stick with the standard formats and trying to fit the images to suit. So about the best format is PNG with 16 bits per channel. It's a bit of a pigs breakfast but basically it scans the image to find the highest depth/format image in the layer stack and then starts trying to find an image writer for that format and just keeps dropping down to lower depth resolution until one works. e.g. it tries floats, then shorts, then bytes.
Well, things are a bit mixed ... to say the least.
PNG isn't too bad - it's saving as 16 bit per channel too - but JPEG does weird things. Actually I was surprised it even saved a 4 channel image - I expected it to fail. What's strange is that Java loads this weird 4 channel JPG just fine as an RGBA image but nothing else does. Java's in the wrong here I'm pretty sure?
Unfortunately the image writer interface doesn't let you find out what type of individual things the image format supports and all you can do is query if it supports writing a given complete format. I guess i'll just have to hardcode it for every format I will support - I suppose I need to anyway to handle the meta-data properly and stuff like compression options. The standard JRE doesn't support much anyway and besides I don't need to either. I suppose i'll use OpenRaster as the 'native' format, although TBH anything with XML and interoperability mentioned together just sounds like a nightmare.
Hmm, then there's all the state and metadata to muck about with from inside the app. Oh wont that be fun.
Storing everything as RGBA also complicates matters ... although I can cheat a bit and put options on the per-format requester for target bit depth & channels saved.
I got a fufut.
I had a go at porting Apple's OpenCL FFT to JOCL and coming up with a simple demo to add to the JOCL demo project.
Rotated image from previous post with rotated motion blur. The image on the left is used to generate the convolution kernel.
So here's a screenshot of a 'blur tool thing' I came up with. Because of screen-size I forced it to only process a 512x512 image, but even at 1024x1024 it does the convolution as fast as the mouse can send events (the raw gpu time for a 1024x1024 convolution is about 1200uS per plane, excluding data conversion). I had previously written a separable convolution for OpenCL and this is about on par with the 63x63 convolution processing time - but isn't limited to separable convolutions or small kernels (e.g. no rotation like above). Does take somewhat more processing to build the kernel though since it's the same size as the image but that's something easily off-loaded to the GPU as well.
More bits. And pieces.
I don't know why the default size of the Java file requester is so unusably small - perhaps it is some microsoft legacy, but it's really quite nice when it's sized up a bit, and set to the vertical list mode. If only you could drive it from the keyboard ...
Well I added an image preview to the one in ImageZ anyway. The info area is a bit naff but I just grabbed some of the generic meta-data from the ImageReader. The images are loaded/iconised in the background, of course. I first tried using a SwingWorker and cancelling it if a new image was browsed - the problem is that the image loading ignores the interruption and keeps going until it's just about finished anyway. So it didn't really suit.
Then I tried writing a loader using the lower-level api's which have an explicit cancel request. Still no go - it just ignores the abort until it's finished anyway. Damn. So I gave up and just launch a new thread every time. If the thread gets interrupted/cancelled I quit the thread as soon as I can, otherwise the image is loaded, scaled, and then dropped into the preview box. Given it loads the full image to create the icon (if there's no thumbnail in the image, which there usually isn't), I should hook the loaded (or loading) image to the OK button too - images will load immediately then.
For 'normal' sized images it's quite quick anyway although I guess it needs to handle the edge cases to avoid a backlog of threads. Actually loading a bunch of images using multi-select is pretty fast - loading all of the 14 images so far from this series of posts at once is under a second - at least 3x faster than GIMP on this box. Maybe it's the threading (and yes, not entirely fair, a full app vs a toy).
I also hacked up a very nasty bit of code which lets tools add information to the display window itself.
e.g. the current brush 'shape'.
Right now the tool has to handle calculating the dirty regions and other crap so I need a better solution, but having it there will let me investigate.
On the other hand it is pretty flexible. I thought i'd start poking at my 'affine' tool - trying a rotation to start with. First just to see how slow it would be, and then to try ways of controlling it. The tool just tells the image window to repaint every time the angle changes, and then its paint callback just blats the image over the top. And it's not too shabby by default speedwise - even the bicubic filter is around the same speed as GIMP's nearest neighbour for medium sized images.
Then I played with a few other ideas - in the above shot the image is scaled down first, set to acceleratable, and then drawn as the mouse moves (using nearest-neighbour in this shot). i.e. it runs very very quick once you start rotating the image (as fast as the mouse sends events). Unfortunately for very large images the initial scaling can be quite slow, although it is probably still worth it. Scaling down large BufferedImages seems unusually slow so perhaps scaling it myself in integer increments would be faster for things like this where I just need an approximation. Just 'scaling' the image by 1, and converting from a `BufferedImage' to a `Toolkit' `Image' makes a huge difference too as it presumably gets some hardware acceleration.
As can be seen from the shots I also worked out some problems with the Nimbus theme - it's JTable widget handles the default renderer types a little differently so I had to change the column type to an IconImage rather than just an Icon for the layer icon. Nimbus looks ok - the scrollbars are a bit ugly/out of character and the general spacing is a little loose but it looks modern enough. And I added a little info box to the bottom of the toolbox. Now what to put in the white empty area ...
And I fixed some bugs in the handling of layer translation. I haven't hooked any tools up for that yet but it should hopefully just work when I get to that point.
The GIMP's taken to hiding all the control boxes when I change virtual desktops now, and wont let me move them to other desktops either. I wonder if it's trying to tell me something.
XBMC BeagleBoard GSOC '10 Wrap Up
GSOC 2010 is coming to an end and the final assessments have been made, so it's about time I posted an update on the result and my experiences.
The Story So Far ...
XBMC had been compiled to run on the BeagleBoard but wasn't really practical for use. The menu's and video ran poorly at under 10fps even for low-resolution video and in general it wasn't usable.
Part of the reason is because the rendering system is written as a game loop - everything is drawn every frame and it relies on relatively powerful video hardware to ensure it runs at a reasonable rate.
To speed up the menu's and video playback to a point it might be usable, or at least demonstrate it could be possible.
The initial aim was to try to reduce load on the graphics rendering subsystem by reducing the amount of work it was given. Ideally working towards an event driven widget system but at least not drawing things that haven't changed from frame to frame. Although this was much of the initial proposal and most of the time spent, the resultant improvements were only modest. The menu's did speed up some but the full-screen video wasn't markedly changed since usually there are no graphics to draw at the same time anyway.
Two other suggestions became key to improving the performance. One was to use the video overlay hardware in the beagleboard to perform the video to rgb conversion and image scaling. The other was to come up with a more modest theme which didn't tax the system so much - reducing animations and large background images.
The good news first. You can now play small videos smoothly and navigating the menu's on their own is also responsive enough to be usable. Just on this alone the project was clearly a success - 11 weeks is really a very short amount of time to do anything much and certainly not enough time to debug someone else's software.
Whilst playing a video in the background the menu's still have some issues - things slow down quite badly and the video overlay (well, under-lay) doesn't work terribly well. I think the custom theme should help this though.
The main menu in the default theme.
Unfortunately I hit a pretty major bug when trying the custom theme made for GSOC, so I had to leave it with the (mostly) original theme. This appears to be some issue inside the deep lower-bowels of XBMC.
Default theme problems
Although even the default theme had some major problems. The above is the video source menu - but it's only showing the text and you cannot see the mouse when it is stationary, and the keyboard does not highlight the current menu item.
Not being an XBMC user i'm not terribly familiar with the controls - which makes it a bit difficult to use when you can't see them much of the time. I seem to hit a lot of stability problems when trying to quit a video and try another one. And if things don't shutdown cleanly you can be left with a frame of video or a blank screen covering everything.
Playing at least a low-resolution video from local media is quite fine - there is no tearing, the scaling looks quite acceptable, and the video and audio generally keep pretty good sync. The above screenshot is whilst playing a 640x480 24 fps mpeg-1 video, and at least it's got enough headroom to keep the video playing.
Playing music seems to have an unusually high cpu usage - but still under 50%. And it plays quite well across the network from a mediatomb server I have.
So for comparison here is my favourite reliable video player mplayer's top results on the same video - so there's still obviously some way to go on the video pipeline front since they're both using ffmpeg for the decoding (afaict). Although mplayer tears noticeably so isn't terribly great either.
Also for comparison, mplayer is able to play 576p (720x576x25) recordings from a digital tuner (mpeg2) without dropping frames - and under 90% cpu utilisation. And I believe that is only using the CPU (and presumably Xvideo overlays). XBMC cannot keep up with this and doesn't degrade terribly gracefully (audio stutters, video remains black, the menu's become unresponsive).
Where to now?
Obviously there is still a lot of work to be done before it's practical as a media player. If the issues with the themes and some general stability problems were fixed it could at least be used in a limited way - e.g. as a remote player for a small screen or an analogue tv (and the lower resolution is a big help for performance of the front end). With more work it could certainly make it as an SD media player and a HD one (720p) for the Beagleboard XM (the next-gen one that was just released a few days ago). The DSP is also sitting idle too so the hardware is capable of quite a bit more.
From a very limited amount of profiling I did, it appears the XBMC codebase is littered with snippets of less than ideal code which eventually adds up. For example the background images are scaled using a simple double for loop and put/get pixel routines - which gets very slow as the resolution of the interface is scaled up (simple fix here is theme tweaking). The audio is being remixed when I believe it doesn't need to be since the audio bitrate is flexible, and so on and so forth.
Reports from Tobias suggest there are opportunities for tuning of the calls to the graphics library to improve performance. e.g. splitting up large textures into smaller segments and so on. Some of the GL library is still in software so ideally you'd avoid those code-paths. The XM board also has updated hardware which implements more of the API directly so has different tuning characteristics and just runs faster.
And from the way it runs I gather the media loop is still a bit too tightly coupled to front-end. Given that the video is being rendered on a separate surface to the menu's there's the potential for them to be completely de-coupled although apparently that isn't going to be easy.
XBMC is a very big piece of software that appears a muddled together from a lot of separate pieces tied together with python. Although a lot of the code seems pretty decent I wonder if it isn't straining a little under it's own weight, particularly on a machine with limited capabilities. And well, python ...
My own experiences
GSOC was I guess an 'interesting' experience. I think what made it most strange for me was it was for software of which I am not a maintainer (or even contributor or user). Not having intimate knowledge of the software and specific internal goals for it made it hard to judge the direction things should move in. Fortunately Tobias had a better idea of that than me, was good at finding things out on his own, and took directions well. And I know a thing or two about GUI programming, and a bit about video, so I still had some important tips to offer. I got side-tracked with work and other projects as well and the timezone difference (~10 hours) didn't help so I didn't really have as much time as I thought I might when I put my name down initially - but I don't think that had any effect on Tobias's progress and I made a point to keep up with the few mail conversations we had.
I also had a strange problem for which I still haven't identified the source of - all of a sudden the code just wouldn't run any more. Odd errors about pixel shaders not compiling and bogus messages in the log file. I tried recompiling multiple times, updating the os, upgrading the os - in the end changing to another spare board fixed it. I'm not convinced it's a hardware issue since everything else was working ok but haven't had the time to track it down.
The BeagleBoard/TI fellows were decent and quite helpful. Jason deserves special mention trying to manage this all for the first time and doing a pretty good job. Others helped chairing the IRC meetings when he was unavailable. Which can't have been that fun as nobody seemed terribly attentive at them.
The Google side of things wasn't exactly inspiring - not a great deal of direction on things such as grading (although again, a project one was more involved in would help), and an utterly dreadful piece of software they use to manage it. It was almost enough to want to give up before it even started but thankfully after using it constantly during that time, you only had to use it a couple more times after that. As a general mentor (not the boss-man) there wasn't really any direct interaction with Google anyway.
For students - make sure you're doing something you're actually interested in. And you really should have some free software experience under your belt already and ideally with the project in question. Experience with source control, email and IRC and remote development is pretty important. I can't imagine too many mentors would like me merely do it because I could and out of curiosity - they probably want a real outcome and hopefully a long-term contributor to their pet project. And make sure you really have the time to dedicate to it - university, family , or other personal commitments can very quickly eat into the limited amount of time you have available.
For mentors - choose wisely. From the sounds of it the beagleboard projects all went pretty well, and there was certainly a lot of deliberation over which projects to choose. Also, don't waste time on students who can't even be bothered to submit a full proposal to start with or who have ideas a little too crazy (Beagleboard seemed to get a lot of these - I think because it had a hardware/software component and only a few specifically targeted software projects). I'd personally avoid anyone who looks like they're just doing it for school, or the pittance of money on offer.
I'm kind of in two minds whether the whole thing itself is a terribly good idea. If you need such encouragement to discover the joy of hacking then maybe it isn't for you - most projects are looking for programmers all the time so it isn't hard to find something to play with. Although I guess they aren't always all terribly welcoming. Perhaps on the last point it has some merit since it forces organisations to get their shit together a bit wrt novice contributors although I imagine for many of them it isn't actually worth the effort.
Bits and pieces
Finally a day with a bit of sunshine ... did a backlog of washing, pulled weeds for 3 hours, mowed the nature strip and started emptying out a garden shed I need to move. All in all a productive day.
I spent most of the rest of the week since the last post hacking away on ImageZ, although I took it a bit easier with not so many late nights, and had room to fit in an extended binge drinking session with a couple of mates. I guess next week I'll be back to work again.
Tough life for some ...
I bit the bullet and put a 'tool bar' in. I kinda hate those things ... but the alternative was lots of popup windows so I don't really think I had much choice. Thing is if you're working with layers you really need them handy so it seemed the obvious place to put it. Also means i can link them with the document and that simplifies a few things both for the code and using it. At least you can show/hide the whole thing with a single keypress. Still may change it to an internal window too. Not sure what else I will put there - I suppose something about the current tool may as well go there as well. Which might mean less need for popup tool selectors ...
I added ellipse selection - very easy. And hooked up a few menu's for select all/clear/invert selection. I started looking at editing the 'current' ellipse/rectangle but I haven't gotten very far yet - it adds two invisible hit-boxes at the start/end point that lets you drag it around so the mechanics are there, I just need to present it somehow. Since I need similar facilities for structured objects I may be able to re-use code or at least how it works.
I fixed most of the state tracking issues - now the window that contains the drawing surface is listening to mouse events and re-routing them to the current tool, rather than the tool listening itself.
Found a workable method for the menu item actions to find out which window invoked them, so I added accelerator keys to them, and added all the tools to a menu item too.
They `broke' again when I added the blend mode and opacity to the layer viewer (yeah I hooked those up too). Turns out I wasn't pre-multiplying alpha for the result (which wasn't necessary for the tool layer). So this has hopefully fixed all those problems up ... until the next lot come along. Oh also I had started with a checker-board pattern as the base data before starting the composition ... which was a bit mistake even though it looks the same for normal blend modes. Now I just blend the result into a checker-board pattern once I have a composed result.
Moved the undo tracking to the image itself rather than have a global one. I'm still tossing up whether I use the Swing UndoManager (which lets me track state changes from other swing objects), or stick to my own which is simpler ...
Made a couple of other plain and sparse-tile backends for different data-types. Unfortunately the sparse int layer isn't any faster than the sparse float once since it has to go through the same generic code-paths, although I guess it uses less memory. Memory usage is a bit of problem - it uses a lot. I guess with the GC though you can't do much about that and it's the price you pay for it running quickly. The 'native' int-based backend is very quick though. I still need to do a 64-bit backend (16 bit elements) but that's relatively simple. I stuffed up a bit and my layers are RGBA instead of ARGB - it doesn't really matter since I data-convert anyway ... but maybe it makes that less efficient. It is what OpenCL supports though.
Played with a lot of different ideas to do with threading. Right now drawing is on the event thread - if it gets too much to draw it starts dropping mouse events. I played with running the layer composite rebuild on another thread, and running the tool rendering on another thread. Hmm, various trade-offs here and I can't say i've settled on anything yet. Given that the custom image types are so slow to draw to using Graphics2D, I may have to consider using one of the built-in 8-bit types as a tool layer - for most operations this is more than adequate (the only real place I can think of it not being is with fine gradients).
Actually I want more of a 'structured graphics' layer. Well, ... that's really a whole project in itself. I just played a bit with the low-level text api's which can do word-wrap and so on. Text is always a bit of a pain and this isn't any exception. I'll surely be able to do everything I want, but there's a lot of api's to learn first and adding the right level of front-end is the hard bit.
Ultimately I want there to be a layer type that contains more than text - possibly multiple text and graphics objects. Yeah i'm dreamin here ...
Still no real save (I have some test code but it isn't hooked up). I came across the OpenRaster format, which is pretty much what I was going to do anyway - except the XML bit. So I'll probably use that for compound images. Friggan XML.
Still thinking about possibly doing a light version of the interface. Maybe drop out the layers and just support 8 bit RGBA (for memory use and simpler i/o). Mostly so I can test out ideas without being bogged down in complexity and try to get something I can use.
Clipboard, screen capture
Well after writing all the above and forgetting to post I looked into first capturing the screen and then when I found out that was so simple I moved onto the clipboard.Capturing the whole screen is one line of code, but trying to find where a given window is seems to be beyond Java. So I cheated a bit and invoke
xwininfo and read the content from it's stdout. That pops up a cross suspiciously like the one that GIMP uses to select frames for grabbing, and then spits out some window details. A simple loop that parses the lines and I extract the window bounds and now I can grab windows too, at least on GNU systems. I still need to add a little requester asking for some details but that's a piece of piss.
Merged ellipse fuzzy select pasted into everyone's favourite free image editor.
Clipboard support turned out to be almost as easy, so I added that too. I've done it before using gtk+ and basically it works the same way - negotiated content using mime types - but all the details are done for you and you end up with a BufferedImage if that's what you're after in a line of code.
I did come across one issue (bug?) if I alpha-select something in GIMP it pastes fine, but if I alpha select something in ImageZ, paste to GIMP and then copied from GIMP again it comes back with no alpha by default. Looks like the image has been converted to low-quality JPEG as well. If I explicitly ask for a PNG file it works fine, but then I suspect it will force all selections to 8 bit even with internal copy/paste ... (assuming it isn't already). I guess nothing's perfect.
Well I guess that's enough for one weekend, off for a long ride in the hills to try to burn off some of this winter/hacking fat (although with beer, grog and pizza possibly at the end of it, it may not go far toward that end).
Mistakes and milestones
I had a bit more of a poke around the SampleModel and WritableRaster classes last night and worked out it wasn't too difficult to add some optimised code-paths for the sparse tile data-buffer. With that in place it's pretty comparable on small images to a simple flat array and there's still a few tweaks left. So today I filled that stuff out a bit and wrote some better getLine() code for the compositor and did a bit of profiling and playing.
Then I thought i'd tackle undo - just for the image edits. Turned out to be very simple and pretty easy (in hindsight perhaps not so surprising - it isn't like editing a tree where you have complex data-structures to manipulate, it's just a rectangle of bits). I just made a new version of the tool layer composition routine and had it calculate a delta as it went - and that delta is just stored in another sparse layer. Because the compositor is not sparse-aware I added a check in the sparse layer setLine() call to see if a non-existent tile was being written with a row of zeros and it does nothing if it was ... yeah it's not terribly efficient! The delta along with some pointers to the relevant objects is then just pushed onto a stack. Undoing or redoing the delta is a simple matter of applying an addition or subtraction to the target region. Again it isn't sparse-aware but it could be made thusly without a lot of work. I'm just doing the edit undo globally for the application (which isn't right!) but that is easy to change. Lastly I just tested to see how well the deltas compress. I added a step to compress (and discard) the sparse layer delta every time it is saved to the undo stack. For very small edits it's very good - from 5% or so, for lager single-colour paints up to about 20% of the original size. A full-sized (1024x700) wide-radius gaussian blur was more like 70% ... but it's probably still worth it. I'll have to find a way to lazily do it in the background from the undo manager though as it can start to take a while to run as the data size gets bigger.
I got a bit sidetracked trying to fit a colour selector and layer list into the paint pox ... the Java one is quite large. And so now the window is getting a bit big (maybe not too big though). I started working on my own version but i'm not sure i'ts a path I really want to go down. I'm also not sure what to do with the layer list - it's sort of something you need handy from every tool but I want to avoid having a separate window for it that is always around. Might be a job for an iframe ... maybe. Having one central layer list is a little clumsy if you use focus-follows-mouse like I do so it might make sense to have it per image somehow.
But I've already found the pox idea works a lot better on my laptop (only 1024x768 screen) than having a separate toolbox which is too easily lost or hidden.
I do need to rethink the way the tools track the current document. Right now there is a central model which tracks the application state. When the mouse enters a window it notifies the model of the change - but this is really broken (e.g. drag a paintbrush outside the current window over another ... nasty things happen). I did have it based on focus but that didn't seem very reliable either. I probably need to manually re-route window messages to the current tool rather than having the tool listen to the current window (which is clumsy as hell anyway). Menu items are still a bit of a pain since i'm using a single menu across all windows and the actions have no direct context to go by when they fire (although now i think about it, since i'm manually popping them up I already know where they came from).
Now i've pretty much got the 'guts' I need as a baseline it's probably time to fix these niggling issues and bed it down a bit more solidly. Which probably means things will shift into low-gear for a while since that stuff can get tedious. Not a huge amount to fix though.
Ahh the milestone. Broken 10KLOC, at least according to wc. I suppose that's ok for 4 weeks of spare time ... must be slipping in my old age.
Copyright (C) 2019 Michael Zucchi, All Rights Reserved.
Powered by gcc & me!