Michael Zucchi

 B.E. (Comp. Sys. Eng.)


android (44)
beagle (63)
biographical (85)
blogz (4)
business (1)
code (59)
cooking (30)
dez (7)
dusk (30)
ffts (3)
forth (3)
free software (4)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (425)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (221)
java ee (3)
javafx (48)
jjmpeg (72)
junk (3)
kobo (15)
libeze (7)
linux (5)
mediaz (27)
ml (15)
nativez (8)
opencl (119)
os (17)
parallella (97)
pdfz (8)
philosophy (26)
picfx (2)
playerz (2)
politics (7)
ps3 (12)
puppybits (17)
rants (135)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
wanki (3)
workshop (3)
zcl (1)
zedzone (18)
Sunday, 07 September 2008, 06:53

Chrome, again.

I recently posted my last entry on b.g.o, and I said I wasn't going to rant about what is wrong with the desktop (well I did before I deleted it). But maybe I should have, as with fortuitous timing, my second to last entry about Chrome should have reminded me what Chrome is capable of. I will only say in my defence that I was only considering Chrome as a browser, and maybe as an `ms office' replacement, and dismissing views otherwise (well that is how I use a browser).

First, some background. I had been noticing the trend to move toward Python in GNOME in particular, and I haven't liked it. I know why developers like it (well why they claim to like it), but as a user it leaves a lot to be desired - slow, extremely heavy applications, that too-often bomb out with meaningless backtraces. I had some ideas that could make it palatable to users (well, beyond just debugging), but it relied on some features which Python lacks, so I gave up thinking about it. But Python isn't the only problem.

The GNU desktop is in an awful state - and that's even if you stick to just one flavour and it's attendant applications (I don't know about KDE, but the following is true of both GNOME and Xfce). If you take a default install of your average `distribution', for example, Ubuntu, after installing a rather large number of packages you end up with a pretty login window, and a relatively pretty desktop, and quite a few applications, from basic to outstanding, from buggy to stable. But what is behind the actual desktop? A mis-mash of random programmes the packager/desktop team determined to be useful for themselves or some mythical `average luser'. Some work well, some don't, some are necessary for the basic operation of the machine (auto-mounting and network selection), others are pure fluff, most are in-between. Also - it barely runs ok if you have only 256MB of memory, for example that `older machine' that GNU/Linux can supposedly take advantage of, or embedded/special machines, like a Playstation 3, both of which actually affect me.

One problem is that the `in thing' these days seems to be to write (or re-write!) many of the applets/applications that provide core desktop functionality using Visual BAS... oh oops ... Python. Now Python is a `scripting language'. This means that every time you run a python ap, it must compile the source-code into byte-code or perhaps machine code (I do not know if there are pre-compilers for it). This takes time, and it takes memory, and to do it well it can take a lot of memory and time, and this is one reason traditionally that developers had much beefier machines than users - because they're the only ones who had to do this step, once. If it only compiles to byte-code, then every basic instruction is emulated using a state machine - a 'virtual machine' (VM), which is at least and order of magnitude slower than the physical machine is. Any conversion to machine code and further optimisations which make the running speed faster, also generally cost in memory and cpu time during the compilation phase. For simple scripts and applications this is no big deal, but for more complex applications it can start to add up. Not only that, because many of the libraries themselves are written using the scripting language, every application which uses those libraries needs to recompile the same libraries every time they run - and more importantly store their own copy of the byte/machine code. I will also mention in passing that many of these `libraries' are just `wrappers' - glue code which just calls some `C' library to do the actual work; but someone has to write those too, so either the script engine `vendor' or the library `vendor' must expend additional resources (which wouldn't otherwise be needed) for this work, so the cost isn't born solely by the users.

Scripting languages are just fine for short-lived applications, they run, do their job, and finish, releasing the memory they used - even if it is excessive it doesn't usually matter. And often they are `batch' processes anyway - non-interactive programmes which run by themselves, and so long as they run to completion they needn't be particularly speedy. But now with applets and other trivial applications that run for the entire time you're at the computer, or they require interactive response, they are a potential disaster. You now have a separate VM for every application loaded, with all the non-shareable data that entails. Often scripting VM's haven't even been designed with this in mind, and in that case they may be quite cavalier with their use of memory because it isn't an issue for the workloads for which they were designed. Most of these languages use garbage collection too - but garbage collectors are quite hard to write properly, so there are often bugs, but even when those are all fixed, to get performance they generally need more total memory than they're actually using (sometimes by a lot, but often about twice). And again, all of this overhead needs to be duplicated for each VM running. Contrast that to say a C application. When an application is compiled in the normal way, all of the code, and all of the code of the libraries can be shared in memory. Far more time and memory can be spared during the compilation phase, since it is only done once. And explicit memory management at least forces you to think about it, even if you don't take advantage of that opportunity for thought (even if explicit memory management has spare/overheads for efficiency, it's a trade-off you can control). And finally, often the reason programmers use scripting languages in the first place is because they are easier - or to translate (in some cases) - they don't know any better. Although they may have the enthusiasm and the ideas, they may just not have the skills to pull it off properly.

Another problem affects all languages - that is the startup time/non-shared data overhead. Things such as font metric tables (sigh, and font glyph tables/glyph cache, now the font server has been basically dropped - remote X sucks shit now, even though networks are much faster), display information, other global state tables, and other data which is loaded at run-time, and could otherwise be shared among applications. This only gets worse when you have many versions of the same library present, and/or completely different libraries which do the same thing. Sure you can run a KDE application on a GNOME desktop, but it isn't at a zero cost, as even basic things like displaying a string of text involves an extraordinary amount of logic and data, little of which will be shared.

Having so many libraries to choose from, and indeed a continually changing set of libraries to choose from, is also a particular problem with GNU desktops (and Windows at least). Add to that - people keep coming up with their own `framework' which will `solve all the problems' in a specific domain, but all it really does is add yet another set of libraries (and versions over time) that we all have to put up with if we want to run a particular application that uses them (or worse, the poor developer is burdened with having to develop and maintain yet-another backend when they could be doing real - and more importantly; interesting - work). Even if the one library is the one everyone uses, new versions seem to come out every year or so.

So the result is, that in 2008 we have a desktop with barely more features than one in 2000, yet consuming far more resources. Tiny little applets which could just as easily been written in any language, are dragging in millions of lines of code and megabytes of memory by virtue of being written in a scripting one. Lots of libraries - many which do the same thing, even just different versions of the same one - often end up being installed as well.

There are at least a couple of ways to get around the scripting problem, and they also cover the shared state and library's breeding like fundie children as well. If you're not using scripting they don't help - but shared state could be addressed using traditional IPC mechanisms (i.e. use a server), but because of the complexity this is often not done. Fixing the breeding library problem in general is tricky - each library needs to be far more disciplined in their design, and make use of ld features for backward/forward compatibility if required. Some duplication is still necessary - competition is generally good - although perhaps application developers should avoid using every new library that comes out just because it is new and promises to abolish world hunger.

First possibility, you have a separate process that compiles and executes all scripts - a script `application server', in today's language. For a stand-alone script, a small client uploads/tells the server which script to execute, and the server sends the results back to the client using queues and/or rpc. Because the scripts are executed in the same address space, they can share libraries, the garbage collector, and other resources. You also have the benefit that if you want to extend your application with scripting facilities, any application can use the same mechanism to run their own scripts. This could also provide a powerful system whereby you can write meta-applications, talking between applications as well, if you design the system properly. Threading is an issue - but it's an issue that only has to be solved once, by people who probably have an idea, rather than clueless application programmers.

The other way is to move your applications to the (one) server. All applications simply run in the same VM/address space, and again all code and much data can easily be shared among applications. Where you need additional non-scripted facilities you either build them in/use plugins, or use IPC mechanisms. And you only have to do it once too. Although meta-application programming is certainly possible, it would have to be an additional layer or protocol that needn't be there by design. And you can't really write an application that has a scripting `extension mechanism' either - since the app is the script.

The first way is sort of how AREXX worked. It can be quite simple, yet very powerful. Nobody wrote applications in AREXX, but they did write meta-applications which literally let completely unrelated applications `talk' to one other. The second way, if taken to the extreme, is something like JavaOS or that M$ thingy that does the same thing.

Hmmm. So I guess one potential realisation of the second idea is Chrome. It isn't a browser, it's an application framework, or rather, an os-independent application execution environment, a meta-operating system if you will. The sort of thing Java was capab;le of, but didn't work so well because it was too fine grained/no central server. The sort of thing Flash is basically doing now, although it's too buggy and also no central server. Probably the closest is the sort of thing GNOME was originally envisioned to be (as i fuzzily remember it - the NOM in GNOME) before being down-graded to basically a Gtk theme - although the glandular-fever infected among them are still thinking along those lines, I think. The sort of thing Firefox always claimed to be, but you couldn't take seriously because we all know what a bloaty pig's bum it was, and still is, even though they've made great strides in the swine's bun-tone. Well, at least the process model in Chrome makes sense now.

So watch out GNOME and KDE and Xfce. All of those little crapplets that deal with no or small amounts of data - they can all be re-written as trivial JavaScript applications, and probably with network transparency built in (I haven't mentioned `google gadgets', because it should be obvious this is one and the same thing). e.g. post-it notes, a desktop clock/calendar which links into your planner, rss aggregators, umm, whatever it is people run on their desktop, file browsers aren't much different from an internet browser either. So maybe the `start menu' (for native apps) can't be written - well, yet - because of the OS integration, so that is safe for now. Still, who knows, they've got the sandboxing, so there will perhaps be a mechanism for priviledge escalation as well, and it can be made as secure as yum or apt-get (i.e. not very). If they implement a VIDEO tag, and SVG properly, with any luck Flash and M$' flash knock-off can get the bullets in the head they deserve as an added bonus. Good riddance to bad rubbish there.

Ok, so perhaps I was wrong in my second to last post on b.g.o. Chrome isn't just another featureless webkit browser after all (although it is still too featureless for me). But it isn't just Firefox that has to fear from another browser, it is not just desktop applications that have to fear from another browser, it is the desktop as we have come to know it - and thank fuck for that too.

Ahh well, maybe that isn't the idea `they' had. It has the potential though, if the VM and GC is as good as the claims on the box. And if Google doesn't do it, someone else can - because it's free software.

Wednesday, 20 August 2008, 13:11

A Hacker's Introduction

Hello once more, or for the first time.

I figure that as I am no longer a part of the GNOME community, do not use most of their software, and have no interest in it I think my diary on blogs.gnome.org is no longer particularly appropriate. So here is yet-another web diary to add to the 3 or so stale ones I have laying about the place. I will start with a little introduction and history - since I don't have any great code ideas to share today. It isn't in strictly chronological order, but it should cover the important bits.

So, ... the story so far ...

The first computer my family owned was a Commodore 64. We just played some games copied from friends for the most part, and typed in the odd basic programme from magazines and books. I dabbled a little bit with programming - entering sprites using hand-encoded binary and making farty beeps through the SID chip. When one of my brothers bought another one, which came with a disk drive and a pile of magazines, it opened up a whole new world.

After typing in some 'acceleration' libraries from a Compute Gazette! (iirc) magazine I discovered the wonders of machine code, and subsequently assembly language. I typed in an assembler from one of the magazines - it extended BASIC with mnemonics, and you had to implement the 3 (or more) passes using FOR loops(!) and then taught myself 6502 assembly language from the related article and tiny snippets I found in various magazines. We lived in the country and had no access to bulletin board systems or much information, so it was all down to my own curiosity and probing. I can't remember how I learnt the rest of the mnemonics, perhaps a book in the town library or more magazine articles, but I ended up fairly proficient at it. I also can't remember where I got it from - perhaps I typed it in from a magazine too - but I ended up with a 'machine code monitor' as well (a debugger/disassembler) which let me disassemble demo's from magazine cover disks and learn further. I even wrote my own dissassembler (in assembly language of course) and dumped the entire BASIC and Kernel ROM's to a printer to learn further (reading the code subsequently I couldn't understand most of it ever again). Other bits and pieces were an interpolating printer driver for GEOS, and lots of other little useless toys, graphics ('vector' graphics, sprite multiplexers, raster interrupt stuff), and sound routines (a primitive sequencer iirc). I still used the machine to type my first few essays for uni.

Of course as a Commodore-head I dreamed of an Amiga, and finally got one during my first year at uni. Of course the first thing I did was borrow a book on 68K and hand-assembled a matrix multiplication routine to accelerate some 3d graphics in the horrid BASIC the Amiga came with - at least it came with a language though. I'd heard about fish discs from magazines, so I soon got the fish disk with A68K on it, and slowly collected other tools and got to work - learning M68K and hacking up code on a floppy-based system. And yet it booted faster than my latest GNU machine does (and just as well, it sure rebooted a lot more often too) and the editor was more responsive. Graphics routines, interrupt queue-based blitter 'engines', 2d stuff, 3d stuff, even a mod player routine. The golden age when all was learning and no other distractions like life to get in the way. Using snippets from magazines and books (the university necessarily had a much better stock of technical books) I read up on assembly language and hardware and i/o registers and all the rest.

Then I got a modem. At this point the internet didn't really exist - usenet was around, and ftp was just starting to pick up - but you could only get it at uni. That opened up access to other software, and other people, and I got in touch with a 'demo group' who wanted a coder for demos. We never really did anything to speak of, but we had a lot of fun being creative, and trying to mix real-time code, graphics and music together. Along the way I got an Amiga 1200 - having a hard drive was a nice step up.

It all lead up to a fateful Easter Long Weekend where I managed to stay up for 73 hours straight working on our demo, attended the demo competition, got disqualified for a silly optimisation mistake which had it fail on the target hardware, slept an hour on a plastic chair with my head on a Laminex table, had an external floppy drive stolen, set up for a video presentation at a rave/dance party, blew the 'blue' output of my 1200 (everything turned a yucky mustard grey/yellow) from a dodgy home-made video cable which shorted out, stayed up all night helping run the visuals at the party, and nearly had several accidents being driven home by a friend who all but fell asleep a the wheel mid Monday morning. I think I've been tired ever since, but maybe that's just a coincidence (sleep aopnea diagnosed years later). Although I kept coding on it, and even bought new hardware, I think the golden age had passed - for me and Commodore.

I got the ROM Kernel Reference Manuals and started writing more OS-friendly code rather than hardware-banging stuff. It was still in assembly language though. A little multi-threaded file manager I never finished, my MOD player routine came along about then. I dabbled a little in extending Amiga E - but couldn't maintain interest - my contribution was a 3-d library written in assembly language. Amiga E was a bit like Pascal but had an outrageously fast compiler - 45K of machine code - written by the clever and nice bloke who also wrote the first BrainFuck compiler - 1024 bytes of compiler which generated executables directly. I wrote a freeware extension to the AmigaOS's multi-media platform - datatypes - which read and displayed GIF files much faster than anything else. I learnt a lot of stuff about the importance of latency vs throughput, asynchronous I/O, threading and optimising code (it was all in assembly language), even what OO is all about.

Linux started to show up around me about then, along with the growing internet. Many GNU tools made their way to AmigaOS too, and I started to learn about the FSF (at first I couldn't believe anyone would give away such software - let alone the source-code, none of which I could use anyway). Eventually a mate had a cheap motherboard going and this new 'Linux' thing seemed to be getting more usable, so I bought a PeeCee and stayed up all one night with him trying to get RedHat 3 (I think?) going on it. After failing to get anywhere, I tried slackware and it worked. Suddenly I was in the world of pain that is the IBM compatible PC! Things get a bit hazy there. Work, life, and whatnot, too many very late nights doing geeky stuff with mates or IRC or other stuff. I can't remember what I used to use the Linux machine for other than internet access and compiling applications. I had learnt C by this time but I can't remember if I even wrote any software - other than for work, which was the occasional portability patch or hacked up Perl scripts. With a bunch of mates I helped run an ISP for a while, I lost some money, some lost skin through stress. A bit of a dark period I guess, which put me off the idea of ever wanting to run a business.

After the positive experience with the gif 'datatype', and sending in the odd patch here and there at work, I slowly became more interested in writing software as a hobby for free again, but this time including the source-code (I regret now that I never released the zgif.datatype source-code, such as it was). The KDE project looked very interesting - but I don't know if it was the GPL issue back then or perhaps C++ scared me off, but I never tried it again. I became involved in the GNOME project around 1998 - working on what would become the (now long-gone) second iteration of the GNOME Terminal application.

I started work on libzvt because I had had the idea from some terminal application on the Amiga which didn't have to 'copy' the lines it displayed. It just used Copper (a video co-processor) tricks to scroll by just telling the video DMA which lines to display, rather than re-ordering them so they displayed in the right order. Of course, on an X Windows display no such facilities were available, but I used similar ideas to optimise the screen update and minimise memory allocations - and by the end it was a pretty bloody fast and solid piece of code. Since I did much of my work on a Solaris box - which had an awfully slow malloc, re-using malloc's was a big win. Unfortunately nobody else working on the code-base ever understood why I did it that way, so patches came along and removed this desirable behaviour, and slowly I lost interest in maintaining it - because of a bit of a loss of direction for it, because I was too busy on Evolution by then, and because my ownership of the application had been undermined by someone who wouldn't let me do a few things I wanted. However, before then my work on libzvt/gnome-terminal got me a job working for Ximian (as #8) on this new 'Evolution' application.

Actually I never really had much of an interest in Evolution (I was a die-hard Elm user!), but I ended up working on it for 6 years. Rather interesting times. Working for a startup like that was really a unique experience. We worked like maniacs for the first year and produced tons of code - I don't know about the other guys but I never really expected to make gazillions like some other startups, but it would've been a nice bonus - so it wasn't the money driving me. I'm not sure i'd do it all again but certainly some of it was worth it. I do wish I'd known what I know now about programming and design, although I don't think we really made too many mistakes given what we were working with and aiming for. I always disagreed with the clone-MS aspect of the project - we all thought we could have done better, and I still think we could have. I would also have used CORBA more, and more effectively - but without a fully working ORB in the early days, and since most of us were busy just getting things working, I guess it was always going to end up the way it did. I also would have kept things simpler - but it is hard to know how simple something should be before you've done it once or twice. I think the project worked quite well considering few of the developers resided in the same city as any other, let alone the same timezone. Although it wasn't all sweetness and light ... I had a lot of nasty conflicts with inexperienced management (not all their fault I'm sure) and was wildly out of touch with many goings on by virtue of being literally on the other side of the planet. It was probably saved by virtue of the code-base being easily compartmentalised into chunks one or two developers could tackle in almost total isolation. And I got along reasonably well with Jeff on the mail component once he came along.

Novell buying Ximian was a double-edged sword. On the one hand they had money to keep paying us, on the other they started pushing uninteresting things like Groupwise backends into the mix. We already had a pretty low opinion of Groupwise from working around it's external protocol bugs (not as bad as Exchange mind you), and they threw inexperienced programmers at the task who didn't write very good code to start with, so the opinion didn't go up. Also being part of a larger organisation meant simple one-on-one management and a flat management structure were out the window. Now you had the over-paid baggage of a HR department breathing down your neck for pointless 'objective management tool' reports which they made you fill out on fear of death (which I might add you needed internet explorer to access properly - or maybe i'm confusing that with the expenses system which needed the MS proprietary Java). Other niceties were the yearly business ethics forms you had to fill out to cover their legal arses, and being told in no uncertain terms that you were there to work for the company and they had no obligation to give anything back in return. Even if they paid well, this was a foreign (and offensive - workers are not slaves) concept having grown up in a blue-collar slightly-pinko family during the Hawke/Keating years.

Ahh well. Anyway, we kept struggling on. I wrote a lot of code which never got used (actually some of it did eventually, in a changed form anyway), which is a pity, but by the end I was very burnt out on the project. Novell didn't really have anything I wanted to work on, or wouldn't let me work on projects I thought were potentially interesting at the time (e.g. Mono). And I couldn't come up with a project that they would let me work on either. I despised HR, and no longer had any interest in Evolution. So after a fairly extended and quite complete hand-over period of the Evolution code-base to the Bangalore team (I think I did quite a good job - although compared to any previous hand-over efforts, any job would've been an improvement), a fortuitously timed redundancy let me let go of all that and move forward. I found the hand-over quite cathartic - during the last few weeks, as I 'brain-dumped' 6 years of background into a few wiki pages, I felt more relief as each paragraph and section ticked off. Since then I haven't even run Evolution - I still cannot bring myself to, nearly 3 years later.

Since then, I've been working (for money) on a .NET, WPF desktop application! Bletch. Well, at least I can now say with confidence that MS stuff is awful. What a badly documented, sourceless, buggy, slow, dead-end piece of still-born technology this is. Well it pays the bills, but I will have to see what happens after this - I don't think I can keep it up.

Apart from work, I was pretty burnt out after leaving the Evolution project. I had a fairly long break from just about any sort of hobby coding. I'm still not sure where i'm headed. I've become more 'militant' Free Software - I never liked 'open source' but now I see it as highly damaging. Commercial interests are muddying the waters and trying to impose their corrupt way of business onto Free Software, and it really stinks. I joined the FSF.

I dabbled with AROS a little bit - but although they seem like a nice bunch of guys, I couldn't get terribly inspired. I wrote them an AVL library though - and that got me interested in C again for a while. I also dabbled with some other ADT's, and memory allocation routines, just some nice raw code to try to blow the c-hash cobwebs out of my head - and it was a true joy to return to Emacs. I played with literal programming systems along the way, but I can't make it work for me - the authoring is ok but debugging and writing libraries is not so nice. I even read a bit about ADA and Scheme and Lisp, although none of those inspired me either. I looked at writing a vorbis decoder. I wouldn't mind learning more about signal processing, and I just wondered if I could do it from the spec. I didn't get very far though - it has quite a nasty bit-stream format, and that scared me away. I did quite a bit of work on a content management system/blogging thing. I had some ideas on database versioning and document processing I wanted to play with, but have exhausted most of them now. Cheap branches, automatic indexing/toc/cross reference generation, web-friendly and print-friendly and not too author-unfriendly. I will get back to it if it becomes fun again, but who knows when that will be.

I have a Playstation 3 with Ubuntu installed, and have written a few little CELL routines. That is a lot of fun - I really love the architecture - it deserves success outside of super-computers and game consoles. But I can't really think of anything useful to do with it yet. At least, something useful I can do without it being too big a project for one man and his spare time to contemplate. I have some job-queue stuff, a bi-linear up-scaling routine and YUV to RGB converter. Just with that patched into mplayer makes quite a difference.

Well, that is pretty much got to today. As for the future, well more to come no doubt.

Tagged biographical.
Newer Posts
Copyright (C) 2018 Michael Zucchi, All Rights Reserved.Powered by gcc & me!