About

Michael Zucchi

 B.E. (Comp. Sys. Eng.)

Tags

android (44)
beagle (63)
biographical (79)
business (1)
code (56)
cooking (29)
dez (6)
dusk (30)
ffts (2)
forth (3)
free software (3)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (414)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (206)
java ee (3)
javafx (47)
jjmpeg (66)
junk (3)
kobo (15)
linux (3)
mediaz (27)
ml (15)
opencl (117)
os (17)
parallella (97)
pdfz (8)
philosophy (25)
picfx (1)
politics (7)
ps3 (12)
puppybits (17)
rants (129)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
wanki (3)
zedzone (12)
Tuesday, 15 May 2018, 19:00

Backend stuff

Winter has hit here and along with insomnia i'm not really feeling like doing much of an evening but i've dabbled a few times and basically ported the Java version of a tree-revision database to C.

At this point i've just got the core done - schema/bindings and most of the client api. I'm pretty sure it's solid but I need to write a lot of testing and validation code to make sure it will be reliable and performant enough, and then write a bunch more to turn it into something interesting.

But i've been at a desk for 10 hours straight and my feet are icy cold so it's not happening tonight.

Tagged hacking, zedzone.
Tuesday, 15 May 2018, 18:39

Evolution and S/MIME

So I noticed there was a S/MIME security fault in a bunch of email software - including Evolution.

Now my memory is a bit faded because it was 15+ years ago but I'm pretty sure we wrote the code to handle this case (mostly Larry and Jeff). For this each decoded segment was displayed separately with a special gtkhtml tag to reset the html parser between blocks. Although it might have only been on the signature level so I could be wrong but in general it didn't just dump the whole email to HTML for all sorts of reasons. The MIME parser could handle all sorts of broken streams so truncated HTML was expected to come up once in a while.

Of course that must've all been thrown away when the renderer was replaced by the 'better' renderer from apple going by some of the reports of the 'vulnerability'.

Not that i've ever used S/MIME or gpg - it's pretty much useless to me since nobody I know knows how to use it and hardly anyone uses email these days anyway.

I was also horrified to see that evolution now uses cmake. Well just as well I completely ignored the project after I took a voluntary redundancy ... I would've gone absolutely ballistic! Not that compiling with libtool didn't suck complete arse but at least it worked.

But GNOME was already going to shit back before I quit, both due to redhat throwing their weight around and Miguel being such an obnoxiously microsoft fanboi. Haven't touched it in any meaningful way (or Evolution) in over a decade and all I see of it is going backwards by continously copying the next shitty GUI-trend-of-the-month and/or being bullied into shitty designs by a bunch of fuckwits.

Tagged rants.
Sunday, 13 May 2018, 12:12

Oops

Had a bug in my fastcgi code, that broke the blog for some web clients depending on their ID string. It just happened to break on mobile phones more often. Oops.

Tagged zedzone.
Friday, 11 May 2018, 11:20

King PUSS

Some photos of the cat.

He's a bit of a pretty-boy but he's smarter than he looks.

Ostensibly his name is Cooper (as in Cooper's Original Pale Ale).

But I just call him cat.

Tagged biographical.
Monday, 07 May 2018, 21:00

c dez port

I had a couple of hours to burn Sunday morning so I ported over the rest of the dez code to C, although I didn't feel like testing it till I had some hours to burn today.

Anyway, I fixed some bugs and ran some tests. It's only about 30-50% faster than the Java version on the bible test for practical "limit" values. The patches generated aren't necessarily identical because of some minor changes in the hash table design but the differences are minor. The C code also requires some more bounds and error checking for robustness.

I also added CRC32 checksums to the file format as a quick check that the input and output aren't corrupted.

Tagged dez.
Saturday, 05 May 2018, 16:36

cdez + other stuff

I started porting dez to C to look at using it here somewhere. Along the way I found a bug in the matcher implementation but otherwise got very distracted trying to gain a few neglible percent out of the delta sizes by manipulating the address encoding mechanism.

I tried modifying the matcher in various ways - experimenting with the hash table details. These involved including the hash value (i.e. to reduce spurious string matching - it just slows it down) or using a separate index table (no real difference). Probably the most surprising was that the performance was already somewhat better than covered in the dez benchmarks. Both considerably faster processing and smaller generated deltas. I guess that must have been an earlier implementation and I need to update them. For example the bible compression test only takes 11 seconds and creates a 1 566 019 byte delta - or 65% of the runtime at 90% of the output size.

This insprired me to play with the chain limit tunable - which sets how deep the hashtable chain gets before it starts to throw away older values. Using a setting of 5 (32 depth) it just beats the previous published results but in only 0.7s - still somewhat slower than 0.1 for gzip but at least it's not out of the range of practicality. This is where I found the bug in the entry discard indexing which was an easy fix.

This does mean that the other timings I did are pretty much pointless though - using a larger block search size than 1 just produces so much worse results and it's still slower. I haven't tried with a large source input string however, where a chain limit will truncate the search space prematurely.

Then I spent way too much time and effort trying various address encoding mechanisms to try to squeeze a little bit more out of the algorithm. In the end although I managed to get about 2.5% best case improvement in some cases I doubt it's really worth worrying about. However some of the alternative address encoding schemes are conceptually and mechanically simpler so I might use one of them (and break the file format).

Because of all that faffing about I never really got very far with the cdez conversion although I have the substring matcher basically done which is the more complex part. The encoding/decoding code is quite involved but otherwise straightforward bit bashing.

Update I tried a different test - one where i simulated the total delta size of encoding 180 revisions of jjmpeg development - not a particularly active project but still a real one. The original encoding is easily the best in this case.

bloggone

For some reason the blog went offline for a few hours. It kept getting segfaults in libc somewhere. All I did to fix it was run make install (which simply copied the binary into the cgi directory and didn't rebuild anything) and it started working again. Unfortunately I didn't think to preserve the binary that was there to find out why it stopped working.

Something to keep an eye on anyway.

Tagged dez, zedzone.
Sunday, 29 April 2018, 12:02

BDB | !BDB?

I mentioned a few posts ago that there doesn't seem to be many NoSQL databases around anymore - at least last time I looked a year or two ago, all the buzz from a decade ago had gone away. Various libraries became proprietary-commercial or got abandoned.

For some reason I can't remember I went looking for BerkeleyDB alternatives and hit this stackoverflow question which points to some of them.

So I guess I was a little mistaken, there are still a few around, but not all are appropriate for what I want it for:

I guess the best of those is LMDB - i'd come across it whilst using Caffe but never looked into it. Given it's roots in replacing BDB it has enough similarities in API and features to be a good match for what I want (and written in a sane language) although a couple of niggles exist such as the lack of sequences and all the fixed-sized structures (and database size). Being a part of a specific project (OpenLDAP) means it's hit maturity without features that might be useful elsewhere.

The multi-version concurrency control and so on is pretty neat anyway. No transaction logs is a good thing. If I ever get time I might play with those ideas a little in Java - not because I necessarily think it's a great idea but just to see if it's possible. I played with an extensible hash thing for indexing in camel many years ago but it was plagued by durability problems.

Back to LMDB - i'll definitely give it a go for my revisioned database thing - at some point.

Tagged hacking, zedzone.
Saturday, 28 April 2018, 14:54

https, TLS upgrade

Ahah, so it seems things have changed a bit since last I looked into certificates and certificate authorities - and even then I was looking into code and email signing certs anyway.

After a short poke around I quickly became aware of the Let's Encrypt project which provides automated and free server domain certificates. It can be automated because you control the server and part of the issuing process creates temporary server resources that the signer can cross-check. And all the certs are created locally.

So after a bit of fudging around with the C-based acme client and some apache config I got it all turned on and (compatible) browsers automagically redirecting to the TLS protected url.

Yay.

I didn't want to go with the offical CertBot because python isn't otherwise installed on this server and I didn't want to drag all that snot in for no other reason.

Because the acme-client is a little out of date I had to pass it a few extra parameters to make it create certificates (and had to do some small porting related changes to it using libressl rather than libopenssl).

acme-client \
  -ahttps://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf \
  -C/var/zedzone/acme \
  -vNn \
  zedzone.space www.zedzone.space code.zedzone.space

Once created a daily cron job runs it (without the -vNn options) which requests new certificates if the old ones are within a month of their expirey date (since the Let's Encrypt certificates only last for 90 days).

I then added a https server config:

<VirtualHost www.zedzone.space:443>
    ServerName www.zedzone.space

    ...

    SSLEngine on
    SSLCertificateFile      /etc/ssl/acme/cert.pem
    SSLCertificateKeyFile /etc/ssl/acme/private/privkey.pem
    SSLCertificateChainFile /etc/ssl/acme/fullchain.pem
    SSLUseStapling on

    Header always set Strict-Transport-Security "max-age=31536000"
    Header always set Content-Security-Policy upgrade-insecure-requests
<VirtualHost>

And finally another header to the main server which tells compatible clients to upgrade to use https. This can be a bit odd on the first access but thereafter it does the right thing. I hope!

<VirtualHost www.zedzone.space:80>
    ServerName www.zedzone.space

    ...

    Header always set Content-Security-Policy upgrade-insecure-requests
<VirtualHost>

I didn't want to use a rewrite rule because at the moment I want to keep both url's active, but i might change that in the future. It seems like it might be useful - on the other hand any client anyone is likely to use will support TLS wont it?

I've left code.zedzone.space unencrypted for now (even though it's currently the only part of the site that can be logged into!) because I need to check things work with virtual servers on https first and more importantly i'm too hungover to care this fine yet overcast afternoon!

Update: For what it's worth, the server gets an A+ rating on ssllabs SSL Server Test at the time of posting. Although to get the score above B required a few mod_ssl config changes.

Tagged zedzone.
Older Posts
Copyright (C) 2018 Michael Zucchi, All Rights Reserved.Powered by gcc & me!