About

Michael Zucchi

 B.E. (Comp. Sys. Eng.)

Tags

android (44)
beagle (63)
biographical (82)
business (1)
code (56)
cooking (29)
dez (6)
dusk (30)
ffts (3)
forth (3)
free software (4)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (414)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (216)
java ee (3)
javafx (48)
jjmpeg (67)
junk (3)
kobo (15)
linux (3)
mediaz (27)
ml (15)
nativez (3)
opencl (117)
os (17)
parallella (97)
pdfz (8)
philosophy (26)
picfx (2)
politics (7)
ps3 (12)
puppybits (17)
rants (134)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
wanki (3)
workshop (2)
zedzone (13)
Tuesday, 23 October 2018, 18:47

Fast incremental Java builds with openjdk 11 and GNU make

This post wont match the article but I think i've solved all the main problems needed to make it work. The only thing missing is ancestor scanning - which isn't trivial but should be straightforward.

Conceptually it's quite simple and it doesn't take much code but bloody hell it took a lot of mucking about with make and the javac TaskListener.

I took the approach I outlined yesterday, I did try to get more out of the AST but couldn't find the info I needed. The module system is making navigating source-code a pain in Netbeans (it wont find modules if the source can't). Some of the 'easy' steps turned out to be a complete headfuck. Anyway some points of interest.

Dependency Tracking

Even though a java file can create any number of classes one doesn't need to track any of the non top-level .class files that might be created for dependency purposes. Any time a .java file is compiled all of it's generated classes are created at the same time. So if any java file uses for example a nested class it only need to track the source file.

I didn't realise this at first and it got messy fast.

Modified Class List

I'm using a javac plugin (-Xplugin) to track the compilation via a TaskListener. This notifies the plugin of various compilation stages including generating class files. The painful bit here is that you don't get information on the actual class files generated, only on the source file and the name of the class being generated. And you can't get the actual name of the class file for anonymous inner classes (it's in the implementation but hidden from public view). In short it's a bit messy getting a simple and complete list of every class file generated from every java file compiled.

But for various other reasons this isn't terribly important so I just track the toplevel class files; but it was a tedious discovery process on a very poorly documented api.

When the compiler plugin get the COMPILATION finished event it uses the information it gathered (and more it discovers) to generate per-class dependency files similar to `gcc -MD'.

Dependency Generation & Consistency

To find all the (immediate) dependencies the .class file is processed. The ClassInfo records provide a starting point but all field and method signatures (descriptors) must be parsed as well.

When an inner class is encountered it's container class is used to determine if the inner class is still extant in the source code - if not it can be deleted.

And still this isn't quite enough - if you have a package private additional class embedded inside the .java file there is no cross-reference between the two apart from the SourceFile attribute and implied package path. So to determine if this is stale one needs to check the Modified Class List instead.

The upshot is that you can't just parse the modified class list and any inner classes that reference them. I scan a whole package at a time and then look for anomilies.

One-shot compile

Because invoking the compiler is slow - but also because it will discover and compile classes as necessary - it's highly beneficial to run it once only. Unfortunately this is not how make works and so it needs to be manipulated somewhat. After a few false starts I found a simple way that works:

The per-module rules are required due to the source-tree naming conventions used by netbeans (src/[module]/classes/[name] to build/modules/[module]/[name]), a common-stem based approach is also possible in which case it wouldn't be required. In practice it isn't particularly onerous as I use metamake facilities to generate these per-module rules automatically.

I spent an inordinate amount of time trying to get this to work but kept hitting puzzling (but documented) behaviour with pattern and implicit rule chaining and various other issues. One big one was using concrete rules (made files) for tracking stages, suddenly everything breaks.

I resorted to just individual java invocations as one would do for gcc, and trying the compiler server idea to mitigate the costs. It worked well enough particularly since it parallelises properly. But after I went to bed I realised i'd fucked up and then spent a few hours working out a better solution.

Example

This is the prototype i've been using to develop the idea.

modules:=notzed.proto notzed.build

SRCS:=$(shell find src -name '*.java')
CLASSES:=$(foreach mod,$(modules),\
 $(patsubst src/$(mod)/classes/%.java,classes/$(mod)/%.class,$(filter src/$(mod)/%,$(SRCS))))

all: $(CLASSES)
        lists='$(foreach mod,$(modules),$(wildcard status/$(mod).list))' ; \
        built='$(patsubst %.list,%.built,$(foreach mod,$(modules),$(wildcard status/$(mod).list)))' ; \
        files='$(addprefix @,$(foreach mod,$(modules),$(wildcard status/$(mod).list)))' ; \
        if [ -n "$$built" ] ; then \
                javac -Xplugin:javadepend --processor-module-path classes --module-source-path 'src/*/classes' -d classes $$files ; \
                touch $$built; \
                rm $$lists ; \
        else \
                echo "All classes up to date" ; \
        fi

define makemod=
classes/$1/%.class: src/$1/classes/%.java
        $$(file >> status/$1.list,$$<)

$1: $2
        if [ -f status/$1.list ] ; then \
                javac --module-source-path 'src/*/classes' -d classes @status/$1.list ; \
                rm status/$1.list ; \
                touch status/$1.built ; \
        fi
endef

$(foreach mod,$(modules),$(eval $(call makemod,$(mod),\
  $(patsubst src/$(mod)/classes/%.java,classes/$(mod)/%.class,$(filter src/$(mod)/%,$(SRCS))))))

-include $(patsubst classes/%,status/%.d,$(CLASSES))

In addition there is a compiler plugin which is about 500 lines of standalone java code. This creates the dependency files (included at the end above) and purges any stale .class files.

I still need to work out a few details with ancestor dependencies and a few other things.

Tagged java.
Monday, 22 October 2018, 17:42

Java 11 Modules, Building

I'm basically done modularising the code at work - at least the active code. I rather indulgently took the opportunity to do a massive cleanup - pretty well all the FIXMEs, should be FIXMEs, TODOs, dead-code and de-duplication that's collected over the last few years. Even quite a few 'would be a bit nicers'. It's not perfect but it was nice to be able to afford the time to do it.

I'm still trying to decide if I break the projects up into related super-projects or just put everything in the single one as modules. I'm aiming toward the latter because basically i'm sick of typing "cd ../blah" so often, and Netbeans doesn't recompile the dependencies properly.

I'm going to reset the repository and try using git. I don't like it but I don't much like mercurial either.

Building

At the moment I have a build system which uses make and compiles at the module level - i.e. any source changes and the whole module is recompiled, and one can add explicit module-module dependencies to control the build order and ensure a consistent build.

One reason I do this is because there is no 1:1 correspondance between build sources and build classes. If you add or remove nested or anonymous (or co-located) classes from a source file that adds or removes .class files which are generated. So to ensure there are no stale classes I just reset it on every build.

This isn't too bad and absolutely guarantees a consistent build (assuming one configures the inter-module dependencies properly) but the compiler is still invoked multiple times which has overheads.

Building Faster

Really the speed isn't a problem for these projects but out of interest i'm having a look at a couple of other things.

One is relatively simple - basically use JSR-199 to create a compiler server something like the jdk uses to build itself.

The more complicated task is incremental builds using GNU Make. I think I should be able to hook into JavacTask and with a small bit of extra code create something akin to the "gcc -MD" option for auto-generating dependencies. It has the added complication of having to detect and remove stale .class files, and doing it all in a way that make understands. I've already done a few little experiments today while I was procrastinating over some weeding.

Using JavacTask it is possible to find out all the .class files that are generated for a given source file. This is one big part of the puzzle and covers the first-level dependencies (i.e. %.java: %.class plus all the co-resident classes). One can also get the AST and other information but that isn't necessary here.

To find the other dependencies I wrote a simple class file decoder which finds all the classes referenced by the binary. Some relatively simple pattern matching and name resolution should be able to turn this into a dependency list.

Actually it may not be necessary to use JavacTask for this because the .class files contain enough information. There is extra overhead because they must be parsed, but they are simple to parse.

Tagged java.
Thursday, 11 October 2018, 20:29

Concurrent Hash Tables

So I got curious about whether the GC in NativeZ would cause any bottlenecks in highly contested situations - one I already faced with an earlier iteration of the library. The specific case I had was running out of memory when many threads were creating short-lived native objects; the single thread consuming values from the ReferenceQueue wasn't able to keep up.

The out of memory situation was fairly easily addressed by just running a ReferenceQueue poll whenever a new object is created, but I was still curious about the locking overheads.

A field of tables

I made a few variations of hash tables which supported the interface I desired which was to store a value which also provides the key itself, as a primitive long. So its more of a set but with the ability to remove items by the key directly. For now i'll just summarise them.

SynchronisedHash

This subclasses HashMap and adds the desired interfaces, each of which is synchronised to the object.

Rather than use the pointer directly as the key it is shifted by 4 (all malloc are 16-byte aligned here) to avoid Long.hasCode() pitfalls on pointers.

ConcurrentHash

This subclasses ConcurrentHashMap and adds the desired interfaces, but no locking is required.

The same shift trick is used for the key as above.

PointerTable

This was the use-weakreference-as-node implementation mentioned in the last post - a simple single-threaded chained hash table implementation. I just synchronised all the entry points.

A load factor of 2.0 is used for further memory savings.

CopyHash

This is a novel(?) approach in which all modifications to the bucket chains is implemented by copying the whole linked list of nodes and replacing the table entry with the new list using compare and set (CAS). A couple of special cases can avoid any copies.

In the event of a failure it means someone else updated it so it simply retries. Some logic akin to a simplified version of what ConcurrentHashMap does is used to resize the table when needed, potentially concurrently.

It uses a load factor of 1.0.

I also tested a fast-path insert which doesn't try to find an existing value (this occurs frequently with the design) but that so overloaded the remove() mechanism it wasn't actually a good fit!

ArrayHash

This is similar to CopyHash but instead of a linked list of container nodes it either stores the value directly in the table, or it stores an array of objects. Again any modifications require rewriting the table entry. Again resize is handled specially and all threads can contribute.

This also has a small modification in that the retry loop includes a ficonacci-increasing Thread.sleep(0,nanos) delay if it fails which may slow down wall-clock time but can improve the cpu load.

I had some ideas for other approaches but i've had enough for now.

Tests

I created a native class which allocates and frees memory of a given size, and a simple 'work' method which just adds up the octets within. I ran two tests, one which just allocated 16 bytes whic immediately went out of scope, and another which allocated 1024 bytes, added them up (a small cpu-bound task), then let it fall out of scope.

Note that lookups are never tested in this scenario apart from the implicit lookup during get/put. I did implement get() - which are always non-blocking in the case of the two latter implementations (even during a resize), but I didn't test or use them here.

I then created 8 000 000 objects in a tight loop in one or more threads. Once the threads finish I invoke System.gc(), then wait for all objects to be freed. The machine I'm running it on has 4 cores/8 threads so I tried it with 1 and with 8 threads.

Phew, ok that's a bit of a mouthful that probably doesn't make sense, bit tired. The numbers below are pretty rough and are from a single run using openjdk 11+28, with -Xmx1G.

                   8 threads, alloc  8 threads, alloc+sum        1 thread, alloc     1 thread, alloc+sum
                     x 1 000 000           x 1 000 000            x 8 000 000             x 8 000 000
                  Elapsed     User      Elapsed     User       Elapsed     User        Elapsed     User
		   
 SynchronisedHash    8.3       38           11       54           8.4       27             16       40
 ConcurrentHash      6.9       35           11       52           8.2       27             17       38

 PointerTable        7.2       26           13       44           7.9       17             19       29
 CopyHash            6.6       31            8.2     42           8.1       24             18       33
 ArrayHash           6.0       28            8.2     39           8.5       23             16       24
 ArrayHash*          6.9       23           12       30           8.1       21             17       23

		    * indicates using a delay in the retry loop for the remove() call
  

To be honest I think the only real conclusion is that this machine doesn't have enough threads for this task to cause a bottleneck in the hash table! Even in the most contested case (alloc only) simple synchronised methods are just about as good as anything else. And whilst this is represenative of a real life scenario, it's just a bloody pain to properly test high concurrency code and i'm just not that into it for a hobby (the journey not the destination and so forth).

I haven't shown the numbers above but in the case of the Copy and Array implementations I count how many retries are required for both put and get calls. In the 8-thread cases where there is no explicit delay it can be in the order of a million times! And yet they still run faster and use less memory. Shrug.

All of the non java.util based implementations also benefit in both time and space from using the primitive key directly without boxing and storing the key as part of the containee object and not having to support the full Collections and Streams interfaces. PointerTable also benefits from fewer gc passes due to not needing any container nodes and having a higher load factor.

BTW one might note that this is pretty bloody slow compared to C or pure Java - there are definitely undeniably high overheads of the JNI back and forths.

The other thing of note is that well hashtables are hashtables - they're all pretty good at what they're doing here because of the algorithm that they share. There's not really all that much practical difference between any of them.

But why?

I'm not sure what posessed me to look into it this deeply but I've done it now. Maybe i'll post the code a bit later, it may have enough bugs to invalidate all the results but it was still a learning experience.

For the lockless algorithms I made use of VarHandles (Java 9's 'safe' interface to Unsafe) to do the CAS and other operations, and some basic monitor locking for coordinating a resize pass.

The idea of 'read, do work, fail and retry' is something I originally learnt about using a hardware feature of the CELL SPU (and Power based PPU). On that you can reserve a memory location (128 bytes on the SPU, 4 or 8 on the PPU), and if the ticket is lost by the time you write back to it the write fails and you know it failed and can retry. So rather than [spin-lock-doing-nothing WORK unlock], you spin on [reserve WORK write-or-retry]. I guess it's a speculative reservation. It's not quite as clean when using READ work CAS (particularly without the 128 byte blocks on the SPU!) but the net result is (mostly) the same. One significant difference is that you actually have to write a different value back and in this instance being able to merely indicate change could have been useful.

ConcurrentHashMap does something similar but only for the first insert into an empty hash chain, after that it locks the entry and only ever appends to the chain.

Some of the trickiest code was getting the resize mechanism to synchronise across threads, but I really only threw it together without much thought using object mointors and AtomicInteger. Occasionally i'm gettting hangs but they don't appear to make sense: some number of threads will be blocked waiting to enter a synchronised method, while a few others are already on a wait() inside it, all the while another thread is calling it at will - without blocking. If I get keen i'll revisit this part of the code.

Tagged java, nativez.
Tuesday, 09 October 2018, 10:07

Other JNI bits/ NativeZ, jjmpeg.

Yesterday I spent a good deal of time continuing to experiment and tune NativeZ. I also ported the latest version of jjmpeg to a modularised build and to use NativeZ objects.

Hashing C Pointers

C pointers obtained by malloc are aligned to 16-byte boundaries on 64-bit GNU systems. Thus the lower 4 bits are always zero. Standard malloc also allocates a contiguous virtual address range which is extended using sbrk(2) which means the upper bits rarely change. Thus it is sufficient to generate a hashcode which only takes into account the lower bits (excluding the first 4).

I did some experimenting with hashing the C pointer values using various algorithms, from Knuth's Magic Number to various integer hashing algorithms (e.g. hash-prospector), to Long.hashCode(), to a simple shift (both 64-bit and 32-bit). The performance analysis was based on Chi-squared distance between the hash chain lengths and the ideal, using pointers generated from malloc(N) for different fixed values of N for multiple runs.

Although it wasn't the best statistically, the best performing algorithm was a simple 32-bit, 4 bit shift due to it's significantly lower cost. And typically it compared quite well statically regardless.

static int hashCode(long p) {
    return (int)p >>> 4;
}

In the nonsensical event that 28 bits are not sufficient the hash bucket index it can be extended to 32-bits:

static int hashCode(long p) {
    return (int)(p >>>> 4);
}

And despite all the JNI and reflection overheads, using the two-round function from the hash-prospector project increased raw execution time by approximately 30% over the trivial hashCode() above.

Whilst it might not be ideal for 8-bit aligned allocations it's probably not that bad either in practice. One thing I can say for certain though is NEVER use Long.hashCode() to hash C pointers!

Concurrency

I also tuned the use of synchronisation blocks very slightly to make critical sections as short as possible whilst maintaining correct behaviour. This made enough of a difference to be worth it.

I also tried more complex synchronisation mechanisms - read-write locks, hash bucket row-locks and so on, but it was at best a bit slower than using synchronize{}.

The benchmark I was using wasn't particularly fantastic - just one thread creating 10^7 `garbage' objects in a tight loop whilst the cleaner thread freed them. No resolution of exisitng objects, no multiple threads, and so on. But apart from the allocation rate it isn't an entirely unrealistic scenario either and i was just trying to identify raw overheads.

Reflection

I've only started looking at the reflection used for allocating and releaseing objects on the Java side, and in isolation these are the highest costs of the implementation.

There are ways to reduce these costs but at the expense of extra boilerplate (for instantiation) or memory requirements (for release).

Still ongoing. And whilst the relative cost over C is very high, the absolute cost is still only a few hundred nanoseconds per object.

From a few small tests it looks like that maximum i could achieve is a 30% reduction in object instantiation/finalisation costs, but I don't think it's worth the effort or overheads.

Makefile foo

I'm still experiemnting with this, I used some macros and implicit rules to get most things building ok, but i'm not sure if it couldn't be better. The basic makefile is working ok for multi-module stuff so I think i'm getting there. Most of the work is just done by the jdk tools as they handle modules and so on quite well and mostly dicatate the disk layout.

I've broken jjmpeg into 3 modules - the core, the javafx related classes and the awt related classes.

Tagged java, jjmpeg, nativez.
Sunday, 07 October 2018, 08:51

GC JNI, HashTables, Memory

I had a very busy week with work working on porting libraries and applications to Java modules - that wasn't really the busy part, I also looked into making various implementation's pluggable using services and then creating various pluggable implementations, often utilising native code. Just having some (much faster) implementation of parts also opened other opportunities and it sort of cascaded from there.

Anyway along the way I revisited my implementation of Garbage Collection with JNI and started working on a modular version that can be shared between libraries without having to copy core object, and then along the way found bugs and things to improve.

Here are some of the more interesting pieces I found along the way.

JNI call overheads

The way i'm writing jni these days is typically just write the method signature as if it were a Java method and just mark it native. Let the jni handle Java to C mappings direclty. This is different to how I first started doing it and flies in the convention i've typically seen amongst JNI implementations where the Java just passes the pointers as a long and has a wrapper function which resolves these longs as appropriate.

The primary reason is to reduce boilerplate and signficiantly simplify the Java class writing without having a major impact on performance. I have done some performance testing before but I re-ran some tests and they confirm the design decisions used in zcl for example.

Array Access

First, I tested some mechanisms for accessing arrays. I passed two arrays to a native function and had it perform various tests:

  1. No op;
  2. GetPrimitiveArrayCritical on both arrays;
  3. GetArrayElements for read-only arrays (call Release(ABORT))
  4. GetArrayElements for read-only on one array and read-write on the other (call Release(Abort, Commit));
  5. GetArrayRegion for read-only, to memory allocated using alloca
  6. GetArrayRegion and SetArrayRegion for one array, to memory using alloca
  7. GetArrayRegion for read-only, to memory allocated using malloc
  8. GetArrayRegion and SetArrayRegion for one array, to memory using malloc

I then ran these tests for different sized float[] arrays, for 1 000 000 iterations, and the results in seconds are below. It's some intel laptop.

       NOOP         Critical     Elements                  Region/alloca             Region/malloc
            0            1            2            3            4            5            6            7
    1  0.014585537  0.116005779  0.199563981  0.207630731  0.104293268  0.127865782  0.185149189  0.217530639
    2  0.013524620  0.118654092  0.201340322  0.209417471  0.104695330  0.129843794  0.193392346  0.216096210
    4  0.012828157  0.113974453  0.206195102  0.214937432  0.107255090  0.127068808  0.190165219  0.215024016
    8  0.013321001  0.116550424  0.209304277  0.205794572  0.102955338  0.130785133  0.192472825  0.217064583
   16  0.013228272  0.116148320  0.207285227  0.211022409  0.106344162  0.139751496  0.196179709  0.222189471
   32  0.012778452  0.119130446  0.229446026  0.239275912  0.111609011  0.140076428  0.213169077  0.252453033
   64  0.012838540  0.115225274  0.250278658  0.259230054  0.124799171  0.161163577  0.230502836  0.260111468
  128  0.014115022  0.120103332  0.264680542  0.282062633  0.139830967  0.182051151  0.250609001  0.297405818
  256  0.013412645  0.114502078  0.315914219  0.344503396  0.180337154  0.241485525  0.297850562  0.366212494
  512  0.012669807  0.117750316  0.383725378  0.468324904  0.261062826  0.358558946  0.366857041  0.466997977
 1024  0.013393850  0.120466096  0.550091063  0.707360155  0.413604094  0.576254053  0.518436072  0.711689270
 2048  0.013493996  0.118718871  0.990865614  1.292385065  0.830819392  1.147347700  0.973258653  1.284913436
 4096  0.012639675  0.116153318  1.808592969  2.558903773  1.628814486  2.400586604  1.778098089  2.514406096

Some points of note:

I didn't look at ByteBuffer because it doesn't really fit what i'm doing with these functions.

Anyway - the overheads are unavoidable with JNI but are quite modest. The function in question does nothing with the data and so any meaningful operation will quickly dominate the processing time.

Object Pointer resolution

The next test I did was to compare various mechanisms for transferring the native C pointer from Java to C.

I created a Native object with two long fields, native final long p, and native long q.

  1. No operation;
  2. C invokes getP() method which returns p;
  3. C invokes getQ() method which returns q;
  4. C access to .p field;
  5. C access to .q field;
  6. The native signature takes a pointer directly, call it resolving the .p field in the caller;
  7. The native signature takes a pointer directly, call it resolving the .p field via a wrapper function.

Again invoking it 1 000 000 times.

NOOP         getP()       getQ()       (C).p        (C).q        (J).p        J wrapper
     0            1            2            3            4            5            6
0.016606942  0.293797182  0.294253973  0.020146810  0.020154508  0.015827028  0.016979563

In short, just passing Java objects directly and having the C resolve the pointer via a field lookup is slightly slower but requires much less boilerplate and so is the preferred solution.

Logger

After I sorted out the basic JNI mechanisms I started looking at the reference tracking implementation (i'll call this NativeZ from here on).

For debugging and trying to be a more re-usable library I had added logging to various places in the C code using Logger.getLogger(tag).fine(String.printf());

It turns out this was really not a wise idea and the logging calls alone were taking approximately 50% of the total execution time - versus java to C to java, hashtable lookups and synchronisation blocks.

Simply changing to use the Supplier versions of the logging functions approximately doubled the performance.

  Logger.getLogger(tag).fine(String.printf());
->
  Logger.getLogger(tag).fine(() -> String.printf());
    

But I also decided to just make including any of the code optional by bracketing each call to a test against a final static boolean compile-time constant.

This checking indirectly confirmed that the reflection invocations aren't particualrly onerous assuming the're doing any work.

HashMap<Long,WeakReference>

Now the other major component of the NativeZ object tracking is using a hash-table to map C pointers to Java objects. This serves two important purposes:

  1. Allows the Java to resolve separate C pointers to the same object;
  2. Maintains a hard reference to the WeakReference, without which they just don't work.

For simplicity I just used a HashMap for this purpose. I knew it wasn't ideal but I did the work to quantify it.

Using jol and perusing the source I got some numbers for a jvm using compressed oops and an 8-byte object alignment.

ObjectSize
HashMap.Node32Used for short hash chains.
HashMap.TreeNode56Used for long hash chains.
Long24The node key
CReference48The node value. Subclass of WeakReference

Thus the incremental overhead for a single C object is either 104 bytes when a linear hashchain is used, and 128 bytes when a tree is used.

Actually its a bit more than that because the hashtable (by default) uses a 75% load factor so also allocates 1.5 pointers for each object but that's neither here nor there and also a feature of the algorithm regardless of implementation.

But there are other bigger problems, the Long.hashCode() method just mixes the low and high words together using xor. If all C pointers are 8 (or worse, 16) byte aligned you essentially only get every 8 (or 16) buckets ever in use. So apart from the wasted buckets the HashMap is very likely to end up using Trees to store each chain.

So I wrote another hashtable implementation which addresses this by using the primitive long stored in the CReference directly as the key, and using the CReference itself as the bucket nodes. I also used a much better hash function. This reduced the memory overhead to just the 48 bytes for the CReference plus a (tunable) overhead for the root table - anywhere from 1/4 to 1 entry per node works quite well with the improved hash function.

This uses less memory and runs a bit faster - mostly because the gc is run less often.

notzed.nativez

So i'm still working on wrapping this all up in a module notzed.nativez which will include the Java base class and a shared library for other JNI libraries to link to which includes the (trivial) interface to the NativeZ object and some helpers to help write small and robust JNI libraries.

And then of course eventually port jjmpeg and zcl to use it.

Tagged java, nativez.
Saturday, 29 September 2018, 09:09

Bye Bye Jaxby

So one of the biggst changest affecting my projects with Java 11 is the removal of java.xml.bind from the openjdk. This is a bit of a pain because the main reason I used it was the convenience, which is a double pain because not only do i have to undo all that inconvience, all that time using and learning it in the first place has just been confirmed as wasted.

I tried using the last release as modules but they are incompatible with the module system because one or two of the packages are split. I tried just making a module out of them but couldn't get it to work either. And either i'm really shit at google-foo or it's just shit but I couldn't for the life of me find any other reasonable approach so after wasting too much time on it I bit the bullet and just wrote some SAXParser and XMLStreamWriter code mandraulically.

Fortunately the xml trees I had made parsing quite simple. First, none of the element names overlapped so even parsing embedded structures works without having to keep track of the element state. Secondly almost all the simple fields were encoded as attributes rather than elements. So this means almost all objects can be parsed from the startElement callback, and a single stack is used to track encapsulated fields. Becuase I use arrays in a few places a coule of ancilliary lists are used to build them (or I could just change them to Lists).

It's still tedious and error-prone and a pretty shit indightment on the state of Java SE in 2018 vs other languages but once it's done it's done and not having a dependency on half a dozen badly over-engineered packages means it's only done once and i'm not wasting my time learning another fucking "framework".

I didn't investigate where javaee is headed - it'll no doubt eventually solve this problem but removing the dependency from desktop and command-line tools isn't such a bad thing - there have to be good reasons it was dropped from JavaSE in the first place.

One might point to json but that's just as bad to use as a DOM based mechanism which is also just as tedious and error prone. json only really works with fully dynamic languages where you don't have to write any of the field bindings, although there are still plenty of issues with no canonicalised encoding of things like empty arrays or null strings. In any event I need file format compatability so the fact that I also think it's an unacceptably shit solution is entirely moot.

Modules

By the end of the week i'd modularised my main library and ported one of the applications that uses it to the new structure. The application itself also needs quite a bit of modularisation but that's a job for next week, as is testing and debugging - it runs but there's a bunch of broken shit.

So using the modules it's actually quite nice - IF you're using modules all the way down. I didn't have time to look further to find out if it's just a problem with netbeans but adding jars to the classpath generally fucks up and it starts adding strange dependencies to the build. So in a couple of cases I took existing jars and added a module-info myself. When it works it's actually really nice - it just works. When it doesn't, well i'm getting resource path issues in one case.

I also like the fact the tools are the ones dictating the source and class file structures - not left to 3rd party tools to mess up.

Unfortunately I suspect modularisation will be a pretty slow-burn and it will be a while before it benefits the average developer.

Netbeans / CVS

As an update on netbeans I joined the user mailing list and asked about CVS - apparently it's in the netbeans plugin portal. Except it isn't, and after providing screenshots of why I would think that it doesn't exist I simply got ignored.

Yeah ok.

Command line will have to do for me until it decides to show up in my copy.

Java After Next

So with Oracle loosening the reigns a bit (?) on parts of the java platform like JavaFX i'm a little concerned about where things will end up.

Outside of the relatively tight core of SE the java platform there are some pretty shitty "industry standard" pieces. ant - it's just a horrible to use tool. So horrible it looks like they've added javascript to address some of it's issues (oh yay). maven has a lot of issues beyond just being slow as fuck. The ease with which it allows one to bloat out dependencies is not a positive feature.

So yeah, if the "industry" starts dictating things a bit more, hopefully they wont have a negative impact.

Tagged java, rants.
Wednesday, 26 September 2018, 10:32

Java Modules

So I might not be giving a shit and doing it for fun but I'm still looking into it at work.

After a couple of days of experiments and quite a bit of hacking i've taken most of the libraries I have and re-combined them into a set of modules. Ostensibly the modules are grouped by functionality but I also moved a few bits and pieces around for dependency reasons.

One handy thing is the module-info (along with netbeans) lets you quickly determine dependencies between modules, so for example when I wanted to remove java.desktop and javafx from a library I could easily find the usages. It has made the library slightly more difficult to use because i've moved some methods to static functions (and these functions are used a lot in my prototype code so there's a lot of no-benefit fixing to be done to port it) but it seems like a reasonable compromise for the first cut. There may be other approaches using interfaces or subclasses too, although I tend to think that falls into over-engineering.

Spi

One of the biggest benefits is the service provider mechanism that enables pluggability by just including modules the path. It's something I should've looked into earlier rather than the messy ad-hoc stuff i've been doing but I guess things get done eventually.

I've probably not done a good job with it yet either but it's a start and easy to modify. There should be a couple of other places I can take advantage of it as well.

Redesign

I'm also mid-way through cleaning out a lot of stuff - cut and paste, newer-better implementations, or just experiments that take too much code and are rarely used.

I had a lot of stream processing experiements which just ended up being over-engineered. For example I tried experimenting with using streams and a Collector to calculate more than just min/sum/max, instead calculating multi-dimensional statistics (i.e. all at once) on multi-dimensional data (e.g. image channels). So I came up with a set of classes (1 to 4 dimensions), collector factories, and so on - it's hundreds of lines of code (and a lot of bytecode) and I think I only use it in one or two places in non-performance critical code. So it's going in the bin and if i do decide to replace it I think I can get by with at most a single class and a few factory methods.

The NotAnywhereZone

Whilst looking for some info on netbeans+cvs I tried finding my own posts, and it seems this whole site has effectively vanished from the internet. Well with a specific search you can still find blog posts on google, but not using the date-ranges (maybe the date headers are wrong here). All you can find on duckduckgo is the site root page.

So if you're reading this, congratulations on not being a spider bot!

Tagged java.
Sunday, 23 September 2018, 12:26

Not Netbeans 9.0, Java 11

Well that effort was short-lived, no CVS plugin anymore.

It's not that hard to live without, just use the command line and/or emacs, but today i've already wasted enough time trying to find out if it will ever return (on which question I found no answer, clear or otherwise).

It was also going to be a bit of a pain translating an existing project into a 'modular' one, even though from the perspective of a makefile it's only a couple of small changes.

Tagged java.
Older Posts
Copyright (C) 2018 Michael Zucchi, All Rights Reserved.Powered by gcc & me!