[Estimated Reading Time: 2 minutes]

For the past two years Google have been working on something for Android that could herald a sea change on the platform. ART.

Essentially, ART is a new runtime which incorporates AOT (Ahead Of Time) compilation, essentially taking JIT to the next level by compiling an application in full at the time of installation on a device.

.NET of course already has this technology or at least something very similar.

The implications of this are obvious and significant.

Any developer using a platform native, Java based development environment today – including tools such as Oxygene for Java – is assured of reaching the broadest range of Android devices and able to take full advantage of the Android SDK‘s, including proprietary extensions such as the Samsung Galaxy S-Pen SDK.

With ART those same platform native applications will now also benefit from the performance boost that comes from full compilation to native code. And it seems this will simply com “free of charge”, without any need to modify code or change or even upgrade tools.

At a stroke, a wholly native code approach to development loses any advantages to outweigh the limitations, constraint and additional difficulties that it brings.

With ART, Android SDK developers can continue to take full advantage of the platform while their users enjoy the benefit of native code.

Everyone’s a winner! 🙂

Furthermore, they will get this benefit on all the hardware that Google supports with their ART technology. Even better, developers won’t have to choose what particular native code their compiler should produce since it is generated at the time of installation and is therefore targeted to the users device at that time.

The hardware that will be supported by this technology is not yet known but it could in theory mean the full range of Android devices whether ARM based or Intel or some other entirely new and exotic architecture that might emerge.

Even if there is some hardware that ART does not support, Android SDK apps will of course continue to run on those devices, only without the additional benefit of ART.

But it’s not all fluffy kittens and daisies. There is a cost.

Pre-compiled applications increase in size compared to their byte-code sources; an inflation said to be in the region of 10%-20%. So a 40KB widget will now blow out to a massive 44KB or worse as much as 50KB.

A 1MB application will explode to 1.2MB. Gadzooks!

Ok, so maybe it is all fluffy kittens and daisies after all.

🙂

ART is still an experimental technology, but it’s an experiment you may be able to participate in.

If you are an Oxygene for Java developer targeting Android (or using Java itself of course) and have access to a Google Nexus device with Kit-Kat, you can enable ART (on the device – nothing to do with the code or the tools) and see for yourself what difference it can make to turn your already platform native applications into native code applications as well.

Unfortunately I don’t have such a device myself, but who knows…. perhaps some day soon…

🙂

Here are some more links on the subject, to alleviate the strain on Google’s servers:

19 thoughts on “A True Work of ART”

  1. Now everything that remains to be done to make Java usable for realtime applications is to ditch the garbage collector in favor of ARC…

    1. You have more chances of seeing ARC go away than GC. The last ARC platform is Apple, and they’re slowly but surely losing ground.
      If you look at hardware trends, you see more RAM and more CPU cores, both of which favor GC designs over ARC.

      And let’s be honest, while ARC is somewhat more comfortable than manual memory management, it still is problematic (weak references business), isn’t computationally free (implicit exception frames, bus locks) and just not in the same comfort league as GC or ARC+GC hybrids.

      1. GC is the one reason we can’t use .NET in our company. We handle a lot of realtime sensor data and just can’t afford the GC “stop the world” behavior – we need smooth and predictable data flow, interruptions are intolerable. I would prefer a 10 ns bus lock any time over a 1 second “stop the world”.

        In my experience, an Android device needs 2 GB RAM to run *almost* as smoothly as an IOS device having only 0.5 GB RAM and even then, many programs (such as video streaming apps) have trouble with stutter. No Android device with 1 GB or less comes anywhere close to IOS. Though I myself switched from IOS to Android years ago, I’m the first to admit that IOS’s performance is superior in this respect, it runs much more smoothly than Android because the system isn’t interrupted all the time to free up memory.

        As a developer, I think there’s nothing better than deterministic memory management and IMHO ARC is a good candidate to make it more comfortable. I have used refcounting for ages because most stuff I write is interface-based. Sure it isn’t free. But it is verifiable and predictable
        and I really don’t care if it costs me 1% performance; my virus scanner costs ten times more more anyway!

        GC is really deadly in a realtime environment. Read this (http://goo.gl/Lp7G1v) to see how spectacularly .NET can fail.

        BTW, I happily program for ARM CPU’s (Raspberry Pi) using deterministic memory management (Freepascal) and it runs smoothly.

        1. Video on Android is handled by the hardware usually, GC doesn’t come into play, if you have stutters they come from something else (don’t see any on Nexus 4 & 7). 1sec world freezes were an oddity of .Net in server GC mode. As for smoothness, since iOS 7, Android definitely ranks ahead IME, as freezes of several seconds is what I sometimes see when switching tasks on the iPad. So the smoothness of iOS is more myth than reality these days AFAICT.

          The StockExchange failure was one for MS, not GC, the solutions they use now on Linux involve Java and Python, which rely on GC and ARC+GC respectively.

          And finally, world freezes is what you have in Delphi’s ARC implementation when weak references are involved, and to a lesser degree in Apple’s implementation. If you say “avoid weak references”, I’ll answer “good luck” and also that in a GC you can still use static memory (just sub-allocate and use object pools), which means little to no GC (and is behind the “jank-free” games, apps and demos in JS f.i)

          As for real-time, you can use workers, with independent GC, so no world freezes, and complete scalability vs core count.

          1. Hi Eric, *streaming* video (such as done by the app provided by my internet/tv provider) involves downloading/ decrypting/ buffering the data before the built-in playback routines of the OS come into play. AFAIK that is done by the app itself.

            I own a nexus 7 (model 2013) myself and it performs brilliantly thanks to its 2 GB RAM and its lack of proprietary software on top of Android. Memory simply never runs out or at least, I’ve never noticed.

            I also own a Galaxy Note N7000 with only half the RAM (which used to be a high-end device only two years ago) and its performance is beyond abysmal. You can observe memory running out and the GC only kicks in when it’s almost down to zero. Whenever that happens, everything becomes incredibly sluggish, even the keyboard starts lagging key presses for several seconds and for some reason even the accuracy of the x/y decoding of the touch screen suffers from those delays, causing 20% of keypresses to be mis-typed. Killing tasks solves the problem only partially, I have to reset the device every few days to make it snappy again.

            I wasn’t aware that the new software of the stock exchange is Java based, where did you read that?

            I myself am still developing using Delphi 2009 and XE and most of my development is interface-based. If an object needs a reference to its owner and a circular reference is created, I simply simulate a “weak” reference by typecasting a pointer inside a method which returns the parent interface and there’s nothing in the background stopping anything.

            Sure you can use static memory in a GC environment by *avoiding* memory allocation and pooling/re-using objects but that’s just it – working around the system instead of improving the system. However, even this method isn’t waterproof: Even if *one* application is well-behaved and avoids unnecessary memory allocation, that does not automatically mean that all the *other* apps and services running simultaneously on the same device behave just as nicely. These may still gobble up RAM and cause GC sweeps at inappropriate times.

            Which is probably the case on my smartphone. There are a few suckers eating up RAM and not releasing it. Without rooting the device, I can’t get rid of those apps. So now the whole phone sucks because of a few bad guys. Luckily I can get a new smartphone in 4 months’ time and it sure as hell won’t be a Samsung. I’ll buy one with 2 GB Ram and I want it as bare-bones as possible, no RAM-eaters in the background.

            1. It seems to me that the problem is the “RAM suckers”, not the GC. A device with more RAM simply gives more room for badly behaved apps to get away with misbehaving without adversely affecting the rest of the system.

              In a GC system the adverse effect (of an app not managing it’s memory considerately) is a GC sweep so that other apps can get the RAM they need.

              In a non-GC system the adverse effect is that other apps simply don’t get the RAM they need at all.

              Choose your poison. 🙂

              I have to say, I’ve never noticed the problems you describe on my Galaxy SII (also with ‘only’ 1GB of RAM), but it is also worth mentioning that since I replaced the Samsung Android ROM with a RootBox ROM it has been like getting a brand new device. Much faster and vastly improved battery life.

              1. Indeed, the only solution is to get rid of the bad apps, whether they are using a GC or not is irrelevant.
                If you’re on a Nexus, the original ROM is good. On Samsung and the rest, you just need to root, nothing else makes sense. It’s just like you needed to reinstall a Dell or HP laptop to get rid of the bloatware. 🙂

                And yes, you can workaround the system for weak-references, but that’ll stay fragile and unsafe.

                Avoiding allocations is also a valid strategy in manually-managed environment, as even when memory allocation is cheap, setting up and cleaning up instances often isn’t.

                The whole point of GC is that you only have to care about that when it matters, rather than having to hack around all the time in manual memory environments, and “too-often” in ARC.

                For Stock Exchange, it’s the NYSE that uses Python.

              2. IMO, rooting an expensive smartphone and losing my warranty just to get sensible memory behavior is not an option. It should just work.

                An interesting question would be, do these “bad apps” really allocate too much memory, or is it a bug/memory leak in the sense that the GC fails to recognize and free orphaned objects for whatever reason? Now *that* would be poison.How do you diagnose memory leaks in a GC system at all? After all, one is *supposed* to simply allocate and let the system manage the disposal. Maybe the developers were bona fide and there’s some freak side effect preventing the GC from freeing objects. Is there such a thing as a circular reference problem in a GC environment?

                I think I still prefer the deterministic poison because it gives me control.

                The advantage of a deterministic environment is that a memory leak is dead easy to spot when the program terminates (though sometimes hard to locate), e.g. using fastMM4. And if a subroutine temporarily allocates/de-allocates memory, that needn’t necessarily fragment the heap.

                The disadvantage of a GC environment is that you need much more RAM memory to begin with, because the heap always contains a mixture of active and orphaned objects. I suppose that freeing the orphans asynchronously creates fragments. And if you want to allocate a larger block than the largest empty fragment, the GC has to do complicated stuff. At least that’s how I understand it.

                The newest Iphones / Ipads with their larger-footprint 64-bit apps have 1 GB. Androids (32-bit) need 2 Gb to be comparably smooth. Hardware to compensate for software.

              3. It should just work.

                Well mine does just work and did even before I rooted, it’s just even better now than it was before (even running a far more capable and feature rich launcher and other features than before). If yours doesn’t then the difference must be in the apps since both devices were running the same OS with the same GC.

                Besides which, your argument is circular: Apps aren’t bad, it’s the GC that’s flawed, assuming that the reason the apps are bad is because of a flawed GC.

                You entirely missed the point of the choice of poisons, which is that a bad app has bad consequences (in this area). Smartphones are actually a perfect environment for GC since the device spends a lot of it’s time idle, providing ample opportunity for the GC to do it’s work unnoticed. GC makes less sense in environments with little/no idle time.

                You also mix in assumptions or implementation specifics of particular memory managers as an argument for the general case.

                Bear in mind that we had deterministic memory management in Delphi long before FastMM existed, and yet when FastMM came along suddenly the fragmentation induced slow-down, leak-detection difficulty and overall poor performance of the previous memory manager was suddenly exposed. And it wasn’t a GC. 😉

                You cannot simply assume that all GC’s are bad any more than you can simply assume that all deterministic implementations are good.

                Androids (32-bit) need 2 Gb to be comparably smooth

                This flies in the face of my experience (and others) where having only 1GB presents no such problem. Which again suggests that the problem isn’t the OS or the device or the GC or not, but the particular apps that you have on it.

                As for 64-bit iOS vs Android, it’s simply far too early yet to say what impact 64-bit has on apps over time since the hardware has only just arrived and I don’t think there are that many actual 64-bit apps for iOS. Certainly my partners (32-bit) iOS devices are no more immune to slow-down and juddering at times than my Android devices.

                Bad apps will be bad apps. There is only so much that an OS, with any particular memory management approach, can do about that.

        2. GC and LSE are not necessarily related. Even they were, the fact is that even at that time .Net bringed some APIs for GC to not have GC stalls (as part of .Net 3.5 SP1). In between the GC right now in .Net is background GC (which reduces “shuttering”). Even more, Disruptor pattern is targetting Java and is fairly fast, faster than typical C++ applications and is used as a transaction engine in ms transactions.

          Even they would be would not mean that the desktop applications should not run using a GC. Android is slow(er) than iOS for many reasons including: is half-interpreted, and even in the public talk describing the JIT in 2010, it is clearly stated that it can achieve like 1/2 of the full speed of the fully compiled code. Other talks in Google describe how many apps do overdraw in Android, so slowness and big memory consumption is not necessarily related with reference counting vs gc.

          At last, I am writing a static compiled system in my free time, and I can say that the big wins are not in pauses if you take into account that the GC is generational, so you will rarely have to do a full GC, but how static compilers have more time to optimize, so they don’t perform just “the most hitted” optimizations, but all optimizations that can be performed on a piece of code. LLVM having in particular many optimizations that target all application aspects, when Android can target just some that target common cases of Dalvik applications. And here is where Art does matter: it can perform many optimizations with no cost when running the application.

          Last take of you: R-Pi runs well in FreePascal. There are many .Net games and most Windows Phone applications and games are written in C#. And they are not jaggering, and Windows Phone is seen as a responsive UI. Why do you think that is it possible? Or Magika to run on .Net, or Xamarin games using MonoGame to run on .Net, or Unity games, etc.

    2. IMHO, this move turns a lot of currently working Java code into uncompatible one. But migration path can be via separation of old-style GC-code with new-style ARC code.

  2. Platform native. This is interesting. Maybe EMB should think about adding a Delphi->Java byte code compiler (like Oxygene for Java)…

  3. I cannot help but keep wondering why they keep farting around with complicated managed environments, when it would be so easy to just provide a native C++ toolkit plus maybe some python bindings on top of it to enable RAD.

  4. Perhaps there’s something a native compiler for Windows like Delphi could implement. Imagine, you write the application as you always have. Compile it as you always have, but deployment is a little different. There would perhaps be some intermediate compile step to some kind of P-Code. On creation of the installer, a Delphi compiler is also packaged with the p-code, and this compiles at the time of installation optimising to the machine. No more copying executables around different machines though. There could be some security enhancements too as some kind of machine signature could be compiled in to the executable.

  5. where do you see that this is really an machine-code-compiled exe?
    why not direkctly use Lazarus?

  6. I predict that when this comes to pass Embarcadero’s marketing department will be undeterred with a little technical change…..

    “Announcing Embarcadero Delphi XE10! The only mobile solution with true native* compilation!”

    * Completed source code is uploaded to our servers, where a compilation will be initiated by Native American, Australian Aborigine and New Zealand Maori employees, who will then transmit a binary executable back.

    1. Be sure it will not happen this. Because this kind of talk of “native” is not true as of today: JavaScript is defined by the “Sip” as being “scripted”, also it stated Xamarin’s Mono on iOS to be “interpreted” (even is statically compiled).

Comments are closed.