JAXenter: Let’s start with a more general question — Why is Java worth using in conjunction with machine learning in the enterprise environments? What are the advantages, how is Java used and why not just work with Python?
Christoph Henkelmann: For starters, you have to be careful. Java is not always better than Python. One should not be overly dogmatic as there are many languages you could use.
In this example, I specifically focused on Java, because there are many Java implemented systems out there. This applies to the enterprise sector, in particular, i.e. the sector of large companies that operate important systems. If I have a system and would like to extend it with (e.g.) machine learning features, it is great if they can be integrated into this existing structure. On the one hand, there are technical advantages, because I have fewer dependencies and my whole process is not over-complicated. If I have 20 application servers, I don’t have to maintain a Python installation on each of them if I already have a Java runtime anyhow.
Java is one of the most important languages for businesses.
On the other hand, business principles are very important. Java is one of the most important languages for business. For example, if you want to sell a new product (e.g. a machine learning product) or implement machine learning as a service provider, it is just very convenient for the customer to have a Java solution as well. Firstly, on a political level, because Java is established. People are content and the whole thing is safe. Secondly, there are some technical advantages. However, it then enters the realm of personal taste. Java is quite good, fast and suitable for the web — hence not a bad idea. But of course, you can do it in any other way.
JAXenter: If you want to run TensorFlow models on a Java server, what do you have to consider regarding the workflow at work?
Christoph Henkelmann: As usual, that depends, but the preprocessing is a very important point. It is important because I have to preprocess my data beforehand, if I want to train my model. This preprocessing is quite often the most important part of my model’s quality. And of course I have to make sure, that I have the exact same preprocessing when I run the model live, especially if it’s in another programming language.
Here, I have two options: Either I make the preprocessing a part of my model or I have to make sure that I recreate the preprocessing part, which I may have written Python, equally in Java. I must stress again how important this part is.
The preprocessing is very important.
The other question has to do with an organizational nature and depends very much on the size of my model. If I have a model that may not exceed a few megabytes, then I can put it in my WAR (Web Application Archive) or my Java artifact and have a normal workflow. Basically, it’s like working with designers who provide assets for my web application, such as background graphics. I get a ZIP file from the machine learning team with everything in there.
And finally, I learn how to load all of this and put it into my application.
It gets more exciting when my models become really huge. However, I haven’t had this in practice yet. You have to be careful then, because there may be an archive with 400 to 500 MB of data and it has to be made available on all the application reserves. But again, how I organize my workflow is more likely a problem of organization than a technological one.
JAXenter: This certainly is an exciting topic but we would like to focus on the question of TensorFlow on Java servers in general. Are you using any other tools or software in this context? What does your stack look like?
Christoph Henkelmann: About my stack — I work with Jupyter notebooks because I get access to a lot of possible visualizations, which are still easier to use in Python than in Java. I then have a Python session in which I can work in a classic manner, just like it’s shown in all the tutorials. If the data is getting bigger, so I need a lot of data, I simply write the preprocessing manually, for example with a Java command line application because a simple pre-loop in Python can sometimes take quite a long time. And that’s in contrast to Java, which is quite fast. This means that if I have large amounts of data, which I have to preprocess and I know how my preprocessing is running, then I sometimes write it in Java, because it’s just faster if it’s running.
I just had a case like this, where the preprocessing in Python took over 30 minutes and exactly one minute in Java. Of course, I can use this to improve and adjust my preprocessing. What I use, when I put something into production and have to utilize my stack, also depends on what I want to do – if I make a web application, I use the Ninja Web framework and if it’s a middleware, I use Apache CXF.
But I also use other frameworks. In most cases, a database is going to be necessary and I prefer to work with jOOQ, instead of Hibernate. A fairly normal standard Java utility framework, there’s nothing too remarkable about it. It’s only important to use TensorBoard when developing the TensorFlow model because it saves an enormous amount of time. This makes developing and debugging more enjoyable and fun.
I just had a case like this, where the preprocessing in Python took over 30 minutes and exactly one minute in Java.
JAXenter: You mentioned that Java is a web language. What would it look like if you wanted to implement a similar solution for the mobile sector?
Christoph Henkelmann: When we talk about Android, we theoretically would use Java, but with a completely different environment in which we run it. Now, when I want to use TensorFlow in a mobile environment I first and foremost work with Google’s TensorFlow Lite Line. This doesn’t necessarily have to be good, but this is what Google is planning for the deployment in IOS and Android. However, I’m not so sure about other platforms. It’s specifically optimized to run on mobile devices.
JAXenter: Is there a new big trend or are there any exciting tools in the field of machine learning that interests you at the moment?
Christoph Henkelmann: Algorithmically speaking, I haven’t had much time to work with it yet. One of the two things which are currently particularly exciting is Attention. Attention — as the name suggests — allows my machine learning process to focus its attention on certain parts of the input instead of treating everything equally. That’s very interesting.
The so-called Adversarial Networks are also interesting. I think they are really exciting at the moment because in my opinion, they are incredibly well-suited to generate artificial training data. The more data, the better and if I can just generate data, then it’s just perfect. And considering the technology, it is CNTK which I really want to see next, despite my background with Java and how I am more active on Linux servers. But of course, only as soon as Microsoft force itself to create a mac-version for it. I already mentioned it at the Machine Learning Conference 2017 that the TensorFlow API is not always nice. And I also watched the documentary about CNTK and yes, that’s nice. I hope to be able to work with it soon.
JAXenter: What are the possible use cases for CNTK?
Christoph Henkelmann: Microsoft’s CNTK is basically the same as Google’s TensorFlow – with a somewhat stronger focus on language. In the end, though, many algorithms are similar or even the same.