ilastik 0.6 tech preview

In the last two years, we have rewritten ilastik from the ground up to enable the following features that are implemented in the upcoming version 0.6:

  • easy creation of workflows for specific application scenarios from modular building blocks
  • interactive machine learning on data sets much larger than available RAM
  • quickest possible response to user inputs by performing only strictly necessary re-computations
  • self-configuring parallelization

The aim of this tech preview is to provide a proof of concept, and showcase this entirely new architecture to developers. As of now, we provide sample workflows for pixelwise classification (now on data much larger than RAM) and carving (still RAM-limited, pending improvements).

We are now working on bringing the user interface up to the standard of ilastik 0.5, simplifying workflow creation, and creating more sample workflows.

Overview of the new architecture

For ilastik 0.6, the main goal was to separate ilastik into modular components, such that new workflows (e.g. object detection and classification, tracking of dividing cells) can be assembled at ease. Towards this goal, we have written two libraries:

  • The lazyflow library allows to set up a data-flow graph where each node represents a function to be computed (such as image features on a given image), and edges represent inputs and outputs. As the data dependency graph is known, lazyflow can optimize the computation: if the result for some region of interest is requested, only the necessary computations are carried out to arrive at the correct result, but no more! For example, when requesting the result of a convolution operation, lazyflow will automatically add the necessary margin to the input data, such that no boundary artefacts occur given the kernel size. Additionally, lazyflow can parallelize computations, for example by block-wise computation. lazyflow is fully asynchronous; when a desired result is available, the user will be notified via a callback function. To avoid re-calculations, caches can be added to the data-flow graph.
  • Volumina is a viewer for up to 5-dimensional data, written in PyQt. A three-dimensional subset can be browsed by scrolling through orthogonal slice views. Multiple layers (showing for example labels, features or predictions) can be stacked on top of each other. As data input, Volumina understands both plain numpy arrays as well as lazyflow outputs. For visible layers, only the currently visible parts are requested asynchronously from the lazyflow data flow graph. Different interaction modes (like navigation vs. labeling vs. object picking) are also possible. Volumina can stream data from disk or internet (see this youtube video demonstrating integration with a CATMAID server).