November 7, 2015

Aggregations in Henshin

The language concepts of Henshin are based on graph transformations. As such, aggregations have not been in the scope of Henshin so far. Aggregations are well known from relational database systems and are a powerful tool for counting entities, calculating minima / maxima or average values etc.

How do aggregations and graph transformations fit together? Well, in Henshin we know the concept of nested multi-rules. In graph transformation terminology this is often referred to as amalgamation. In essence, a transformation rule is extended with parts that are applied with a for-all semantics. These for-all parts are referred to as the multi-rule, whereas the core part is referred to as the kernel rule. In the graphical editor of Henshin, the multi-rule parts are indicated using *-actions.

Now, the trick for supporting amalgamations in Henshin is to allow to use the values of parameters of multi-rules in the kernel rule. What happens is that for one match of the kernel rule, the multi-rule is matched with for-all semantics, and that its parameter values are exposed to the kernel rule as a collection. This enables the use of aggregations functions in the kernel rule, which combine the values of parameters of the multi-rules.

Consider the two example rules below. Both match one :Family object (kernel rule) and all :Person objects in that family (multi-rule). Now, the set of persons can be used as basis for aggregations. In our two example rules, we use as aggregation functions COUNT and AVG to resp. count the number of family members and calculate the average age of the family members.

Note that the expressions Aggregations.COUNT(x) and Aggregations.AVG(a) are not some special syntax for aggregations. They are just calls to ordinary Java methods. The special thing about these methods is that they take a collection of values as parameters (not single values). The Henshin interpreter just calls the (possibly user-defined) functions to calculate the aggregated values. For the interpreter to know which class to use, it must be defined as part of the Java imports of the rule, as shown in the screenshot below. Additionally, you need to ensure that this class is in the Java classpath when executing the Henshin transformation.

The aggregation support is currently available in the development branch of Henshin. You can find the above rules with accompanying Java code in the examples plugins.

May 1, 2015

Towards Modeling of Distributed Graph Algorithms

The most prominent approach for distributed processing of very large graphs today is probably the Bulk Synchronous Parallel (BSP) model. BSP is a bridging model for implementing graph algorithms in such a way that they can be massively parallelized and distributed. One application can be found in the Henshin graph transformation tool, where graph transformation systems can be modeled, and code generated for the BSP framework Apache Giraph.

Although the BSP model is widely accepted and used today, there is no standard implementation framework for BSP algorithms. Frameworks that do provide BSP include Google's Pregel, Apache Giraph and GraphX in Apache Spark. For application developers it is not easy to find out which platform is the most suited one for her problem. An implementation of a graph algorithm in, say, Apache Giraph cannot be reused in other frameworks, even though the underlying concepts of BSP are the same. This is unfortunate, particularly because the overhead of developing, deploying and testing these algorithms is rather high due to the complexity of the distributed frameworks.

This problem can be solved by introducing a modeling layer for BSP algorithms. Instead of directly implementing graph algorithms in Pregel, Giraph or GraphX, the idea is to use a modeling language that supports the concepts of BSP. One possible approach is to use UML 2 state machines for the modeling. The figure below shows a state machine for a BSP-model of the shortest path algorithm.

Using such a model, one could generate code for different platforms, e.g. Pregel, Giraph or GraphX. The code generators need to be implemented only once and are straight-forward to build since the BSP concepts are more or less directly used. Using different code generators, algorithm engineers can automatically derive implementations for different platforms and benchmark them against each other without implementing a single line of code.

In Henshin, the code generation for Apache Giraph is quite complex because the concepts of graph transformations are rather different than BSP. One opportunity to simplify it would be to use BSP models as intermediate target language. Specifically, model transformations can be employed to translate a Henshin model into a BSP model. From that point on, platform-specific code generators could be used to generate the final implementation.

Using a modeling approach would enable the reuse of the many many graph algorithms already implemented in the existing BSP-frameworks. Maybe UML state machines are not the best approach for the modeling -- for instance, one could also come up with a (textual) DSL for BSP. The important point is that an abstraction from the actual implementation platforms is made.  

February 6, 2015

Henshin 1.2

We are happy to announce the release of Henshin 1.2! Henshin is a model / graph transformation tool for EMF models. The new release comes with new features for the graphical editor and a better integration with Apache Giraph.

The graphical editor now supports attribute conditions, which are used to restrict the applicability of transformation rules. In addition, the editor now supports live validation which gives you an immediate warning if you are trying to add an object to two containers or you are using an invalid JavaScript expression etc.

For me the most important new feature of the new version is the improved code generation facilities for Giraph. The new wizard will generate a full-fledged Java-project with all needed libraries, set up a Maven build and even automatically install a local Hadoop test environment if you like. This means that you can model a graph transformation in Henshin and fully automatically generate executable Giraph code that you can directly run in Eclipse. The video below shows you a demo of this new code generator.

Henshin 1.2 is available at our release update site.