January 26, 2013

The Achilles' Heel of Model Checkers

Implementing a model checker is a tricky business. There are a lot of pitfalls which can cause them to produce incorrect results or to become extremely inefficient. I recently stumbled on one particular performance problem in two very different state space generation and analysis tools: the PRISM model checker and the Henshin state space analysis tools. I am exaggerating of course by calling it an Achilles' heel, but the problem is severe as it caused both tools in some situations not to scale well for large models.

First a word on the two tools. PRISM is a really powerful model checker for quantitative (stochastic / probabilistic / timed) models. It has been applied to a lot of case studies and my personal experience is that it is very stable and, due to its symbolic engines, scales in general very well. For Henshin, I implemented the state space analysis tools myself, so obviously I think it is a great tool. ;)

Now to the problem: it occurred for certain kinds of models during the state space generation. The synopsis of the problem in both tools was that in the beginning the state space generation is really fast, but the larger the state space gets, the slower the generation was. To understand the source of the problem, let's take a look at this pseudo code for a state space generator (similarly implemented in PRISM and Henshin):

E = {}
X = [ init ]
while (!is_empty(X)) {
  x = pop(X)
  S = succ(x)
  for (s in S) {
    if (!contains(E,s)) {

E is a set of explored states (initially empty). X is a stack of unexplored states (contains the initial state in the beginning). Now the body of the while-loop is executed as long as there are unexplored states. In every iteration, the first unexplored state is popped from the list and added to the set of explored states. Then the set S of successor states of this states is computed. For every successor state, we check if it is explored already. If not, we add it to the list of unexplored states.

Now what is the problem in this piece of code? There is no problem specifically in this code. The issue lies in the implementations of the used collections. You should make sure you use a list implementation for X with constant complexity for pushing and popping elements (e.g. a linked list). But the real problem lies in the implementation of the set of states E, specifically in the used hash function (and in Henshin also the equality checker). In the state space generators, the hash function is used to assign a state to a memory slot. What is really important is that this hash function produces as few hash collisions as possible, because the state space sets tend to get very large and many collisions mean longer times for containment checks.

In the PTA state space generator of PRISM, states are pairs of a location and a zone (a symbolic representation of a time window). The implemented hash function computed the hash code only based on the location but not the zone. Now, if you have a system comprised of 1.000 locations and, say, 1.000.000 states, you have an average of 1.000 zones per state. If you use a simple hash function that uses only the location information, the hash code of every state collides with 1.000 other states. Thus, a simple containment check requires to iterate over all these states. And the larger the state space gets, the more collisions you get and the slower becomes the state space generation. By simply using a hash function that also takes into account the zones, the performance of the state space generator boosted from 438 to 12 seconds for one case of the CSMA example on my laptop.

In Henshin, I was facing the same problem, but here the solution was not so easy. In Henshin, states are essentially attributed, typed graphs (EMF models). Defining a good hash function for these models which can be efficiently computed is really not easy. Also, checking equality of states boils down to checking graph isomorphy -- a problem for which it is not even known whether it is in P. I think it took me 3 or 4 complete rewrites of the hash function implementation and the equality checker until it was running smoothly. If you are interested in the details, you can check out the source code of the hash function implementation. It is also the basis for the equality checker and, as I found out in a profiling session, it is the most often executed code during state space generation in Henshin. Maybe in one of my next posts I can explain some of its details.

January 17, 2013

Nested Multi-Rules in Henshin

Graph transformations become more and more popular in software engineering. This trend is somewhat surprising as their foundations were already developed in the 1970's (or even late 1960's). One reason for this is probably their appealing visual representation similar to Petri nets or UML diagrams. Graph transformations (and Reo) are also one of the few topics that my wife recalls when she is asked about what I work on.

One of the most popular applications of graph transformation is model-driven engineering and specifically model transformations. In Henshin, we develop a graph transformation based language targeting the Eclipse Modeling Framework.

A neat feature that we have in Henshin are so-called nested multi-rules which are formally based on a concept called amalgamation. It is essentially a concept for synchronized rule applications and it greatly increases the expressive power of the Henshin language compared to the standard graph transformation rule formats. Usually, graph transformation rules describe a pattern of fixed size and topology that should be matched and transformed. In Henshin, we can use nested multi-rules to match and transform patterns of unbounded size and a, let's call it regular, structure. Let s take a look at an example...

This rule models a simple broadcasting protocol in wireless sensor networks. It matches an active node x that has exactly one message. Now the multi-rule part of this rule are the elements on the right with the «...*» stereotypes. This part of the rule is matched and applied as often as possible. So when this rule is executed, a message object is created at all active neighbor nodes. This rule alone describes a non-trivial broadcast protocol completely independently of the topology of the network. Let's see another example...

This is an operational model of the so-called gossiping girls problem. In a nutshell, the problem is: there are n girls each of them knowing a piece of gossip. How many phone calls are required so that all girls know all secrets? The above rule models a phone call between two girls, say g1 and g2. The rule matches all secrets of g1 that g2 doesn't know and all secrets of g2 that g1 doesn't know. The rule then just creates the necessary edges so that after the rule has been applied both girls know all matched secrets. Neat, isn't it.

What we have seen so far are actually only rules with a single multi-rule. In Henshin, you can have in fact more than one multi-rule and even nest them. You can specify the different multi-rules and their nesting structure through a path concept in Henshin's actions (this is how we call the stereotypes on nodes and edges.) The rule below is a more complicated example which makes use of nested multi-rules. It is the first part of the classical OO2Rdb-example for exogenous model transformations. The rule translates a package to a database schema together with all its classes and attributes.

There are many more examples where nested multi-rules allow you to do fairly complex graph manipulations. From an operational point of view, a nice feature of such rules is not only their conciseness, but the fact that they are executed atomically.

In one of my next posts, I will tell you how to define and use dynamically typed rules. These can be used for example to copy arbitrary EMF models using Henshin. Stay tuned.

January 4, 2013

Ray Kurzweil's "How to Create a Mind"

We are not really up to date when it comes to communication technology. The mobile phone that my wife and I possess (yes, one for the two of us) resembles a pocket calculator from the 1980's. Two months ago I had to give a trial lecture for third semester students in computer science on the topic "Introduction to the Development of Mobile Applications". I talked about the new possibilities and challenges in the mobile sector and explained them how the lifecycle of an App in Android works. Overall I think I performed pretty well but I must say that I am very glad that none of the students asked for a demo on my mobile. Anyway, I decided that it is time to change something. I am still not convinced of smartphones, and so instead I got a Kindle Fire HD for my wife Carola for Christmas.

To be honest most of the time until now I was using the Kindle. The first e-book I read on the Kindle was "How to Create a Mind" by Ray Kurzweil. Ray Kurzweil is an icon in the Artificial Intelligence (AI) field and has worked for several decades on engineering intelligent systems. He worked a lot in the area of speech and text recognition. The research of his group led to a number of products that can be considered as the state of the art in intelligent systems. The most popular product is probably the Siri assistant that you may know from your iPhone.

In "How to Create a Mind", Kurzweil describes the state of the art in AI from a biological, a mathematical, and a philosophical perspective. One of the predominant messages of the book is that it is only a matter of a few decades until machines are more intelligent than humans. Kurzweil gives an overview of the structure of the human brain and describes the core concepts of thinking and learning in the neocortex. He describes thoughts as sequential and hierarchically structured patterns in the brain. An example of the hierarchy of patterns is (spoken) language: phonemes form words, words form sentences, sentences form stories. In a similar way, the brain organizes ideas and concepts into hierarchical structures with increasing levels of abstraction.

The sequential and hierarchical structure of patterns in the process of thinking and learning manifests also in the formalism of Hierarchical Hidden Markov Models (HHMMs) which Kurzweil uses and advocates in his book. In their core, HHMMs can be described as structured discrete-time Markov chains with observations. The process of learning involves to find (1) an optimal topology of the HHMM and (2) optimal transition probabilities, based on a training set that is given by sequences of observations. Problem (2) is the actual learning step and can be solved using variants of the Baum-Welch algorithm, whereas (1) can be solved using evolutionary algorithms.

One of the most often occurring criticisms of AI is that the used methods are just based on statistical models or manipulations of abstract symbols. In short: there is no real thinking going on, no awareness and no consciousness. Of course, Kurzweil defends AI, e.g., he argues that the whole decision making in the human brain is based on natural law and statistics. He describes an interesting situation where the AI system Watson was competing with the best two Jeopardy! players. Watson is based on statistical models and computes the probabilities of possible answers. For example, Watson's response to
“In Act 3 of an 1846 Verdi Opera, this ‘Scourge of God’ is stabbed to death by his lover Odabella” 
was “Who is Attila?”. The showmaster asked him (or it?) to specify the answer and Watson correctly responded with “Who is Attila the Hun?”. Kurzweil makes a point in saying that the understanding of the language used in the Jeopardy! questions (technically answers) is a real challenge and that Watson learned his knowledge by reading the whole Wikipedia and other human-written text resources, which marks a milestone in the engineering of artificial intelligence.

Kurzweil also gives some clues to the grand philosophical questions in AI which naturally arise when attempting to build machines that reach or exceed human intelligence. For a machine that acts like as if it were conscious and as if it had a free will, we should also assume that it is conscious and has a free will. Thus, Kurzweil believes that machines are conscious if they pass the Turing test. This topic is highly controversial as this excerpt from Wikipedia shows:
"Searle argues that the experience of consciousness can't be detected by examining the behavior of a machine, a human being or any other animal. Daniel Dennett points out that natural selection can not preserve a feature of an animal that has no effect on the behavior of the animal, and thus consciousness (...) can't be produced by natural selection. Therefor either natural selection did not produce consciousness, or (...) consciousness can be detected by suitably designed Turing test."
No matter whether you stand on the side of Kurzweil (as I on this question) or on Searle, I hope I was able to whet your appetite. If you are a computer scientist or generally interested in AI, I highly recommend to read "How to Create a Mind" (even if you don't own a Kindle).