Sunday, September 18, 2011

The Sugiyama Layout Algorithm (Hierarchical Algorithm) for dummies

We use hierarchical diagrams all the time..heck, i know i do. Whether it's for the company staff chart that the boss wants or the UML diagrams for my projects, i am a keen user of this kind of layout to clearly bring out relationships and/or show the flow of processes ( i use orthogonal diagrams for circuit diagrams...but that's another blog post all together) and so imagine my shock to realise that Hierarchial algorithms are actually a part of a topic known as graph drawing in Computer Science. Anyhow, i spent a lot of time researching and developing Java code to implement a common hierarchial algorithm known as the Sugiyama Layout algorithm and i realised the following 3 things
  1. The Sugiyama layout algorithm is actually a set of algorithms all folded into one. The various algorithms that make up the sugiyama algorithm have been reviewed and updated over the years.
  2. Set theory is everything to understanding the Sugiyama algorithm. It seems that a lot of authors (Sugiyama included...) use set theory symbols to denote the various algorithms within Sugiyama. Now i haven't used set theory in two years and you can imagine the strain i had having to refresh myself when it came to set theory notation. It wasn't all that bad...just slowed me down a bit
  3. Lots of guys struggle with understanding the Sugiyama layout algorithm and so i hope that this blog post makes it easy for folks to take that first step in quickly understanding the Sugiyama Layout algorithm. 
In this blog post, i am going to try and demystify the hierachial layout algorithm commonly known as Sugiyama to make it easy for anyone using advanced texts to better understand what the Sugiyama algorithm is all about. 

What is the Sugiyama Layout algorithm?

The Sugiyama layout algorithm is actually a set of algorithms that were defined by Sugiyama et al in the late 80's. The idea behind these set of algorithms that make up the Sugiyama layout algorithm was to enable the automatic layout of nodes(sometimes called vertices) and edges to a hierachial fashion. If one thinks of  a chart of UML diagrams with inherited classes, then the nodes(or vertices) would be the class diagrams (for instance) with the various propoerties and methods and the edges would be the lines connecting the various vertices/Nodes. To make this blog posting more palatable, i will avoid using set theory as much as possible. easy peasy...

While it would be easy for humans to create a hierachial diagram that is visually appealing, it's another thing all together to have this automatically done with a piece of computer software but this is what exactly the sugiyama algorithm aids to do



<Fig 1: Example of Hierarchical diagram - graph, layer, vertice/Node, edge>

To understand the Sugiyama layout algorithm, one has to first undertand what a visually appealing hierachial diagram is...then the code to automatically lay out the nodes because possible to write. The Sugiyama algorithm lays out a number of conditions that define what a visually appealing hierahial diagram should fulfil:

  1. The number of upward facing edges should be minimised (hierachy)
  2. Vertices should be placed in layers and as uniformly as possible in these layers (uniform distribution)
  3. The number of layers that the edges span should be minimised as much as possible (edges length minimization)
  4. Edges crossings should be minimized as much as possible (edge crossings minimization)
  5. Isomorphic graphs should be drawn identically (identical drawing)
  6. On each layer, vertices should be separated by the smallest distance possible (smallest separation)
  7. Neighbouring vertices in neighbouring layers should be placed as close as possible (closeness)
  8. Each vertice should be placed at the barycenter of the neighbouring vertices (balance)
  9. Bends should be minimised and edges made as straight as possible (linearity)
Don't worry if the above rules don't make sense right now. As you go through the blog post, it will become apparent what the terms 'barycenter' or 'isomorphic graph' mean.

Great, but what exactly is the Sugiyama algorithm and how does it work?

You will notice that above, i just mention what the Sugiyama algorithm is meant to do i.e. automatically lay out nodes in a hierarchial fashion...but it does not explicitly explain how that happens! Well, that';s what this section is all about...explaining how the Sugiyama algorithm works before i delve into the finer details using source code to provide a more detailed explanation.

The Sugiyama algorithm lays out the vertices in a graph (which coincidentally must be an acyclic digraph i.e. a directed graph (the edges have arrows or direction) and where no set of the edges form a cycle) in four stages
  1. The graph is made acyclic (i.e. any cycles formed by edges in the graph are removed). Details of how this is done below
  2. Vertices are assigned to layers in the graph
  3. The order of vertices in each layer is determined to reduce edge crossings as much as possible
  4. The position of the reordered vertices in each layer is determined to enhance balance and closeness.
Four mysterious stages(or algorithms)....that makes up Sugiyama!

There are obviously a lot more details that one will need to understand about each of the four steps above to clearly visualize the internals of Sugiyama. I will go through the four steps above using source code in java to clearly explain the four steps of Sugiyama and how they work.

Before we go on...

You are free to code the nodes/vertices and edges in your software as you see fit. I will explain the way i did it (a model i will be using all through the blog post) but feel free to adapt it to your source code once the stages below are clear.

The Graph that contains all the nodes and edges is denoted by the instance:

Graph graph = graphFrame.getGraph();

Edges are denoted and obtained from an arraylist of all the edges by the following code:

Edge e = (Edge)graphEdges.get(i); // i defines the index/counter for the current edge retrieved

Vertices are denoted and obtained by the following code:

Node n = (Node) graphNodes.get(t) // t is the counter for the current node retrieved

Step 1: making the graph Acyclic

I'll be a bad guy and let you figure this one out. Here's the reason why. When i was implementing the Sugiyama code, i knew the context that the software would be used...to lay out UML diagrams and if you think about it for a second, you will see that it becomes very difficult to create cycles..but i do promose to get back with code on how to do this later.

Step 2: Assigning vertices to layers


This step actually involves 2 steps; assigning vertices to layers in the graph and inserting dummy vertices. Dummy vertices...what's that? Well here's the thing. An edge in a graph(according to the Sugiyama algorithm should have a span of on 1) i.e. it should interconnect 2 vertices in adjacent layers. However, a vertice may be in layer 2 while connected to a vertice in layer 4; thus the edge would have a span of 2. Thus this step involves not just assigning the vertice to a layer but inserting dummy vertices     to edges that have a span of 1. So the question remains...what criteria is used to assign vertices/nodes to layers. Simple

First select the root nodes. These are the nodes that have edges flowing away from them but none coming into them. Then from each of the root nodes recursively go through all the nodes in the graph incrementing the level that a graph should have and assigning this level to the graph. If a node/vertice with a level assigned is encountered, check if the proposed level that it is to be assigned is less than the current level. If this is the case, assign it the proposed level (that is greater than the initial node level)




<fig 2: flow diagram of the algorithm to assigning levels to the nodes in a graph>

Here's a method referred to as getNodeChildren() with the signature indicated:

 private Node getNodeChildren(Node n, ArrayList graphEdges, Integer startCounter)
    {
        /**
         * The algorithmn for this function is as given below
         * starting with sourceNode 0,
         *      loop through all the edges
         *      find which edges have source Node as sourceNode0
         *      get the endNodes for these edges
         *      Add these endNodes as the childNodes for recursion
         *      increment the level that the nodes are found at
         */
        ArrayList endNodes = new ArrayList();
        if(n.getLayer()==-1||startCounter > n.getLayer())
        {
           n.setLayer(startCounter);
        }
        
        for(int i = 0 ; i 0)
        {
            startCounter += 1;
            for(int k = 0 ; k< endNodes.size(); k++)
            {                
                getNodeChildren((Node) endNodes.get(k),graphEdges, startCounter);
            }
        }
        
        // output the layer at which every node is found
       System.out.println("\nLayer: "
                    + n.getLayer()
                    + " Node: "
                    + n.toString());
        return n;
    }

Once all the nodes have a level(or layer number assigned to them, then comes the next step; adding dummy vertices where the edges have a span of more than 1. To do this, here is a simple function that gets the maximum span of all the edges:
 
     /**
     * Function gets the maximum span for all edges to allow
     * for the introduction of dummy vertices
     * @param edges
     * @return the max edge span
     */
    public int getMaxSpan(ArrayList edges)
    {
        int maxSpan = 0;

        for(int i=0;imaxSpan)?(endNodeLayer-startNodeLayer)
                    :maxSpan;
        }

        return maxSpan;
    }

Once the spans for all the edges have been determined, now comes the interesting part, the dummy vertices (and edges) are added:
 
     /**
     * function inserts dummy vertices and breaks down the
     * edges into segments in cases where the edges
     * span is >1
     * @param g the edge whose span is checked if>1
     */
    public void insertDummyVerticesAndEdges(Edge g)
    {
        /**
         * a. Loops through an edge
         *      i. finds out if the span is >1 by checking the
         *          start and End Node layers
         *      ii. Adds dummy vertices if span>1 at
         *          all calculated intermediate points
         *          (endNode.coords - startNode.coords)/span
         *      iii.Add segements to replace edge with start
         *          and end nodes indicated
         * e. remove the long edge whose span >1
         */
        Node startNode = g.getStart();
        Node endNode = g.getEnd();

        // get the span of the edge
        int span = endNode.getLayer()-startNode.getLayer();
        if(span>1)
        {
            System.out.println("\t* Edge "
                    + g.toString()
                    + " has span of "
                    + Integer.toString(span)
                    + " from "
                    + Integer.toString(startNode.getLayer())
                    + " to "
                    + Integer.toString(endNode.getLayer()));

            // calculate the incremental positions for the
            // display of the dummy vertices
            double xIncrement = (endNode.getBounds().getCenterX()
                                        - startNode.getBounds().getCenterX())/span;
            double yIncrement = (endNode.getBounds().getCenterY()
                                        - startNode.getBounds().getCenterY())/span ;
            
            // add dummy vertice at the various
            // layers that the edge spans
            // dummy_vertices_added  = (span -1)
            // @todo create a dummyNode vertice
            int noDummyVertices = span -1;
            Node previousNode = startNode;
            
            for(int i = 0 , currLayer = startNode.getLayer() + 1
                    ; i < noDummyVertices
                    ; i++ , currLayer++)
            {
                PointNode d = new PointNode();
                d.setLayer(currLayer);

                // show that it's a dummy variable
                d.setDummy(); 

                // add the Node to the graphs set of Nodes
                // get the dummy vertices coordinates
                final double xCoord = previousNode.getBounds().getCenterX()
                                    + xIncrement;
                final double yCoord = previousNode.getBounds().getCenterY()
                                    + yIncrement;

                // add Dummy Node
                graph.add(d, new Point2D.Double(xCoord,yCoord));

                System.out.println("\t\tAdded Dummy Node "
                        + d.toString()
                        + " at Layer "
                        + Integer.toString(d.getLayer())
                        + " and coords "
                        + d.getBounds()
                        + " ["
                        + Double.toString(xCoord)
                        + ","
                        + Double.toString(yCoord)
                        +"]");

                // add edge joining previous node
                // to newly created node
                // create new edge
                ClassRelationshipEdge r = new ClassRelationshipEdge();

                // connect the Nodes using the new edge
                graph.connect(r, previousNode, d);

                System.out.println("\t\t\tAdded new edge "
                        + r.toString()
                        + " at "
                        + previousNode.getBounds()
                        + " to "
                        + d.getBounds());                                
                                
                // make sure that the final edge between
                // the last dummy variable and endNode
                // is plotted
                if(i==noDummyVertices-1)
                {
                    // create new edge
                    ClassRelationshipEdge lastEdge = new ClassRelationshipEdge();

                    // connect the Nodes using the new edge
                    graph.connect(lastEdge, d, endNode);

                    System.out.println("\t\t\tAdded Last edge "
                        + r.toString()
                        + " at "
                        + d.getBounds()
                        + " to "
                        + endNode.getBounds());
                }

                // set the previous Node to the current
                // highlighted Node
                previousNode = d;
            }

            // since edge is decomposed to various
            // segments linking the various
            // dummy vertices, remove the edge
            graph.removeEdge(g);
        }
        else
        {
            // span is 1, OK
            // do nothing...
             return;
        }
    }

Step 3: Ordering the vertices to reduce the number of edge crossings:


So now all the nodes have been assigned to specific layers and dummy nodes (and by extension dummy edges) have been added. However, the vertices in each of the layers may be ordered in such a way that a lot of the edges connecting vertices(including the dummy vertices) between adjacent levels are crossing each other a lot. This is highly unacceptable and from the list of aethestic considerations below, the reduction of edge crossings is a key feature of the algorithm


<fig 3: a graph with a lot of edge crossings and one with few edge crossings>

So how do we reorder the vertices in each of the levels to ensure that the edge crossings are reduced as much as possible?

The number of edge crossings in vertices in adjacent layers is obtained by obtaining the adjacency matrices of the edges in adjacent layers, calculating the barycenters of these adjacency matrices and reordering the nodes so that the barycenters of the nodes are arranged from the smallest to the largest. Once a reordering occurs, the barycenters are recalculated and reordering done until the minimum number of edge crossings are obtained.




<fig 3: flow diagram illustrating the algorithm>


I realise that the description above is as cryptic as, well...trying to figure out how to make a chicken sing and so i'll try break down the various components of this step:

(a) Obtaining the adjacency matrices of adjacent layers in the graph where the vertices have been assigned layers


PS: I have had this article in my archives for very long before i decide to publish it. Please let me know if there are any errors &/ ommisions in the comment section below

Sunday, September 4, 2011

Enhancing Preservice training for Medical Students using ICDL - Optimistic or Deceiving?

August 26th marked a great day...and a bad one for me and at least 44 other folks. Could be more but at minimum 45. Let me explain. One of the programs that i work for has the mandate to improve pre-service training. To some of you folks, the term 'preservice' may be quite new if not outright confusing and so that i do not antagonize any further, i'll briefly describe it. 'Preservice' or more definitively 'Preservice Training' is pretty much the training you get while you are still in school(in whatever form that translates to for you...college, campus, nursery...); the training/learning you get before you get into 'service' or into your profession as a graduate. Our focus here was lecturers training students at Medical Teaching Centers; students in Clinical Medicine, Pharmacy, Environmental Science...that kind of thing. Clear? Great. Let's go on. This program that is working to fulfill this mandate decided to implement a very exciting idea, one that i must say i am very very proud of...to train lecturers on basic computer skills; and using the ICDL curriculum. Very exciting indeed!


Well, the program's lean and the team members few . So there weren't a lot of folks to volunteer on this one, just me. Don't get me wrong. I was excited to get 60 lecturers from colleges all across the country to comprehensively cover the basics on computer use but setting up the logistics was going to be a whole other thing, right. But then there was that tenet..."What does not kill you makes you stronger"...that encouraged me on. I rarely shudder at a task before taking it on, irregardless of how mundane, insurmountable or challenging it seems and I wasn't scared of this task either, i was just reflecting on the sheer volume of logistics to be addressed...and they didn't look pretty. What was to be done, you may wonder? Well quite a bit, really. I had to engage an accredited ICDL training college, find a 'spot' with a conducive environment and computer labs with 60 computers to spare and good internet access for 3 weeks, find accommodation for 60 folks, agree on a training schedule with the facilitators from the accredited ICDL college, arrange for transport from the hotel to the training venue and work with 60 or so colleges to get a participant from each...all in 2 weeks and with the companies hideous procedures and paper-work. I was starting to wonder if fate had an insidious plot to pit me against a seemingly Herculean task. Angst wouldn't even begin to describe my feelings at that point.


And at the end of all this the poignant question still hang in the air...would this really benefit the students at the various medical training colleges to obtain better instruction and learning?

Well, i was extremely optimistic advocating for this activity because (without sounding didactic :) ), i did believe that it really would translate to better teaching and learning. A number of lecturers lacked vital computing skills and as a recent survey conducted by a leading (if not the leading) human survey firm here in Kenya established, these lecturers are almost always exposed to a more technologically savvy and vibrant set of individuals in the form of their students. Students who are not only competent in the use of technology but relish its use and in my view would even prefer a mode of instruction that incorporated an advanced use of social networking (twitter and facebook...in case you are wondering), multimedia content in the form of DVD's and youtube videos, Interactive gaming and 3D simulation and eLearning platforms all offered using what is perhaps the world most ubiquitous consumer electronics platform(after the radio, of course); the mobile phone...among other platforms (tablets for instance, provide a very viable opportunity). Anyhow, back to the interesting stuff.

So to cut the long story short, i ended up with 49 faculty members from the various colleges nationwide, 44 of whom undertook the ICDL accreditation examinations! While the training was by large very well conducted, the testing was so poorly administered, it stuck out like a sore thumb to viciate the whole thing. The 'accredited' ICDL training college lacked the capacity to administer the exams (a fact that it was dishonest about) since it did not have the official testing software or its license key and rushed to get things done at the very last minute encumbering the lectures with a dueling one and half days of back-to-back exams and then flooding them with certificates of attendance to hide the shame and dissatisfaction of having none of the lecturers pass all the 7 otherwise straight forward exams. Indeed what was most distressing was that even these dismal results were only possible due to the assistance of an ICDL regional representative who dropped all pending activities and took time to urgently address this. So many other things could have gone wrong!

Anyhow, lessons learnt 1. Let the students undertake ICDL exams at a registered test center not at an alternative venue, it will prolong your life by a couple of days 2. Ensure that every little detail discussed with the ICDL accreddited college is verified...and when i say 'every little detail...' i mean exactly that 'every little detail...'. This may very well turn out to be your Achilles foot. It was mine. And never impose deadlines on the facilitators, let them take the wheel on this one. It is 'their training' in a way 3. Attend as many sessions of the training as possible. The participants feel secure and the facilitators accountable. It shouldn't be like this, I know. Oh well... 4. Make sure that the small but important details are addressed and fitted into the schedule. Aspects like administering the pretest and developing a schedule. They say an email(or anything on hard copy) is worth a thousand explanations. 5. Last but not least, evaluate and talk to both the participants and facilitators during the training. They will either be your best friends or enemies during the course of it.

I have no doubt that most if not all the participants obtained additional skills to navigate around a computer. I saw one participant who was exceptionally keen to learn but who previously had little exposure to computers and i was happy to see him learn something new every day. It is anticipated that this first set of lecturers will now be more confident and enthusiastic to use electronic learning materials, discuss the use of electronic learning materials and interact with their students using forums and emails. Inevitably this program will have to advocate for the inclusion of additional (in-service) content into pre-service curricula...and then the need to have precise and engaging learning content will become more apparent, urgent and necessary.