Classycle logo

2. How Classycle works

Classycle Analyser performs the analysis in five steps:
  1. Reading the class files
  2. Creating the dependency graph of classes
  3. Creating the dependency graph of packages
  4. Detecting cycles
  5. Calculating layer indices
  6. Writing a report
The first two are also done by the Dependency Checker.

2.1 Reading the Class Files

The analysis relies on .class files and not on .java source files. Classycle extracts from a class file the so-called constant pool which contains references to other classes.

2.2 Creating the Dependency Graph of Classes

In order to detect class dependencies Classycle analyses the constant pool. A constant pool contains different types of constants (for more details see Chapter 4 of The Java Virtual Machine Specification). Classycle evaluates two or three of them:
  • Class constant: Refers to a UFT8 constant which contains the fully-qualified class name with '/' instead of '.' as package name separator.
  • UFT8 constant: Contains an arbitray unicode-encoded string.
  • String constant (optionally): Refers to a UFT8 constant which contains a string constant from the Java code.
It is not enough to evalute class constants to find all refered classes. For example, the constant pool of
class SimpleClass {
  String a;
}
contains only class constants for the class itself (SimpleClass) and the superclass (java.lang.Object). But there is no class constant for java.lang.String. Instead, there is a UTF8 constant with the value Ljava/lang/String;. This is a so-called descriptor which represents the type of a field or method. Unfortunately, there is no special constant type for such descriptors. Classycle has to analyse each UTF8 constant whether it is a descriptor or not. Fortunately, descriptors are specified by a grammer (see Section 4.3 of the Java VM Specifications). Thus, if the UTF8 constant is a descriptor in accordance with this grammer Classycle extracts the class names from this descriptor. Otherwise, the UTF8 constant will be ignored. Since version 1.3 Classycle can also handle the extended grammer for Java Generics signatures.

If the commandline option -reflectionPattern is used also string constants in the pool are analysed. A string constant (as it appears in the code of a Java file) is handled as a class name if the following two conditions are fulfilled:

  1. The string is a syntactically valid fully-qualified class name.
  2. The string matches one of the specified patterns.
If the option -reflectionPattern is used but no pattern has been defined only the first condition has to be fulfilled. But is always recommended to specifiy at least one pattern. Otherwise strings like hello or www.w3c.org are handled as class references.

Note, that this reflection option is a must if a class with the following line of code should have a reference to java.lang.Thread:

Class clazz = Thread.class;

This analysis leads to a direct dependency in the following cases:

  • Superclass.
  • Field type.
  • Classes of parameters and return value of any method declared or invoked.
  • Class implementing an invoked method.
  • String constants if the reflection option is used and the above-mentioned conditions are fulfilled.
Array types are not refered. Instead the element type is refered. Note, that some references in the source code do not appear in the class file:
  • Local variables which are never used.
  • Constants (i.e., static final declared fields) which can by resolved at compile time.

Examples:

Source CodeEvaluated ConstantsRefered Classes
class MyException 
          extends Exception {
  int left = java.awt.Label.LEFT;
  MyException(String msg) {
    super(msg);
    Integer n = null;
  }
}
Class constants:

MyException
java/lang/Exception

UFT8 constant with descriptor:

(Ljava/lang/String;)V

java.lang.Exception
java.lang.String
class ExampleClass {
  int[] counts = new int[5];
  String[][] table
      = new String[5][3];
}
Class constants:

ExampleClass
java/lang/String
java/lang/Object

UFT8 constants with descriptor:

[I
[[Ljava/lang/String;
()V

java.lang.Object
java.lang.String
class ExampleClass {
  boolean atLeastOneSystemProperty
      = System.getProperties()
              .keys()
              .hasMoreElements();
}
Class constants:

ExampleClass
java/lang/Object
java/lang/System
java/util/Hashtable
java/util/Enumeration

UFT8 constants with descriptor:

()V
()Ljava/util/Properties;
()Ljava/util/Enumeration;
()Z

java.lang.Object
java.lang.System
java.util.Hashtable
java.util.Enumeration
java.util.Properties
abstract class 
    ExampleClass<T extends Set<Byte>> 
          implements Collection<Short>
{ 
  T handle(Map<Long, 
               ? extends Float[]> map) 
  { 
    return null;
  }
}	
Class constants:

ExampleClass
java/lang/Object
java/util/Collection

UFT8 constants with descriptor:


(Ljava/util/Map;)Ljava/util/Set;
(Ljava/util/Map<Ljava/lang/Long;
   +[Ljava/lang/Float;>;)TT;
<T::Ljava/util/Set
      <Ljava/lang/Byte;>;>
   Ljava/lang/Object;
   Ljava/util/Collection
      <Ljava/lang/Short;>;
java.lang.Object
java.util.Collection
java.util.Map
java.util.Set
java.lang.Long
java.lang.Float
java.lang.Byte
java.lang.Short

After the analysis of the constant pools of all read class files Classycle creates a directed graph (short: digraph): Each vertex represents an analysed class. Vertices are connected by directed edges (usually represented by an arrow). A directed edge will be pointed from vertex A to vertex B if Classycle has detected that A is refering B. The edge is called an outgoing arc for A and an incoming arc for B. In general they will be outgoing arcs to external classes which are not in the set of classes analysed by Classycle. These outgoing arcs will be ignored in step 4 and 5 of the analysis.

2.3 Creating the Dependency Graph of Packages

From the class dependency graph it is easy to get the package dependency graph. Classycle just parses the class graph and extracts the package name of a class from its fully qualified class name. If the class is not in a package a package node named (default package) will be associated with this class. Package A depends on package B if there is at least one class in A which depends on at least one class in B.

2.4 Detecting Cycles

In a digraph one can walk along the arrows from vertex to vertex. Eventually one reaches either a vertex with no outgoing arc or an already visited vertex. In the later case one has detected a cycle. Classycle detects cycles with the help of Tarjan's algorithm for condensation of a digraph into an acyclic digraph of its so-called strong components:
A group of vertices has the property of being strongly connected if each vertex of the group can be reached from each other vertex of the group. A strong component is a strongly connected group of vertices which looses the property of being strongly connected if a vertex outside the group is added. A special case is a strong component containing only one vertex (like the strong component b in the example which contains only vertex C). Classycle reports each strong component with more than one vertex as cycles. In the above example it would detect three cycles.

The condensation leads to a new digraph where each vertex is a strong component of the original digraph. A directed edge goes from strong component a to b if a directed edge goes from at least one vertex of a to at least one vertex of b.

Note, that the detection of package cycles done by JDepend can be wrong. For example, JDepend would identify by mistake vertex C in the above-mentioned graph as cyclic. The reason is presumably that JDepend walks through the graph until it detects a vertex already visited. In this case it marks the starting vertex as cyclic.

2.5 Calculating Layer Indices

After condensation of the dependency graph to an acyclic digraph of its strong components Classycle calculates the layer index for each strong component. The layer index of a vertex in an acyclic digraph is the number of edges of the longest possible walk starting at this vertex. All class/package vertices of a cycle have the same layer index. For example, the classes D, G, and I in the above example have all the layer index 0.

Grouping the classes of an application in layers suggests a layered architecture of the application. In general this is not the case. The layers introduced by the analysis of Classycle are of finer granularity and they violates in general the rule that a layer should only know the layer one level deeper. This rule would forbid the arrows from d to b and c in the example.

To get a layering in this strict sense one should do a further condensation by grouping the strong components in such a way that all edges pointing from a vertex with layer index n to a vertex with index n-1. Unfortunately, there is in general no unique way to do this. For example, one can condense the above example further by forming two groups: One group contains d and the other one c. But a and b may be both either in the d or c group or a in the d group and b in the c group.

2.6 Writing a Report

Finally Classycle writes the result of the analysis in a report. The most extended report is the XML report. It lists all cycles, layers, classes and packages. With this report it is possible to reconstruct the original class/package digraph. Chapter 3 explains the measures listed in the XML report.