Extending Revapi

Extension points

The architecture overviews sports a simple diagram that hints at several extension points available in Revapi. Each extension point is configurable and can provide a JSON schema for its configuration.

API Analyzer

An API analyzer is the main interface for implementing API checks for custom "language". It provides and configures the analyzers of the archives and API difference analyzers both detailed in the example project.

There is an example showing a custom api analyzer implementation.

Archive Analyzer

An archive analyzer is instantiated and configured by the API analyzer to analyze archives of a version of API. It represents the results of the analysis as an element forest (i.e. a set of element trees). The archive analyzers can take advantage of the tree filters to leave out parts of the elements and save some processing power.

There is an example showing a custom archive analyzer implementation.

Tree Filter

An tree filter can filter out elements from the element forest before they are passed further down the API analysis pipeline. The same set of element filters is applied to both the old API and new API element forests.

There is an example showing a custom tree filter implementation.

Element Matcher

An element matcher is a kind of "helper" extension that can be used by other extensions, like tree filters or difference transforms to identify the elements matching some user-defined criteria.

There is an example showing a custom element matcher implementation.

Difference Analyzer

The magic happens in the difference analyzers. Revapi simultaneously traverses the two element forests discovering new or removed elements and matching the comparable elements in them (using a co-iterator). It then passes the matched pairs to the difference analyzer that performs the actual analysis of changes and provides the reports summarizing them.

A report summarizes the differences found between 2 elements - one from the old API and the other from the new API (accounting for removals or additions by one of the pair being null).

In addition to the two elements in comparison, the report also contains the list of the differences the analyzer found between the two.

There is an example showing a custom difference analyzer implementation.

Difference Transform

Once the differences are found they are supplied to the difference transforms. These extensions can, as the name suggests, transform the found differences into different ones or altogether remove them from the results.

There is an example showing a custom difference transform implementation.

Reporter

Finally, after the final set of differences is settled, it is passed to the reporters. These are responsible to report the found differences to the caller somehow (standard output, database, xml files, whatever one imagines).

There is an example showing how to write a custom reporter implementation.

Analysis Workflow

The caller is first required to first supply the pipeline configuration to tell Revapi what extensions it will have available for analysis and other configuration.

The analysis is then executed using an analysis context. This context contains the two APIs that should be compared as well as the configuration of the extensions for that particular analysis.

The following diagram provides a detailed picture of the analysis workflow. The objects marked by the circled E represent a collection of extension instances of respective kinds which can be supplied to Revapi. Each such extension can be configured by the user.

API Traversal

All the elements produced by a single API analyzer need to be mutually comparable. The element trees produced by the archive analyzers are sorted. Revapi takes the advantage of this fact when looking for the changed elements.

The traversal is performed in the depth-first manner. Let’s consider the below two API trees. Their names encode the position in the graph so that we can then illustrate the API traversal in text. o_ and n_ prefixes mean that the element comes from the old (o_) or new (n_) API. The letter following the prefix indicates the "name" and is used merely for identification purposes. Finally, the number following the name indicates the "order" of the element amongst all the elements with the same name in both APIs.

These pairs are then supplied to the difference analyzers that produce the lists of found differences between these pairs.

The API traversal goes through both of the trees at the same time and produces pairs of elements where the first element comes from the old API and the second from the new API. Either of those elements can be null indicating that there is no adequate counterpart in the other API.

So let’s start the traversal…

(o_a1, n_a1) The elements o_a1 and n_a1 are considered equal (they have the same name and same "order"). When we have a match, we report them and dive into their children.

(o_b1, n_b1) Here, we’re in the same situation as before. The elements compare as equal and therefore we’re diving a level further.

(o_c1, n_c1) Again, the elements are matching and therefore they are both reported at the same time. There are no children to iterate so we’re continuing to the siblings.

(o_c1, null) Here we see the first "odd" thing. Only o_c2 is reported and no element from the new API. This is because we’ve found no matching element in the new API. n_c4 is considered "greater".

(null, n_c4) Here we see a similar situation only with the new API. We’ve already reported o_c2 and there is no further element to report in the old API. But we need to report n_c4 which is "greater" than all the elements in the old API. Therefore, we report it. We’ve depleted all siblings in both APIs so we continue in the upper level.

(o_b2, n_b2) These two elements match and are therefore reported together. Let’s dive into children.

(null, n_d1) Here we see, that the element n_d1 from the new API is considered "less" than all other siblings in the old API. Therefore, it is reported first and alone.

(o_d2, null) The next in line for the combined sets of siblings is o_d2. It has no counterpart in the new API and therefore it is again reported alone.

(o_d3, null) The next in line is o_d3. It again doesn’t have a matching counterpart in the new API and so is again reported alone.

(o_d4, n_d4) Now we arrive at elements that are considered equal in both APIs so they’re reported together.

(o_b3, null) In the previous step we finished visiting all the "d" siblings and therefore now we’re on the "b" level. Here, the next in line is o_b3 that has no matching counterpart in the new API.

(null, n_b4) And at last we’ve arrived at the last unreported element, n_b4. We’ve visited all the elements in both APIs.

Packaging Extensions

Extensions should be packaged as ordinary jar files. Revapi is as of yet not fully modularized (it only defines the automatic module names) so it is recommended to place Revapi and all extensions on the classpath, not the modulepath.

The convention for finding Revapi extensions is to find them using the service loader. Therefore, if you want your extension to be found by the revapi-maven-plugin for example, you need to make sure you place the appropriate service file in META-INF/services or your jar (and subsequently add that jar as a dependency of the revapi-maven-plugin).

E.g., if you define a new difference transform in your extension, called com.acme.AcmeDifferenceTransform,you need to create a file called org.revapi.DifferenceTransform in the META-INF/services directory of you jar. Each line in that file should contain a fully qualified class name of an implementation of the difference transform.

Take a look at the example extensions which define these service files.