Extending the Alignment API with a new matcher

This version:
http://alignapi.gforge.inria.fr/tutorial/tutorial3/
Author:
Jérôme Euzenat & Cassia Trojahn dos Santos, INRIA & LIG

This tutorial explains, step-by-step, how to add your own ontology matcher, existing or new, to the Alignment API.

Other tutorials can be found here.

This tutorial has been designed for the Alignment API version 4.0.

Extending the Alignment API with your matcher will enable:

There are many different methods for computing alignments. However, they always need at least two ontologies as input and provide an alignment as output (or as an intermediate step because some algorithms are more focussed on merging the ontologies for instance). Sometimes they can take an alignment or various other parameters as input.

The alignent API has been built around exactly this minimal interface, hence, it is easy to extend it by adding a new matcher. It is used by creating an alignment object, providing the two ontologies, calling the align method which takes parameters and initial alignment as arguments. The alignment object then bears the result of the matching procedure.

Preparation

First you must download the Alignment API and check that it works as indicated here.

You will then go to the directory of this tutorial by doing:

$ cd tutorial3

You can clean up previous trials by:

$ rm results/*

We assume that you have developed a matcher MyMatcher.java. This will help understanding the way the Alignment API works. If you have your own matcher, you will have to substitute it for MyMatcher.

MyMatcher can be compiled by:

$ javac -classpath ../../../lib/align.jar:../../../lib/procalign.jar -d results MyMatcher.java

Adding an existing matcher the straightforward way

Embedding your matcher is a very simple task. In general, you do not need to go further than this section to do it. Basically, adding your matcher within the Alignment API amounts to:

  1. get the parameters, mainly the ontologies, for your matcher;
  2. run your matcher;
  3. output the results within the Alignment structure.
Because, the Alignment API already comes with a feature rich implementation, the most simple, and advised procedure, consists of taking advantage of the API.

We will do this by simply pointing to an instance of your matcher class from a host Alignment class. This warrants the independence of both implementations whose interactions are limited to the above. If you want to achieve a deeper integration, please read also next section and see how it can be achieved.

Subclassing BasicAlignment

Adding new matching methods amounts to create a new AlignmentProcess class implementing the interface. Generally, this class can extend the proposed URIAlignment class, which extends the BasicAlignment class. The BasicAlignment class defines the storage structures for ontologies and alignment specification as well as the methods for dealing with alignment display. All methods can be refined (no one is final). The only method it does not implement is align itself.

So, the first thing to do is to create a subclass of URIAlignment implementing AlignmentProcess.

package fr.inrialpes.exmo.align.impl; import java.util.Properties; import org.semanticweb.owl.align.Alignment; import org.semanticweb.owl.align.AlignmentProcess; import org.semanticweb.owl.align.AlignmentException; import fr.inrialpes.exmo.ontowrap.LoadedOntology; import fr.inrialpes.exmo.align.impl.URIAlignment; import my.domain.MyMatcher; public class MyAlignment extends URIAlignment implements AlignmentProcess { public MyAlignment() {} public void align( Alignment alignment, Properties params ) throws AlignmentException { ... // to be replaced by the pieces of code below } }

Retrieving ontologies

In order to align the ontologies, at least, MyMatcher needs to retrieve them. They have been provided to the Alignment at the moment of its initialisation through the init() method. The coordinate of these ontologies have been stored in the Alignment structure. It can be retrieved as URIs in the following way:

URI uri1 = getOntology1URI(); URI uri2 = getOntology2URI();
This provides the real URIs identifying the ontologies if they are available. But it may be preferable to use pointer to resources actually containing the ontologies. This is what the getFile methods do:
URI url1 = getFile1(); URI url2 = getFile2();
then, of course, if MyMatcher requires URIs, it is very simple to call it, like in:
MyMatcher matcher = new MyMatcher(); matcher.match( url1, url2 );
if the matcher requires parameters, it is also possible to obtain them from the properties passed withing the align( Alignment, Property ) method. They use the standard Java Property class.

Providing results

Now that the matcher has been run, in order for the Alignment to be aware of its result, it is necessary to communicate it. If the Matcher class provide its resulting correspondences as an iterator, it is possible to "fill" the Alignment with:

for ( Object[] c : matcher ){ addAlignCell( (URI)c[0], (URI)c[1], (String)c[2], ((Double)c[3]).doubleValue() ); }
In this case, entities may be URIs, relation may be a string, e.g., "=", and confidence a double, e.g., .375.

That's it.

The new matcher is implemented as MyAlignment using MyMatcher. It can be used in any situation in which a matcher is required by the Alignment API. Basically all the tutorials presented here can be played with your new class.

MyAlignment can be compiled by:

$ javac -classpath .:../../../lib/align.jar:../../../lib/procalign.jar -d results MyAlignment.java
and can be used in:
$ java -classpath .:../../../lib/ontowrap.jar:../../../lib/procalign.jar:results fr.inrialpes.exmo.align.cli.Procalign -i MyAlignment file://$CWD/myOnto.owl file://$CWD/edu.mit.visus.bibtex.owl

A more direct implementation of a matcher is also proposed in NewMatcher.java which is a matcher based on the ObjectAlignment class and uses the Ontology interface to manipulate ontology content.

The full story

In reality, what has been achieved by the previous section is to implement the AlignmentProcess interface of the API. This interface declares only the align() method, but it is also a subinterface of the Alignment interface which requires you to implement many more methods and other classes.

Maybe it is not enough, or not efficient enough. In this case, the best way is to start from one of our class implementing Alignment:

and extend it so that it implements your matcher. This is the example given in the NewMatcher.java class.

If this is still not sufficient, the API is declared at org.semanticweb.owl.align You are welcome to reimplement it.

However, it will require to reimplement other types of objects (Cell, Relation) and to implement the full Alignement interface.

Advanced: You can develop a specialized matching algorithm by subclassing the Java programs provided in the Alignment API implementation (like DistanceAlignment).

Other natural extensions

There are other parts of the Alignment API which may be extended. The most natural ones are:

Extending the alignment language

The Alignment format can be extended for introducing metadata in the alignments and correspondences. This is possible through the extensions of the Alignment API. Extensions in the Alignment API follows the API:

public Collection getExtensions(); public String getExtension( String uri, String label ); public void setExtension( String uri, String label, String value );
so extenstions are identified by their namespace (uri) and their label. Their value is a String. We publish a list of already declared extensions. If they fill your needs, please use them; if you create new ones, please tell us.

More advanced: not documented...

Further packaging your matcher

Through slightly more work, it is possible to ease the use of the new class.

Packaging for evaluation in the SEALS platform

This is subject to change but you can find instruction here. Check that these instructions are up-to-date.

Making the resulting class jar-launchable

In order to have this new class directly jar-launchable, it is sufficient to deliver it as a jar-file containing the new introduced classes plus a MANIFEST.MF file refering to all the necessary packages and launching Procalign:

Manifest-Version: 1.0 Created-By: Jerome.Euzenat@inrialpes.fr Class-Path: align.jar ontowrap.jar procalign.jar mymatcher.jar Main-Class: fr/inrialpes/exmo/align/cli/Procalign
The jar may then be launched by:
$ java -jar lib/mymatcher.jar file://$CWD/rdf/onto1.owl file://$CWD/rdf/onto2.owl -i my.domain.MyAlignment

Preparing the class for the Alignment server

In order to be visible from the Alignment server, the class must not only implement the AlignmentProcess interface, but it must also declare that it implements it in the class header even if it extends a class that implements the interface. This is a limitation of Java support.

Further exercises

More info: http://alignapi.gforge.inria.fr/tutorial/


http://alignapi.gforge.inria.fr/tutorial/tutorial3/

$Id: index.html 1717 2012-04-03 06:23:27Z euzenat $