Skip to content

Algorithms as Plugins

Cátia Raquel Jesus Vaz edited this page Sep 17, 2023 · 16 revisions

In phyloDB, algorithms (user-defined procedures) are supported as Neo4j plugins, using the APOC library. A user-defined procedure is a mechanism that extends Neo4j through the writing of custom code which can be also invoked directly from Cypher. These procedures can take arguments, perform operations on the database and return results.

Plugins, intended to extend Neo4j to support inference and vizualization algorithms will be available as procedures.

To implement an algorithm as a plugin (we will exemplify with goeBURST), it must be taken the following steps:

1. Create a new Procedure

Different procedures can be added by creating a sub-type of the Procedure type

public abstract class Procedure {
  @Context
  public GraphDatabaseService database;
  @Context
  public Log log;
} 

Thus, for our example, we should define the new InferenceProcedure type. It is also necessary to use the @Procedure annotation to be executed as a standard procedure. In this annotation, the valueproperty is responsible for defining the algorithm and the mode property to indicate if data can be written in the database.

public class InferenceProcedure extends algorithm.utils.Procedure {
  /*...*/
  @Procedure(value="algorithms.inference.goeburst", mode=Mode.WRITE)
  public void goeBURST(@Name("project") String project, @Name("dataset") String dataset, @Name("lvs") long lvs, @Name("inference") String inference){
    InferenceService service = new InferenceService(database, log);
    service.goeBURST(project, dataset, lvs);
  }
}
   

2. Create a new Service

Different services can be created by defining a sub-type of the Service type. They have the responsibility of reading the input data from the database, computing the algorithm result and storing back to the database.

public abstract class Service {
  public GraphDatabaseService database;
  public Log log;

  public Service(GraphDatabaseService database, Log log) {
    this.database = database;
    this.log = log;
   }
}

In the context of our example, it must be defined a new InferenceService type as following code:

public class InferenceService extends Service{
/*...*/
  public void  goeBURST(String project, String dataset, String analysis, long lvs){
    InferenceRepository repository = new InferenceRepository(database);
    GoeBURST algorithm = new GoeBURST();
    algorithm.init(project, dataset, analysis, lvs);
    Matrix matrix;
    try (Transaction tx1 = database.beginTx()) {
      matrix = repository.read(tx1, project, dataset);
      tx1.commit();
    }
    Inference inference = algorithm.compute(matrix);
    try (Transaction tx2 = database.beginTx()) {
      repository.write(tx2, inference);
      tx2.commit();
    }
  }
}

2. Create a new Repository

The reading and writing of data are accomplished by using the methods provided by the Repository layer.

To provide a way to retrieve and store a certain type of data would must define a sub-type of Repository.

public abstract class Repository<T, R> {
  /*...*/
  public abstract  R read(Transaction tx, String... params) throws Exception;
  public abstract void write(Transaction tx, T param);
}

such as the example of InferenceRepositorytype that it necessary to add goeBURST algorithm as a plugin.

4. Create the plugin

If the implementation just described is added to the existing algorithms library, a new version of this library can be created by following the build instructions explained in Algorithm jar section ). It is also possible to define them as an extension to the algorithms library, producing a new separated jar as usual. In both cases, the new library jar must be added to the plugins directory of Neo4j along with required dependencies.

   \$HOME\instance1
      \db
         \data
         \logs
         \plugins
            algorithms-1.0.jar
            apoc-5.9.0-core.jar
            apoc-5.9.0-extended.jar
      \app
         \logs