Data Integrator
Developer setup

Most important thing to do is to set the corrent python and perl path. The "env.sh" script can be evaluated to set it correctly:

$ eval $(./env.sh)

Directory structure:

doc
Documentation files under SVN control.
src
Sources.
src/perl
Perl sources.
src/perl/cls
Sources providing classes, no shell commands though.
src/perl/cmd
Sources providing shell commands (applying classes).
src/perl/utest
Unit test framework for Perl API
src/python
Python sources
src/python/cls
Sources providing classes, no shell commands though.
src/python/cmd
Sources providing shell commands applying classes.
src/python/common
Important settings on a global scale. Of interest/use for all modules.
src/python/utest
Unit test framework for Python API
src/utest
Shared functions between the unit tests.
html
This is the documentation directory generated by Doxygen. (NOT under svn)
data
Data directory. This is a link or a mirror of the actual data dir.
galaxy
Galaxy XML tool definition files and installation programs.

Main project Makefile

A Makefile is present root folder of the project. Currently it has the following targets:

  • make doc : regenerates all project documentation into html
  • make check : run all the test suites (Python and Perl). Please note: the environment must be already set-up.
  • make clean : remove temporary files created by Python and the test suite.

EnsEMBL API installation

"DataIntegrator.cfg" expects the EnsEMBL API to be installed in a certain way, so that all the EnsEMBL API versions can be available at the same time, depending on the user's choice.

prefix/
The API prefix, as specified in "DataIntegrator.cfg".
prefix/bin/
Required binaries (such as calc_genotypes)
prefix/lib/site_perl/
Installation root of the Perl API. Put all the Perl API source (core, variation, etc) in this directory.

See Compiling C programs for Ensembl API for some details on compiling the small API C programs.