Data Integrator (Perl API)
Perl Coding style

Coding rules

  • All code that doesn't belong to D-Integrator itself goes into Dint::Utils.
  • Common initialization code for the program interface goes into Dint::Init and related modules.
  • Exceptions are handled via the Error module.
  • EnsEMBL exceptions are trapped via Dint::EnsEMBL::Utils::ens_except().
  • All exceptions derive from Dint::Error (which is an Error itself).
  • In general, exceptions are reserved for non-recoverable errors (that is: re-running the code again will always fail with the same input).
  • Error::EAgain exceptions are special and can be handled via Dint::EnsEMBL::auto_retry(). Internally all Dint classes should automatically retry EnsEMBL exceptions unless explicitly mentioned.
  • Transient errors return undef instead.
  • All functionality is broken up into classes.
  • All classes use the Class::InsideOut module.
  • All logging is performed via Log::Log4perl.
  • Read-only variables and globals are declared using the Readonly module.
  • All code is strict (warnings recommended).
  • One-class per module.
  • Real EnsEMBL modules should be included using Dint::EnsEMBL::use_ens(), which performs delayed inclusion for runtime version switching.
  • Real EnsEMBL functions and variables cannot be used in the module body and/or before a Dint::EnsEMBL::Connector instance has been created.

Formatting conventions

A couple of Perl formatting conventions are followed:

  • Indentation is 2 spaces (expanded tabs).
  • Built-in Perl functions don't use parentheses, unless required for precedence rules.
  • No propotypes for functions.
  • Methods/functions use lower_case_with_underscores.
  • Class cames use CamelCase.
  • The constructor, if any, is always new().
  • All modules use Exporter.
  • Conditionals use the right-side evaluation, without negation, for a single branch/statement:
# simple statement
function() if(condition);
# do not use negation
function() unless(condition);
  • Multi-branch conditionals use the right-hanging brace only for single-statement blocks, left-on-newline otherwise:
# fully packed
if(condition) {
iftrue();
} else {
iffalse();
}
# conditional branch with multiple statement
if(condition)
{
statement;
statement;
}
# right-handing brace to conserve space
if(condition) {
statement;
}
else
{
statement;
statement;
}
# no single-line statement
if(condition)
{
statement;
statement;
statement;
}
else
{
statement;
statement;
}
  • The same brace rules apply for blocks and subs.
  • As a recommendation, for conditional branches, the likely-branch should be the first. For cases without any likelihood, the shortest branch is first.