项目作者: rvosa

项目描述 :
Bio::Phylo::Forest::DBTree - Object API for trees in portable databases
高级语言: Perl
项目地址: git://github.com/rvosa/bio-phylo-forest-dbtree.git
创建时间: 2013-02-07T19:40:45Z
项目社区:https://github.com/rvosa/bio-phylo-forest-dbtree

开源协议:Other

下载


DOI
CPAN

DBTree - toolkit for megatrees in portable SQL databases

Figure 1

An example mapping of a tree topology to a database table. The mapping is
created by processing a Newick tree file (infile.tre) as follows:

  1. megatree-loader -i infile.tre -d outfile.db

With this mapping, several topological queries can be performed quickly when
loading the output file in sqlite3 (or the excellent SQLiteBrowser).

  1. -- select the most recent common ancestor of C and F
  2. select MRCA.* from node as MRCA, node as C, node as F
  3. where C.name='C' and F.name='F'
  4. and MRCA.left < min(C.left,F.left)
  5. and MRCA.right > max(C.right,F.right)
  6. order by MRCA.left desc limit 1;
  7. -- select the descendants from node n2
  8. select DESCENDANT.* from node as DESCENDANT, node as MRCA
  9. where MRCA.name='n2'
  10. and DESCENDANT.left > MRCA.left
  11. and DESCENDANT.right < MRCA.right;

Using databases that are indexed in this way, significant performance increases can
be accomplished. For example, a very common usage of large, published, static
phylogenies is to extract subtrees from them in order to use them for downstream
analysis (e.g. in phylogenetic comparative studies). This application is so common that
it forms essentially the basis of the success of
Phylomatic and the
PhyloTastic project.
A similar subtree extraction operation is also implemented by NCBI as the option to
extract the ‘common tree
from the NCBI taxonomy. Here, this functionality is made available by the
megatree-pruner program. To benchmark its performance in comparison with a naive
approach that operates on Newick strings, a pruner script based on
DendroPy was run side by side with the pruner on randomly
selected sets of tips from the OpenTree topology. The performance difference is shown
below:

Figure 2

Installation

The following installation instructions describe three different ways to install the
package. Unless you know what you are doing, the first way is probably the best one.

1. From BioConda

On many Linux-like operating systems as well as MacOSX, the entire installation completes
with this single command:

  1. conda install -c bioconda perl-bio-phylo-forest-dbtree

2. From the Comprehensive Perl Archive Network (CPAN)

On many Linux-like operating systems as well as MacOSX, the entire installation completes
with this single command:

  1. sudo cpanm Bio::Phylo::Forest::DBTree
  • Advantages - it’s simple and all prerequisites are automatically installed. You will
    obtain the latest stable release from CPAN, which is amply tested.
  • Disadvantages - you will likely get code that is a lot older than the latest work
    on this package.

3. From GitHub

On many Linux-like operating systems as well as MacOSX, you can install the latest code
from the repository with this single command:

  1. sudo cpanm git://github.com/rvosa/bio-phylo-forest-dbtree.git
  • Advantages - it’s simple, all prerequisites are automatically installed. You will
    get the latest code, including any new features and bug fixes.
  • Disadvantages - you will install untested, recent code, which might include new bugs
    or other features, in your system folders.

4. From an archive snapshot

This is the approach you might take if you want complete control over the installation,
and/or if there is a specific archive (such as zenodo release 10.5281/zenodo.1035856)
you wish to install or verify.

This approach starts by installing the prerequisites manually:

  1. # do this only if you don't already have these already
  2. sudo cpanm Bio::Phylo
  3. sudo cpanm DBIx::Class
  4. sudo cpanm DBD::SQLite

Then, unpack the archive, move into the top level folder, and issue the build commands:

  1. perl Makefile.PL
  2. make
  3. make test

Finally, you can opt to install the built products (using sudo make install), or
keep them in the present location, which would require you to update two environment
variables:

  1. # add the script folder inside the archive to the search path for executables
  2. export PATH="$PATH":`pwd`/script
  3. # add the lib folder to the search path for perl libraries
  4. export PERL5LIB="$PERL5LIB":`pwd`/lib

BUGS

Please report any bugs or feature requests on the GitHub bug tracker:
https://github.com/rvosa/bio-phylo-forest-dbtree/issues

Copyright 2013-2019 Rutger Vos, All Rights Reserved. This program is free software;
you can redistribute it and/or modify it under the same terms as Perl itself, i.e.
a choice between the following licenses:

SEE ALSO

Several curated, large phylogenies released by ongoing projects are made available as
database files that this distribution can operate on. These are: