A modular system for evolutionary analysis
[Full Documentation] [Download] [Screenshots]
[Developer Documentation] [Source code]
Our knowledge of the characteristics of the world's diverse species (including their DNA sequences) is growing daily. To analyze these data for evolutionary patterns, biologists are relying increasingly on specialized software. A handful of biologist-programmers (ourselves included) have struggled to write and maintain a few massive multi-purpose programs widely used in evolutionary analysis, and many others have written small programs for particular analyses. The inertia of the massive programs and the isolation and idiosyncracies of the small programs have prompted us to develop Mesquite, a new modular system that will allow many programmers to contribute building blocks to a common system. Mesquite began with an emphasis on phylogenetic analysis, but its flexibility allows for other uses in evolutionary biology. It is written in Java for ease of development and platform independence. The Mesquite documentation has a page describing why Mesquite was made, and commenting on the relationship between Mesquite and MacClade.
In Mesquite, modules cooperate to perform analyses. There are modules for reading and writing file components, for drawing charts, for performing evolutionary calculations, and so on. Different modules that perform the same basic duty might use different assumptions, criteria, or types of data, and thus alternative analyses can be selected by choice of which module to employ. For example, modules of the NumberForTree class respond with a number when given an evolutionary (phylogenetic) tree. Different NumberForTree modules could return different numbers: one might return the symmetry of the tree, another an assessment of the likelihood of the tree based on DNA data. A user, examining a chart showing symmetry of a series of trees, could interchange NumberForTree modules to view likelihood of the trees instead.
A programmer can contribute by writing a new module that returns a number for a tree, knowing that such a module can take advantage of the Mesquite infrastructure (charts, data editors, tree manipulators, etc.). Likewise, a programmer might write a statistical or graphical module that analyzes or plots numbers for trees, knowing that such a module can take advantage of all the NumberForTree modules that have and will be written. The Mesquite system is therefore analogous to the children's toys in which a creature can be made by choosing one head from several alternative heads, one torso, limbs, and a tail, in a mix-and-match fashion. As different programmers make different heads, torsos, limbs, the number of different creatures (analyses) that can be performed rises multiplicatively.
The documentation has more details on Mesquite's modular architecture and how it works.
Because of its modular nature, the eventual scope of analyses that can be done with Mesquite is unpredictable. Modules written include ones for basic character analysis (parsimony and likelihood), comparative biology, molecular evolution, population genetics and morphometrics. The Mesquite documentation has a summary of plans as to what it will do.
Mesquite has been released as a public beta version, available for download here. As it is a big project (over 120,000 lines of code, over 900 total Java classes), we anticipate there could be numerous bugs found initially. Within a couple of months we hope to have various packages working well enough to produce publishable results.
If you're interested, you can check out screenshots, and the older screenshots available from the talk introducing Mesquite at the 2000 Evolution meetings. There is also a tutorial introducing the basics of using Mesquite.
We have not yet released source code broadly, primarily because we don't yet know what sort of license would be appropriate (suggestions welcome). We plan to have some style of open source, thus encouraging a community of contributing programmers. If you're interested in writing modules please contact us.
Also, we plan to increase Mesquite's connectedness in the world outside the local hard disk via distributed processing and interaction with Internet databases such as the Tree of Life. We already have Mesquite communicating with programs such as Swofford's PAUP*, and we hope to have it link to other programming efforts including Drummond and Strimmer's PAL.
Mesquite requires a Java 1.1 or higher virtual machine. On the Macintosh operating system, this means that MRJ 2.2 or better is needed. On Windows, Linux and unix, we have found Mesquite to behave reasonably well under JRE or JDK 1.1.8, 1.2, and 1.3. We have tested it most thoroughly on the MacOS and Windows.
Mesquite was developed with the assistance of a Fellowship to WPM from the David and Lucile Packard Foundation.
24 July 2001